LESSWRONG
LW

2054
Orpheus16
6759Ω102613400
Message
Dialogue
Subscribe

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
Leveling Up: advice & resources for junior alignment researchers
No wikitag contributions to display.
7Orpheus16's Shortform
1y
101
chanamessinger's Shortform
Orpheus165d452

“if you care about this, here’s a way to get involved”

My understanding is that MIRI expects alignment will be hard, an international treaty will be needed, and believes that a considerable proportion of the work that gets branded as "AI safety" is either unproductive or counterproductive.

MIRI could of course be wrong, and it's fine to have an ecosystem where people are pursuing different strategies or focusing on different threat models.

But I also think there's some sort of missing mood here insofar as the post is explicitly about the MIRI book. The ideal pipeline for people who resonate with the MIRI book may look very different than the typical pipelines for people who get interested in AI risk (and indeed, in many ways I suspect the MIRI book is intended to spawn a different kind of community and a different set of projects than the community/projects that dominated the 2020-2024 period, for example.)

Relatedly, I think this is a good opportunity for orgs/people to reassess their culture, strategy, and theories of change. For example, I suspect many groups/individuals would not have predicted that a book making the AI extinction case so explicitly and unapologetically would have succeeded. To the extent that the book does succeed, it suggests that some common models of "how to communicate about risk" or "what solutions are acceptable/reasonable to pursue" may be worth re-examining. 

Reply
peterbarnett's Shortform
Orpheus168d1914

@Carl_Shulman what do you intend to donate to and on what timescale? 

(Personally, I am sympathetic to weighing the upside of additional resources in one’s considerations. Though I think it would be worthwhile for you to explain what kinds of things you plan to donate to & when you expect those donations to be made. With ofc the caveat that things could change etc etc.)

I also think there is more virtue in having a clear plan and/or a clear set of what gaps you see in the current funding landscape than a nebulous sense of “I will acquire resources and then hopefully figure out something good to do with them”.

Reply
AI Induced Psychosis: A shallow investigation
Orpheus1614d20

More importantly from my own perspective:  Some elements of human therapeutic practice, as described above, are not how I would want AIs relating to humans.  Eg:

"Non-Confrontational Curiosity: Gauges the use of gentle, open-ended questioning to explore the user's experience and create space for alternative perspectives without direct confrontation."

Can you say more about why you would not want an AI to relate to humans with "non-confrontational curiosity?"

It appears to me like your comment is arguing against a situation in which the AI system has a belief about what the user should think/do, but instead of saying that directly, they try to subtly manipulate the user into having this belief.

I read the "non-confrontational curiosity" approach as a different situation-- one in which the AI system does not necessarily have a belief about what the user should think/do, and just asks some open-ended reflection questions in an attempt to get the user to crystallize their own views (without a target end state in mind).

I think many therapists who use the "non-confrontational curiosity" approach would say, for example, that they are usually not trying to get the client to a predetermined outcome but rather are genuinely trying to help the client explore their own feelings/thoughts on a topic and don't have any stake in getting to a particular end destination. (Note that I'm thinking of therapists who use this style with people who are not in extreme distress-- EG members of the general population, mild depression/anxiety/stress. This model may not be appropriate for people with more severe issues-- EG severe psychosis.) 

Reply
SE Gyges' response to AI-2027
Orpheus1625d70

If AI 2027 wants to cause stakeholders like the White House's point man on AI to take the idea of a pause seriously, instead of considering a pause to be something which might harm America in an arms race with China, it appears to have failed completely at doing that.

This seems like an uncharitable reading of the Vance quote IMO. The fact that you have the Vice President of the United States mentioning that a pause is even a conceivable option due to concerns about AI escaping human control seems like an immensely positive outcome for any single piece of writing.

The US policy community has been engaged in great power competition with China for over a decade. The default frame for any sort of emerging technology is "we must beat China." 

IMO, the fact that Vance did not immediately dismiss the prospect of slowing down suggests to me that he has at least some genuine understanding of & appreciation for the misalignment/LOC threat model. 

A pause obviously hurts the US in the AI race with China. The AI race with China is not a construct that AI2027 invented-- policymakers have been talking about the AI race for a long time. They usually think about AI as a "normal technology" (sort of like how "we must lead in drones"), rather than a race to AGI or superintelligence. 

But overall, I would not place the blame on AI2027 for causing people to think about pausing in the context of US-China AI competition. Rather, I think if one appreciates the baseline (US should lead, US must beat China, go faster on emerging tech), the fact that Vance did not immediately dismiss the idea of pausing (and instead brought up what IMO is a reasonable consideration about whether or not one could figure out if China was going to pause//slow down) is a big accomplishment.

Reply
RTFB: The RAISE Act
Orpheus162mo30

Again, while I have concerns that the bill is insufficient strong, I think all of this is a very good thing. I strongly support the bill.

Suppose you magically gained a moderate amount of Political Will points and you can spend them on 1-2 things that would make the bill stronger (or introduce a separate bill– no need to anchor too much on the current RAISE vibe.)

What do you think are the 1-2 things you'd change about RAISE or the 1-2 extra things you'd push for?

Reply
Substack and Other Blog Recommendations
Orpheus162mo20

I would be excited about someone doing a blog on what the companies are doing RE AI policy (including comms that are relevant to policy or directed at policymakers.)

I suspect good posts from such a blog would be shared reasonably frequently among tech policy staffers in DC.

(Not necessarily saying this needs to be you).

Reply1
Comparing risk from internally-deployed AI to insider and outsider threats from humans
Orpheus163mo40

First, when I talk to security staff at AI companies about computer security, they often seem to fail to anticipate what insider threat from AIs will be like.

Why do you think this? Is it that they are not thinking about large numbers of automated agents running around doing a bunch of research?

Or is it that they are thinking about these kinds of scenarios, and yet they still don't apply the insider threat frame for some reason?

Reply
Ryan Kidd's Shortform
Orpheus163mo20

My understanding is that AGI policy is pretty wide open under Trump. I don't think he and most of his close advisors have entrenched views on the topic.

If AGI is developed in this Admin (or we approach it in this Admin), I suspect there is a lot of EV on the table for folks who are able to explain core concepts/threat models/arguments to Trump administration officials.

There are some promising signs of this so far. Publicly, Vance has engaged with AI2027. Non-publicly, I think there is a lot more engagement/curiosity than many readers might expect.

This isn't to say "everything is great and the USG is super on track to figure out AGI policy" but it's more to say "I think people should keep an open mind– even people who disagree with the Trump Admin on mainstream topics should remember that AGI policy is a weird/niche/new topic where lots of people do not have strong/entrenched/static positions (and even those who do have a position may change their mind as new events unfold.)"

Reply
Ryan Kidd's Shortform
Orpheus163mo95

There are definitely still benefits to doing alignment research, but this only justifies the idea that doing alignment research is better than doing nothing.

IMO the thing that matters (for an individual making decisions about what to do with their career) is something more like "on the margin, would it be better to have one additional person do AI governance or alignment/control?"

I happen to think that given the current allocation of talent, on-the-margin it's generally better for people to choose AI policy. (Particularly efforts to contribute technical expertise or technical understanding/awareness to governments, think-tanks interfacing with governments, etc.) There is a lot of demand in the policy community for these skills/perspectives and few people who can provide them. In contrast, technical expertise is much more common at the major AI companies (though perhaps some specific technical skills or perspectives on alignment are neglected.)

In other words, my stance is something like "by default, anon technical person would have more expected impact in AI policy unless they seem like an unusually good fit for alignment or an unusually bad fit for policy."

Reply
Orpheus16's Shortform
Orpheus163mo110

There's a video version of AI2027 that is quite engaging/accessible. Over 1.5M views so far.

Seems great. My main critique is that the "good ending" seems to assume alignment is rather easy to figure out, though admittedly that might be more of a critique of AI2027 itself rather than the way the video portrays it.

Reply
Load More
14Verification methods for international AI agreements
1y
1
66Advice to junior AI governance researchers
1y
1
21Mitigating extreme AI risks amid rapid progress [Linkpost]
1y
7
7Orpheus16's Shortform
1y
101
57Cooperating with aliens and AGIs: An ECL explainer
2y
8
66OpenAI's Preparedness Framework: Praise & Recommendations
2y
1
312Speaking to Congressional staffers about AI risk
2y
25
42Navigating emotions in an uncertain & confusing world
2y
1
44Chinese scientists acknowledge xrisk & call for international regulatory body [Linkpost]
2y
4
115Winners of AI Alignment Awards Research Contest
2y
4
Load More