LESSWRONG
LW

Orpheus16
6677Ω101613360
Message
Dialogue
Subscribe

Sequences

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Leveling Up: advice & resources for junior alignment researchers
7Orpheus16's Shortform
1y
101
RTFB: The RAISE Act
Orpheus165d30

Again, while I have concerns that the bill is insufficient strong, I think all of this is a very good thing. I strongly support the bill.

Suppose you magically gained a moderate amount of Political Will points and you can spend them on 1-2 things that would make the bill stronger (or introduce a separate bill– no need to anchor too much on the current RAISE vibe.)

What do you think are the 1-2 things you'd change about RAISE or the 1-2 extra things you'd push for?

Reply
Substack and Other Blog Recommendations
Orpheus165d20

I would be excited about someone doing a blog on what the companies are doing RE AI policy (including comms that are relevant to policy or directed at policymakers.)

I suspect good posts from such a blog would be shared reasonably frequently among tech policy staffers in DC.

(Not necessarily saying this needs to be you).

Reply1
Comparing risk from internally-deployed AI to insider and outsider threats from humans
Orpheus1612d20

First, when I talk to security staff at AI companies about computer security, they often seem to fail to anticipate what insider threat from AIs will be like.

Why do you think this? Is it that they are not thinking about large numbers of automated agents running around doing a bunch of research?

Or is it that they are thinking about these kinds of scenarios, and yet they still don't apply the insider threat frame for some reason?

Reply
Ryan Kidd's Shortform
Orpheus1615d20

My understanding is that AGI policy is pretty wide open under Trump. I don't think he and most of his close advisors have entrenched views on the topic.

If AGI is developed in this Admin (or we approach it in this Admin), I suspect there is a lot of EV on the table for folks who are able to explain core concepts/threat models/arguments to Trump administration officials.

There are some promising signs of this so far. Publicly, Vance has engaged with AI2027. Non-publicly, I think there is a lot more engagement/curiosity than many readers might expect.

This isn't to say "everything is great and the USG is super on track to figure out AGI policy" but it's more to say "I think people should keep an open mind– even people who disagree with the Trump Admin on mainstream topics should remember that AGI policy is a weird/niche/new topic where lots of people do not have strong/entrenched/static positions (and even those who do have a position may change their mind as new events unfold.)"

Reply
Ryan Kidd's Shortform
Orpheus1617d95

There are definitely still benefits to doing alignment research, but this only justifies the idea that doing alignment research is better than doing nothing.

IMO the thing that matters (for an individual making decisions about what to do with their career) is something more like "on the margin, would it be better to have one additional person do AI governance or alignment/control?"

I happen to think that given the current allocation of talent, on-the-margin it's generally better for people to choose AI policy. (Particularly efforts to contribute technical expertise or technical understanding/awareness to governments, think-tanks interfacing with governments, etc.) There is a lot of demand in the policy community for these skills/perspectives and few people who can provide them. In contrast, technical expertise is much more common at the major AI companies (though perhaps some specific technical skills or perspectives on alignment are neglected.)

In other words, my stance is something like "by default, anon technical person would have more expected impact in AI policy unless they seem like an unusually good fit for alignment or an unusually bad fit for policy."

Reply
Orpheus16's Shortform
Orpheus161mo110

There's a video version of AI2027 that is quite engaging/accessible. Over 1.5M views so far.

Seems great. My main critique is that the "good ending" seems to assume alignment is rather easy to figure out, though admittedly that might be more of a critique of AI2027 itself rather than the way the video portrays it.

Reply
What We Learned from Briefing 70+ Lawmakers on the Threat from AI
Orpheus161mo324

This is fantastic work. There's also something about this post that feels deeply empathic and humble, in ways that are hard-to-articulate but seem important for (some forms of) effective policymaker engagement.

A few questions:

  • Are you planning to do any of this in the US?
  • What have your main policy proposals or "solutions" been? I think it's becoming a lot more common for me to encounter policymakers who understand the problem (at least a bit) and are more confused about what kinds of solutions/interventions/proposals are needed (both in the short-term and the long-term).
  • Can you say more about what kinds of questions you encounter when describing loss of control, as well as what kinds of answers have been most helpful? I'm increasingly of the belief that getting people to understand "AI has big risks" is less important than getting people to understand "some of the most significant risks come from this unique thing called loss of control that you basically don't really have to think about for other technologies, and this is one of the most critical ways in which AI is different than other major/dangerous/dual-use technologies."
  • Did you notice any major differences between parties? Did you change your approach based on whether you were talking to conservatives or labour? Did they have different perspectives or questions? (My own view is that people on the outside probably overestimate the extent to which there are partisan splits on these concerns-- they're so novel that I don't think the mainstream parties have really entrenched themselves in different positions. But would be curious if you disagree.)
    • Sub-question: Was there any sort of backlash against Rishi Sunak's focus on existential risks? Or the UK AI Security Institute? In the US, it's somewhat common for Republicans to assume that things Biden did were bad (and for Democrats to assume that things Trump does is bad). Have you noticed anything similar?
Reply
We're Not Advertising Enough (Post 3 of 7 on AI Governance)
Orpheus161mo1212

I think we should be careful not to overestimate the success of AI2027. "Vance has engaged with your work" is an impressive feat, but it's still relatively far away from something like "Vance and others in the Admin have taken your work seriously enough to start to meaningfully change their actions or priorities based on it." (That bar is very high, but my impression is that the AI2027 folks would be like "yea, that's what would need to happen in order to steer toward meaningfully better futures.")

My impression is that AI2027 will have (even) more success if it is accompanied by an ambitious policymaker outreach effort (e.g., lots of 1-1 meetings with relevant policymakers and staffers, writing specific pieces of legislation or EOs and forming a coalition around those ideas, publishing short FAQ memos that address misconceptions or objections they are hearing in their meetings with policymakers, etc.) 

This isn't to say that research is unnecessary-- much of the success of AI2027 comes from Daniel (and others on the team) having dedicated much of their lives to research and deep understanding. There are plenty of Government Relations people who are decent at "general policy engagement" but will fail to provide useful answers when staffers ask things like "But why won't we just code in the goals we want?", or "But don't you think the real thing here is about how quickly we diffuse the technology?", or "Why don't you think existing laws will work to prevent this?" or a whole host of other questions.

But on the margin, I would probably have Daniel/AI2027 spend more time on policymaker outreach and less time on additional research (especially now that AI2027 is done). There is some degree of influence one can have with the "write something that is thoroughly researched and hope it spreads organically" effort, and I think AI2027 has essentially saturated that. For additional influence, I expect it will be useful for Daniel (or other competent communicators on his team) to advance to "get really good at having meetings with the ~100-1000 most important people, understanding their worldviews, going back and forth with them, understanding their ideological or political constraints, and finding solutions/ideas/arguments that are tailored to these particular individuals." This is still a very intellectual task in some ways, but it involves a lot more "having meetings" and "forming models of social/political reality" than the classic "sit in your room with a whiteboard and understand technical reality" stuff that we typically associate with research.

Reply
Eliezer and I wrote a book: If Anyone Builds It, Everyone Dies
Orpheus162mo328

Note that IFP (a DC-based think tank) recently had someone deliver 535 copies of their new book to every US Congressional office.

Note also that my impression is that DC people (even staffers) are much less "online" than tech audiences. Whether or not you copy IFP, I would suggest thinking about in-person distribution opportunities for DC.

Reply
RA x ControlAI video: What if AI just keeps getting smarter?
Orpheus162mo70

I think there are organizations that themselves would be more likely to be robustly trustworthy and would be more fine to link to

I would be curious for your thoughts on which organizations you feel are robustly trustworthy. 

Bonus points for a list that is kind of a weighted sum of "robustly trustworthy" and "having a meaningful impact RE improving public/policymaker understanding". (Adding this in because I suspect that it's easier to maintain "robustly trustworthy" status if one simply chooses not to do a lot of externally-focused comms, so it's particularly impressive to have the combination of "doing lots of useful comms/policy work" and "managing to stay precise/accurate/trustworthy").

Reply
Load More
14Verification methods for international AI agreements
10mo
1
66Advice to junior AI governance researchers
1y
1
21Mitigating extreme AI risks amid rapid progress [Linkpost]
1y
7
7Orpheus16's Shortform
1y
101
57Cooperating with aliens and AGIs: An ECL explainer
1y
8
66OpenAI's Preparedness Framework: Praise & Recommendations
2y
1
312Speaking to Congressional staffers about AI risk
1y
25
42Navigating emotions in an uncertain & confusing world
2y
1
44Chinese scientists acknowledge xrisk & call for international regulatory body [Linkpost]
2y
4
115Winners of AI Alignment Awards Research Contest
2y
4
Load More