This post is spot-on about basically everything it covers, and I'm really, really glad to see that someone like you thought of at least half of this on your own, discovering it independently. It's really good news that we have thinkers like that here.

The one thing that is not spot-on is the claim that "politics probably aren't as hard as you think". Politics are much harder, more hostile/malevolent, less predictable, and more evil than they appear. We didn't have to be born in a timeline where AI alignment was ever conceived of at all, in the first place, as opposed to being born on a timeline where people built AI but the concept of the Control Problem never occurred to anyone. So I think we're very fortunate that the concept of AI alignment exists in the first place, and it would be such an unfortunate waste if the whole enchilada were to be eviscerated by the political scene.

AI governance, and governance in general, is immensely complicated and full of self-interested and outright vicious people. Many of them are also extremely smart, competent, and/or paranoid about others encroaching on their little empire that they spend their entire lives building for themselves, brick by brick, such as J. Edgar Hoover. Any really good idea of governance is probably full of these random, unforseeable "aha" moments that completely invalidate the entire good idea, because some random factor that most smart people couldn't possibly have reasonably anticipated.

Please don't be discouraged, this is an uncharacteristically high-quality post on AI governance and I look forward to seeing more from you in the future. I've learned a lot from it and many others have too.

I recommend contributing to the $20k AI alignment rhetoric and one-liner contest, it needs more entries from competent people like you who know what they're talking about. It was forced off the front page by a bunch of naive people who know nothing about the situation with governance, so very few people are aware of the existence of that contest. If you (or anyone, really) put in 30 minutes thinking of a sorta clever quote (or just finding one) that can convince policymakers that AI alignment is a big deal, you will probably end up with $500 in your pocket; that's how badly the contest is neglected right now.

[-]Nicholas Kross4y*30

Thanks! FWIW part of the point here is that "AI Governance" includes (but is not limited to) "real politics", which I assume are as bad / worse as everyone here does. Hence the examples section mostly being NGOs.

And thanks for letting me know about the contest, ~~is there a limit on number of submissions?~~ (EDIT: there appears to not be a limit beyond whatever LW already uses for spam filtering, ofc). I can write a lot of quotes for $500.

[-][anonymous]4y90

That's good that you're willing to make a lot of submissions for $500, because at the way things are going, you'll probably get $500 per submission for several submissions.

[-]sludgepuddle4y40

How do we deal with institutions that don't want to be governed, say idk the Chevron corporation, North Korea, or the US military?

[-]samshap4y60

In my model, Chevron and the US military are probably open to AI governance, because: 1 - they are institutions traditionally enmeshed in larger cooperative/rule-of-law systems, AND 2 - their leadership is unlikely to believe they can do AI 'better' than the larger AI community.

My worry is instead about criminal organizations and 'anti-social' states (e.g. North korea) because of #1, and big tech because of #2.

Because of location, EA can (and should) make decent connective with US big tech. I think the bigger challenge will be tech companies in other countries , especially China.

[-]Nicholas Kross4y20

My co-blogger Devin saw this comment before I did, so these points are his. Just paraphrasing:

We can still do a lot without "coordinating" every player, and governance doesn't mean we should be ham-fisted about it.

Furthermore, even just doing coordination/governance work with some of the major US tech companies (OpenAI, Google, Microsoft, Facebook) would be really good, since they tend to be ahead of the curve (as far as we know) with the relevant technologies.

Devin also noted that there could be tension between "we're coordinating to all extend our AI development timelines somewhat so things aren't rushing forward" and "OpenAI originally had a goal to develop aligned AI before anyone else developed unaligned AI". However, I think this sort of thing is minor, and doing more governance now requires some flexibility anyway.

[-]Sandi4y30

How many of the decision makers in the companies mentioned care about or even understand the control problem? My impression was: not many.

Coordination is hard even when you share the same goals, but we don't have that luxury here.

An OpenAI team is getting ready to train a new model, but they're worried about it's self improvement capabilities getting out of hand. Luckily, they can consult MIRI's 2025 Reflexivity Standards when reviewing their codebase, and get 3rd-party auditing done by The Actually Pretty Good Auditing Group (founded 2023).

Current OpenAI wants to build AGI.^[1] Current MIRI could confidently tell them that this is a very bad idea. Sure they could be advised that step 25 of their AGI building plan is dangerous, but so were steps 1 through 24.

MIRI's advice to them won't be "oh implement this safety measure and you're golden" because there's no such safety measure because we won't have solved alignment by then. The advice will be "don't do that", as it is currently, and OpenAI will ignore it, as they do currently.

^{^}
Sure, they could actually mean "build AGI in a few decades when alignment is solved and we're gonna freeze all our current AGI building efforts long before then", but no they don't.

[-]Nicholas Kross4y20

At one point (working off memory here), Sam Altman (leader of OpenAI) didn't quite agree with the orthogonality thesis. After some discussion and emailing with someone on the Eleuther discord (iirc), he shifted to agree with it more fully. I think.

This ties into my overall point of "some of this might be adversarial, but first let's see if it's just straight-up neglected along some vector we haven't looked much at yet".

Moderation Log

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

70

Quick Thoughts on A.I. Governance

70

70

The case for governance now (Skip if you're already convinced)

Specific stories of how this would really look in the real world, for real

Find and use existing coordination mechanisms

Politics VS the other stuff