This is a special post for quick takes by Ebenezer Dukakis. Only they can create top-level comments. Comments here also appear on the Quick Takes page and All Posts page.
1 comment, sorted by Click to highlight new comments since:

Regarding the situation at OpenAI, I think it's important to keep a few historical facts in mind:

  1. The AI alignment community has long stated that an ideal FAI project would have a lead over competing projects. See e.g. this post:

Requisite resource levels: The project must have adequate resources to compete at the frontier of AGI development, including whatever mix of computational resources, intellectual labor, and closed insights are required to produce a 1+ year lead over less cautious competing projects.

  1. The scaling hypothesis wasn't obviously true around the time OpenAI was founded. At that time, it was assumed that regulation was ineffectual because algorithms can't be regulated. It's only now, when GPUs are looking like the bottleneck, that the regulation strategy seems viable.

What happened with OpenAI? One story is something like:

  • AI safety advocates attracted a lot of attention in Silicon Valley with a particular story about AI dangers and what needed to be done.

  • Part of this story involved an FAI project with a lead over competing projects. But the story didn't come with easy-to-evaluate criteria for whether a leading project counted as a good "FAI project" or an bad "UFAI project". Thinking about AI alignment is epistemically cursed; people who think about the topic independently rarely have similar models.

  • Deepmind was originally the consensus "FAI project", but Elon Musk started OpenAI because Larry Page has e/acc beliefs.

  • OpenAI hired employees with a distribution of beliefs about AI alignment difficulty, some of whom may be motivated primarily by greed or power-seeking.

  • At a certain point, that distribution got "truncated" with the formation of Anthropic.

Presumably at this point, every major project thinks it's best if they win, due to self-serving biases.

Some possible lessons:

  • Do more message red-teaming. If an organization like AI Lab Watch had been founded 10+ years ago, and was baked into the AI safety messaging along with "FAI project needs a clear lead", then we could've spent the past 10 years getting consensus on how to anoint one or a just a few "FAI projects". And the campaign for AI Pause could instead be a campaign to "pause all AGI projects except the anointed FAI project". So -- when we look back in 10 years on the current messaging, what mistakes will seem obvious in hindsight? And if this situation is partially a result of MIRI's messaging in the past, perhaps we should ask hard questions about their current pivot towards messaging? (Note: I could be accused of grinding my personal axe here, because I'm rather dissatisfied with current AI Pause messaging.)

  • Assume AI acts like magnet for greedy power-seekers. Make decisions accordingly.