Staying Sane While Taking Ideas Seriously


I mean, I don't really care how much e.g. Facebook AI thinks they're racing right now. They're not in the game at this point.

The race dynamics are not just about who's leading. FB is 1-2 years behind (looking at LLM metrics), and it doesn't seem like they're getting further behind OpenAI/Anthropic with each generation, so I expect that the lag at the end will be at most a few years.

That means that if Facebook is unconstrained, the leading labs have only that much time to slow down for safety (or prepare a pivotal act) as they approach AGI before Facebook gets there with total recklessness.

If Microsoft!OpenAI lags the new leaders by less than FB (and I think that's likely to be the case), that shortens the safety window further.

I suspect my actual crux with you is your belief (correct me if I'm misinterpreting you) that your research program will solve alignment and that it will not take much of a safety window for the leading lab to incorporate the solution, and therefore the only thing that matters is finishing the solution and getting the leading lab on board. It would be very nice if you were right, but I put a low probability on it.

I'm surprised that nobody has yet brought up the development that the board offered Dario Amodei the position as a merger with Anthropic (and Dario said no!).

(There's no additional important content in the original article by The Information, so I linked the Reuters paywall-free version.)

Crucially, this doesn't tell us in what order the board made this offer to Dario and the other known figures (GitHub CEO Nat Friedman and Scale AI CEO Alex Wang) before getting Emmett Shear, but it's plausible that merging with Anthropic was Plan A all along. Moreover, I strongly suspect that the bad blood between Sam and the Anthropic team was strong enough that Sam had to be ousted in order for a merger to be possible.

So under this hypothesis, the board decided it was important to merge with Anthropic (probably to slow the arms race), booted Sam (using the additional fig leaf of whatever lies he's been caught in), immediately asked Dario and were surprised when he rejected them, did not have an adequate backup plan, and have been scrambling ever since.

P.S. Shear is known to be very much on record worrying that alignment is necessary and not likely to be easy; I'm curious what Friedman and Wang are on record as saying about AI x-risk.

No, I don't think the board's motives were power politics; I'm saying that they failed to account for the kind of political power moves that Sam would make in response.

In addition to this, Microsoft will exert greater pressure to extract mundane commercial utility from models, compared to pushing forward the frontier. Not sure how much that compensates for the second round of evaporative cooling of the safety-minded.

If they thought this would be the outcome of firing Sam, they would not have done so.

The risk they took was calculated, but man, are they bad at politics.

  1. The quote is from Emmett Shear, not a board member.
  2. The board is also following the "don't say anything literally false" policy by saying practically nothing publicly.
  3. Just as I infer from Shear's qualifier that the firing did have something to do with safety, I infer from the board's public silence that their reason for the firing isn't one that would win back the departing OpenAI members (or would only do so at a cost that's not worth paying). 
  4. This is consistent with it being a safety concern shared by the superalignment team (who by and large didn't sign the statement at first) but not by the rest of OpenAI (who view pushing capabilities forward as a good thing, because like Sam they believe the EV of OpenAI building AGI is better than the EV of unilaterally stopping). That's my current main hypothesis.

It's too late for a conditional surrender now that Microsoft is a credible threat to get 100% of OpenAI's capabilities team; Ilya and Jan are communicating unconditional surrender because the alternative is even worse.

I agree, it's critical to have a very close reading of "The board did *not* remove Sam over any specific disagreement on safety".

This is the kind of situation where every qualifier in a statement needs to be understood as essential—if the statement were true without the word "specific", then I can't imagine why that word would have been inserted.

The most likely explanation I can think of, for what look like about-faces by Ilya and Jan this morning, is realizing that the worst plausible outcome is exactly what we're seeing: Sam running a new OpenAI at Microsoft, free of that pesky charter. Any amount of backpedaling, and even resigning in favor of a less safety-conscious board, is preferable to that.

They came at the king and missed.

Load More