Here's the question: Is one year of additional alignment research more beneficial than one more year of hardware overhang is harmful?

One problem I see is that alignment research can be of variable or questionable value, while hardware overhang is an economic certainty.

What if we get to the 2020s and it turns out all the powerful AI are LLMs? /s

I don't know how much that's affected the value of completed alignment research, but I feel like that twist in the story can't have had no impact on our understandings of what the important or useful research ought to be.

New Answer
New Comment
2 comments, sorted by Click to highlight new comments since: Today at 12:20 PM

“What if we get to the 2020s and it turns out all the powerful AI are LLMs?”

Did you mean 2030’s, or am I misinterpreting something?

I think what the OP was saying was that in, say, 2013, there's no way we could have predicted the type of agent that LLM's are and that they would be the most powerful AI's available. So, nobody was saying "What if we get to the 2020s and it turns out all the powerful AI are LLM's?" back then. Therefore, that raises a question on the value of the alignment work done before then.

If we extend that to the future, we would expect most good alignment research to happen within a few years of AGI, when it becomes clear what type of agent we're going to get. Alignment research is much harder if, ten years from now, the thing that becomes AGI is as unexpected to us as LLM's were ten years ago.

Thus, there's not really that much difference, goes the argument, if we get AGI in five years with LLM's or fifteen years with God only knows what, since it's the last few years that matters.

A hardware overhang, on the other hand, would be bad. Imagine we had 2030's hardware when LLM's came onto the scene. You'd have Vaswani et al. coming out in 2032 and by 2035 you'd have GPT-8. That would be terrible.

Therefore, says the steelman, the best scenario is if we are currently in a slow takeoff that gives us time. Hardware overhang is never going to be lower again than it is now, and that ensures we are bumping up against not only conceptual understanding or engineering requirements but also the fundamental limits of compute, which limits how fast we can scale the LLM paradigm. This may not happen if we get a new type of agent in ten years.