What Indicators Should We Watch to Disambiguate AGI Timelines?
(Cross-post from https://amistrongeryet.substack.com/p/are-we-on-the-brink-of-agi, lightly edited for LessWrong. The original has a lengthier introduction and a bit more explanation of jargon.) No one seems to know whether transformational AGI is coming within a few short years. Or rather, everyone seems to know, but they all have conflicting opinions. Have we entered into what will in hindsight be not even the early stages, but actually the middle stage, of the mad tumbling rush into singularity? Or are we just witnessing the exciting early period of a new technology, full of discovery and opportunity, akin to the boom years of the personal computer and the web? AI is approaching elite skill at programming, possibly barreling into superhuman status at advanced mathematics, and only picking up speed. Or so the framing goes. And yet, most of the reasons for skepticism are still present. We still evaluate AI only on neatly encapsulated, objective tasks, because those are the easiest to evaluate. (As Arvind Narayanan says, “The actually hard problems for AI are the things that don't tend to be measured by benchmarks”.) There’s been no obvious progress on long-term memory. o1 and o3, the primary source of the recent “we are so back” vibe, mostly don’t seem better than previous models at problems that don’t have black-and-white answers[1]. As Timothy Lee notes, “LLMs are much worse than humans at learning from experience”, “large language models struggle with long contexts”, and “[LLMs] can easily become fixated on spurious correlations in their training data”. Perhaps most jarringly, LLMs still haven’t really done anything of major impact in the real world. There are good reasons for this – it takes time to find productive applications for a new technology, people are slow to take advantage, etc. – but still, it’s dissatisfying. I recently attempted to enumerate the fundamental questions that lie underneath most disagreements about AI policy, and number one on the l