General Reasoning in LLMs

Some ML research suggests that LLMs, despite being so capable in some ways, are not capable of the kind of general reasoning that humans are, and are unable to generalize to new domains.

The main reason that this is important from a safety perspective is that it seems likely to significantly impact timelines. If LLMs are fundamentally incapable of certain kinds of reasoning, and scale won't solve this (at least in the next couple of orders of magnitude), and scaffolding doesn't adequately work around it, then we're at least one significant breakthrough away from dangerous AGI -- it's pretty hard to imagine an AI system executing a coup if it can't successfully schedule a meeting with several of its co-conspirator instances.

Further, one key feature of many fast-timeline scenarios such as AI 2027 and Situational Awareness is LLMs accelerating AI research. Their ability to do so depends substantially on whether they can do the sort of general reasoning, especially in novel domains, that humans do.

This sequence attempts to evaluate whether LLMs are in fact capable of general reasoning.

LLM Generality is a Timeline Crux (06/2024) introduces the question, and attempts to evaluate the strongest arguments on each side.
LLMs Look Increasingly Like General Reasoners (11/2024) gives an update, pointing out that a number of the strongest arguments against general reasoning no longer hold for the models that have come out in the interim.
Numberwang: LLMs Doing Autonomous Research, and a Call for Input (01/2025, with @ncase) presents an initial empirical experiment trying to answer the question, lays out a larger experiment, and requests feedback on whether that larger experiment would provide a compelling answer.
(Forthcoming) describes the results of that experiment, providing evidence that LLMs are capable of scientific research in (toy) novel domains. They are unreliable at it, but research from METR suggests that LLM reliability across the board increases over time in a fairly predictable way; if they can do something unreliably now, they'll likely be able to do it reliably later.

I now consider this question answered to my satisfaction: I see no fundamental architectural limitation that will prevent LLMs from scaling to AGI and beyond. Of course that doesn't mean I'm 100% confident, just that I'm sufficiently confident that trying to better answer this question no longer seems like my most valuable use of time. I'd say I'm 85% confident that, if progress continues as expected (ie if we don't hit a data wall, have unexpected compute shortages, etc), LLMs/LRMs will reach the point of contributing meaningfully to AI research, and 80% confident that they'll reach AGI in the sense of: 'able to do the large majority of cognitive tasks as well as a large majority of humans'.

LESSWRONG
LW

LESSWRONG
LW

General Reasoning in LLMs