[Intro to brain-like-AGI safety] 1. What's the problem & Why work on it now?
(Last revised: January 2026. See changelog at the bottom.) 1.1 Post summary / Table of contents This is the first of a series of blog posts on the technical safety problem for hypothetical future brain-like Artificial General Intelligence (AGI) systems. That previous sentence might raise a few questions, such as: What is “AGI”? What is “brain-like AGI”? What is “the technical safety problem for brain-like AGI”? If these are “hypothetical future systems”, then why on Earth am I wasting my time reading about them right now? …So my immediate goal in this post is to answer all those questions! After we have that big-picture motivation under our belt, the other 14 posts of this 15-post series will dive into neuroscience and AGI safety in glorious technical detail. See the series cover page for the overall roadmap. Summary of this first post: * In §1.2, I define the “AGI technical safety problem”, put it in the context of other types of safety research (e.g. inventing passively-safe nuclear power plant designs), and relate it to the bigger picture of what it will take for AGI to realize its potential benefits to humanity. * In §1.3, I define “brain-like AGI” as algorithms with big-picture similarity to key ingredients of human intelligence. Future researchers might make such algorithms by reverse-engineering aspects of the brain, or by independently reinventing the same tricks. Doesn’t matter. I argue that “brain-like AGI” is a yet-to-be-invented AI paradigm, quite different from large language models (LLMs). I will also bring up the counterintuitive idea that “brain-like AGI” can (and probably will) have radically nonhuman motivations. I won’t explain that here, but I’ll finish that story by the end of Post #3. * In §1.4, I define the term “AGI”, as I’m using it in this series. * In §1.5, I discuss whether it’s likely that people will eventually make brain-like AGIs, as opposed to some other kind of AGI (or just not invent AGI at all). The section includes seven
Well I don’t think either Epoch and Dario were talking about data improvements (Epoch because they used perplexity not benchmarks, and perplexity on a fixed corpus is only slightly helped by training data improvements; and Dario based on the wording he used, see §2.2 excerpt).
If Epoch and Dario are making claims that are crazy (6-8 month halving time excluding the data category), and lots of people misunderstood those claims as asserting something directionally less crazy (6-8 month halving time including the data category) … umm, I guess that’s a good thing for public understanding of LLMs, and I should be happy about it?
But it still matters for other reasons. E.g. I think... (read more)