By contrast, some reasons to be skeptical that AI will be automating more than a few percent of the economy by 2033 (still absent AI R&D feedback loops and/or catastrophe):
I currently expect substantial AI R&D acceleration from AIs which are capable of cheap and fast superhuman performance at arbitrary easy-and-cheap-to-check domains (especially if these AIs are very superhuman). Correspondingly I think "absent AI R&D feedback loops" might be doing a lot of the work here. Minimally, I think full automation of research engineering would yield a large acceleration (e.g. 3x faster AI progress), though this requires high performance on (some) non-formal domains. I think if AIs were (very) superhuman at easy (and cheap) to check stuff, you'd probably be able to use them to do a lot of coding tasks, some small scale research tasks that might transfer well enough, and there would be a decent chance of enough transfer to extend substantially beyond this.
I think I still agree with the bottom line "it's plausible that for several years in the late 2020s/early 2030s, we will have AI that is vastly superhuman at formal domains including math, but still underperforms humans at most white-collar jobs". And I agree more strongly if you change "underperforms humans at most white-collar jobs" to "can't yet fully automate AI R&D".
Reliable informal-to-formal translation: solution verifiers need to be robust enough to avoid too much reward hacking, which probably requires natural language problems and solutions to be formalized to some degree (a variety of arrangements seem possible here, but it's hard to see how something purely informal can provide sufficiently scalable supervision, and it's hard to see how something purely formal can capture mathematicians' intuitions about what problems are interesting).
Do you think OpenAI and GDM's recent results on IMO are driven by formalization in training (e.g. the AI formalizes and checks the proof) or some other verification strategy? I'd pretty strongly guess some other verification strategy that involves getting AIs to be able to (more) robustly check informal proofs. (Though maybe this verification ability was partially trained using formalization? I currently suspect not.) This also has the advantage of allowing for (massive amounts of) runtime search that operates purely using informal proofs.
I think an important alternative to informal-to-formal translation is instead just getting very robust AI verification of informal proofs that scales sufficiently with capabilities. This is mostly how humans have gotten increasingly good at proving things in mathematics and I don't see a strong reason to think this is infeasible.
If superhuman math is most easily achieved by getting robust verification and this robust verification strategy generalizes to other non-formal but relatively easy to check domains (e.g. various software engineering tasks), then we'll also see high levels of capability in these other domains, including domains which are more relevant to AI R&D and the economy.
Probably it would be more accurate to say "doesn't seem to help much while it helps a lot for openai models".
Some people seem to think my timelines have shifted a bunch while they've only moderately changed.
Relative to my views at the start of 2025, my median (50th percentile) for AIs fully automating AI R&D was pushed back by around 2 years—from something like Jan 2032 to Jan 2034. My 25th percentile has shifted similarly (though perhaps more importantly) from maybe July 2028 to July 2030. Obviously, my numbers aren't fully precise and vary some over time. (E.g., I'm not sure I would have quoted these exact numbers for this exact milestone at the start of the year; these numbers for the start of the year are partially reverse engineered from this comment.)
Fully automating AI R&D is a pretty high milestone; my current numbers for something like "AIs accelerate AI R&D as much as what would happen if employees ran 10x faster (e.g. by ~fully automating research engineering and some other tasks)" are probably 50th percentile Jan 2032 and 25th percentile Jan 2029.[1]
I'm partially posting this so there is a record of my views; I think it's somewhat interesting to observe this over time. (That said, I don't want to anchor myself, which does seem like a serious downside. I should slide around a bunch and be somewhat incoherent if I'm updating as much as I should: my past views are always going to be somewhat obviously confused from the perspective of my current self.)
While I'm giving these numbers, note that I think Precise AGI timelines don't matter that much.
See this comment for the numbers I would have given for this milestone at the start of the year. ↩︎
I've updated towards somewhat longer timelines again over the last 5 months. Maybe my 50th percentile for this milestone is now Jan 2032.
Some AI company employees with shorter timelines than me mostly. I also think that "why I don't agree with X" is a good prompt to express some deeper aspect of my models/views. It also makes a good reasonably engaging hook for a blog post.
I might write some posts responding arguments for longer timelines that I disagree with if I feel like I have something interesting to say.
This is only somewhat related to what you were saying, but I do think 100 year medians vs 10 year medians does matter a bunch.
Presumably, another problem with your caveated version of the title is that you don't expect literally everyone to die (at least not with high confidence) even if AIs take over.