Was a philosophy PhD student, left to work at AI Impacts, then Center on Long-Term Risk, then OpenAI. Quit OpenAI due to losing confidence that it would behave responsibly around the time of AGI. Now executive director of the AI Futures Project. I subscribe to Crocker's Rules and am especially interested to hear unsolicited constructive criticism. http://sl4.org/crocker.html
Some of my favorite memes:
(by Rob Wiblin)
(xkcd)
My EA Journey, depicted on the whiteboard at CLR:
(h/t Scott Alexander)
Thank you! I continued your ChatGPT conversation and got the following tidbits of extra info:
For my 2020 predictions, I asked it to summarize my biggest sources of Brier loss:
Top 5 Brier losses from FALSE outcomes (overconfidence)
| Rank | Claim | p (2020) | Loss |
|---|---|---|---|
| 1 | 100+ drone swarms operationally fielded | 0.50 | 0.250 |
| 2 | 10%+ news articles AI-generated | 0.20 | 0.040 |
| 3 | $1 deepfake costs $100+ to detect | 0.35 | 0.123 |
| 4 | Orbital rocket reused 100 times | 0.30 | 0.090 |
| 5 | 90%+ Rio Tinto mines automated | 0.30 | 0.090 |
| Rank | Claim | p (2020) | Loss |
|---|---|---|---|
| 1 | AI wins International Math Olympiad | 0.15 | 0.723 |
| 2 | Neural net writes B-average college essays | 0.20 | 0.640 |
| 3 | Crypto restaurants near Berkeley | 0.20 | 0.640 |
| 4 | Combat drones <50 lbs fielded | 0.50 | 0.250 |
| 5 | AI beats top StarCraft players | 0.70 | 0.090 |
The IMO prediction alone is your single largest loss across all 2020 items.
Interestingly the two biggest hits to Brier score I took (on jan 1 2020) were from underestimating LLMs basically.
Also interestingly, it seems like maybe I took significantly more Brier loss from saying things wouldn't happen, that did happen, then from saying things would happen, that didn't? In other words I was too cautious overall rather than too credulous?
I basically agree with Eli, though I'll say that I don't think the gap between extrapolating METR specifically and AI revenue is huge. I think ideally I'd do some sort of weighted mix of both, which is sorta what I'm doing in my ATC.
Well done for making these predictions and scoring them!
Well done for making these predictions and scoring them!
A major plausible class of worlds in which we don't get superhuman coders by end of 2026 is worlds where the METR trend continues at roughly the same or only slightly greater slope than the slope it had in 2025. Right?
Insofar as zero is significantly smaller than epsilon, yes.
lol oh yeah.
We did a survey to choose the name, so I have data on this! Apparently my top choice was "What 2027 Looks Like," my second was "Crunchtime 2027" and my third choice was "What Superintelligence Looks Like." With the benefit of hindsight I think only my third choice would have actually been better.
Note that we did the survey after having already talked about it a bunch; IIRC my original top choice was "What 2027 looks like" with "What superintelligence looks like" runner-up, but I had been convinced by discussions that 27 should be in the title and that the title should be short. I ended up using "What superintelligence looks like" as a sort of unofficial subtitle, see here: https://www.lesswrong.com/posts/TpSFoqoG2M5MAAesg/ai-2027-what-superintelligence-looks-like-1
"What 2027 looks like" was such an appealing title to me because this whole project was inspired by the success of "What 2026 looks like," a blog post I wrote in 2021 that held up pretty well and which I published without bringing the story to its conclusion. I saw this project as (sort of) fulfilling a promise I made back then to finish the story.
I should do a blog post retrospective, now that it's almost 2026, going over the whole post & comparing to reality!
...that's a lot of work though, if anyone else wants to do it I'd be very grateful! As a bonus they might be less biased than me.
I'd be happy to link to it here in the OP.
ChatGPT goes on to say that I outperformed Rick (who also made predictions in 2020 in the spreadsheet.) However, looking over the data briefly, I'm not sure I agree with some of the scores, e.g. are there really robotaxis in 20+ cities now? And drone delivery?