Thank you for writing this, I find it very relatable. I'd heart react the post if that feature existed, so I'll heart react my comment instead.
This benchmark includes a Slay the Spire environment! When it was written, Gemini 2.5 did the best, getting roughly halfway through a non-Ascension run.
This very roughly implies that the median of "50% time horizon as predicted by METR staff" by EOY 2026 is a bit higher than 20 hours.
I very roughly polled METR staff (using Fatebook) what the 50% time horizon will be by EOY 2026, conditional on METR reporting something analogous to today's time horizon metric.
I got the following results: 29% average probability that it will surpass 32 hours. 68% average probability that it will surpass 16 hours.
The first question got 10 respondents and the second question got 12. Around half of the respondents were technical researchers. I expect the sample to be close to representative, but maybe a bit more short-timelines than the rest of METR staff.
The average probability that the question doesn't resolve AMBIGUOUS is somewhere around 60%.
I think my median is now 4 years, due to 2025 progress being underwhelming. I plan to write a follow up post sometime soon.
I don't endorse the timelines in this post anymore (my median is now around EOY 2029 instead of EOY 2027) but I think the recommendations stand up.
In person, especially in 2024, many people would mention my post to me, and I think it helped people think about their career plans. I still endorse the robustly good actions.
How did my 2025 predictions hold up? Pretty well! I plan to write up a full post reviewing my predictions, but they seem pretty calibrated. I think I overestimated public attention, frontiermath, and I slightly overestimated SWE-Bench verified and OSWorld. All of the preparedness categories were hit I think.
I don't see why either of those things stop you from having a family.
I think we might be using different operationalizations of "having a family" here. I was imagining it to mean something that at least includes "raise kids from the age of ~0 to 18". If x-risk were to materialize within the next ~19 years, I would be literally stopped from "having a family" by all of us getting killed.
But under a definition of "have a family" which is means "raise a child from the age of ~0 to 1", then yeah, I think P(doom) is <20% in the next 2 years and I'm probably not literally getting stopped.
Also to be clear, my P(ASI within our lifetimes) is like 85%, and my P(doom) is like 2/3.
This is because the correct answer is option three: try to modify the button to lower the 60 and raise the 15, until such time as a 1-in-5 chance of survival is a net improvement relative to your default situation.
Yes, the counterfactual I was imagining in this button world was just living a normal life and dying at the end. If indeed there's a way to shift around the probabilities I'd devote my life to it. Which is what we're doing!
It's been honestly very freeing to be able to discuss these things somewhere other than this community.
I agree. This year I've had the policy of being very direct about what I think about crazy AI futures even with people outside of the AI safety community. I held a powerpoint presentation to my close family members talking about AGI and AI safety and how the world is going to be crazy in the coming decades. When my relatives ask me about having kids, I say "By the time I'd have had kids, if humanity is even around, who knows what the concept of kids will look like. Maybe we'll be growing them in vats. Maybe we'll all be uploaded."
Of course, I don't say all of that every time. Most of the time people aren't in the mood for those sorts of discussions. But people have started taking these arguments more seriously as AI has had more and more of an effect and appeared more and more in the news.
that's unfair; if there's no utopia, none of the other interventions work either,
You're so right! Thanks for catching this.
I think I probably want to be clearer about the units of measurement being slightly different. Every intervention except the cryonics one are naively reducing acute micromorts, which can be converted into "microutopias" by multiplying by P(utopia). The cryonics one is about increasing microutopias, because the counterfactual is ~purely in utopian worlds.
I think that space-based power grabs are unlikely as long as powers care about, and are equally-matched on, Earth.
This is the rough story that I think is unlikely to happen:
This story to me seems unlikely because in this scenario, Superpower A probably still has most of its human population on Earth (relocating millions of people to space would probably be very slow). Therefore, as long as mutually assured destruction is maintained on Earth, Superpower B will retain a lot of its bargaining power despite having a disadvantage in space infrastructure.