I just wanna state this gives me a feeling of the blind leading the blind.
I think I agree with this. To illustrate: When I met John (which was an overall pleasant interaction) I really did not think that the hat and sunglasses looked cool, but just assumed this was Berkeley style idiosyncracy. Usually I would not deem it appropriate to comment about this publicly, but since it was used as an example in the post this feels like a relevant enough data point to bring up.
Thanks for publishing this!
My main disagreement is about a missing consideration: Shrinking time to get alignment right. Despite us finding out that frontier models are less misaligned by default than [1]most here would have predicted, the bigger problem to me is that we have made only barely progress about crossing the remaining alignment gap. As a concrete example: LLMs will in conversation display a great understanding and agreement with human values, but in agentic settings (Claude 4 system card examples of blackmail) act quite differently. More importantly on the research side: to my knowledge, there has neither been a recognized breakthrough nor generally recognized smooth progress towards actually getting values into LLMs.
Similarly, at least for me a top consideration that AFAICT is not in your list: the geopolitical move towards right-wing populism (particularly in the USA) seems to reduce the chances of sensible governance quite severely.
Less risk. AI is progressing fast, but there is still a huge amount of ground to cover. Median AGI timeline vibes seem to be moving backwards. This increases the chance of a substantial time for regulation while AI grows. It decreases the chance that AI will just be 50% of the economy before governance gets its shoes on.
This seems basically true to me if we are comparing against early 2025 vibes, but not against e.g. 2023 vibes ("I think vibes-wise I am a bit less worried about AI than I was a couple of years ago"). Hard to provide evidence for this, but I'd gesture at the relatively smooth progress between the release of ChatGPT and now, which I'd summarize as "AI is not hitting a wall, at the very most a little speedbump".
Less risk. AI revenue seems more spread.
This is an interesting angle, and feels important. The baseline prior should imo be: governing more entities with near 100% effectiveness is harder than governing fewer. While I agree that conditional on having lots of companies it is likelier that some governance structure exists, it seems that the primary question is whether we get a close to zero miss rate for "deploying dangerous AGI". And that seems much harder to do when you have 20 to 30 companies that are in a race dynamic, rather than 3. Having said that, I agree with your other point about AI infrastructure becoming really expensive and that the exact implications are poorly understood.
I think about two/thirds of this perceived effect are due to LLMs not having much goals at all rather than them having human compatible goals.
I upvoted. While I disagree with most of the reasoning, it seems relatively clear to me that going against community opinion is the main reason for the downvotes. Consider this: If an author well known for his work in forecasting had pregresitered that he was going to write a bunch of not fully fleshed out arguments in favor or against updating in a particular direction, most people here would be encouraging of publishing it. I dont think there has ever been a consistent standard for "only publish highly thought out arguments" here, and we should not engage in isolated demands for rigor here, even if the topic is somewhat dicey.
This conversation uses "underdog" in different ways, giving rise to confusion. Yes, the point of an underdog story is indeed that the underdog wins, but this just makes the heros of the story just more awesome. Ultimately, you emphasize with somebody who is super strong.
The OP, however, describes a phenomenon where the groups see themselves as weaker and in fact unlikely to win. cousin_it attributes this to weakness being desirable due to Christianity. Socrates is a good counterexample, but the 300 are less so.
It is unclear to me that the described phenomenon exists to the degree assumed. If two equally powerful countries or sports teams battle each other, each group of supporters will believe they are likelier to win on average.
Thanks for posting! Which model was used for this eval? gpt-5-thinking or gpt-thinking-high, or any other? I think it could be good to specify (or update) for future evaluation reports
It's not a defined sentence to say, "everyone has equal moral worth"
make sure they can construct their ideology
These seem like excessive and unusual demands in the context of such a discussion. I concede there is some argument to be had for defining the terms since they are inherently vague, but this is not a philosophical treatise where that feels more appropriate. This feels similar to how in arguments about AGI some folks argue that you should not use the word intelligence (as in intelligence explosigion) since it is undefined. Moral worth, just as intelligence, seems like a useful enough concept to apply without needing to define it. To wit, John originally indicated disgust at people less ambitious than him and used the words "so-called fellow humans", and at that depth of analysis it feels congruent for other people to intuit that he assigns less moral worth to these people and vaguely argue against it.
Great study!
A strong motivating aspect of the study is measuring AI R&D accleration. I am somewhat wary of using this methodology to find negative evidence for this kind of acceleration happening at labs:
I expect this to be a good but not perfect analogy to how an AI related catastrophic event could trigger political change. My understanding is that a crucial part of public discourse was, as other commenters allude to, a perceived taboo against being anti-war, such that even center-left reputable mainstream sources did not in fact doubt the evidence for Iraq's alleged WMD. Likely a crucial component is a sort of moral dimension to the debate ("are you suggesting we should not do anything about 9/11?") that prevents people from speaking out.
I expect an AI related fiasco to have less of this moral load, and instead think that scenarios like the Whenzhou train accident or the bridge collapse in Italy 2018 are more analogous in that the catastrophe is a clear accident, that while perhaps caused by recklessness was not caused by a clearly evil entity. The wiki article on the bridge collapse makes it sound like in the aftermath there was a lot of blaming going on, but no mention of any effort to invest more into infrastructure.
So:
Overall, I think these are a wash, but no strong yes. At any rate, I have the hunch that these factors were derived from large social movements whose path to impact was very bottom up and around high saliency issues to the public. At least for the moderates in this debate this is not the path to impact I think, it's rather influencing key stakeholders.