It is unclear to me that the described phenomenon exists to the degree assumed. If two equally powerful countries or sports teams battle each other, each group of supporters will believe they are likelier to win on average.
Thanks for posting! Which model was used for this eval? gpt-5-thinking or gpt-thinking-high, or any other? I think it could be good to specify (or update) for future evaluation reports
It's not a defined sentence to say, "everyone has equal moral worth"
make sure they can construct their ideology
These seem like excessive and unusual demands in the context of such a discussion. I concede there is some argument to be had for defining the terms since they are inherently vague, but this is not a philosophical treatise where that feels more appropriate. This feels similar to how in arguments about AGI some folks argue that you should not use the word intelligence (as in intelligence explosigion) since it is undefined. Moral worth, just as intelligence, seems like a useful enough concept to apply without needing to define it. To wit, John originally indicated disgust at people less ambitious than him and used the words "so-called fellow humans", and at that depth of analysis it feels congruent for other people to intuit that he assigns less moral worth to these people and vaguely argue against it.
Great study!
A strong motivating aspect of the study is measuring AI R&D accleration. I am somewhat wary of using this methodology to find negative evidence for this kind of acceleration happening at labs:
I expect this to be a good but not perfect analogy to how an AI related catastrophic event could trigger political change. My understanding is that a crucial part of public discourse was, as other commenters allude to, a perceived taboo against being anti-war, such that even center-left reputable mainstream sources did not in fact doubt the evidence for Iraq's alleged WMD. Likely a crucial component is a sort of moral dimension to the debate ("are you suggesting we should not do anything about 9/11?") that prevents people from speaking out.
I expect an AI related fiasco to have less of this moral load, and instead think that scenarios like the Whenzhou train accident or the bridge collapse in Italy 2018 are more analogous in that the catastrophe is a clear accident, that while perhaps caused by recklessness was not caused by a clearly evil entity. The wiki article on the bridge collapse makes it sound like in the aftermath there was a lot of blaming going on, but no mention of any effort to invest more into infrastructure.
Congratulations on the first video! The production quality is really impressive.
One challenge with broader public communication is that, as you allude in the post, most people have not been exposed to much of AI discourse in the way that e.g. members of this forum or 80k employees have. Do you have any approach to figuring out how to most effectively communicate with such an audience, such as what background level to assume, how doom-y to sound and so on?
I don't see the awfulness, although tbh I have not read the original reactions. If you are not desensitized to what this community woudl consider irresponsible AI development speed, responding with "You are building and releasing an AI that can do THAT?!" rather understandable. It is relatively unfortunate that it is the safety testing people that get the flack (if this impression is accurate) though.
This is a good post, but it applies unrealistic standards and therefore draws too strong conclusions.
>And at least OpenAI and Anthropic have been caught lying about their motivations:
Just face it: It is very normal for big companies to lie. That does make many of their press and public facing statements not trustworthy, but is not predictive of their general value system and therefore actions. Plus Anthropic, unlike most labs, did in fact support a version of SB 1047 at all. That has to count for something.
>There is a missing mood here. I don't know what's going on inside the heads of x-risk people such that they see new evidence on the potentially imminent demise of humanity and they find it "exciting".
In a similar vein, humans do not act or feel rationally in light of their beliefs, and changing your behavior completely in response to a years off event is just not in the cards for the vast majority of folks. Therefore do not be surprised that there is a missing mood, just like it is not surprising that people who genuinely believe in the end of humanity due to climate change do not adjust their behavior accordingly. Having said that, I did sense a general increase and preponderance of anxiety when o3 was announced, perhaps that was a point where it started to feel real for many folks.
Either way, I really want to stress that concluding much about the beliefs of folks based on these reactions is very tenuous, just like concluding that a researcher must not really care about AI safety because instead of working a bit more they watch some TV in the evening.
For the sake of correctness/completeness: The chemical compound purchase was not done by ARC, but by another unspecified red-team.
This conversation uses "underdog" in different ways, giving rise to confusion. Yes, the point of an underdog story is indeed that the underdog wins, but this just makes the heros of the story just more awesome. Ultimately, you emphasize with somebody who is super strong.
The OP, however, describes a phenomenon where the groups see themselves as weaker and in fact unlikely to win. cousin_it attributes this to weakness being desirable due to Christianity. Socrates is a good counterexample, but the 300 are less so.