Why do you think that?
Don't the mountain of posts on optimization pressure explain why ending with "U3 was up a queen and was a giga-grandmaster and hardly needed the advantage. Humanity was predictably toast" is actually sufficient? In other words, doesn't someone who understands all the posts on optimization pressure not need the rest of the story after the "U3 was up a queen" part to understand that the AIs could actually take over?
If you disagree, then what do you think the story offers that makes it a helpful concrete example for people who both are skeptical that AIs can take over and already understand the posts on optimization pressure?
Ryan disagree-reacted to the bold part of this sentence in my comment above and I'm not sure why: "This tweet predicts two objections to this story that align with my first and third bullet point (common objections) above."
This seems pretty unimportant to gain clarity on, but I'll explain my original sentence more clearly anyway:
For reference, my third bullet point was the common objection: "How would humanity fail to notice this and/or stop this?"
To my mind, someone objecting that the story is unrealistic because "there's no reason why OpenAI would ever let the model do its thinking steps in opaque vectors instead of written out in English" (as stated in the tweet) is an objection of the form "humanity wouldn't fail to stop AI from sneakily engaging in power-seeking behavior by thinking in opaque vectors." Like it's a "sure, AI could takeover if humanity were dumb like that, but there's no way OpenAI would be dump like that."
It seems like Ryan was disagreeing with this with his emoji, but maybe I misunderstood it.
Good point. At the same time, I think the underlying cruxes that lead people to being skeptical of the possibility that AIs could actually take over are commonly:
I mention these points because people who mention these objections typically wouldn't raise these objections to the idea of an intelligent alien species invading Earth and taking over.
People generally have no problem granting that aliens may not share our values, may have actuators / the ability to physically wage war against humanity, and could plausibly overpower us with their superior intellect and technological know-how.
Providing a detailed story of what a particular alien takeover process might look like then isn't actually necessarily helpful to addressing the objections people raise about AI takeover.
I'd propose that authors of AI takeover stories should therefore make sure that they aren't just describing aspects of a plausible AI takeover story that could just as easily be aspects of an alien takeover story, but are instead actually addressing peoples' underlying reasons for being skeptical that AI could take over.
This means doing things like focusing on explaining:
(With this comment I don't intend to make a claim about how well the OP story does these things, though that could be analyzed. I'm just making a meta point about what kind of description of a plausible AI takeover scenario I'd expect to actually engage with the actual reasons for disagreement of the people who say "can the AIs actually take over".)
Edited to add: This tweet predicts two objections to this story that align with my first and third bullet point (common objections) above:
It was a good read, but the issue most people are going to have with this is how U3 develops that misalignment in its thoughts in the first place.
That, plus there's no reason why OpenAI would ever let the model do its thinking steps in opaque vectors instead of written out in English, as it is currently
Thanks for the story. I found the beginning the most interesting.
U3 was up a queen and was a giga-grandmaster and hardly needed the advantage. Humanity was predictably toast.
I think ending the story like this is actually fine for many (most?) AI takeover stories. The "point of no return" has already occurred at this point (unless the takeover wasn't highly likely to be successful), and so humanity's fate is effectively already sealed even though the takeover hasn't happened yet.
What happens leading up to the point of no return is the most interesting part because it's the part where humanity can actually still make a difference to how the future goes.
After the point of no return, I primarily want to know what the (now practically inevitable) AI takeover implies for the future: does it mean near-term human extinction, or a future in which humanity is confined to Earth, or a managed utopia, etc?
Trying to come up with a detailed concrete plausible story of what the actual process of takeover actually looks like isn't as interesting seeming (at least to me). So I would have preferred more detail and effort put into the beginning of the story explaining how humanity managed to fail to stop the creation of a powerful agentic AI that would takeover rather than see as much detail and effort put into imagining how the takeover actually happens.
Specifically I’m targeting futures that are at my top 20th percentile of rate of progress and safety difficulty.
Does this mean that you think AI takeover within 2 years is at least 20% likely?
Or are there scenarios where progress is even faster and safety is even more difficult than illustrated in your story and yet humanity avoids AI takeover?
The fake names are a useful reminder and clarification that it's fiction.
I'm going to withdraw from this comment thread since I don't think my further participation is a good use of time/energy. Thanks for sharing your thoughts and sorry we didn't come to agreement.
I agree that that would be evidence of OP being more curious. I just don't think that given what OP actually did it can be said that she wasn't curious at all.
Thanks for the feedback, Holly. I really don't want to accuse the OP of making a personal attack if OP's intent was to not do that, and the reality is that I'm uncertain and can see a clear possibility that OP has no ill will toward Kat personally, so I'm not going to take the risk by making the accusation. Maybe my being on the autism spectrum is making me oblivious or something, in which case sorry I'm not able to see things as you seem them, but this is how I'm viewing the situation.
Thanks for the clarification. My conclusion is that I think your emoji was meant to signal disagreement with the claim that 'opaque vector reasoning makes a difference' rather than a thing I believe.
I had rogue AIs in mind as well, and I'll take your word on "for catching already rogue AIs and stopping them, opaque vector reasoning doesn't make much of a difference".