So, The Possessed Machines. There's been some discussion already. It is a valuable piece -- it has certainly provoked some thought in me! -- but it has some major flaws. It (sneakily!) dismisses specific arguments about AI existential risk and broad swaths of discourse altogether without actually arguing against them. Also, the author is untrustworthy at the moment; readers should be skeptical of purported first-person information in the piece.
Before getting into it, I want to praise the title. "Possessed" has four relevant meanings: demonic; ideologically possessed; frenzied/manically/madly; belonging to someone. "Machines" has three possible referents: AI; people; an efficient group of powerful people/institutions. There are twelve combinations there. I see the following seven (!) as being applicable.
1. Demonic machines; machines that are intelligent and evil.
2. Machines that belong to us; AI is something humanity currently possesses.
3a. Frenzied, manically productive people (AI-folk).
3b. Demonic, machine-like people.
3c. Ideologically possessed people. (They are machines for their ideology).
4b. The out-of-control technocapitalist machine.[2]
4c. The cabal of AI tech elites and their political allies.
Okay, let's get into it.
Dismissal of pivotal acts
An important idea in Possessed Machines is the Shigalyovian system/argument.[3] This system/argument is defined as such:
No one can quite refute the argument. And this is Dostoevsky's point: the argument cannot be refuted on its own terms because its premises, once accepted, do indeed lead to its conclusions. The error is in the premises, but the premises are hidden behind such a mass of reasoning that they are difficult to locate.
I want to be very direct about the contemporary relevance of this passage. The AI safety community has developed its own versions of Shigalyovism—systems of thought that begin with freedom and end with despotism, proposals that would sacrifice almost everything to preserve what they define as valuable.
In theory, then, one should be able to dismantle a Shigalyovian system by identifying its hidden premises and arguing against them. Now, part of what makes an argument Shigalyovian is that its premises are difficult to locate, so this might be difficult. Indeed, part of the rhetorical and memetic success of these systems comes from this difficulty. Nevertheless, these premises exist and can be discovered.
The author then gives an example of this in today's AI world:
The concept of a "pivotal act" is perhaps the clearest example. A pivotal act, in AI safety discourse, is an action taken by a powerful AI system that permanently prevents certain catastrophic outcomes. The canonical example is using an aligned AI to prevent all other AI development—establishing a kind of permanent monopoly on artificial intelligence.15
This is Shigalyovism in digital form. It begins with the desire to protect humanity and ends with a proposal for a single point of failure controlling all future technological development. The reasoning is internally consistent: if unaligned AI would destroy humanity, and if many independent AI projects increase the probability of unaligned AI, then preventing independent AI development reduces existential risk. QED.
But the conclusion is monstrous. A world in which a single entity controls all AI development is a world without meaningful freedom, without the possibility of exit, without any check on the power of whoever controls that entity. It is Shigalyov's one-tenth ruling over his nine-tenths, with the moral framework of "preventing extinction" replacing the moral framework of "achieving paradise."
The implied conclusion here is that we shouldn't use an aligned AI to prevent all other AI development. But the author doesn't actually argue for this. In this Shigalyovian framework, what they need to do to rebut the pivotal act argument is find the hidden premises that are objectionable and argue against those. But the author doesn't do this.
To put this another way: the argument here is of the form: 1. X->Y is a system that begins with freedom and ends with despotism.
2. Thus X->Y is Shigalyovian.
3. [Implied] Thus X->Y is valid but unsound.
4. [Implied] Thus Y is wrong.
Where X->Y is the "pivotal act" system: <<The desire to protect humanity -> using an aligned AI to prevent all other AI development.>>
There's something wrong with this argument. The problem here is that the author hasn't actually shown that X->Y is unsound. Just because a valid argument starts from freedom and ends with despotism doesn't mean it's wrong! To figure out if the conclusion is wrong, you have to look at the assumptions -- the "hidden premises".
What are the hidden premises in the pivotal act argument? And what is the error in them? I don't know! But if you want to argue against pivotal acts... you need to engage with these questions substantively. Merely pointing out that the system starts from freedom and ends with "despotism" and that its conclusion is "monstrous" to you... is not enough. It's not a real argument.
Dismissal of calm, rational discourse
A central claim of Possessed Machines is that the heart of the problem is the moral deficit of certain powerful people in the industry. In the chapter titled "What Is to Be Done?", the author writes:
The core problem is that the people making the key decisions are, many of them, damaged in ways that disqualify them from making these decisions wisely.
They continue:
This damage is not primarily intellectual. The people I am thinking of are intelligent, often extraordinarily so. It is something more like moral—a failure of the channels that connect knowledge to action, that make abstract truths feel binding, that generate appropriate emotional responses to contemplated harms.
There are two characters in Demons that have this moral deficit: Pyotr Verkhovensky and Stavrogin.
Verkhovensky is charming, clever, and absolutely without moral content. He believes in nothing except his own power and the excitement of watching things burn...
Stavrogin is brilliant, beautiful, charismatic, and utterly empty... He is capable of intellectual engagement at the highest level but experiences it as performance rather than connection.
Possessed Machines makes a specific point: that some of the most powerful people in AI are Verkhovenskys and Stavrogins. I have no qualms with this.
Then there's a related, broader point that the essay makes, which is something like: most of the calm, "rational" discussion on AI existential risk comes from a deprived place. In one case, this deprivation is an emotional/moral "numbness":
Some of the people who speak most calmly about human extinction are not calm because they have achieved wisdom but because they have achieved numbness. They have looked at the abyss so long that they no longer see it. Their equanimity is not strength; it is the absence of appropriate emotional response.
In another case, this deprivation is "the aestheticization of darkness" or "performance":
Stavrogin's confession fails because he has turned it into a performance. He wants the shock value without the repentance. He wants to be seen as someone who has done terrible things and faces them without flinching—but this desire is itself a form of flinching, a way of converting a moral reality into an aesthetic pose.
I see this dynamic throughout the rationalist-adjacent world. The willingness to discuss existential risk, to contemplate human extinction, to reason about torture and genocide and civilizational collapse—all of this is valuable insofar as it helps us think more clearly about these topics. Butit becomes dangerous when the willingness to discuss becomes the primary thing, when people compete to be the most willing to face the darkest topics, when the pose of unflinching analysis substitutes for genuine moral engagement.
The implication in these passages is that this deprivation makes the calm, "rational" AI existential risk discourse fundamentally unsound. The discourse (so the author claims) comes from numbness, not wisdom. The discourse is performance, not genuine moral engagement.
But wait. Does that mean the arguments are wrong? Again, we have a case in which the author does not actually engage with the arguments. Like in their discussion of the "pivotal act", the author provides meta-reasons for dismissing the arguments of the other side but does not actually engage with these arguments. The reader is left with the feeling that a valid rebuttal has been made, but it hasn't.
What is it, exactly, that the author wants? Reading those passages again, I gather they want "appropriate emotional response" and "genuine moral engagement". As opposed to... equanimity? Unflinching analysis?
Call me crazy, but I think that equanimity and unflinching analysis are good. Now, perhaps the author isn't criticizing the presence of these things but rather the lack of the other things. ¿Por qué no los dos?
Okay. What is the correct emotional response to all this? That's actually a good question, seriously, ask yourself that. What about genuine moral engagement? "When the pose of unflinching analysis substitutes for genuine moral engagement" sounds nice but... what does it mean? Call me crazy, but I think unflinching analysis is pretty good! What is the alternative?
Can we trust the author?
No. If the author is indeed who they say they are, they should provide verification. Why do I think this?
1. I think the author is being dishonest about how this piece was written.
There is a lot of AI in the writing of Possessed Machines. The bottom of the webpage states "To conceal stylistic identifiers of the authors, the above text is a sentence-for-sentence rewrite of an original hand-written composition processed via Claude Opus 4.5." As I wrote in a comment:
Ah, this [statement] was not there when I read the piece (Jan 23). You can see an archived version here in which it doesn't say that.
I don't actually believe that this is how the document was made. A few reasons. First, I don't think this is what a sentence-for-sentence rewrite looks like; I don't think you get that much of the AI style that this piece has with that^. Second, the stories in the interlude are superrrrr AI-y, not just in sentence-by-sentence style but in other ways. Third, the chapter and part titles seem very AI generated...
The piece has 31 uses of “genuine”/“genuinely” in ~17000 words. One “genuine” every 550 words.
There's some stuff that feels a little bit weird here. The author says they left in early 2024 and then spent the "following months" reading Dostoevsky and writing this essay. Was the essay a bit older and only got put up? (Has to be relatively recently edited, if it was run through 4.5). Who are the editors alluded to at the very end? Is it supposed to be Tim Hwang? A little bit more transparency would be much appreciated (the disclaimer about Opus 4.5 being used for anonymization was only added on the 24th after some people had pointed out that it sounded rather AI-written.).
Another weirdness: why did Hwang put up another microsite about Demons that's written by an anonymous author "still working in industry" that has clear LLM-writing patterns at basically the same time? https://shigalyovism.com/. Though this one is much less in-depth.
At the bottom of the webpage in an "About the Author" box, we are told "Correspondence may be directed to the editors." This is weird, because we don't know who the editors are. Probably this was something that Claude added and the human author didn't check.
There are some anomalies in the chapter numbering:
Part IV ends with Chapter 18; Part V begins with Chapter 21... [etc.]
3. This piece could have been written by someone who wasn't an AI insider
If you're immersed in 2025/2026 ~rationalist AI discourse, you would have the information to write Possessed Machines. That is, there's no "inside information" in the piece. There is a lot of "I saw people at the lab do this [thing that I, a non-insider, already thought that people at the lab did]". Leogao has made this same point: "it seems plausible that the piece was written by someone who only has access to public writings."
From the essay: "Not by ideology, not by any single vision, but by the spirit of acceleration itself—the drive toward "more" and "faster" that has no end point and no criterion for success except continued motion."
"Technocapitalist machine" = the system made up of VCs, startups, labs, government, etc.
The machine is out of control in the sense that it has goals of its own, we can't control it, and it's creating something evil. It's a possessed machine.
So, The Possessed Machines. There's been some discussion already. It is a valuable piece -- it has certainly provoked some thought in me! -- but it has some major flaws. It (sneakily!) dismisses specific arguments about AI existential risk and broad swaths of discourse altogether without actually arguing against them. Also, the author is untrustworthy at the moment; readers should be skeptical of purported first-person information in the piece.
This image comes from a different "book review" of Demons. It's an excellent piece. I highly recommend it.
Before getting into it, I want to praise the title. "Possessed" has four relevant meanings: demonic; ideologically possessed; frenzied/manically/madly; belonging to someone. "Machines" has three possible referents: AI; people; an efficient group of powerful people/institutions. There are twelve combinations there. I see the following seven (!) as being applicable.
1. Demonic machines; machines that are intelligent and evil.
2. Machines that belong to us; AI is something humanity currently possesses.
3a. Frenzied, manically productive people (AI-folk).
3b. Demonic, machine-like people.
3c. Ideologically possessed people. (They are machines for their ideology).
4a. The accelerationist AI industry.[1]
4b. The out-of-control technocapitalist machine.[2]
4c. The cabal of AI tech elites and their political allies.
Okay, let's get into it.
Dismissal of pivotal acts
An important idea in Possessed Machines is the Shigalyovian system/argument.[3] This system/argument is defined as such:
In theory, then, one should be able to dismantle a Shigalyovian system by identifying its hidden premises and arguing against them. Now, part of what makes an argument Shigalyovian is that its premises are difficult to locate, so this might be difficult. Indeed, part of the rhetorical and memetic success of these systems comes from this difficulty. Nevertheless, these premises exist and can be discovered.
The author then gives an example of this in today's AI world:
The implied conclusion here is that we shouldn't use an aligned AI to prevent all other AI development. But the author doesn't actually argue for this. In this Shigalyovian framework, what they need to do to rebut the pivotal act argument is find the hidden premises that are objectionable and argue against those. But the author doesn't do this.
To put this another way: the argument here is of the form:
1. X->Y is a system that begins with freedom and ends with despotism.
2. Thus X->Y is Shigalyovian.
3. [Implied] Thus X->Y is valid but unsound.
4. [Implied] Thus Y is wrong.
Where X->Y is the "pivotal act" system: <<The desire to protect humanity -> using an aligned AI to prevent all other AI development.>>
There's something wrong with this argument. The problem here is that the author hasn't actually shown that X->Y is unsound. Just because a valid argument starts from freedom and ends with despotism doesn't mean it's wrong! To figure out if the conclusion is wrong, you have to look at the assumptions -- the "hidden premises".
What are the hidden premises in the pivotal act argument? And what is the error in them? I don't know! But if you want to argue against pivotal acts... you need to engage with these questions substantively. Merely pointing out that the system starts from freedom and ends with "despotism" and that its conclusion is "monstrous" to you... is not enough. It's not a real argument.
Dismissal of calm, rational discourse
A central claim of Possessed Machines is that the heart of the problem is the moral deficit of certain powerful people in the industry. In the chapter titled "What Is to Be Done?", the author writes:
They continue:
There are two characters in Demons that have this moral deficit: Pyotr Verkhovensky and Stavrogin.
Possessed Machines makes a specific point: that some of the most powerful people in AI are Verkhovenskys and Stavrogins. I have no qualms with this.
Then there's a related, broader point that the essay makes, which is something like: most of the calm, "rational" discussion on AI existential risk comes from a deprived place. In one case, this deprivation is an emotional/moral "numbness":
In another case, this deprivation is "the aestheticization of darkness" or "performance":
The implication in these passages is that this deprivation makes the calm, "rational" AI existential risk discourse fundamentally unsound. The discourse (so the author claims) comes from numbness, not wisdom. The discourse is performance, not genuine moral engagement.
But wait. Does that mean the arguments are wrong? Again, we have a case in which the author does not actually engage with the arguments. Like in their discussion of the "pivotal act", the author provides meta-reasons for dismissing the arguments of the other side but does not actually engage with these arguments. The reader is left with the feeling that a valid rebuttal has been made, but it hasn't.
What is it, exactly, that the author wants? Reading those passages again, I gather they want "appropriate emotional response" and "genuine moral engagement". As opposed to... equanimity? Unflinching analysis?
Call me crazy, but I think that equanimity and unflinching analysis are good. Now, perhaps the author isn't criticizing the presence of these things but rather the lack of the other things. ¿Por qué no los dos?
Okay. What is the correct emotional response to all this? That's actually a good question, seriously, ask yourself that. What about genuine moral engagement? "When the pose of unflinching analysis substitutes for genuine moral engagement" sounds nice but... what does it mean? Call me crazy, but I think unflinching analysis is pretty good! What is the alternative?
Can we trust the author?
No. If the author is indeed who they say they are, they should provide verification. Why do I think this?
1. I think the author is being dishonest about how this piece was written.
There is a lot of AI in the writing of Possessed Machines. The bottom of the webpage states "To conceal stylistic identifiers of the authors, the above text is a sentence-for-sentence rewrite of an original hand-written composition processed via Claude Opus 4.5." As I wrote in a comment:
See also...
2. Fishiness
From kaiwilliams:
At the bottom of the webpage in an "About the Author" box, we are told "Correspondence may be directed to the editors." This is weird, because we don't know who the editors are. Probably this was something that Claude added and the human author didn't check.
Richard_Kennaway points out:
3. This piece could have been written by someone who wasn't an AI insider
If you're immersed in 2025/2026 ~rationalist AI discourse, you would have the information to write Possessed Machines. That is, there's no "inside information" in the piece. There is a lot of "I saw people at the lab do this [thing that I, a non-insider, already thought that people at the lab did]". Leogao has made this same point: "it seems plausible that the piece was written by someone who only has access to public writings."
From the essay: "Not by ideology, not by any single vision, but by the spirit of acceleration itself—the drive toward "more" and "faster" that has no end point and no criterion for success except continued motion."
"Technocapitalist machine" = the system made up of VCs, startups, labs, government, etc.
The machine is out of control in the sense that it has goals of its own, we can't control it, and it's creating something evil. It's a possessed machine.
I understand "system" as meaning something like "system of beliefs"; synonyms I'd use are "worldview" or "memeplex".