Who is responsible for shutting down rogue AI?

Cole Wyeth

44

[ Question ]

Who is responsible for shutting down rogue AI?

by Cole Wyeth

1st Jan 2026

1 min read

A

0 2

44

A loss of control scenario would likely result in rogue AI replicating themselves across the internet, as discussed here: https://metr.org/blog/2024-11-12-rogue-replication-threat-model/

Under fast takeoff models, the first rogue AGI posing a serious takeover/extinction risk to humanity would very likely be the last, with no chance for serious opposition (e.g. Sable). This model seems theoretically compelling to me.

However, there is some recent empirical evidence that the basin of "roughly human" intelligence may not be trivial to escape. LLM agents seem increasingly competent and general, but continue to lag behind humans on long-term planning. If capabilities continue to develop in a highly jagged fashion, we may face rather dangerous rogue AI that still have some exploitable weaknesses. Also, the current (neuro-scaffold) paradigm is compute/data hungry, and perhaps not easily amenable to RSI. Though I suspect strongly superhuman models would be able to invent a much more efficient paradigm, it does seem reasonable to give some weight to the possibility that early rogue neuro-scaffold AGI will undergo a relatively slow takeoff.^[1]

Therefore, a competent civilization would have a governmental agency (or team) designated to rapidly shut down (and thoroughly purging/containing) rogue AGI. My question is which agencies currently hold that responsibility?

Surprisingly, I have not been able to find much previous discussion on practical aspects of this question (ex. legal aspects of shutting down a rogue AI running on AWS).

Ideally, such an agency would be international since rogue AGI can easily cross borders and may even negotiate with / bribe / blackmail governments. However, I would guess that some cybercrime unit within the (U.S.) DoD is probably the best candidate. While the UK AISI seems most "on the ball," as far as I know they are not very well equipped to aggressively pursue rogue AGI across borders, which may require a very quick response / escalation across cyber-defense and conventional strikes on data-centers.

At a bare minimum, a strong candidate for this role should actually perform drills simulating shutdown attempts against rogue AGI, which will probably be possible to carry out in a somewhat useful form very soon (or now, with red team human assistance).

^{^}
If neuro-scaffold AI is inherently too weak to reach AGI then the first rogue AGI may arise from a more dangerous paradigm, e.g. "brain-like-AGI". This would be unfortunate, is likely, and is not the focus of this post.

AI GovernanceAI

Frontpage

44

New Answer

New Comment

2 comments, sorted by

top scoring

Click to highlight new comments since: Today at 6:53 AM

[-]Mitchell_Porter2mo121

Cybercrime units must be part of the answer. Cybercrime is constantly evolving anyway, and online is AI's native environment, so that is where it is most likely to perform nefarious deeds.

I would add that crimes committed by agentic AI that is fixed in place, strike me as a far more likely beginning, than an AI escaping or copying itself.

Reply

[-]Steven McCulloch2mo41

Rogue AI won't necessarily have to resort to crime though. It might make money legitimately, like by doing freelance work, or running a SaaS business. I expect these cases would be a lot harder to detect.

Reply

Moderation Log

LESSWRONG
LW

LESSWRONG
LW

44

[ Question ]

Who is responsible for shutting down rogue AI?

44

44

44