1142

LESSWRONG
LW

1141
AI RiskDebate (AI safety technique)AI
Frontpage

65

Rejecting Violence as an AI Safety Strategy

by James_Miller
22nd Sep 2025
4 min read
5

65

AI RiskDebate (AI safety technique)AI
Frontpage

65

Rejecting Violence as an AI Safety Strategy
9J Thomas Moros
3samuelshadrach
3James_Miller
-3avturchin
2James_Miller
2[comment deleted]
New Comment
5 comments, sorted by
top scoring
Click to highlight new comments since: Today at 7:43 PM
[-]J Thomas Moros2mo90

If your goal is to discourage violence, I feel you've missed a number of key considerations that you need to address to discourage people. Specifically, I find myself confused by several things:

  • Why the assumption that the target would be AI researchers instead of AI infrastructure (i.e. training clusters and chip fabs)
  • You claim that the state would have the force advantage, but independent terrorist actors often have advantages against large state actors. Any violence wouldn't be some kind of organized militia vs. the state. It would be a loose coalition of "terrorist" cells seeking to disrupt and slow AI development.
  • You assume a lot of association with mainstream AI safety and reputational harm. However, any violence would likely have to come from a separate faction that would seek to distinguish itself from the mainstream. For example, there are a few ecoterrorist and "deep green" movements, but they are clearly distinct from mainstream climate activism, and I don't think they have had much impact on mainstream climate activism.

So, by all means, we at LessWrong condemn any attempt to use violence to solve the race to ASI that kills everyone. But, if you were attempting to prevent some group from splintering off to seek a violent means of resistance, I think you've somewhat missed the mark.

Reply
[-]samuelshadrach2mo30

This type of post has bad epistemics, because people in support of terrorism are probably not going to come and correct you. You will receive no feedback, and your points even if incorrect will get amplified.

Reply
[-]James_Miller2mo30

That is a valid point. I did ask two AIs to point out mistakes in the article so I got some criticism. One AI wanted me to steelman the position in favor of violence, which I didn't do because I feared it being taken out of context, and I feared that some might think I was really advocating for violence and putting in the anti-violence positions as cover.

Reply
[-]avturchin2mo-3-11

But hunger strike is borderline violent thing? If a person will be able to die from hunger, it will put enormous guilt on AI creators – which they will perceive as a very personal attack. 

Reply
[-]James_Miller2mo20

Doomers are claiming that those building AI are threatening the lives of everyone, so that is already an attempt to put a lot of guilt on the builders.

Reply
[+][comment deleted]2mo20
Moderation Log
More from James_Miller
View more
Curated and popular this week
5Comments
Deleted by Garrett Baker, 09/22/2025
Reason: not worth getting into

Violence against AI developers would increase rather than reduce the existential risk from AI. This analysis shows how such tactics would catastrophically backfire and counters the potential misconception that a consequentialist AI doomer might rationally endorse violence by non-state actors.

  1. Asymmetry of force. Violence would shift the contest from ideas to physical force, a domain where AI safety advocates would face overwhelming disadvantages. States and corporations command vast security apparatuses and intelligence networks. While safety advocates can compete intellectually through research and argumentation, entering a physical conflict would likely result in swift, decisive defeat.
  2. Network resilience and geographic distribution. The AI development ecosystem spans multiple continents, involves thousands of researchers, and commands trillions in resources. Targeting individuals would likely redistribute talent and capital to more secure locations without altering the fundamental trajectory.
  3. Economic and strategic imperatives. AI development represents both unprecedented economic opportunity and perceived national security necessity. These dual incentives create momentum that violence would be unlikely to meaningfully disrupt. States view AI supremacy as existential, while markets see it as the next great transformation. No amount of intimidation is likely to overcome these structural forces.
  4. International coordination collapse. Effective AI governance would require unprecedented global cooperation, particularly between the US and China. China's government maintains zero tolerance for non-state violence and would likely immediately cease engagement with any movement associated with such tactics. This could eliminate the already slim possibility of coordinated international action on AI risk, perhaps the only viable path to meaningful safety guarantees.
  5. Stigma contagion and guilt by association. A single act of violence would likely permanently brand the entire movement as extremist. This reputational contamination would operate as a cognitive shortcut, allowing critics to dismiss all safety arguments without engaging their substance. The movement's actual concerns could become inaudible beneath the noise of its worst actors' choices.
  6. Securitization and democratic bypass. Violence would likely trigger immediate reclassification of AI safety from a policy debate to a security threat. Decision-making could shift from open forums to classified settings dominated by defense agencies. This securitization would potentially eliminate public oversight precisely when scrutiny matters most, while personal security fears might override rational risk assessment.
  7. Fear-driven acceleration. Leaders facing credible threats to their lives would naturally compress their time horizons and seek rapid resolution. Rather than pausing development, they would likely accelerate deployment to reach perceived safety through technological superiority.
  8. Infrastructure for repression. Violence would provide justification for comprehensive surveillance, asset seizure, and deplatforming. Payment processors, hosting services, and venues would likely blacklist safety organizations. These cascading restrictions could eliminate funding channels, communication platforms, and physical spaces necessary for advocacy, effectively dismantling the movement's operational capacity.
  9. Transparency destruction and dissent suppression. Labs would likely invoke security concerns to classify research, cancel external audits, and eliminate whistleblower protections. Internal critics could face not just professional consequences but also potential criminal liability for speaking out. This opacity would likely blind both policymakers and the public to genuine risks while silencing the employees best positioned to identify them.
  10. Regulatory capture through fear. Politicians and regulators would likely avoid any association with a movement linked to violence. Meetings would be canceled, hearings postponed, and briefings rejected. The careful technical arguments that might influence policy could lose their only remaining channels of influence, leaving regulation to those least concerned with catastrophic risks.
  11. Selection effects in leadership. Crisis and conflict would likely elevate different personality types to power. Violence would systematically promote leaders comfortable with secrecy, confrontation, and rapid decision-making while potentially marginalizing those inclined toward caution, transparency, and deliberation. This adverse selection could entrench exactly the wrong decision-makers at the most critical juncture.
  12. Narrative capture and media distortion. Violence would transform complex technical debates into simple crime stories. Media coverage would likely focus exclusively on threats, victims, and law enforcement responses rather than existential risks or alignment challenges. This narrative hijacking could ensure public discourse remains permanently divorced from substantive safety concerns.
  13. Martyrdom and tribal consolidation. Harming AI lab employees would likely transform them into symbols of institutional loyalty. Internal communities would probably close ranks against external criticism, treating safety advocates as enemies rather than partners. Employees raising concerns could be viewed as potential security threats rather than conscientious objectors, possibly destroying internal advocacy channels.
  14. Collateral damage to critical systems. AI infrastructure interconnects with hospitals, emergency services, financial systems, and utilities through shared cloud providers and data centers. Any disruption could cascade through these dependencies, potentially causing deaths and widespread suffering. Public outrage over such collateral damage would likely permanently destroy any remaining sympathy for safety concerns.
  15. False flag vulnerability. Once violence becomes associated with AI safety, opponents would gain a devastating tool: stage an attack, attribute it to safety advocates, and justify unlimited crackdowns. The movement would likely lack the means to prove innocence while bearing collective punishment for acts it didn't commit.
  16. Trust erosion and proximity denial. AI labs would likely systematically exclude anyone sympathetic to safety concerns from leadership access. The informal conversations and chance encounters where minds might change would probably disappear behind layers of security. Those most capable of articulating risks could be kept furthest from those empowered to address them.
  17. Generational talent exodus. Association with violence would create lasting stigma that could repel precisely those researchers most concerned with careful, safe development. The field would likely systematically select for risk-tolerant personalities while excluding the cautious voices most needed. This multi-decade talent distortion could eliminate internal brakes on dangerous development.
  18. It can get worse. No matter how bad a situation is, it can always get worse. Even marginal increases in extinction probability, from 98% to 99%, would represent catastrophic losses in expected value. Violence might increase not just extinction risk but also the probability of worse-than-extinction outcomes involving vast suffering, as actors leverage anticipated future capabilities to coerce present opponents.