Mikhail Samin

Unless its governance changes, Anthropic is untrustworthy

Anthropic is untrustworthy. This post provides arguments, asks questions, and documents some examples of Anthropic's leadership being misleading and deceptive, holding contradictory positions that consistently shift in OpenAI's direction, lobbying to kill and water down regulation so helpful that employees of all major AI companies speak out to support it, and violating the fundamental promise the company was founded on. It also shares a few previously unreported details on Anthropic leadership's promises and efforts.[1] Anthropic has a strong internal culture that has broadly EA views and values, and the company has strong pressures to appear to follow these views and values as it wants to retain talent and the loyalty of staff, but it's very unclear what they would do when it matters most. Their staff should demand answers. Suggested questions for Anthropic employees to ask themselves, Dario, the policy team, and the board after reading this post, and for Dario and the board to answer publicly On regulation: Why is Anthropic consistently against the kinds of regulation that would slow everyone down and make everyone more likely to be safer? To what extent does Jack Clark act as a rogue agent vs. in coordination with the rest of Anthropic's leadership? On commitments and integrity: Do you think Anthropic leadership would not violate their promises to you, if it had a choice between walking back on its commitments to you and falling behind in the race? Do you think the leadership would not be able to justify dropping their promises, when they really need to come up with a strong justification? Do you think the leadership would direct your attention to the promises they drop? Do you think Anthropic's representatives would not lie to the general public and policymakers in the future? Do you think Anthropic would base its decisions on the formal mechanisms and commitments, or on what the leadership cares about, working around the promises? How likely are you

287Nov 29, 2025

How to Give in to Threats (without incentivizing them)

73Sep 12, 2024

Gradient descent might see the direction of the optimum from far away

70Jul 28, 2023

AI pause/governance advocacy might be net-negative, especially without a focus on explaining x-risk

77Aug 27, 2023

Mikhail Samin

Message

My name is Mikhail Samin (@Mihonarium on Twitter/X, @misha on Telegram).

Humanity's future can be enormous and awesome; losing it would mean our lightcone (and maybe the universe) losing most of its potential value.

I have takes on what seems to me to be the very obvious...

3179

408

Confession: I pranked Inkhaven to make sure no one fails

(Content warnings: dubious math, quantum immortality, nuclear war) Normal people make New Year’s resolutions. People on the internet love to make resolutions for November. So, for the entire month of November, 41 people, myself included, set out to publish a post every day, as part of a writing residency called...

Jan 1641

Unless its governance changes, Anthropic is untrustworthy

Nov 29, 2025287

I made a tool for learning absolute pitch as an adult

I read a study that claims to have debunked the myth that only children can learn absolute pitch, and got 12 musicians who’ve not previously had absolute pitch to improve significantly at having absolute pitch. On average, they spent 21.4 hours over 8 weeks, making 15,327 guesses. All learned to...

Nov 24, 202516

A list of people who could’ve started a nuclear war, but chose not to

This is a list of everyone who had a big red button but did not press it, despite a unilateral ability to destroy (at least some of) the world with nuclear weapons. (Please comment with suggestions for additions: I’m sure I missed some people.) Dwight D. Eisenhower was a U.S....

Nov 23, 202527

Prediction markets for social deduction games

Being able to N/A contracts with market manipulators reduces the incentives to do bad stuff. Prediction markets are fun! Social deduction games are fun! Can you merge the two? The simple solution is normal games plus betting among the spectators. But for a long time, I wanted to play a...

Nov 15, 202510

There should be unicorns

I want people to pursue their dreams. I want more mad and not-so-mad scientists and inventors. More people pursuing awesome projects. I want them to create fun and wholesome things that make our civilization simultaneously a little bit more grown-up and a little bit sillier and child-like. I want there...

Nov 9, 202516

How to be convincing when talking to people about existential threat from AI

I think I’m pretty good at convincing people about AI dangers in personal conversations. This post talks about the basics of speaking convincingly about AI dangers to people. Prerequisites I. Learn to truly see them In 2022, at a CFAR workshop, I was introduced to circling. It is multi-player meditation....

Nov 5, 202535

Load More (7/32)

LESSWRONG
LW

LESSWRONG
LW

Mikhail Samin

Mikhail Samin

Mikhail Samin

Unless its governance changes, Anthropic is untrustworthy

How to Give in to Threats (without incentivizing them)

Gradient descent might see the direction of the optimum from far away

AI pause/governance advocacy might be net-negative, especially without a focus on explaining x-risk

Mikhail Samin

Confession: I pranked Inkhaven to make sure no one fails

Unless its governance changes, Anthropic is untrustworthy

I made a tool for learning absolute pitch as an adult

A list of people who could’ve started a nuclear war, but chose not to

Prediction markets for social deduction games

There should be unicorns

How to be convincing when talking to people about existential threat from AI

Unless its governance changes, Anthropic is untrustworthy

How to Give in to Threats (without incentivizing them)

Gradient descent might see the direction of the optimum from far away

AI pause/governance advocacy might be net-negative, especially without a focus on explaining x-risk

Confession: I pranked Inkhaven to make sure no one fails

Unless its governance changes, Anthropic is untrustworthy

I made a tool for learning absolute pitch as an adult

A list of people who could’ve started a nuclear war, but chose not to

Prediction markets for social deduction games

There should be unicorns

How to be convincing when talking to people about existential threat from AI