Eliezer Yudkowsky

Contradict my take on OpenPhil's past AI beliefs

At many points now, I've been asked in private for a critique of EA / EA's history / EA's impact and I have ad-libbed statements that I feel guilty about because they have not been subjected to EA critique and refutation. I need to write up my take and let...

Dec 20, 2025196

Eliezer's Unteachable Methods of Sanity

"How are you coping with the end of the world?" journalists sometimes ask me, and the true answer is something they have no hope of understanding and I have no hope of explaining in 30 seconds, so I usually answer something like, "By having a great distaste for drama, and...

Dec 7, 2025494

The Tale of the Top-Tier Intellect

Once upon a time in the medium-small town of Skewers, Washington, there lived a 52-year-old man by the name of Mr. Humman, who considered himself a top-tier chess-player. Now, Mr. Humman was not generally considered the strongest player in town; if you asked the other inhabitants of Skewers, most of...

Nov 3, 2025113

On Fleshling Safety: A Debate by Klurl and Trapaucius.

(23K words; best considered as nonfiction with a fictional-dialogue frame, not a proper short story.) Prologue: Klurl and Trapaucius were members of the machine race. And no ordinary citizens they, but Constructors: licensed, bonded, and insured; proven, experienced, and reputed. Together Klurl and Trapaucius had collaborated on such famed artifices...

Oct 26, 2025257

Why Corrigibility is Hard and Important (i.e. "Whence the high MIRI confidence in alignment difficulty?")

A lot of objection and confusion to the MIRI worldview seems to come from a perspective of "but, it.... shouldn't be possible be that confident in something that's never happened before at all, with anything like the current evidence and the sorts of arguments you're making here." And while I...

Sep 30, 202595

Re: recent Anthropic safety research

Crossposted from X by the LessWrong team, with permission. A reporter asked me for my off-the-record take on recent safety research from Anthropic. After I drafted an off-the-record reply, I realized that I was actually fine with it being on the record, so: Since I never expected any of the...

Aug 6, 2025153

The Problem

This is a new introduction to AI as an extinction threat, previously posted to the MIRI website in February alongside a summary. It was written independently of Eliezer and Nate's forthcoming book, If Anyone Builds It, Everyone Dies, and isn't a sneak peak of the book. Since the book is...

Aug 5, 2025325

LESSWRONG
LW

LESSWRONG
LW

Eliezer Yudkowsky

Eliezer Yudkowsky

AGI Ruin: A List of Lethalities

Preface

Making Beliefs Pay Rent (in Anticipated Experiences)

Eliezer's Unteachable Methods of Sanity

Eliezer Yudkowsky

AGI Ruin: A List of Lethalities

Preface

Making Beliefs Pay Rent (in Anticipated Experiences)

Eliezer's Unteachable Methods of Sanity

Contradict my take on OpenPhil's past AI beliefs

Eliezer's Unteachable Methods of Sanity

The Tale of the Top-Tier Intellect

On Fleshling Safety: A Debate by Klurl and Trapaucius.

Why Corrigibility is Hard and Important (i.e. "Whence the high MIRI confidence in alignment difficulty?")

Re: recent Anthropic safety research

The Problem