The Best of LessWrong

Here you can find the best posts of LessWrong. When posts turn more than a year old, the LessWrong community reviews and votes on how well they have stood the test of time. These are the posts that have ranked the highest for all years since 2018 (when our annual tradition of choosing the least wrong of LessWrong began).

For the years 2018, 2019 and 2020 we also published physical books with the results of our annual vote, which you can buy and learn more about here.
Sort by:
curatedyear
+

2022

Eliezer Yudkowsky
AGI Ruin: A List of Lethalities
Eliezer Yudkowsky
MIRI announces new "Death With Dignity" strategy
paulfchristiano
Where I agree and disagree with Eliezer
TurnTrout
Inner and outer alignment decompose one hard problem into two extremely hard problems
So8res
On how various plans miss the hard bits of the alignment challenge
janus
Simulators
Elizabeth
Epistemic Legibility
Scott Garrabrant
Tyranny of the Epistemic Majority
KatjaGrace
Counterarguments to the basic AI x-risk case
KatjaGrace
Let’s think about slowing down AI
TurnTrout
Reward is not the optimization target
Eliezer Yudkowsky
Six Dimensions of Operational Adequacy in AGI Projects
johnswentworth
What Are You Tracking In Your Head?
Adam Scholl
Safetywashing
Diffractor
Threat-Resistant Bargaining Megapost: Introducing the ROSE Value
HoldenKarnofsky
Nonprofit Boards are Weird
Veedrac
Optimality is the tiger, and agents are its teeth
nostalgebraist
chinchilla's wild implications
gwern
It Looks Like You're Trying To Take Over The World
benkuhn
Staring into the abyss as a core life skill
johnswentworth
You Are Not Measuring What You Think You Are Measuring
Adam Zerner
Losing the root for the tree
johnswentworth
Worlds Where Iterative Design Fails
So8res
Decision theory does not imply that we get to have nice things
AnnaSalamon
Comment reply: my low-quality thoughts on why CFAR didn't get farther with a "real/efficacious art of rationality"
lc
What an actually pessimistic containment strategy looks like
Alex_Altair
Introduction to abstract entropy
Ajeya Cotra
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
[DEACTIVATED] Duncan Sabien
Sazen
Elizabeth
Luck based medicine: my resentful story of becoming a medical miracle
Neel Nanda
A Mechanistic Interpretability Analysis of Grokking
Ben
The Redaction Machine
Elizabeth
Butterfly Ideas
LawrenceC
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
Buck
Language models seem to be much better than humans at next-token prediction
GeneSmith
Toni Kurz and the Insanity of Climbing Mountains
HoldenKarnofsky
Useful Vices for Wicked Problems
AnnaSalamon
What should you change in response to an "emergency"? And AI risk
Sam Ringer
Models Don't "Get Reward"
johnswentworth
How To Go From Interpretability To Alignment: Just Retarget The Search
elspood
Security Mindset: Lessons from 20+ years of Software Security Failures Relevant to AGI Alignment
johnswentworth
Why Agent Foundations? An Overly Abstract Explanation
So8res
A central AI alignment problem: capabilities generalization, and the sharp left turn
TurnTrout
Humans provide an untapped wealth of evidence about alignment
HoldenKarnofsky
Learning By Writing
Raemon
Limerence Messes Up Your Rationality Real Bad, Yo
chanamessinger
The Onion Test for Personal and Institutional Honesty
Natália
Counter-theses on Sleep
Quintin Pope
The shard theory of human values
Collin
How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme
Eliezer Yudkowsky
ProjectLawful.com: Eliezer's latest story, past 1M words
+

2021

jasoncrawford
How factories were made safe
mingyuan
Cryonics signup guide #1: Overview
johnswentworth
Making Vaccine
Mark Xu
Strong Evidence is Common
AnnaSalamon
“PR” is corrosive; “reputation” is not.
Eliezer Yudkowsky
Your Cheerful Price
Daniel Kokotajlo
Taboo "Outside View"
HoldenKarnofsky
All Possible Views About Humanity's Future Are Wild
paulfchristiano
Another (outer) alignment failure story
[DEACTIVATED] Duncan Sabien
Split and Commit
Andrew_Critch
What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)
eukaryote
There’s no such thing as a tree (phylogenetically)
paulfchristiano
ARC's first technical report: Eliciting Latent Knowledge
HoldenKarnofsky
This Can't Go On
Eric Raymond
Rationalism before the Sequences
johnswentworth
The Plan
Scott Alexander
Trapped Priors As A Basic Problem Of Rationality
Scott Garrabrant
Finite Factored Sets
johnswentworth
Selection Theorems: A Program For Understanding Agents
johnswentworth
Slack Has Positive Externalities For Groups
paulfchristiano
My research methodology
[DEACTIVATED] Duncan Sabien
Lies, Damn Lies, and Fabricated Options
Daniel Kokotajlo
Fun with +12 OOMs of Compute
Daniel Kokotajlo
What 2026 looks like
Owain_Evans
The Rationalists of the 1950s (and before) also called themselves “Rationalists”
[DEACTIVATED] Duncan Sabien
Ruling Out Everything Else
Darmani
Leaky Delegation: You are not a Commodity
Zack_M_Davis
Feature Selection
[DEACTIVATED] Duncan Sabien
Cup-Stacking Skills (or, Reflexive Involuntary Mental Motions)
nostalgebraist
larger language models may disappoint you [or, an eternally unfinished draft]
Eliezer Yudkowsky
Ngo and Yudkowsky on alignment difficulty
johnswentworth
How To Write Quickly While Maintaining Epistemic Rigor
johnswentworth
Science in a High-Dimensional World
Eliezer Yudkowsky
Self-Integrity and the Drowning Child
So8res
Comments on Carlsmith's “Is power-seeking AI an existential risk?”
johnswentworth
Working With Monsters
johnswentworth
Simulacrum 3 As Stag-Hunt Strategy
KatjaGrace
Elephant seal 2
1a3orn
EfficientZero: How It Works
Sune
Lars Doucet's Georgism series on Astral Codex Ten
LoganStrohl
Catching the Spark
johnswentworth
Specializing in Problems We Don't Understand
[DEACTIVATED] Duncan Sabien
Shoulder Advisors 101
juliawise
Notes from "Don't Shoot the Dog"
jasoncrawford
Why has nuclear power been a flop?
+

2020

johnswentworth
The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables
johnswentworth
Coordination as a Scarce Resource
paulfchristiano
Inaccessible information
Daniel Kokotajlo
Cortés, Pizarro, and Afonso as Precedents for Takeover
Steven Byrnes
My computational framework for the brain
Rafael Harth
Inner Alignment: Explain like I'm 12 Edition
Ajeya Cotra
Draft report on AI timelines
evhub
An overview of 11 proposals for building safe advanced AI
johnswentworth
When Money Is Abundant, Knowledge Is The Real Wealth
Daniel Kokotajlo
Against GDP as a metric for timelines and takeoff speeds
JackH
Anti-Aging: State of the Art
johnswentworth
Interfaces as a Scarce Resource
abramdemski
Most Prisoner's Dilemmas are Stag Hunts; Most Stag Hunts are Schelling Problems
abramdemski
An Orthodox Case Against Utility Functions
alkjash
Is Success the Enemy of Freedom? (Full)
catherio
microCOVID.org: A tool to estimate COVID risk from common activities
johnswentworth
Alignment By Default
Mark Xu
The Solomonoff Prior is Malign
Diffractor
Introduction To The Infra-Bayesianism Sequence
abramdemski
Radical Probabilism
AnnaSalamon
Reality-Revealing and Reality-Masking Puzzles
jasoncrawford
Why haven't we celebrated any major achievements lately?
Andrew_Critch
Some AI research areas and their relevance to existential safety
Alex Flint
Search versus design
Jacob Falkovich
Seeing the Smoke
alkjash
Pain is not the unit of Effort
Alex Flint
The ground of optimization
Raemon
"Can you keep this confidential? How do you know?"
KatjaGrace
Discontinuous progress in history: an update
Jacob Falkovich
The Treacherous Path to Rationality
zhukeepa
How uniform is the neocortex?
benkuhn
To listen well, get curious
Zvi
Motive Ambiguity
Zvi
Simulacra Levels and their Interactions
johnswentworth
What Money Cannot Buy
Richard_Ngo
AGI safety from first principles: Introduction
Kaj_Sotala
The Felt Sense: What, Why and How
Mark Xu
The First Sample Gives the Most Information
Jeffrey Ladish
Nuclear war is unlikely to cause human extinction
Benquo
Can crimes be discussed literally?
Martin Sustrik
Swiss Political System: More than You ever Wanted to Know (I.)
Scott Alexander
Studies On Slack
johnswentworth
Transportation as a Constraint
Daniel Kokotajlo
The date of AI Takeover is not the day the AI takes over
HoldenKarnofsky
Reply to Eliezer on Biological Anchors
Eliezer Yudkowsky
Biology-Inspired AGI Timelines: The Trick That Never Works
[DEACTIVATED] Duncan Sabien
CFAR Participant Handbook now available to all
+

2019

paulfchristiano
What failure looks like
evhub
Risks from Learned Optimization: Introduction
abramdemski
The Parable of Predict-O-Matic
Raemon
Noticing Frame Differences
Scott Garrabrant
Yes Requires the Possibility of No
Buck
"Other people are wrong" vs "I am right"
Unreal
Rest Days vs Recovery Days
TurnTrout
Seeking Power is Often Convergently Instrumental in MDPs
evhub
Chris Olah’s views on AGI safety
johnswentworth
Being the (Pareto) Best in the World
Scott Alexander
Book Review: The Secret Of Our Success
Scott Alexander
Rule Thinkers In, Not Out
Rohin Shah
Reframing Superintelligence: Comprehensive AI Services as General Intelligence
paulfchristiano
The strategy-stealing assumption
TurnTrout
Reframing Impact
evhub
Understanding “Deep Double Descent”
Zvi
Moloch Hasn’t Won
habryka
Integrity and accountability are core parts of rationality
Kaj_Sotala
Book summary: Unlocking the Emotional Brain
Zvi
Asymmetric Justice
Zack_M_Davis
Heads I Win, Tails?—Never Heard of Her; Or, Selective Reporting and the Tragedy of the Green Rationalists
johnswentworth
Gears-Level Models are Capital Investments
[DEACTIVATED] Duncan Sabien
In My Culture
jefftk
Make more land
Wei Dai
Forum participation as a research strategy
jacobjacob
Unconscious Economics
abramdemski
Mistakes with Conservation of Expected Evidence
abramdemski
Selection vs Control
Raemon
You Get About Five Words
Raemon
The Schelling Choice is "Rabbit", not "Stag"
Spiracular
Bioinfohazards
Benquo
Excerpts from a larger discussion about simulacra
nostalgebraist
human psycholinguists: a critical appraisal
Wei Dai
AI Safety "Success Stories"
Ruby
Do you fear the rock or the hard place?
Raemon
Propagating Facts into Aesthetics
evhub
Gradient hacking
Raemon
The Amish, and Strategic Norms around Technology
Elizabeth
Power Buys You Distance From The Crime
johnswentworth
Paper-Reading for Gears
Hazard
How to Ignore Your Emotions (while also thinking you're awesome at emotions)
Said Achmiz
The Real Rules Have No Exceptions
Eliezer Yudkowsky
Coherent decisions imply consistent utilities
abramdemski
Alignment Research Field Guide
Zvi
Blackmail
pjeby
The Curse Of The Counterfactual
abramdemski
The Credit Assignment Problem
Benquo
Reason isn't magic
Scott Alexander
Mental Mountains
Zvi
Simple Rules of Law
Jacob Falkovich
Is Rationalist Self-Improvement Real?
Elizabeth
Literature Review: Distributed Teams
Vaniver
Steelmanning Divination
johnswentworth
Book Review: Design Principles of Biological Circuits
Kaj_Sotala
Building up to an Internal Family Systems model
johnswentworth
Evolution of Modularity
Ruby
[Answer] Why wasn't science invented in China?
johnswentworth
Gears vs Behavior
+

2018

Zvi
Prediction Markets: When Do They Work?
Rohin Shah
Coherence arguments do not entail goal-directed behavior
Scott Alexander
Is Science Slowing Down?
abramdemski
Embedded Agents
Eliezer Yudkowsky
The Rocket Alignment Problem
Eliezer Yudkowsky
Local Validity as a Key to Sanity and Civilization
Scott Garrabrant
Robustness to Scale
Jameson Quinn
A voting theory primer for rationalists
Eliezer Yudkowsky
Toolbox-thinking and Law-thinking
Ben Pace
A Sketch of Good Communication
zhukeepa
Paul's research agenda FAQ
abramdemski
An Untrollable Mathematician Illustrated
paulfchristiano
Arguments about fast takeoff
Ben Pace
The Costly Coordination Mechanism of Common Knowledge
Martin Sustrik
Anti-social Punishment
Scott Alexander
Varieties Of Argumentative Experience
Vika
Specification gaming examples in AI
Eliezer Yudkowsky
Meta-Honesty: Firming Up Honesty Around Its Edge-Cases
Kaj_Sotala
My attempt to explain Looking, insight meditation, and enlightenment in non-mysterious terms
sarahconstantin
Naming the Nameless
Martin Sustrik
Inadequate Equilibria vs. Governance of the Commons
Scott Alexander
The Tails Coming Apart As Metaphor For Life
alkjash
Babble
alkjash
More Babble
Valentine
Noticing the Taste of Lotus
sarahconstantin
The Pavlov Strategy
Raemon
Being a Robust Agent
eukaryote
Spaghetti Towers
Wei Dai
Beyond Astronomical Waste
Martin Sustrik
Research: Rescuers during the Holocaust
alkjash
Prune
orthonormal
The Loudest Alarm Is Probably False
Valentine
The Intelligent Social Web
paulfchristiano
Open question: are minimal circuits daemon-free?
Samo Burja
On the Loss and Preservation of Knowledge
Eliezer Yudkowsky
Is Clickbait Destroying Our General Intelligence?
abramdemski
What makes people intellectually active?
KatjaGrace
Why did everything take so long?
Eliezer Yudkowsky
Challenges to Christiano’s capability amplification proposal
Eli Tyre
Historical mathematicians exhibit a birth order effect too
TurnTrout
Towards a New Impact Measure
Bucky
Birth order effect found in Nobel Laureates in Physics