Three Worlds Collide assumes calibration is solved

Vyacheslav Ladischenski (Slava)

Three Worlds Collide assumes calibration is solved

Eliezer Yudkowsky's Three Worlds Collide (2009) is a first-contact novella designed as a thought experiment about naturalistic metaethics. The setup: a human starship encounters two alien species simultaneously.

The Babyeaters are a civilized species — they have art, empathy, stories — who consider eating their own sentient offspring a sacred moral duty. It emerges from their evolutionary history and they see it the way we see any foundational moral principle: as obviously correct.

The Super Happy Fun People are a vastly more powerful species who find all suffering morally abhorrent. They want to modify humanity to eliminate pain, and stop the Babyeaters from eating their children. They see human tolerance of pain the way humans see baby-eating: as a barbaric practice that must end.

The story is usually read as a tragedy of incompatible values. Three species, each with genuine moral reasoning, each horrifying to the others. The philosophical punch: if we insist the Babyeaters must stop eating babies, what stops the Super Happies from insisting we stop experiencing pain? The plot culminates in a decision to nova a star system, killing 15 billion people, to prevent the Super Happies from reaching and modifying humanity.

It's a brilliant setup. But I think there's a hidden assumption at its foundation that deserves examination.

The story takes for granted that all three species actually understand each other.

The xenopsychologist translates Babyeater moral arguments into human terms and says he understands their position. The Super Happies declare they comprehend why humans value pain. Everyone proceeds as if mutual comprehension is achieved, and the rest of the plot is game theory over incompatible utility functions.

But there's a step missing.

Steven Pinker draws a useful distinction between shared knowledge and common knowledge. Shared knowledge is when we both know the same fact — I know your position, you know mine. Common knowledge is when I know that you know that I know — the recursive chain that's actually required for coordination. The difference matters: two generals who each independently learn the enemy's position have shared knowledge. But they can only coordinate an attack if they have common knowledge — if each knows the other knows, and knows the other knows they know.

A concrete example from the story: the Super Happies say they want to "combine and compromise the utility functions of the three species until we mutually satisfice." The humans hear this and interpret it through their own cognitive framework — as something like forced modification of human nature. But what does "combine utility functions" actually mean in Super Happy cognition? These are beings whose communication literally is sex, whose experience of individuality is fundamentally alien to humans. When they say "compromise," do they mean what humans mean by that word? The story never checks.

And it's not just about translation accuracy. Even if the translation were perfect, there's the question of whether anyone knows the translation is perfect. The Lord Pilot makes the decision to nova a star and kill 15 billion people based on his model of Super Happy intentions. His confidence in that model is never examined. No one asks: "How sure are we that we're understanding them correctly, and how sure are they that we're understanding them, and do we know that they know that we know?"

Without that recursive verification, you can't actually distinguish between two very different scenarios:

1. Genuine values conflict — we truly understand each other and our values are incompatible. This is a real tragedy with no easy solution.

2. Coordination failure from miscalibrated comprehension — we believe we understand each other, but our confidence in that understanding exceeds its actual accuracy. The conflict is partly or wholly an artifact of misunderstanding.

The story presents scenario 1 as a given and builds its entire moral dilemma on that foundation. But scenario 2 is never ruled out, and no one in the story has the tools to rule it out.

This isn't just a nitpick about the fiction. It reflects a broader pattern in how we think about disagreements — one that I think the rationalist community is particularly susceptible to. When two parties are confident they understand each other's positions and still disagree, we tend to conclude they have fundamentally different values. But confidence in understanding is not the same as actual understanding. The gap between the two is what you might call metacognitive miscalibration — and it's measurable, if anyone bothers to measure it.

Consider how a simple protocol could have changed the story's trajectory: before any negotiation, each species paraphrases back the other's core position in their own terms, and the other species rates the accuracy. "You believe eating children is morally necessary because [X, Y, Z] — is that correct?" If the Babyeaters respond "No, you're missing the key point, it's actually about [W]," then you've learned something crucial — your model was wrong, and every decision based on that model would have been wrong too.

Nobody in Three Worlds Collide runs this loop. The xenopsychologist's understanding is taken at face value. The Super Happies' stated comprehension of human values is accepted without verification. And as a result, we genuinely don't know — within the fiction — whether the tragedy is driven by incompatible values or by miscalibrated mutual understanding.

In real human conflicts (which the alien species are obviously metaphors for), it's almost always a mix of both — and in my experience, more of the second than people expect. People are systematically overconfident in their understanding of positions they disagree with. And that overconfidence is invisible to them, which makes it especially dangerous: you can't fix a comprehension gap you don't know exists.

The story would have been even more interesting — and more true to the actual structure of real disagreements — if it had grappled with this uncertainty instead of assuming it away. Not because it would change the ending necessarily, but because it would add a layer of doubt that makes the moral dilemma even harder: what if we're about to kill 15 billion people over a misunderstanding?

[-]Josh Snider10d124

I think this is less a critique of the story and more a refusal to engage with its premise. Three Worlds Collide is a thought experiment about genuinely incompatible values. Saying "but maybe they don't actually conflict, maybe it's a misunderstanding" sidesteps the dilemma rather than engaging with it. It's like responding to the trolley problem by asking whether the trolley has brakes.

On the translation point, the story's translation system isn't some bilingual dictionary, it's essentially a superintelligent AI. Doubting its accuracy feels like an objection the story has already addressed.

If we take the "how do we really know we understand each other?" question seriously, it doesn't just apply to first contact with aliens, but to all communication between any two minds. There are other stories which are much more vulnerable to this critique and many other who engage with the question.

LESSWRONG
LW

LESSWRONG
LW

6

Three Worlds Collide assumes calibration is solved

6

6

6