LESSWRONG
LW

AI Alignment FieldbuildingDecision theoryAI
Frontpage

-2

If one surviving civilization can rescue others, shouldn't civilizations randomize?

by Knight Lee
20th May 2025
2 min read
4

-2

AI Alignment FieldbuildingDecision theoryAI
Frontpage

-2

If one surviving civilization can rescue others, shouldn't civilizations randomize?
3Dagon
1Knight Lee
3Dagon
1Knight Lee
New Comment
4 comments, sorted by
top scoring
Click to highlight new comments since: Today at 6:56 PM
[-]Dagon4mo30

Can you explain your model for what "survive" and "resurrect" means for a civilization, as opposed to individuals that happen to exist within a civilizational context?  Relatedly, what's your model for a civilization's decision theory that makes "random strategy" a coherent idea?

My model is that a civilization is an emergent set of behaviors and expectations of individuals that are coexistent in time and space.   And I'm not sure your thinking is applicable on that level.

Reply1
[-]Knight Lee4mo10

Oops oh no. I used the wrong word. I meant planetary civilization, e.g. humanity or an alien civilization. Sorry.

I'll edit the post to replace "civilization" with "planetary civilization." Thank you for commenting, you saved me from confusing everyone!

In the discussion of You can, in fact, bamboozle an unaligned AI into sparing your life, the people from planet 1 can revive the people from another planet which got taken over by a misaligned ASI (planet 2), if that ASI saved the brain states of planet 2's people before killing them.

Both the people from planet 1 and the ASI from planet 2 might colonize the stars, expanding further and further until they meet each other. The ASI might sell the brain states of planet 2's people, to planet 1's people, so that planet 1's people can revive planet 2's people.

Planet 1's people agree to this deal because they care about saving people from other planets. The ASI from planet 2 agree to this deal because planet 1's people might give them a tiny bit more resources for making paperclips.

This was one out of many ideas, for how one surviving planetary civilization could revive others.

Reply
[-]Dagon4mo30

Thanks for the clarification - I'm still a bit unsure if "planetary civilization" is distinct from "the specific set of individuals inhabiting a planet", and I should admit that I'm highly skeptical of the value (to an AGI or even to other humans) of a specific individual's brain-state, and I have a lot of trouble following arguments that imply migration or resurrection of more than a few percent of biological intelligences.

Reply
[-]Knight Lee4mo10

Sorry, yes a planetary civilization is simply the specific set of individuals inhabiting a planet. I'm not sure what's the best way to describe that in two words :/

What I described there was only one out of very many ideas proposed in the discussion of You can, in fact, bamboozle an unaligned AI into sparing your life. The overall idea is that a few surviving civilizations can do a lot of good.

How valuable a few surviving civilizations are depends on your ontology. If you believe in the many worlds interpretation of quantum mechanics, or believe that the universe is infinitely big, then there are infinite exact copies of the Earth. Even if only 0.1% of Earths were saved, there will still be infinite copies of future you alive, but at 0.1% the density.

The planetary civilization saving Earth may have immense resources in the post singularity world. With millions of years of technological progress, technology will be limited only by the laws of physics. They can expand out close to the speed of light, and control the matter and energy of 1022 stars. Meanwhile, energy required to simulate all of humanity, using the most efficient computers possible, is probably not much more than running 1 electric car.[1]

They could easily simulate 1000 copies of humanity.

This means for every 1000 identical copies of you, you might have 999 dying, and one surviving but duplicated 1000 times.


If you don't care about personal survival but whether the average sentient life in all of existence is happy or miserable, then it's also good for planetary civilizations to randomize their strategies, to ensure at least a few survive, and use their immense resources to create far more happy lives than all the miserable lives from pre-singularity times.

  1. ^

    The human brain uses 20 watts of energy, but is very inefficient. Each neuron firing uses 6×108 ATP molecules. If a simulated neuron firing only uses the energy equivalent of 60 ATP molecules, then it would be 107 times more efficient, and 8 billion people will only use 16,000 watts, similar to an electric car.

Reply
Moderation Log
More from Knight Lee
View more
Curated and popular this week
4Comments

In the comments section of You can, in fact, bamboozle an unaligned AI into sparing your life, both supporters and critics of the idea seemed to agree on two assumptions:

  • Surviving planetary civilizations have some hope of rescuing planetary civilizations killed by misaligned AI, but they disagree on the best method of rescuing.
  • The big worry is that there are almost 0 surviving planetary civilizations, because if we're unlucky, all planetary civilizations will die the same way.

What if to ensure at least some planetary civilizations survive (and hopefully rescue others), each planetary civilization should pick a random strategy?

Maybe if every planetary civilization follows a random strategy, they increase the chance of surviving the singularity, and also increase the chance that the average sentient life in all of existence is happy rather than miserable. It reduces logical risk.

History already is random, but perhaps we could further randomize the strategy we pick.

For example, if the random number generated using Dawson et al's method (at some prearranged date) is greater than the 95th percentile, we could all randomly choose MIRI's extremely pessimist strategy, and do whatever Eliezer Yudkowsky and Nate Soares suggest with less arguing and more urgency. If they tell you that your AI lab, working on both capabilities and alignment, is a net negative, then you quit and work on something else. If you are more reluctant to do so, you might insist on the 99th percentile instead.

Does this make sense or am I going insane again?

Total utilitarianism objections

If you are a total utilitarian, and don't care about how happy the average life is, and only care about the total number of happy lives, then you might say this is a bad idea, since it increases the chance at least some planetary civilizations survive, but reduces the total expected number of happy lives.

However, it also reduces the total expected number of miserable lives. Because if 0 planetary civilizations survive, the number of miserable lives may be huge due to misaligned AI simulating all possible histories. If only a few planetary civilizations survive, they may trade with these misaligned AI (causally or acausally) to greatly reduce suffering, since the misaligned AI only gain a tiny tiny bit by causing astronomical suffering. They only lose a tiny bit of accuracy if they decrease the suffering by 2x.

This idea is only morally bad if you are both a total utilitarian, and only care about happiness (not worrying about suffering). But really, we should have moral uncertainty and value more than one philosophy (total utilitarianism, average utilitarianism, etc.).