Cross-posted, as always, from Putanumonit.


When constructing a theory of the universe and everything else, most people start with the universe.

For example: a theory that the universe was created by God, starting with important things like grass (Genesis 1:12), the moon (Genesis 1:16), and big crocodiles (Genesis 1:21), before finally getting to humans. These humans eventually also gain self awareness, but that doesn’t happen until Genesis 3.

Or: there are a bunch of possible mathematical structures, and some of them exist because they describe physical structures, and some of those physical structures are complex enough to become Self-Aware Substructures™. I think those “SAS” refer to you and me, though I’m not sure. Max gave me a copy of Our Mathematical Universe at this nerdy comedy thing I did but I never got around to finishing it.

What I do know: if a theory of the universe starts with the universe, it usually takes a while to get to conscious beings that can have experiences. Self awareness feels somewhat tacked on to whatever structure is supposed to explain grass and moons.

But what if we started from just the fact of experience itself? Taking a few steps beyond the most narrow solipsism here is a short list of observation you may arrive at, in order of decreasing certainty:

  1. You have some sort of conscious experience.
  2. You have different sorts of conscious experience.
  3. Some experiences (“thoughts”) seem to follow others in a chain, as if your internal state is crunching through some process step by step. Let’s call this process “computation”.
  4. But other experiences (“perceptions”) aren’t entirely predicted by your prior internal state, suggesting there’s some external world that creates them.
  5. This external world seems orderly. That is: you can compress what’s happening into a short description of what’s where and how it behaves according to fixed rules, as opposed to everything happening at random and all over the place.
  6. But insofar as this external world gives rise to experience, it doesn’t seem to do it in a computationally efficient manner. There’s way too much empty space where no experience is likely happening, a lot of hard-to-compute physics happening inside unconscious (?) stars as opposed to inside conscious brains, etc. 

This all may suggest the following theory:

Computation happens, and may result in an “experience moment” like the one you’re having now. Since all computations are possible in principle, the chance of a particular experience being experienced is the probability that a random program run on some general computer will output that experience. This corresponds to experiences being more likely in inverse exponential proportion to length of the shortest program that outputs them. An experience that takes 147 bits to produce is twice as likely as one that takes 148 bits: the computer crunches the bits one-by-one and so the 147-length experience one would be output along the way for both the possible values of the 148th bit. It’s a million times more likely than one that takes 167 bits because 220(extra bits) ≈ 1,000,000. You should think of every moment of your experience as being randomly drawn according to the above measure of likelihood. Is there anything other than “experience moments”? We can’t really know, and so we don’t really care.

That’s it, that’s the tweet UDASSA. It stands for Universal Distribution (the likelihood distribution of strings by length) + Absolute Self Selection Assumption (that you should think of yourself as randomly selected among these). It’s credited to Wei Dai and explained in more technical detail here by Hal Finney and here by Paul Christiano, or argued against by Joe Carlsmith using terms like “Universal Turing Machine” and “Solomonoff Induction” and “high measure observer-moment”. If you’re familiar with these names you know we’re dealing with deep Rationalist lore here, venerable oral tradition passed from the erstwhile LessWrong giants to their co-conspirators, from them to my friend Tristan, and from him to yours truly.

If this sounds fun, stay tuned for why this crazy-sounding theory intuitively reconciles a lot of seeming paradoxes from the simulation argument to the repugnant conclusion. If you’re horrified and wondering what happened to writing about big titties worry not, I’ve got plenty more to write about sex and dating. But after the LessWrong crowd insulted me last week I will have my revenge: getting Rationalists to argue about anthropics in the comments.

Also: since I learned UDASSA by way of oral gossip, some of the technical details below are either wrong or simply not canon-UDASSA. Consider this my spin on the theory, not an authoritative treatment.

Note on Qualia

The first-hand explanation of UDASSA I got equivocated between “experience moments” and “qualia”, but I personally quibble with applying it to qualia specifically. 

I like the following intuition for qualia: imagine that you are simulated in precise atomic detail on a supercomputer by a scientist named Mary. Mary can query the simulation in many ways and know many things about you: that you see a red stop sign over there, that seeing the color red raises your blood pressure and hunger slightly, that you recognize it being the same color as a ripe tomato. These are all bits of knowledge that can be decoded, compared, replicated — for the purposes of UDASSA or anything else. But the redness of red, this particular quality of the experience that exists for you but is inaccessible to Mary, can’t be “decoded”, it’s something different from information or computation.

I think of UDASSA as applying to everything a scientist querying your simulation could potentially know, but not to qualia itself.

Doing Bits

What does it mean to specify an experience moment using a string of bits fed into a universal computer? We can suppose this string is broken into three parts: the specs for generating the universe, the query for locating your brain (or any other structure whose physical state corresponds to the experience), and a decoder for translating the state of the query target (e.g., a brain) into an experience.

For example, the string of bits 01110…11spec000…10query101…01decoder could encode:

  1. Spec: the precise physical state of some small volume of the universe shortly after the big bang, and the laws of physics that would evolve that volume in time to include the entire part of the universe visible from the Milky Way on January 4th, 2023.
  2. Query: the location of the Earth in this space along with the timestamp, and enough detail to find you on Earth at this time. This may require specifying your branch of the quantum multiverse; I won’t really dwell on this because quantum randomness and branching isn’t super relevant to most of what I’ll discuss though UDASSA does have a neat account of multiverse quantum probabilities.
  3. Decoder: the translation of your precise brain state right now into the experience of reading this sentence.

We can immediately think of some factors that will make parts of this string shorter or longer, and thus the experience moment it encodes likelier or scarcer:

  • A small universe with simple laws will have a shorter spec than a large one with complex or changing laws.
  • A world with many “experiencers” will require longer queries to locate either of them, especially if they are relatively similar to each other and are close together.

This already tells you a lot about where you happen to find yourself right now.

The physics of the universe

It should not be surprising that you experience a universe with simple laws and locality. Locality means that only a finite section of the universe containing you needs to be specified, and fixed laws allow it to be specified in only a single moment in time (or a single slice of some dimension, if you don’t like your physics to have privileged time in it). We should in fact raise our expectation slightly that underlying the current laws of physics we think we know is an even simpler system. And again, we don’t care how hard it is to compute the universe based on these laws, only how many bits they take to specify.

Also, this suggests that the starting volume of stuff (particles, fields, etc.) needed to specify our cosmic surroundings is much smaller than the volume of stuff in your brain right now. If this wasn’t so, it would be shorter to just fully specify the entire state of your brain, not caring if it was a free-floating Boltzmann brain in the void. But any random brain could have any random experience, so it would be highly unlikely for a brain to experience a lawful-universe-evolved-from-big-bang by sheer accident. The shortest bit string to generate your brain state must be by simulating a tiny volume of space expanding and evolving for 13.7 billion years.

The number of dimensions we observe strongly implies that it is the minimum number required for at least self-aware experiences, and the magnitude of physical constants we observe is likely the one that lets our environs be computed from the smallest starting volume.

Copies and slices

A common conundrum in thinking of experience and ethics is: if we duplicate a physical system that has an experience, did we double the experience itself? Is there twice as much suffering and pleasure as there was before the duplication? The same amount? Somewhere in between?

Nick Bostrom suggests the “sliced computer” thought experiment: let’s take an experiencing computer brain and lay all its wires flat on a big table. What happens if we slice all the wires in half lengthwise, creating two separated “thin” computers with the exact same pattern creating the exact same experience? 

The view that this slicing duplicates the measure of experience itself leads to weird conclusions. Do we “kill” a being by gluing the computer back? Or suppose that all the wires are 10 atoms thick. Is the computer having a single experience, or should we think of it as two computers consisting of layers 1-5 and 6-10 stacked on top of each other? Or of four computers in layers 1-3, 4, 5-8, and 9-10? Or of 512 computers simultaneously containing all possible combinations? Is the utility monster just a cockroach that’s simulated on a computer using extremely thick wires?

UDASSA says: duplicating the physical substrate doesn’t increase the measure of experience. In the split computer case, the universe spec would be the same as before it was split, and the query would just look for the 2D pattern of the computer regardless of how many times that pattern is found in the world. If the two sliced computers are far enough apart there may be a slight increase in measure if one of them becomes easier to locate in the world but in general, splitting or merging identical copies doesn’t change much.

Momma always said you’re special

Extending the above logic from computers to people, UDASSA implies that each person-like being has a higher measure, which means they’re likelier to exist and have more moral weight, when they are rare and more distinct.

One implication of this is defusing the “Lindy Doomsday Argument” which states simply that it should be very surprising to find yourself very early on in history. In other words: assuming that you are equally likely to be any human, you should assign a probability no higher than 1% to being one of the first 1% of humans to exist. And since less than 100 billion recognizable Homos sapiens have existed so far, we can say with 99% certainty that fewer than 10 trillion humans will ever exist. This means that the scenario of humans colonizing the galaxy and thriving for millions of years is very unlikely, and the scenario where we all go extinct or dwindle in numbers inexorably is all but guaranteed.

But UDASSA says that it’s almost 1,000 times easier to locate a person in a galaxy containing 8 billion people than it would be to locate one in a galaxy of 8 trillion, ignoring the effects of physical diversity. This gives roughly equal weight to all humans at any given time in which any humans existed. Time-wise, humans have been around for 50-100 thousand years, so we’re not finding ourselves shockingly early on if people will go on for a few million more.

UDASSA also implies that in a populous universe it’s important to make yourself easy to find. You can do it by always remaining in close proximity to a simple but unusual physical object, which is exactly what Tristan, who told me about UDASSA, always does.

Utilitarian ethics is chiefly concerned with aggregate amounts of joy and suffering. “Experience moments” seem like the natural basis for adding those up, and so UDASSA has important implications for utilitarianism.

One important consequence is defusing the repugnant conclusion. Is it better to have a planet of 1 billion beings living in perfect bliss or a planet with 100 billion drudges whose lives are only 1.01% as good as the blissful few? UDASSA says that we shouldn’t just add the utility of the larger population linearly, since in a larger population of presumably somewhat-similar beings each one has a lower weight, given the extra information required to locate them.

In general, UDASSA suggests that the proper aggregation of utility across similar beings (like all humans broadly) is somewhere between simply adding them all up as total utilitarianism does and doing no aggregation at all as average utilitarianism does. Many people (including me) have the intuition, with respect to population ethics, that diversity should matter along with the sheer total of experience. UDASSA gives us a quasi-mathematical framework to justify this intuition, which is about as much solid ground as one can hope for when talking about population ethics.

The simulation argument against simulation

Although UDASSA is based on the idea that every experience is the result of some computation, it actually suggests that you are almost certainly in “base reality” as opposed to a simulation run by another experiencing being. The simulation argument relies on the fact that if, for example, our descendants are simulating their ancestors then they could run a huge number of such simulations — much larger than the number of their ancestors who actually lived. If you give these simulations equal weight, their sheer number makes it exceedingly likely that you are one of them.

UDASSA “penalizes” the measure of these simulations in two ways: first, if they are running in the same time and location then finding a specific one takes more bits in direct proportion to the number of simulations. In addition, a simulation needs a longer decoder since it requires decoding the emulation into the thing being emulated and finally from that thing into the experience, instead of just the final step. Or, going the other way, a simulation requires going [universe]->[computer]->[simulated brain]->[experience] as opposed to the shorter [universe]->[brain]->[experience].

The low measure of simulations also implies that destructively scanning your brain to upload into a computer is akin to death, in the sense of eliminating almost the entire weight of your experience. If it takes a mere kilobyte to decode the upload, UDASSA’s math makes it equivalent to being uploaded to a computer that then has a mere 1 in 28000 chance of ever being turned on. This is probably worse than death, since I would give at least a 1 in 28000 chance to Islam being true enough that this unnatural upload-death would deprive me of all the raisins I would get in heaven, or any similar Pascal’s wager.

UDASSA and U

An important feature of crazy grand theories is that they should add up to normality.

Normal people believe that they are neither Bolzmann brains nor living in a simulation, that physics works, that the highest good is neither tiling the universe with identical copies of rat brains on heroin nor total extinction that leaves behind a single happy being. Sometimes these normal people discover LessWrong and read about crazy thought experiments on electron suffering and acausal demiurges and astronomical basilisks. And if they don’t immediately run screaming and actually think through all this stuff it should definitely nudge what “normal” means to them. But it shouldn’t turn their ethics and metaphysics and worldview entirely on its head.

This I think the neat thing about UDASSA: it’s a very Rationalist way to get back the same intuitions that Rationality often pushes people away from. Or at the very least, a framework that lets people who hold “normal intuitions” hang out with Rats without feeling inadequately nerdy when the conversation turns to metaphysics.

Does this mean that UDASSA is true? That it adds up to normality could be as much evidence for it as evidence against it, that the intuitiveness of its conclusions makes us want to believe than is rationally justified. To me, it’s more of a reminder to be skeptical, an anchor to keep from being swept away to wild conclusions by grand theories with hard-to-spot errors or hidden assumption. UDASSA seems as solid as most such philosophies, and they should all be taken equally seriously.

New to LessWrong?

New Comment
8 comments, sorted by Click to highlight new comments since: Today at 12:24 PM

But after the LessWrong crowd insulted me last week I will have my revenge: getting Rationalists to argue about anthropics in the comments.

Noooo! Mercy, have mercy, please!

i've thought about this a bunch before, and in a recent post i suggest hard-coding the universal program instead of varying over spec, such that the (query, decoder) pair is the only thing left varying. is this something that has been considered by other people about UDASSA ?

I think your proposal is the same as regular UDASSA. The "claw" and "world" aren't intrinsic parts of the framework, they're just names Carlsmith uses to denote a two-part structure that frequently appears in programs used to generate our observations.

An important feature of crazy grand theories is that they should add up to normality.

This is actually not a great criterion here, primarily because it assumes normality is actually a property of the universe.

I get it, it is a useful heuristic, but I don't nearly agree with this being raised to actual decisions.

EDIT: I believe that the simulation argument actually does add up to normality through a different procedure.

This seems to equate the weight of a person, with the utility that we should assign to that person. I don't see a reason for this equivalence.

But which universal distribution, though? The mapping between bitstrings and computations depends on what model of computation we use. You can only declare a world "simple" once you've picked a particular model of computation, and I don't see any non-arbitrary way of singling one out.

Using the Universal Distribution in the context of the simulation argument makes a lot of sense if we think that the base reality has no intelligent simulators, as it fits with our expectations that a randomly generated simulator is very likely to be coincise. But for human (or any agent-simulators) generated simulations, a more natural prior is how easy is the simulation to be run (Simplicity Assumption), since agent-simulators face concrete tradeoffs in using computational resources, while they have no pressing tradeoffs on the length of the program. 

See here for more info on the latter assumption.

Nice post, thanks!

Is there a formulation of UDASSA that uses the self-indication assumption instead? What would be the implications of this?