I think most of the arguments in this essay fail to bind to reality. This essay seems to have been backchained from a cool idea for the future into arguments for it. The points about how "The cosmos will be divided between different value systems" are all really vague and don't really provide any insight into what the future actually looks like, or how any of these processes really lead to a future like the one described, yet the descriptions of each individual layer are very specific.
(I can imagine that maybe after some kind of long reflection, we all agree on something like this, but I expect that any actual war scenarios end up with a winner-takes-all lock-in)
I do think the intuitions that a stratified utopia is desirable are somewhat interesting. I think that dividing up the universe into various chunks probably is a way to create a future that most people would be happy with. The "Nothing to Mourn" principle is really nice.
Then again, I think that a simple forward-chaining application of the nothing to mourn principle immediately runs into the real, difficult problems of allocating resources to different values: some people's utopias are mutually net negative. For example, if one person thinks the natural world is a horrid hell-pit of suffering and another thinks that living in a fully AI-managed environment is a kind of torture for everyone involved, they just can't compromise. It's not possible. This is the real challenge of allocating value through splitting up the universe, and the fact that you didn't really address it gives the whole essay a kind of "Communist art students planning their lives on the commune after the revolution" vibe.
It would be cool to do a dive into this concept which focuses more on what kind of a thing a value actually is, and what moral uncertainty actually means (some people especially EAs do this thing where they talk about moral uncertainty as if they're moral realists, but firstly I think moral realism is incoherent, and secondly they don't actually endorse moral realism) and also to address the problem of mutually net negative ideal worlds.
(1)
You're correct that the essay is backchainy. Stratified utopia is my current best bet for "most desirable future given our moral uncertainty" which motivates me to evaluate its likelihood. I don't think it's very likely, maybe 5-10%, and I could easily shift with further thought
Starting with the most desirable future and then evaluating its likelihood does risk privileging the hypothesis. This is a fair critique: better epistemics would start with the most likely future and then evaluate its desirability.
(2)
Regarding your example: "if one person thinks the natural world is a horrid hell-pit of suffering and another thinks that living in a fully AI-managed environment is a kind of torture for everyone involved, they just can't compromise."
I should clarify that resource-compatibility is a claim about the mundane and exotic values humans actually hold. It's a contingent, not a necessary. Yes, some people think the natural world is a hell-pit of suffering (negative utilitarians like Brian Tomasik), but they're typically scope-sensitive and longtermist, so they'd care far more about the distal resources.
You could construct a value profile like "utility = -1 if suffering exists on Earth, else 0" which is exotic values seeking proximal resources. I don't have a good answer for handling such cases. But empirically, this value profile seems rare.
More common are cases involving contested sacred sites, which also violate the Nothing-To-Mourn Principle. For example, some people would mourn if the Third Temple were never rebuilt on the Temple Mount, while others would mourn if the Al-Aqsa Mosque were destroyed to make way for it.
Summary: "Stratified utopia" is an outcome where mundane values get proximal resources (near Earth in space and time) and exotic values get distal resources (distant galaxies and far futures). I discuss whether this outcome is likely or desirable.
I hold mundane values, such as partying on the weekend, the admiration of my peers, not making a fool of myself, finishing this essay, raising children, etc. I also have more exotic values, such as maximizing total wellbeing, achieving The Good, and bathing in the beatific vision. These values aren't fully-compatible, i.e. I won't be partying on the weekend in any outcome which maximizes total wellbeing[1].
But I think these values are nearly-compatible. My mundane values can get 99% of what they want (near Earth in space and time) and my exotic values can get get 99% of what they want (distant galaxies and far futures). This is a happy coincidence.
I call this arrangement "stratified utopia." In this essay, I discuss two claims:
The simple case in favour: when different values care about different kinds of resources, it's both likely and desirable that resources are allocated to the values which care about them the most. I discuss considerations against both claims.
Humans hold diverse values. I'm using a thick notion of "values" that encompasses not only a utility function over world states but also ontologies, decision procedures, non-consequentialist principles, and meta-values about value idealizations.[2] You could use the synonyms "normative worldviews", "conceptions of goodness", "moral philosophies", and "policies about what matters". Each human holds multiple values to different degrees in different contexts, and this value profile varies across individuals.
Values have many structural properties, and empirically certain properties cluster together. There is a cluster whose two poles I will call 'mundane' and 'exotic':
Mundane Values | Exotic Values |
---|---|
Scope-Insensitive: These values saturate at human-scale quantities. | Scope-Sensitive: These values scale linearly with resources, without a saturation point. |
Shorttermist: These values weight the near future more heavily than the far future. | Longtermist: These values weight all times equally, with no temporal discounting. |
Substrate-Specific: The atoms matter, not just the computational pattern. | Substrate-Independent: Only the computation matters, not the physical implementation. |
Causal-focused: These values care about things reachable through normal physical causation. | Acausal-focused: These values embrace acausal cooperation with distant civilisations we'll never meet. |
Cosmic resources similarly possess many structural properties, and empirically certain properties cluster together. There is a cluster whose two poles I will call 'proximal' and 'distal':
Proximal Resources | Distal Resources |
---|---|
Spatial Proximity: Near Earth in space. The solar system. Maybe the local galaxy cluster. | Spatially Distant: Galaxies billions of light-years away. The vast majority of the observable universe. |
Temporal Proximity: The near future. The next million years. | Temporally Distant: The far future, trillions of years from now, after the last stars have burned out. |
Substrate Proximity: Physical reality. Atoms and matter you can touch. | Substrate Distant: Simulated worlds. Computation rather than physical matter. |
Causal Proximity: Resources we can influence through normal causation. | Causally Distant: Distant civilizations which are correlated with our actions but cannot be physically reached. |
It follows that mundane values want proximal resources. Exotic values want distal resources. This is the happy coincidence making these values near-compatible.
The cosmos gets divided into shells expanding outward from Earth-2025, like layers of an onion. Each shell is a stratum. The innermost shell Earth satisfies our most mundane values. Moving outward, each shell satisfies increasingly exotic values.
The innermost shells contain a tiny proportion of the total volume, but fortunately the mundane values are scope-insensitive. The outermost shells cannot be reached for billions of years, because of the speed of light, but fortunately the exotic values are longtermist.
The descriptions below shouldn’t be taken too seriously, but I think it’s helpful to be concrete.
Level 0: The Humane World. This is our world but without atrocities such as factory farming and extreme poverty. People are still driving to school in cars, working ordinary jobs, and partying on the weekend.
Level 1: Better Governance. Everything from Level 0, plus better individual and collective rationality, improved economic, scientific, and governmental institutions, and the absence of coordination failures. Think of Yudkowsky's Dath Ilan.
Level 2: Longevity. Everything from Level 1, plus people live for centuries instead of decades. Different distributions of personality traits and cognitive capabilities, but nothing unprecedented.
Level 3: Post-Scarcity. Everything from Level 2, plus material abundance. Think of Carlsmith's "Concrete Utopia"[3].
Level 4: Biological Transhumanism. Everything from Level 3, plus fully-customizable biological anatomy.
Level 5: Mechanical Transhumanism. Everything from Level 4, but including both biological and mechanical bodies. Imagine a utopian version of Futurama or Star Trek, or the future Yudkowksy describes in Fun Theory.
Level 6: Virtual Worlds. Similar to Level 5, except everything runs on a computer simulation, possibly distant in the future. The minds here still have recognizably human psychology; they value science, love, achievement, meaning. But the physics is whatever works best.
Level 7: Non-Human Minds. There are agents optimizing for things we about, such as scientific discovery or freedom. But they aren't human in any meaningful sense. This is closer to what Carlsmith calls "Sublime Utopia".
Level 8: Optimized Welfare. Moral patients in states of optimized welfare. This might be vast numbers of small minds experiencing constant euphoria, i.e. "shrimp on heroin". Or it might be a single supergalactic utility monster. Whatever satisfies our mature population ethics. There might not be any moral agents here.
Level 9: Sludge. In this stratum, there might be not even be moral patients. It is whatever configuration of matter optimizes for our idealized values. This might look like classical -oniums, pure substances maximizing for recognizable moral concepts[4]. Or it might be optimizing for something incomprehensible, e.g. "enormously complex patterns of light ricocheting off intricate, nano-scale, mirror-like machines" computing IGJC #4[5].
Predicting the longterm future of the cosmos is tricky, but I think something like stratified utopia has ~5% likelihood.[6] The argument rests on four premises:
It follows from these premises that proximal resources satisfy mundane values and distal resources satisfy exotic values. I'll examine each premise in turn.
The cosmos will be divided among value systems through some mechanism. Here are the main possibilities:
I expect resources will be allocated through some mixture of these mechanisms and others. Stratification emerges across this range because these mechanisms share a common feature: when different values care about different resources, those resources get allocated to whoever values them most.
Some considerations against this premise:
At allocation time, mundane and exotic values will dominate with comparable weighting.
Several factors push toward this premise:
Several factors push against this premise:
Will mundane values remain proximal-focused? Several factors push toward yes:
Several factors push toward no:
Several factors push toward persistence:
Several factors push against persistence:
Here are some moral intuitions that endorse stratified utopia:
But I have some countervailing moral intuitions:
If we want to split the cosmos between mundane and exotic values, we have two basic options. We could stratify temporally, saying the first era belongs to mundane values and later eras belong to exotic values. Or we could stratify spatially, saying the inner regions belong to mundane values and the outer regions belong to exotic values.
I think that spatial stratification is better than temporal stratification. Under temporal stratification, the first million years of the cosmos belong to mundane values, but after that deadline passes, exotic values take over everywhere, including Earth and the nearby stars.
Spatial stratification has several moral advantages over temporal stratification
If stratification optimally satisfies mixed values, it also pessimizes them. The strata might look like:
Level 0: Ordinary suffering of people alive today
Levels 1-4: Enhanced biological torture
Levels 5-7: Simulated hells
Levels 8-9: Pure suffering substrate, maximally efficient negative utility
Stratified utopia requires firewalls, which blocks the flow of information from higher strata to lower ones. This has some advantages:
Lower strata inhabitants don't know about upper strata because firewalls prevent that knowledge. They lead lives that are, by most welfare standards, worse than lives in higher strata. This creates a moral problem: we sacrifice individual welfare for non-welfarist values like tradition and freedom. This seems unjust to those in inner strata.
To mitigate this injustice, when people die in Levels 0-6, they can be uplifted into higher strata if they would endorse this upon reflection. Someone whose worldview includes "death should be final" gets their wish: no uplifting. But everyone else would be uplifted, either physically (to Levels 1-5 which are substrate-specific utopias) or uploaded (to Levels 6-7 which are substate-neutral). Alternatively, we could uplift someone at multiple points in their life, forking them into copies: one continues in the inner stratum, another moves to a higher stratum.
Utilitarianism does not love you, nor does it hate you, but you’re made of atoms that it can use for something else. In particular: hedonium (that is: optimally-efficient pleasure, often imagined as running on some optimally-efficient computational substrate). —Being nicer than Clippy (Joe Carlsmith, Jun 2021)
Second, the idealising procedure itself, if subjective, introduces its own set of free parameters. How does an individual or group decide to resolve internal incoherencies in their preferences, if they even choose to prioritize consistency at all? How much weight is given to initial intuitions versus theoretical virtues like simplicity or explanatory power? Which arguments are deemed persuasive during reflection? How far from one's initial pre-reflective preferences is one willing to allow the idealization process to take them? — Better Futures (William MacAskill, August 2025)
For a defence of the subjectivity of idealization procedure, see On the limits of idealized values (Joe Carlsmith, Jun 2021).
Glancing at various Wikipedias, my sense is that literary depictions of Utopia often involve humans in some slightly-altered political and material arrangement: maybe holding property in common, maybe with especially liberated sexual practices, etc. And when we imagine our own personal Utopias, it can be easy to imagine something like our current lives, but with none of the problems, more of the best bits, a general overlay of happiness and good-will, and some favored aesthetic — technological shiny-ness, pastoralness, punk rock, etc — in the background. — Actually possible: thoughts on Utopia (Joe Carlsmith, Jan 2021)
Classical -oniums include: alethonium (most truthful), areteonium (most virtuous), axionium (most valuable), dikaionium (most just), doxonium (most glorious), dureonium (most enduring), dynamonium (most powerful), eirenium (most peaceful), eleutheronium (most free), empathonium (most compassionate), eudaimonium (most flourishing), harmonium (most harmonious), hedonium (most pleasurable), holonium (most complete), kalionium (most beautiful), magnanimium (most generous), philonium (most loving), pneumonium (most spiritual), praxonium (most righteous), psychonium (most mindful), sophonium (most wise), teleonium (most purposeful), timonium (most honourable).
Suppose, for example, that a candidate galaxy Joe---a version of myself created by giving original me 'full information' via some procedure involving significant cognitive enhancement---shows me his ideal world. It is filled with enormously complex patterns of light ricocheting off of intricate, nano-scale, mirror-like machines that appear to be in some strange sense 'flowing.' These, he tells me, are computing something he calls [incomprehensible galaxy Joe concept (IGJC) #4], in a format known as [IGJC #5], undergirded and 'hedged' via [IGJC #6]. He acknowledges that he can't explain the appeal of this to me in my current state.
'I guess you could say it's kind of like happiness,' he says, warily. He mentions an analogy with abstract jazz.
'Is it conscious?' I ask.
'Um, I think the closest short answer is no,' he says.
Suppose I can create either this galaxy Joe's favorite world, or a world of happy puppies frolicking in the grass. The puppies, from my perspective, are a pretty safe bet: I myself can see the appeal. Expected value calculations under moral uncertainty aside, suppose I start to feel drawn towards the puppies. Galaxy Joe tells me with grave seriousness: 'Creating those puppies instead of IGJC #4 would be a mistake of truly ridiculous severity.' I hesitate. Is he right, relative to me?
— On the limits of idealized values (Joe Carlsmith, Jun 2021).
My aim is this essay is not to offer quantitative probabilities, but I will offer some here as an invitation for pushback: Efficient Allocation (75%), Value Composition (50%), Resource Compatibility (45%), Persistence (30%). A naive multiplication gives 5% for Stratified Utopia, which seems reasonable.
In the future, there could be potential for enormous gains from trade and compromise between groups with different moral views. Suppose, for example, that most in society have fairly commonsense ethical views, such that common-sense utopia (from the last essay) achieves most possible value, whereas a smaller group endorses total utilitarianism. If so, then an arrangement where the first group turns the Milky Way into a common-sense utopia, and the second group occupies all the other accessible galaxies and turns them into a total utilitarian utopia, would be one in which both groups get a future that is very close to as good as it could possibly be. Potentially, society could get to this arrangement even if one group was a much smaller minority than the other, via some sort of trade. Through trade, both groups get a future that is very close to as good as it could possibly be, by their lights. — Better Futures (William MacAskill, August 2025)
Consider mundane utility U_m = 100(1-x) + (1-y) and exotic utility U_e = x + 100y, where x and y are the proportions of proximal and distal resources allocated to exotic values. Starting from equal division (0.5, 0.5) as the disagreement point, both Nash and K-S select the corner solution (0,1) where mundane gets all proximal and exotic gets all distal. For Nash: this maximizes the product of gains since both parties get resources they value 100 times more than what they give up. For K-S: this is the only Pareto-efficient point providing positive equal gains (each party gets utility 100, gaining 49.5 from disagreement). The anti-stratified corner (1,0) leaves both worse off than disagreement.
This is similar to the Market mechanism, except the allocation doesn't involve the transfer of property rights or prices.
There is also a possibility (although it seems to me less likely) that my exotic values become more proximal-focused, perhaps due to mature infinite ethics undermining total utilitarianism.
If Loud Aliens Explain Human Earliness, Quiet Aliens Are Also Rare (Robin Hanson et al., 2021)
I suspect the most common attitude among people today would either be to reject the idea of reflection on the good (de dicto) as confusing or senseless, to imagine one's present views as unlikely to be moved by reflection, or to see one's idealised reflective self as an undesirably alien creature. — Section 2.3.1 Better Futures (William MacAskill, August 2025)
Hanson argues that history is a competition to control the distant future, but behavior has been focused on the short term. Eventually, competition will select for entities capable of taking longer views and planning over longer timescales, and these will dominate. He calls this transition point "Long View Day." See Long Views Are Coming (Robin Hanson, November 2018)
Near mode and far mode refer to different styles of thinking identified in construal level theory. Near mode is concrete, detailed, and contextual — how we think about things physically, temporally, or socially close to us. Far mode is abstract, schematic, and decontextualized — how we think about distant things. See Robin Hanson's summary.
See Section 4.2.3. Defense-dominance, Better Futures (William MacAskill, August 2025)
As vast robotic fleets sweep across the cosmos, constructing astronomical megastructures with atomic precision, hear a single song echoing throughout all strata: "Non, Je ne regrette rien".
The expected choiceworthiness approach assigns each theory a utility function and maximizes the credence-weighted sum of utilities. See What to Do When You Don't Know What to Do (Andrew Sepielli, 2009) and Moral Uncertainty (MacAskill, Bykvist & Ord, 2020).
This faces the problem of intertheoretic comparisons: different theories may use different utility scales. But we can solve this with normalisation: Moral Uncertainty and Its Consequences (Ted Lockhart, 2000) proposes range normalization, equalizing each theory's range between best and worst options. Statistical Normalization Methods in Interpersonal and Intertheoretic Comparisons (Cotton-Barratt, MacAskill & Ord, 2020) proposes variance normalization, equalizing each theory's variance across possible outcomes.
On either normalisation scheme, the expected choiceworthiness is maximised when proximal resources satisfy mundane values and distal resources satisfy exotic values.
A Bargaining-Theoretic Approach to Moral Uncertainty (Hilary Greaves & Owen Cotton-Barratt, 2019)
Normative Uncertainty as a Voting Problem (William MacAskill, 2016) and The Parliamentary Approach to Moral Uncertainty (Toby Newberry & Toby Ord, 2021)
The Property Rights Approach to Moral Uncertainty (Harry Lloyd, 2022)
For discussion of both approaches, see Section 5 of Moral Decision-Making Under Uncertainty (Tarsney, Thomas, & MacAskill, SEP 2024).
Credit to Avi Parrack for this point.