Wiki Contributions


Why Subagents?

Of course, this is just a heuristic argument, and if partial preference orderings in real life have some special structure, the conclusion might differ.

Hmm I may be missing something here, but I suspect that "partial preference orderings in real life have some special structure" in the relevant sense, is very likely true. Human preferences don't appear to be a random sample from the set of all possible partial orders over "world states" (or more accurately, human models of worlds).

First of all, if you model human preferences as a vector-valued utility function (i.e. one element of the vector per subagent) it seems that it has to be continuous, and probably Lipschitz, in the sense that we're limited in how much we can care about small changes in the world state. There's probably some translation of this property into graph theory that I'm not aware of.

Also, it seems like there's one or a handful of preferred factorizations of our world model into axes-of-value, and different subagents will care about different factors/axes. More specifically, it appears that human preferences have a strong tendency to track the same abstractions that we use for empirical prediction of the world; as John says, human values are a function of humans' latent variables. If you stop believing that souls and afterlives exist as a matter of science, it's hard to continue sincerely caring about what happens to your soul after you die. We also don't tend to care about weird contrived properties with no explanatory/predictive power like "grue" (green before 1 January 2030 and blue afterward).

To the extent this is the case, it should dramatically– exponentially, I think– reduce the number of posets that are really possible and therefore the number of subagents needed to describe them.

We need a theory of anthropic measure binding

I'll address your points in reverse order.

What if all of the worlds with the lowest EU are completely bizarre (like, boltzmann brains, or worlds that have somehow fallen under the rule of fantastical devils with literally no supporters).

The Boltzmann brain issue is addressed in infra-Bayesian physicalism with a "fairness" condition that excludes worlds from the EU calculation where you are run with fake memories or the history of your actions is inconsistent with what your policy says you would actually do. Vanessa talks about this in AXRP episode 14. The "worlds that have somehow fallen under the rule of fantastical devils" thing is only a problem if that world is actually assigned high measure in one of the sa-measures (fancy affine-transformed probability distributions) in your prior. The maximin rule is only used to select the sa-measure in your convex set with lowest EU, and then you maximize EU given that distribution. You don't pick the literal worst conceivable world.

Notably, if you don't like the maximin rule, it's been shown in Section 4 of this post that infra-Bayesian logic still works with optimism in the face of Knightian uncertainty, it's just that you don't get worst-case guarantees anymore. I'd suspect that you could also get away with something like "maximize 10th percentile EU" to get more tempered risk-averse behavior.

Solomonoff inducting, producing an estimate of the measure of my existence (the rate of the occurrence of the experience I'm currently having) across all possible universe-generators weighted inversely to their complexity seems totally coherent to me. (It's about 0.1^10^10^10^10)

I'm not sure I follow your argument. I thought your view was that minds implemented in more places, perhaps with more matter/energy, have more anthropic measure? The Kolmogorov complexity of the mind seems like an orthogonal issue.

Maybe you're already familiar with it, but I think Stuart Armstrong's Anthropic Decision Theory paper (along with some of his LW posts on anthropics) do a good job of "deflating" anthropic probabilities and shifting the focus to your values and decision theory.

We need a theory of anthropic measure binding

Update: I don't think I agree with this anymore, after listening to what Vanessa Kosoy said about anthropics and infra-Bayesianism during her recent AXRP interview. Her basic idea is that the idea of "number of copies" of an agent, which I take to be closely related to anthropic measure, is sort of incoherent and not definable in the general case. Instead you're just supposed to ask, given some hypothesis H, what is the probability that the computation corresponding to my experience is running somewhere, anywhere?

If we assume that you start out with full Knightian uncertainty over which of the two brains you are, then infra-Bayesianism would (I think) tell you to act as if you're the brain whose future you believe to have the lowest expected utility, since that way you avoid the worst possible outcome in expectation.

EDT with updating double counts

Yeah, I think this is right. It seems like the whole problem arises from ignoring the copies of you who see "X is false." If your prior on X is 0.5, then really the behavior of the clones that see "X is false" should be exactly analogous to yours, and if you're going to be a clone-altruist you should care about all the clones of you whose behavior and outcomes you can easily predict.

I should also point out that this whole setup assumes that there are 0.99N clones who see one calculator output and 0.01N clones who see the opposite, but that's really going to depend on what exact type of multiverse you're considering (quantum vs. inflationary vs. something else) and what type of randomness is injected into the calculator (classical or quantum). But if you include both the "X is true" and "X is false" copies then I think it ends up not mattering.

We need a theory of anthropic measure binding

Thank you for bringing attention to this issue— I think it's an under-appreciated problem. I agree with you that the "force" measure is untenable, and the "pattern" view, while better, probably can't work either.

Count-based measures seem to fail because they rely on drawing hard boundaries between minds. Also, there are going to be cases where it's not even clear whether a system counts as a mind or not, and if we take the "count" view we will probably be forced to make definitive decisions in those cases.

Mass/energy-based measures seem better because they allow you to treat anthropic measure as the continuous variable that it is, but I also don't think they can be the answer. In particular, they seem to imply that more efficient implementations of a mind (in terms of component size or power consumption or whatever) would have lower measure than less efficient ones, even if they have all the same experiences.

This is debatable, but it strikes me that anthropic measure and "degree of consciousness" are closely related concepts. Fundamentally, for a system to have any anthropic measure at all, it needs to be able to count as an "observer" or an "experiencer" which seems pretty close to saying that it's conscious on some level.

If we equate consciousness with a kind of information processing, then anthropic measure could be a function of "information throughput" or something like that. If a System A can "process" more bits of information per unit time than System B, then it can have more experiences than System B, and arguably should be given more anthropic measure. In other words, if you identify "yourself" with the set of experiences you're having in a given moment, then it's more likely that those experiences are being realized in a system with more computing power, more ability to have more experiences, than a system with less compute. Note that, on this view, the information that's being processed doesn't have to be compressed/deduplicated in any way; systems running the same computation on many threads in parallel would still have more measure than single-threaded systems ceteris paribus.

There's a lot that needs to be fleshed out with this "computational theory of anthropic measure" but it seems like the truth has to be something in this general direction.