Eric Chen

13

The 'individual rationality condition' is about the payoffs in equilibrium, not about the strategies. It says that the equilibrium payoff profile must yield to each player at least their minmax payoff. Here, the minmax payoff for a given player is -99.3 (which comes from the player best responding with 30 forever to everyone else setting their dials to 100 forever). The equilibrium payoff is -99 (which comes from everyone setting their dials to 99 forever). Since -99 > -99.3, the individual rationality condition of the Folk Theorem is satisfied.

42

Because the meaning of statements does not, in general, consist entirely in observations/anticipated experiences, and it makes sense for people to have various attitudes (centrally, beliefs and desires) towards propositions that refer to unobservable-in-principle things.

Accepting that beliefs should pay rent in anticipated experience does not mean accepting that the meaning of sentences are determined entirely by observables/anticipated experiences. We can have that the meanings of sentences are the propositions they express, and the truth-conditions of propositions are generally states-of-affairs-in-the-world and not just observations/anticipated experiences. Eliezer himself puts it nicely here: "The meaning of a statement is *not* the future experimental predictions that it brings about, nor isomorphic up to those predictions [...] you can have meaningful statements with no experimental consequences, for example: "Galaxies continue to exist after the expanding universe carries them over the horizon of observation from us.""

As to how to choose one belief over another, if both beliefs are observationally equivalent in some sense, there are many such considerations. One is *our best theories predict it: *if our best cosmological theories predict something does not cease to exist the moment it exits our lightcone, then we should assign higher probability to the statement "objects continue to exist outside our lightcone" than the statement "objects vanish at the boundary of our lightcone". Another is *simplicity-based priors: *the many-worlds interpretation of quantum mechanics is strictly simpler/has a shorter description length than the Copenhagen interpretation (Many-Worlds = wave function + Schrödinger evolution; Copenhagen interpretation = wave function + Schrödinger evolution + collapse postulate), so we should assign a higher prior to many-worlds than to Copenhagen.

If your concern is instead that attitudes towards such propositions have no behavioural implications and thus cannot in principle be elicited from agents, then the response is to point to the various decision-theoretic representation theorems available in the literature. Take the Jeffrey framework: as long as your preferences over propositions satisfies certain conditions (e.g. Ordering, Averaging), I can derive both a quantitative desirability measure and probability measure, characterising your desire and belief attitudes (respectively) towards the propositions you are considering. The actual procedure to elicit this preference relation looks like asking people to consider and compare actualising various propositions, which we can think of as gambles. For example, a gamble might look like "If the coin lands Heads, then one person comes into existence outside of our future lightcone and experiences bliss; If the coin lands Tails, then one person comes into existence outside of our future lightcone and experiences suffering". Note, the propositions here can refer to unobservables. Also, it seems reasonable to prefer the above gamble involving a fair coin to the same gamble but with the coin biased towards Tails. Moreover, the procedure to elicit an agent's attitudes to such propositions merely consists in the agents considering what they would do *if *they were choosing which of various propositions to bring about, and does not cash out in terms of observations/anticipated experiences.

(As an aside, doing acausal reasoning in general requires agent to have beliefs and desires towards unobservable-in-principle stuff in, e.g. distant parts of our universe, or other Everett branches).

20

Same as Sylvester, though my credence in consciousness-collapse interpretations of quantum mechanics has moved from 0.00001 to 0.000001.

50

Yeah great point, thanks. We tried but couldn't really get a set-up where she just learns a phenomenal fact. If you have a way of having the only difference in the 'Tails, Tuesday' case be that Mary learns a phenomenal fact, we will edit it in!

62

Thanks, the clarification of UDT vs. "updateless" is helpful.

But now I'm a bit confused as to why you would still regard UDT as "EU maximisation, where the thing you're choosing is policies". If I have a preference ordering over lotteries that violates independence, the vNM theorem implies that I cannot be represented as maximising EU.

In fact, after reading Vladimir_Nesov's comment, it doesn't even seem fully accurate to view UDT taking in a preference ordering over lotteries. Here's the way I'm thinking of UDT: your prior over possible worlds uniquely determines the probabilities of a single lottery L, and selecting a global policy is equivalent to choosing the outcomes of this lottery L. Now different UDT agents may prefer different lotteries, but this is in no sense expected utility maximisation. This is simply: some UDT agents think one lottery is the best, other might think another is the best. There is nothing in this story that resembles a cardinal utility function over outcomes that the agents are multiplying with their prior probabilities to maximise EU with respect to.

It seems that to get an EU representation of UDT, you need to impose coherence on the preference ordering over lotteries (i.e. over different prior distributions), but since UDT agents come with some fixed prior over worlds which is not updated, it's not at all clear why rationality would demand coherence in your preference between lotteries (let alone coherence that satisfies independence).

21

Okay this is very clarifying, thanks!

If the preference ordering over lotteries violates independence, then it will not be representable as maximising EU with respect to the probabilities in the lotteries (by the vNM theorem). Do you think it's a mistake then to think of UDT as "EU maximisation, where the thing you're choosing is policies"? If so, I believe this is the most common way UDT is framed in LW discussions, and so this would be a pretty important point for you to make more visibly (unless you've already made this point before in a post, in which case I'd love to read it).

10

Yeah by "having a utility function" I just mean "being representable as trying to maximise expected utility".

20

Ah okay, interesting. Do you think that updateless agents need not accept any separability axiom at all? And if not, what justifies using the EU framework for discussing UDT agents?

In many discussions on LW about UDT, it seems that a starting point is that agent is maximising some notion of expected utility, and the updatelessness comes in via the EU formula iterating over policies rather than actions. But if we give up on some separability axiom, it seems that this EU starting point is not warranted, since every major EU representation theorem needs some version of separability.

70

Don't updateless agents with suitably coherent preferences still have utility functions?

Oh yeah, the Folk Theorem is totally consistent with the Nash equilibrium of the repeated game here being 'everyone plays 30 forever', since the payoff profile '-30 for everyone' is feasible and individually-rational. In fact, this is the unique NE of the stage game and also the unique subgame-perfect NE of any finitely repeated version of the game.

To sustain '-30 for everyone forever', I don't even need a punishment for off-equilibrium deviations. The strategy for everyone can just be 'unconditionally play 30 forever' and there is no profitable unilateral deviation for anyone here.

The relevant Folk Theorem here just says that any feasible and individually-rational payoff profile in the stage game (i.e. setting dials at a given time) is a Nash equilibrium payoff profile in the infinitely repeated game. Here, that's everything in the interval [-99.3, -30] for a given player. The theorem itself doesn't really help constrain our expectations about which of the possible Nash equilibria will

in factbe played in the game.