Oh yeah, the Folk Theorem is totally consistent with the Nash equilibrium of the repeated game here being 'everyone plays 30 forever', since the payoff profile '-30 for everyone' is feasible and individually-rational. In fact, this is the unique NE of the stage game and also the unique subgame-perfect NE of any finitely repeated version of the game.
To sustain '-30 for everyone forever', I don't even need a punishment for off-equilibrium deviations. The strategy for everyone can just be 'unconditionally play 30 forever' and there is no profitable unilateral... (read more)
The 'individual rationality condition' is about the payoffs in equilibrium, not about the strategies. It says that the equilibrium payoff profile must yield to each player at least their minmax payoff. Here, the minmax payoff for a given player is -99.3 (which comes from the player best responding with 30 forever to everyone else setting their dials to 100 forever). The equilibrium payoff is -99 (which comes from everyone setting their dials to 99 forever). Since -99 > -99.3, the individual rationality condition of the Folk Theorem is satisfied.
Because the meaning of statements does not, in general, consist entirely in observations/anticipated experiences, and it makes sense for people to have various attitudes (centrally, beliefs and desires) towards propositions that refer to unobservable-in-principle things.
Accepting that beliefs should pay rent in anticipated experience does not mean accepting that the meaning of sentences are determined entirely by observables/anticipated experiences. We can have that the meanings of sentences are the propositions they express, and the truth-conditions of pr... (read more)
Same as Sylvester, though my credence in consciousness-collapse interpretations of quantum mechanics has moved from 0.00001 to 0.000001.
Yeah great point, thanks. We tried but couldn't really get a set-up where she just learns a phenomenal fact. If you have a way of having the only difference in the 'Tails, Tuesday' case be that Mary learns a phenomenal fact, we will edit it in!
Thanks, the clarification of UDT vs. "updateless" is helpful.
But now I'm a bit confused as to why you would still regard UDT as "EU maximisation, where the thing you're choosing is policies". If I have a preference ordering over lotteries that violates independence, the vNM theorem implies that I cannot be represented as maximising EU.
In fact, after reading Vladimir_Nesov's comment, it doesn't even seem fully accurate to view UDT taking in a preference ordering over lotteries. Here's the way I'm thinking of UDT: your prior over possible worlds uniquely det... (read more)
Okay this is very clarifying, thanks!
If the preference ordering over lotteries violates independence, then it will not be representable as maximising EU with respect to the probabilities in the lotteries (by the vNM theorem). Do you think it's a mistake then to think of UDT as "EU maximisation, where the thing you're choosing is policies"? If so, I believe this is the most common way UDT is framed in LW discussions, and so this would be a pretty important point for you to make more visibly (unless you've already made this point before in a post, in which case I'd love to read it).
Yeah by "having a utility function" I just mean "being representable as trying to maximise expected utility".
Ah okay, interesting. Do you think that updateless agents need not accept any separability axiom at all? And if not, what justifies using the EU framework for discussing UDT agents?
In many discussions on LW about UDT, it seems that a starting point is that agent is maximising some notion of expected utility, and the updatelessness comes in via the EU formula iterating over policies rather than actions. But if we give up on some separability axiom, it seems that this EU starting point is not warranted, since every major EU representation theorem needs some version of separability.
Don't updateless agents with suitably coherent preferences still have utility functions?
That's a coherent utility function, but it seems bizarre. When you're undergoing extreme suffering, in that moment you'd presumably prefer death to continuing to exist in suffering, almost by nature of what extreme suffering is. Why defer to your current preferences rather than your preferences in such moments?
Also, are you claiming this is just your actual preferences or is this some ethical claim about axiology?
What's the countervailing good that makes you indifferent between tortured lives and nonexistence? Presumably the extreme suffering is a bad that adds negative value to their lives. Do you think just existing or being conscious (regardless of the valence) is intrinsically very good?
In particular, other alignment researchers tend to think that competitive supervision (e.g. AIs competing for reward to provide assistance in AI control that humans evaluate positively, via methods such as debate and alignment bootstrapping, or ELK schemes).
I think there's a typo in the last paragraph of Section I?
And let’s use “≻” to mean “better than” (or, “preferred to,” or “chosen over,” or whatever), “≺” to mean “at least as good as,”
"≺" should be "≽"
Yeah, by "actual utility" I mean the sum of the utilities you get from the outcomes of each decision problem you face. You're right that if my utility function were defined over lifetime trajectories, then this would amount to quite a substantive assumption, i.e. the utility of each iteration contributes equally to the overall utility and what not.
And I think I get what you mean now, and I agree that for the iterated decisions argument to be internally motivating for an agent, it does require stronger assumptions than the representation theorem argum... (read more)
The assumption is that you want to maximize your actual utility. Then, if you expect to face arbitrarily many i.i.d. iterations of a choice among lotteries over outcomes with certain utilities, picking the lottery with the highest expected utility each time gives you the highest actual utility.
It's really not that interesting of an argument, nor is it very compelling as a general argument for EUM. In practice, you will almost never face the exact same decision problem, with the same options, same outcomes, same probability, and same utilities, over and over again.
Yeah, that's a good argument that if your utility is monotonically increasing in some good X (e.g. wealth), then the type of the iterated decision you expect to fact involving lotteries over that good can determine that the best way to maximize your utility is to maximize a particular function (e.g. linear) of that good.
But this is not what the 'iterated decisions' argument for EUM amounts to. In a sense, it's quite a bit less interesting. The 'iterated decisions' argument does not start with some weak assumption on your utility function and then att... (read more)
The 'iterated decisions'-type arguments support EUM in a given decision problem if you expect to face the exact same decision problem over and over again. The 'representation theorem' arguments support EUM for a given decision problem, without qualification.
In either case, your utility function is meant to be constructed from your underlying preference relation over the set of alternatives for the given problem. The form of the function can be linear in some things or not, that's something to be determined by your preference relation and not the arguments for EUM.
Another good resource on this, which distinguishes the affectable universe, the observable universe, the eventually observable universe, and the ultimately observable universe: The Edges of Our Universe by Toby Ord.
What's your current credence that we're in a simulation?
I think that by count across all the possible worlds (and the impossible ones) the vast majority of observers like us are in simulations. And probably by count in our universe the vast majority of observers like us are in simulations, except that everything is infinite and so counting observers is pretty meaningless (which just helps to see that it was never the thing you should care about).
I'm not sure "we're in a simulation" is the kind of thing it's meaningful to talk about credences in, but it's definitely coherent to talk about betting odds (i.e. how ... (read more)
Philosophical Zombies: inconceivable, conceivable but not metaphysically possible, or metaphysically possible?
Do you think progress has been made on the question of "which AIs are good successors?" Is this still your best guess for the highest impact question in moral philosophy right now? Which other moral philosophy questions, if any, would you put in the bucket of questions that are of comparable importance?