Equilibrate

Sorted by New

# Wiki Contributions

In particular, other alignment researchers tend to think that competitive supervision (e.g. AIs competing for reward to provide assistance in AI control that humans evaluate positively, via methods such as debate and alignment bootstrapping, or ELK schemes).

Unfinished sentence?

I think there's a typo in the last paragraph of Section I?

And let’s use “≻” to mean “better than” (or, “preferred to,” or “chosen over,” or whatever), “≺” to mean “at least as good as,”

"≺" should be "≽"

Yeah, by "actual utility" I mean the sum of the utilities you get from the outcomes of each decision problem you face. You're right that if my utility function were defined over lifetime trajectories, then this would amount to quite a substantive assumption, i.e. the utility of each iteration contributes equally to the overall utility and what not.

And I think I get what you mean now, and I agree that for the iterated decisions argument to be internally motivating for an agent, it does require stronger assumptions than the representation theorem arguments. In the standard 'iterated decisions' argument, my utility function is defined over outcomes which are the prizes in the lotteries that I choose from in each iterated decision. It simply underspecifies what my preferences over trajectories of decision problems might look like (or whether I even have one). In this sense, the 'iterated decisions' argument is not as self-contained as (i.e., requires stronger assumptions than) 'representation theorem' arguments, in the sense that representation theorems justify EUM entirely in reference to the agent's existing attitudes, whereas the 'iterated decisions' argument relies on external considerations that are not fixed by the attitudes of the agent.

Does this get at the point you were making?

The assumption is that you want to maximize your actual utility. Then, if you expect to face arbitrarily many i.i.d. iterations of a choice among lotteries over outcomes with certain utilities, picking the lottery with the highest expected utility each time gives you the highest actual utility.

It's really not that interesting of an argument, nor is it very compelling as a general argument for EUM. In practice, you will almost never face the exact same decision problem, with the same options, same outcomes, same probability, and same utilities, over and over again.

Yeah, that's a good argument that if your utility is monotonically increasing in some good X (e.g. wealth), then the type of the iterated decision you expect to fact involving lotteries over that good can determine that the best way to maximize your utility is to maximize a particular function (e.g. linear) of that good.

But this is not what the 'iterated decisions' argument for EUM amounts to. In a sense, it's quite a bit less interesting. The 'iterated decisions' argument does not start with some weak assumption on your utility function and then attempts to impose more structure on your utility function in iterated choice situations. They don't assume anything about your utility function, other than that you have one (or can be represented as having one).

All it's saying is that, if you expect to face arbitrarily many i.i.d. iterations of a choice among lotteries (i.e. known probability distributions) over outcomes that you have assigned utilities to already, you should pick the lottery that has the highest expected utility. Note, the utility assignments do not have to be linear or monotonically increasing in any particular feature of the outcomes (such as the amount of money I gain if that outcome obtains), and that the utility function is basically assumed to be there.

The 'iterated decisions'-type arguments support EUM in a given decision problem if you expect to face the exact same decision problem over and over again. The 'representation theorem' arguments support EUM for a given decision problem, without qualification.

In either case, your utility function is meant to be constructed from your underlying preference relation over the set of alternatives for the given problem. The form of the function can be linear in some things or not, that's something to be determined by your preference relation and not the arguments for EUM.

Another good resource on this, which distinguishes the affectable universe, the observable universe, the eventually observable universe, and the ultimately observable universe: The Edges of Our Universe by Toby Ord.

What's your current credence that we're in a simulation?

Philosophical Zombies: inconceivable, conceivable but not metaphysically possible, or metaphysically possible?

Do you think progress has been made on the question of "which AIs are good successors?" Is this still your best guess for the highest impact question in moral philosophy right now? Which other moral philosophy questions, if any, would you put in the bucket of questions that are of comparable importance?