## LESSWRONGLW

Neel_Krishnaswami

Sorted by New

# Wiki Contributions

Trust in Bayes

Eliezer: Never mind having the expectation of a sum of an infinite number of variables not equalling the sum of the expectations; here we have the expectation of the sum of two bets not equalling the sum of the expectations.

If you have an alternating series which is conditionally but not absolutely convergent, the Riemann series theorem says that reordering its terms can change the result, or force divergence. So you can't pull a series of bets apart into two series, and expect their sums to equal the sum of the original. But the fact that you assumed you could is a perfect illustration of the point; if you had a collection of bets in which you could do this, then no limit-based Dutch book is possible.

Ensuring that this property holds necessarily restricts the possible shapes of a utility function. We need to bound the utility function to avoid St. Petersburg-style problems, but the addition of time adds another infinite dimension to the event space, so we need to ensure that expectations of infinite sums of random variables indexed by time are also equal to the sum of the expectations. For example, one familiar way of doing this is to assume a time-separable, discounted utility function. Then you can't force an agent into infinitely delayed gratification, because there's a bounded utility and a minimum, nonzero payment to delay the reward -- at some point, you run out of the space you need to force delay.

If you're thinking that the requirement that expected utility actually works puts very stringent limits on what forms of utility functions you can use -- you're probably right. If you think that philosophical analyses of rationality can justify a wider selection utility functions or preference relations -- you're probably still right. But the one thing you can't do is to pick a function of the second kind and still insist that ordinary decision-theoretic methods are valid with it. Decision theoretic methods require utility to form a proper random variable. If your utility function can't satisfy this need, you can't use decision theoretic methods with it.

Trust in Bayes

You've profoundly misunderstood McGee's argument, Eliezer. The reason you need the expectation of the sum of an infinite number of random variables to equal the sum of the expectations of those random variable is exactly to ensure that choosing an action based on the expected value actually yields an optimal course of action.

McGee observed that if you have an infinite event space and unbounded utilities, there are a collection of random utility functions U1, U2, ... such that E(U1 + U2 + ...) != E(U1) + E(U2) + .... McGee then observes that if you restrict utilities to a bounded range, then in fact E(U1 + U2 + ...) == E(U1) + E(U2) + ..., which ensures that a series of choices based on the expected value always give the correct result. In contrast, the other paper -- which you apparently approve of -- happily accepts that when E(U1 + U2 + ...) != E(U1) + E(U2) + ..., an agent can be Dutch booked and defends this as still rational behavior.

Right now, you're a "Bayesian decision theorist" who a) doesn't believe in making choices based on expected utility, and b) accepts Dutch Books as rational. This is goofy.

The "Intuitions" Behind "Utilitarianism"

I think claims like "exactly twice as bad" are ill-defined.

Suppose you have some preference relation on possible states R, so that X is preferred to Y if and only if R(X, Y) holds. Next, suppose we have a utility function U, such that if R(X, Y) holds, then U(X) > U(Y). Now, take any monotone transformation of this utility function. For example, we can take the exponential of U, and define U'(X) = 2^(U(X)). Now, note that U(X) > U(Y) if and only if that U'(X) > U'(Y). Now, even if U is additive along some dimension of X, U' won't be.

But there's no principled reason to believe that U is a "truer" reflection of the agent's preferences than U', since both of them are equally faithful to the underlying preference relation. So if you want to do meaningful comparisons of utility you have to do them in a way that's invariant under monotone transformations. Since "twice as bad" isn't invariant such a transformation, it's not evidently a meaningful claim.

Now, there might be some additional principle you can advance to justify claims like that, but I haven't seen it, or its justification, yet.

The "Intuitions" Behind "Utilitarianism"

Bob: Sure, if you specify a disutility function that mandates lots-o'-specks to be worse than torture, decision theory will prefer torture. But that is literally begging the question, since you can write down a utility function to come to any conclusion you like. On what basis are you choosing that functional form? That's where the actual moral reasoning goes. For instance, here's a disutility function, without any of your dreaded asymptotes, that strictly prefers specks to torture:

U(T,S) = ST + S

Freaking out about asymptotes reflects a basic misunderstanding of decision theory, though. If you've got a rational preference relation, then you can always give a bounded utility function. (For example, the function I wrote above can be transformed to U(T,S) = (ST + S)/(ST + S + 1), which always gives you a function in [0,1], and gives rise to the same preference relation as the original.) If you absolutely require unbounded utilities in utility functions, then you become subject to a Dutch book (see Vann McGee's "An Airtight Dutch Book"). Attempts to salvage unbounded utility pretty much always end up accepting certain Dutch books as rational, which means you've rejected the whole decision-theoretic justification of Bayesian probability theory. Now, the existence of bounds means that if you have a monotone utility function, then the limits will be well-defined.

So asymptotic reasoning about monotonically increasing harms is entirely legit, and you can't rule it out of bounds without giving up on either Bayesianism or rational preferences.

0 And 1 Are Not Probabilities

If you don't want to assume the existence of certain propositions, you're asking for a probability theory corresponding to a co-intutionistic variant of minimal logic. (Cointuitionistic logic is the logic of affirmatively false propositions, and is sometimes called Popperian logic.) This is a logic with false, or, and (but not truth), and an operation called co-implication, which I will write a <-- b.

Take your event space L to be a distributive lattice (with ordering <), which does not necessarily have a top element, but does have dual relative pseudo-complements. Take < to be the ordering on the lattice. (a <-- b) if for all x in the lattice L,

for all x, b < (a or x) if and only if a <-- b < x

Now, we take a probability function to be a function from elements of L to the reals, satisfying the following axioms:

1. P(false) = 0
2. if A < B then P(A) <= P(B)
3. P(A or B) + P(A and B) = P(A) + P(B)

There you go. Probability theory without certainty.

This is not terribly satisfying, though, since Bayes's theorem stops working. It fails because conditional probabilities stop working -- they arise from a forced normalization that occurs when you try to construct a lattice homomorphism between an event space and a conditionalized event space.

That is, in ordinary probability theory (where L is a Boolean algebra, and P(true) = 1), you can define a conditionalization space L|A as follows:

L|A = { X in L | X < A } true' = A false' = false and' = and or' = or not'(X) = not(X) and A P'(X) = P(X)/P(A)

with a lattice homomorphism X|A = X and A

Then, the probability of a conditionalized event P'(X|A) = P(X and A)/P(A), which is just what we're used to. Note that the definition of P' is forced by the fact that L|A must be a probability space. In the non-certain variant, there's no unique definition of P', so conditional probabilities are not well-defined.

To regain something like this for cointuitionistic logic, we can switch to tracking degrees of disbelief, rather than degrees of belief. Say that:

1. D(false) = 1
2. for all A, D(A) > 0
3. if A < B then D(A) >= D(B)
4. D(A or B) + D(A and B) = D(A) + D(B)

This will give you the bounds you need to let you need to nail down a conditional disbelief function. I'll leave that as an exercise for the reader.

With the graphical-network insight in hand, you can give a mathematical explanation of exactly why first-order logic has the wrong properties for the job, and express the correct solution in a compact way that captures all the common-sense details in one elegant swoop.

Consider the following example, from Menzies's "Causal Models, Token Causation, and Processes"[*]:

An assassin puts poison in the king's coffee. The bodyguard responds by pouring an antidote in the king's coffee. If the bodyguard had not put the antidote in the coffee, the king would have died. On the other hand, the antidote is fatal when taken by itself and if the poison had not been poured in first, it would have killed the king. The poison and the antidote are both lethal when taken singly but neutralize each other when taken together. In fact, the king drinks the coffee and survives.

We can model this situation with the following structural equation system:

A = true G = A S = (A and G) or (not-A and not-G)

where A is a boolean variable denoting whether the Assassin put poison in the coffee or not, G is a boolean variable denoting whether the Guard put the antidote in the coffee or not, and S is a boolean variable denoting whether the king Survives or not.

According to Pearl and Halpern's definition of actual causation, the assassin putting poison in the coffee causes the king to survive, since changing the assassin's action changes the king's survival when we hold the guard's action fixed. This is clearly an incorrect account of causation.

IMO, graphical models and related techniques represent the biggest advance in thinking about causality since Lewis's work on counterfactuals (though James Heckman disagrees, which should make us a bit more circumspect). But they aren't the end of the line, even if we restrict our attention to manipulationist accounts of causality.

[*] The paper is found here. As an aside, I do not agree with Menzies's proposed resolution.

Torture vs. Dust Specks

g: that's exactly what I'm saying. In fact, you can show something stronger than that.

Suppose that we have an agent with rational preferences, and who is minimally ethical, in the sense that they always prefer fewer people with dust specks in their eyes, and fewer people being tortured. This seems to be something everyone agrees on.

Now, because they have rational preferences, we know that a bounded utility function consistent with their preferences exists. Furthermore, the fact that they are minimally ethical implies that this function is monotone in the number of people being tortured, and monotone in the number of people with dust specks in their eyes. The combination of a bound on the utility function, plus the monotonicity of their preferences, means that the utility function has a well-defined limit as the number of people with specks in their eyes goes to infinity. However, the existence of the limit doesn't tell you what it is -- it may be any value within the bounds.

Concretely, we can supply utility functions that justify either choice, and are consistent with minimal ethics. (I'll assume the bound is the [0,1] interval.) In particular, all disutility functions of the form:

U(T, S) = A(T/(T+1)) + B(S/(S+1))

satisfy minimal ethics, for all positive A and B such that A plus B is less than one. Since A and B are free parameters, you can choose them to make either specks or torture preferred.

Likewise, Robin and Eliezer seem to have an implicit disutility function of the form

U_ER(T, S) = AT + BS

If you normalize to get [0,1] bounds, you can make something up like

U'(T, S) = (AT + BS)/(AT + BS + 1).

Now, note U' also satisfies minimal ethics, in that if T is set to 1, then in the limit as S goes to infinity, U' will still always go to one and exceed A/(A+1). So that's why they tend to have the intuition that torture is the right answer. (Incidentally, this disproves my suggestion that bounded utility functions vitiate the force of E's argument -- but the bounds proved helpful in the end by letting us use limit analysis. So my focus on this point was accidentally correct!)

Now, consider yet another disutility function,

U''(T,S) = (ST + S)/ (ST + S + 1)

This is also minimally ethical, and doesn't have any of the free parameters that Tom didn't like. But this function also always implies a preference for any number of dust specks to even a single instance of torture.

Basically, if you think the answer is obvious, then you have to make some additional assumptions about the structure of the aggregate preference relation.

Torture vs. Dust Specks

Tom, your claim is false. Consider the disutility function

D(Torture, Specks) = [10 * (Torture/(Torture + 1))] + (Specks/(Specks + 1))

Now, with this function, disutility increases monotonically with the number of people with specks in their eyes, satisfying your "slight aggregation" requirement. However, it's also easy to see that going from 0 to 1 person tortured is worse than going from 0 to any number of people getting dust specks in their eyes, including 3^^^3.

The basic objection to this kind of functional form is that it's not additive. However, it's wrong to assume an additive form, because that assumption mandates unbounded utilities, which are a bad idea, because they are not computationally realistic and admit Dutch books. With bounded utility functions, you have to confront the aggregation problem head-on, and depending on how you choose to do it, you can get different answers. Decision theory does not affirmatively tell you how to judge this problem. If you think it does, then you're wrong.

Torture vs. Dust Specks

Eliezer, both you and Robin are assuming the additivity of utility. This is not justifiable, because it is false for any computationally feasible rational agent.

If you have a bounded amount of computation to make a decision, we can see that the number of distinctions a utility function can make is in turn bounded. Concretely, if you have N bits of memory, a utility function using that much memory can distinguish at most 2^N states. Obviously, this is not compatible with additivity of disutility, because by picking enough people you can identify more distinct states than the 2^N distinctions your computational process can make.

Now, the reason for adopting additivity comes from the intuition that 1) hurting two people is at least as bad as hurting one, and 2) that people are morally equal, so that it doesn't matter which people are hurt. Note that these intuitions mathematically only require that harm should be monotone in the number of people with dust specks in their eyes. Furthermore, this requirement is compatible with the finite computation requrements -- it implies that there is a finite number of specks beyond which disutility does not increase.

If we want to generalize away from the specific number N of bits we have available, we can take an order-theoretic viewpoint, and simply require that all increasing chains of utilities have limits. (As an aside, this idea lies at the heart of the denotational semantics of programming languages.) This forms a natural restriction on the domain of utility functions, corresponding to the idea that utility functions are bounded.

Torture vs. Dust Specks

Eliezer, in your response to g, are you suggesting that we should strive to ensure that our probability distribution over possible beliefs sum to 1? If so, I disagree: I don't think this can be considered a plausible requirement for rationality. When you have no information about the distribution, you ought to assign probabilities uniformly, according to Laplace's principle of indifference. But the principle of indifference only works for distributions over finite sets. So for infinite sets you have to make an arbitrary choice of distribution, which violates indifference.