In Probability Space & Aumann Agreement, I wrote that probabilities can be thought of as weights that we assign to possible world-histories. But what are these weights supposed to mean? Here I’ll give a few interpretations that I've considered and held at one point or another, and their problems. (Note that in the previous post, I implicitly used the first interpretation in the following list, since that seems to be the mainstream view.)

  1. Only one possible world is real, and probabilities represent beliefs about which one is real.
    • Which world gets to be real seems arbitrary.
    • Most possible worlds are lifeless, so we’d have to be really lucky to be alive.
    • We have no information about the process that determines which world gets to be real, so how can we decide what the probability mass function p should be? 
  2. All possible worlds are real, and probabilities represent beliefs about which one I’m in.
    • Before I’ve observed anything, there seems to be no reason to believe that I’m more likely to be in one world than another, but we can’t let all their weights be equal.
  3. Not all possible worlds are equally real, and probabilities represent “how real” each world is. (This is also sometimes called the “measure” or “reality fluid” view.)
    • Which worlds get to be “more real” seems arbitrary.
    • Before we observe anything, we don't have any information about the process that determines the amount of “reality fluid” in each world, so how can we decide what the probability mass function p should be?
  4. All possible worlds are real, and probabilities represent how much I care about each world. (To make sense of this, recall that these probabilities are ultimately multiplied with utilities to form expected utilities in standard decision theories.)
    • Which worlds I care more or less about seems arbitrary. But perhaps this is less of a problem because I’m “allowed” to have arbitrary values.
    • Or, from another perspective, this drops another another hard problem on top of the pile of problems called “values”, where it may never be solved.

As you can see, I think the main problem with all of these interpretations is arbitrariness. The unconditioned probability mass function is supposed to represent my beliefs before I have observed anything in the world, so it must represent a state of total ignorance. But there seems to be no way to specify such a function without introducing some information, which anyone could infer by looking at the function.

For example, suppose we use a universal distribution, where we believe that the world-history is the output of a universal Turing machine given a uniformly random input tape. But then the distribution contains the information of which UTM we used. Where did that information come from?

One could argue that we do have some information even before we observe anything, because we're products of evolution, which would have built some useful information into our genes. But to the extent that we can trust the prior specified by our genes, it must be that evolution approximates a Bayesian updating process, and our prior distribution approximates the posterior distribution of such a process. The "prior of evolution" still has to represent a state of total ignorance.

These considerations lead me to lean toward the last interpretation, which is the most tolerant of arbitrariness. This interpretation also fits well with the idea that expected utility maximization with Bayesian updating is just an approximation of UDT that works in most situations. I and others have already motivated UDT by considering situations where Bayesian updating doesn't work, but it seems to me that even if we set those aside, there is still reason to consider a UDT-like interpretation of probability where the weights on possible worlds represent how much we care about those worlds.

New Comment
89 comments, sorted by Click to highlight new comments since:
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

In order answer questions like "What are X, anyway?", we can (phenomenologically) turn the question into something like "What can we do with X?" or "What consequences does X have?"

For example, consider the question "What are ordered pairs, anyway?". Sometimes you see "definitions" of ordered pairs in terms of set theory. Wikipedia says that the standard definition of ordered pairs is:

(a, b) := {{a}, {a, b}}

Many mathematicians find this "definition" unsatisfactory, and view it not as a definition, but an encoding or translation. The category-theoretic notion of a product might be more satisfactory. It pins down the properties that the ordered pair already had before the "definition" was proposed and in what sense ANY construction with those properties could be used. Lambda calculus has a couple constructions that look superficially quite different from the set-theory ones, but satisfy the category-theoretic requirements.

I guess this is a response at the meta level, recommending this sort of "phenomenological" lens as the way to resolve these sort of questions.

... as does the set-theoretic one. ETA: Now that I read more closely, you didn't imply otherwise.

This word "possible" carries a LOT of hidden baggage. If math tells us anything its that LOTS of things SEEM possible to us because we aren't logically omniscient but aren't really possible.

While we're at it, how about we drop "worlds" from the mix. I don't think it adds anything. If we replace it with "information flows" do things work better?

Do you mean something precise by "information flows"?

5Wei Dai
Possible world is a standard term in several related fields, such as philosophy and linguistics. Are you arguing against my particular usage, or all usage of the term in general?

Lumping probabilities in with utilities sounds pretty close to Vladimir Nesov's Representing Preference by Probability Measures.

Copied from a chat where I tried to explain interpretations 3 and 4 a bit more:

I'm not sure what it means for a world to be more real either, but to the extent the idea makes sense in the many worlds interpretation of quantum mechanics (where some Everett branches are somehow "more real" or "exist more") it seems reasonable to extend that to other mathematical structures. One intuition pump is to imagine that the multiverse literally consists of an infinite collection of universal Turing machines, each initialized with a random input tape. So that's #3 i

... (read more)

Your getting yourself in trouble because you assume that puzzling questions must have deep answers when usually the question itself is flawed or misleading. In this case there just seems to be a need for any explanation of the kind you offer nor would be of any use anyway.

These 'explanations' you offer of probability aren't really explaining anything. Certainly we do succesfully use probability to reason about systems that behave in a deterministic classical fashion (rolling dice probably counts). No matter what sort of probability you believe in you hav... (read more)


All possible worlds are real, and probabilities represent how much I care about each world.

Right, so maybe we need to rethink this whole rationality thing, then? I mean, since there are possible worlds where god exists, under this view, the only difference between a creationist and a rational atheist is one of taste?

To me, the god world seems much easier to deal with and more pleasant. So why not shun rationality all together if probabilities are actually arbitrary - if thinking it really does make it so?

2Wei Dai
In this view, rationality doesn't play a role in choosing the initial weights on the possible universes. That job would be handed over to moral philosophy, just like choosing the right utility function already is. No, thinking it doesn't make it so. Even in this view, the right beliefs and decisions aren't arbitrary, because they depend in a lawful way on your preferences. You still want to be rational in order to make the best decisions to satisfy your preferences.
Right, but I don't actually have a strong preference for the simplicity prior that science uses: if I can just choose what kind of reality to endorse - and there is really no fact of the matter about which one is real - it seems silly to endorse the reality based on the occam prior of science. According to science - i.e. according to the probability distribution you get from updating the complexity/occam prior with the evidence - the world is allowed to do lots of horrible things to me, like kill me. It would be much more pleasant to endorse some other prior - for example, one where everything just happens to work out to match my preferences - the "wishful thinking" prior. In general, if there is no fact of the matter about what is real, then why would anyone bother to endorse anything other than their own personal wishful thinking as real? It would seem to be irrational not to.
2Wei Dai
Presumably you don't do that because that's not your actual prior - you don't just care about one particular possible world where things happen to turn out exactly the way you want. You also care about other possible worlds and want to make decisions in ways that make those worlds better. It would be for the same reason that you don't change your utility function to give everything an infinite utility.
Presumably there are infinitely many possible worlds where things happen to turn out exactly the way I want: I care about some small finite subset of the world, and the rest is allowed to vary. Why should I expend energy worrying about one particular infinity of worlds that are hard to optimize when I have already got infinitely many where I win easily or by default? There are presumably also infinitely many possible worlds where all varieties of bizarre decision/action algorithms are the way to win. For example, the world where the extent to which your preferences get satisfied is determined by what fraction of your skin is covered in red body paint, etc, etc. Also, there are other classes of worlds where I lose: for example, anti-inductive worlds. Why should I pay special attention to the worlds that loosely obey the occam/complexity prior? Perhaps I could frame it this way: the complexity prior is (in fact) counterintuitive and alien to the human mind. Why should I pay special attention to worlds that conform to it (simple worlds)? The answer I used to have was "because it works", which seemed to cache out as "if I use a complexity prior to repeatedly make decisions, then my subjective experience will be (mostly) of winning" which I used to think was because the Real world that we live in is, in fact, a simple one, rather than a wishful-thinking one, a red-body-paint one, or an anti-inductive one.
5Wei Dai
It sounds like you're assuming that people use a wishful-thinking prior by default, and have to be argued into a complexity-based prior. This seems implausible to me. I think the phenomenon of wishful thinking doesn't come from one's prior, but from evolution being too stupid to design a rational decision process. That is, a part of my brain rewards me for increasing the anticipation of positive future experiences, even if that increase is caused by faulty reasoning instead of good decisions. This causes me to engage in wishful thinking (i.e., miscalculating the implications of my prior) in order to increase my reward. I dispute this. Sure, some of the implications of the complexity prior are counterintuitive, but it would be surprising if none of them were. I mean, some theorems of number theory are counterintuitive, but that doesn't mean integers are aliens to the human mind. Suppose someone gave you a water-tight argument that all possible world are in fact real, and you have to make decisions based on which worlds you care more about. Would you really adopt the "wishful-thinking" prior and start putting all your money into lottery tickets or something similar, or would your behavior be more or less unaffected? If it's the latter, don't you already care more about worlds that are simple? Perhaps this is just one of the ways an algorithm that cares about each world in proportion to its inverse complexity could feel from the inside?
this is a good point, I'll have to think about it.
I think that there would be a question about what "I" would actually experience. There have been times in my younger days when I tried a bit of wishful thinking - I think everyone has. Maybe, just maybe, if I wish hard enough for X, X will happen? Well what you actually experience after doing that is ... failure. Wishing for something doesn't make it happen - or if it does in some worlds, then I have evidence that I don't inhabit those worlds. So I suppose I am using my memory - which points to me having always been in a world that behaves exactly as the complexity prior would predict - as evidence that the thread of my subjective experience will always be in a world that behaves as the complexity prior would predict, which is sort of like saying that only one particular simple world is real.
You don't believe in affirmations? The self-help books about the power of positive thinking don't work for you? What do you make of the following quote? "Personal optimism correlates strongly with self-esteem, with psychological well-being and with physical and mental health. Optimism has been shown to be correlated with better immune systems in healthy people who have been subjected to stress." *
This is not the kind of wishful thinking I was talking about: I was talking about wishing for $1000 and it just appearing in your bank account.
When crafting ones wishes, one should have at least some minor element of realism. Also, your wish should be something your subconscious can help you with. For example, instead of wishfully thinking about money appearing in your bank account, you could wishfully think about finding it on the sidewalk. Or, alternatively you could wishfully think about yourself as a money magnet. If you previously did not bear such points in mind, you might want to consider revisiting the technique, to see if you can make something of it. Unless you figure you are already too optimistic, that is.
Probability is Subjectively Objective.
Isn't that conflating instrumental rationality and epistemic rationality?
Epistemic rationality can be seen as a kind of instrumental rationality. See Scoring rule, Epistemic vs. Instrumental Rationality: Approximations.
You seem to be confusing plausibility with possibility. The existence of God seems plausible to many people, but whether or not the existence of God is truly possible is not clear. Reasonable people believe that God is impossible, others that God is possible, and others that God is necessary (i.e. God's nonexistance is impossible).
Well, there are many weird and wonderful gods that are indeed possible, even if the particular one that many people profess to believe in is self-contradictory, and therefore incoherent.
It wouldn't quite throw all of our shit in the fan. If you know you're living in a QM many worlds universe you still have to optimize the borne probabilities, for example. I think we can rule out the popular religions as being impossible worlds, but simulated worlds are possible worlds, and in some subset of them, you can know this. In the one's where you can know differentiate to some degree, there are certainly actions that one could take to help his 'simulated' selves at the cost of the 'nonsimulated' selves, if you cared. I guess the question is of whether it's even consistent to care about being "simulated" or not, and where you draw the line (what if you have some information rate in from the outside and have some influence over it? What if its the exact same hardware just pluggged in like in 'the matrix'?) My guess is that it is gonna turn out to not make any sense to care about them differently, and that theres some natural weighting which we haven't yet figured out. Maybe weight each copy by the redundancy in the processor (eg if each transistor is X atoms big, then that can be thought of X copies living in the same house) or by the power they have to influence the world, or something. Both of those have problems, but I can't think of anything better.
There are possible worlds that are pretty good approximations to popular religions. I don't understand this...
True... The paper does a much more thorough job than I, but the summary is that the only consistent way to carve is into borne probabilities, so you have to weight branches accordingly. I think this has to due with the amplitude squared being conserved, so that the ebborians equivalent would be their thickness, but I admit some confusion here. This means there's at least some sense of probability in which you don't get to 'wish away', though it's still possible to only care about worlds where "X" is true (though in general you actually do care about the other worlds)
There are plenty of possible worlds (infinitely many of them) where quantum mechanics is false; so I don't see how this helps.
It means that if you are in one, probability does not come down to only preferences. I suppose that since you can never be absolutely sure you're in one, you still have to find out your weightings between worlds where there might be nothing but preferences. The other point is that I seriously doubt there's anything built into you that makes you not care about possible worlds where QM is true, so even if it does come down to 'mere preferences', you can still make mistakes. The existence of an objective weighting scheme within one set of possible worlds gives me some hope of an objective weighting between all possible worlds, but note all that much, and it's not clear to me what that would be. Maybe the set of all possible worlds is countable, and each world is weighted equally?
I am not really sure what to make of weightings on possible worlds. Overall, on this issue, I think I am going to have to admit that I am thoroughly confused. By the way, do you mean "finite" here, rather than countable?
Yeah, but the confusion gets better as the worlds become more similar. How to weight between QM worlds and nonQM worlds is something I haven't even seen an attempt to explain, but how to weight within QM worlds has been explained, and how to weight in the sleeping beauty problem is quite straight forward. I meant countable, but now that you mention it I think I should have said finite- I'll have to think about this some more.

Before I’ve observed anything, there seems to be no reason to believe that I’m more likely to be in one world than another, but we can’t let all their weights be equal.

We can't? Why not? Estimating the probability of two heads on two coinflips as 25% is giving existence in worlds with heads-heads, heads-tails, tails-heads, and tails-tails equal weight. The same is true of a more complicated proposition like "There is a low probability that Bigfoot exists" - giving every possible arrangement of objects/atoms/information equal weight, and then ruling out the ones that don't result in the evidence we've observed, few of these worlds contain Bigfoot.

Without an arbitrary upper bound on complexity, there are infinitely many possible arrangements.
4Scott Alexander
Theoretically, it's not infinite because of the granularity of time/space, speed of light, and so on. Practically, we can get around this because we only care about a tiny fraction of the possible variation in arrangements of the universe. In a coin flip, we only care about whether a coin is heads-up or tails-up, not the energy state of every subatomic particle in the coin. This matters in the case of a biased coin - let's say biased towards heads 66%. This, I think, is what Wei meant when he said we couldn't just give equal weights to all possible universes - the ones where the coin lands on heads and the ones where it lands on tails. But I think "universes where the coin lands on heads" and "universes where the coin lands on tails" are unnatural categories. Consider how the probability of winning the lottery isn't .5 because we choose with equal weight between the two alternatives"I win" and "I don't win". Those are unnatural categories, and instead we need to choose with equal weight between "I win", "John Q. Smith of Little Rock Arkansas wins", "Mary Brown of San Antonio, Texas, wins" and so on to millions of other people. The unnatural category "I don't win" contains millions of more natural categories. So on the biased coin flip, the categories "the coin lands heads" and "the coin lands tails" contains a bunch of categories of lower-level events about collisions of air molecules and coin molecules and amounts of force one can use to flip a coin, and two-thirds of those events are in the "coin lands heads" category. But among those lower-level events, you choose with equal weight. True, beneath these lower-level categories about collisions of air molecules, there are probably even lower things like vibrations of superstrings or bits in the world-simulation or whatever the lowest level of reality is, but as long as these behave mathematically I don't see why they prevent us from basing a theory of probability on the effects of low level conditions.
3Wei Dai
These initial weights are supposed to be assigned before taking into account anything you have observed. But even now (under the second interpretation in my list) you can't be sure that the world you're in is finite. So, suppose there is one possible world for each integer in the set of all integers, or one possible world for each set in the class of all sets. How could one assign equal weight to all possible worlds, and have the weights add up to 1? I don't think that gets around the problem, because there is an infinite number of possible worlds where the energy state of nearly every subatomic particle encodes some valuable information.
By the same method we do calculus. Instead of sum of the possible worlds we integrate over the possible worlds (which is a infinite sum of infinitesimally small values). For explicit construction on how this is done any basic calculus book is enough.
5Wei Dai
My understanding is that it's possible to have a uniform distribution over a finite set, or an interval of the reals, but not over all integers, or all reals, which is why I said in the sentence before the one you quotes, "suppose there is one possible world for each integer in the set of all integers."
There is a 1:1 mapping between "the set of reals in [0,1]" and "the set of all reals". So take your uniform distribution on [0,1] and put it through such a mapping... and the result is non-uniform. Which pretty much kills the idea of "uniform <=> each element has the same probability as each other". There is no such thing as a continuous distribution on a set alone, it has to be on a metric space. Even if you make a metric space out of the set of all possible universes, that doesn't give you a universal prior, because you have to choose what metric it should be uniform with respect to. (Can you have a uniform "continuous" distribution without a continuum? The rationals in [0,1]?)
As there is the 1:1 mapping between set of all reals and unit interval we can just use the unit interval and define a uniform mapping there. As whatever distribution you choose we can map it into unit interval as Pengvado said. In case of set of all integers I'm not completely certain. But I'd look at the set of computable reals which we can use for much of mathematics. Normal calculus can be done with just computable reals (set of all numbers where there is an algorithm which provides arbitrary decimal in a finite time). So basically we have a mapping from computable reals on unit interval into set of all integers. Another question is that is the uniform distribution the entropy maximising distribution when we consider set of all integers? From a physical standpoint why are you interested in countably infinite probability distributions? If we assume discrete physical laws we'd have finite amount of possible worlds, on the other hand if we assume continuous we'd have uncountably infinite amount which can be mapped into unit interval. From the top of my head I can imagine set of discrete worlds of all sizes which would be countably infinite. What other kinds of worlds there could be where this would be relevant?
(Nitpick: Spacetime isn't quantized AFAIK in standard physics, and then there are still continuous quantum amplitudes.) I thought Wei was talking about single worlds (whatever those may be), not sets of worlds. Applied to sets of worlds, this seems correct.
Yvain said the finiteness well, but I think the "infinitely many possible arrangements" needs a little elaboration. In any continuous probability distributions we have infinitely many (actually uncountably infinitely many) possibilities, and this makes the probability of any single outcome 0. Which is the reason why, in the case of continuous distributions, we talk about probability of the outcome being on a certain interval (a collection of infinitely many arrangements). So instead of counting the individual arrangements we calculate integrals over some set of arrangements. Infinitely many arrangements is no hindrance to applying probability theory. Actually if we can assume continuous distribution it makes some things much easier.
Good point. Does this work over all infinite sets, though? Integers? Rationals?
It does work, actually if we're using Integers (there are as many integers as Rationals so we don't need to care about the latter set) we get the good old discrete probability distribution where we either have finite number of possibilities or at most countable infinity of possibilities, e.g set of all Integers. Real numbers are strictly larger set than integers, so in continuous distribution we have in a sense more possibilities than countably infinite discrete distribution.

Hmmm - caring as a part of reality? Why not just flip things up, and consider that emotion is also part of reality. Random by any other name. Try to exclude it and you'll find you can't no matter how infinitely many worlds you suppose. There's also calculus to irrationality . . .

The "caring" interpretation doesn't say that caring is part of reality (except insofar as minds are implemented in reality). Rather, it says that probability isn't part of reality, it's part of decision theory (again except insofar as minds are implemented in reality).
cool! but can you really posit artificial intelligence (decision theory has to get enacted somewhere) and not allow mind as part of reality?

All possible worlds are real, and probabilities represent how much I care about each world. ... Which worlds I care more or less about seems arbitrary.

This view seems appealing to me, because 1) deciding that all possible worlds are real seems to follow from the Copernican principle, and 2) if all worlds are real from the perspective of their observers, as you said it seems arbitrary to say which worlds are more real.

But on this view, what do I do with the observed frequencies of past events? Whenever I've flipped a coin, heads has come up about half the time. If I accept option 4, am I giving up on the idea that these regularities mean anything?

What does real even mean, by the way? Interpretation 1 with real taken to mean ‘of or pertaining to the world I'm in’ (as I would) is equivalent to Interpretation 2 with real taken to mean ‘possible’ (as Tegmark would, IIUC) and to Interpretation 3 with real taken to mean ‘likely’ and to Interpretation 4 with real taken to mean ‘important to me’.

It depends. We use the term "probability" to cover a variety of different things, which can be handled by similar mathematics but are not the same.

For example, suppose that I'm playing blackjack. Given a certain disposition of cards, I can calculate a probability that asking for the next card will bust me. In this case the state of the world is fixed, and probability measures my ignorance. The fact that I don't know which card would be dealt to me doesn't change the fact that there's a specific card on the top of the deck waiting to be dealt.... (read more)

2 and 4 are much the same if you only care about worlds you are in.

The post would be much better if a definition of "possible world" was given. When giving definitions, perhaps to define what does "real" precisely mean would be beneficial.

More or less, I interpret "reality" as all things which can be observed. "Possible", in my language", is something which I can imagine and which doesn't contradict facts that I already know. This is somewhat subjective definition, but possibility obviously depends subjective knowledge. I have flipped a coin. Before I have looked at the result... (read more)

All possible worlds are real, and probabilities represent how much I care about each world.

Could you elaborate on what it means to have a given amount of "care" about a world? For example, suppose that I assign (or ought to assign) probability 0.5 to a coin's coming up heads. How do you translate this probability assignment into language involving amounts of care for worlds?

You care equally for your selves that see heads and your selves that see tails. If you don't care what happens to you after you see heads, then you would assign probability one to tails. Of course, you'd be wrong in about half the worlds, but hey, no skin off your nose. You're the one who sees tails. Those other guys ... they don't matter.
A bizarre interpretation. For example, caring about "living until tomorrow" does not normally mean assigning a zero probability to death in the interim. If anything that would tend to make you fearless - indifferent to whether you stepped in front of a bus or not - the very opposite of what we normally mean by "caring" about some outcome.
Thanks. That makes it a lot clearer. It seems like this "caring" could be analyzed a lot more, though. For example, suppose I were an altruist who continued to care about the "heads" worlds even after I learned that I'm not in them. Wouldn't I still assign probability ~1 to the proposition that the coin came up tails in my own world? What does that probability assignment of ~1 mean in that case? I suppose the idea is that a probability captures not only how much I care about a world, but also how much I think that I can influence that world by acting on my values.
0Wei Dai
See for more details. Many of my later posts can be considered explanations/justifications for the "design choices" I made in that post.

Why should probabilities mean anything? How how would you behave differently if you decided (or learned) a given interpretation was correct?

As long as there's no difference, and your actions add up to normality under any of the interpretations, then I don't see why an interpretation is needed at all.

3Wei Dai
The different interpretations suggest different approaches to answer the question of "what is the right prior?" and also different approaches to decision theory. I mentioned that the "caring" interpretation fits well with UDT.
Can't you choose your (arational) preferences to get any behaviour (decision theory) no matter what interpretation you choose?
2Wei Dai
Preferences may be arational, but they're not completely arbitrary. In moral philosophy there are still arguments for what one's preferences should be, even if they are generally much weaker than the arguments in rationality. Different interpretations influence what kinds of arguments apply or make sense to you, and therefore influence your preferences.
How can there be arguments about what preferences should be? Aren't they, well, a sort of unmoved mover, a primal cause? (To use some erstwhile philosophical terms :-) I can understand meta-arguments that say your preferences should be consistent in some sense, or that argue about subgoal preferences given some supergoals. But even under strict constraints of that kind, you have a lot of latitude, from humans to paperclip maximizers on out. Within that range, does interpreting probabilities differently really give you extra power you can't get by finetuning your prefs? Edit: the reason I'd perfer editing prefs is that talking about the Meaning of Probabilities sets off my materialism sensors. It leads to things like multiple-world theories because they're easy to think about as an inetrpretation of QM, regardless of whether they actually exist. Then they can actually negatively affect our prefs or behavior.
0Wei Dai
Well, I don't know what many of my preferences should be. How can I find out except by looking for and listening to arguments? No, not for humans anyway.
That implies there's some objectively-definable standard for preferences which you'll be able to recognize once you see it. Also, it begs the question of what in your current preferences says "I have to go out and get some more/different preferences!" From a goal-driven intelligence's POV, asking others to modify your prefs in unspecified ways is pretty much the anti-rational act.
1Wei Dai
I think we need to distinguish between what a rational agent should do, and what a non-rational human should do to become more rational. Nesov's reply to you also concerns the former, I think, but I'm more interested in the latter here. Unlike a rational agent, we don't have well-defined preferences, and the preferences that we think we have can be changed by arguments. What to do about this situation? Should we stop thinking up or listening to arguments, and just fill in the fuzzy parts of our preferences with randomness or indifference, in order to emulate a rational agent in the most direct manner possible? That doesn't make much sense to me. I'm not sure what we should do exactly, but whatever it is, it seems like arguments must make up a large part of it.
That arguments modify preference means that you are (denotationally) arriving at different preferences depending on arguments. This means that, from the perspective of a specific given preference (or "true" neutral preference not biased by specific arguments), you fail to obtain optimal rational decision algorithm, and thus to achieve high-preference strategy. But at the same time, "absence of action" is also an action, so not exploring the arguments may as well be a worse choice, since you won't be moving forward towards more clear understanding of your own preference, even if the preference that you are going to understand will be somewhat biased compared to the unknown original one. Thus, there is a tradeoff: * Irrational perception of arguments leads to modification of preference, which is bad for original preference, but * Considering moral arguments leads to a more clear understanding of some preference close to the original one, which allows to make more rational decisions, which is good for the original preference.
Please see my reply to Nesov above, too. I think we shouldn't try to emulate rational agents at all, in the sense that we shouldn't pretend to have rationality-style preferences and supergoals; as a matter of fact we don't have them. Up to here we seem to agree, we just use different terminology. I just don't want to conflate rational preferences with human preferences because they the two systems behave very differently. Just as an example, in signalling theories of behaviour, you may consciously believe that your preferences are very different from what your behaviour is actually optimizing for when noone is looking. A rational agent wouldn't normally have separate conscious/unconscious minds unless only the conscious part was sbuject to outside inspection. In this example, it makes sense to update signalling-preferences sometimes, because they're not your actual acting-preferences. But if you consciously intend to act out your (conscious) preferences, and also intend to keep changing them in not-always-foreseeable ways, then that isn't rationality, and when there could be confusion due to context (such as on LW most of the time) I'd prefer not to use the term "preferences" about humans, or to make clear what is meant.
FWIW, my preferences have not been changed by arguments in the last 20 years. So I don't think your "we" includes me.
As an example, consider the arguments in form of proofs/disproofs of the statements that you are interested in. Information doesn't necessarily "change" or "determine arbitrarily" the things you take from it, it may help you to compute an object in which you are already interested, without changing that object, and at the same time be essential in moving forward. If you have an algorithm, it doesn't mean that you know what this algorithm will give you in the end, what the algorithm "means". Resist the illusion of transparency.
I don't understand what you're saying as applied to this argument. That Wei Dai has an algorithm for modifying his preferences and he doesn't know what the end output of that algorithm will be?
There will always be something about preference that you don't know, and it's not the question of modifying preference, it's a question of figuring out what the fixed unmodifiable preference implies. Modifying preference is exactly the wrong way of going about this. If we figure out the conceptual issues of FAI, we'd basically have the algorithm that is our preferences, but not in infinite and unknowable normal "execution trace" denotational "form".
As Wei says below, we should consider rational agents (who have explicit preferences separate from the rest of their cognitive architecture) separately from humans who want to approximate that in some ways. I think that if we first define separate preferences, and then proceed to modify them over and over again, this is so different from rational agents that we shouldn't call it preferences at all. We can talk about e.g. morals instead, or about habits, or biases. On the other hand if we define human preferences as 'whatever human behavior happens to optimize', then there's nothing interesting about changing our preferences, this is something that happens all the time whether we want it to or not. Under this definition Wei's statement that he deliberately makes it happen is unclear (the totality of a human's behaviour, knowledge, etc. is subtly changing over time in any case) so I assumed he was using the former definition.
There is no clear-cut dichotomy between defining something completely at the beginning and doing things arbitrarily as we go. Instead of defining preference for rational agents, in a complete, finished form, and then seeing what happens, consider a process of figuring out what preference is. This is neither a way to arrive at the final answer, at any point, nor a history of observing of "whatever happens". Rational agent is an impossible construct, but something irrational agents aspire to be, never obtaining. What they want to become isn't directly related to what they "appear" to strive towards.
I understand. So you're saying we should indeed use the term 'preference' for humans (and a lot of other agents) because no really rational agents can exist. Actually, why is this true? I don't know about perfect rationality, but why shouldn't an agent exist whose preferences are completely specified and unchanging?
Right. Except that really rational agents might exist, but not if their preferences are powerful enough, as humans' have every chance to be. And whatever we irrational humans, or our godlike but still, strictly speaking, irrational FAI try to do, the concept of "preference" still needs to be there. Again, it's not about changing preference. See these comments. An agent can have a completely specified and unchanging preference, but still not know everything about it (and never able to know everything about it). In particular, this is a consequence of halting problem: if you have source code of a program, this code completely specifies whether this program halts, and you may run this code for arbitrarily long time without ever changing it, but still not know whether it halts, and not being able to ever figure that out, unless you are lucky to arrive at a solution in this particular case.
OK, I understand now what you're saying. I think the main difference, then, between preferences in humans and in perfect (theoretical) agents is that our preferences aren't separate from the rest of our mind.
I don't understand this point.
Rational (designed) agents can have an architecture with preferences (decision making parts) separate from other pieces of their minds (memory, calculations, planning, etc.) Then it's easy (well, easier) to reason about changing their preferences because we can hold the other parts constant. We can ask things like "given what this agent knows, how would it behave under preference system X"? The agent may also be able to simulate proposed modifications to its preferences without having to simulate its entire mind (which would be expensive). And, indeed, a sufficiently simple preference system may be chosen so that it is not subject to the halting problem and can be reasoned about. In humans though, preferences and every other part of our minds influence one another. While I'm holding a philosophical discussion about morality and deciding how to update my so-called preferences, my decisions happen to be affected by hunger or tiredness or remembering having had good sex last night. There are lots of biases that are not perceived directly. We can't make rational decisions easily. In rational agents who are self-modifying preferences, the new prefs are determined by the old prefs, i.e. via second-order prefs. But in humans prefs are potentially determined by the entire state of mind, so perhaps we should talk about "modifying our minds" and not our prefs, since it's hard to completely exclude most of our mind from the process.
As per Pei Wang's suggestion, I'm stating that I'm going to opt out of this conversation until you take seriously (accept/investigate/argue against) the statement that preference is not to be modified, something that I stressed in several of the last comments.
There are other relevant differences as well, of course. For instance, a good rational agent would be able to literally rewrite its preferences, while humans have trouble with self-binding their future selves.
Re: "How can there be arguments about what preferences should be?" The idea that some preferences are "better" than other ones is known as "moral realism".
Wikipedia says moral realists (in general) claim that moral propositions can be true or false as objective facts but their truth cannot be observed or verified. This doesn't make any sense. Sounds like religion.
Are you looking at ...? Care to quote an offending section about moral truths not being observervable or verifiable?
Under the section "Criticisms": Regarding the emotivist criticism, it begs a lot of questions. Surely not all negative emotional reactions signal wrong moral actions. Besides, emotivism isn't aligned with moral realism.
I see - thanks. That some criticisms of moral realism appear to lack coherence does not seem to me to be a point that counts against the idea. I expect moral realists would deny that morality is any more nonmaterial than any other kind of information - and would also deny that it does not appear to be accessible to the scientific method.
If moral realism acts as a system of logical propositions and deductions, then it has to have moral axioms. How are these grounded in material reality? How can they be anything more than "because i said so and I hope you'll agree"? Isn't the choice of axioms done using a moral theory nominally opposed to moral realism, such as emotivism, or (amoral) utilitarianism?
One way would be to consider the future of civilization. At the moment, we observe a Shifting Moral Zeitgeist. However, in the future we may see ideas about how to behave towards other agents settle down into an optimal region. If that turns out to be a global optimum - rather than a local one - i.e. much the same rules would be found by most surviving aliens - then that would represent a good foundation for the ideas of moral realism. Even today, it should be pretty obvious that some moral systems are "better" than others ("better" in the sense of promoting the survival of those systems). That doesn't necessarily mean there's a "best" one - but it leaves that possibility open.
It might also sound like science - don't scientists generally claim that propositions about the world can be true or false, but cannot be directly observed or verified? Joshua Greene's thesis "The Terrible, Horrible, No Good, Very Bad Truth about Morality and What to Do About it" might be a decent introduction to moral realism / irrealism. Overall it is an argument for irrealism.
In science, a proposition about the world can generally be proven or disproven with arbitrary probability, so you can become as sure about it as you like if you invest enough resources. In moral realism, propositions are purely logical constructs, and can be proven true or false just like a mathematica proposition. Their truth is one with the truth of the axioms used, and the axioms can't be proven or disproven with any degree of certainty; they are simply accepted or not accepted. The morality is internally consistent, but you can't derive it from the real world, and you can't derive any fact about the real world from the morality. That sounds just like theology to me. (The difference between this and ordinary math or logic, is that mathematical constructs aren't supposed to lead to should or ought statements about behavior.) I will read Greene's thesis, but as far as I can tell it argues against moral realism (and does it well), so it won't help me understand why anyone would believe in it.