A common background assumption on LW seems to be that it's rational to act in accordance with the dispositions one would wish to have. (Rationalists must WIN, and all that.)

E.g., Eliezer:

It is, I would say, a general principle of rationality - indeed, part of how I define rationality - that you never end up envying someone else's mere choices.  You might envy someone their genes, if Omega rewards genes, or if the genes give you a generally happier disposition.  But [two-boxing] Rachel, above, envies [one-boxing] Irene her choice, and only her choice, irrespective of what algorithm Irene used to make it.  Rachel wishes just that she had a disposition to choose differently.

And more recently, from AdamBell:

I [previously] saw Newcomb’s Problem as proof that it was sometimes beneficial to be irrational. I changed my mind when I realized that I’d been asking the wrong question. I had been asking which decision would give the best payoff at the time and saying it was rational to make that decision. Instead, I should have been asking which decision theory would lead to the greatest payoff.

Within academic philosophy, this is the position advocated by David Gauthier.  Derek Parfit has constructed some compelling counterarguments against Gauthier, so I thought I'd share them here to see what the rest of you think.

First, let's note that there definitely are possible cases where it would be "beneficial to be irrational".  For example, suppose an evil demon ('Omega') will scan your brain, assess your rational capacities, and torture you iff you surpass some minimal baseline of rationality.  In that case, it would very much be in your interests to fall below the baseline!  Or suppose you're rewarded every time you honestly believe the conclusion of some fallacious reasoning.  We can easily multiply cases here.  What's important for now is just to acknowledge this phenomenon of 'beneficial irrationality' as a genuine possibility.

This possibility poses a problem for the Eliezer-Gauthier methodology. (Quoting Eliezer again:)

Rather than starting with a concept of what is the reasonable decision, and then asking whether "reasonable" agents leave with a lot of money, start by looking at the agents who leave with a lot of money, develop a theory of which agents tend to leave with the most money, and from this theory, try to figure out what is "reasonable".

The problem, obviously, is that it's possible for irrational agents to receive externally-generated rewards for their dispositions, without this necessarily making their downstream actions any more 'reasonable'.  (At this point, you should notice the conflation of 'disposition' and 'choice' in the first quote from Eliezer.  Rachel does not envy Irene her choice at all.  What she wishes is to have the one-boxer's dispositions, so that the predictor puts a million in the first box, and then to confound all expectations by unpredictably choosing both boxes and reaping the most riches possible.)

To illustrate, consider (a variation on) Parfit's story of the threat-fulfiller and threat-ignorer.  Tom has a transparent disposition to fulfill his threats, no matter the cost to himself.  So he straps on a bomb, walks up to his neighbour Joe, and threatens to blow them both up unless Joe shines his shoes.  Seeing that Tom means business, Joe sensibly gets to work.  Not wanting to repeat the experience, Joe later goes and pops a pill to acquire a transparent disposition to ignore threats, no matter the cost to himself. The next day, Tom sees that Joe is now a threat-ignorer, and so leaves him alone.

So far, so good.  It seems this threat-ignoring disposition was a great one for Joe to acquire.  Until one day... Tom slips up.  Due to an unexpected mental glitch, he threatens Joe again.  Joe follows his disposition and ignores the threat.  BOOM.

Here Joe's final decision seems as disastrously foolish as Tom's slip up.  It was good to have the disposition to ignore threats, but that doesn't necessarily make it good idea to act on it.  We need to distinguish the desirability of a disposition to X from the rationality of choosing to do X.


184 comments, sorted by Click to highlight new comments since: Today at 7:07 AM
New Comment
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Thanks for posting. Your analysis is an improvement over the LW conventional wisdom, but you still doesn't get it right, where right, to me, means the way it is analyzed by the guys who won all those Nobel prizes in economics. You write:

First, let's note that there definitely are possible cases where it would be "beneficial to be irrational".

But in every example you supply, what you really want is not exactly to be irrational; rather it is to be believed irrational by the other player in the game. But you don't notice this because in each of your artificial examples, the other player is effectively omniscient, so the only way to be believed irrational is to actually be irrational. But then, once the other player really believes, his strategies and actions are modified in such a way the your expected behavior (which would have been irrational if the other player had not come to believe you irrational) is now no longer irrational!

But, better yet, lets Taboo the word irrational. What you really want him to believe is that you will play some particular strategy. If he does, in fact, believe, then he will choose a particular strategy, and your own best response is t... (read more)

But in every example you supply, what you really want is not exactly to be irrational; rather it is to be believed irrational by the other player in the game.

I don't think that's the real problem: after all, Parfit's Hitchhiker and Newcomb's problem also eliminate this distinction by positing an Omega that will not be wrong in its predictions.

The real problem is that Chappell has delineated a failure mode that we don't care about. TDT/UDT are optimized for situations in which the world only cares about what you would do, not why you decide to do so. In Chappell's example's, there's no corresponding action that forms the basis of the failure; the "ritual of cognition" alone determines your punishment.

The EY article he linked to ("Newcomb's Problem and the Regret of Rationality") makes the irrelevance of these cases very clear:

Next, let's turn to the charge that Omega favors irrationalists. I can conceive of a superbeing who rewards only people born with a particular gene, regardless of their choices. I can conceive of a superbeing who rewards people whose brains inscribe the particular algorithm of "Describe your options in English and choose the l

... (read more)
It's not from LW, but here's Scott Aaronson [http://scottaaronson.com/blog/?p=30]:
| Has anyone written anything on LW about responding to Omega by randomizing? Yes [http://lesswrong.com/lw/nc/newcombs_problem_and_regret_of_rationality/1q3c?c=1] . It is often explicity ruled out by the supplied scenario.
You get a point for that.
Yes, back when we discussed Newcomblike problems frequently I more or less used a form letter to reply to that objection. Any useful treatment of Newcomblike problems will specify explicitly or implicitly how Omega will handle (quantum) randomness if it is allowed. The obvious response for Omega is to either give you nothing (or maybe a grenade!) for being a smart ass or, more elegantly, handle the reward given in commensurate manner to the probabilities. If probabilistic decisions are to be allowed then an Omega that can handle probabilistic decisions quite clearly needs to be supplied. I downvoted the parent. How on earth is Perplexed comparing LW conventional wisdom to that of Nobel prize winning economists when he thinks coin tossing is a big deal?
At the risk of appearing stupid, I have to ask: exactly what is a "useful treatment of Newcomb-like problems" used for? So far, the only effect that all the Omega-talk has had on me is to make me honestly suspect that you guys must be into some kind of mind-over-matter quantum woo. Seriously, Omega is not just counterfactual, he is impossible. Why do you guys keep asking us to believe so many impossible things before breakfast? Jaynes says not to include impossible propositions among the conditions in a conditional probability. Bad things happen if you do. Impossible things need to have zero-probability priors. Omega just has no business hanging around with honest Bayesians. When I read that you all are searching for improved decision theories that "solve" the one-shot prisoner's dilemma and the one-shot Parfit hitchhiker, I just cringe. Surely you shouldn't change the standard, well-established, and correct decision theories. If you don't like the standard solutions, you should instead revise the problems from unrealistic one-shots to more realistic repeated games or perhaps even more realistic games with observers - observers who may play games with you in the future. In every case I have seen so far where Eliezer has denigrated the standard game solution because it fails to win, he has been analyzing a game involving a physically and philosophically impossible fictional situation. Let me ask the question this way: What evidence do you have that the standard solution to the one-shot PD can be improved upon without creating losses elsewhere? My impression is that you are being driven by wishful thinking and misguided intuition.

Here's another way of looking at the situation that may or may not be helpful. Suppose I ask you, right here and now, what you'd do in the hypothetical future Parfit's Hitchhiker scenario if your opponent was a regular human with Internet access. You have several options:

  1. Answer truthfully that you'd pay $100, thus proving that you don't subscribe to CDT or EDT. (This is the alternative I would choose.)

  2. Answer that you'd refuse to pay. Now you've created evidence on the Internet, and if/when you face the scenario in real life, the driver will Google your name, check the comments on LW and leave you in the desert to die. (Assume the least convenient possible world where you can't change or delete your answer once it's posted.)

  3. Answer that you'd pay up, but secretly plan to refuse. This means you'd be lying to us here in the comments - surely not a very nice thing to do. But if you subscribe to CDT with respect to utterances as well as actions, this is the alternative you're forced to choose. (Which may or may not make you uneasy about CDT.)

What makes me uneasy is the assumption I wouldn't want to pay $100 to somebody who rescued me from the desert. Given that, lying to people whom I don't really know should be a piece of cake!
I would of course choose option #1, adding that, due to an affliction giving me a trembling hand, I tend to get stranded in the desert and the like a lot and hence that I would appreciate it if he would spread the story of my honesty among other drivers. I might also promise to keep secret the fact of his own credulity in this case, should he ask me to. :) I understand quite well that the best and simplest way to appear honest is to actually be honest. And also that, as a practical matter, you never really know who might observe your selfish actions and how that might hurt you in the future. But these prudential considerations can already be incorporated into received decision theory (which, incidentally, I don't think matches up with either CDT or EDT - at least as those acronyms seem to be understood here.) We don't seem to need TDT and UDT to somehow glue them in to the foundations. Hmmm. Is EY perhaps worried that an AI might need need even stronger inducements toward honesty? Maybe it would, but I don't see how you solve the problem by endowing the AI with a flawed decision theory.
...What? Also, it doesn't matter if he's impossible. He's an easy way to tack on arbitrary rules to hypotheticals without overly tortured explanations, because people are used to getting arbitrary rules from powerful agents. It's also impossible for a perfectly Absent Minded Driver to come to one of only two possible intersections with 3 destinations with known payoffs and no other choices. To say nothing of the impossibly horrible safety practices of our nation's hypothetical train system.
Are you sure? I'm not objecting to the arbitrary payoffs or complaining because he doesn't seem to be maximizing his own utility. I'm objecting to his ability to predict my actions. Give me a scenario which doesn't require me to assign a non-zero prior to woo and in which a revisionist decision theory wins. If you can't, then your "improved" decision theory is no better than woo itself. Regarding the Absent Minded Driver, I didn't recognize the reference. Googling, I find a .pdf by one of my guys (Nobelist Robert Aumann) and an LW article by Wei-Dai. Cool, but since it is already way past my bedtime, I will have to read them in the morning and get back to you.
The only 'woo' here seems to be your belief that your actions are not predictable (even in principle!). Even I can predict your actions within some tolerances, and we do not need to posit that I am a superintelligence! Examples: you will not hang yourself to death within the next five minutes, and you will ever make another comment on Less Wrong.
"ever"? No, "never".
Wha? In case it wasn't clear, it was a one-off prediction and I was already correct.
In case mine wasn't clear, it was a bad Gilbert & Sullivan joke. Deservedly downvoted. Apparently.
You need a little more context/priming or to make the joke longer for anyone to catch this. Or you need to embed it in a more substantive and sensible reply. Otherwise it will hardly ever work.
Counterexample [http://www.martyhill.net/humor.xena.html]
I'd call that a long joke, wouldn't you?
See what I mean? I made it long and it still didn't work. :)
I wasn't sure, so I held off posting my reply (a decision I now regret). It would have been, "Well, hardly ever."
Why? What about you is fundamentally logically impossible to predict? Do you not find that you often predict the actions of others? (ie. giving them gifts that you know they'll like) And that others predict your reactions? (ie. choosing not to give you spider-themed horror movies if you're arachnophobic)
Omega is a perfect super-intelligence, existing in a computer simulation like universe that can be modeled by a set of physical laws and a very long string of random numbers. Omega knows the laws and the numbers.
Ok, I've read the paper(most of it) and Wei-Dai's article now. Two points. 1. In a sense, I understand how you might think that the Absent Minded Driver is no less contrived and unrealistic than Newcomb's Paradox. Maybe different people have different intuitions as to what toy examples are informative and which are misleading. Someone else (on this thread?) responded to me recently with the example of frictionless pulleys and the like from physics. All I can tell you is that my intuition tells me that the AMD, the PD, frictionless pulleys,and even Parfit's Hitchhiker all strike me as admirable teaching tools, whereas Newcomb problems and the old questions of irrestable force vs immovable object in physics are simply wrong problems which can only create confusion. 2. Reading Wei-Dai's snarking about how the LW approach to decision theory (with zero published papers to date) is so superior to the confusion in which mere misguided Nobel laureates struggle - well, I almost threw up. It is extremely doubtful that I will continue posting here for long.
It wasn't meant to be a snark. I was genuinely trying to figure out how the "LW approach" might be superior, because otherwise the most likely explanation is that we're all deluded in thinking that we're making progress. I'd be happy to take any suggestions on how I could have reworded my post so that it sounded less like a snark.

Wei-Dai wrote a post entitled The Absent-Minded Driver which I labeled "snarky". Moreover, I suggested that the snarkiness was so bad as to be nauseating, so as to drive reasonable people to flee in horror from LW and SAIA. I here attempt to defend these rather startling opinions. Here is what Wei-Dai wrote that offended me:

This post examines an attempt by professional decision theorists to treat an example of time inconsistency, and asks why they failed to reach the solution (i.e., TDT/UDT) that this community has more or less converged upon. (Another aim is to introduce this example, which some of us may not be familiar with.) Before I begin, I should note that I don't think "people are crazy, the world is mad" (as Eliezer puts it) is a good explanation. Maybe people are crazy, but unless we can understand how and why people are crazy (or to put it more diplomatically, "make mistakes"), how can we know that we're not being crazy in the same way or making the same kind of mistakes?

The paper that Wei-Dai reviews is "The Absent-Minded Driver" by Robert J. Aumann, Sergiu Hart, and Motty Perry. Wei-Dai points out, rather condescendingly... (read more)

How is Wei Dai being condescending there? He's pointing out how weak it is to dismiss people with these credentials by just calling them crazy. ETA: In other words, it's an admonishment directed at LWers. That, at any rate, was my read.
I'm sure it would be Wei-Dai's read as well. The thing is, if Wei-Dai had not mistakenly come to the conclusion that the authors are wrong and not as enlightened as LWers, that admonishment would not be necessary. I'm not saying he condescends to LWers. I say he condescends to the rest of the world, particularly game theorists.
Are you essentially saying you are nauseated because Wei Dai disagreed with the authors?
No. Not at all. It is because he disagreed through the wrong channels, and then proceeded to propose rather insulting hypotheses as to why they had gotten it wrong. Just read that list of possible reasons! And there are people here arguing that "of course we want to analyze the cause of mistakes". Sheesh. No wonder folks here are so in love with Evolutionary Psychology. Ok, I'm probably going to get downvoted to hell because of that last paragraph. And, you know what, that downvoting impulse due to that paragraph pretty much makes my case for why Wei Dai was wrong to do what he did. Think about it.
Interestingly enough I think that it is this paragraph that people will downvote, and not the one above. Mind you, the premise in "No wonder folks here are so in love with Evolutionary Psychology." does seem so incredibly backward that I almost laughed. I can understand your explanation here. Without agreeing with it myself I can see how it follows from your premises.
I'm having trouble following you. Are you saying that you read him differently, and that he would somehow be misinterpreting himself? The admonishment is necessary if LWers are likely to wrongly dismiss Aumann et al. as "crazy". In other words, to think that the admonishment is necessary is to think that LWers are too inclined to dismiss other people as crazy I got that. Who said anything about condescending to LWers?
Preliminary notes: You can call me "Wei Dai" (that's firstname lastname). "He" is ok. I have taken a graduate level course in game theory (where I got a 4.0 grade, in case you suspect that I coasted through it), and have Fudenberg and Tirole's "Game Theory" and Joyce's "Foundations of Causal Decision Theory" as two of the few physical books that I own. I can't see where they made this point. At the top of Section 4, they say "How, then, should the driver reason at the action stage?" and go on directly to describe action-optimality. If they said something like "One possibility is to just recompute and apply the planning-optimal solution. But if you insist ..." please point out where. See also page 108: If Aumann et al. endorse using planning-optimality at the action stage, why would they say the driver has some hard thinking to do? Again, why not just recompute and apply the planning-optimal solution? I also do not see how subgame perfection is relevant here. Can you explain? This footnote? Since p* is the action-optimal solution, they are pointing out the formal relationship between their notion of action-optimality and Nash equilibrium. How is this footnote an argument for "it" (it being "recomputing the planning-optimal decision at each intersection and carrying it out")?
Ok, so it is me who is convicted of condescending without having the background to justify it. :( FWIW I have never taken a course, though I have been reading in the subject for more than 45 years. My apologies. More to come on the substance.
Relevance of Subgame perfection. Seldin suggested subgame perfection as a refinement of Nash equilibrium which requires that decisions that seemed rational at the planning stage ought to still seem rational at the action stage. This at least suggests that we might want to consider requiring "subgame perfection" even if we only have a single player making two successive decisions. Relevance of Footnote #4. This points out that one way to think of problems where a single player makes a series of decisions is to pretend that the problem has a series of players making the decisions - one decision per player, but that these fictitious players are linked in that they all share the same payoffs (but not necessarily the same information). This is a standard "trick" in game theory, but the footnote points out that in this case, since both fictitious players have the same information (because of the absent-mindedness) the game between driver-version-1 and driver-version-2 is symmetric, and that is equivalent to the constraint p1 = p2. Does Footnote #4 really amount to "they had already argued for [just recalculating the planning-optimal solution]"? Well, no it doesn't really. I blew it in offering that as evidence. (Still think it is cool, though!) Do they "argue for it" anywhere else? Yes, they do. Section 5, where they apply their methods to a slightly more complicated example, is an extended argument for the superiority of the planning-optimal solution to the action-optimal solutions. As they explain, there can be multiple action-optimal solutions, even if there is only one (correct) planning-optimal solution, and some of those action-optimal solutions are wrong *even though they appear to promise a higher expected payoff than does the planning optimal solution. I really don't see why you are having so much trouble parsing this. "If indeed he chose p1 , there is no problem" is an endorsement of the correctness of the planning-optimal solution. The sentence dealing with
It is a statement that the planning-optimal action is the correct one, but it's not an endorsement that it is correct to use the planning-optimality algorithm to compute what to do when you are already at an intersection. Do you see the difference? ETA (edited to add): According to my reading of that paragraph, what they actually endorse is to compute the planning-optimal action at START, remember that, then at each intersection, compute the set of action-optimal actions, and pick the element of the set that coincides with the planning-optimal action. BTW, you can use "\" to escape special characters like "*" and "_".
Thx for the escape character info. That really ought to be added to the editing help popup. Yes, I see the difference. I claim that what they are saying here is that you need to do the planning-optimal calculation in order to find p*1 as the unique best solution (among the three solutions that the action-optimal method provides). Once you have this, you can use it at the first intersection. But at the other intersections, you have some choices: either recalculate the planning-optimal solution each time, or write down enough information so that you can recognize that p*1 is the solution you are already committed to among the three (in section 5) solutions returned by the action-optimality calculation. ETA in response to your ETA. Yes they do. Good point. I'm pretty sure there are cases more complicated than this perfectly amnesiac driver where that would be the only correct policy. (ETA:To be more specific, cases where the planning-optimal solution is not a sequential equilibrium). But then I have no reason to think that UDT would yield the correct answer in those more complicated cases either.
I deleted my previous reply since it seems unnecessary given your ETA. What would be the only correct policy? What I wrote after "According to my reading of that paragraph"? If so, I don't understand your "cases where the planning-optimal solution is not a sequential equilibrium". Please explain.
Yes. I would have thought it would be self explanatory. It looks like I will need to construct and analyze examples slightly more complicated that the Absent Minded Driver. That may take a while. Questions before I start: Does UDT encompass game theory, or is it limited to analyzing single-player situations? Is UDT completely explained in your postings, or is it, like TDT, still in the process of being written up?
Wei has described a couple versions of UDT. His descriptions seemed to me to be mathematically rigorous. Based on Wei's posts, I wrote this pdf [http://www.scribd.com/doc/32853039/Updateless-Decision-Theory], which gives just the definition of a UDT agent (as I understand it), without motivation or justification. The difficulty with multiple agents looks like it will be very hard to get around within the UDT framework. UDT works essentially by passing the buck to an agent who is at the planning stage*. That planning-stage agent then performs a conventional expected-utility calculation. But some scenarios seem best described by saying that there are multiple planning-stage agents. That means that UDT is subject to all of the usual difficulties that arise when you try to use expected utility alone in multiplayer games (e.g., prisoners dilemma). It's just that these difficulties arise at the planning stage instead of at the action stage directly. -------------------------------------------------------------------------------- *Somewhat more accurately, the buck is passed to the UDT agent's simulation of an agent who is at the planning stage.
What I meant was, what point were you trying to make with that statement? According to Aumann's paper, every planning-optimal solution is also an action-optimal solution, so the decision procedure they endorse will end up picking the planning-optimal solution. (My complaint is just that it goes about it in an unnecessarily round-about way.) If theirs is a correct policy, then the policy of just recomputing the planning-optimal solution must also be correct. That seems to disprove your "only correct policy" claim. I thought your "sequential equilibrium" line was trying to preempt this argument, but I can't see how. Pretty much single-player for now. A number of people are trying to extend the ideas to multi-player situations, but it looks really hard. No, it's not being written up further. (Nesov is writing up some of his ideas, which are meant to be an advance over UDT.)
My understanding of their paper has changed somewhat since we began this discussion. I now believe that repeating the planning-optimal analysis at every decision node is only guaranteed to give ideal results in simple cases like this one in which every decision point is in the same information set. In more complicated cases, I can imagine that the policy of planning-optimal-for-the first-move, then action-optimal-thereafter might do better. I would need to construct an example to assert this with confidence. In this simple example, yes. Perhaps not in more complicated cases. And I can't see how to explain it without an example
While I wait, did you see anything in Aumann's paper that hints at "the policy of planning-optimal-for-the first-move, then action-optimal-thereafter might do better"? Or is that your original research (to use Wikipedia-speak)? It occurs to me that if you're correct about that, the authors of the paper should have realized it themselves and mentioned it somewhere, since it greatly strengthens their position.
Answering that is a bit tricky. If I am wrong, it is certainly "original research". But my belief is based upon readings in game theory (including stuff by Aumann) which are not explicitly contained in that paper. Please bear with me. I have a multi-player example in mind, but I hope to be able to find a single-player one which makes the reasoning clearer. Regarding your last sentence, I must point out that the whole reason we are having this discussion is my claim to the effect that you don't really understand their position, and hence cannot judge what does or does not strengthen it.
Ok, I now have at least a sketch of an example. I haven't worked it out in detail, so I may be wrong, but here is what I think. In any scenario in which you gain and act on information after the planning stage, you should not use a recalculated planning-stage solution for any decisions after you have acted upon that information. Instead, you need to do the action-optimal analysis. For example, let us complicate the absent-minded driver scenario that you diagrammed by adding an information-receipt and decision node prior to those two identical intersections. The driver comes in from the west and arrives at a T intersection where he can turn left(north) or right(south). At the intersection is a billboard advertising today's lunch menu at Casa de Maria, his favorite restaurant. If the billboard promotes chile, he will want to turn right so as to have a good chance of reaching Maria's for lunch. But if the billboard promotes enchiladas, which he dislikes, he probably wants to turn the other way and try for Marcello's Pizza. Whether he turns right or left at the billboard, he will face two consecutive identical intersections (four identical intersections total). The day is cloudy, so he cannot tell whether he is traveling north or south. Working this example in detail will take some work. Let me know if you think the work is necessary.
Ok, I see. I'll await your example.
It is a part of the problem statement that you can't distinguish between being at any of the intersections. So you have to use the same algorithm at all of them. How are you getting this from their words? What about "this coordination can take place only before he starts out at the planning stage"? And "If he chose something else, or nothing at all, then at the action stage he will have some hard thinking to do"? Why would they say "hard thinking" if they meant "recalculate the planning-optimal solution"? (Especially when the planning-optimality calculation is simpler than the action-optimality calculation.)
You can use a backslash to escape special characters in markdown. If you type \*, that will show up as * in the posted text.
In the comment section of Wei Dai's post in question [http://lesswrong.com/lw/182/the_absentminded_driver/], taw and pengvado completed his solution so conclusively that if you really take the time to understand the object level (instead of the meta level where some people are apriori smarter because they won a prize), you can't help but feel the snarking was justified :-)
1A. It may well be a wrong problem. if so it ought to be dissolved. 1B. If so, many theorists (including presumably nobel prize winners), have missed it since 1969. 1C. Your intuition should not be considered a persuasive argument, even by you. 2 . Even ignoring any singularitarian predictions, given the degree to which knowledge acceleration has already advanced, you should expect to see cases where old standards are blown away with seemingly little effort. Maybe this isn't one of those cases, but it should not surprise you if we learn that humanity as a whole has done more decision theory in the past few years than in all previous history. Given that the similar accelerations are happening in many fields, there are probably several past-nobel-level advances by rank amateurs with no special genius.
OK, I've got some big guns pointed at me, so I need to respond. I need to respond intelligently and carefully. That will take some time. Within a week at most.
A couple more comments: 1. For a long time I also didn't think that Newcomb's Problem was worth thinking about. Then I read something by Eliezer that pointed out the connection to Prisoner's Dilemma. (According to Prisoners' Dilemma is a Newcomb Problem [http://www.jstor.org/pss/2265034], others saw the connection as early as 1969.) See also myNewcomb's Problem vs. One-Shot Prisoner's Dilemma [http://lesswrong.com/lw/6r/newcombs_problem_vs_oneshot_prisoners_dilemma/] where I explored how they are different as well. 2. I'm curious what you now think about my perspective on the Absent Minded Driver, on both the object level and meta level (assuming I convinced you that it wasn't meant to be a snark). You're the only person who has indicated actually having read Aumann et al.'s paper.
The possible connection between Newcomb and PD is seen by anyone who considers Jeffrey's version of decision theory (EDT). So I have seen it mentioned by philosophers long before I had heard of EY. Game theorists, of course, reject this, unless they are analysing games with "free precommitment". I instinctively reject it too, for what that is worth, though I am beginning to realize that publishing your unchangeable source code is pretty-much equivalent to free precommitment. My analysis of your analysis of AMD is in my response to your comment below.
Omega is not obviously impossible: in theory, someone could scan your brain and simulate how you react in a specific situation. If you're already an upload and running as pure code, this is even easier. The question is particularly relevant when trying to develop a decision theory for artificial intelligences: there's nothing impossible about the notion of two adversarial AIs having acquired each others' source codes and basing their actions on how a simulated copy of the other would react. If you presume that this scenario is possible, and there seems to be no reason to assume that it wouldn't be, then developing a decision theory capable of handling this situation is an important part of building an AI.
What on Earth gives you that impression? I agree that scenarios with Omega wil have probably little impact on practical matters, at least in near future, but quantum woo? Why is Omega physically impossible? What is philosophically impossible, in general?
Omega makes a decision to put the money in the box, or not. In my model of (MWI) reality, that results in a branching - there are now 2 worlds (one with money, one without). The only problem is, I don't know which world I am in. Next, I decide whether to one-box or to two-box. In my model, that results in 4 possible worlds now. Or more precisely, someone who knows neither my decision nor Omega's would count 4 worlds. But now we are asked to consider some kind of weird quantum correlation between Omega's choice and my own. Omega's choice is an event within my own past light-cone. By the usual physical assumptions, my choice should not have any causal influence on his choice. But I am asked to believe that if I choose to two-box, then he will have chosen not to leave money, whereas if I just believe as Omega wishes me to believe, then my choice will make me rich by reaching back and altering the past (selecting my preferred history?). And you ask "What on Earth gives me the impression that this is quantum woo?"
Incorrect. Omega's decision is no more indeterministic than the output of a calculation. Asking (say) me "Does two plus two equal three?" does not create two worlds, one in which I say "yes" and one in which I say "no" - overwhelmingly I will tell you "no".
Your model ought to be branching at every subatomic event, not at every conscious intelligent choice. This makes reality (even humans) predictable.
As others have said. Omega-talk is possible in a purely classical world, and is clearer in a classical world. Omega simply scans my brain and deterministically decides whether to put the money in or not. Then I decide whether I take one or two of the boxes. To say my choice should not have any causal influence on his choice is misleading at least. It may be true (depending on how exactly one defines causality), however it doesn't exclude correlations between the two choices simply because they are both consequences of a common cause (state of my brain and the relevant portion of the world immediately before the scenario begun). There is no need to include quantumness or even MWI into this scenario, and no certain reason why quantum effects would prevent it from happening. That said, I don't say that something similar is probably going to happen soon.
That's the case if you somehow manage to use a quantum coin in your decision. Your decision could be close enough to deterministic that the measure of the words where you decide differently is billions of times or more smaller and can safely be neglected.
Have you read the original article [http://lesswrong.com/lw/nc/newcombs_problem_and_regret_of_rationality/]? The payoff is less if you follow ordinary decision theory, and yet the whole point of decision theory is to maximize the payoff.
0 and 1 are not probabilities [http://lesswrong.com/lw/mp/0_and_1_are_not_probabilities/]. I certainly don't have a prior of 0 that Omega's existence is impossible; he's not defined in a contradictory fashion, and even if he was I harbor the tiniest bit of doubt that I'm wrong about how contradictions work.
I am using sloppy language here, perhaps. But to illustrate my usage, I claim that the probability that 2+2=4 is 1. And that p(2+2=5)=0.
If you were a Bayesian and assigned 0 probability to 2+2=5, you'd be in unrecoverable epistemic trouble if you turned out to be wrong about that. See How to convince me 2+2=3 [http://lesswrong.com/lw/jr/how_to_convince_me_that_2_2_3/].
EY to the contrary, I remain smug in my evaluation p(2+2=5)=0. Of all the evidences that Eliezer offered, the only one to convince me was the one which demonstrated to me that I was confused about the meaning of the digit 5. Yes, by Cromwell's rule, I think it possible I might be mistaken about how to count. "1, 2, 3, 5, 6, 4, 7", I recite to myself. "Yes, I had been wrong about that. Thanks for correcting me." I might then write down p(Eliezer Yupkowski is the guru of Less Wrong)=0.999999999. Once again, I would be mistaken. It is "Yudkowski", not "Yupkowski") But in neither case am I in unrecoverable epistemic trouble. Those were typos. Correcting them is a simple search-and-replace, not a Bayesian updating. Or so I understand.
It's Yudkowsky. Might want to update your general confidence evaluations.
Yudkowsky, in fact.
If you run out of material, here's an academic paper, that claims to resolve many of the same problems as are being addressed on this site: "DISPOSITION-BASED DECISION THEORY" * http://www.justin-fisher.com/papers/DBDT.pdf [http://www.justin-fisher.com/papers/DBDT.pdf]
CODT (Cop Out Decision Theory) : In which you precommit to every beneficial precommitment.
This Omega [http://lesswrong.com/lw/nc/newcombs_problem_and_regret_of_rationality/] is not impossible. It says: "Omega has been correct on each of 100 observed occasions so far". Not particularly hard - if you pick on decision theorists who had previously publicly expressed an opinion on the subject.
Ah! So I need to assign priors to three hypotheses. (1) Omega is a magician (i.e. illusion artist) (2) Omega had bribed people to lie about his past success. (3) He is what he claims. So I assign a prior of zero probability to hypothesis #3, and cheerfully one-box using everyday decision theory.
First: http://lesswrong.com/lw/mp/0_and_1_are_not_probabilities/ [http://lesswrong.com/lw/mp/0_and_1_are_not_probabilities/] You don't seem to be entering into the spirit of the problem. You are "supposed" to reach the conclusion that there's a good chance that Omega can predict your actions in this domain pretty well - from what he knows about you - after reading the premise of the problem. If you think that's not a practical possibility, then I recommend that you imagine yourself as a deterministic robot - where such a scenario becomes more believable - and then try the problem again.
If I imagine myself as a deterministic robot, who knows that he is a deterministic robot, I am no longer able to maintain the illusion that I care about this problem.
Do you think you aren't a deterministic robot? Or that you are, but you don't know it?
It is a quantum universe. So I would say that I am a stochastic robot. And Omega cannot predict my future actions.
...then you need to imagine you made the robot, it is meeting Omega on your behalf - and that it then gives you all its winnings.
I like this version! Now the answer seems quite obvious. In this case, I would design the robot to be a one-boxer. And I would harbour the secret hope that a stray cosmic ray will cause the robot to pick both boxes anyway.
Yes - but you would still give its skull a lead-lining - and make use of redundancy to produce reliability...
For what it's worth, I have written programs that cooperate on the prisoner's dilemma if and only if their opponent will cooperate, without caring about the opponent's rituals of cognition, only about his behaviour. Unfortunately, this margin is too small to contain them, I mean, they're not ready for prime time. I'll probably write up a post on that in the near future.
Now that you mention it, I was wondering earlier what would happen if you roll a die and one-box on odd and two-box on even...
The problem states that Omega has never been wrong, which would imply that if there are cases where he can't be certain about his prediction, he won't offer the bargain in the first place.
Dice are deterministic, but there's still hope for quantum randomness...

There are a few essential questions here:

  1. Does a reasonable model of reality actually cause us to anticipate any scenarios where it is beneficial to have an irrational disposition?
  2. Are these common enough that choosing to surrender one's rational disposition would have an overall positive expected utility?
  3. If you've gotten far enough to be able to wield rationality skillfully enough to correctly determine the answers to those questions, is it really possible to force yourself to forget how to be that rational, if you decide it would be instrumentally bene
... (read more)
By "irrational", do you mean in the sense of "would pay the $100 as Parfit's Hitchhiker"? If so, then the answer to all three questions is yes: there are lots of scenarios in real life where we are called upon to pay debts both positive and negative (repay favors, retaliate against aggression) and we think the benefit to be gained from doing so will be less than the cost. There are enough such scenarios that a disposition to pay debts without stopping to do utility calculations usually pays off handsomely over a lifetime.
But I don't believe such claims are true, so why would I attempt to argue for them? My claim is purely theoretical: we need to distinguish, conceptually, between desirable dispositions and rational actions. It seems to me that many on LW fail to make this conceptual distinction, which can lead to mistaken (or at least under-argued) theorizing about rationality. The dispute between one-boxers and two-boxers is interesting and significant even if both sides agree about most "real world" cases.

My claim is purely theoretical: we need to distinguish, conceptually, between desirable dispositions and rational actions. It seems to me that many on LW fail to make this conceptual distinction, which can lead to mistaken (or at least under-argued) theorizing about rationality

This is because actions only ever arise from dispositions. Yes, given that Omega has predicted you will one-box, it would (as an abstract fact) be to your benefit to two-box; but in order for you to actually two-box, you would have to execute some instruction in your source code, which, if it were present, Omega would have read, and thus would not have predicted that you would one-box.

Hence only dispositions are of interest.

Is this the argument? 1. It is impossible to have the one-boxing disposition and then two-box. 2. Ought implies can. 3. Therefore, it is false that someone with a one-boxing disposition ought to two-box. Or are you agreeing that you ought to two-box, but claiming that this fact isn't interesting because of premise 1? At any rate, it seems like a bad argument, since analogous arguments will entail that whenever you have some decisive disposition, it is false that you ought to act differently. (It will entail, for instance, NOT[people who have a decisive loss aversion disposition should follow expected utility theory].)
Yes, if "ought" merely means the outcome would be better, and doesn't imply "can". As far as I can tell, it would only have that implication in situations where an outcome depended directly on one's disposition (as opposed to one's actions).
I don't think so: 1. John has the loss-aversion disposition.. 2. It is impossible to have the loss-aversion disposition and maximize expected utility in case C. 3. Ought implies can. 4. Therefore, it is false that John ought to maximize expected utility in case C. Or, for Newcomb: 1. It is impossible for someone with the two-boxing disposition to one-box. 2. Ought implies can. 3. Therefore, it is false that someone with the two-boxing disposition ought to one box.
Either "ought" applies to dispositions, or actions, but one mustn't equivocate. If "what John ought to do" means "the disposition John should have", then perhaps John ought to maximize expected utility even if he's not currently so disposed. If the outcomes depend on John's disposition only indirectly via his actions, and his current disposition will lead to a suboptimal action, then we may very well say that John "ought" to do something different, meaning that he should have a different disposition. If, however, John is involved in a Newcomblike problem where there is a causal arrow leading directly from his disposition to the outcome, and his current disposition is optimal with respect to outcome, then one cannot say that he "ought" to do differently, on this (dispositional) usage of "ought".
Everyone agrees about what the best disposition to have is. The disagreement is about what to do. I have uniformly meant "ought" in the action sense, not the dispositional sense. (FYI: this is always the sense in which philosophers (incl. Richard) mean "ought", unless otherwise specified.) BTW: I still don't understand the relevance of the fact that it is impossible for people with one-boxing dispositions to two-box. If you don't like the arguments that I formalized for you, could you tell me what other premises you are using to reach your conclusion?
That sense is entirely uninteresting, as I explained in my first comment in this thread. It's the sense in which one "ought" to two-box after having been predicted by Omega to one-box -- a stipulated impossibility. Philosophers who, after having considered the distinction, remain concerned with the "action" sense, would tend to be -- shall we say -- vehemently suspected of non-reductionist thinking; of forgetting that actions are completely determined by dispositions (i.e. the algorithms running in the mind of the agent). Having said that, if one does use "ought" in the action sense, then there should be no difficulty in saying that one "ought" to two-box in the situation where Omega has predicted you will one-box. That's just a restatement of the assumption that the outcome of (one-box predicted, two-box) is higher in the preference ordering than that of (one-box predicted, one-box). Normally, the two meanings of "ought" coincide, because outcomes normally depend on actions that happen to be determined by dispositions, not directly on dispositions themselves. Hence it's easy to be deceived into thinking that the action sense is the appropriate sense of "ought". But this breaks down in situations of the Newcomb type. There, the dispositional sense is clearly the right one, because that's the sense in which you ought to one-box; since the dispositional sense also gives the same answers as the action sense for "normal" situations, we may as well say that the dispositional sense is what we mean by "ought" in general.
So, you're really interested in this question: what is the best decision algorithm? And then you're interested, in a subsidiary way, in what you ought to do. You think the "action" sense is silly, since you can't run one algorithm and make some other choice. Your answer to my objection involving the parody argument is that you ought to do something else (not go with loss aversion) because there is some better decision algorithm (that you could, in some sense of "could", use?) that tells you to do something else. What do you do with cases where it is impossible for you to run a different algorithm? You can't exactly use your algorithm to switch to some other algorithm, unless your original algorithm told you to do that all along, so these cases won't be that rare. How do you avoid the result that you should just always use whatever algorithm you started with? However you answer this objection, why can't two-boxers who care about the "action sense" of ought answer your objection analogously?
Just take causal decision theory and then crank it with an account of counterfactuals whereby there is probably a counterfactual dependency between your box-choice and your early disposition. Arntzenius called something like this "counterfactual decision theory" in 2002 [http://stanford.edu/~joelv/teaching/184/arntzenius%20-%20reflections%20on%20sleeping%20beauty.pdf] . The counterfactual decision theorist would assign high probability to the dependency hypotheses "if I were to one-box now then my past disposition was one-boxing" and "if I were to two-box now then my past disposition was two-boxing." She would assign much lower probability to the dependency hypotheses on which her current action is independent of her past disposition (these would be the cognitive glitch/spasm sorts of cases).
I agree that this fact [you can't have a one-boxing disposition and then two box] could appear as premise in an argument, together with an alternative proposed decision theory, for the conclusion that one-boxing is a bad idea. If that was the implicit argument, then I now understand the point. To be clear: I have not been trying to argue that you ought to take two boxes in Newcomb's problem. But I thought this fact [you can't have a one-boxing disposition and then two box] was supposed to be a part of an argument that did not use a decision theory as a premise. Maybe I was misreading things, but I thought it was supposed to be clear that two-boxers were irrational, and that this should be pretty clear once we point out that you can't have the one-boxing disposition and then take two boxes.
Not irrational by their own lights. "Take the action such that an unanticipated local miracle causing me to perform that action would be at least as good news as local miracles causing me to perform any of the alternative actions" is a coherent normative principle, even though such miracles do not occur. Other principles with different miracles are coherent too. Arguments for one decision theory or another only make sense for humans because we aren't clean implementations of any of these theories, and can be swayed by considerations like "agents following this rule regularly get rich."
I agree with all of this.

It was good to have the disposition to ignore threats

But not as good as the disposition to ignore threats, except when the threats are caused by transparently accidental mental glitches (which would not be encouraged by the disposition).

Eliezer's theory is more-or-less causal decision theory with a different account of dependency hypotheses/counterfactuals. The most relevant philosophical disputes would be about whether to use "local miracle" counterfactuals rather than various backtracking counterfactuals, or logical/mathematical counterfactuals (Eliezer's timeless decision theory idea).

Or reduce counterfactuals [http://lesswrong.com/lw/2l2/what_a_reduction_of_could_could_look_like/] and get them out of the analysis of problem statement, rather than explicitly as part of the problem statement. Decision theories that run on explicit notions of dependency only compete with each other on the correctness of informal dependence analysis established by guidelines (specific to a particular theory) for presenting dependencies. And for each such theory, we can find a problem statement where the guidelines collapse. Actual progress requires understanding where dependencies themselves come from (and for now it's UDT/ADT).

"Due to an unexpected mental glitch, he threatens Joe again. Joe follows his disposition and ignores the threat. BOOM. Here Joe's final decision seems as disastrously foolish as Tom's slip up."

But of course, the initial decision to take the pill may be rational, and the "final decision" is constrained so much that we might regard it as a "decision" in name only. The way I see it: When Joe takes the pill, he will stop rational versions of Tom from threatening him, meaning he benefits, but will be at increased risk of irration... (read more)

This reminds me of a quote from Scott Aaronson's On Self-Delusion and Bounded Rationality [http://www.scottaaronson.com/writings/selfdelusion.html] :
We aren't transparent. The only reason to fulfill our threats is to make it so later people will know that we will, in which case it's totally rational by any decision theory.
These "pills" and "dispositions" are equivalent to pre-commitments. If you're interested in the math and some interesting examples, I'd suggest reading The Strategy of Conflict.
Yep (I actually discuss the case of emotions in the linked post [http://www.philosophyetc.net/2005/03/rational-irrationality.html]!)
[-][anonymous]12y 3

Sort of a side note to the main topic of discussion but being as my post was quoted, maybe worth responding:

The great thing about comparing an argument to one in the philosophical literature is that it provides access to a whole range of papers on the issue so that ideas don't need to be rediscovered. The corresponding bad thing though is it makes it easy to accidentally commit a straw man attack if the argument isn't actually the same as the one in the literature. So I'll outline my argument (basically I'll extend on the quote of mine you used).

If we thin... (read more)

Hi Adam, can I ask for a little more clarification here? You write: Newcomb's Problem is a case where Omega punishes those who are disposed to follow strong dominance reasoning. But how, exactly, does it follow from this that dominance reasoning isn't rational? It may just be a case where Omega punishes those who are disposed to reason rationally. (If dominance reasoning is indeed rational, then this is the right way to describe the case.)
Edit: Hang on, let me try that again before you respond. I suppose it depends what you mean by rationality but it seems to me that the same argument that is often used to make people favour strong dominance (regardless of the world state, strong dominance leads to better outcomes) can actually be used to argue that it's not a very good decision procedure (because there are world states where using this decision procedure does not lead to a better outcome), at least as long as there are decision theories that do lead to better outcomes in general (regardless of the world state, these decision theories lead to better outcomes than other decision theories - or the weaker but more realistic, in more world states, these decision theories lead to sensible outcomes.). Just as the rationality of a strong dominance decision is justified by it leading to better outcomes than other decisions, the rationality of a decision theory could be justified by whether it leads to better outcomes than other decision theories. If that's not what you mean by rationality that's fine but then what establishes strong dominance as being a rational way of acting and hence what makes two boxing on Newcomb's rational? I'm not saying there's no answer to that but I am saying that I will struggle to respond to your question without knowing how you think about rationality in that situation. I'm confident you know more about this topic than me so I will try to understand your points but so far, I haven't seen anything which would: a.) Establish a decision based on strong dominance at an individual point in time as being rational without: b.) Establishing strong dominance as an irrational decision procedure by using a similar argument but applied to decision procedures rather than individual decisions. I'd be interested to know whether you think this is flawed as I'd be happy to either change my mind or learn to explain my reasoning better, depending on what the flaw was.
Rationality and winning may not be the same thing. But I do think they’re linked. If we’re asked to judge whether the principle of strong dominance is rational, we say yes because it always leads to the best outcome (leads to “winning”). If we were asked to choose from a 10% chance of winning $100 or a 20% chance, we would say it was rational to choose the 20% chance, once again because there’s a higher chance of winning. In fact, it seems to me that people do judge whether a decision is rational based on whether it leads to "winning" but they just get confused by multiple possible meanings of winning in the case of Newcomb's Problem which I think comes from confusing two possible questions about the rationality of a decision in the problem (discussed later). Regardless, even if that's not true, it seems that rationality and winning are at least related. Now I believe that, in just the same way, the rationality of a decision theory or procedure can be judged based on the same basis. So it may be rational to follow TDT instead of CDT (as an example, I’m not getting into the conversation of which is better here) because it may lead to a greater chance of winning. The justification here is just the same as it is in the strong dominance and lottery example in the first paragraph. Which means there are two questions: 1.) What is the rational decision to make in the circumstance? The answer here may well be the strongly dominant decision (two boxing) 2.) What is the rational decision theory to follow? The answer here might be (for example) TDT and hence the decision that flows from this is one boxing. But that means the question of whether one boxing or two boxing is the rational decision in the case of Newcomb’s Problem can mean one of two things: 1.) Is it a rational decision? 2.) Did it follow from a rational decision theory? Previously, I provided more weight to the second of these and said that as it followed from a rational decision theory, that was what matte
But remember, in Newcomb the one-boxer wins in virtue of her disposition, not in virtue of her decision per se. On your broader point, I agree that we need to distinguish the two questions you note, though I find it a little obscure to talk of a "rational decision theory" (as by this I had previously taken you to mean the theory which correctly specifies rational decisions, when you really mean something more like what I'm calling desirable dispositions). I agree with you that one-boxing is the more desirable disposition (or decision-procedure to have inculcated). But it's a separate question what the rational act is; and I think it'd be a mistake to assume that two-boxing can't be a rational choice just because a disposition to so choose would not be rational to inculcate. Well, I think that depends on one's purposes. If you're interested in creature-building, then I guess you want to know what decision procedure would be best (regardless of the rationality of the decisions it leads to). But if - like me - you're just interested in understanding rationality, then what you want is a criterion or general theory of which particular actions are rational (and why) -- regardless of whether we can reliably implement or follow it. (See also my previous contrast [http://lesswrong.com/lw/1r1/epistemic_luck/1lbk?c=1] between the projects of constructing theoretical 'accounts' vs. practical 'instruction manuals'.)
Yes, I'm willing to concede the possibility that I could be using words in unclear ways and that may lead to problems. I am interested though in how you define a rational decision if not in terms of which leads to the better outcome?
Maybe the focus shouldn't be on the decision (or action) that leads to the best outcome, but on the decision procedure (or theory or algorithm) that leads to the best outcome. If the outcome is entirely independent of the procedure, the difference is unimportant, so you can speak of "rational decision" and "rational decision procedure" interchangeably. But in newcomb's problem, that's not the case.
Yes, that's my basic view. The difficulty in part is that people seem to have different ideas of what it means to be rational.
That sounds fine to me. (Well, technically I think it's a primitive concept, but that's not important here.) It's applying the term 'rational' to decision theories that I found ambiguous in the way noted.
Which means that one boxing is the better choice because it leads to the better outcome. I say that slightly tongue in cheek because I know you know that but, at the same time, I don't really understand the position that says: 1.) The rational decision is the one that leads to the better outcome. 2.) In Newcomb's Problem one boxing would actually lead to the better outcome. 3.) But the principle of strong dominance suggests that this shouldn't be the case I don't understand how 3, a statement about how things should be, outweighs 2, a statement about how things are. It seems like the sensible thing to do is say, well due to point 2, one boxing does lead to the better outcome. Due to point 1, this means one boxing is rational. A side note of this is that strong dominance must not be a rational way of making decisions (in all cases).
No, the choice of one-boxing doesn't lead to the better outcome. It's one's prior possession of the disposition to one-box that leads to the good outcome. It would be best of all to have the general one-boxing disposition and yet (somehow, perhaps flukily) manage to choose both boxes. (Compare Parfit's case. Ignoring threats doesn't lead to better outcomes. It's merely the possession of the disposition that does so.)
Okay, so your dispositions are basically the counterfactual "If A occurred then I would do B" and your choice, C, is what you actually do when A occurs. In the perfect predictor version of Newcomb's, Omega predicts perfectly the choice you make, not your disposition. It may generate it's own counterfactual for this "If A occurs then this person will do B" but that's not to say it cares about your disposition just because the two counterfactuals look similar. Because Omega's prediction of C is perfect, that means that if a stray bolt of lightning hits you and switches your decision, Omega will have taken that lightning into account. You will always be sad if it changes your choice, C, to two boxing because Omega perfectly predicts C and so will punish you. Inversely, the rational disposition in Newcomb's isn't to one box. Instead, your disposition has no bearing on Newcomb's except insofar as it is related to C (if you always act in line with your dispositions, for example, what your dispositions matter). It isn't a disposition to one box that leads to Omega loading the boxes a certain way, it's a choice to one box so your disposition neither helps nor hinders you. As such, your choice of whether to one or two box is what is relevant. And hence, the choice of one boxing is what leads to the better outcome. Your disposition to one box plays no role whatsoever. Hence, based on the maximising utility definition of rationality, the rational choice is to one box because its this choice itself that leads to the boxes being loaded in a certain way (note on causality at the bottom of the post). So to restate it in the terms used in the above comments: A prior possession of the disposition to one-box is irrelevant to Newcomb's because Omega is interested in your choices not your dispositions to choices and is perfect at predicting your choices not your dispositions. Flukily choosing two boxes would be bad because Omega would have perfectly predicted the fluky choice and s
We're going in circles a little aren't we (my fault, I'll grant). Okay, so there are two questions: 1.) Is it a rational choice to one box? Answer: No. 2.) Is it rational to have a disposition to one box? Answer: Yes. As mentioned earlier, I think I'm more interested in creating a decision theory than wins than one that's rational. But let's say you are interested in a decision theory that captures rationality: It still seems arbitrary to say that the rationality of the choice is more important than the rationality of the decision. Yes, you could argue that choice is the domain of study for decision theory but the number of decision theorists that would one box (outside of LW) suggests that other people have a different idea of what decision theory would be. I guess my question is this: Is the whole debate over one or two boxing on Newcomb's just a disagreement over which question decision theory should be studying or are there people who use choice to mean the same thing that you do that think one boxing is the rational choice?
I don't understand the distinction between choosing to one-box and being the sort of person who chooses to one-box. Can you formalize that difference?
The latter, I think. (Otherwise, one-boxers would not really be disagreeing with two-boxers. We two-boxers already granted that one-boxing is the better disposition. So if they're merely aiming to construct a theory of desirable dispositions, rather than rational choice, then their claims would be utterly uncontroversial.)
I thought that debate was about free will.

For any given concept of "rational (action)" that's not defined as "(the action) arranging for the best expected winning", you can of course find a situation where that concept and winning are at odds. But if you define them to be the same, it's no longer possible. At that point, you can be taxed for being a given program and not other program (of for the fact that pi is less than 10, for that matter), something you don't control, but such criterion won't be about rationality of your decision-making, because it doesn't provide a suggest... (read more)

I'm curious about the downvotes. Do others disagree with me that Parfit's threat ignorer case (and the distinction it illustrates between evaluating dispositions and actions) is worth considering?

You can't have a disposition to act in a certain way without counter-factually acting that way. You can't counter-factually act a certain way without actually acting that way in a situation indistinguishable form the counter-factual. What you seem to be talking about appears to be pretending to have a certain disposition (e.g. acting according to that disposition unless the stakes are really high and trying to hide that fact). In other words you are talking about signaling, and I don't think the decision theory discussions here have progressed far enough for complicating the matter by trying to incorporate a theory of signalling to be productive at this point. (or perhaps you believe in magical acausal free will)
No, neither. It's more the idea that certain identifiable dispositions needn't be 100% determinative. I may be disposed to X in C so long as I do X in a sufficiently high proportion of C-situations. But if (say) an unpredictable mental glitch leads me to do otherwise one day, that may well be all the better. My point is then that it would be a mistake to condemn this more-fortunate choice as "irrational", in such cases.
If it's due to a random glitch and not any qualities that you regard as part of defining who you are I don't see how it could possibly described as choice. Randomness is incompatible with sensibly defined choice (of course the act of deciding to leave something up to chance itself is a choice, but which of the possible outcomes actually comes about is not). If your disposition is to flip a quantum coin in certain situations that is a fact about your disposition. If your disposition is to decide differently in certain high stake situations that also is a fact about your disposition. You may choose to try to hide such facts and pretend your disposition is more simple than it actually is, but that's a question of signaling, not of what your disposition is like. (Of course a disposition to signal in a certain way is also a disposition).
It's an interesting question how to draw the line between chosen action and mere behaviour. If the "glitch" occurs at an earlier enough stage, and the subsequent causal process includes enough of my usual reasons-responsive mechanisms (and so isn't wildly contrary to my core values, etc.), then I don't see why the upshot couldn't, in principle, qualify as "my" choice -- even if it's rather surprising, at least to a casual observer, that I ended up acting contrary to my usual disposition. Your second point involves the notion of a kind of totalizing, all things considered disposition, such that your total disposition + environmental stimuli strictly entails your response (modulo quantum complications). Granted, the kind of distinction I'm wanting to draw won't be applicable when we're talking about such total dispositions. But there are cases where it is applicable. In particular, there are cases where everyone involved is less than omniscient (even about such local matters as the precise arrangement of matter in my head). They might have some fantastic knowledge -- e.g. they might know everything there is to know about my brain that can be captured using the language of ordinary folk psychology. This can include various important dispositional facts about me. But if folk psychology is too coarse-grained to capture my total disposition, then we need to distinguish (and separately evaluate) my coarse-grained dispositions from my actual actions.
If your normal decision making apparatus continues to work afterwards, has the chance to compensate for the glitch, doesn't, and the glitch changes the result it would have to be almost exactly balanced in a counterfactual case without the glitch. How likely is that? And even so it doesn't strike me as conceptionally all that different from unconsciously incorporating a small random element in the decision making process right from the start. In either case the more important the random element the less accurately is the outcome described as your choice, as far as I'm concerned (maybe some would define the the random element as the real you, and not the parts that include your values, experiences, your reasoning ability and so on; or possibly argue that for mysterious reasons they are so conveniently entangled that they are somehow the same thing) But that's just a map-territory difference. If you use disposition as your word for "map of the decision making process" of course that map will sometimes have inaccuracies. But calling the difference between map and territory "choice" strikes me as ... well.. it matches the absolutely crazy way some people think about free will, but is worse than useless. Unless you want to outlaw psychology because it's akin to slavery, trying to take away peoples choice by understanding them, oh the horror!
Eh? That's not what I'm doing. I'm pointing out that there's a respectable (coarse-grained) sense of 'disposition' (i.e. tendency) according to which one can have a disposition to X without this necessarily entailing that one will actually do X. (There's another sense of 'total disposition' where the entailment does hold. N.B. We make choices either way, but it only makes sense to separately evaluate choices from coarse-grained dispositions.) I take these general dispositions to accurately correspond to real facts in the world -- they're just at a sufficiently high level of abstraction that they allow for various exceptions. (Ceteris paribus laws are not, just for that reason, "inaccurate".)
My take on this is the following: It's easier to see what is meant by disposition if you look at it in terms of AI. Replace the human with an AI, replace "disposition" with "source code" and replace "change your disposition to do some action X" to "rewrite your source code so that it does action X". Of course it would still want to incorporate the probability of a glitch as someone else already suggested. If an AI, which is running CDT expects to encounter a newcomb-like problem, it would be rational for it to self-modify (in advance) to use a decision theory which one-boxes (i.e. the AI will change it's disposition).
Likewise, an AI surrounded by threat-fulfillers would rationally self-modify to become a threat-ignorer. (The debate is not about whether these are desirable dispositions to acquire -- that's common ground.) Do you think it follows from this that the act of ignoring a doomsday threat is also rational?
But you use disposition as word for the map, right? Otherwise why would you have mentioned folk psychology? If so talking about disposition in games involving other players is talking about signaling. If not, what would it even mean to act contrary to ones disposition? That there exists a possible coarse-grained model of ones decision making process that that predicts a a majority of ones actions (where is the cutoff? 50%? 90%?), but doesn't predict that particular action? How do you know that's not the case for most actions? Or that the mathematically most simple model of ones decision making process that predicts a high enough percentage of ones actions doesn't predict that particular action?
No, I referenced folk psychology just to give a sense of the appropriate level of abstraction. I assume that beliefs and desires (etc.) correspond to real (albeit coarse-grained) patterns in people's brains, and so in that sense concern the 'territory' and not just the 'map'. But I take it that these are also not exhaustive of one's total disposition -- human brains also contain a fair bit of 'noise' that the above descriptions fail to capture. Regardless, this isn't anything to do with signalling, since there's no possibility of manipulated or false belief: it's stipulated that your standing beliefs, desires, etc. are all completely transparent. (And we may also stipulate, in a particular case, that the remaining 'noise' is not something that the agents involved have any changing beliefs about. Let's just say it's common knowledge that the noise leads to unpredictable outcomes in a very small fraction of cases. But don't think of it as the agent building randomness into their source code -- as that would presumably have a folk-psychological analogue. It's more a matter of the firmware being a little unreliable at carrying out the program.) The upshot, as I see things, is as follows: the vast majority of people who "win" at Newcomb's will be one-boxers. After all, it's precisely the disposition to one-box that is being rewarded. But the predictor (in the variation I'm considering) is not totally omniscient: she can accurately see the patterns in people's brains that correspond to various folk-psychological attributes (beliefs, desires, etc.), but is sometimes confounded by the remaining 'noise'. So it's compatible with having a one-boxing disposition (in the specified sense) that one go on to choose two boxes. And an individual who does this gains the most of all. (Though obviously one couldn't plan on winning this way, or their disposition would be for two-boxing. But if they have an unexpected and unpredictable 'change of heart' at the moment of decision, my c
I still don't see how statements about disposition in your sense are supposed to have an objective truth value (what does someone look like in visually simplified?), and why you think this disposition is supposed to better correlate with peoples predictions about decisions than the non-random component of the decision making process (total disposition) does (or why you think this concept is useful if it doesn't), but I suspect discussing this further won't lead anywhere. Let's try leaving the disposition discussion aside for a moment: You are postulating a scenario where someone spontaneously changes from a one-boxer into a two-boxer after the predictor has already made the prediction, just long enough to open the right hand box and collect the $1000. Is that right? And the question is whether I should regret not being able to change myself back into a one boxer in time to refuse the $1000? Obviously if my behavior in this case was completely uncorrelated to the odds of finding the $1,000,000 box empty I should not. But the normal assumption for cases where your behavior is unpredictable (e. g. when you are using a quantum coin) is that P(two box) = P ( left box empty). Otherwise I would try to contrive to one-box with a probability of just over 0.5. So the details depend on P. If P>0.001 (I'm assuming constant utility per dollar, which is unrealistic) my expected dollars before opening the left box have been reduced, and I bitterly regret my temporary lapse from sanity since it might have costed me $1,000,000. The rationale is the same as in the normal Newcomb problem. If P<0.001 my expected dollars right at that point have increased, and according to some possible decision theories that one-box I should not regret the spontaneous change, since I already know I was lucky. But nevertheless my overall expected payoff in all branches is lower than it would be if temporary lapses like that were not possible. Since I'm a Counterfactual muggee [http://lesswrong.c
What is the relevance of this? Are you using this argument? (See comment above.)* 1. It is impossible to have the one-boxing disposition and then two-box. 2. Ought implies can. 3. Therefore, it is false that someone with a one-boxing disposition ought to two-box. If that isn't your argument, what is the force of the quoted text? At any rate, it seems like a bad argument, since analogous arguments will entail that whenever you have some decisive disposition, it is false that you ought to act differently. (It will entail, for instance, NOT[people who have a decisive loss aversion disposition should follow expected utility theory].) Notice that an analogous argument also cuts the other way: 1. It is impossible for someone with the two-boxing disposition to one-box. 2. Ought implies can. 3. Therefore, it is false that someone with the two-boxing disposition ought to one box. *I made a similar comment above, but I don't know how to link to it. Help appreciated.
Type (See [comment above](http://lesswrong.com/lw/2lg/desirable_dispositions_and_rational_actions/2gg4?c=1).) to get (I got the URI http://lesswrong.com/lw/2lg/desirable_dispositions_and_rational_actions/2gg4?c=1 from the Permalink on your comment above.)
Making a decision means discovering your disposition (if we are using that word, we could call it something else if that avoids terminology confusion. What I mean is the non-random element of how you react to a specific input) in respect to a certain action. In a certain sense you are your dispositions, and everything else is just meaningless extras (that is your values, experiences, non-value preferences, reasoning ability etc. collectively form your dispositions and are part of them). Controlling your dispositions is how you control your actions. And your dispositions are what is doing that controlling. Making a choice between A and B doesn't mean letting disposition a and disposition b fight and pick a winner, it means that preferences vs A and B are the cause for your disposition being what it is. You can change your disposition vs act X in the sense that your disposition vs any X before time t is Y and your disposition for any X after t is Z, but not in the sense that you can change your disposition vs X at time t from Y to Z. Whatever you actually do (modulo randomness) at time t, that's your one and only disposition vs X at time t. Assume you prefer red to blue, but more strongly prefer cubes to spheres. When given the choice between a red sphere and a blue cube and only one of them you can't just pick the red cube. And it's not the case that you ought to pick the red after you already have the cube, that's just nonsense. The problem is more than just impossibility.
Okay, I understand how you use the word "disposition" now. This is not the way I was using the word, but I don't think that is relevant to our disagreement. I hereby resolve to use the phrase "disposition to A" in the same way as you for the rest of our conversation. I still don't understand how this point suggests that people with one-boxing dispositions ought not to two-box. I can only understand it in one way: as in the argument in my original reply to you. But that argument form leads to this absurd conclusion: (a) whenever you have a disposition to A and you do A, it is false that you ought to have done something else In particular, it backfires for the intended argumentative purpose, since it entails that two-boxers shouldn't one-box.
No, when you have disposition a and do A it may be the case that you ought to have disposition b and do B, perhaps disposition a was formed by habit and disposition b would counter-factually have resulted if the disposition had formed on the basis of likely effects and your preferences. What is false is that you ought to have disposition a and do B.
OK. So the argument is this one: 1. According to two-boxers, you ought to (i) have the disposition to one-box, and (ii) take two boxes. 2. It is impossible to do (i) and (ii). 3. Ought implies can. 4. So two-boxers are wrong. But, on your use of "disposition", two-boxers reject 1. They do not believe that you should have a FAWS-disposition to one-box, since having a FAWS-disposition to one-box just means "actually taking one box, where this is not a result of randomness". Two-boxers think you should non-randomly choose to take two boxes. ETA: Some two-boxers may hesitate to agree that you "ought to have a disposition to one-box", even in the philosopher's sense of "disposition". This is because they might want "ought" to only apply to actions; such people would, at most, agree that you ought to make yourself a one-boxer.
From the original post: Richard is probably using disposition in a different sense (possibly the model someone has of someones disposition in my sense) but I believe Eliezer's usage was closer to mine, and either way disposition in my sense is what she would need to actually get the million dollars.
It's definitely worth considering; but it seems intuitively clear at least that having the disposition of negotiating with counterfactual terrorists tends to lead to much greater utility loss than being screwed over now and again by terrorists who are mindlessly destructive irrespective of any gains they could make. I'm not sure exactly what argument would lead one to believe that such mindless terrorists are rare; something like Omohundro's basic AI drives might indicate that Bayesian utility-maximizing superintelligences are unlikely to be stubbornly destructive at any rate. (By the way, I like your blog, and am glad to see you posting here on Less Wrong.)
It appears to be typical cold war reasoning, nothing new. Was MAD rational? Perhaps not in the idealized world of abstraction, but many real world situations rely on both sides maintaining a feeling of control by threatening each other. Maybe there is more to the Parfit case? It seems very easy to model and to involve typical day-to-day types of threat assessment.

In most of these cases we can distinguish further: what is rational is to act in a certain way and to have a certain reputation. This has the benefit of being more airtight - one can argue for a logical relationship between disposition and action. (In Newcomb, the existence of an omniscient agent makes them all equivalent, but weird assumptions lead to weird conclusions.)

[-][anonymous]12y 0

Your discussion of the threat game is utterly dissolved by game theory. The game between Tom and Joe has a mixed Nash equilibrium where both make some sort of "probabilistic precommitments", and neither can improve their outcome by changing their "disposition" while assuming the other's "disposition" as given.

I've been tinkering with the idea of making a top level post on this issue, but figured it would get excessively downvoted. So I'll risk it here.

For any decision theory, isn't there some hypothetical where Omega can say, "I've analyzed your decision theory, and I'm giving you proposition X, such that if you act the way your decision theory believes is optimal, you will lose?" The "Omega scans your brain and tortures you if you're too rational" would be an obvious example of this.

Designing a decision theory around any such problem seems ... (read more)

I think of Omega as a simplified stand-in for other people. The part about Omega being omniscient and knowably trustworthy isn't solved. But I think the problem of Omega rewarding bizarre irrational behaviour on your part mostly goes away if you assume it's fairly human-like, perhaps following UDT or some other decision theory itself. The human motivation for it posing Newcomb's problem could be that it wants one of the boxes kept closed for some reason, and will reward you for keeping it closed. To make it fit this explanation, Omega should say it doesn't want you to open the box, and preferably give a reason. Kinds of things the human-like Omega might do: * trust you or not based on it's prediction of your behaviour. * prefer you to be rewarded if you act how it wants. * prefer you be punished if you harm it. * tell you what it wants of you. But it should be less likely to reward you for acting irrational for no reason, or for doing what it wants you not to do.
This isn't obvious. In particular, note that your "obvious example" violates the basic assumption all these attempts at a decision theory are using, that the payoff depends only on your choice and not how you arrived at it. Of course this is not necessarily a realistic assumption, but that is, IINM, the problem they're trying to solve.
Omega simulates you in a variety of scenarios. If you consistently make rational decisions he tortures you.
My reply to this was going to be essentially the same as my comment on bentarm's thread [http://lesswrong.com/lw/2lm/how_can_we_compare_decision_theories/2gv1?c=1], so I'll just point you there.
That does make it somewhat more useful, if that's the constraint under which it's operating. It still strikes me as probable that, insofar as decision theory A+ makes decisions that theory A- does not, there must be some way to reward A- and punish A+. I may well be wrong about this. The other flaws, namely the fact that actual decision makers do not encounter omniscient entities with entirely inscrutable motives that are unwaveringly honest, still seem to render the pursuit futile. It's decidedly less futile if Omega is constrained to outcome based reward/punishment.