jsalvata

Posts

Sorted by New

Comments

Beautiful Probability

Elezier:

The results of the two experimenters in the example are different: to begin with, the 2nd experimenter's first result is a non-cure (otherwise he would have stopped there with a 100% success); one of the three following results is also a non-cure (otherwise he would have stopped with a 75%); etc. Also, his last result is a cure (otherwise he would have stopped one patient earlier).

The first experimenter certainly got different results -- or you may as well win the lottery: the odds that a Bernoulli trial produces a sequence x1..x100 in which no prefix x1...xN has a higher rate of successes than the whole sequence are really small.

Note that this argument gets weaker as you change the definition of "definitely greater than 60%" to require greater statistical confidence (indeed .99 results are less sensible to methodological biases than .95 results), but even at .99 the odds that the sequence obtained by the 1st doctor would finish exactly where the 2nd doctor would stop are well below 1/10th (I just made a quick upper bound calculation, it is actually even smaller).

The problem is that (1) when the results are reported in a journal, you only get the total counts -- which hides the methodological trap, and (2) even if you got the full results, you most likely don't have the computational power to discover the difference (except of course in the ~60% of reports from doctor 2 where he reports on a single patient).

The Allais Paradox

While Elezier's argument is still correct (that you should multiply to make decisions based on probabilistic knowledge), I see a perfectly rational and utilitarian explanation for choosing 1A and 2B in the stated problem.

The clue lies in Colin Reid's comment: "people do not ascribe a low positive utility to winning nothing or close to nothing - they actively fear it". This fear is explained by Kingreaper: "in scenario 1B if you lose you know it's your fault you got nothing".

That makes the two cases, stated as they are, different. In game 1 the utility of U1($0) has negative value: a sense of guilt (or shame) over having made the bad choice, which doesn't seem possible in game 2 (because game 2 is stated in terms of abstract probabilities, see below).

This makes the inequations compatible:

U($24,000)   >   33/34 U($27,000) + 1/34 U1($0)

e.g. 24 > 33/34 · 27 + 1/34 · -1000

0.34 U($24,000) + 0.66 U2($0)   <   0.33 U($27,000) + 0.67 U2($0)

e.g. 0.34 · 24 + 0.66 · 0 < 0.33 · 27 + 0.67 · 0

Note that stating the game with the "switch" rule turns game 2 into one (let's call it 3) in which the guilt/shame reappears, making U3=U1 -- so a rational player with the described negative U1 would choose A in game 3 and there would be no money pump.

This solution to the paradox is less valid if it is made clear that the subject will be allowed to play the game many times.

Another interesting way to remove this as a possible solution would be to restate case 2 in more concrete terms, to make it clear that you won't get away not knowing that "it was your fault" if you loose:

4A. If a 100-face dice falls on <=34, win $24,000, otherwise win nothing.

4B. If a 100-face dice falls on <=33, win $27,000, otherwise win nothing.

Just to prevent the subject being pattern-matching and not thinking, we should add the phrase "note that if the dice falls on a 34 and you've chosen A, you win 24k, but if you've chosen B, you get nothing".

I believe game 4 is pretty equivalent to game 3 (the one with the switch).

I've checked Allais' document and it suffers the same flaw: it's not an actual experiment in which people are asked to choose A or B and actually allowed to play the game, but a questionnaire asking subjects what they would choose. This is not the same, among other reasons because it doesn't force the experimenter or subject to detail the mechanics of the game (and hence it is not stated whether the subject will be given that sense of shame or even allowed to "chase the rabbit").

It would be interesting to know the result of an actual experiment with this design, possibly with smaller figures to reduce the non-linearity of the utility functions -- since that's not what's being discussed here --, and with subjects filtered against innumeracy (since those are out of hope anyway).