It seems to me you're using "perceived probability" and "probability" interchangeably. That is, you're "defining" probability as the probability that an observer assigns based on certain pieces of information. Is it not true that when one rolls a fair 1d6, there is an actual 1/6 probability of getting any one specific value? Or using your biased coin example: our information may tell us to assume a 50/50 chance, but the man may be correct in saying that the coin has a bias--that is, the coin may really come up heads 80% of the...
"Is it not true that when one rolls a fair 1d6, there is an actual 1/6 probability of getting any one specific value?"
No. The unpredictability of a die roll or coin flip is not due to any inherent physical property of the objects; it is simply due to lack of information. Even with quantum uncertainty, you could predict the result of a coin flip or die roll with high accuracy if you had precise enough measurements of the initial conditions.
Let's look at the simpler case of the coin flip. As Jaynes explains it, consider the phase space for the coin's motion at the moment it leaves your fingers. Some points in that phase space will result in the coin landing heads up; color these points black. Other points in the phase space will result in the coin landing tails up; color these points white. If you examined the phase space under a microscope (metaphorically speaking) you would see an intricate pattern of black and white, with even a small movement in the phase space crossing many boundaries between a black region and a white region.
If you knew the initial conditions precisely enough, you would know whether the coin was in a white or black region of phase space, and you...
Case in point:
There are dice designed with very sharp corners in order to improve their randomness.
If randomness were an inherent property of dice, simply refining the shape shouldn't change the randomness, they are still plain balanced dice, after all.
But when you think of a "random" throw of the dice as a combination of the position of the dice in the hand, the angle of the throw, the speed and angle of the dice as they hit the table, the relative friction between the dice and the table, and the sharpness of the corners as they tumble to a stop, you realize that if you have all the relevant information you can predict the roll of the dice with high certainty.
It's only because we don't have the relevant information that we say the probabilities are 1/6.
GBM:
Q: What is the probability for a pseudo-random number generator to generate a specific number as his next output?
A: 1 or 0 because you can actually calculate the next number if you have the available information.
Q: What probability do you assign to a specific number as being it's next output if you don't have the information to calculate it?
Replace pseudo-random number generator with dice and repeat.
Even more important, I think, is the realization that, to decide how much you're willing to bet on a specific outcome, all of the following are essentially the same:
The bottom line is that you don't know what the next value will be, and that's the only thing that matters.
So therefore a person with perfect knowledge would not need probability. Is this another interpretation of "God does not play dice?" :-)
The Bayesian says, "Uncertainty exists in the map, not in the territory. In the real world, the coin has either come up heads, or come up tails."
Alas, the coin was part of an erroneous stamping, and is blank on both sides.
Here is another example me, my dad and my brother came up with when we were discussing probability.
Suppose there are 4 card, an ace and 3 kings. They are shuffled and placed face side down. I didn't look at the cards, my dad looked at the first card, my brother looked at the first and second cards. What is the probability of the ace being one of the last 2 cards. For me: 1/2 For my dad: If he saw the ace it is 0, otherwise 2/3. For my brother: If he saw the ace it is 0, otherwise 1.
How can there be different probabilities of the same event? It is because probability is something in the mind calculated because of imperfect knowledge. It is not a property of reality. Reality will take only a single path. We just don't know what that path is. It is pointless to ask for "the real likelihood" of an event. The likelihood depends on how much information you have. If you had all the information, the likelihood of the event would be 100% or 0%.
The competent frequentist would presumably not be befuddled by these supposed paradoxes. Since he would not be befuddled (or so I am fairly certain), the "paradoxes" fail to prove the superiority of the Bayesian approach. Frankly, the treatment of these "paradoxes" in terms of repeated experiments seems to straightforward that I don't know how you can possibly think there's a problem.
"Probabilities express uncertainty, and it is only agents who can be uncertain. A blank map does not correspond to a blank territory. Ignorance is in the mind."
Eliezer, in quantum mechanics, one does not say that one does not have knowledge of both position and momentum of a particle simultaneously. Rather, one says that one CANNOT have such knowledge. This contradicts your statement that ignorance is in the mind. If quantum mechanics is true, then ignorance/uncertainty is a part of nature and not just something that agents have.
Constant: The competent frequentist would presumably not be befuddled by these supposed paradoxes.
Not the last two paradoxes, no. But the first case given, the biased coin whose bias is not known, is indeed a classic example of the difference between Bayesians and frequentists. The frequentist says:
"The coin's bias is not a random variable! It's a fixed fact! If you repeat the experiment, it won't come out to a 0.5 long-run frequency of heads!" (Likewise when the fact to be determined is the speed of light, or whatever.) "If you flip the coin 10 times, I can make a statement about the probability that the observed ratio will be within some given distance of the inherent propensity, but to say that the coin has a 50% probability of turning up heads on the first occasion is nonsense - that's just not the real probability, which is unknown."
According to the frequentist, apparently there is no rational way to manage your uncertainty about a single flip of a coin of unknown bias, since whatever you do, someone else will be able to criticize your belief as "subjective" - such a devastating criticism that you may as well, um, flip a coin. Or consul...
I think EY's example here should actually should be targeted at the probability as propensity theory of Von Mises (Richard, not Ludwig), not the frequentist theory, although even frequentists often conflate the two.
The probability for you is not some inherent propensity of the physical situation, because the coin will flip depending on how it is weighted and how hard it is flip. The randomness isn't in the physical situation, but in our limited knowledge of the physical situation.
The argument against frequentist thinking is that we're not interested in a long term frequency of an experiment. We want to know how to bet now. If you're only going to talk about long term frequencies of repeatable experiments, you're not that useful when I'm facing one con man with a biased coin.
That singular event is what it is. If you're going to argue that you have to find the right class of events in your head to sample from, you're already halfway down the road to bayesianism. Now you just have to notice that the class of events is different for the con man than it is for you, because of your differing states of knowledge, you'll make it all the way there.
Notice how you thought up a symmetrically ...
Maybe I'm stupid here... what difference does it make?
Sure, if we had a coin-flip-predicting robot with quick eyes it might be able to guess right/predict the outcome 90% of the time. And if we were precognitive we could clean up at Vegas.
In terms of non-hypothetical real decisions that confront people, what is the outcome of this line of reasoning? What do you suggest people do differently and in what context? Mark cards?
B/c currently, as far as I can see, you're saying, "The coin won't end up 'heads or tails' -- it'll end up heads, or it'll end u...
Sudeep: the inverse certainy of the position and momentum is a mathematical artifact and does not depend upon the validity of quantum mechanics. (Er, at least to the extent that math is independent of the external world!)
PK: I like your posts, and don't take this the wrong way, but, to me, your example doesn't have as much shocking unintuitiveness as the ones Eliezer Yudkowsky (no underscore) listed.
I'd like to understand: Are frequentist "probability" and subjective "probability" simply two different concepts, to be distinguished carefully? Or is there some true debate here?
I think that Jaynes shows a derivation follownig Bayesian principles of the frequentist probability from the subjective probability. I'd love to see one of Eliezer's lucid explanations on that.
You can derive frequentist probabilities from subjective probabilities but not the other way around.
Silas: My post wasn't meant to be "shockingly unintuitive", it was meant to illustrate Eliezer's point that probability is in the mind and not out there in reality in a ridiculously obvious way.
Am I somehow talking about something entirely different than what Eliezer was talking about? Or should I complexificationafize my vocabulary to seem more academic? English isn't my first language after all.
If I'm being asked to accept or reject a number meant to correspond to the calculated or measured likelihood of heads coming up, and I trust the information about it being biased, then the only correct move is to reject the 0.5 probability.
Alas, no. Here's the deal: implicit in all the coin toss toy problems is the idea that the observations may be modeled as exchangeable. It really really helps to have a grasp on what the math looks like when we assume exchangeability.
In models where (infinite) exchangeability is assumed, the concept of long-run frequen...
Eliezer, I have no argument with the Bayesian use of the probability calculus and so I do not side with those who say "there is no rational way to manage your uncertainty", but I think I probably do have an argument with the insistence that it is the one true way. None of the problems you have so far outlined, including the coin one, really seem to doom either frequentism specifically, or more generally, an objective account of probability. I agree with this:
Even before a fair coin is tossed, the notion that it has an inherent 50% probability of...
No way to do it other way around? Nothing along the lines of, say, considering a set of various "things to be explained" and for each a hypothesis explaining it, and then talk about subsets of those? ie, a subset in which 1/10 of the hypothesies in that subset are objectively true would be a set of hypothesies assigned .1 probability, or something?
Yeah, the notion of how to do this exactly is, admittedly, fuzzy in my head, but I have to say that it sure does seem like there ought to be some way to use the notion of frequentist probability to construct subjective probability along these lines.
I may be completely wrong though.
"Suppose our information about bias in favour of heads is equivalent to our information about bias in favour of tail. Our pdf for the long-run frequency will be symmetrical about 0.5 and its expectation (which is the probability in any single toss) must also be 0.5. It is quite possible for an expectation to take a value which has zero probability density."
What I said: if all you know is that it's a trick coin, you can lay even odds on heads.
"We can refuse to believe that the long-run frequency will converge to exactly 0.5 while simultaneou...
But frequentists emphatically are not talking about individual tosses. They are talking about infinitely repeated tosses.
In other words, they are talking about tail events. That a frequentist probability (i.e., a long-run frequency) even exists can be a zero-probability event -- but you have to give axioms for probability before you can even make this claim. (Furthermore, I'm never going to observe a tail event, so I don't much care about them.)
Conrad,
Okay, so unpack "ungrounded" for me. You've used the phrases "probability" and "calculated or measured likelihood of heads coming up", but I'm not sure how you're defining them.
I'm going to do two things. First, I'm going to Taboo "probability" and "likelihood" (for myself -- you too, if you want). Second, I'm going to ask you exactly which specific observable event it is we're talking about. (First toss? Twenty-third toss? Infinite collection of tosses?) I have a definite feeling that our disagreement is about word usage.
If you honestly subscribe to this view of probability, please never give the odds for winning the lottery again. Or any odds for anything else.
What does telling me your probability that you assign something actually tell me about the world? If I don't know the information you are basing it on, very little.
I'm also curious about a formulation of probability theory that completely ignores random numbers and other theories that are based upon them (e.g. The law of large numbers, Central limit theorem).
Heck a re-write of http://en.wikipedia.org/wiki/Probability_theory with all mention of probabilities in the external world removed might be useful.
I'm not sure the many-worlds interpretation fully eliminates the issue of quantum probability as part of objective reality. You can call it "anthropic pseudo-uncertainty" when you get split and find that your instances face different outcomes. But what determines the probability you will see those various outcomes? Just your state of knowledge? No, theory says it is an objective element of reality, the amplitude of the various elements of the quantum wave function. This means that probability, or at least its close cousin amplitude, is indeed an ...
Roland and Ian C. both help me understand where Eliezer is coming from. And PK's comment that "Reality will only take a single path" makes sense. That said, when I say a die has a 1/6 probability of landing on a 3, that means: Over a series of rolls in which no effort is made to systematically control the outcome (e.g. by always starting with 3 facing up before tossing the die), the die will land on a 3 about 1 in 6 times. Obviously, with perfect information, everything can be calculated. That doesn't mean that we can't predict the probability of...
::Okay, so unpack "ungrounded" for me. You've used the phrases "probability" and "calculated or measured likelihood of heads coming up", but I'm not sure how you're defining them.::
Ungrounded: That was a good movie. Grounded: That movie made money for the investors. Alternatively: I enjoyed it and recommend it. -- is for most purposes grounded enough.
::I'm going to do two things. First, I'm going to Taboo "probability" and "likelihood" (for myself -- you too, if you want). Second, I'm going to ask you...
GBM:: ..That said, when I say a die has a 1/6 probability of landing on a 3, that means: Over a series of rolls in which no effort is made to systematically control the outcome (e.g. by always starting with 3 facing up before tossing the die), the die will land on a 3 about 1 in 6 times.::
--Well, no: it does mean that, but don't let's get tripped up that a measure of probability requires a series of trials. It has that same probability even for one roll. It's a consequence of the physics of the system, that there are 6 stable distinguishable end-states and explosively many intermediate states, transitioning amongst each other chaotically.
Conrad.
I have to say that it sure does seem like there ought to be some way to use the notion of frequentist probability to construct subjective probability along these lines.
Assign a measure to each possible world (the prior probabilities). For some state of knowledge K, some set of worlds Ck is consistent with K (say, the set in which there is a brain containing K). For some proposition X, X is true in some set of worlds Cx. The subjective probability P(X|K) = measure(intersection(Ck,Cx)) / measure(Ck). Bayesian updating is equivalent to removing worlds from K. To make it purely frequentist, give each world measure 1 and use multisets.
Does that work?
Who else thinks we should Taboo "probability", and replace it two terms for objective and subjective quantities, say "frequency" and "uncertainty"?
The frequency of an event depends on how narrowly the initial conditions are defined. If an atomically identical coin flip is repeated, obviously the frequency of heads will be either 1 or 0 (modulo a tiny quantum uncertainty).
GBM, I think you get the idea. The reason we don't want to say that the gomboc has an inherent probability of one for righting itself (besides that we, um, don't use probability one), is that as it is with the gomboc, so it is with the die or anything else in the universe. The premise is that determinism, in the form of some MWI, is (probably!) true, and so no matter what you or anyone else knows, whatever will happen is sure to happen. Therefore, when we speak of probability, we can only be referring to a state of knowledge. It is still of course the case...
Cyan, sorry. My comment was to Eliezer and statements such as
"that probabilities express ignorance, states of partial information; and if I am ignorant of a phenomenon, that is a fact about my state of mind, not a fact about the phenomenon."
Before accepting this view of probability and the underlying assumptions about the nature of reality one should look at the experimental evidence. Try Groeblacher, Paterek, et al arXiv.0704.2529 (Aug 6 2007) These experiments test various assumptions regarding non=local realism and conclude= "...giving up the concept of locality is not sufficient to be consistent with quantum experiments, unless certain intuitive features of realism are abandoned"
Standard reply from MWIers is that MWI keeps realism and locality by throwing away a different hidden assumption called "counterfactual definiteness".
Nick Tarleton:
Who else thinks we should Taboo "probability", and replace it two terms for objective and subjective quantities, say "frequency" and "uncertainty"?
I second that, this would probably clear a lot of the confusion and help us focus on the real issues.
The "probability" of an event is how much anticipation you have for that event occurring. For example if you assign a "probability" of 50% to a tossed coin landing heads then you are half anticipating the coin to land heads.
Yesterday I spoke of the Mind Projection Fallacy, giving the example of the alien monster who carries off a girl in a torn dress for intended ravishing—a mistake which I imputed to the artist's tendency to think that a woman's sexiness is a property of the woman herself, woman.sexiness, rather than something that exists in the mind of an observer, and probably wouldn't exist in an alien mind.
The term "Mind Projection Fallacy" was coined by the late great Bayesian Master, E. T. Jaynes, as part of his long and hard-fought battle against the accursèd frequentists. Jaynes was of the opinion that probabilities were in the mind, not in the environment—that probabilities express ignorance, states of partial information; and if I am ignorant of a phenomenon, that is a fact about my state of mind, not a fact about the phenomenon.
I cannot do justice to this ancient war in a few words—but the classic example of the argument runs thus:
You have a coin.
The coin is biased.
You don't know which way it's biased or how much it's biased. Someone just told you, "The coin is biased" and that's all they said.
This is all the information you have, and the only information you have.
You draw the coin forth, flip it, and slap it down.
Now—before you remove your hand and look at the result—are you willing to say that you assign a 0.5 probability to the coin having come up heads?
The frequentist says, "No. Saying 'probability 0.5' means that the coin has an inherent propensity to come up heads as often as tails, so that if we flipped the coin infinitely many times, the ratio of heads to tails would approach 1:1. But we know that the coin is biased, so it can have any probability of coming up heads except 0.5."
The Bayesian says, "Uncertainty exists in the map, not in the territory. In the real world, the coin has either come up heads, or come up tails. Any talk of 'probability' must refer to the information that I have about the coin—my state of partial ignorance and partial knowledge—not just the coin itself. Furthermore, I have all sorts of theorems showing that if I don't treat my partial knowledge a certain way, I'll make stupid bets. If I've got to plan, I'll plan for a 50/50 state of uncertainty, where I don't weigh outcomes conditional on heads any more heavily in my mind than outcomes conditional on tails. You can call that number whatever you like, but it has to obey the probability laws on pain of stupidity. So I don't have the slightest hesitation about calling my outcome-weighting a probability."
I side with the Bayesians. You may have noticed that about me.
Even before a fair coin is tossed, the notion that it has an inherent 50% probability of coming up heads may be just plain wrong. Maybe you're holding the coin in such a way that it's just about guaranteed to come up heads, or tails, given the force at which you flip it, and the air currents around you. But, if you don't know which way the coin is biased on this one occasion, so what?
I believe there was a lawsuit where someone alleged that the draft lottery was unfair, because the slips with names on them were not being mixed thoroughly enough; and the judge replied, "To whom is it unfair?"
To make the coinflip experiment repeatable, as frequentists are wont to demand, we could build an automated coinflipper, and verify that the results were 50% heads and 50% tails. But maybe a robot with extra-sensitive eyes and a good grasp of physics, watching the autoflipper prepare to flip, could predict the coin's fall in advance—not with certainty, but with 90% accuracy. Then what would the real probability be?
There is no "real probability". The robot has one state of partial information. You have a different state of partial information. The coin itself has no mind, and doesn't assign a probability to anything; it just flips into the air, rotates a few times, bounces off some air molecules, and lands either heads or tails.
So that is the Bayesian view of things, and I would now like to point out a couple of classic brainteasers that derive their brain-teasing ability from the tendency to think of probabilities as inherent properties of objects.
Let's take the old classic: You meet a mathematician on the street, and she happens to mention that she has given birth to two children on two separate occasions. You ask: "Is at least one of your children a boy?" The mathematician says, "Yes, he is."
What is the probability that she has two boys? If you assume that the prior probability of a child being a boy is 1/2, then the probability that she has two boys, on the information given, is 1/3. The prior probabilities were: 1/4 two boys, 1/2 one boy one girl, 1/4 two girls. The mathematician's "Yes" response has probability ~1 in the first two cases, and probability ~0 in the third. Renormalizing leaves us with a 1/3 probability of two boys, and a 2/3 probability of one boy one girl.
But suppose that instead you had asked, "Is your eldest child a boy?" and the mathematician had answered "Yes." Then the probability of the mathematician having two boys would be 1/2. Since the eldest child is a boy, and the younger child can be anything it pleases.
Likewise if you'd asked "Is your youngest child a boy?" The probability of their being both boys would, again, be 1/2.
Now, if at least one child is a boy, it must be either the oldest child who is a boy, or the youngest child who is a boy. So how can the answer in the first case be different from the answer in the latter two?
Or here's a very similar problem: Let's say I have four cards, the ace of hearts, the ace of spades, the two of hearts, and the two of spades. I draw two cards at random. You ask me, "Are you holding at least one ace?" and I reply "Yes." What is the probability that I am holding a pair of aces? It is 1/5. There are six possible combinations of two cards, with equal prior probability, and you have just eliminated the possibility that I am holding a pair of twos. Of the five remaining combinations, only one combination is a pair of aces. So 1/5.
Now suppose that instead you asked me, "Are you holding the ace of spades?" If I reply "Yes", the probability that the other card is the ace of hearts is 1/3. (You know I'm holding the ace of spades, and there are three possibilities for the other card, only one of which is the ace of hearts.) Likewise, if you ask me "Are you holding the ace of hearts?" and I reply "Yes", the probability I'm holding a pair of aces is 1/3.
But then how can it be that if you ask me, "Are you holding at least one ace?" and I say "Yes", the probability I have a pair is 1/5? Either I must be holding the ace of spades or the ace of hearts, as you know; and either way, the probability that I'm holding a pair of aces is 1/3.
How can this be? Have I miscalculated one or more of these probabilities?
If you want to figure it out for yourself, do so now, because I'm about to reveal...
That all stated calculations are correct.
As for the paradox, there isn't one. The appearance of paradox comes from thinking that the probabilities must be properties of the cards themselves. The ace I'm holding has to be either hearts or spades; but that doesn't mean that your knowledge about my cards must be the same as if you knew I was holding hearts, or knew I was holding spades.
It may help to think of Bayes's Theorem:
That last term, where you divide by P(E), is the part where you throw out all the possibilities that have been eliminated, and renormalize your probabilities over what remains.
Now let's say that you ask me, "Are you holding at least one ace?" Before I answer, your probability that I say "Yes" should be 5/6.
But if you ask me "Are you holding the ace of spades?", your prior probability that I say "Yes" is just 1/2.
So right away you can see that you're learning something very different in the two cases. You're going to be eliminating some different possibilities, and renormalizing using a different P(E). If you learn two different items of evidence, you shouldn't be surprised at ending up in two different states of partial information.
Similarly, if I ask the mathematician, "Is at least one of your two children a boy?" I expect to hear "Yes" with probability 3/4, but if I ask "Is your eldest child a boy?" I expect to hear "Yes" with probability 1/2. So it shouldn't be surprising that I end up in a different state of partial knowledge, depending on which of the two questions I ask.
The only reason for seeing a "paradox" is thinking as though the probability of holding a pair of aces is a property of cards that have at least one ace, or a property of cards that happen to contain the ace of spades. In which case, it would be paradoxical for card-sets containing at least one ace to have an inherent pair-probability of 1/5, while card-sets containing the ace of spades had an inherent pair-probability of 1/3, and card-sets containing the ace of hearts had an inherent pair-probability of 1/3.
Similarly, if you think a 1/3 probability of being both boys is an inherent property of child-sets that include at least one boy, then that is not consistent with child-sets of which the eldest is male having an inherent probability of 1/2 of being both boys, and child-sets of which the youngest is male having an inherent 1/2 probability of being both boys. It would be like saying, "All green apples weigh a pound, and all red apples weigh a pound, and all apples that are green or red weigh half a pound."
That's what happens when you start thinking as if probabilities are in things, rather than probabilities being states of partial information about things.
Probabilities express uncertainty, and it is only agents who can be uncertain. A blank map does not correspond to a blank territory. Ignorance is in the mind.