IMO there may be one Bayesian who falls for this, and there may be one frequentist who is struck by lightning twice.
depending on your prior that some unknown person is mailing you tomorrow‘s stock prices (probably very low) there is some number of correct predictions which should convince you this is true. If that prior weight is low enough (and it should be tiny), it might be that the number of correct predictions is so high that the entire world’s population of Bayesians is not enough for one Bayesian who falls for it. But if there IS one bayesian who falls for it, that’s bad luck for her, and seems like no argument against bayesianism.
I think your intuition that this is a problem comes from the idea that “oh now all the frequentists can start running this scam and money pump all the bayesians.” but if you actually do the calculation, it won’t work out that way, because the scams interfere with each other. When you receive the 101st (first) letter from a new scammer, you don’t bother to open it because your prior that it’s a scam has increased a lot; all the previous cold letters have turned out to be scams.
When you receive the 101st (first) letter from a new scammer, you don’t bother to open it because your prior that it’s a scam has increased a lot; all the previous cold letters have turned out to be scams.
Already there. The advice I always give people asking about a dodgy-looking message they've received is "IT'S A SCAM, IT'S ALWAYS A SCAM."
One you accept that IT'S A SCAM, IT'S ALWAYS A SCAM, you can then at your leisure speculate on how the scam works, i.e. what was the generating process that created the message, but it doesn't really matter if you can't work out how the scam works, because IT'S A SCAM, IT'S ALWAYS A SCAM.
I'm not actually that interested in the scam, more how Bayesianism handles the problem. If we assume the Bayesian has reasonable priors and isn't naive, then your answer makes sense. But when we are talking about science and the frontier of knowledge, we don't have that luxury.
I think I can demonstrate this with a much more abstract problem: how can I use Bayesianism to evaluate multiple schools of epistemology to find the best one? Can Bayesianism come to a definitive answer or will there always be a non-zero probability that any of the schools are the correct one? (assuming any of them are)
While I'm asking this to demonstrate that bad priors and naivety are the default state, I'm also genuinely interested in the answer.
Arguably, using Bayesianism to evaluate schools of epistemology (which may not be Bayesian) is a type error. However, I think I still endorse assigning tentative probabilities to beliefs about epistemology and updating them by Bayes rule in practice, so I think I don't have a full answer for you (only a bunch of disorganized thoughts; it's a deep question).
While I'm asking this to demonstrate that bad priors and naivety are the default state, I'm also genuinely interested in the answer.
It's not quite clear to me what you're getting at.
Here is my Bayesian analysis from the perspective of someone who's received six successful predictions in a row.
Therefore, I have strong Bayesian evidence that it's a scam.
If you have weak/uncertain priors, the thing to do is run low-cost experiments that differentiate between your different hypotheses of what's going on.
A real cheap experiment in relation to the question "is this a scam?" is to Google to see if others have received similar letters and what their outcomes were. If it's a scam, you're likely to surface evidence of this, if it's not a scam, you're likely to surface both people saying it went well for them, and debunking sites that explain what's going on, where the letters came from, etc. If you get no information at all based on a google search, just no results come back, someone went to a lot of effort to make that be the case, and you should be suspicious, it's evidence something's amiss.
In a case where you're at the forefront of scientific discovery, there may be no cheap tests available, but you still devise tests which you predict, based on what you currently know, will go one way if a theory is correct, another way if it's incorrect, and see what happens.
If you have weak/uncertain priors, the thing to do is run low-cost experiments that differentiate between your different hypotheses of what's going on.
CF kind of agrees here. CF says that any contradiction between theories can be used to create meaningful hypotheses that will refute one or both theories; and CF doesn't have a concept of weak or strong arguments, instead it uses decisive criticisms and claims are either refuted or unrefuted.
What does Bayesianism say if you have strong priors? Is there an objective breakpoint that separates weak from strong priors? (e.g., P < 0.5?)
There isn't a set breakpoint that separates weak from strong priors, it's a continuum from "it seems extremely unlikely that everything I know about the world is false, but it's not technically impossible" to "it seems extremely likely that I'm typing on a keyboard right now, but there's a tiny possibility that something else is going on, like a hallucination or me being a brain in a vat or some other possibility I haven't thought of".
Bayesianism says that if you have strong priors about a particular matter, you should be surprised with corresponding strength if something your priors said shouldn't happen, happens. Some occurrences mean "I adjust what I think slightly, but this was within what I guessed might happen", while others mean "halt and catch fire - either I should fundamentally rethink my notions of how reality works, or I've deeply misunderstood what just happened".
I haven't read about CF (I skimmed your link extremely rapidly, which doesn't count) but it sounds pretty binary. Bayesian thinking is not that.
Bayesian is about having priors and updating them. If your prior is that the efficient market theory is true and that there are some people who run scams via unsolicited email exist, getting 6 letters is no strong reason to update towards the company sending the unsolicited email being legit.
It seems like a consistent element of everyone's explanations so far has been that the bayesian has reasonable priors and is not naive. I'll respond to everyone at once here (hope that's okay) to keep things organized. cc @michaeldickens, @cole-wyeth, @gavin-runeblade, @thomas-castriensis (thanks everyone for the explanations so far).
edit: IDK how to mention/tag people :(
One thing I notice, the Bayesian needs a prior for 'unknown scam' but shouldn't they also consider the inverse? Like an unknown bank error in your favor type situation. I suppose that if the prior for that is small enough it doesn't matter so it's ignored as insignificant. Is the significance a matter of the ratio between priors of unknowns? Like it's way more likely to be an unknown scam than an unknown bank error, so we ignore the bank error.
Something bigger doesn't quite sit right with me: I don't think having priors and not being naive is our default state for knowledge frontiers (hence the initial framing and restrictions on knowledge of scams). Rather, the default is that we are missing priors (or they're wildly inaccurate) and we are naive to the ways in which we are naive (there is a lot that we don't know we don't know).
Here's a concrete example: consider a Bayesian in 1901 (before general relativity). Almost all measurements of planetary motion agreed with Newton's universal gravity (UG) and Mercury's orbit was our main (only?) counterexample. We had multiple (incorrect) theories about how Mercury's orbit might be perturbed by this body or that. The most recently discovered planet at the time (Neptune) was predicted via UG (0.125 of all known planets).
My understanding is that the Bayesian will, upon seeing measurements of all the planets' orbits, update based on successful predictions. Most of the predictions are correct for UG alone, and all of them are correct for UG + New Mystery Planet. Let's call these hypotheses
Now, based on the reasoning used (if I understand it correctly), the Bayesian already has reasonable priors (at least they believe so), and the general relativity hypothesis (playing the role of the scam hypothesis) is already accounted for in the Bayesian's calculations and updates. It doesn't matter that the Bayesian is unaware of how it works or not. [Unsure] The probability of GR being true is independent of whether the Bayesian knows about it or not
[1]
. Let's call all the other hypotheses (excluding any UG hypotheses) just
Okay so in case of observing something uncontroversial, like Mars data,
Then, when observing Mercury, we see
Have I gone wrong somewhere? I'm not confident that I'm at the right midstate here. If there isn't a problem with my logic then this result feels intuitively wrong because we're getting more confident in
Also it seems like
that's how it seems it must be to me. I can't quite put my finger on it, but it feels wrong that learning about an idea independently of all observations of data somehow updates things retroactively. ↩︎
I think you need to make a clearer distinction between a hypothetical perfect bayesian reasoner, who would know all possible hypotheses from the beginning and only narrow them down through observation, and the things humans do to try to approximate that, which will sometimes involve going back and changing the prior when a new hypothesis has been thought of.
[Unsure] The probability of GR being true is independent of whether the Bayesian knows about it or not[1]
Keep in mind that these "probabilities" are subjective assessments of probability based on an individual's prior knowledge, not facts about reality. Two Bayesians with different prior experience may disagree about how probable something is (/seems to them), but reality will not disagree or debate with itself about the truth of the matter, or assign probability to different possibilities (mumble mumble I don't really understand quantum mechanics and am pretending it doesn't matter for the purpose of this conversation).
Whether or not General Relativity is true is unaffected by any probabilities any Bayesian may put on its truth or falsehood when reasoning about the evidence they've seen so far. But whether or not a particular Bayesian finds General Relativity to be probably true, is definitely affected by whether they know about it or not. Keep a clear distinction in your mind between "the probability a Bayesian reasoner assigns to some fact being true" and "whether or not that fact is true in reality" - these are not the same.
This is a "map vs. territory" distinction. Bayesian probabilities go into a mental model of how the world works (map), while how the world actually works is separate from the map.
Keep in mind that these "probabilities" are subjective assessments of probability based on an individual's prior knowledge, not facts about reality. Two Bayesians with different prior experience may disagree about how probable something is (/seems to them),
Doesn't that mean we should expect that Bayesians often disagree and they have no way to resolve it except consulting reality (i.e., an experiment)?
If that's the case, why bother with Bayesianism at all? It seems like any situation where people needed to agree on the truth of something, Bayesianism wouldn't help:
Also say there is an experiment, is there any standard or agreement among bayesians about how to weight credence? (when should it be weak or strong? etc) Because if there isn't, they might not even be able to agree on what experiment to do or if it will matter.
Doesn't that mean we should expect that Bayesians often disagree and they have no way to resolve it except consulting reality (i.e., an experiment)?
Short answer: Yes.
Longer answer: Two Bayesians who start out with the same prior probabilities, and see the same evidence, should update their posterior probabilities in the same way, and so their mental models should stay consistent with each other. Two Bayesians who start out with different prior probabilities, but see the same evidence, should update their posterior probabilities in ways that are predictable to each other, and in line with the evidence - that is, if one reasoner (A)'s prior probability that (for example) General Relativity is true was high, while another (B)'s was low, then when an experiment is run which provides evidence for general relativity, A's estimates of General Relativity's likelihood of being true will change less than B's (because B's priors were more wrong), but both will update in a direction and to an extent that is predictable to either of them. As they see more and more of the same evidence, their models of the world should converge.
This is all assuming an ideal Bayesian reasoner with practically-unlimited computing power who doesn't cheat or decide not to reason according to Bayesian rules when it becomes inconvenient, and humans don't meet those constraints. But, there's math to say how much you should update given particular evidence. So:
Also say there is an experiment, is there any standard or agreement among bayesians about how to weight credence?
Yep. "How to weight credence" is a bit unclearly stated, but there's Bayes' formula, which tells you how to update your probabilities based on evidence, and that might be what you're getting at?
Which is (one reason) why bother with Bayesianism at all. It's a method of approaching consensus when working under uncertainty. It's kind of an "agreeing to the rules of the game" situation, where "the rules" are a mathematical equation that says how probabilities must change when people are disagreeing (and "must" here carries the same level of mathematical strength as saying "2+2 must equal 4", it's not a thing that was decided by committee) - if for example you say it's 95% unlikely/5% likely that something will happen under your idea of how the world works, and then it happens, if you're playing fair, you make a big update, and if you put numbers on it, Bayes' rule tells you what your new numbers should be. If you don't like the new numbers, you have to either acknowledge that what you said your priors were was incorrect, or that what you said your likelihood estimates of different outcomes were was incorrect - so either you retroactively revise how you used to think the world works, or you retroactively revise what you thought would happen and how confident you were, both of which are kind of awkward and embarrassing. And the people you're disagreeing with, if you don't make an appropriately sized update given what you told them your priors and likelihood estimates were, can point this out as a fact. And if you're not very confident, or you don't think particular evidence should carry much weight, you can express that in a way that makes it clear how much you're going to update based on whatever evidence you see, before you see it, so it isn't like:
"I think x is definitely wrong, and y will provide strong evidence"
"OK, so y didn't go how I thought, now I think x is only almost certain to be false"
It's like:
"I think X is a% likely to be false, and Y experiment will turn out the way I expect with b% likelihood as a result."
"Oh. Ok, well, I guess now I'm down to c% likelihood that X is false. Shoot."
And instead of being like "it's not fair that you moved from definitely to almost certain based on something you said would provide strong evidence but didn't go the way you expected", the reaction is "yep, that math is correct, you updated how you should have given your priors". And before you get to that point, pre-experiement, you can argue over whether b% likelihood is reasonable, where it's hard to argue about the correct meaning of the word "strong".
And once you get really familiar with doing this (I'm still not great at it), you know intuitively how much putting X% probability on a particular outcome means you're going to have to change your views if it doesn't happen, and you become appropriately cautious ("calibrated") in your estimates, and your saying things in probabilities conveys a lot of information to other people who are also familiar with talking this way.
All of that is in idealized theory among people who are quite smart and can do lots of calculation in their heads. Lots of people also LARP it and use Bayesian-sounding words without actually having the deeper intuitive understanding of what what they're saying means.
I think the missing step is that you're updating
When new evidence comes in that falsifies NMP, P(O) jumps up to 0.5.
Okay I see, thanks.
When new evidence comes in that falsifies NMP, P(O) jumps up to 0.5.
Is that because we only have
Also, I guess the
Also, do you know how things change after GR is published? (assuming no new data in the mean time)
I don't have a reason for setting them equal, no. The prior probabilities could be arbitrarily split between the remaining options.
Yes, that's correct. If we were to keep experimenting and observing, we would find some data that would have essentially 0 likelihood showing up under
That last question is trickier. If there's no new data either way, but it predicts reality better than most hypotheses in
Then you can compare which specific predictions
(This is not intended to be "the" Bayesian answer, just my unsophisticated first reaction.)
Even if our unlucky Bayesian doesn't know about this particular type of scam, shouldn't we still assume that he believes that
In that case he would still conclude that it is more likely that he is being fooled (although he doesn't know how) than that he's been contacted by a genuine oracle.
I'm not an expert Bayesian, and it's not part of my identity which I would feel the need to defend by going "here's why I wouldn't get scammed", but I know how I would answer from a "modify your expectations in light of new evidence" lens, which I understand to be the core of Bayesianism if put into plain English.
The key thing is, what are your priors?
If you were a very naive Bayesian reasoner, say a 5 year old of average intelligence, and your experience was extremely sheltered, skewed towards a very kind world where everyone was always nice to you and you weren't really aware that scams were a thing that happens sometimes, and you didn't know anything about how stock prices worked, you might be taken in. Because your probability that someone really could predict the way these messages indicated would be "I dunno, seems unlikely, but maybe?"
But as an adult human with the priors I have, here's how I'd think of it if I received such a letter. All numbers are roundish and hand-wavy.
Probability it's a scam, excluding marketing spam, donation requests, and surveys, given it's an unsolicited message in any communication medium, selected randomly: I dunno, 70%? There's a reason I don't pick up calls from people who aren't in my contact list any more, if someone I don't know wants to reach me, they can leave a message and let me think before responding.
Probability it's a scam, given the message is unsolicited and it's from someone I don't know, a business I don't do business with, or is anonymous: 80%.
Probability it's a scam, given the message is unsolicited, from someone I don't know, and it's even slightly odd in any way: 90%.
Probability it's a scam, given the message is unsolicited, from someone I don't know, it's odd, and it involves me spending money to make more money later: Basically 100%. At this point, I don't care what it says, it's a scam and it's getting deleted or recycled. It's not worth my time or mental effort to evaluate the claims it contains.
Additionally and separately, but potentially relevant given the content of the message: Probability I do not have an anonymous benefactor who wants me to be rich for the sole reason that that would make them happy or discharge a duty they feel they have: 100%. I mean, not literally mathematically 100%, but close enough, I'm as sure of this as I am of almost anything, aside from things I can verify with my senses directly like "I have 5 fingers on each hand".
Given those priors, unlike those of a sheltered 5 year old, it's hard for a scam to get through.
If I got an unsolicited message from someone I don't know making a stock prediction, I'd be like "that's weird, it's almost certainly a scam". If they wanted me to spend money, I'd be like "definitely a scam" and throw it out, and have forgotten about it one month later.
If I got a second one, I'd check online to see what the nature of the scam was, because it is somewhat odd to get mail scams, those cost a dollar per message to run, and given my field of work friends and family sometimes check with me to see if things are legit, so I'd want to know what the deal was. I hadn't heard of the Mail-Order Prophet scam before, so if I was sent a piece of mail trying to draw me in before having read this on LessWrong, I'd be like "neat, new category of scam!"
A similar situation, to demonstrate that I'm not BSing about how I'd react to unsolicited nice things happening from unexpected benefactors: One time, I got a piece of mail saying there was a package for me at the nearby gas station. Which was super weird, I didn't know gas stations took packages. But OK. So I go to the gas station, and the guy behind the desk was like "were you expecting a TV?", to which the response was "No". But, there was a TV there for me, new in box. I was very confused about what the scam was in this case, but I was like "well OK, I guess I'll take it". And I looked for contact information or some other indication of who had sent this, and the only thing was a tech support/setup help number. So I called that, and was like "so this seems really scammy, what's the deal?" And they explained that these TVs were sent to people who opened new bank accounts at a local bank, as a part of a promotional offer. And I had in fact opened a new bank account, because I needed to, several weeks prior, and nobody had told me about the promotional offer. So I was still sufficiently concerned to call my bank and confirm, rather than trusting the word of the person on the other end of the contact information on the unsolicited TV's box. The bank did confirm, and only then did I plug it in and set it up.
Backing selection effects out of data is a notoriously expensive operation without guarantees of convergence to the true distribution afaik.
That seems reasonable, but how can the Bayesian tell in the scam example? (if there's a selection effect at play or not) What about in the real world? We do have selection effects like this in nature (eg evolution / natural selection), so it seems like Bayesianism should be able to handle it.
The likelihood function is a modeling choice. You are free to choose one that assigns a lower P(Not Scam|N Correct Predictions) than would be warranted for an unbiased model.
For this scam, even if they mailed letters to everyone on the planet, you could calculate the maximum number of guaranteed correct predictions they could have achieved if they are running this scam and were scamming a given fraction of the population. Then you could, for example, construct a model that assigns arbitrarily low probability to that number of letters and treats the later predictions as meaningful evidence.
The other replies gave you good examples of how to resolve this. Let me take a stab at your mistake. From a high level, you are assuming the information contained in the scam is the only information the Bayesian has available to use.
As shown a Bayesian has probably got priors about the way the market works, the way people advertise, the existence and nature of scams, etc. The information in the predictions is applied against those priors, not just against itself.
Bayesian epistemology typically works in the framework of an existing hypothesis space, with a prior over that space, which is then updated. In addition to updating your credences about the possibilities in the space, you can also reformulate your hypothesis space itself, e.g., because you become aware of new possibilities (like the existence of scammers), or because you want to carve the world into different concepts due to some ontological shift. I think the Bayesian should just be allowed to reformulate their hypothesis space and reform their prior to get out of this.
Okay that sounds reasonable (to me, a non-bayesian) but where do the new hypotheses come from, or the ideas for how to reformulate the space? If they came from Bayesianism, why is reformulation ever necessary?
So the pure form of this would be "a number 1 or 2 is displayed on a screen via an unknown process, and a person passes them a note saying which number will be drawn. This happens 6 times in a row". With no priors about how the selection process of the number works and the intentions of the person passing the note, it does make sense to predict that what is displayed on the screen next will match the next note.
Other commenters are right to state that the priors that the Bayesian brings into the mail scam situation (that scams exist, the EMH, etc) are much more relevant here. Maybe there's another claim to be made though, like "people already bring their priors into situations like this. Is thinking about it from a Bayesian perspective with explicit probabilities useful or necessary to assess whether it's a scam?" To that, I would say no.
Can Bayesianism deal with the mail-order prophet scam? If so, how?
This scam uses a segmented mailing list and a large initial population to advertise predictions so that a 'winning' portion of recipients receive an apparently improbable series of correct predictions. Most recipients (the 'losers') receive an incorrect prediction at some point.
For clarity, here is the situation:
A Bayesian with no prior knowledge of the scam receives a letter advertising stock predictions via a new proprietary quantitative model. The letter predicts that by the end of the month, stock XYZ will be up. It also states that the firm will prove their legitimacy by sending 6 more predictions (one per month) that will correctly predict whether XYZ is up or down. Sure enough, the Bayesian receives each letter, and the predictions all end up correct!
Unbeknownst to the Bayesian, 63 other Bayesians also received the first letter, 31 others received the second, 15 the third, 7 the fourth, 3 the fifth, and 1 other received the sixth letter. Our unlucky Bayesian was the only one to receive the seventh.
The question is, assuming no knowledge of the scam or communication with the "losers" (for whom a stock prediction was wrong), should the Bayesian -- strictly adhering to Bayesian Epistemology -- believe that the firm's stock predictions are legit?
I don't know how to approach this from a Bayesian perspective and would appreciate any guidance from someone who's more familiar. My intuition is that the Bayesian should believe that the predictions are legit (which we observers know to be incorrect).
Also I feel I should disclose that I'm not a Bayesian (rather a Critical Fallibilist), and I'm looking for the best Bayesian answers to this problem (links to prior material welcome). It's often quite hard to find the best answers to a problem from a particular school, so please bear with me if I've missed something obvious.