To me the fact that they feature so prominently raises the question of how much certain commitments to "bayesianism" reflect actual usage of bayesian methods vs a kind of pop-science version of bayesianism.
This is a valid concern. I'm new here and just going through the sequences (though I have a mathematics background), but I have yet to see a good framing of bayesian/frequentist debate as maximum likelihood vs maximum a-posteriori. (I welcome referrals)
I think people like to make these "methodological" critiques
Yes, there is a methodological critique to strict p-value calculations, but in the absence of informative priors p values are a really good indicator for experiment design. I feel that in hyping up Bayesian updates people are missing that and not offering a replacement. The focus on methods is a strength when you are talking about methods.
and that actually both experimenters precommitted to treat at least 100 patients.
That would be an interesting wrinkle. I haven't read the original source. Supposing this I would think Mr. Frequentist would still say Bessel is still more likely to be fooled by unlikely data than George (in the positive only direction) but honestly only by a very small amount. One could call that the trade-off for a method that won't be fooled by unlikely negative data.
Actually, why?
I was treating the description of Bessel and having a distinct stopping condition of 70%, otherwise he would have stopped at 69.7% like you said. If he was doing the tests one at a time we know 70.7% at 99 didn't occur because he stopped at 100.
Correct, in reality the world doesn't change if we reorder our results. The point is that for a frequentist it feels like it should. Because the method is flawed, it seems right for the result to be less right. This is a bad way of analyzing results, but not as bad a way to evaluate methodologies.
Your valid concern about corrupted results stems from the correlation between bad behavior and what a frequentist calls a bad methodology.
Bessel's methodology is not inherently bad either. If Bessel believed that the treatment would save lives and needed to keep going to prove it, wouldn't he behave the same way?
We need a Bayesian methodology that can help evaluate methodology with and without informative priors. This probably already exists in literature, but we won't be able to overcome the use of p-values until it is common knowledge.
Exactly. For the purposes of the post I framed it as a single interaction, but my honest response would be 'Cool. Do you have a video?"
While I recommend looking up the real thing or news, in case you are curious I started with a much longer version of the story before I realized it was distracting from the rest of the post:
"So I have an amazing fish. Marcy. She's an archerfish. You know, the kind of fish that can spit at flying insects? She can recognize human faces! You see, I trained her to spit at politicians.
"We play a game sometimes, where I tap a spot on some glass by the tank. If she hits it, she gets a treat. We'd been doing this for a couple months when I put a TV behind the glass to see if she would react. She didn't really; not at first. I turned it to the news though, and on a lark tapped the glass where that politician was talking. You know, the boring, long-winded one? She spat at 'em and got her treat.
"So I thought, why not? What if I tap all politicians faces when they come on? It became part of our game. Marcy couldn't predict when I'd tap the glass because I couldn't. She got a game and treats and I got to watch the news. Win-win.
"But then one day I got bored. I wasn't tapping all the politician faces. It was that day, remember, where the thing happened? News was getting sound bites from everyone and their dog? Pelosi and Trump came up and Marcy nailed 'em both. Dead center. I was already giving her the treat before I realized that I hadn't tapped the glass.
"So I test her, you know? She spitting at anyone now? Talking heads? Nope. Snoop? Nope. Vance? Dead aim. AOC? A little to the left but Marcy still got a treat. Every politician got hit. Well, not that Canadian guy. But I don't think she's seen that one before. So I figure she can spot faces. I mark the target and she spits 'em when she sees 'em.
"So what do you say? Come see her, please? I want to know if she'll do it for you, you know? Before I call a fish scientist? Promise it will be a hoot."
I totally agree with this for some/many use cases. I would caution against doing so in the following situations:
In reality it is a balancing act, and it would be best to avoid over-reliance on either approach: over-analysis or pattern heuristics.
Ha! Well done. I spent a week making sure my math was right and never thought of this. I agree that updating the truth probability is a better model of this situation, and I can confirm your numbers.
I suppose we could also update each day's success chance, with some kind of prior balancing updating truth probability vs. success probability. Though by that point we are likely no longer "simplifying".
"Doesn't exist, or doesn't give a fuck about suffering" is the answer that matches the data
I agree with you. (Though I might rephrase the second as 'doesn't care about suffering the way we do'. Either way, your point is valid.)
My point wasn't to say 'doesn't exist' is wrong, but that there is more than one possibility. If you or anyone has taken the time to evaluate the possibilities and come to the conclusion that 'doesn't exist' is the more likely / simple / predictive model, then I commend you. That is what rationality is about.
All I ask is the same courtesy as I might be exploring a different set of models than you are.
Interesting. Presumably if Bessel never got the results he wanted, he could (assuming he's honest) continue until the negative data was enough to convince himself that he was wrong. Depending on his prior that might not happen, he could run out of money or motivation before he gave up and published a negative result. Avoiding this seems related to issues about publishing negative results and timely reporting of raw data.
With regards to the biased reporting, I'll just mention that we would have to adjust for known bias wether we were using Bayesian or frequentist methods.