Bayes is Out-Dated, and You’re Doing it Wrong

[-]RobertM3yModerator Comment75

Pinned by RobertM

This post, and many of @AnthonyRepetto's subsequent replies to comments on it, seem to be attacking a position that the named individuals don't hold, while stridently throwing out a bunch of weird accusations and deeply underspecified claims. "Bayes is persistently wrong" - about what, exactly?

Content like this should include specific, uncontroversial examples of all the claimed intellectual bankruptcy, and not include a bunch of random (and wrong) snipes.

I'm rate-limiting your ability to comment to once per day. You may consider this a warning; if the quality of your argumentation doesn't improve then you will no longer be welcome to post on the site.

[-]Vladimir_Nesov3y54

Content like this should include specific, uncontroversial examples of all the claimed intellectual bankruptcy

That's not the problem here and this is a bad general rule.

[-]RobertM3y20

That's definitely one of the problems with this post, and while rudeness is generally undesirable it's slightly more forgiveable when there's some evidence of the thing that "justifies" it.

[-]AnthonyRepetto3y-1-2

"Content like this should include specific, uncontroversial examples of all the claimed intellectual bankruptcy, and not include a bunch of random (and wrong) snipes."

I did in fact include empirical metrics of Dirichlet's superiority and how Bayes' Theorem fails in contrast: industry uses it, after they did their own tests, which is empiricism at work. I also showed how Dirichlet Process allows you to compute Confidence Intervals, while Bayes' Theorem is incapable of computing Confidence Intervals. I also explained how, due to the median of the likelihood function being closer to an equal distribution than Bayes would expect, Bayes is persistently biased toward whichever extrema might be observed in the sample. Thus, Bayes' Theorem will consistently mis-estimate; it's persistently wrong, and Dirichlet was developed as the necessary adjustment. So, I did give explicit reasons why Bayes' Theorem is inadequate compared to the modern, standard approach which has empirical backing in industry.

It seems like you want to rate-limit me for an unspecified duration? What are the empirical metrics for that rate-limit being removed? And, the fact that you claim I "didn't provide specific, uncontroversial examples," when I just showed you those specifics again here, implies that you either weren't reading everything very carefully, or you want to mischaracterize me to silence any opposition of your preferred technique: Bayes'-Theorem-by-itself.

[-]RobertM3y30

The missing examples are for claims of the form:

The Rationalists repeatedly rely upon sparse evidence, while claiming certainty

They have self-selected for a community of people who call Bayes the be-all-end-all, all of them agreeing they’re right, and they don’t know that they’re horribly wrong… because they don’t check!

...then you DON’T know the be-all-end-all statistical technique — and neither do Scott Alexander or Eliezer Yudkowski, as much as they’d like you to believe otherwise.

I would not be surprised if some random "rationlist" you ran into somewhere was sloppy or imprecise with their usage of Bayes. I would also not be surprised if you misinterpreted some offhand comment as an unjustified claim to statistical rigor. Maybe it was some third, other thing.

As an aside, all the ways in which you claim that Bayes is wrong are... wrong? Applications of the theorem gives you wrong results insofar as the inputs are wrong, which in real life is ~always, and yet the same is true of the techniques you mention (which, notably, rely on Bayes). There is always the question of what tool is best for a given job, and here we circle back to the question of where exactly this grevious misuse of Bayes is occurring.

It seems like you want to rate-limit me for an unspecified duration? What are the empirical metrics for that rate-limit being removed? And, the fact that you claim I "didn't provide specific, uncontroversial examples," when I just showed you those specifics again here, implies that you either weren't reading everything very carefully, or you want to mischaracterize me to silence any opposition of your preferred technique: Bayes'-Theorem-by-itself.

Deeply uncharitable interpretations of others' motives is not something we especially tolerate on LessWrong.

[-]AnthonyRepetto3y10

Ah, first: you DID claim that I "didn't provide specific, uncontroversial examples" and I HAD given such for why Bayes' Theorem is inadequate. Notice that you made your statement in this context:

<<"Bayes is persistently wrong" - about what, exactly?

Content like this should include specific, uncontroversial examples>>

In that context, where you precede "this" with my statement about Bayes, I naturally took "content like this" to be referring to my statement that "Bayes is persistently wrong." I hope you can see how easy it would be for me to conclude such a thing, considering "this" refers to... the prior statement?

You now move your goal-posts by insisting that my statement "Rationalists repeatedly rely upon sparse evidence, while claiming certainty" was ACTUALLY the argument I had to support with specifics... while if I were to give such specifics, I would have betrayed individual confidences, which is unethical. So, no, I'll continue to assert without specifics, for the sake of confidences, that "Rationalists repeatedly rely upon sparse evidence, while claiming certainty" because MULTIPLE rationalist over the past YEAR have done so, NOT an isolated incident or an off-hand joke, as you assume.

Your assumption that my "amalgam of rationalists I've met over the last year" was somehow a one-off or cursory remark is your OWN uncharitable interpretation; you are dismissing my repeated interactions with your community; such has been the norm. Similarly, in the EA Forum post "Doing EA Better" - a group of risk analysts had been spending a year trying to tell EA that "you're doing risk-assessment wrong; those techniques are out-dated," and EA members kept insisting their way was fine and right. Eventually, that nearly-dozen folks sat down and scribed an essay to EA... and EA pointedly ignored that fact they mentioned! "EA dismisses experts when experts tell EA they're using out-dated techniques." I'm seeing a similar pattern across the Rationalist community, NOT a one-off event or a casual remark; they were using Bayes' Theorem improperly, as the substance of arguments made in response to me.

"As an aside, all the ways in which you claim that Bayes is wrong are... wrong?"

Bayesian Inference is a good and real thing. And, Bayes' Theorem is an old formula, used in Bayesian Inference. AND Bayes' Theorem cannot produce Confidence Intervals, nor will it allocate to minimize the cost of being wrong, nor does it make adjustments for samples' bias toward the extrema. Those are all specific ways where "I just plug it into Bayes' Theorem" is factually wrong. You keep claiming that my critique is wrong - but you only do so vaguely! You skip right past these failures of Bayes' Theorem, each time I mention them. Check the math books: there is NO "question of what tool is best for a given job," as you say - rather, Bayes' Theorem alone is NEVER the tool. You'll have to adjust in many ways, not just one. And if you don't do so, you are in fact using an obsolete technique during your Bayesian Inference.

[-]Jonas Moss3y112

Roughly speaking, we can divide Bayesianism into two, maybe three or more, separate but related meanings:

1. Adherence to a form of Bayesian epistemology. You think that knowledge comes in degrees of belief, and the correct way to update your beliefs on seeing new information is to use Bayes theorem. It's usually done informally.

2. Adherence to Bayesian statistics. You believe that frequentist inference is invalid and that frequentist measures of an estimator's quality should not be used. Instead, you prefer to use precisely defined priors and likelihoods, derive their posteriors, and report a quantity based solely on that. Moreover, you would often espouse some form of Bayesian decision theory - i.e., you have a loss function in addition to your prior and likelihood, and report (or act on) the optimal decision according to your framework. All of this is usually done formally.

Your comments about Dirichlet don't make sense. Are you thinking about the Dirichlet distribution? If so, it is more widely used in Bayesian statistics than frequentist statistics, as it is the conjugate prior to the multinomial distribution. Regarding your comments about the SAS institute, I can say this: Most of the members of this forum are deeply interested in deep learning. Is deep learning Bayesian? No. Not even Bayesian deep learning is properly Bayesian. Does that matter to you, as a Bayesian epistemologist? No, as deep learning has little to nothing to do with epistemology. Does it matter to you, as a Bayesian statistician? No, as deep learning is not about inference or decision theory, which is what Bayesian statisticians care about (for the most part).

By the way, Bayes theorem isn't a "statistical technique", it's just a theorem. Used by all statisticians without a second thought. It's when you use it to do inference you become a Bayesian statistician.

[-]AnthonyRepetto3y3-1

I haven't observed any rationalists here using Dirichlet, and no, I wasn't talking about Bayesian vs. Frequentist; Bayesians are correct. Using Bayes Theorem when you didn't consider the probability of each possibly population producing your observed sample? That's definitely you doing it wrong. Instrumentation has variability; Dirichlet is how you include that, too.

[-]quanticle3y77

Criticizing the use of Bayes Theorem because it's 260 years old is such a weird take.

The Pythagorean theorem is literally thousands of years old. But it's still useful, even though lots of progress has been made in trigonometry since then. Should we abandon , as a result?

[-]M. Y. Zuo3y10

This does seem like a laughable conclusion. Imagine the implications for the world if this line of reasoning became the accepted paradigm!

Though I'm hesitant to outright dismiss anyone willing to put in effort into writing a post, the author here really needs to rewrite their post to remove all the self-imposed absurdities.

[-]simon3y61

If you have a real argument that the prior is reliably best obtained via a Dirichlet process and no other method of coming up with a prior is ever more useful, then make the argument.

I see:

argument from authority/prestige
argument from age (as if math changes over time)
straw/weakmanning ("These Rationalists pick the Prior that they *prefer*."; "The Rationalists repeatedly rely upon sparse evidence, while claiming certainty")

[-]AnthonyRepetto3y-30

Dirichlet is used by industry, NOT Bayes. What is your rebuttal to that, to show that Bayes is in fact superior to Dirichlet?

[-]simon3y97

The wiki article on the Dirrchlet process includes:

In other words, a Dirichlet process is a probability distribution whose range is itself a set of probability distributions. It is often used in Bayesian inference to describe the prior knowledge about the distribution of random variables—how likely it is that the random variables are distributed according to one or another particular distribution.

I.e. it isn't an alternative to Bayes, but rather a way of coming up with a prior.

[+]AnthonyRepetto3y-7-2

[-]AnthonyRepetto3y0-1

And, I never claimed that priors are better obtained with Dirichlet than Bayes... I'm not sure what you were reading, could you quote the section where you thought I was making that claim?

[-]xepo3y55

why are you trying to attack instead of educate?

90% of your article is “rationalists do it wrong”. Why? Who cares? Teach us how to do it better instead of focusing on how we’re doing it wrong.

[+]AnthonyRepetto3y-5-2

[-]Adam Shai3y52

I don't know if I'm missing something, but it sounds like you are discussing for a particular method of picking a prior within a Bayesian context, but you are not arguing against Bayes itself. If anything, it seems to me this is pro-Bayes, just using DIrilecht Processes as a prior.

[-]AnthonyRepetto3y00

Erm, is SAS using Bayes? That's the actual best in class.

[-]Adam Shai3y2-1

Well I don't know SAS at all but a quick search of the SAS documentation for dirilecht calls it a "nonparametric Bayes approach"...

https://documentation.sas.com/doc/en/casactml/8.3/casactml_nonparametricbayes_details12.htm

[-]AnthonyRepetto3y-4-4

SAS has developed their own trade-secret that outperforms all public methods; by definition, that MUST not be what YOU do when you apply Bayes to a few personal examples.

[-]Dagon3y40

The downvotes are predictable - not only is it mis-stating a strawman of the group's position, it uses a lot of exclamation points to emphasize how stupid we all are.

However, it's also got some pretty good points, especially as some of the adjacent social groups are exploding, in part due to untenable extrapolation over unstable premeses.

[-]LVSN3y2-1

I can't tell if you're right, because no one has ever laid out Bayesianism as a set of definition and instruction steps, explained what Bayesianism uniquely relevantly achieves, and explored the relevant consequences of making various tempting-from-some-perspective mutations on the instruction set; those are the steps required to elevate a person's grasp of Bayesianism to true understanding.

You have also not followed those steps with Dirichlet and SAS, and compared it to Bayesianism.

Still I have an intuition that your complaint about using personal experience is not virtuous. Everything you learn has to pass through personal experience. If your other ways of becoming informed had not been conceived, learning from personal experience would still be possible. Information is information no matter how seriously you take it, and I think personal experience is worth taking seriously, as a person who concerns themself with misleadingness, in a world full of people who attend only to the truth of what they hear and not to the misleadingness.

[-]AnthonyRepetto3y-20

You claim of Bayes and Dirichlet that "no one has ever laid (them) out", and to prove your claim, you link to another post that YOU wrote, where you claim it again? Check math textbooks; I don't have to teach you what's already available in the public sphere.

[-]LVSN3y10

It was not to prove my claim; the post I wrote elaborates more fully on what I believe is the correct teaching process. If you read the post, it would become clear to you that my teaching standards have never been met in textbooks, and can hardly even in principle be met through textbooks. My teaching standards are not arbitrary; if these standards are not met then I will not truly understand the subject.

[-]AnthonyRepetto3y-30

Your difficulty understanding it is NOT equivalent to "no one has ever laid them out". Those are two wildly different statements. A dyslexic person would have similar difficulty reading a novel, yet that is NOT equal to "no one ever wrote a book."

[-]LVSN3y-10

Feeling like you understand is not the same as actual understanding. People who read the existing explanations and feel like they understand, when the explanations did not follow the process I described, do not truly understand. My complaint is not that when I read the explanations I don't feel like I understand them; my complaint is that the extents to which Bayesianism have ever been laid out are insufficient for creating true understanding upon first reading.

[-]AnthonyRepetto3y10

Astounding! Then my argument that "NOT including Dirichlet is wrong" must have been wrong? Or else, why are you mentioning that no one taught you to your own satisfaction?

[-]LVSN3y1-2

Then my argument that "NOT including Dirichlet is wrong" must have been wrong?

It could be right, actually. The only objection I made was in response to your objection to using personal experience, and I only talked about my intuition rather than what must or must not be the case.

Or else, why are you mentioning that no one taught you to your own satisfaction?

You seem to want to proselytize better epistemic methods, and I am telling you what I need from you in order to adopt or reject your advised methods from an engineering angle (which I regard as superior); until then I can only follow clues of lesser quality (such as the correlation between caring about misleadingness and tendency to say things that impress me as insightful); the detective angle.

[-]AnthonyRepetto3y-10

Screenshots are up! I'll be glad when more members of the public see the arguments you give for ignoring mine. :P cheers!

[+]AnthonyRepetto3y-52

[-]duck_master1y10

The single biggest question I have is "what is Dirichlet?"

[-][anonymous]3y11

As I understand it, in the event that you are correct and Dirichlet is better, rational Rationalists must switch to the better algorithm. Because rationality is about systematized winning, and if you are correct, this is a measurably better algorithm to win.

[-]AnthonyRepetto3y-2-2

Yes! And, even since Dirichlet was published in 1973, it has ONLY ever been run on super-computers, using statistically significant sample sizes! You CANNOT do Dirichlet in your head, unless you are a Savant, and no math class will ask you to Dirichlet on a quiz. I'm not sure how ANYONE can claim Bayes is reliable, when NO ONE in industry touches it... your community has an immense blind-spot to real-world methods, yet you claim certainty and confidence - that's the Dunning-Kreugers self-selecting into a pod that all agree they're right to use Bayes.

[-][anonymous]3y11

That's only one piece of rationality, and I think the general conclusion was "ask an artificial intelligence you can trust" would be the only scalable way for humans to be genuinely rational in their decision-making. It does not matter what algorithm that machine uses internally, merely it is the best performing one from the class of "sufficiently trustworthy" choices.

Note this is a feasible thing to do, for example the activation function Swish was found this way.

A lot of the rest of it was dismissing obviously wrong individuals and institutions? You saw how you dismissed the idea of "start with a prior from the median of mainstream knowledge" and "update with each anecdote"?

The thing is, that method is arguably better than many institutions and individuals are. At least it uses information to make it's decision.

One of the tenants of "what does $authority_figure claim to know and how does he know it" allows you to dismiss obviously wrong/misaligned authorities on subjects.

Such as the FDA or machine learning scientists setting 2060 as the date for AGI. (the FDA is misaligned, it serves it's own interests not the interests of living Americans wanting to remain that way. the ML scientists did not account for an increase in investment or recursive improvement)

There are a lot of other ideas and societal practices that are simply based on bullshit, no actual thought or process was even followed to generate them, they are usually just parroting some past flawed idea. Like what you said regarding Bayes.

[-]AnthonyRepetto3y-3-4

Then why does industry use Dirichlet, not Bayes? You keep pretending yours is better, when everyone who has to publish physics used additional methods, from this century. None of you explain why industry would use Dirichlet, if Bayes is superior. Further, why would Dirichlet even be PUBLISHED unless it's an improvement? You completely disregard these blinding facts. More has happened in the last 260 years than just Bayes' Theorem, and your suspicion of the FDA doesn't change that fact.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

-45

Bayes is Out-Dated, and You’re Doing it Wrong

-45

-45