# 67

It was a sane world. A Rational world. A world where every developmentally normal teenager was taught Bayesian probability.

Saundra's math class was dressed in their finest robes. Her teacher, Mr Waze, had invited the monk Ryokan to come speak. It was supposed to be a formality. Monks rarely came down from their mountain hermitages. The purpose of inviting monks to speak was to show respect for how much one does not know. And yet, monk Ryokan had come down to teach a regular high school class of students.

Saundra ran to grab monk Ryokan a chair. All the chairs were the same—even Mr Waze's. How could she show respect to the mountain monk? Saundra's eyes darted from chair to chair, looking for the cleanest or least worn chair. While she hesitated, Ryokan sat on the floor in front of the classroom. The students pushed their chairs and desks to the walls of the classroom so they could sit in a circle with Ryokan.

"The students have just completed their course on Bayesian probability," said Mr Waze.

"I see[1]," said Ryokan.

"The students also learned the history of Bayesian probability," said Mr Waze.

"I see," said Ryokan.

There was an awkward pause. The students waited for the monk to speak. The monk did not speak.

"What do you think of Bayesian probability?" said Saundra.

"I am a Frequentist," said Ryokan.

Mr Waze stumbled. The class gasped. A few students screamed.

"It is true that trolling is a core virtue of rationality," said Mr Waze, "but one must be careful not to go too far."

Ryokan shrugged.

Saundra raised her hand.

"You may speak. You need not raise your hand. Rationalism does not privilege one voice above all others," said Ryokan.

Saundra's voice quivered. "Why are you a Frequentist?" she said.

"Why are you a Bayesian?" said Ryokan. Ryokan kept his face still but he failed to conceal the twinkle in his eye.

Saundra glanced at Mr Waze. She forced herself to look away.

"May I ask you a question?" said Ryokan.

Saundra nodded.

"With what probability do you believe in Bayesianism?" said Ryokan.

Saundra thought about the question. Obviously not 1 because no Bayesian believes anything with a confidence of 1. But her confidence was still high.

"Ninety-nine percent," said Saundra, "Zero point nine nine."

"Why?" said Ryokan, "Did you use Bayes' Equation? What was your prior probability before your teacher taught you Bayesianism?"

"I notice I am confused," said Saundra.

"The most important question a Rationalist can ask herself is 'Why do I think I know what I think I know?'" said Ryokan. "You believes Bayesianism with a confidence of where represents the belief 'Bayesianism is true' and represents the observation 'your teacher taught you Bayesianism'. A Bayesian belives with a confidence because . But that just turns one variable into three variables ."

Saundra spotted the trap. "I think I see where this is going," said Saundra, "You're going to ask me where I got values for the three numbers ."

Ryokan smiled.

"My prior probability was very small because I didn't know what Bayesian probability was. Therefore must be very large." said Saundra.

Ryokan nodded.

"But if is very large then that means I trust what my teacher says. And a good Rationalist always questions what her teacher says," said Saundra.

"Which is why trolling is a fundamental ingredient to Rationalist pedagogy. If teachers never trolled their students then students would get lazy and believe everything that came out of their teachers' mouths," said Ryokan.

"Are you trolling me right now? Are you really a Frequentist?" said Saundra.

"Is your teacher really a Bayesian?" said Ryokan.

1. Actually, what Ryokan said was "そうです" which means "[it] is so". ↩︎

# 67

New Comment

"It is simply a question of which method of calculation is more convenient for you," explained Ryokan. "If you are a high-level monk and remember your previous billion reincarnations, any possible event you consider has already happened to you many times. Calculating n(A)/n(S) once requires much less effort than calculating Bayesian updates over and over again."

"But if the enlightened person remembers the frequencies of everything, how can probability be in the mind?" cried Saundra.

"Because the entire reality only exists in the mind of Lord Vishnu," answered Mr Waze.

I am a regularity detector generated by the regularities of reality. Frequentism and Bayesianism are attempted formalizations of the observed regularities in the regularity detection process but, ultimately, I am neither.

I think a better way to look at it is that frequentist reasoning is appropriate in certain situations and Bayesian reasoning is appropriate in other situations.  Very roughly, frequentist reasoning works well for descriptive statistics and Bayesian reasoning works well for inferential statistics.  I believe that Bayesian reasoning is appropriate to use in certain kinds of cases with a probability of (1-delta), where 1 represents the probability of something that has been rationally proven to my satisfaction and delta represents the (hopefully small) probability that I am deluded.

Is there some connection to the mountain troll in HPMoR? I'm not seeing it, but I feel like the title would be too big a coincidence otherwise.

Sometime a mountain troll is just a mountain troll.

"A good rationalist always questions what her teacher says."

Why does Saundra believe this? I'd hazard the guess that her teacher said it to her.

The axioms that we pick up before we learn to question new axioms are the hardest to see and question. I wonder if that's a factor in the correlation between "smarter" people often seeming to learn to question axioms earlier in life -- less time spent getting piled with beliefs that were never tested by the "shall I choose to believe this?" filter because the filter didn't exist yet when the beliefs were taken on.

The whole concept of "questioning" is questionable, as it's suggesting an improvement over status quo where claims you overhear are unconditionally accepted as own beliefs verbatim, or at least alternatives to them are discouraged from being discussed, which is insane (and the way language models learn). A more reasonable baseline for improvement is where claims are given inappropriate credence or inappropriate attention to the question of their credence (where one alternative is siloing them inside hypotheticals).

Being honest, for nearly all people nearly all of the time questioning firmly established ideas is a waste of time at best.  If you show a child, say, the periodic table (common versions of which have hundreds of facts), the probability that the child's questioning will lead to a significant new discovery are less that 1 in a billion* and the probability that they will lead to a useless distraction approach 100%.  There are large bodies of highly reliable knowledge in the world, and it takes intelligent people many years to understand them well enough to ask the questions that might actually drive progress.  And when people who are less intelligent, less knowledgeable, and/or more prone to motivated reasoning are asking the questions, you can get flat earthers, Qanon, etc.

*Based on the guess that we've taught the periodic table to at least a billion kids and it's never happened yet.

the probability that the child's questioning will lead to a significant new discovery

The relevant purpose is new discoveries for the child, which is quite plausible. Insufficiently well-understood claims are also not really known, even when they get to be correctly (if not validly) accepted on faith. (And siloing such claims inside appropriate faith-correctness/source-truthfulness hypotheticals is still superior to accepting them unconditionally.) There is also danger of discouraging formation of gears level understanding on the basis of irrefutability of policy level knowledge, rendering ability to make use of that knowledge brittle. The activity of communicating personal discoveries to the world is mostly unrelated.

I get your point, and I totally agree that answering a child's questions can help the kid connect the dots while maintaining the kid's curiosity.  As a pedagogical tool, questions are great.

Having said that, most people's knowledge of most everything outside their specialties is shallow and brittle.  The plastic in my toothbrush is probably the subject of more than 10 Ph.D. dissertations, and the forming processes of another 20.  This computer I'm typing on is probably north of 10,000.  I personally know a fair amount about how the silicon crystals are grown and refined, have a basic understanding of how the chips are fabricated (I've done some fabrication myself), know very little about the packaging, assembly, or software, and know how to use the end product at a decent level.  I suspect that worldwide my overall knowledge of computers might be in the top 1% (of some hypothetical reasonable measure).  I know very little about medicine, agriculture, nuclear physics, meteorology, or any of a thousand other fields.

Realistically, a very smart* person can learn anything but not everything (or even 1% of everything).  They can learn anything given enough time, but literally nobody is given enough time.  In practice, we have to take a lot of things on faith, and any reasonable education system will have to work within this limit.  Ideally, it would also teach kids that experts in other fields are often right even when it would take them several years to learn why.

*There are also average people who can learn anything that isn't too complicated and below-average people who can't learn all that much.  Don't blame me; I didn't do it.

My point is not that one should learn more, but about understanding naturally related to any given claim of fact, whose absence makes it brittle and hollow. This sort of curiosity does apply to your examples, not in a remedial way that's only actually useful for other things. The dots being connected are not other claims of fact, but alternative versions of the claim (including false ones) and ingredients of motivation for looking into the fact and its alternatives, including more general ideas whose shadows influence the claim. These gears of the idea do nothing for policies that depend on the fact, if it happens to be used appropriately, but tend to reassemble into related ideas that you never heard about (which gives an opportunity to learn what is already known about them).

It doesn't require learning much more, or about toothbrushes, it's instead emphasis of curiosity on things other than directly visible claims of fact, that shifts attention to those other things when presented with a given claim. This probably results in knowing less, with greater fluency.

To the extent that I understand what you're saying, you seem to be arguing for curiosity as a means of developing a detailed, mechanistic ("gears-level" in your term) model of reality.  I totally support this, especially for the smart kids.  I'm just trying to balance it out with some realism and humility.  I've known too many people who know that their own area of expertise is incredibly complicated but assume that everything they don't understand is much simpler.  In my experience, a lot of projects fail because a problem that was assumed to be simple turned out not to be.

This is useless in practice and detrimental to being a living encyclopedia, distracting from facts deemed salient by civilization. Combinatorial models of more specific and isolated ideas you take an interest in, building blocks for reassembling into related ideas, things that can be played with and not just taken from literature and applied according to a standard methodology. The building blocks are not meant to reconstruct ideas directly useful in practice, it's more about forming common sense and prototyping. The kind of stuff you learn in the second year of college (the gears, mathematical tools, empirical laws), in the role of how you make use of it in the fourth year of college (the ideas reassembled from them, claims independently known that interact with them, things that can't be explained without the background), but on the scale of much smaller topics.

Well, that's the attempt to channel my impression of the gears/policy distinction, which I find personally rewarding, but not necessarily useful in practice, even for research. It's a theorist's aesthetic more than anything else.

I don't see what this parable has to do with Bayesianism or Frequentism.

I thought this was going to be some kind of trap or joke around how "probability of belief in Bayesianism" is a nonsense question in Frequentism.