This is my response to Trapped Priors As A Basic Problem Of Rationality by Scott Alexander.

I felt that the discussion in the post and comments lack some references to lessons learned from Yudkowsky's Causal Diagrams and Causal Models and Fake Causality and One Argument Against An Army which is that you can't bounce back and forth between verdict and assumptions double-counting the evidence in self-reinforcing loop.

Also, I posit that you have to be careful to distinguish separate concepts of "prior on this particular encounter going badly" and "prior on the fraction of the encounters which will go badly" etc. Scott's post seems to hide both under the same phrase "prior that dogs are terrifying".

I've made a colorful charts showing the story of three encounters with dogs, which illustrate how I believe one should update their prior (and meta-prior) and what can go badly if you let your final verdicts influence them instead:

https://docs.google.com/spreadsheets/d/1vQfK-g0zAMEcXwbbPbAWWMTpHprE1jpf6bmwDjtJkHo/edit?usp=sharing

One important feature of the epistemology algorithm I propose above, is that you can admit you were wrong about your past judgments: after exposure to many friendly puppies and updating your prior on "fraction of good dogs" up sufficiently, you may reevaluate your previous verdicts ("see them in a new light") and decide that perhaps they were also good dogs. And there is no risk of this causing a loop in the reasoning, because there are no edges from "verdicts" to "the prior" in this algorithm.

There are many open questions, though. Like for example: should one also update the operational definition of word "bad" over time - in my example, initially we assume that "bad dog" will make us feel threatened 2/3 of the time, and this remains fixed throughout the lifetime. But perhaps this should also update, adding one more layer of Bayes computations. (Perhaps this is exactly the part which is "broken" in people suffering from phobia - perhaps they have a different value than 2/3, or let it drift too much or too little w.r.t. rest of us?). I don't know how to do it, without risking that "bad" and "good" will lose their meaning over time. The 1/3 vs 2/3 split might also change for a different reason: perhaps over time we learn something about our sensors' reliability, their mapping to reality, etc. This might introduce another layer of Bayes. I'd like to know how a full-fledged, correct, Bayesian, mental model of this "simple" issue of "Are dogs bad?" should really look like.

I think this boils down to the https://en.wikipedia.org/wiki/Reference_class_problem. How similar is this novel situation (a distinct quantum configuration than has ever existed in the universe) to whatever you've used to come up with a prior?

Are you thinking "are dogs good", or "are dogs I have encountered on this corner good" or "are dogs wearing a yellow collar that I have encountered on this corner at 3:15pm when it's not raining good", or ... And, of course, with enough specificity, you have zero examples that will have updated your universal prior. Even if you add second-hand or third-hand data, how many reports of good or bad interactions with dogs of this breed, weight, age, location, and interval since last meal have you used to compare?

This doesn't make Bayes useless, but you have to understand that this style of probability is about your uncertainty, not about any underlying real thing. Your mental models about categorization and induction are part of your prediction framework, just as much as individual updates (because they tell you how/when to apply updates). Now you get to assign a prior that your model is useful for this update, as well as the update itself. And update the probability that your model did or did not apply using a meta-model based on the outcome. And so on until your finite computing substrate gives up and lets system 1 make a guess.