There is a third use of Bayesianism, the way that sophisticated economists and political scientists use it: as a useful fiction for modeling agents who try to make good decisions in light of their beliefs and preferences. I’d guess that this is useful for AI, too. These will be really complicated systems and we don’t know much about their details yet, but it will plausibly be reasonable to model them as “trying to make good decisions in light of their beliefs and preferences”.
Perhaps a fourth use is that we might actively want to try to make our systems more like Bayesian reasoners, at least in some cases.

My post was intended to critique these positions too. In particular, the responses I'd give are that:

  • There are many ways to model agents as “trying to make good decisions in light of their beliefs and preferences”. I expect bayesian ideas to be useful for very simple models, where you can define a set of states to have priors and preferences over. For more complex and interesting models, I think most of the work is done by considering the cognition the agents are doing, and I don't think bayesianism gives you particular insight into that for the same reasons I don't think it gives you particular insight into human cognition.
  • In response to "The Bayesian framework plausibly allows us to see failure modes that are common to many boundedly rational agents": in general I believe that looking at things from a wide range of perspectives allows you to identify more failure modes - for example, thinking of an agent as a chaotic system might inspire you to investigate adversarial examples. Nevertheless, apart from this sort of inspiration, I think that the bayesian framework is probably harmful when applied to complex systems because it pushes people into using misleading concepts like "boundedly rational" (compare your claim with the claim that a model in which all animals are infinitely large helps us identify properties that are common to "boundedly sized" animals).
  • "We might actively want to try to make our systems more like Bayesian reasoners": I expect this not to be a particularly useful approach, insofar as bayesian reasoners don't do "reasoning". If we have no good reason to think that explicit utility functions are something that is feasible in practical AGI, except that it's what ideal bayesian reasoners do, then I want to discourage people from spending their time on that instead of something else.

Against strong bayesianism

by ricraz 6 min read30th Apr 202063 comments


(Note that this post has been edited in response to feedback and comments. In particular, I've added the word "strong" into the title, and an explanation of it at the beginning of the post, to be clearer what position I'm critiquing. I've also edited the discussion of Blockhead).

In this post I want to lay out some intuitions about why bayesianism is not very useful as a conceptual framework for thinking either about AGI or human reasoning. This is not a critique of bayesian statistical methods; it’s instead aimed at the philosophical position that bayesianism defines an ideal of rationality which should inform our perspectives on less capable agents, also known as "strong bayesianism". As described here:

The Bayesian machinery is frequently used in statistics and machine learning, and some people in these fields believe it is very frequently the right tool for the job.  I’ll call this position “weak Bayesianism.”  There is a more extreme and more philosophical position, which I’ll call “strong Bayesianism,” that says that the Bayesian machinery is the single correct way to do not only statistics, but science and inductive inference in general – that it’s the “aspirin in willow bark” that makes science, and perhaps all speculative thought, work insofar as it does work.

Or another way of phrasing the position, from Eliezer:

Whatever approximation you use, it works to the extent that it approximates the ideal Bayesian calculation - and fails to the extent that it departs.

First, let’s talk about Blockhead: Ned Block’s hypothetical AI that consists solely of a gigantic lookup table. Consider a version of Blockhead that comes pre-loaded with the optimal actions (according to a given utility function) for any sequence of inputs which takes less than a million years to observe. So for the next million years, Blockhead will act just like an ideal superintelligent agent. Suppose I argued that we should therefore study Blockhead in order to understand advanced AI better. Why is this clearly a bad idea? Well, one problem is that Blockhead is absurdly unrealistic; you could never get anywhere near implementing it in real life. More importantly, even though Blockhead gets the right answer on all the inputs we give it, it’s not doing anything remotely like thinking or reasoning.

The general lesson here is that we should watch out for when a purported "idealised version" of some process is actually a different type of thing to the process itself. This is particularly true when the idealisation is unimaginably complex, because it might be hiding things in the parts which we can’t imagine. So let's think about what an ideal bayesian reasoner like a Solomonoff inductor actually does. To solve the grain of truth problem, the set of hypotheses it represents needs to include every possible way that the universe could be. We don't yet have any high-level language which can describe all these possibilities, so the only way to do so is listing all possible Turing machines. Then in order to update the probabilities in response to new evidence, it needs to know how that entire universe evolves up to the point where the new evidence is acquired.

In other words, an ideal bayesian is not thinking in any reasonable sense of the word - instead, it’s simulating every logically possible universe. By default, we should not expect to learn much about thinking based on analysing a different type of operation that just happens to look the same in the infinite limit. Similarly, the version of Blockhead I described above is basically an optimal tabular policy in reinforcement learning. In reinforcement learning, we’re interested in learning policies which process information about their surroundings - but the optimal tabular policy for any non-trivial environment is too large to ever be learned, and when run does not actually do any information-processing! Yet it's particularly effective as a red herring because we can do proofs about it, and because it can be calculated in some tiny environments.

You might argue that bayesianism is conceptually useful, and thereby helps real humans reason better. But I think that concepts in bayesianism are primarily useful because they have suggestive names, which make it hard to realise how much work our intuitions are doing to translate from ideal bayesianism to our actual lives. For more on what I mean by this, consider the following (fictional) dialogue:

Alice the (literal-minded) agnostic: I’ve heard about this bayesianism thing, and it makes sense that I should do statistics using bayesian tools, but is there any more to it?

Bob the bayesian: Well, obviously you can’t be exactly bayesian with finite compute. But the intuition that you should try to be more like an ideal bayesian is a useful one which will help you have better beliefs about the world. In fact, most of what we consider to be “good reasoning” is some sort of approximation to bayesianism.

A: So let me try to think more like an ideal bayesian for a while, then. Well, the first thing is - you’re telling me that a lot of the things I’ve already observed to be good reasoning are actually approximations to bayesianism, which means I should take bayesianism more seriously. But ideal bayesians don’t update on old evidence. So if I’m trying to be more like an ideal bayesian, I shouldn’t change my mind about how useful bayesianism is based on those past observations.

B: No, that’s silly. Of course you should. Ignoring old evidence only makes sense when you’ve already fully integrated all its consequences into your understanding of the world.

A: Oh, I definitely haven’t done that. But speaking of all the consequences - what if I’m in a simulation? Or an evil demon is deceiving me? Should I think about as many such skeptical hypotheses as I can, to be more like an ideal bayesian who considers every hypothesis?

B: Well, technically ideal bayesians consider every hypothesis, but only because they have infinite compute! In practice you shouldn’t bother with many far-fetched hypotheses, because that’s a waste of your limited time.*

A: But what if I have some evidence towards that hypothesis? For example, I just randomly thought of the hypothesis that the universe has exactly a googleplex atoms in it. But there's some chance that this thought was planted in my mind by a higher power to allow me to figure out the truth! I should update on that, right?

B: Look, in practice that type of evidence is not worth keeping track of. You need to use common sense to figure out when to actually make the effort of updating.

A: Hmm, alright. But when it comes to the hypotheses I do consider, they should each be an explicit description of the entire universe, right, like an ideal bayesian’s hypotheses?

B: No, that’s way too hard for a human to do.

A: Okay, so I’ll use incomplete hypotheses, and then assign probabilities to each of them. I guess I should calculate as many significant digits of my credences as possible, then, to get them closer to the perfectly precise real-valued credences that an ideal bayesian has?

B: Don’t bother. Imprecise credences are good enough except when you’re solving mathematically precise problems.

A: Speaking of mathematical precision, I know that my credences should never be 0 or 1. But when an ideal bayesian conditions on evidence they’ve received, they’re implicitly being certain about what that evidence is. So should I also be sure that I’ve received the evidence I think I have?

B: No-

A: Then since I’m skipping all these compute-intensive steps, I guess getting closer to an ideal bayesian means I also shouldn’t bother to test my hypotheses by making predictions about future events, right? Because an ideal bayesian gets no benefit from doing so - they can just make updates after they see the evidence.

B: Well, it’s different, because you’re biased. That’s why science works, because making predictions protects you from post-hoc rationalisation.

A: Fine then. So what does it actually mean to be more like an ideal bayesian?

B: Well, you should constantly be updating on new evidence. And it seems like thinking of degrees of belief as probabilities, and starting from base rates, are both helpful. And then sometimes people conditionalise wrong on simple tasks, so you need to remind them how to do so.

A: But these aren’t just bayesian ideas - frequentists are all about base rates! Same with “when the evidence changes, I change my mind” - that one’s obvious. Also, when people try to explicitly calculate probabilities, sometimes they’re way off.** What’s happening there?

B: Well, in complex real-world scenarios, you can’t trust your explicit reasoning. You have to fall back on intuitions like “Even though my inside view feels very solid, and I think my calculations account for all the relevant variables, there’s still a reasonable chance that all my models are wrong.”

A: So why do people advocate for the importance of bayesianism for thinking about complex issues if it only works in examples where all the variables are well-defined and have very simple relationships?

B: I think bayesianism has definitely made a substantial contribution to philosophy. It tells us what it even means to assign a probability to an event, and cuts through a lot of metaphysical bullshit.

Back to the authorial voice. Like Alice, I'm not familiar with any principled or coherent characterisation of what trying to apply bayesianism actually means. It may seem that Alice’s suggestions are deliberately obtuse, but I claim these are the sorts of ideas you’d consider if you seriously tried to consistently “become more bayesian”, rather than just using bayesianism to justify types of reasoning you endorse for other reasons.

I agree with Bob that the bayesian perspective is useful for thinking about the type signature of calculating a subjective probability: it’s a function from your prior beliefs and all your evidence to numerical credences, whose quality should be evaluated using a proper scoring rule. But for this insight, just like Bob’s insights about using base rates and updating frequently, we don’t need to make any reference to optimality proofs or the idealised limit of intelligence brute force search. In fact, doing so often provides an illusion of objectivity which is ultimately harmful. I do agree that most things people identify as tenets of bayesianism are useful for thinking about knowledge; but I claim that they would be just as useful, and better-justified, if we forced each one to stand or fall on its own.

* Abram Demski has posted about moving past bayesianism by accounting for logical uncertainty to a greater extent, but I think that arguments similar to the ones I’ve made above are also applicable to logical inductors (although I’m less confident about this).

** You can probably fill in your own favourite example of this. The one I was thinking about was a post where someone derived that the probability of extinction from AI was less than 1 in 10^200; but I couldn’t find it.