Note that this post has been edited in response to feedback and comments. In particular, I've added the word "strong" into the title, and an explanation of it at the beginning of the post, to be clearer what position I'm critiquing. I've also edited the discussion of Blockhead.
In this post I want to lay out some intuitions about why bayesianism is not very useful as a conceptual framework for thinking either about AGI or human reasoning. This is not a critique of bayesian statistical methods; it’s instead aimed at the philosophical position that bayesianism defines an ideal of rationality which should inform our perspectives on less capable agents, also known as "strong bayesianism". As described here:
The Bayesian machinery is frequently used in statistics and machine learning, and some people in these fields believe it is very frequently the right tool for the job. I’ll call this position “weak Bayesianism.” There is a more extreme and more philosophical position, which I’ll call “strong Bayesianism,” that says that the Bayesian machinery is the single correct way to do not only statistics, but science and inductive inference in general – that it’s the “aspirin in willow bark” that makes science, and perhaps all speculative thought, work insofar as it does work.
Or another way of phrasing the position, from Eliezer:
Whatever approximation you use, it works to the extent that it approximates the ideal Bayesian calculation - and fails to the extent that it departs.
First, let’s talk about Blockhead: Ned Block’s hypothetical AI that consists solely of a gigantic lookup table. Consider a version of Blockhead that comes pre-loaded with the optimal actions (according to a given utility function) for any sequence of inputs which takes less than a million years to observe. So for the next million years, Blockhead will act just like an ideal superintelligent agent. Suppose I argued that we should therefore study Blockhead in order to understand advanced AI better. Why is this clearly a bad idea? Well, one problem is that Blockhead is absurdly unrealistic; you could never get anywhere near implementing it in real life. More importantly, even though Blockhead gets the right answer on all the inputs we give it, it’s not doing anything remotely like thinking or reasoning.
The general lesson here is that we should watch out for when a purported "idealised version" of some process is actually a different type of thing to the process itself. This is particularly true when the idealisation is unimaginably complex, because it might be hiding things in the parts which we can’t imagine. So let's think about what an ideal bayesian reasoner like a Solomonoff inductor actually does. To solve the grain of truth problem, the set of hypotheses it represents needs to include every possible way that the universe could be. We don't yet have any high-level language which can describe all these possibilities, so the only way to do so is listing all possible Turing machines. Then in order to update the probabilities in response to new evidence, it needs to know how that entire universe evolves up to the point where the new evidence is acquired.
In other words, an ideal bayesian is not thinking in any reasonable sense of the word - instead, it’s simulating every logically possible universe. By default, we should not expect to learn much about thinking based on analysing a different type of operation that just happens to look the same in the infinite limit. Similarly, the version of Blockhead I described above is basically an optimal tabular policy in reinforcement learning. In reinforcement learning, we’re interested in learning policies which process information about their surroundings - but the optimal tabular policy for any non-trivial environment is too large to ever be learned, and when run does not actually do any information-processing! Yet it's particularly effective as a red herring because we can do proofs about it, and because it can be calculated in some tiny environments.
You might argue that bayesianism is conceptually useful, and thereby helps real humans reason better. But I think that concepts in bayesianism are primarily useful because they have suggestive names, which make it hard to realise how much work our intuitions are doing to translate from ideal bayesianism to our actual lives. For more on what I mean by this, consider the following (fictional) dialogue:
Alice the (literal-minded) agnostic: I’ve heard about this bayesianism thing, and it makes sense that I should do statistics using bayesian tools, but is there any more to it?
Bob the bayesian: Well, obviously you can’t be exactly bayesian with finite compute. But the intuition that you should try to be more like an ideal bayesian is a useful one which will help you have better beliefs about the world. In fact, most of what we consider to be “good reasoning” is some sort of approximation to bayesianism.
A: So let me try to think more like an ideal bayesian for a while, then. Well, the first thing is - you’re telling me that a lot of the things I’ve already observed to be good reasoning are actually approximations to bayesianism, which means I should take bayesianism more seriously. But ideal bayesians don’t update on old evidence. So if I’m trying to be more like an ideal bayesian, I shouldn’t change my mind about how useful bayesianism is based on those past observations.
B: No, that’s silly. Of course you should. Ignoring old evidence only makes sense when you’ve already fully integrated all its consequences into your understanding of the world.
A: Oh, I definitely haven’t done that. But speaking of all the consequences - what if I’m in a simulation? Or an evil demon is deceiving me? Should I think about as many such skeptical hypotheses as I can, to be more like an ideal bayesian who considers every hypothesis?
B: Well, technically ideal bayesians consider every hypothesis, but only because they have infinite compute! In practice you shouldn’t bother with many far-fetched hypotheses, because that’s a waste of your limited time.*
A: But what if I have some evidence towards that hypothesis? For example, I just randomly thought of the hypothesis that the universe has exactly a googleplex atoms in it. But there's some chance that this thought was planted in my mind by a higher power to allow me to figure out the truth! I should update on that, right?
B: Look, in practice that type of evidence is not worth keeping track of. You need to use common sense to figure out when to actually make the effort of updating.
A: Hmm, alright. But when it comes to the hypotheses I do consider, they should each be an explicit description of the entire universe, right, like an ideal bayesian’s hypotheses?
B: No, that’s way too hard for a human to do.
A: Okay, so I’ll use incomplete hypotheses, and then assign probabilities to each of them. I guess I should calculate as many significant digits of my credences as possible, then, to get them closer to the perfectly precise real-valued credences that an ideal bayesian has?
B: Don’t bother. Imprecise credences are good enough except when you’re solving mathematically precise problems.
A: Speaking of mathematical precision, I know that my credences should never be 0 or 1. But when an ideal bayesian conditions on evidence they’ve received, they’re implicitly being certain about what that evidence is. So should I also be sure that I’ve received the evidence I think I have?
A: Then since I’m skipping all these compute-intensive steps, I guess getting closer to an ideal bayesian means I also shouldn’t bother to test my hypotheses by making predictions about future events, right? Because an ideal bayesian gets no benefit from doing so - they can just make updates after they see the evidence.
B: Well, it’s different, because you’re biased. That’s why science works, because making predictions protects you from post-hoc rationalisation.
A: Fine then. So what does it actually mean to be more like an ideal bayesian?
B: Well, you should constantly be updating on new evidence. And it seems like thinking of degrees of belief as probabilities, and starting from base rates, are both helpful. And then sometimes people conditionalise wrong on simple tasks, so you need to remind them how to do so.
A: But these aren’t just bayesian ideas - frequentists are all about base rates! Same with “when the evidence changes, I change my mind” - that one’s obvious. Also, when people try to explicitly calculate probabilities, sometimes they’re way off.** What’s happening there?
B: Well, in complex real-world scenarios, you can’t trust your explicit reasoning. You have to fall back on intuitions like “Even though my inside view feels very solid, and I think my calculations account for all the relevant variables, there’s still a reasonable chance that all my models are wrong.”
A: So why do people advocate for the importance of bayesianism for thinking about complex issues if it only works in examples where all the variables are well-defined and have very simple relationships?
B: I think bayesianism has definitely made a substantial contribution to philosophy. It tells us what it even means to assign a probability to an event, and cuts through a lot of metaphysical bullshit.
Back to the authorial voice. Like Alice, I'm not familiar with any principled or coherent characterisation of what trying to apply bayesianism actually means. It may seem that Alice’s suggestions are deliberately obtuse, but I claim these are the sorts of ideas you’d consider if you seriously tried to consistently “become more bayesian”, rather than just using bayesianism to justify types of reasoning you endorse for other reasons.
I agree with Bob that the bayesian perspective is useful for thinking about the type signature of calculating a subjective probability: it’s a function from your prior beliefs and all your evidence to numerical credences, whose quality should be evaluated using a proper scoring rule. But for this insight, just like Bob’s insights about using base rates and updating frequently, we don’t need to make any reference to optimality proofs or the idealised limit of
intelligence brute force search. In fact, doing so often provides an illusion of objectivity which is ultimately harmful. I do agree that most things people identify as tenets of bayesianism are useful for thinking about knowledge; but I claim that they would be just as useful, and better-justified, if we forced each one to stand or fall on its own.
* Abram Demski has posted about moving past bayesianism by accounting for logical uncertainty to a greater extent, but I think that arguments similar to the ones I’ve made above are also applicable to logical inductors (although I’m less confident about this).
** You can probably fill in your own favourite example of this. The one I was thinking about was a post where someone derived that the probability of extinction from AI was less than 1 in 10^200; but I couldn’t find it.