Predictive coding = RL + SL + Bayes + MPC

by steve2152 6 min read10th Dec 20191 comment

24

Ω 11


Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

I was confused and skeptical for quite a while about some aspects of predictive coding—and it's possible I'm still confused—but after reading a number of different perspectives on brain algorithms, the following picture popped into my head and I felt much better:

This is supposed to be a high-level perspective on how the neocortex[1] builds a predictive world-model and uses it to choose appropriate actions. (a) We generate lots of "hypotheses", a.k.a. generative models (also called "patterns" by Kurzweil, or "subagents"[2] by various people[3]) in parallel about what's going on and what's going to happen next, including what I am doing and will do next (i.e., my plans). The hypotheses gain "prominence" by (b) correctly predicting upcoming sensory inputs; (c) correctly predicting other types of input information coming into the neocortex like tiredness, guilt, hunger, warmth, pain, pleasure, and so on; (d) being compatible with other already-prominent hypotheses; (e) predicting that my innate drives will be satisfied—my goals will be fulfilled with minimal effort, I'll be eating soon if I'm hungry, I'll be sleeping soon if I'm tired, I'll avoid pain, and so on. Whatever candidate hypothesis winds up the most "prominent" wins, and determines my beliefs and actions going forward.

Before we get to details, I need to apologize for the picture being misleading:

  • First, I drew (b,c,d,e) as happening after (a), but really some of these (especially (d) I think) work by affecting which hypotheses get considered in the first place. (More generally, I do not want to imply that a,b,c,d,e correspond to exactly five distinct neural mechanisms, or anything like that. I'm just going for a functional perspective in this post.)

  • Second (and relatedly), I depicted it as if we simply add up points for (b-e), but it's certainly not linear like that. I think at least some of the considerations effectively get vetoes. For example, we don't generally see a situation where (e) is so positive that it simply outvotes (b-d), and thus we spend all day checking our wallet expecting to find it magically filled with crisp $1000 bills. (Much more about wishful thinking below.)

  • Third, at the bottom I drew one hypothesis being the "winner". Things like action plans and conscious attention do in fact have a winner-take-all dynamic because, for example, we don't want to be sending out muscle commands for both walking and sitting simultaneously.[4] But in general, lower-ranked hypotheses are not thrown out; they linger, with their prominence growing or shrinking as more evidence comes in.

Anyway, the picture above tells a nice story:

(b) is self-supervised learning[5], i.e. learning from prediction. Process (b) simply votes against hypotheses when they make incorrect predictions. This process is where we get the vast majority of the information content we need to build a good predictive world-model. Note that there doesn't seem to be any strong difference in the brain between (i) actual experiences, (ii) memory recall, and (iii) imagination—process (b) will vote for or against hypotheses when presented with any of those three types of "evidence".

(c) is credit assignment, i.e. learning what aspects of the world cause good or bad things to happen to us, so that we can make good decisions. Each hypothesis makes claims about which part of its models are the cause of subcortex-provided informational signals (analogous to "reward" in RL)—information signals that say we're in pain, or eating yummy food, or exhausted, or scared, etc. These claims cash out as predictions that can prove right or wrong, thus either supporting or casting doubt on that hypothesis. Thus our internal models say that "cookies are yummy", corresponding to a prediction that, if we eat one, we'll get a "yummy" signal from some ancient reptilian part of our brain.

(d) is Bayesian priors. I doubt we do Bayesian updating in a literal mathematical sense, but we certainly do incorporate prior beliefs into our interpretation of new evidence. I'm claiming that the mechanism for this is "hypotheses gain prominence by being compatible with already-prominent hypotheses". What is an "already-prominent hypothesis"? One that has previously been successful in this same process I'm describing here, especially if in similar contexts, and super-especially if in the immediate past. Such hypotheses function as our priors. And what does it mean for a new hypothesis to be "compatible" with these prior hypotheses? Well, a critical fact about these generative models is that they snap together like Legos, allowing hierarchies, recursion, composition, analogies, casual relationships, and so on. (Thus, I've never seen a rubber wine glass, but I can easily create a mental model of one by gluing together some of my rubber-related generative models with some of my wine-glass-related generative models.) Over time we build up these super-complicated and intricate Rube Goldberg hypotheses, approximately describing our even-more-complicated world. I think a new hypothesis is "compatible" with a prior one when (1) the new hypothesis is almost the same as the prior hypothesis apart from just one or two simple edits, like adding a new bridging connection to a different already-known model; and/or (2) when the new hypothesis doesn't make predictions counter to the prior one, at least not in areas where the prior one is very precise and confident.[6] Something like that anyway, I think...

(e) is Model-Predictive Control. If we're hungry, we give extra points to a hypothesis that says we're about to get up and eat a snack, and so on. This works in tandem with credit assignment (process (c)), so if we have a prominent hypothesis that giving speeches will lead to embarrassment, then we will subtract points from a new hypothesis that we will give a speech tomorrow, and we don't need to run the generative model all the way through to the part where we get embarrassed. I like Kaj Sotala's description here: "mental representations...[are] imbued with a context-sensitive affective gloss"—in this case, the mental representation of "I will give a speech" is infused with a negative "will lead to embarrassment" vibe, and hypotheses lose points for containing that vibe. It's context-sensitive because, for example, the "will lead to feeling cold" vibe could be either favorable or unfavorable depending on our current body temperature. Anyway, this framing makes a lot of sense for choosing actions, and amounts to using control theory to satisfy our innate drives. But if we're just passively observing the world, this framework is kinda problematic...

(e) is also wishful thinking. Let's say someone gives us an unmarked box with a surprise gift inside. According to the role of (e) in the picture I drew, if we receive the box when we're hungry, we should expect to find food in the box, and if we receive the box when we're in a loud room, we should expect to find earplugs in the box, etc. Well, that's not right. Wishful thinking does exist, but it doesn't seem so inevitable and ubiquitous as to deserve a seat right near the heart of human cognition. Well, one option is to declare that one of the core ideas of Predictive Coding theory—unifying world-modeling and action-selection within the same computational architecture—is baloney. But I don't think that's the right answer. I think a better approach is to posit that (b-d) are actually pretty restrictive in practice, leaving (e) mainly as a comparitively weak force that can be a tiebreaker between equally plausible hypotheses. In other words, passive observers rarely if ever come across multiple equally plausible hypotheses for what's going on and what will happen next; it would require a big coincidence to balance the scales so precisely. But when we make predictions about what we ourselves will do, that aspect of the prediction is a self-fulfilling prophecy, so we routinely have equally plausible hypotheses...and then (e) can step in and break the tie.

More general statement of situations where (e) plays a big role: Maybe "self-fulfilling" is not quite the right terminology for when (e) is important; it's more like "(e) is most important in situations where lots of hypotheses are all incompatible, yet where processes (b,c,d) never get evidence to support one hypothesis over the others." So (e) is central in choosing action-selection hypotheses, since these are self-fulfilling, but (e) plays a relatively minor role in passive observation of the world, since there we have (b,c) keeping us anchored to reality (but (e) does play an occasional role on the margins, and we call it "wishful thinking"). (e) is also important because (b,c,d) by themselves leave this whole process highly under-determined: walking in a forest, your brain can build a better predictive model of trees, of clouds, of rocks, or of nothing at all; (e) is a guiding force that, over time, keeps us on track building useful models for our ecological niche.

One more example where (e) is important: confabulation, rationalization, etc. Here's an example: I reach out to grab Emma's unattended lollipop because I'm hungry and callous, but then I immediately think of an alternate hypothesis, in which I am taking the lollipop because she probably wants me to have it. The second hypothesis gets extra points from the (e) process, because I have an innate drive to conform to social norms, be well-regarded and well-liked, etc. Thus the second hypothesis beats the truthful hypothesis (that I grabbed the lollipop because I was hungry and callous). Why can't the (b) process detect and destroy this lie? Because all that (b) has to go on is my own memory, and perniciously, the second hypothesis has some influence over how I form the memory of grabbing the lollipop. It has covered its tracks! Sneaky! So I can keep doing this kind of thing for years, and the (b) process will never be able to detect and kill this habit of thought. Thus, rationalization winds up more like action selection, and less like wishful thinking, in that it is pretty much ubiquitous and central to cognition.[7]

Side note: Should we lump (d-e) together? When people describe Predictive Coding theory, they tend to lump (d-e) together, to say things like "We have a prior that, when we're hungry, we're going to eat soon." I am proposing that this lumping is not merely bad pedagogy, but is actually conflating together two different things: (d) and (e) are not inextricably unified into a single computational mechanism. (I don't think the previous sentence is obvious, and I'm not super-confident about it.) By the same token, I'm uncomfortable saying that minimizing prediction error is a fundamental operating principle of the brain; I want to say that processes (a-e) are fundamental, and minimizing prediction error is something that arguably happens as an incidental side-effect.

Well, that's my story, it seems to basically makes sense, but that could just be my (e) wishful thinking and (e) rationalization talking. :-)


  1. The neocortex is 75% of the human brain by weight, and centrally involved in pretty much every aspect of human intelligence (in partnership with the thalamus and hippocampus). More about the neocortex in my previous post ↩︎

  2. Some hypotheses are not at all subagent-y, like the hypothesis "That is a falling ball and it is going to bounce.". Other hypotheses are very subagent-y, like the hypothesis "That is a dangerous hot stove, and I am going to run away from it, and then I will be safe!". ↩︎

  3. See Jan Kulveit's Multi-agent predictive minds and AI alignment, Kaj Sotala's Multiagent models of mind sequence, or of course Marvin Minsky and many others. ↩︎

  4. I described conscious attention and action plans as "winner-take-all" in the competition among hypotheses, but I think it's somewhat more complicated and subtle than that. I also think that picking a winner is not a separate mechanism from (b,c,d,e), or at least not entirely separate. This is a long story that's outside the scope of this post. ↩︎

  5. I have a brief intro to self-supervised learning at the beginning of Self-Supervised Learning and AGI Safety ↩︎

  6. Note that my picture at the top shows parallel processing of hypotheses, but that's not quite right; in order to see whether two prominent hypotheses are making contradictory predictions, we need to exchange information between them. ↩︎

  7. See The Elephant in the Brain etc. ↩︎

24

Ω 11