Supervised learning of outputs in the brain

This piece is super interesting, especially the toy models.

A few clarifying questions:

'And not just any learning algorithm! The neocortex runs a learning algorithm, but it's an algorithm that picks outputs to maximize reward, not outputs to anticipate matched supervisory signals. This needs to be its own separate brain module, I think.'

-- Why does it need to be its own separate module? Can you expand on this? And even if separate modules are useful (as per your toy models and different inputs, couldn't the neocortex also be running lookup table like auto or hetero-associative learning).

"Parallel fibers carry the context signals, Purkinje cells produce the output signals, and climbing fibers carry the shoulda signals, and the synapse strengths between a parallel fiber and a Purkinje cell is modified as a function of how recently the parallel fiber and climbing fiber have fired."

-- Can you cite this? I have seen evidence this is the case but also that the context actually comes through the climbing fibers and training (shoulda) signal through the mossy/parallel fibers. Eg here for eyeblink operant conditioning https://www.cs.cmu.edu/afs/cs/academic/class/15883-f17/readings/hesslow-2013.pdf

Can you explain how the accelerator works in more detail (esp as you use it in the later body and cognition toy models 5 and 6)? Why is the cerebellum faster at producing outputs than the neocortex? How does the neocortex carry the "shoulda" signal? Finally I'm confused by this line:

"You can't just take any incoming and outgoing signal lines and pair them up! ...Well, in an accelerator, you can! Let's say the output line in the diagram controls Muscle #847. Which shoulda line needs to be paired up with that? The answer is: it doesn't matter! Just take any neocortex output signal and connect it. And then the neocortex will learn over time that that output line is effectively a controller for Muscle #847."

-- This suggests that the neocortex can learn the cerebellar mapping and short-circuit to use it? Why does it need to go through the cerebellum to do this? Rather than via the motor cortex and efferent connections back to the muscles?

Thank you!

[-]Steven Byrnes5y*30

Thanks for the great questions!!

Why does it need to be its own separate module?

Maybe you're ahead of me, but it took me until long after this post—just a couple weeks ago—to realize that you can take a neural circuit set up for RL, and jury-rig it to do supervised learning instead.

I think this is a big part of the story behind what the vmPFC is doing. And, in a certain sense, the amygdala too. More on this in a forthcoming post.

couldn't the neocortex also be running lookup table like auto or hetero-associative learning ... Why is the cerebellum faster at producing outputs than the neocortex?

I think of the neocortex as doing analysis-by-synthesis—it searches through a space of generative models for one that matches the input data. There's a lot of recurrency—the signals bounce around until it settles into an equilibrium. For example in this model, there's a forward pass from the input data, and that activates some generative models that seem plausible. But it may be multiple models that are mutually inconsistent. For example, in the vision system, "Yarn" and "Yam" are sufficiently close that a feedforward pass would activate both possibilities simultaneously. Then there's this message-passing algorithm where the different possibilities compete to explain the data, and it settles on one particular compositional generative model.

So this seems like a generally pretty slow inference algorithm. But the slowness is worth it, because the neocortex winds up understanding the input data, i.e. fitting it into its structured model of the world, and hence it can now flexibly query it, make predictions, etc.

I think the cerebellum is much simpler than that, and closer to an actual lookup table, and hence presumably much faster.

The cerebellum is also closer in proximity to the spinal cord, which reduces communication delays when reading proprioceptive nerves and commanding muscles.

I have seen evidence this is the case but also that the context actually comes through the climbing fibers and training (shoulda) signal through the mossy/parallel fibers. Eg here for eyeblink operant conditioning https://www.cs.cmu.edu/afs/cs/academic/class/15883-f17/readings/hesslow-2013.pdf

That paper says "it has been the dominant working assumption in the field that the CS is transmitted to the cerebellar cortex via the mossy fibres (mf) and parallel fibres (pf) whereas information about the US is provided by climbing fibres (cf) originating in the inferior olive", which is what I said (US=shoulda, CS=context). Where are you disagreeing here? Or do they contradict that later in the paper?

How does the neocortex carry the "shoulda" signal?

The neocortex is a fancy algorithm that understands the world and takes intelligent actions. It's not perfect, but it's pretty great! So whatever the neocortex does, that's kinda a "ground truth" for the question of "What is the right thing to do right now?"—at least, it's a ground truth from the perspective of the much stupider cerebellum. So my proposal is that whatever the neocortex is outputting, that's a "shoulda" signal that the cerebellum wants to imitate.

This suggests that the neocortex can learn the cerebellar mapping and short-circuit to use it? Why does it need to go through the cerebellum to do this? Rather than via the motor cortex and efferent connections back to the muscles?

I'm not sure I understand your question. The neocortex outputs don't need to go through the cerebellum. People can be born without a cerebellum entirely, they turn out OK. But since the cerebellum is like a super-fast memoizer / lookup table, I think the neocortex can work better by passing signals through the cerebellum.

Anyway, that was all just casual speculation, I don't know how the motor cortex, midbrain, cerebellum, and outgoing nerves are wired together. (I'm very interested to learn, just haven't gotten around to it.)

Hmm, this page suggests that there are both motor-pathways-through-the-cerebellum and motor-pathways-bypassing-the-cerebellum. But that's not a reliable source—that page seems to be riddled with errors. So I dunno.

[-]Gordon Seidoh Worley5yΩ240

Your toy models drew a parallel for me to modern CPU architectures. That is, doing computation the "complete" way involves loading things from memory, doing math, writing to memory, and then that memory might affect later instructions. CPUs have all kinds of tricks to get around this to go faster, and it sort of like like your models of brain parts, only with a reversed etiology, since the ACU came first whereas the neocortex came last, as i understand it.

[-]Steven Byrnes5yΩ240

Interesting! I'm thinking of CPU brach predictors, are you? (Are there other examples? Don't know much about CPUs.) If so, that did seem like a suggestive analogy to what I was calling "the accelerator".

Not sure about etiology. How different is a neocortex from the pallium in a bird or lizard? I'm inclined to say "only superficially different", although I don't think it's known for sure. But if so, then there's a version of it even in lampreys, if memory serves. I don't know the evolutionary history of the cerebellum, or of the cerebellum-pallium loops. It might be in this paper by Cisek which I read but haven't fully processed / internalized.

[-]Gordon Seidoh Worley5yΩ240

Branch predictors for sure, but modern CPUs also do things like managing multiple layers of cache using relatively simple algorithms that nonetheless in practice get high hit rates, conversion of instructions into microcode because it turns out small, simple instructions execute faster but CPUs need to do a lot of things so the tradeoff is to have the CPU interpret the instructions in real time into simpler instructions sent to specialized processing units inside the CPU, and maybe even optimistic branch execution, where instructions in the pipeline are partially executed provisionally ahead of branches being confirmed. All of these things seem like tricks of the sort I wouldn't be surprised to find parallels to in the brain.

[-]Rafael Harth5y20

Clarifying question: how are the outputs of the supervised learning algorithm used (other than in model #6)?

[-]Steven Byrnes5y40

I think often they go out to the body to control muscles, and in #6 they feed information into the neocortex, and in the amygdala case they also release cortisol and/or other hormones. I'm not intimately familiar with neuroanatomy, there could be additional complications I don't know about. Does that answer your question?

[-]Rafael Harth5y40

I think so. I was imagining an additional mechanism where the outputs compete with other parts of the brain for the final say on what your muscles are doing. If they control muscles directly, that would mean 'I' can't choose not to have flinch if the supervised learning algorithm says I should (right?) -- which I guess does actually align with experience.

[-]Steven Byrnes5y40

There's almost definitely competition between different systems trying to control muscles. The basal ganglia has at least something to do with judging the competition and picking winners. The neocortex competes with other parts of the neocortex, and against the amygdala and maybe brainstem, I dunno. But I'm not sure that we should think of the cerebellum as a competitor. I think of the cerebellum as a personal assistant for the competitors. So if it thinks that the neocortex is going to place a bid to move a muscle, it preemptively places that bid itself, and if the neocortex later says something different, the cerebellum says "oops, my bad, I'll forward that bid instead, and try to get it right next time".

So maybe I implied here that the cerebellum is responsible for flinching. That's kinda true, but after writing this I read that really the amygdala knows that flinching is the right thing to do, and then the cerebellum learns to make the flinch happen more quickly. (Not 100% sure on that.)