(Half-baked work-in-progress. There might be a “version 2” of this post at some point, with fewer mistakes, and more neuroscience details, and nice illustrations and pedagogy etc. But it’s fun to chat and see if anyone has thoughts.)

1. Background

There’s a neuroscience problem that’s had me stumped since almost the very beginning of when I became interested in neuroscience at all (as a lens into AGI safety) back in 2019. But I think I might finally have “a foot in the door” towards a solution!

What is this problem? As described in my post Symbol Grounding and Human Social Instincts, I believe the following:

  • (1) We can divide the brain into a “Learning Subsystem” (cortex, striatum, amygdala, cerebellum and a few other areas) on the one hand, and a “Steering Subsystem” (mostly hypothalamus and brainstem) on the other hand; and a human’s “innate drives” (roughly equivalent to the reward function in reinforcement learning) are calculated by a bunch of specific, genetically-specified “business logic” housed in the latter subsystem;
  • (2) Some of those “innate drives” are related to human social instincts—a suite of reactions that are upstream of things like envy and compassion;
  • (3) It might be helpful for AGI safety (for reasons briefly summarized here) if we understood exactly how those particular drives worked. Ideally this would look like legible pseudocode that’s simultaneously compatible with behavioral observations (including everyday experience), with evolutionary considerations, and with a neuroscience-based story of how that pseudocode is actually implemented by neurons in the brain. (Different example of what I think it looks like to make progress towards that kind of pseudocode.)
  • (4) Explaining how those innate drives work is tricky in part because of the “symbol grounding problem”, but it probably centrally involves “transient empathetic simulations” (see §13.5 of the post linked at the top);
  • (5) …and therefore there needs to be some mechanism in the brain by which the “Steering Subsystem” (hypothalamus & brainstem) can tell whether the “Learning Subsystem” (cortex etc.) world-model is being queried for the purpose of a “transient empathetic simulation”, or whether that same world-model is instead being queried for some other purpose, like recalling a memory, considering a possible plan, or perceiving what’s happening right now.

As an example of (5), if Zoe is yelling at me, then when I look at Zoe, a thought might flash across my mind, for a fraction of a second, wherein I mentally simulate Zoe’s angry feelings. Alternatively, I might imagine myself potentially feeling angry in the future. Both of those possible thoughts involve my cortex sending a weak but legible-to-the-brainstem (“grounded”) anger-related signal to the hypothalamus and brainstem (mainly via the amygdala) (I claim). But the hypothalamus and brainstem have presumably evolved to trigger different reactions in those two cases, because the former but not the latter calls for a specific social reaction to Zoe’s anger. For example, in the former case, maybe Zoe’s anger would trigger in me a reaction to feel anger back at Zoe in turn, although not necessarily because there are other inputs to the calculation as well. So I think there has to be some mechanism by which the hypothalamus and/or brainstem can figure out whether or not a (transient) empathetic simulation was upstream of those anger-related signals. And I don’t know what that mechanism is.

I came into those five beliefs above rather quickly—the first time I mentioned that I was confused about how (5) works, it was way back in my second-ever neuroscience blog post, maybe within the first 50 hours of my trying to teach myself neuroscience as an adult. I’ve remained confused about (5) ever since, and have brought up that fact probably dozens of times in my writings since then.

Well, 4½ years later, I finally think I have my first decently-plausible hypothesis for how (5) might work! It’s vague, and probably somewhat confused, and it might well be wrong, but after so long I’m very excited to grasp any straw that feels like progress.

2. New stuff

2.1 Ingredient 1: “Spatial attention”

2.1.1 What is “spatial attention” intuitively?

In the visual world, you can be looking at something in the room, and then your spatial attention is at the location of that thing. This often involves moving your eyes (saccading) to be centered at that thing, but not necessarily—“covert attention” is the term for paying attention to a part of visual space that you’re not looking directly at.

But it’s not just visual: even with your eyes closed, spatial attention is a thing. When you pick up a cup, I claim that you’ll find it a very challenging task unless your spatial attention is centered around the cup’s physical location.[1] When you feel an itch, you have an intuitive sense of where that itch is located in 3D space, and when you’re thinking about that itch, your spatial attention is at its location. When you hear a sudden loud sound, part of the associated innate “orienting reflex” involves a shift of your spatial attention towards the location of that sound. If I'm listening to someone behind me talk, then my spatial attention is sitting somewhere behind my back, as it is when I “have a feeling that someone is behind me” and am paying attention to that feeling.

2.1.2 How does spatial attention work in the brain?

My claim is: The brainstem’s genetically-hardcoded “business logic” includes a “center of spatial attention” parameter that can dart around local 3D space.

Some reasons I believe this include:

  • The brainstem superior and inferior colliculus, among many other things, execute the “orienting reflex” mentioned above—the one where you turn your eyes and head towards a bright light, sound, looming threat, etc. This seems 2D rather than 3D (i.e., direction but not distance), but I think innate orient / startle reflexes are generally at least somewhat sensitive to distance as well as direction—if a sudden noise is close by, I think you’ll orient to it in 3D space, which might involve e.g. flinching your head back a bit if something is very close, as well as preemptively starting to adjust your eye vergence and accommodation even as you rotate your head.[2] 
  • Relatedly, the so-called “hand blink reflex” is a reflex where stimulating a hand nerve can cause a blink reaction, but it only does so when the hand happens to be positioned close to the face in 3D space. This reflex is supposedly mediated by the brainstem,[3] which suggests that the brainstem is always tracking where all parts of the body are in 3D space, and shifting spatial attention to the location of any salient tactile sensation (which then feeds into flinches, orienting, and other reactions).
  • Innate fear-of-heights: If I’m correct that the brainstem has a system involving an “attention” center that can dart around local 3D space, then we have a straightforward story for how fear-of-heights works—the “business logic” would be: “IF spatial attention is currently located a great distance almost directly below me, THEN (1) raise your heart rate, (2) emit a tingly sensation, (3) etc.” Whereas if the brainstem does not have such a 3D spatial attention system, then I’m not sure how else fear-of-heights could realistically work.[4]

Anyway, I don’t understand all the details of this system (if I'm right that it exists at all), but the basic idea that “3D center of spatial attention darting around my local surrounding space” is a “brainstem primitive”—a genetically-hardwired parameter built into the brainstem’s “business logic”—seems to me like a really solid working hypothesis.

If the brainstem is constantly maintaining a center of 3D spatial attention, it can also send its present value up to the cortex as a sensory input (discussion of what I mean by that)—and hence we have conscious access to spatial attention. Likewise, the cortex can send output / motor signals back down to the brainstem that manipulate 3D spatial attention[5]—and hence we have (imperfect) conscious control over spatial attention. I say “imperfect” because innate reactions can override conscious control—for example, a sufficiently loud noise or painful sensation can sometimes “override” the cortex’s suggestion about where spatial attention should sit.

2.2 Empathetic simulation version 1

Finally, we get to transient empathetic simulations.

If we momentarily pay attention to something about our own feelings, consciousness, and state of mind, then (I claim) our spatial attention is at that moment centered somewhere in our own bodies—more specifically, in modern western culture, it’s very often the head, but different cultures vary. Actually, that’s a sufficiently interesting topic that I’ll go on a tangent: here’s an excerpt from the book Impro by Keith Johnstone:

The placing of the personality in a particular part of the body is cultural. Most Europeans place themselves in the head, because they have been taught that they are the brain. In reality of course the brain can’t feel the concave of the skull, and if we believed with Lucretius that the brain was an organ for cooling the blood, we would place ourselves somewhere else. The Greeks and Romans were in the chest, the Japanese a hand’s breadth below the navel, Witla Indians in the whole body, and even outside it. We only imagine ourselves as ‘somewhere’.

Meditation teachers in the East have asked their students to practise placing the mind in different parts of the body, or in the Universe, as a means of inducing trance.Michael Chekhov, a distinguished acting teachersuggested that students should practise moving the mind around as an aid to character work. He suggested that they should invent ‘imaginary bodies’ and operate them from ‘imaginary centres’

Johnstone continues from here, discussing at length how moving the implicit spatial location of introspection seems to go along with rebooting the personality and sense-of-self. Is there a connection to the space-referenced implementation of innate social drives that I’m hypothesizing in this post? I’m not sure—food for thought. Also possibly related: Julian Jaynes’s Origin of Consciousness in the Breakdown of the Bicameral Mind, and the phenomenon of hallucinated voices.

…But anyway, back to my story.

As I was saying, if we momentarily pay attention to something about our own feelings, consciousness, and state of mind, then (I claim) our spatial attention is at that moment centered somewhere in our own bodies. By contrast, if we momentarily pay attention to someone else being angry, then our spatial attention is centered somewhere in that other person’s body.

…And that’s my first hypothesis for how (5) works above! During a transient empathetic simulation, the brainstem is getting a signal from the cortex about where to place spatial attention right now, and that location is not where my own head or body is (which the brainstem knows), but rather somewhere else. So the brainstem theoretically has all the information it needs to deduce that this thought must be a transient empathetic simulation, and not a transient self-focused memory, prediction, perception, counterfactual musing, etc.

2.3 Ingredient 2: A brainstem “is-a-person” flag

I can do a little better than that, I think, in order to explain how social instincts seem to depend on what I think about the person. For example, if I do a transient empathetic simulation of someone who is angry, it matters whether I feel affection towards the person, versus hatred, versus condescension, versus not caring one way or the other about them, etc. I think I need one more ingredient for that.

I think the brainstem probably has a bunch of heuristics that indicate that, at some 3D spatial location, there is a person. Specifically, I know that the brainstem has a human-face-detector (see e.g. Morton & Johnson 1991), and I strongly presume that it also has a human-voice detector, and probably also a detector of specific kinds of human-like motion (gait), and so on. None of these detectors are particularly reliable—they depend on evolved heuristics—and indeed we can easily get an eerie intuitive sense that mannequins and animatronic puppets are in the “people” category. Anyway, whenever those heuristics trigger, I claim that the brainstem (A) centers spatial attention around that apparent-person and (B) raises a flag (i.e. a signal in a specific genetically-defined set of neurons, and no I don’t currently know which neurons in particular), which I’ll call the “is-a-person flag”.

I think that brainstem is-a-person flag, just like any brainstem signal, can serve as supervised-learning ground truth that trains up a corresponding predictor (“thought assessor”) attached to the cortex world-model (details here), and eventually this predictor will “tell the brainstem” whenever the cortex is thinking about another person, even if that person is not right there in the room.

2.4 Empathetic simulation version 2

Now, with the help of the “is-a-person flag”, we seem to have an even better story. It’s possible for me to think two transient thoughts consecutively, i.e. within a fraction of a second:

  • One thought is a (transient) empathetic simulation of how (I think) Zoe feels.
  • The other thought is not an empathetic simulation, but rather just that I’m thinking about Zoe from my own perspective.

(To make it harder on myself, I’m assuming that Zoe is not here in the room with me; instead I’m imagining a possible future interaction with her.)

In this case, the brainstem has all the information it needs to figure out what’s going on. First, these two consecutive thoughts command the same location of spatial attention, and this location is away from my body. (Maybe I’m imagining her vaguely in front of me, or something? Not sure.)  Second, the second thought but not the first is associated with the “is-a-person flag”, so the brainstem knows which is which. Third, the transient empathetic simulation sends signals to the brainstem related to how Zoe feels. Fourth, the other (non-empathetic-simulation) thought sends signals to the brainstem related to how I feel about Zoe (love, hate, respect, apathy, etc.).

So that’s a sketchy outline of a hypothesis! I have a lot of work to do to sort out the details though, if this is even true.

(Thanks Seth Herd for critical comments on a draft.)
 

  1. ^

     I think this claim is true, but I expect readers to disagree. I claim that it’s not obvious by introspection, because your spatial attention can jump around multiple times per second, and it’s possible to pick up the cup by course-correcting the motion during sporadic moments of appropriate spatial attention. So I think you can easily pick up the cup while mostly keeping your spatial attention somewhere that is not the cup, and you might not be aware of the sporadic moments where your spatial attention transiently shifts to the location of the cup. (I could be wrong and am not sure how to settle that question.)

  2. ^

    The superior colliculus is famously associated with retinotopic maps in its various layers, so I’m kinda confused how distance is encoded, if I’m right that it’s really 3D and not 2D.

  3. ^

    Specifically, this source suggests the “reticular formation”, which doesn’t narrow it down much. I’m not sure which part—still need to read more.

  4. ^

    Hmm, another possibility is that fear-of-heights is really fear-of-falling, built by a learning algorithm whose ground-truth is the particular sensations of freefall-then-hitting-the-ground. But I think people can be afraid of heights without past experience of falling—more specifically, I think people are more afraid of falling 50 m than falling 1 m even if they’ve actually fallen 1 m many times but never fell 50 m, and that’s the opposite of what that learning algorithm would do, presumably. Also, I recall reading that baby birds are “scared of heights” before they fledge, despite presumably never falling at all in their short nest-bound lifetime. (EDIT TO ADD: Also maybe crawling babies, h/t this comment.)

  5. ^

    For example, the frontal eye fields enable some degree of conscious control of spatial attention, via their output projection to the superior colliculus.

New Comment
12 comments, sorted by Click to highlight new comments since:

We've learned a lot about the visual system by looking at ways to force it to wrong conclusions, which we call optical illusions or visual art.  Can we do a similar thing for this postulated social cognition system?  For example, how do actors get us to have social feelings toward people who don't really exist?  And what rules do movie directors follow to keep us from getting confused by cuts from one camera angle to another?

Tangentially related: some advanced meditators report that their sense that perception has a center vanishes at a certain point along the meditative path, and this is associated with a reduction in suffering.

You write:

…But I think people can be afraid of heights without past experience of falling…

I have seen it claimed that crawling-age babies are afraid of heights, in that they will not crawl from a solid floor to a glass platform over a yawning gulf.  And they’ve never fallen into a yawning gulf.  At that age, probably all the heights they’ve fallen from have been harmless, since the typical baby is both bouncy and close to the ground.

Whereas if the brainstem does not have such a 3D spatial attention system, then I’m not sure how else fear-of-heights could realistically work

I think part of the trigger is from the visual balance center.  The eyes sense small changes in parallax as the head moves relative to nearby objects.  If much of the visual field is at great distance (especially below, where the parallax signals are usually strongest and most reliable), then the visual balance center gets confused and starts disagreeing with the other balance senses.

If I’m looking up at the clouds, or at a distant mountain range, then everything is far away (the ground could be cut off from my field-of-view)—but it doesn’t trigger the sensations of fear-of-heights, right? Also, I think blind people can be scared of heights?

Another possible fear-of-heights story just occurred to me—I added to the post in a footnote, along with why I don’t believe it.

The vestibular system can detect whether you look up or down. It could be that the reflex triggers when you a) look down (vestibular system) and b) have a visual parallax that indicates depth (visual system).

Should be easy to test by closing one eye. Alternatively, it is the degree of accommodation of the lens. That should be testable by looking down with a lens that forces accommodation on short distances.

The negative should also be testable by asking congenitally blind people about their experience with this feeling of dizziness close to a rim.

I think I would feel characteristic innate-fear-of-heights sensations (fear + tingly sensation for me, YMMV) if I were standing on an opaque bridge over a chasm, especially if the wood is cracking and about to break. Or if I were near the edge of a roof with no railings, but couldn’t actually see down.

Neither of these claims is straightforward rock-solid proof that the thing you said is wrong, because there’s a possible elaboration of what you said that starts with “looking down” as ground truth and then generalizes that ground truth via pattern-matching / learning algorithm—but I still think that elaborated story doesn’t hang together when you work through it in detail, and that my “innate ‘center of spatial attention’ constantly darting around local 3D space” story is much better.

There are likely multiple detectors of risk of falling. Being on shaky ground is for sure one. In amusement parks, there are sometimes thingies that share and wobble and can also give these kind of feeling. Also, it could be a learned (prediction by the though assessor) reaction, as you mention too.

Are there any disorders impairing spatial attention that you think would also impair empathy? I asked GPT-4 for disorders of spatial attention and gave me Hemispatial neglect and Balint's Syndrom. If things were really convenient with Hemispatial neglect, I can imagine that people always think of some of their thoughts and feelings as on the “left” side. Then they would have difficulties having those feelings once they have trouble attending to anything on the left side. For a cliché example, associating his love with his heart on the left side (Maybe that's a bad example. Perhaps better would be something where someone would have trouble telling if something was their own or another person's thought or feelings).

I appreciate the brainstorming prompt but I can’t come up with anything useful here. The things you mention are related to cortex lesions, which would presumably leave the brainstem spatial attention system intact. (Brainstem damage is more rare and often lethal.) The stuff you say about neglect is fun to think about but I can’t see situations where there would be specifically-social consequences, in a way that sheds light on what’s happening.

There might be something to the fact that the temporoparietal junction (TPJ) seems to include areas related to spatial attention, and is also somehow involved in theory-of-mind tasks. I’ve been looking into that recently—in fact, that’s part of the story of how I came to write this post. I still don’t fully understand the TPJ though.

Hmm, there do exist lesion studies related to theory-of-mind, e.g. this one—I guess I should read them.

If we momentarily pay attention to something about our own feelings, consciousness, and state of mind, then (I claim) our spatial attention is at that moment centered somewhere in our own bodies—more specifically, in modern western culture, it’s very often the head, but different cultures vary. Actually, that’s a sufficiently interesting topic that I’ll go on a tangent: here’s an excerpt from the book Impro by Keith Johnstone:

The placing of the personality in a particular part of the body is cultural. Most Europeans place themselves in the head, because they have been taught that they are the brain. In reality of course the brain can’t feel the concave of the skull, and if we believed with Lucretius that the brain was an organ for cooling the blood, we would place ourselves somewhere else. The Greeks and Romans were in the chest, the Japanese a hand’s breadth below the navel, Witla Indians in the whole body, and even outside it. We only imagine ourselves as ‘somewhere’.

Meditation teachers in the East have asked their students to practise placing the mind in different parts of the body, or in the Universe, as a means of inducing trance.… Michael Chekhov, a distinguished acting teacher…suggested that students should practise moving the mind around as an aid to character work. He suggested that they should invent ‘imaginary bodies’ and operate them from ‘imaginary centres’…

Johnstone continues from here, discussing at length how moving the implicit spatial location of introspection seems to go along with rebooting the personality and sense-of-self. Is there a connection to the space-referenced implementation of innate social drives that I’m hypothesizing in this post? I’m not sure—food for thought. Also possibly related: Julian Jaynes’s Origin of Consciousness in the Breakdown of the Bicameral Mind, and the phenomenon of hallucinated voices.

@WhatsTrueKittycat Potentially useful cogtech for both meditation and mental-proscenium-training.

If step 5 is indeed grounded in the spatial attention being on other people, this should be testable! For example, people who pay less spatial attention to other people should feel less intense social emotions - because the steering system circuit gets activated less often and weaker. And I think that is the case. At least ChatGPT has some confirming evidence, though it's not super clear and I haven't yet looked deeper into it.