Builds up on: Consciousness and the Brain by Stanislas Dehaene. Good summaries may be found here or here, though reading them is not strictly necessary.

Synopsis: I claim to describe the exact mental structure that allows qualia.

Background: What Is Consciousness?

Dehaene's Consciousness and the Brain rigorously differentiates conscious and unconscious activity. Consciousness, the book suggests, is correlated with events where the brain gathers all of its probability distributions about the world, and samples from them to build a consistent unitary world-model, on which it then acts. The experiments show that this is necessary for multi-step calculation, abstract thinking, and reasoning over agglomerations of distant sensory inputs.

However, the book's definition of consciousness is not a synonym for self-awareness. Rather, I would term the phenomenon it picks out as "moments of agency": as the state in which the mind can devise goal-oriented plans using a well-formed world-model, and engage in proper consequenialist reasoning. Outside those moments, it's just a bundle of automatic heuristics.

Self-awareness, I suspect, is part of these moments-of-agency in humans, but isn't the same thing generally. Just having Dehaene!consciousness isn't a sufficient condition for self-awareness: there's something on top of that going on.

What Is Self-Awareness?

What do we expect to happen to an agent the moment it attains self-awareness, in the sense of perceiving itself to have qualia?

Why, it would start perceiving qualia — the keyword being perceive. It would start acting like it receives some sort of feedback from a novel sense, not unlike sight. Getting data about how it's like to be a thing like itself.

Let's suppose that it's nothing magical — that there isn't a species of etheric parasites which attach themselves to any sufficiently advanced engine of cognition and start making it hallucinate. Neither are qualia "emergent" — as if, if you formally wrote out an algorithm for general reasoning, that algorithm would spontaneously rewrite itself to be having these imaginary experiences. If self-awareness is as mundane as any other sense, then what internal mechanism would we expect to correspond to it?

When we "see", what happens is: some sort of ground-truth data enter a specialized sensory organ, that organ transmits the information to the brain, the brain parses it, and offers it to our conscious inspection[1], so we may account for it in planning.

If we view qualia as sense-data, it follows that they'd be processed along a similar pathway.

  • What are the ground-truth data corresponding to qualia? The current internal state of this agent. The inputs it's processing, the configuration its world-model is in, its working-memory cache, the setting it's operating in, the suite of currently active processes...
  • What is the specialized sensory organ, and how does it communicate with the brain? Despite what may seem, we do need one. The ground-truth state of the brain isn't by default "known" by the brain itself; it's just in that state. A specialized mechanism needs to know how to summarize raw brain-states into reports, then pool them together with other information about the world.
  • How are the qualia-data interpreted? Much like visual information, they're parsed as the snapshot of all information received by the sensory organ at a particular moment. A self-model; a summary of how it's like to be you, perceiving what you perceive and feeling what you feel.
    • (In theory, that should cause infinite recursion. A faithful self-model also has a self-model: you can consider what it's like to be someone who experiences being a someone. But seeing as we're bounded agents, I assume that algorithm is lazy.)

Let's make the distinction sharper. An agent gets hurt, information about that travels to the brain, where it's interpreted as "pain". Pain has the following effects:

  1. It updates the inner planner away from plans that cause the agent harm, like an NN getting high loss.
  2. It changes the current plan-making regime: the planner is incentivized to make plans in a hurried manner, and to consider more extreme options, so as to get out of the dangerous situation faster.

Which parts of that correspond to pain-qualia?


Pain-qualia are not pain inflicting changes upon a planner: by themselves, these changes are introduced outside the planner's purview. Consider taking an inert neural network, manually rewriting its weights, then running it. It couldn't have possibly "felt" that change, except by magic; it was simply changed. For a change to be felt, you need an outer loop that'd record your actions, then present the records to the NN.

That's what pain-qualia are: summaries of the effects pain has upon the planner that are fed as input to that very planner.

That leaves one question:

How Is It Useful?

Well, self-awareness evolved to be offered as an input to the planner-part, so it must be used by the planner-part somehow. The obvious answer seems correct here: meta-planning.

First, self-awareness allows an agent to account for altered states of consciousness. If it knows it's deliriously happy, or sad, or drugged, it'll know that it's biased towards certain actions or plans over others, and that these situational biases may not be desirable. So it'll know to correct its behavior to suit. (Note that mere awareness of an altered state is insufficient: feedback needs to be detailed enough to allow that course-correction.)

Second, it allows it to predict in detail its future plan-making instances. What, given a plan, its future selves would want to do at various stages of that plan, and how capable they'll be of doing this.

Concretely, self-awareness is what allows to:

  • Know not to lash out while angry, even if it feels right and sensible in the moment, because you know your plan-making process is compromised.
  • Know that a plan which hinges on your ability to solve highly challenging mathematical problems while getting your arm chopped off is a doomed one.
  • Know not to commit to an exciting-seeming project too soon, because you know from past experience that your interest will wane.

To be clear, those are just examples; what we want is the ability to display such meta-planning universally. We can imagine an animal that instinctively shies away from certain drastic actions while angry, but the actual requirement is the ability to do that in off-distribution contexts in a zero-shot regime.

On that note... I'll abstain from strong statements on whether various animals actually have self-models complex enough to be morally relevant. I suspect, however, that almost no-one's planning algorithms are advanced enough to make good use of qualia — and evolution would not grant them senses they can't use. In particular, this capability implies high trust placed by evolution in the planner-part: that sometimes it may know better than the built-in instincts, and should have the ability to plan around them.

But I'm pushing back against this sort of argument. As I've described, a mind in pain does not necessarily experience that pain. The capacity to have qualia of pain corresponds to a specific mental process where the effect of pain on the agent is picked up by a specialized "sensory apparatus" and re-fed as input to the planning module within that agent. This, on a very concrete level, is what having internal experience means. Just track the information flows!

And it's entirely possible for a mind to simply lack that sensory apparatus.

As such, in terms of empirical tests for sentience, the thing to look for isn't whether something looks like it experiences emotions. It's whether, while plan-making, that agent can reason about its own behavior in different emotional states.


1. Cogito, Ergo Sum. It's easy to formalize. As per Dehaene, all of the inferences the brain makes about the outside world are probabilistic. When presented to the planner, they would be appropriately tagged with their probability estimates. The one exception would be information about the brain's own continued functioning: it would be tagged "confidence 1". After all, the only way for the self-awareness mechanism to become compromised involves severe damage to the brain's internals, which is probably fatal. So evolution never had cause to program us to doubt it.

And that's why we go around slipping into solipsism.

2. "Here's a simple program that tracks its own state. Is it sentient?" No. It needs to be an agent.

3. The Hard Problem of Consciousness. None of the above seems to address the real question: why does self-awareness seem so... metaphysically different from the rest of the universe? Or, phrased more tractably: "Why does a mind that implements the self-awareness mechanism start viewing self-aware processes as being qualitatively different, compared to other matter? And gets so confused about it?"

I'm afraid I don't have a complete answer to that, as I'm having some trouble staring at the thing myself. I feel confident, though, that whatever it is, it wouldn't invalidate anything I wrote above.[2] I suspect it's a combination of two things:

  • Cogito, ergo sum. Our existence feels qualitatively different because it's the only thing to which the mind assigns absolute confidence.
  • A quirk of our conceptual vocabulary. It's not that self-aware things have some metaphysically special component, it's that there's an irreducible concept native to our minds that sharply differentiates them, so things we imagine as sentient feel qualitatively different to us.

And this "qualia" concept has some really confusing properties. It's basically defined by "this thing has first-person experiences, just like me". Yet we can imagine rocks to have qualia, and rocks definitely don't share any of our algorithmic machinery, especially the self-awareness machinery. At the same time, we can imagine entities that share all of our algorithms — p-zombies — which somehow lack qualia.

Why is that so? What is the purpose of this distinct "qualia" concept? Why are we allowed to use it so incoherently?

I'm not sure, but I suspect that it's a short-hand for "has inherent moral relevance". It's not tied to "is self-aware" because evolution wanted us to be able to dehumanize criminals and members of competing tribes: view them as beasts, soulless barbarians. So the concept is decoupled from its definition, which means we can imagine incoherent states where things that have what we define as "qualia" don't have qualia, and vice versa.

It's not a particularly satisfying answer, granted. But what's the alternative here? If this magical-feeling "first-person perspective" can act on the world, it has to be implemented within the world. The self-awareness as I described it seems to suffice to describe all characteristics of having qualia, and the purpose of our capability for self-awareness; it just fails to validate our feeling that qualia are metaphysically significant. But if we can't accept "they're not metaphysically significant, here's how they're implemented and why they falsely feel metaphysically significant", then we've decided to reject any answer except "here's a new metaphysics, turns out there really are etheric parasites making us hallucinate!".

Maybe my specific answer, "qualia-having is a signifier for moral relevance", isn't exactly right; and indeed, it doesn't quite feel right to me. But whatever the real answer is, I expect it to look very similar, and to similarly dismiss the "hard problem" as lacking substance.

4. Moral Implications. If you accept the above, then the whole "qualia" debacle is just a massive red herring caused by the idiosyncrasies of our mental architecture. What does that imply for ethics?

Well, that's simple: we just have to re-connect the free-floating "qualia" concept with the definition of qualia. We value things that have first-person experiences similar to ours. Hence, we have to isolate the algorithms that allow things to have first-person experiences like ours, then assign things like that moral relevance, and dismiss the moral relevance of everything else.

And with there not being some additional "magical fluid" that can confer moral relevance to a bundle of matter, we can rest assured there won't be any shocking twists where puddles turn out to have been important this entire time.

  1. ^

    In the Dehaene sense: i. e., the data are pooled together and consolidated into a world-model which is fed as input to the planning algorithm.

  2. ^

    Unless it really are etheric parasites or something. I mean, it might be!

New to LessWrong?

New Comment
9 comments, sorted by Click to highlight new comments since: Today at 8:09 AM

It seems to be taken for granted here that self-awareness=qualia. If something is self-aware and talking or thinking about how it has qualia, that sure is evidence of it having qualia, but I'm not sure the reverse direction holds. What about internal-state-tracking is necessary for creating the mysterious redness of red exactly, or the hurt-iness of pain? 

I can see how pain as defined above the spoiler section doesn't necessarily lead to pain qualia, and in many simple architectures obviously doesn't, but I don't see how processing a summary of pain effects on the network does lead to it. Say the summary is a single bit that's either 0, "no pain right now", or 1, "pain active". What makes that bit feel hurty to the network, instead of, say, looking red, or smelling sweet, or any other qualia? I don't feel any more able to answer these questions after adding the hypothesis "self-awareness necessary" to my model of the situation.

My mind sure agrees that it's kind of suspicious how these mysterious qualia thingies only ever seem to exert a direct influence on the world when agents engage in modelling and introspection about them, and maybe that's hinting that self-awareness is, or causes, qualia somehow. But I've never gotten further than this vague intuition in justifying or modelling the connection.

Great questions!

What about internal-state-tracking is necessary for creating the mysterious redness of red exactly, or the hurt-iness of pain?

Well, as you note, the only time we notice these things is when we self-model, and they otherwise have no causal effect on reality; a mind that doesn't self-reflect is not affected by them. So... that can only mean they only exist when we self-reflect.

Say the summary is a single bit that's either 0, "no pain right now", or 1, "pain active". What makes that bit feel hurty to the network, instead of, say, looking red, or smelling sweet, or any other qualia?

Mm, the summary-interpretation mechanism? Imagine if instead of an eye, you had a binary input, and the brain was hard-wired to parse "0" from this input as a dog picture, and "1" as a cat picture. So you perceive 1, the signal travels to the brain, enters the pre-processing machinery, that machinery retrieves the cat picture, and shoves it into the visual input of your planner-part, claiming it's what the binary organ perceives.

Similarly, the binary pain channel you're describing would retrieve some hard-coded idea of how "I'm in pain" is meant to feel, convert it into a format the planner can parse, put it into some specialized input channel, and the planner would make decisions based on that. This would, of course, not be the rich and varied and context-dependent sense of pain we have — it would be, well, binary, always feeling the same.

I want to respond to this, but I'm someone who cares intensely about animal rights and thus I have to work around my instinctive reaction of blind rage at someone who implies they might not be conscious, and try to communicate in a reasonable, respectful way.

My best counterargument to all this is the simple fact that humans have dreams. No one is self-aware in a sufficiently nonlucid dream. I have never once that I know of made a clear plan in a dream - I only ever react to what is happening around me. Nonetheless I experience qualia.

Also I wildly disagree with you that consciousness and self-awareness have anything to do with one another. The latter is an accident of evolution that humans have found quite useful, but it's not intrinsically special or necessary. In fact I don't even think self-awareness is morally relevant - only consciousness in general is - and I think that panpsychism is true - that is, qualia are universal and exist in every physical system which is doing any kind of computation, but they have degrees of complexity that differ, and ours happen to be rather complex.

All of these are my opinions though, and they are partly emotionally motivated; I'm not sure how to actually provide any kind of counterargument, other than the copernican principle: anything that implies humans are radically different from other animals in a qualitative way, rather than merely having crossed a quantitative threshold over which some abilities manifest that were only latent before, is frankly absurd.

Note, I'm not saying your analysis here doesn't have value - it's a great explanation of the kinds of qualia only self-aware beings have - but not qualia in themselves. I do not have to be able to perceive my own existence in order to perceive colors, sounds, and sensations, and people who are asleep or partly so, or otherwise have their self-awareness heavily modified, nonetheless experience qualia.

No one is self-aware in a sufficiently nonlucid dream.

No one is self-aware in a sufficiently nonlucid X, no matter what X is.

I want to respond to this, but I'm someone who cares intensely about animal rights and thus I have to work around my instinctive reaction of blind rage at someone who implies they might not be conscious, and try to communicate in a reasonable, respectful way.

I appreciate it.

No one is self-aware in a sufficiently nonlucid dream

And when a person closes their eyes, they stop being able to see. Their inner planner starts to ignore visual input, or might not make active plans at all if you're just taking a moment to rest. Does that mean that the machinery for vision disappears?

Dreams, or moments of intense pain as the post I'd linked to describes, are just the analogue for qualia. It keeps generating self-reports and keeps storing them in memory, but the planner simply doesn't use them. But when you wake up, and turn your attention to your dream-memories, they're there for you to look at.

I think that panpsychism is true - that is, qualia are universal and exist in every physical system which is doing any kind of computation

But what does that mean? When you imagine a non-self-aware entity perceiving colors, what do you imagine happening in its mind, algorithmically?

Dreams don't suddenly become real only after you wake up. That's utterly absurd. I know for a fact I am conscious while I am dreaming - it's not just a bunch of memories I can only access later. I'm frankly flabbergasted that you believe otherwise.

When I imagine a non-self-aware entity perceiving colors, I imagine that the part of their brain involved in sensory processing is... doing the sensory processing! There is something which it is like to have a temporal lobe analyzing data filtered through the retina, regardless of what else is in your brain! You can't just be doing complex computations like that without experiencing them.

And by the way, when I close my eyes, I can still see. There might not be much to look at, but my visual processing still works. That part of your response was badly worded.

Your "planner" seems like a confused, pseudo-physicalized version of a soul to me. You haven't actually solved the problem of how qualia arise in the first place, you've just shoved it into a corner of the brain and claimed to dissolve the problem. Okay, so suppose you're right and only the planner part of the brain has qualia, and qualia exist only because of it - but what are they? We both know there is, in fact a difference between qualia and basically everything else in the universe. You can't solve the hard problem by saying "aha, the brain doesn't do it, a specific part of the brain does it!"

Meanwhile, if you believe as I do that qualia are just what computations of all kinds "look like from the inside", the problem actually does dissolve. If you haven't looked into integrated information theory, I would suggest you do so: I think they are somehow incorrect as their theory makes a few unintuitive predictions, but it's the closest to what I think of as a reasonable theory of consciousness that I've ever seen.

I'd be interested to hear how this compares with Wolfgang Schwarz's ideas in 'Imaginary Foundations' and 'From Sensor Variables to Phenomenal Facts'. Sounds like there's some overlap, and Schwarz has a kind of explanation for why the hard problem might arise that you might be able to draw on. 

Link to the second of the papers mentioned:

There's some overlap indeed. In particular, it definitely aligns with my model of why the redness of red is "flavoured" as the redness of red and not as burning your fingers (because it's routed through the associations in our world-model). But it seems to jump too quickly to "having senses + WM implies qualia"; I think the self-awareness loop I'd described is still necessary on top of that.

No. It needs to be an agent.

Why? Also everything is an agent that successfully optimizes itself to be in the state it currently is. And what counts for an outer loop is arbitrary in the same way.