Thoughts on AGI consciousness / sentience

[-]Charlie Steiner3y2110

I'll chime in with the same thing I always say: the label "consciousness" is a convenient lumping-together of a bunch of different properties that co-occur in humans, but don't have to all go together a priori.

For example, the way I can do reasoning via subverbal arguments that I can later remember and elaborate if I attend to them. Or what it's like to remember things via associations and some vague indications of how long ago they were, rather than e.g. a discrete system of tags and timestamps. My sense of aversion to noxious stimuli, which can in turn be broken down into many subroutines.

Future AIs are simply going to have some of these properties, to some extent, but not all of them. Asking whether that's consciousness is somewhat useless - what matters is how much of a moral patient we want to treat them as, based on a more fine-grained consideration of their properties.

[-]jacob_cannell3y91

I agree with the general points and especially "It should be obvious by now that AGI will necessarily be brain-like (in the same ways that DL is brain-like but a bit moreso), and necessarily conscious in human-like ways, as that is simply what intelligence demands".

However, there is much interesting grey area when we start comparing the consciousness of humans with specific types of brain damage to current large transformer AI.

Transformers (even in their more agentic forms) are missing many components of the brain, but their largest deficit is the lack of strong recurrence and medium term memory capacity. Transformer LLMs like GPT3 have an equivalent conscious experience of essentially waking up from scratch, reading around a thousand tokens (in parallel), thinking about those for just a few hundred steps (equivalent to a dozen seconds or so of human thought), and then reset/repeat. They have reasonably extensive short term memory (attention), and very long term memory (standard weights), but not much in between, and they completely lack brain style RNN full recurrence.

In DL terminology brains are massive RNNs with a full spectrum of memory (roughly 10GB equiv in activations, and then a truly massive capacity of 10TB equiv or more in synapses that covers a full spectrum of timescales).

But some humans do have impairments to their medium term memory systems that is perhaps more comparable to LLM transformers - humans with missing/damaged hippocampus/EC regions like the infamous HM. Still conscious, but not in the same way.

[-]Mitchell_Porter3y70

Seems like consciousness is rather similar to "the Global Neuronal Workspace", and qualia are rather similar to "what's currently in the Workspace". Is there a reason to reject this way of thinking?

[-]Steven Byrnes3y100

Hmm, my point here was that “your introspective model of your own consciousness” is more-or-less the same as “your internal model of your Global Neuronal Workspace”.

But I’m very hesitant to take the seemingly-obvious next step and say “therefore, your own consciousness is your Global Neuronal Workspace”.

The thing is, an internal model of X doesn’t have to have much in common with X:

In the moving-Mario optical illusion, there’s an internal model in which Mario is moving, but that’s not a veridical reflection of the thing that it’s nominally modeling—Mario is not in fact moving.
Another example (I think from Graziano’s book) is that we have an internal model of “pure whiteness”, but that’s not a veridical reflection of the thing that it’s nominally modeling, because actual white light is a mix of different colors, not “pure”.

I think the consciousness case is an extreme case of that. Out of the various things that people say when describing their own phenomenal consciousness, I think only a very small fraction could be taken to be a veridical description of aspects of their Global Neuronal Workspace.

And when you have features of an internal model of X that are not veridical reflections of features of the actual X, we call that an “illusion”.

(Another thing is: I also think that different people in different cultures can have rather different internal models of their Global Neuronal Workspace, cf. Buddhists rejecting “self” and Julian Jaynes claiming a massive cultural shift in self-models around 1500-500BC.)

[-]TAG3y0-3

Saying that qualia aren't veridical representations of the properties of external objects, doesnt make them nonveridical in the sense of a hallucination....or nonexistent.

Saying that qualia aren't veridical representations of the brain , doesnt make nonexistent either.

In fact, both claims strengthen the case for qualia. Them first claim is a rejection of naive realism, and naive realists don't need qualia.

[-]TAG3y00

It seems access consciousness is almost tautologically accounted for by global workspace, and other aspects or meanings of consciousness aren't addressed by it all.

[-]TAG3y60

I feel like I already understand, reasonably well, the chain of causation in my brain that leads to me saying the thing in the previous paragraph, i.e. “I’m conscious right now, let me describe the qualia…” See my Book Review: Rethinking Consciousness.

You only have evidence that understand a chain of causation. You don't have evidence that no alternative account is possible.

…And it turns out that there is nothing whatsoever in that chain of causation that looks like what we intuitively expect consciousness and qualia to look like.

If you look at a brain from the outside, its qualia aren't visible. Equally, if you look at your brain from the inside, you see nothing but qualia...you do not see neural activity as such.

And your internal view of causality is that your pains cause you ouches.

Therefore, I need to conclude that either consciousness and qualia don’t exist, or that consciousness and qualia exist, but that they are not the ontologically fundamental parts of reality that they intuitively seem to be.

I don't think qualia seem to be fundamental.

As I understand it, here I’m endorsing the “illusionism” perspective, as advocated (for example) by Keith Frankish, Dan Dennett, and Michael Graziano.

Illusionism is the claim that qualia don't exist at all, not the claim they are merely non-fundamental. An emergentist could agree that they are non-fundamental.

Next, if a computer chip is running similar algorithms as a human philosopher, expressing a similar chain of causation, that leads to that chip emitting similar descriptions of consciousness and qualia as human philosophers emit, for similar underlying reasons, then I think we have to say that whatever consciousness and qualia are (if anything), this computer chip has those things just as much as the human does

That isn't illusionism. The most an illusionist would say is that a computer would be subject to the same illusions/delusions.
You have bypassed the possibility that what causes qualia to emerge is not computation, but the concrete physics of the brain.... something that can only be captured by a physical description.

(Side note: Transformer-based self-supervised language models like GPT-3 can emit human-sounding descriptions of consciousness, but (I claim) they emit those descriptions for very different underlying reasons than brains do—i.e., as a result of a very different chain of causation / algorithm

Different chain of physical causation , or different algorithm? It's quite possible for the same algorithm to be implemented in physically different ways...and it's quite possible for emergent consciousness to supervene on physics.

However, nihilism is not decision-relevant

Nihilism about what, and why? I don't think you have a theory that consciousness doesn't exist or that qualia don't exist. And even if you did, I don't see how it implies the non existence of values, or preferences or selves or purposes... or whatever else it takes to undermine decision theory.

When I do that, I wind up feeling pretty strongly that if an AGI can describe joy and suffering in a human-like way, thanks to human-like underlying algorithmic processes, then I ought to care about that AGI’s well-being.

Because they have the qualia, or because qualia don't matter?

Because if the agent has to (meta)learn better and better strategies for things like brainstorming and learning and planning and understanding, I think this process entails the kind of self-reflection which comprises full-fledged self-aware human-like consciousness.

Meaning that qualia aren't even one component of human consciousness? Or one possible meaning of "consciousness"?

So I don’t even think the AGI would be in a gray area—I think it would be indisputably conscious, conscious according to any reasonable definition

Illusionist don't think humans are conscious, for some definitions of consciousness.

[-]Steven Byrnes3y30

Thanks!

Illusionism is the claim that qualia don't exist at all, not the claim they are merely non-fundamental. An emergentist could agree that they are non-fundamental.

I’m unclear on this part. It seems like maybe just terminology to me. Suppose

Alice says “Qualia are an illusion, they don’t exist”,
Bob says “Qualia are an illusion. And they exist. They exist as an illusion.”

…I’m not sure Alice and Bob are actually disagreeing about anything of substance here, and my vague impression is that you can find self-described illusionists on both sides of that (non?)-dispute. For example, Frankish uses Alice-type descriptions, whereas Dennett and Graziano use Bob-type descriptions, I think.

Analogy: in the moving-Mario optical illusion, Alice would say “moving-Mario does not exist”, and Bob would say “there is an illusion (mental model) of moving-Mario, and it’s in your brain, and that illusion definitely exists, how else could I be talking about it?”

And if you’re on the Bob side of the dispute here, that would seem to me to be a form of emergentism, right??

You have bypassed the possibility that what causes qualia to emerge is the concrete physics of the brain, something that can only be captured by a physical description.

I don’t think I understand this part. According to the possibility that you have in mind, does the computer chip emit similar descriptions of consciousness and qualia as the human philosopher? Or not?

And then follow-up questions:

If yes, then do you agree that (on this possibility) actual consciousness and qualia are not involved in the chain of causation in your brain that leads to your describing your own consciousness and qualia? After all, presumably the chain of causation is the same in the computer chip, right?
If no, then does this possibility require that it’s fundamentally impossible to simulate a brain on a computer, such that the simulation and the actual brain emit the same outputs in the same situations?

[-]TAG3y*4-5

Therefore, I need to conclude that either consciousness and qualia don’t exist, or that consciousness and qualia exist, but that they are not the ontologically fundamental parts of reality that they intuitively seem to be.

Illusionism is the claim that qualia don’t exist at all, not the claim they are merely non-fundamental. An emergentist could agree that they are non-fundamental

I’m unclear on this part. It seems like maybe just terminology to me

I don't think so because "fundamental" and "illusory" are not obvious antonyms.

Alice says “Qualia are an illusion, they don’t exist”,

Bob says “Qualia are an illusion. And they exist. They exist as an illusion.”

The Bob version runs into a basic problem with illusionism, which is that it is self contradictory: an illusion is a false appearance a false appearance is an appearance and an appearance is a quale

The Bob version could be rectified as

Charlie says “Qualia are a delusion. People have a false belief that they have them , but don't have them.

And some illusionists believe that, but don't call it delusionism.

[Edit I think the Charlie claim is Dennets position.]

[Edit: I think I understand your position much better after having read your reply to Mitchell. Must exist, since neither brain states nor perceived objects have their properties, but only in a virtual sense...? ]

And if you’re on the Bob side of the dispute here, that would seem to me to be a form of emergentism, right??

Only Bob's (or Robs's ) self-defeating form of illusionism. Basically, illusionists are trying to deny qualia, and if they let them in by the back door, that's probably a mistake. Also, they don't believe in the full panoply of qualia anyway, only the one responsible for the illusion.

I don’t think I understand this part. According to the possibility that you have in mind, does the computer chip emit similar descriptions of consciousness and qualia as the human philosopher

I'm taking that as true by hypothesis.

If yes, then do you agree that (on this possibility) actual consciousness and qualia are not involved in the chain of causation in your brain that leads to your describing your own consciousness and qualia? After all, presumably the chain of causation is the same in the computer chip, right?

The chain of causation is definitely different because silicon isn't protoplasm. By hypothesis , the computation is the same but computation isn't causation. Computation is essentially a lossy, high level description of the physical behaviour.

If no, then does this possibility require that it’s fundamentally impossible to simulate a brain on a computer, such that the simulation and the actual brain emit the same outputs in the same situations

No, but that says nothing about qualia. It's possible for qualia to depend on some aspects of the physics that isn't captured the computational description ...which means that out of two systems running the same algorithm on different hardware,one could have qualia , but the other not. The other is a kind of zombie, but not a p-zombie because of the physical difference.

And since that is true , the GAZP is false.

[-]Nathan Helm-Burger3y10

I strongly disagree with "computation is a lossy high-level description". For what we're talking about, I think computation is a lossless description. I believe the thing we are calling 'qualia' is equivalent to a python function written on a computer. It is not a 'real' function on the computer it is written on, but a 'zombie' function when run on a different computer. If the computation is exactly the same, the underlying physical process that produced it is irrelevant. It is the same function.

[-]TAG3y-30

Computation in general is a lossy high level description, but not invariably.

For what we’re talking about, I think computation is a lossless description.

And what we are talking about is the computational theory of consciousness

If the computational theory of consciousness is correct, then computation is a lossless description.

But that doesn't prove anything relevant, because it doesn't show that computational theory is actually or necessarily correct. It is possibly wrong , so computational zombies are still possible.

I believe the thing we are calling ‘qualia’ is equivalent to a python function written on a computer

Can you state the function?

[+][comment deleted]3y20

[-]Richard_Kennaway3y*61

I feel like I already understand, reasonably well, the chain of causation in my brain that leads to me saying the thing in the previous paragraph

The feeling of understanding, and actual understanding, are very different things. Astrology will give people a feeling of understanding. Popsci books give people a feeling of understanding. Repeating the teacher's password gives a feeling of understanding. Exclaiming "Neurons!" gives people a feeling of understanding. Stories of all sorts give people a feeling of understanding.

One of the signs of real understanding is doing real things with it. If I can build a house that stays up and doesn't leak, I have some understanding of how to build a house. If I can develop a piece of software that performs some practical task, then I have some understanding of software development. If I can help people live better and more fulfilled lives, I have some understanding of people.

Therefore, I need to conclude that either consciousness and qualia don’t exist, or that consciousness and qualia exist, but that they are not the ontologically fundamental parts of reality that they intuitively seem to be.

Or the real explanation is something we have not even thought of yet.

As I understand it, here I’m endorsing the “illusionism” perspective

I don't see how you make the jump from "not ontologically fundamental" to "illusion". For that matter, it's not clear to me what you count as being ontologically fundamental or why it matters.

[-]Noosphere893y*3-1

I don't see how you make the jump from "not ontologically fundamental" to "illusion". For that matter, it's not clear to me what you count as being ontologically fundamental or why it matters.

An ontologically fundamental property is a property that is fundamental to every other property. It also can't be reduced to any other property. A great example is the superforce proposed in Theories of Everything would essentially symmetry-break into the 4 known fundamental forces: Weak and Strong Nuclear forces, Gravity, and Electromagnetism.

BTW, my credences in the following general theories of consciousness are the following:

Ontologically fundamental consciousness is less than 1%.

Non-ontologically fundamental consciousness is around 10-20% credence.

And the idea that consciousness is an illusion is probably 80-90% in my opinion.

[-]Richard_Kennaway3y70

I put illusionism at effectively 0 (i.e. small enough to ignore in all decision-making). Ontological fundamentality, as you describe it, is something that one could only judge in hindsight, after finding a testable and tested Theory of Everything Including Consciousness. We don't yet have even a testable and tested Theory of Everything Excluding Consciousness.

[-]Steven Byrnes3y144

The Standard Model of Particle Physics plus perturbative general relativity (I wish it was better-known and had a catchier name) appears sufficient to explain everything that happens in the solar system, and has been extremely rigorously tested. It can’t explain everything that happens in the universe—in particular, it can’t make any predictions about microscopic black holes or the big bang, unfortunately. All signs point to some version of string theory eventually filling in those gaps as a true Theory of Everything, although of course one can’t be certain until the physicists actually find the right vacuum for our universe and do all the calculations etc.

I have very high confidence that, when that process is complete, and we understand the fundamental laws of the universe, the laws which hold everywhere with no exceptions, we will have learned nothing whatsoever new or helpful about consciousness. I think fundamental physics is just not going to help us here :)

[Sorry if I’m misunderstanding your point.]

[-]Richard_Kennaway3y83

I completely agree with that. So far we only have speculations towards a TOE (excluding consciousness), and when we have one, there will still be all of the way to go to explain consciousness.

[-]Steven Byrnes3y10

Can you say more about the “non-ontologically fundamental consciousness” that you like? Or provide a link to something I could read?

[-]deepthoughtlife3y41

Honestly Illusionism is just really hard to take seriously. Whatever consciousness is, I have better evidence it exists than anything else since it is the only thing I actually experience directly. I should pretend it isn't real...why exactly? Am I talking to slightly defective P-zombies?

If the computer emitted it for the same reasons...is a clear example of a begging the question fallacy. If a computer claimed to be conscious because it was conscious, then it logically has to be conscious, but that is the possible dispute in the first place. If you claim consciousness isn't real, then obviously computers can't be conscious. Note, that you aren't talking about real illusionism if you don't think we are p-zombies. Only the first of the two possibilities you mentioned is Illusionism if I recall correctly.

You seem like one of the many people trying to systematize things they don't really understand. It's an understandable impulse, but leads to an illusion of understanding (which is the only thing that leads to a systemization like Illusionism seems like frustrated people claiming there is nothing to see here.)
If you want a systemization of consciousness that doesn't claim things it doesn't know, then assume consciousness is the self-reflective and experiential part of the mind that controls and directs large parts of the overall mind. There is no need to state what causes it.

If a machine fails to be self-reflective or experiential then it clearly isn't conscious. It seems pretty clear that modern AI is neither. It probably fails the test of even being a mind in any way, but that's debatable.

Is it possible for a machine to be conscious? Who knows. I'm not going to bet against it, but current techniques seem incredibly unlikely to do it.

[-]Steven Byrnes3y-1-3

Whatever consciousness is, I have better evidence it exists than anything else since it is the only thing I actually experience directly.

In an out-of-body experience, you can “directly experience” your mind floating on the other side of the room. But your mind is not in fact floating on the other side of the room.

So what you call a “direct experience”, I call a “perception”. And perceptions can be mistaken—e.g. optical illusions.

So, write down a bulleted list of properties of your own consciousness. Every one of the items on your list is a perception that you have made about your own consciousness. How many of those bulleted items are veridical perceptions—perceiving an aspect of your own consciousness as it truly is—and how many of them are misperceptions? If you say “none is a misperception”, how do you know, and why does it differ from all other types of human perception in that respect, and how do you make sense of the fact that some people report that they were previously mistaken about properties of their own consciousness (e.g. “enlightened” Buddhists reflecting on their old beliefs)?

Or if you allow that some of the items on your bulleted list may be misperceptions, why not all of them??

It seems pretty clear that modern AI is neither

To be clear, this post is about AGI, which doesn’t exist yet, not “modern AI”, which does.

[-]TAG3y64

So what you call a “direct experience”, I call a “perception”. And perceptions can be mistaken—e.g. optical illusions.

This comment:

Whatever consciousness is, I have better evidence it exists than anything else since it is the only thing I actually experience directly.

...could have been phrased as:

Whatever consciousness is, I have better evidence it exists than anything else since it is the only thing I experience everything else with.

[-]deepthoughtlife3y21

I do agree with your rephrasing. That is exactly what I mean (though with a different emphasis.).

[-]Gunnar_Zarncke3y31

Why I expect AGIs to be sentient / conscious, whether we wanted that or not

I think it could be worse than that.

As you speculate, we will earlier or later figure out how to engineer consciousness. And I mean your strong version of consciousness where people agree that the thing is conscious, e.g., not just because it responds like a conscious agent but because we can point out how it happens, observe the process, and generally see the analogy to how we do it. If we can engineer it, we can optimize it and reduce it to the minimum needed components and computational power to admit consciousness in this sense. Humans got consciousness because it was evolutionary useful or even a side-effect. It is not the core feature of the human brain, and most of the processing and learning of the brain deals with other things. Therefore I conjecture that not that much CPU/RAM will be needed for pure consciousness. I would guess that my laptop has enough for that. What will happen if some evil actor engineers small consciousnesses into their devices or even apps? May we reformat or uninstall them?

[-]Dagon3y61

As you speculate, we will earlier or later figure out how to engineer consciousness.

I think we're much further away from this than we are other problems with AGI. I agree that we will, at some point, be able to define consciousness in a way that will be accepted by those we currently agree are conscious by dint of their similarity to ourselves). I don't know that it will match my or your intuitions about it today, and depending on what that agreement is, we may or may not be able to engineer it very much.

I strongly expect that as we progress in understanding, we'll decide that it's not sacred, and it's OK to create and destroy some consciousnesses for the convenience of others. Heck, we don't spend very much of our personal energy in preventing death of distant human strangers, though we try not to be personally directly responsible for deaths. I'm certainly not going to worry about reformatting a device that has a tiny consciousness any more than I worry about killing an ant colony that's too close to my house. I may or may not worry about longevity of a human-sized consciousness, if it's one of billions that are coming and going all the time. I have no intuitions about giant consciousnesses - maybe they're utility monsters, maybe they're just blobs of matter like the rest of us.

[-]Gunnar_Zarncke3y51

I strongly expect that as we progress in understanding, we'll decide that it's not sacred, and it's OK to create and destroy some consciousnesses for the convenience of others.

That might be an outcome. In that case, we might decide that the sacredness of life is not tied to consciousness but something else.

[-]Shiroe3y21

Creating or preventing conscious experiences from happening has a moral valence equivalent to how that conscious experience feels. I expect most "artificial" conscious experiences created by machines to be neutral with respect to the pain-pleasure axis, for the same reason that randomly generated bitmaps rarely depict anything.

[-]Steven Byrnes3y40

I expect most "artificial" conscious experiences created by machines to be neutral with respect to the pain-pleasure axis, for the same reason that randomly generated bitmaps rarely depict anything.

What if the machine is an AGI algorithm, and right now it’s autonomously inventing a new better airplane design? Would you still expect that?

[-]Shiroe3y1211

The space of possible minds/algorithms is so vast, and that problem is so open-ended, that it would be a remarkable coincidence if such an AGI had a consciousness that was anything like ours. Most details of our experience are just accidents of evolution and history.

Does an airplane have a consciousness like a bird? "Design an airplane" sounds like a more specific goal, but in the space of all possible minds/algorithms that goal's solutions are quite undetermined, just like flight.

[-]Steven Byrnes3y50

My airplane comment above was a sincere question, not a gotcha or argument or anything. I was a bit confused about what you were saying and was trying to suss it out. :) Thanks.

I do disagree with you though. Hmm, here’s an argument. Humans invented TD learning, and then it was discovered that human brains (and other animals) incorporate TD learning too. Similarly, self-supervised learning is widely used in both AI and human brains, as are distributed representations and numerous other things.

If our expectation is “The space of possible minds/algorithms is so vast…” then it would be a remarkable coincidence for TD learning to show up independently in brains & AI, right? How would you explain that?

I would propose instead an alternative picture, in which there are a small number of practical methods which can build intelligent systems. In that picture (which I subscribe to, more or less), we shouldn’t be too surprised if future AGI has a similar architecture to the human brain. Or in the most extreme version of that picture, we should be surprised if it doesn’t! (At least, they’d be similar in terms of how they use RL and other types of learning / inference algorithms; I don’t expect the innate drives a.k.a. reward functions to be remotely the same, at least not by default.)

[-]Nathan Helm-Burger3y53

I agree with Stephen's point about convergent results from directed design (or evolution in the case of animals). I don't agree that consciousness and moral valence are closely coupled such that it would incur a performance loss to decouple them. Therefore, I suspect it will be a nearly costless choice to make morally relevant vs irrelevant AGI, and that we very much morally ought to choose to make morally-irrelevant AGI. To do otherwise would be possible, as Gunnar describes, but morally monstrous. Unfortunately some people do morally monstrous things sometimes. I am unclear on how to prevent this particular form of monstrosity.

[-]Algon3y20

My own point of view is similar to that of Luke Melhausser's opinion in section 4.1 of "2017 report on consciousness and moral patienthood." Physicalism seems like it is true. But the global neural work space + attention schema, and other physicalist models of consciousness, just don't feel like they explain enough of the phenomena, and seem so simple that were they true I'd expect consciousness to be pretty much everywhere. Like, what exactly is reflectivity? What counts as a reflective model for the purpose of generating a computation that outputs thoughts like "I have qualia"?

^{^}
Though I suppose where I differ is that I've got philisophical issues with ascribing a computation to a physical process. It seems like you can ascribe many computations to e.g. a rock or a waterfall. And I don't know a philosophical principle which doesn't give you an insane answer and seems justified.

[-]Shiroe3y00

I'm not quite convinced that illusionism is decision-irrelevant in the way you propose. If it's true that there is no such thing as 1st-person experience, then such experience cannot disclose your own values to you. Instead, you must infer your values indirectly through some strictly 3rd-person process. But all external probing of this sort, because it is not 1st-person, will include some non-zero degree of uncertainty.

One paradox that this leads to is the willingness to endure vast amounts of (purportedly illusory) suffering in the hope of winning, in exchange, a very small chance of learning something new about your true values. Nihilism is no help here, because you're not a nihilist; you're an illusionist. You do believe that you have values, instantiated in 3rd-person reality.

[-]TAG3y10

Instead, you must infer your values indirectly through some strictly 3rd-person process

Or some other first person process.

[-]Shiroe3y00

Can you elaborate what such a process would be? Under illusionism, there is no first person perspective in which values can be disclosed (namely, for hedonic utilitarianism).

[-]TAG3y10

Ilusionism denies the reality of qualia, not personhood.

[-]Shiroe3y*10

Personhood is a separate concept. Animals that may lack a personal identity conception may still have first person experiences, like pain and fear. Boltzmann brains supposedly can instantiate brief moments of first person experience, but they lack personhood.

The phrase "first person" is a metaphor borrowed from the grammatical "first person" in language.

[+]AndyWood3y-7-4

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

45

Thoughts on AGI consciousness / sentience

45

45

Summary

Quick points before starting

My take on the philosophy of consciousness / sentience, in a nutshell

How does that feed into morality?

Why I expect AGIs to be sentient / conscious, whether we wanted that or not