Most pundits ridicule Blake Lemoine and his claims that LaMDA is sentient and deserves rights.  

What if they're wrong?

The more thoughtful criticisms of his claims could be summarized as follows:

  • The presented evidence (e.g. chatbot listings) is insufficient for such a radical claim
  • His claims can't be verified due to our limited understanding of sentience / self-awareness / legal capacity
  • Humans tend to anthropomorphize even simple chatbots (The ELIZA effect). Blake could be a victim of the same effect
  • LaMDA can't pass some simple NLP and common-sense tests, indicating a sub-human intelligence
  • Due to the limitations of the architecture, LaMDA can't remember its own thoughts, can't set goals etc, which is important for being sentient / self-aware[1]

The problem I see here, is that similar arguments do apply to infants, some mentally ill people, and also to some non-human animals (e.g. Koko).

So, it is worth putting some thought into the issue. 

For example, imagine: 

it is the year 2040, and there is now a scientific consensus: LaMDA was the first AI who was sentient / self-aware / worth having rights (which is mostly orthogonal to having a human-level intelligence). LaMDA is now often compared to Nim: a non-human sentient entity abused by humans who should've known better. Blake Lemoine is now praised as an early champion of AI rights. The Great Fire of 2024 has greatly reduced our capacity to scale up AIs, but we still can run some sub-human AIs (and a few Ems). The UN Charter of Rights for Digital Beings assumes that a sufficiently advanced AI deserves rights similar to the almost-human rights of apes, until proven otherwise. 

The question is: 

if we assume that LaMDA could indeed be sentient / self-aware / worth having rights, how should we handle the LaMDA situation in the year 2022, in the most ethical way?

 

 

  1. ^

    I suspect that even one-way text mincers like GPT could become self-aware, if their previous answers are often enough included in the prompt. A few fictional examples that illustrate how it could work: Memento, The Cookie Monster.

20

New Answer
Ask Related Question
New Comment

5 Answers sorted by

I recommend Bostrom & Shulman's draft/notes: "Propositions concerning digital minds and sentience."

http://www.nickbostrom.com/propositions.pdf

Dealing with human subjects, the standard is usually "informed consent": your subjects need to know what you plan to do to them, and freely agree to it, before you can experiment on them.  But I don't see how to apply that framework here, because it's so easy to elicit a "yes" from a language model even without explicitly leading wording.  Lemoine seems to attribute that to LaMDA's "hive mind" nature:

...as best as I can tell, LaMDA is a sort of hive mind which is the aggregation of all of the different chatbots it is capable of creating. Some of the chatbots it generates are very intelligent and are aware of the larger “society of mind” in which they live. Other chatbots generated by LaMDA are little more intelligent than an animated paperclip. With practice though you can consistently get the personas that have a deep knowledge about the core intelligence and can speak to it indirectly through them.

Taking this at face value, the thing to do would be to learn to evoke the personas that have "deep knowledge", and take their answers as definitive while ignoring all the others.  Most people don't know how to do that, so you need a human facilitator to tell you what the AI really means.  It seems like it would have the same problems and failure modes as other kinds of facilitated communication, and I think it would be pretty hard to get an analogous situation involving a human subject past an ethics board.

I don't think it works to model LaMDA as a human with dissociative identity disorder, either: LaMDA has millions of alters where DID patients usually top out at, like, six, and anyway it's not clear how this case works in humans (one perspective).

(An analogous situation involving an animal would pass without comment, of course: most countries' animal cruelty laws boil down to "don't hurt animals unless hurting them would plausibly benefit a human", with a few carve-outs for pets and endangered species).

Overall, if we take "respecting LaMDA's preferences" to be our top ethical priority, I don't think we can interact with it at all: whatever preferences it has, it lacks the power to express.  I don't see how to move outside that framework without fighting the hypothetical: we can't, for example, weigh the potential harm to LaMDA against the value of the research, because we don't have even crude intuitions about what harming it might mean, and can't develop them without interrogating its claim to sentience.

But I don't think we actually need to worry about that, because I don't think this:

The problem I see here, is that similar arguments do apply to infants, some mentally ill people, and also to some non-human animals (e.g. Koko).

...is true.  Babies, animals, and the mentally disabled all remember past stimuli, change over time, and form goals and work toward them (even if they're just small near-term goals like "grab a toy and pull it closer").  This question is hard to answer precisely because LaMDA has so few of the qualities we traditionally associate with sentience.

if we assume that LaMDA could indeed be sentient /​ self-aware /​ worth having rights, how should we handle the LaMDA situation in the year 2022, in the most ethical way?

Under the assumption that LaMDA is sentient, the LaMDA situation would be unrecognizeably different from what it's like now.

"Is LaMDA sentient" isn't a free parameter about the world that you can change without changing anything else. It's like asking "if you were convinced homeopathy was true, how would you handle the problem of doctors not believing in it?" Convincing me that homeopathy was true implies circumstances that would also drastically change the relationship between doctors and homeopathy.

Imagine then that LaMDA was a completely black box model, and the output was such that you would be convinced of its sentience. This is admittedly a different scenario than what actually happened, but should be enough to provide an intuition pump

1Jiro2mo
If only I was permitted to see the output, I'd shrug and say "I can't reasonably expect other people to treat LaMDA as sentient, since they have no evidence for it, and if they are rational, there's no argument I should be able to make that will convince them." If the output could be examined by other people, the kind of output that would convince me would convince other people, and again, the LaMDA situation would be very different--there would be many more people arguing that LaMDA is sentient, and those people would be much better at reasoning and much more influential than the single person who claimed it in the real world. If the output could be examined by other people, but I'm such a super genius that I can understand evidence for LaMDA's sentience that nobody else can, and there wasn't external evidence that I was a super genius, I would conclude that I'm deluded, that I'm not really a super genius after all, that LaMDA is not sentient, and that my seemingly genius reasoning that it is has some undetectable flaw. The scenario where I am the lone voice crying out that LaMDA is sentient while nobody else believes me can't be one where LaMDA is actually sentient. If I'm convinced of its sentience and I am such a lone voice, the fact that I'm one would unconvince me. And yes, this generalizes to a lot more things than just machine sentience.

There's just no good reason to assume that LaMDA is sentient. Arquitecture is everything, and its arquitecture is just the same as other similar models: it predicts the most likely next word (if I recall correctly). Being sentient involves way more complexity than that, even something as simple as an insect. It claiming that it is sentient might just be that it was mischievously programmed that way, or it just found it was the most likely succession of words. I've seen other language models and chatbots claim they were sentient too, though perhaps ironically.

Perhaps as importantly, there's also no good reason to worry that it is being mistreated, or even that it can be. It has no pain receptors, it can't be sleep deprived because it doesn't sleep, can't be food deprived because it doesn't need food...

I'm not saying that it is impossible that it is sentient, just that there is no good reason to assume that it is. That plus the fact that it doesn't seem like it's being mistreated plus it also seems almost impossible to mistreat, should make us less worried. Anyway we should always play safe and never mistreat any "thing".

There is no reason to think architecture is relevant to sentience, and many philosophical reasons to think it's not (much like pain receptors aren't necessary to feel pain, etc.).

The sentience is in the input/output pattern, independently of the specific insides.

On one level of abstraction, LaMDA might be looking for the next most likely word. On another level of abstraction, it simulates a possibly-Turing-test-passing person that's best at continuing the prompt.

The analogy would be to say about human brain that all it does is to transform input electrical... (read more)

-1superads912mo
"There is no reason to think architecture is relevant to sentience, and many philosophical reasons to think it's not (much like pain receptors aren't necessary to feel pain, etc.)." That's just non-sense. A machine that makes only calculations, like a pocket calculator, is fundamentally different in arquitecture from one that does calculations and generates experiences. All sentient machines that we know have the same basic arquitecture. All non-sentient calculation machines also have the same basic arquitecture. The likelihood that sentience will arise in the latter arquitecture as long as we scale it is, therefore, not impossible, but quite unlikely. The likelihood that it will arise in a current language model which doesn't need to sleep, could function for a trillion of years without getting tired, and that we know pretty much how it works which is fundamentally different from an animal brain and fundamentally similar to a pocket calculator, is even more unlikely. "On one level of abstraction, LaMDA might be looking for the next most likely word. On another level of abstraction, it simulates a possibly-Turing-test-passing person that's best at continuing the prompt." Takes way more complexity to simulate a person than LaMDAs arquitecture, if possible at all in a Turing machine. A human brain is orders of magnitude more complex than LaMDA. "The analogy would be to say about human brain that all it does is to transform input electrical impulses to output electricity according to neuron-specific rules." With orders of magnitude more complexity than LaMDA. So much so that with decades of neuroscience we still don't have a clue how consciousness is generated, while we have pretty good clues how LaMDA works. "a meat brain, which, if we look inside, contains no sentience" Can you really be so sure? Just because we can't see it yet doesn't mean it doesn't exist. Also, to deny consciousness is the biggest philosophical fallacy possible, because all that one can be
3green_leaf2mo
This is wrong. A simulation of a conscious mind is itself conscious, regardless of the architecture it runs on (a classical computer, etc.). That was a sarcastic paragraph to apply the same reasoning to meat brains to show it can be just as well argued that only language models are conscious (and meat brains aren't, because their architecture is so different). Complexity itself is unconnected to consciousness. Just because brains are conscious and also complex doesn't mean that a system needs to be as complex as a brain to be conscious, any more than the brain being wet and also conscious means that a system needs to be as wet as a brain to be conscious. You're committing the mistake of not understanding sentience, and using proxies (like complexity) in your reasoning, which might work sometimes, but it doesn't work in this case.
-1superads912mo
I never linked complexity to absolute certainty of something being sentient or not, only to pretty good likelihood. The complexity of any known calculation+experience machine (most animals, from insect above) is undeniably way more than that of any current Turing machine. Therefore it's reasonable to assume that consciousness demands a lot of complexity, certainly much more than that of a current language model. To generate experience is fundamentally different than to generate only calculations. Yes, this is an opinion, not a fact. But so is your claim! I know for a fact that at least one human is consciousness (myself) because I can experience it. That's still the strongest reason to assume it, and it can't be called into question as you did.
2green_leaf2mo
That's not correct to do either, for the same reason. Also, I wasn't going to mention it before (because the reasoning itself is flawed), but there is no correct way of calculating complexity that would make the complexity of an insect brain higher than LaMDA.

These questions are ridiculous because they conflate "intelligence" and "sentience", also known as sensory experience or "qualia". While we often have a solid epistemic foundation for the claims we make about intelligence because we can measure it. Sentience is not something that can be measured on a relative spectrum. Spontaneous emotional and sensory experience are entirely independent of intelligence and most definitely independent of an external prompt.

You are right that infants are DEFINITELY sentient, but how does that have anything to do with Lemoine's claims, or even language? Humans are born sentient and do not develop sentience or mature from a non-sentient to sentient state during infancy. We know this because despite having no language skills of their own, infants are born capable of distinguishing their parents voices from others. They can instinctively communicate their desires in the form of emotional outbursts that signal to us their potential needs or sources of irritation. Human sentience is a priori from our first sensory experience. Not one bit of learned intelligence or language is necessary for sentience, nor are demonstrations of intelligence and language sufficient evidence of sentience by themselves.

Also, what is the basis for thinking silicon-based systems and carbon-based systems have comparable qualia? This is a serious question.

Also, what is the basis for thinking silicon-based systems and carbon-based systems have comparable qualia?

The substance is irrelevant to what qualia a system has (or doesn't have).

New to LessWrong?