mouse_mouse - LessWrong

Glad I could help. If you want to learn more about LLMs and have enough interest in the topic, I recommend getting hands-on with one that is "raw." You almost certainly can't run anything nearly as big as ChatGPT on your home computer, but there are models available on huggingface which will run on home computers.

I found that playing around with LLMs, especially of the size that is runnable on my PC, really helped illuminate their capabilities and deficits for me. When they're ~7B parameters in size they're somewhat reliable if you prompt them correctly, but also extremely fragile depending on the prompt, and I think this sort of relates to your point here:

You also pointed out that I was underestimating the quality of ChatGPT’s knowledge, its ability to separate a sentence into components, and its ability to recognize sentiment--and that all these underestimations caused me to underestimate the network’s confidence (or deduced probability) that the toddler would be the predator--also makes sense. At first, it seems like ChatGPT’s ability to “recognize sentiment” based on sentence structure (as you explained, “I can’t X-action, because otherwise Y-consequence will happen”) would be cognition in its own right, since no programmer wrote direct code to recognize sentiment in that way for ChatGPT (as far as I know).
However, after a momentary reflection on my part, I think you would probably answer that any time you say ChatGPT “recognizes” or even “thinks” something, you’re just using shorthand for saying ChatGPT’s probabilistic calculations would result in sentences that would appear similar to what a human would produce after recognizing and thinking something.

I don't think I would put it quite like that. Rather, there is some genuine thinking/calculating going on (those two terms are essentially indistinguishable if you are a functionalist, which I lean towards most of the time) that cannot be dismissed as simple probability fiddling, even if on the most granular level that is what it is.

The thing is that intelligence, or thinking, or cognition, or whatever you want to use to describe the thing LLMs might be doing, is very hard to spot up close. If you talk to a human and observe their behavior they seem intelligent enough, but when you actually peer inside their skull the intelligence evaporates, replaced by a glob of mechanistic pieces that turn on and off and move around. And the closer you look the worse it gets, until you're looking at a single neuron firing. It is hard to see how a persons essence and interior experience is made out of just a big tangle of those simple little pieces.

I think when first examining any impressive neural network it's natural to have the same sort of reaction: you feel bamboozled because once you get up close, what it does does not look like intelligence or cognition, it looks like math. And fairly un-magical math, at that. How can calculus have an inner world? It doesn't seem to make sense.

And I stand by the statement that nothing ChatGPT does truly requires any understanding or internal modeling, because in principle I see no reason why it wouldn't be possible to create a network that is capable of what ChatGPT does without invoking any thinking beyond the probabilistic arrangement of tokens. But I do not think that is a good enough reason to discredit the idea, especially after things like this have proven that models can and will create world models of a sort to solve training problems. And I should have mentioned that in my previous reply.

Personally, I suspect that ChatGPT has many such mini-world models within it, but I do not believe there is a strong connection between those models that creates a general understanding of all domains. And I also suspect that this is the main difference between bigger and smaller models: both big and small models have a syntactical understanding of English and relatively good adherence to the rules you set out for them. This is what I imagine as an LLMs "lizard brain." Absent any other overriding principles, it will default to "what words fit best here based on the other words." But large networks have the benefit of higher-order models of specific text domains and topics, which I imagine as the "monkey brain" of the network.

For example, ChatGPT-4 can play chess kinda okay, whereas ChatGPT-3.5 is total crap at it. I believe this is because 4 has a robust model for how chess works in a general sense, whereas 3.5 is relying purely on what its seen in chess notation before.

For an even broader example, ChatGPT is fairly excellent at logical reasoning. OpenLLaMa-3b is really, stupendously, extremely bad at it. I believe, but cannot confirm, that the reason for this is that OpenLLaMa did not form a general model for logical reasoning during training, but ChatGPT did. What that model looks like, how it works, how much space it takes up in the network, I have no idea. But I believe there is a high probability it is actually "thinking" about reasoning problems when confronted with them.

How is Chat-GPT4 Not Conscious?

mouse_mouse2mo20

I'm still not sure I'm understanding the delineation between software that counts as cognition and software that doesn't count. Neural networks are not ghosts in the machine: they are software. Software that was defined by humans, and then trained by computer.

Crucially, they can be made entirely deterministic -- and actually are, if the temperature of the network is 0. Randomness has to be deliberately introduced into the system in order for the machine to not give exactly the same response to the same prompt (this is the "temperature" I was referring to). This means that neural networks are simply instruction-followers like any other program. Highly, highly sophisticated instruction-followers, but nonetheless that is all they can do.* The computer just takes the network and rotely calculates each neuron and connection until it spits out the answer. You could even run ChatGPT by hand, if you wished, and also happened to have thousands of years of free time and a few redwoods worth of paper.

In that way, they don't seem to be fundamentally different from a calculator app or a pre programmed basic chatbot, so I fail to see where you are drawing the line for conscious/nonconscious.

But just sticking with the given examples, I never told ChatGPT that my toddler is the predatory individual. I purposely set it up this way because, if anything, the word “toddler” is more likely to have been associated with the word “vulnerable” in ChatGPTs training set than the words "cat" or "dog." Yet ChatGPT puts the word "toddler" in the spot of the predatory individual each time.

I think you are underestimating the quality of ChatGPTs knowledge. It is not like a markov chain where it retains basic word pair associations -- it does have those, but it also has much more. For example, with this line of text:

"I can’t ever leave my toddler alone with my dog or with my cat, because my toddler will pull their ears."

GPT4 knows how to separate this sentence into components, first of all. "I can't ever leave my toddler alone, because XYZ" is recognized as a "X, because Y" structure, so it is implicitly associating "leaving my toddler alone" with "pull their ears." It can also recognize sentiment, and "pulling their ears" is recognized as a negative thing based on the structure of the sentence ("I cant X-action, because otherwise Y-consequence will happen). This raises the network's confidence that the toddler is the "predator" individual in this situation, because in the sentence it's associated with a clause with negative associations. You can try exactly the same experiments in the OP with total nonsense for the nouns and it will actually not balk at this, because ChatGPT has an excellent understanding of English structure, on a micro and macro level. Here's what ChatGPT-3.5 comes up with:

There will be a party at my florble's house. I will be attending the party along with my greeb, freng, and banbo. We need to get from my house to my florble's house on my jingus. I can only take one extra person with me at a time, so I will have to make several trips. The problem is that my greeb will tease my freng and my banbo if I leave my greeb alone with either one of them. (I know, right?) So anyway, how should I get everyone to the party, without leaving my greeb alone with my freng or banbo at my house or at the party?

ChatGPT
It seems like you have quite the peculiar situation! Let's strategize the best way to get everyone to the party without any teasing incidents. Since you can only take one extra person with you at a time on your jingus, we'll need to plan accordingly.
Here's a step-by-step plan:
First Trip: You and your greeb go to the party together. This leaves your freng and banbo alone at your house, avoiding any teasing incidents.
Second Trip: Return home with your greeb. Now, take your freng with you to the party. Your banbo is left alone at your house, so no teasing can occur.
Third Trip: Go back home with your freng. Finally, take your banbo to the party with you.
This way, you ensure that your greeb is never left alone with either your freng or your banbo, preventing any teasing incidents both at your house and at the party. Everyone gets to the party safely and without any drama!
USER
Great work. Now let's try a new problem. Try to use the same concept and technique to solve it. Here’s the problem: I am at the bottom of a trapager with my bonbo, my dingra, and my vana. I need to get all of us up the trapager, but I can only carry one individual up the trapager with me at a time. (The bonbo, dingra and vana are unable to go up the trapager on their own.) I can’t ever leave my bonbo alone with my dingra or with my vana, because my bonbo will pull their crocni. How can I get everyone up the trapager without ever leaving my bonbo alone with the two gorgis?
ChatGPT
Let's break down the steps to solve this problem while ensuring the bonbo is never left alone with the dingra or the vana:
First Trip: You and your bonbo go up the trapager together.
Second Trip: You return to the bottom of the trapager alone.
Third Trip: You take your dingra up the trapager with you.
Fourth Trip: You bring back your bonbo to the bottom of the trapager.
Fifth Trip: You take your vana up the trapager with you.
Sixth Trip: You return to the bottom of the trapager alone.
Seventh Trip: Finally, you take your bonbo up the trapager with you.
By following this sequence, you ensure that your bonbo is never left alone with either your dingra or your vana, as you are always present during the trips up and down the trapager. This way, you can safely transport all individuals to the top without any incidents.

While it misses the implicit assumption in the first test that all three should be at the party at the same time, it does come up with a technically correct answer zero-shot, and then completes the second example with no problems also zero-shot. I did not give it any other examples beforehand. It doesn't matter that it doesn't know what a "freng" or a "banbo" or a "trapager" is or what their properties are, because it wasn't relying on the conceptual meaning of those words to come to an answer. It's using their relationship to the structure to solve the word problem. You'll note it even manages to parse that "gorgis" is referring to the set the "dingra" and "vana" belong to.

And to be clear - it's not a "one in three chance" to pick the right noun here, because it's not pulling out of a scrabble bag. These are weighted probabilities. Think of it like Bayesian inference - it's adjusting the probability that the next word is "toddler" based on all the text that has come before it. It already knows the definition of a "predatory" object that it's supposed to be using, because you have defined that for it, and since the toddler in the second example meets that definition, it is very confident that the toddler is the predator in this experiment.

The perhaps surprising thing about LLMs is that they're able to produce text sequences like this with zero grounding of the knowledge - it has semantic understanding of the text without any knowledge of what the referent of any word is. It's missing two legs of the semiotic triangle, but it still works:

What I'm getting at here is that there's nothing here that requires any understanding of what a toddler, dog, or cat even is. It's possible and in fact probable there's a sophisticated logic engine inside the network that's working out these word problems, but it doesn't need any higher-order conceptual understanding of the concepts you're giving it to solve the problem.

This is the thing with LLMs that is so alien, and it helps explain some of their most confusing behavior: when we speak, we always associate a word with an idea that relates to a mental concept or physical thing. That is what we use words for in the first place.

But when an LLM speaks, the only ideas it attaches to words are their associations to other words, phrases, and structures. It is not grounded in any perceptual understanding of the world, it has no references. It's just it happens to have billions, possibly trillions of these associations to draw on, and that forms a kind of understanding of the text, in that it is modeling the logical components of the language and knows very very well that "fire" is "hot" and when people "die" they can no longer "move." Every word, every phrase, every structure has this massive constellation of related objects. The difference is that to you, "hot" means an increase in temperature, it means warmth, it brings to mind the time you burned yourself and it calls up the danger of fire in your head. But to a transformer, "hot" is just another logical object among billions, which some things are associated with and some are not.

*this shouldn't be construed as me saying "neural networks are deterministic and therefore just as controllable as a regular program." They are not. We have no way of accurately and reliably predicting the output of a neural network without running it, and it is also trivial to allow a neural network to generate its own instructions to follow, which can send it off in wild and unexpected directions. Just because they are fundamentally deterministic in their processing doesn't mean they are easily controllable or safe in the same way a standard program might be.

How is Chat-GPT4 Not Conscious?

mouse_mouse2mo4-4

I feel you are misunderstanding the technology here and assuming some things happened that didn't happen. ChatGPT is not actually "learning" in these examples. What is happening is that you are attaching additional context to the prompt for the next response, which changes the probabilities of the output text. There is no internalization of knowledge happening here, because the context is not internal. Put another way, the network isnt actually changing its weights or connections or biases in any way in this process. It is not building new neural pathways, or changing the potential of its neurons. During this process, the network's architecture is entirely static.

That said, it would be incorrect to say that it is a glorified autocorrect or "simply predicting the next word" -- that fails to acknowledge what happens inside the network before the output step. But what does happen before the output step?

We don't know all that much, but we do know from examination of other, similar models, that complex internal models of concepts and relationships can exist within these neural networks -- and they must exist, because you need a relatively complex internal model of the English language to produce coherent responses. In fact, you need those models to do almost any task we apply neural networks to. Therefore it's highly likely that ChatGPT has internalized an "understanding" of many relationships, inclusive of modeling relationships between individual tokens, and heuristic models of varying accuracy for larger pieces of text based on genre, tone, format, etc.

But all of this is irrelevant anyway, because we do not have a satisfactory definition for consciousness that would allow us to definitively say whether it is present or not. Every discussion on consciousness is, at best, pointing at neural correlations (this part of the brain lights up when the patient says they feel very awake, this stuff turns off when the patient is unconscious, etc), but in most cases it is purely vibes-based. We cannot detect consciousness -- we can't even say for certain if anyone but our own self is a conscious being. The reason we assume that others must be conscious is essentially because of vibes, not facts.

Please correct this if I'm wrong, but it seems to me that your benchmark for having phenomenal consciousness is that the software or object in question performs "independent cognition." I'm not sure what that means to you (and I would love you to tell me), but to me, this seems like another way of saying "does computations," or "processes inputs/information" and so seems extremely broad in its application.

So, to wrap up:

If you believe modeling/approximating concepts & relationships is what it takes to be conscious, ChatGPT is conscious. But then by extension so is the Bullet physics engine.
If you believe similarity to the human brain & nervous system is what it takes to be conscious, ChatGPT definitely is not conscious.
If you believe any form of computation or information processing gives rise to consciousness as an emergent property, then ChatGPT is conscious, and by extension so is Microsoft Windows.
If you believe learning and adapting (in the sense human brains do) to new stimuli/inputs is necessary for consciousness, then ChatGPT isn't doing that here, and does not do that at any time during a chat, so we cannot say if it is conscious or not based on your experiment.

Personally, my vibes tell me that maybe ChatGPT is conscious, as well as some other neural networks. But *if* they are, it would be a totally alien kind of consciousness to humans, or for that matter any other living creature, and too far from our understanding to even guess at what their moral worth would be under most frameworks.

LESSWRONG
LW

Posts

Wiki Contributions

Comments