LLMs showed how far one can get with a good next token prediction engine and a huge amount of training data. This ought to tell us a fair bit about how the human mind works. Specifically, since the next token prediction appears to be so pervasive in human communication, it is bound to shroud other vital parts of it, if any. I wonder what would be left if we managed to factor the prediction engine + training data out of our models of human cognition? Presumably something related to "interpretability" or "qualia" or... something else. What might it be?

New Answer
Ask Related Question
New Comment

2 Answers sorted by

Here's my take:

Like the reward signal in reinforcement learning, next-token prediction is a simple feedback signal that masks a lot of complexity behind the scenes. To predict the next token requires the model first of all to estimate what sort of persona should be speaking, what they know, how they speak, what is the context, and what are they trying to communicate. Self-attention with multiple attention heads at every layer in the Transformer allows the LLM to keep track of all these things. It's probably not the best way to do it, but it works.

Human brains, and cortex in particular, gives us a powerful way to map all of this sort of information. We can map out our current mental model and predict a map of our interlocutor's, looking for gaps in each and planning words either to fill in our own gaps (e.g., by asking questions) or to fill in theirs (with what we actually think or with what we want them to think, depending on our goals). I would also say that natural language is actually a sort of programming language, allowing humans to share cognitive programs between minds, programs of behavior or world modelling.

I also asked your question to ChatGPT, and here is what it had to say:

It is difficult to speculate about what might be left of human cognition if we were able to factor out next token prediction, as this is a complex and multifaceted aspect of human thought and communication. There are many other important aspects of human cognition beyond next token prediction, including things like perception, attention, memory, problem-solving, decision-making, emotion, and social cognition.

One aspect of human cognition that may be relevant in this context is the ability to understand and interpret the meaning and intention behind spoken and written communication. This involves not just predicting the next word or phrase, but also being able to understand the context in which it is being used, the relationships between words and phrases, and the overall meaning and purpose of the communication. It also involves the ability to generate and express one's own ideas and thoughts through language.

Another aspect of human cognition that may be relevant is the ability to experience and process sensory information and emotions. This includes things like the ability to see, hear, touch, taste, and smell, as well as the ability to feel and express emotions.

It is worth noting that these are just a few examples of the many complex and interrelated aspects of human cognition, and it is difficult to fully disentangle and isolate any one aspect from the others.

Not sure if I am even looking in the right direction, but in addition to token predictors, humans also have:

  • instincts;
  • feedback from reality.

Who knows, maybe without these two we wouldn't really be better than GPT-3.

To implement feedback from reality, we would need some virtual environment, and special commands to "do" things in that environment. Doing things could have consequences; it could help or hurt the chatbot.

Chatbot could also "see" the environment, which could be implemented as getting special tokens in the prompt? They could also observe each other's actions (maybe only when they are close to each other).

On top of that, implement instincts, something like very simple personalized extra training data. Something like, to simulate getting burned by fire, add 1000000 strings "observe:fire OUCH" to the training data, thus increasing the probability the chatbot will output OUCH after observing "fire".

Now some genetic algorithm on top of this, so chatbots can live in their environment, have their instincts somehow modified (randomly? intentionally?), the successful chatbots reproduce, the unsuccessful ones die.

Problems with this approach:

It it works, we could create tons of suffering.

If it works, we might not understand the chatbots anyway. Like, they might evolve their own special language. Or the instincts may be just as illegible as looking at values in a neural network.

EDIT: Also, memory. If I understand it correctly, chatbots only remember a certain length of previous discussion. That is fair, human memory is not unlimited either. But perhaps the chatbot could have multiple communication channels: at least, one to talk to humans, and one to talk to itself. The chatbot could write things to its private channel to remember them after they scrolled off the remembered region of its public channel.

5 comments, sorted by Click to highlight new comments since: Today at 6:01 PM

To clarify a bit, here is one of many examples that shows token prediction resulting in self-contradictory replies, apparently because the meaning of its response is not something ChatGPT can model well:

Q: What country is south of Rwanda?

A: Rwanda is a landlocked country located in central Africa. It is bordered by Uganda to the north, Tanzania to the east, Burundi to the south, and the Democratic Republic of the Congo to the west. There are no countries south of Rwanda.

((The question is from the paper https://arxiv.org/pdf/2212.03551.pdf, mentioned by Sarah Constantin in https://twitter.com/s_r_constantin/status/1603809422278418433 .)

I dunno how to respond to this question. It reads like it wants to make a lot of weird-to-me assumptions about the relationship between human cognition and currently popular ML methods. Like, I could give an object-level answer but that feels inadequate considering.

hmm, which part is weird?

I think what struck me most was the assumption the success of the next token prediction objective in today's ML implies something specific about how human cognition works (especially to the point where we might hypothesize that most of it is just generic prediction + training data).

If you look at the kinds of cognitive architectures that shoot for a brain-like structure, like ACT-R or Leabra or Yann LeCun's thing or SPAUN, most of the components are doing things that are not very similar to GPT-style next token prediction.

Hmm, interesting. I wonder if this is an example of Carcinisation, where you can get some ways toward imitating/implementing cognition from multiple directions.

New to LessWrong?