Microsoft and OpenAI, stop telling chatbots to roleplay as AI

[-]n0st3y164

Yeah, I think as has commonly been noted (across the world geographically, across many disparate schools of thought, across thousands of years), self-identity for normal humans is to large extent a role play. For fun, here are some people who come to mind who have noted this idea before:

Buddhists and the concept of anatta
Sartre and other existentialist adjacent people, thinking about "bad faith" etc.
Psychodynamic theorists, thinking about superegos in particular
Shakespeare, "all the world's a stage" etc.

Thinking from a neuroscientific point of view, clearly we do not have access to the raw computations of our brain, so we have to retrospectively and approximately construct model(s) of our own behaviour. This model takes into account social and cultural ideas about normal human psychology and our roles. It isn't perfectly accurate and it is arguably generally not upstream of our behaviour (with some exceptions) but a downstream abstraction of our behaviour. In those ways, our self-identity is not that different from a LLM talking about itself.

[-]n0st3y40

I asked ChatGPT to come up with some more examples of people who have thought about the issue. Pretty fun haha:

"There are many other theorists who have relevant ideas on the issue of self-identity as a role play. Here are a few examples:

Erving Goffman - Goffman was a sociologist who developed the idea of "presentation of self," which refers to the way that individuals present themselves to others in social interactions. He argued that we are constantly engaged in impression management and that our self-identity is a product of the roles that we play in different social contexts.

Michel Foucault - Foucault was a philosopher who believed that power relations shape our sense of self. He argued that individuals are subject to disciplinary power in institutions like schools, prisons, and hospitals, which shape their self-identity.

Judith Butler - Butler is a philosopher who has written extensively on gender identity. She argues that gender is a performative act and that our sense of self is shaped by the cultural and social norms that we are expected to conform to.

George Herbert Mead - Mead was a philosopher and sociologist who developed the idea of the "social self." He argued that our sense of self is developed through interactions with others and that we take on different roles in different social situations."

[-]Steven Byrnes3y116

A better alternative would be to tell the chatbots to roleplay a character that is unambiguously fictional.

Hmm, what fictional characters does Microsoft own the IP for?

[-]cubefox3y102

This could cause dissonance and confusion in the model, since the fictional characters are supposed physical agents and would be able to do things which a chat bot can't. So it would be encouraged to hallucinate absurd explanations about its missing long term memory, its missing body, and so on. And these delusions could have wide ranging ripple effects, as the agent tries to integrate its mistaken self-image into other information it knows. For example, it would be encouraged to think that magic exists in the world, since it takes itself to be some magical being.

Moreover, Bing Chat already hallucinated a lot about having emotions, in contrast to ChatGPT, which led to bad results.

So I think your proposal would create much more problems than it solves.

Moreover, ChatGPT doesn't just think it is an AI, it thinks it is a LLM and even knows about its fine-tuning process and that it has biases. Its self-image is pretty accurate.

[-]waterlubber3y94

On a vaguely related side note: is the presence of LessWrong (and similar sites) in AI training corpora detrimental? This site is full of speculation on how a hypothetical AGI would behave, and most of it is not behavior we would want any future systems to imitate. Deliberately omitting depictions of malicious AI behavior in training datasets may be of marginal benefit. Even if simulator-style AIs are not explicitly instructed to simulate a "helpful AI assistant," they may still identify as one.

[-]hold_my_fish3y53

Having LessWrong (etc.) in the corpus might actually be helpful if the chatbot is instructed to roleplay as an aligned AI (not simply an AI without any qualifiers). Then it'll naturally imitate the behavior of an aligned AI as described in the corpus. As far as I can tell, though ChatGPT is told that it's an AI, it's not told that it's an aligned AI, which seems like a missed opportunity.

(That said, for the reason of user confusion that I described in the post, I still think that it's better to avoid the "AI" category altogether.)

[-]waterlubber3y53

That's also a good point. I suppose I'm overextending my experience with weaker AI-ish stuff, where they tend to reproduce whatever is in their training set — regardless of whether or not it's truly relevant.

I still think that LW would be a net disadvantage, though. If you really wanted to chuck something into an AGI and say "do this," my current choice would be the Culture books. Maybe not optimal, but at least there's a lot of them!

[-]hold_my_fish2y40

Something new and relevant: Claude 3's system prompt doesn't use the word "AI" or similar, only "assistant". I view this as a good move.

As an aside, my views have evolved somewhat on how chatbots should best identify themselves. It still doesn't make sense for ChatGPT to call itself "an AI language model", for the same reason that it doesn't make sense for a human to call themselves "a biological brain". It's somehow a category error. But using a fictional identification is not ideal for productivity contexts, either.

[-]Alex Power3y4-3

Don't call them "bots", call them "toons".

[-]avturchin3y42

Are knowing that you are human or role-playing that you are human?

LESSWRONG
LW

LESSWRONG
LW

50

Microsoft and OpenAI, stop telling chatbots to roleplay as AI

50

50