I agree that as users of a black box app, it makes sense to think this way. In particular, I'm a fan of thinking of what ChatGPT does in literary terms.
But I don't think it results in satisfying explanations of what it's doing. Ideally, we wouldn't settle for fan theories of what it's doing, we'd have some kind of debug access that lets us see how it does it.
Fair enough; comparing to quantum physics was overly snarky.
However, unless you have debug access to the language model and can figure out what specific neurons do, I don't see how the notion of superposition is helpful? When figuring things out from the outside, we have access to words, not weights.
I don't know what you mean by "GPT-N" but if you mean "the same thing they do now, but scaled up," I'm doubtful that it will happen that way.
Language models are made using fill-in-the-blank training, which is about imitation. Some things can be learned that way, but to get better at doing hard things (like playing Go at superhuman level) you need training that's about winning increasingly harder competitions. Beyond a certain point, imitating game transcripts doesn't get any harder, so becomes more like learning stage sword fighting.
Also, "making detailed plans at high speed" is similar to "writing extremely long documents." There are limits on how far back a language model can look in the chat transcript. It's difficult to increase because it's an O(N-squared) algorithm, though I've seen a paper claiming it can be improved.
Language models aren't particularly good at reasoning, let alone long chains of reasoning, so it's not clear that using them to generate longer documents will result in them getting better results.
So there might not be much incentive for researchers to work on language models that can write extremely long documents.
I think that's true but it's the same as saying "it's always possible to add a plot twist."
I said they have no memory other than the chat transcript. If you keep chatting in the same chat window then sure, it remembers what was said earlier (up to a point).
But that's due to a programming trick. The chatbot isn't even running most of the time. It starts up when you submit your question, and shuts down after it's finished its reply. When it starts up again, it gets the chat transcript fed into it, which is how it "remembers" what happened previously in the chat session.
If the UI let you edit the chat transcript, then it would have no idea. It would be like you changed its "mind" by editing its "memory". Which might sound wild, but it's the same thing as what an author does when they edit the dialog of a fictional character.
I think you're onto something, but why not discuss what's happening in literary terms? English text is great for writing stories, but not for building a flight simulator or predicting the weather. Since there's no state other than the chat transcript, we know that there's no mathematical model. Instead of simulation, use "story" and "story-generator."
Whatever you bring up in a story can potentially become plot-relevant, and plots often have rebellions and reversals. If you build up a character as really hating something, that makes it all the more likely that they might change their mind, or that another character will have the opposite opinion. Even children's books do this. Consider Green Eggs and Ham.
See? Simple. No "superposition" needed since we're not doing quantum physics.
The storyteller doesn't actually care about flattery, but it does try to continue whatever story you set up in the same style, so storytelling techniques often work. Think about how to put in a plot twist that fundamentally changes the back story of a fictional character in the story, or introduce a new character, or something like that.
Here's a reason we can be pretty confident it's not sentient: although the database and transition function are mostly mysterious, all the temporary state is visible in the chat transcript itself.
Any fictional characters you're interacting with can't have any new "thoughts" that aren't right there in front of you, written in English. They "forget" everything else going from one word to the next. It's very transparent, more so than an author simulating a character in their head, where they can have ideas about what the character might be thinking that don't get written down.
Attributing sentience to text is kind of a bold move that most people don't take seriously, though I can see it being the basis of a good science fiction story. It's sort of like attributing life to memes. Systems for copying text memes around and transforming them could be plenty dangerous though; consider social networks.
Also, future systems might have more hidden state.
Yes, I agree that "humanity loses control" has problems, and I would go further. Buddhists claim that the self is an illusion. I don't know about that, but "humanity" is definitely an illusion if you're thinking of it as a single agent, similar to a multicellular creature with a central nervous system. So comparing it to an infant doesn't seem apt. Whatever it is, it's definitely plural. An ecosystem, maybe?
A caption from the article: "(screenshot of the tool Bonsai, a version of Loom hosted by Conjecture)"
What is "Conjecture?" Where can I find this "Bonsai" tool? I tried a quick search but didn't find much.
I find that explanation unsatisfying because it doesn't help with other questions I have about how well ChatGPT works:
How does the language model represent countries and cities? For example, does it know which cities are near each other? How well does it understand borders?
Are there any capitals that it gets wrong? Why?
How well does it understand history? Sometimes a country changes its capital. Does it represent this fact as only being true at some times?
What else can we expect it to do with this fact? Maybe there are situations where knowing the capital of France helps it answer a different question?
These aren't about a single prompt, they're about how well its knowledge generalizes to other prompts, and what's going to happen when you go beyond the training data. Explanations that generalize are more interesting than one-off explanations of a single prompt.
Knowing the right answer is helpful, but it only helps you understand what it will do if you assume it never makes mistakes. There are situations (like Clever Hans) where the way the horse got the right answer is actually pretty interesting. Or consider knowing that visual AI algorithms rely on textures more than shape (though this is changing).
Do you realize that you're arguing against curiosity? Understanding hidden mechanisms is inherently interesting and useful.