(Related text posted to Twitter; this version is edited and has a more advanced final section.)
Imagine yourself in a box, trying to predict the next word - assign as much probability mass to the next token as possible - for all the text on the Internet.
Koan: Is this a task whose difficulty caps out as human intelligence, or at the intelligence level of the smartest human who wrote any Internet text? What factors make that task easier, or harder? (If you don't have an answer, maybe take a minute to generate one, or alternatively, try to predict what I'll say next; if you do have an answer, take a moment to review it inside your mind, or maybe say the words out loud.)
Consider that somewhere on the internet is probably a list of thruples: <product of 2 prime numbers, first prime, second prime>.
GPT obviously isn't going to predict that successfully for significantly-sized primes, but it illustrates the basic point:
There is no law saying that a predictor only needs to be as intelligent as the generator, in order to predict the generator's next token.
Indeed, in general, you've got to be more intelligent to predict particular X, than to generate realistic X. GPTs are being trained to a much harder task than GANs.
Same spirit: <Hash, plaintext> pairs, which you can't predict without cracking the hash algorithm, but which you could far more easily generate typical instances of if you were trying to pass a GAN's discriminator about it (assuming a discriminator that had learned to compute hash functions).
Consider that some of the text on the Internet isn't humans casually chatting. It's the results section of a science paper. It's news stories that say what happened on a particular day, where maybe no human would be smart enough to predict the next thing that happened in the news story in advance of it happening.
As Ilya Sutskever compactly put it, to learn to predict text, is to learn to predict the causal processes of which the text is a shadow.
Lots of what's shadowed on the Internet has a *complicated* causal process generating it.
Consider that sometimes human beings, in the course of talking, make errors.
GPTs are not being trained to imitate human error. They're being trained to *predict* human error.
Consider the asymmetry between you, who makes an error, and an outside mind that knows you well enough and in enough detail to predict *which* errors you'll make.
If you then ask that predictor to become an actress and play the character of you, the actress will guess which errors you'll make, and play those errors. If the actress guesses correctly, it doesn't mean the actress is just as error-prone as you.
Consider that a lot of the text on the Internet isn't extemporaneous speech. It's text that people crafted over hours or days.
GPT-4 is being asked to predict it in 200 serial steps or however many layers it's got, just like if a human was extemporizing their immediate thoughts.
A human can write a rap battle in an hour. A GPT loss function would like the GPT to be intelligent enough to predict it on the fly.
Or maybe simplest:
Imagine somebody telling you to make up random words, and you say, "Morvelkainen bloombla ringa mongo."
Imagine a mind of a level - where, to be clear, I'm not saying GPTs are at this level yet -
Imagine a Mind of a level where it can hear you say 'morvelkainen blaambla ringa', and maybe also read your entire social media history, and then manage to assign 20% probability that your next utterance is 'mongo'.
The fact that this Mind could double as a really good actor playing your character, does not mean They are only exactly as smart as you.
When you're trying to be human-equivalent at writing text, you can just make up whatever output, and it's now a human output because you're human and you chose to output that.
GPT-4 is being asked to predict all that stuff you're making up. It doesn't get to make up whatever. It is being asked to model what you were thinking - the thoughts in your mind whose shadow is your text output - so as to assign as much probability as possible to your true next word.
Figuring out that your next utterance is 'mongo' is not mostly a question, I'd guess, of that mighty Mind being hammered into the shape of a thing that can simulate arbitrary humans, and then some less intelligent subprocess being responsible for adapting the shape of that Mind to be you exactly, after which it simulates you saying 'mongo'. Figuring out exactly who's talking, to that degree, is a hard inference problem which seems like noticeably harder mental work than the part where you just say 'mongo'.
When you predict how to chip a flint handaxe, you are not mostly a causal process that behaves like a flint handaxe, plus some computationally weaker thing that figures out which flint handaxe to be. It's not a problem that is best solved by "have the difficult ability to be like any particular flint handaxe, and then easily figure out which flint handaxe to be".
GPT-4 is still not as smart as a human in many ways, but it's naked mathematical truth that the task GPTs are being trained on is harder than being an actual human.
And since the task that GPTs are being trained on is different from and harder than the task of being a human, it would be surprising - even leaving aside all the ways that gradient descent differs from natural selection - if GPTs ended up thinking the way humans do, in order to solve that problem.
GPTs are not Imitators, nor Simulators, but Predictors.
Is this a limitation in practice? Rap Battles are a bad example because they happen to be the exception of a task premised on being "one shot" and real time, but the overall point stands. We ask GPT to do tasks in one try, one step, that humans do with many steps, iteratively and recursively.
Take this "the queen was captured" problem. As a human I might be analyzing a game, glance at the wrong move, think a thought about the analysis premised on that move (or even start writing words down!) and then notice the error and just fix it. I am doing this right now, in my thoughts and on the keyboard, writing this comment.
Same thing works with ChatGPT, today. I deal with problems like "the queen was captured" every day just by adding more ChatGPT steps. Instead of one-shotting, every completion chains a second ChatGPT prompt to check for mistakes. (You may need a third level to get to like 99% because the checker blunders too.) The background chains can either ask to regenerate the original prompt, or reply to the original ChatGPT describing the error, and ask it to fix its mistake. The latter form seems useful for code generation.
Like right now I typically do 2 additional background chains by default, for every single thing I ask Chat GPT. Not just in a task where I'm seeking rigour and want to avoid factual mistakes like "the queen was captured" but just to get higher quality responses in general.
Original Prompt -> Improve this answer. -> Improve this Answer.
Not literally just those three words, but even something that simple is actually better than just asking one time. Seriously. Try it, confirm, and make it a habit. Sometimes it's shocking. I ask for a simple javascript function, it pumps out a 20 line function that looks fine to me. I habitually ask for a better version and "Upon reflection, you can do this in two lines of javascript that run 100x faster."
If GPT were 100x cheaper I would be tempted just go wild with this. Every prompt is 200 or 300 prompts in the background, invisibly, instead of 2 or 3. I'm sure there's diminishing returns and the chain would be more complicated than repeating "Improve" 100 times, but it were fast and cheap enough, why not do it.
As an aside, I think about asking ChatGPT to write code like asking a human to code a project on a whiteboard without the internet to find answers, a computer to run code on, or even paper references. The human can probably do it, sort of, but I bet the code will have tons of bugs and errors and even API 'hallucinations' if you run it! I think it's even worse than that, it's almost like ChatGPT isn't even allowed to erase anything it wrote on white board either. But we don't need to one shot everything, so do we care about infinite length completions? Humans do things in steps, and when ChatGPT isn't trying to whiteboard everything, when it can check API references, when it can see what the code returns, errors, when it can recurse on itself to improve things, it's way better. Right now the form this takes is a human on the ChatGPT web page asking for code, running it, and then pasting the error message back into ChatGPT. The more automated versions of this are trickling out. Then I imagine the future, asking ChatGPT for code when its 1000x cheaper. And my one question behind the scenes is actually 1000 prompts looking up APIs on the internet, running the code in a simulator (or for real, people are already doing that) looking at the errors or results, etc. And that's the boring unimaginative extrapolation.
Also this is probably obvious, but just in case: if you try asking "Improve this answer." repeatedly in ChatGPT you need to manage your context window size. Migrate to a new conversation when you get about 75% full. OpenAI should really warn you because even before 100% the quality drops like a rock. Just copy your original request and the last best answer(s). If you're doing it manually select a few useful other bits too.