Meta Programming GPT: A route to Superintelligence?

by dmtea3 min read11th Jul 20207 comments



Imagine typing the following meta-question into GPT-4, a revolutionary new 20 Trillion parameter language model released in 2021:

"I asked the superintelligence how to cure cancer. The superintelligence responded __"

How likely are we to get an actual cure for cancer, complete with manufacturing blueprints? Or will we get yet another "nice sounding, vague suggestion" like "by combining genetic engineering and fungi based medicine"- the sort GPT-2/3 is likely to suggest?

The response depends on whether GPT focuses on either:

1. What GPT thinks humans think that the superintelligence would say; or

2. Using basic reasoning, solve for what the character (an actual superintelligence) would say if this scenario were playing out in real life.

If GPT takes the second approach, by imitating the idealised superintelligence, it would in essence have to act superintelligent.

The difference between the two lies on the fine semantic line: whether GPT thinks the conversation is a human imitating superintelligence, or an actual words of a superintelligence. Arguably, since it only has training samples of the former, it will do the former. Yet that's not what it did with numbers - it learnt the underlying principle, and extrapolated to tasks it had never seen.

If #1 is true, that still implies that GPT-3/4 could be very useful as an AXI: we just need it to imitate a really smart human. More below under "Human Augmentation".

Human-Like Learning?

Human intelligence ['ability to achieve goals'] can be modelled purely as an optimisation process towards imitating an agent that achieves those goals. In so far as these goals can be expressed in language, GPT exhibits a similar capacity to "imagine up an agent" that is likely to fulfil a particular goal. Ergo, GPT exhibits primitive intelligence, of the same kind as human intelligence.

More specifically, I'm trying to clarify that there is a spectrum between imitation and meta-imitation; and bigger GPT models are getting progressively better at meta-imitation.

  • Meta-Imitation is the imitation of the underlying type of thinking that is represented by a class of real or fictional actors. Eg., mathematics.
  • Imitation is direct (perfect/imperfect) copying of an observed behaviour : eg. recalling the atomic number of Uranium.

Language allows humans to imagine ideas that they then imitate- it gives us an ability to imitate the abstract.

Suppose you were a general in ancient Athens, and the problem of house lamps occasionally spilling and setting neighbourhoods aflame was brought you. "We should build a fire-fighting squad.", You pronounce. The words "fire fighting squad" may never have been used in history before that (as sufficient destiny of human population requiring such measures didn't occur earlier) - yet the meaning would be, to a great degree, plain to onlookers. The fire-fighting squad thus formed can go about their duties without much further instruction, by making decisions based on substituting the question "what do I do?" with "what would a hypothetical idealised firefighter do?".

With a simple use of language, we're able to get people to optimize for brand new tasks. Could this same sort of reasoning be used with GPT? Evidence of word substitution would suggest so.

So in one line, is Meta-Imitation = Intelligence ? And will GPT ever be capable of human-level meta-imitation?

Larger GPT models appear to show an increase in meta-imitation over literal imitation. For example, if you asked GPT-2:

"what is 17+244?"

It replies "11"

This is closer to literal imitation - It knows numbers come after a question including other numbers and an operator ("+"). Incidentally, young children seem to acquire language in a somewhat similar fashion:

They begin by imitating utterances (a baby might initially describe many things as "baba"); Their utterances grow increasingly sensitive to nuances of context over time "doggy" < "Labrador" < "Tommy's Labrador named Kappy". I'm arguing that GPT shows a similar increase in contextual sensitivity as the model size grows, implying increasing meta-imitation.

Human Augmentation

My definition of AXI relies on a turing test comprising of a foremost expert in a field conversing with another expert (or an AI). If the expert finds the conversation highly informative and indistinguishable from the human expert, we've created useful AXI.

GPT-2 and GPT-3 appear to show progression towards such intelligence - GPT written research papers providing interesting ideas being one example. Thus, even if GPT-4 isn't superintelligent, I feel it is highly likely to qualify as AXI [especially when trained on research from the relevant field]. And while it may not be able to answer the question on cancer, maybe it will respond to subtler prompts that induce it to imitate a human expert that has solved the problem. So the following might be how a human announces finding the cure for cancer, and GPT-4's completion might yield interesting results:

"Our team has performed in-vivo experiments where we were able to target and destroy cancerous cells, while leaving healthy ones untouched. We achieved this by targeting certain inactivated genes through a lentivirus-delivered Cas9–sgRNA system. The pooled lentiviruses target several genes, including "

[Epistemic status: weak - I'm not a geneticist and this is likely not the best prompt - but this implies that it would require human experts working in unison with AXI to coax it to give meaningful answers.]

Failure Modes

GPT has some interesting failure modes very distinct from a human - going into repetitive loops for one, and with GPT-3 in particular, and increasing tendency to reproduce texts verbatim. Maybe we'll find that GPT-4 is just a really good memoriser, and lacks abstract thinking and creativity. Or maybe it falls into even more loops than GPT-3. It is hard to say.

To me, the main argument against a GPT-4 acquiring superintelligence is simply its reward function- it is trained to copy humans, perhaps it will not be able to do things humans can't (since there is no point optimising for it). However, this is a fairly weak position. Because, to be precise, GPT attempts to imitate anything, real or hypothetical, in an attempt to get at the right next word. The examples of math, and invented words, show that GPT appears to be learning the processes behind the words, and extrapolating them to unseen scenarios.

Finally, the word "superintelligence" is likely to have a lot of baggage from its usage in sci-fi and other articles by humans. Perhaps, to remove any human linked baggage with the word superintelligence, we could instead define specific scenarios, to focus the AI on imitating the new concept, rather than recalling previous human usage. For example:

"RQRST is a robot capable of devising scientific theories that accurately predict reality. When asked to devise a theory on Dark Energy, RQRST responds,"


"Robert Riley was the finest geneticist of the 21st century. His work on genetic screening of embryos relied on "


"Apple has invented a new battery to replace lithium Ion, that lasts 20x as long. It relies on"

I'd love to see GPT-3 complete expert sounding claims of as-yet unachieved scientific breakthroughs. I'm sure it can already give researchers working in the domain interesting answers; especially once fine-tuning with relevant work is possible.