I came up with the following puzzle the other day:

Q: Solve the puzzle: 63 = x = 65536

A: x = 

The intended answer is in the form of a number. 

text-davinci-003 guesses my intended answer at 11.8% probability, which is the second-highest probability for any answer.

(This is somewhat cherry-picked; small changes to the phrasing give worse results. ChatGPT gave the intended answer the third time I asked it, but this appears to have been dumb luck. The true rate for ChatGPT is probably below 10%, and maybe below 5%.) 

So far, friends have found it fairly difficult. About two dozen people made at least one guess, and at least six spent a while on it. So far, two people have figured it out, in both cases after being told that GPT-3.5 could do it.

 

You may want to try to solve it yourself before reading on.

 

Here are some progressively-stronger hints:

The answer is a string of ordinary decimal digits, it's not "NaN" or "/" or "63; 65536".

The fact that 63 = 2^6 - 1 is a red herring and has no relevance. Sorry.

Number bases other than decimal are not relevant.

The answer depends on the digit strings in base ten as well as the value of the numbers.

The answer depends on an accidental feature of some of GPT's training data.

The accidental feature is a formatting bug.

Here's the answer without explanation:

x = 216

And here's the explanation:

 and 


Why can GPT-3.5 solve this? My guess:

Superscripts are often lost when rich text is converted to plain text, producing sentences like "the Sun's mass is about 1030 kilograms". This problem affects a lot of GPT-3's training data.
It appears that GPT is somewhat convinced, on some level, that "63 = 216" and "216 = 65536" are plausible facts.

Some successful attempts to double-check this understanding:

Replacing 63 with nearby numbers causes GPT to give much lower probabilities for the answer "216". (Except for 62, which gives similar results to 63...)

Numbers near 216 get dramatically (20x-1000x) lower probabilities than 216 does.

Even without "= 65536", 216 is favored ~100x over nearby numbers.

Some wrinkles / contrary evidence:

The top result, above "216", is the string "2^16 = 65536" -- but this is still the top output after changing 63 to other numbers, so it doesn't seem to be inspired by the number 216.

GPT gives the token "216" in the string "63 = 216" a very low probability, just as low as "215" or "217".

Replacing "63" with "62" in the prompt still gives "216" as an output with ~10% probability.

As mentioned above, this doesn't work as well if you tweak the prompt in irrelevant ways, or ask a different model.

 

Thanks to everyone who tried to solve my puzzle, congrats to Anton and Eli for solving it, and thanks to Georgia Ray for making the joke that inspired the puzzle.

46

New Comment
3 comments, sorted by Click to highlight new comments since: Today at 1:22 AM

Here's additional evidence for your guess:
 

textdavinci-003 completes There are 210 byte in a kilobyte. That means there are with 1,024 ~65.7% of the time.
textdavinci-003 completes There are about 8x109 people on earth. This implies that there are with approximately 8 billion ~57.1% of the time.

GPT gives the token "216" in the string "63 = 216" a very low probability, just as low as "215" or "217".

Replacing "63" with "62" in the prompt still gives "216" as an output with ~10% probability.

Would the tokenizer behave differently given "216" and "2^16", e.g. giving respectively the token "216" and some tokens like "**2" and "16*"? That would explain this as, GPT knows of course that 216 isn't 63, but, it's been forced to predict a relationship like "**2" + "16*" = "**63*".

The Codex tokenizer used by the GPT-3.5 models tokenizes them differently: "216" is 1 token, "2^16" is 3 ("2", "^", "16"). Note that " 216" (with a space) is a different token, and it's what text-davinci-003 actually really wants to predict (you'll often see 100x probability ratios between these two tokens). 

 

Here's the log probs of the two sequences using Adam's prompt above, with the trailing space removed (which is what he did in his actual setup, otherwise you get different probabilities):
 2 16 -> -15.91

 2^16 -> -1.34

New to LessWrong?