[Thanks to Jacob Falkovich, Eric Neyman, and my LessWrong reviewer, among others, for their feedback on earlier drafts.]

Initially, Gödel, Escher, Bach comes across as a perplexingly well-regarded conspiracy theory text. But reading on, you come to see the magic: all of the conspiracies are actually true. Gödel numbering actually is just like RNA translation, and recursive transition networks really are similar to renormalization of elementary particles. Who knew? GEB author Douglas Hofstadter did, and he wrote a 700-page exploration of the ideas behind Gödel’s incompleteness theorem so that you could too.

GEB has two parts. Part I is an exposition of many interesting and deeply related ideas: formal systems like math and physics acquire meaning by modeling the world; recursion gives these systems power but also enables self-reference; and self-reference ultimately poses a serious problem for these systems. These ideas build to the statement and proof of Godel’s Incompleteness Theorem. Part II, roughly speaking, claims that the ideas of part I have something to do with artificial intelligence and the nature of consciousness. 

This “book review” is really an in-depth explainer of the key ideas in GEB part I. That said, I’ll also briefly touch on part II at the end. 

Before I start, let me tell you some things that won’t be in this review because you really can’t get them from anywhere but GEB itself.

First, this review will feature very few of Hofstadter’s actual words. The reason is simple: there’s way too many of them. In a previous draft of this review, I tried quoting out of GEB for a few simple things, but it would always turn out like “Hofstadter thinks humans are sometimes different than machines: [300 word quote that somehow essentially involves an analogy about how you think your wife wants you to turn off the TV, but she wants you to start a government-overthrowing revolution] (page 37).” 

Second, this review will leave out the fascinating interconnections Hofstadter draws throughout the text. Gödel numbering really is just like RNA translation, I promise, but if you want to know why you’ll have to check out GEB from your local library, sorry.

And third, GEB is really idiosyncratic in a way no one can imitate. The book’s chapters are separated by entertaining Carrollian dialogues which illustrate key ideas that reappear later in the text, imitating the way themes reappear in Bach’s fugues. Hofstadter has an axe to grind with Zen Buddhism, and the first application of a formal logical system he develops in the text is to refute a Zen koan about grinding axes. He also enjoys taking pot shots at composer John Cage for basically no reason.

Overall, I think GEB is a really good book. In fact, I insist that it’s better than you expect even after taking my insistence into account. Eliezer Yudkowsky, on whom GEB was an early influence, once wrote:

Gödel, Escher, Bach by Douglas R. Hofstadter is the most awesome book that I have ever read. If there is one book that emphasizes the tragedy of Death, it is this book, because it's terrible that so many people have died without reading it.

So, lest you die without learning why Gödel numbering is just like RNA translation familiarizing yourself with GEB, let’s get started.

I. Formal systems and interpretations

[If you are already familiar with formal logic, I recommend only skimming this section. If you’re already familiar enough with formal logic that you understand the statement and proof of Gödel’s incompleteness theorem, then probably just skip straight to section VI.]

The basic object of study in GEB is what Hofstadter calls a formal system. A formal system consists of:

  • A collection of allowable characters out of which we can form strings (sequences of characters)
  • A collection of strings called "axioms"
  • A collection of rules, or "inference rules," for changing some strings into others

Huh? Let's start with a simple, meaningless example called the MIU-system.

The MIU-system:

  • Allowable characters: M, I, and U. (So strings are things like M, UMM, MIMMIUM, UMIIMUMUUIMIM, etc.)
  • Axioms: MI
  • Rules:
    • Rule I: given a string that ends in an I, you can add a U to the end.
      • Example: from UMI, form UMIU
  • Rule II: given a string of the form Mx where x consists of M’s, I’s, and U’s, you can form the string Mxx
    • Example: from MIU, form MIUIU
  • Rule III: given any string with III appearing somewhere inside, you may replace III with U
    • Example: from MIIII, you can form MUI (by replacing the middle III with U). You can also form MIU (by replacing the ending III with U).
  • Rule IV: given any string with UU appearing inside, you may delete UU
    • Example: from MUUI, form MI

Let's call a string a theorem if you can produce it from the axiom MI using the inference rules. For example, I claim that MUIIU is a theorem; in support of this I offer the following "proof":

(1) MI       (axiom)
(2) MII      (using rule II)
(3) MIIII    (using rule II)
(4) MIIIIU   (using rule I)
(5) MUIU     (using rule III)
(6) MUIUUIU  (using rule II)
(7) MUIIU    (using rule IV)

There you have it – MUIIU is a theorem (as are all of the strings obtained along the way).

Hold on, axioms? theorems? Readers who've seen some mathematical logic might see where this is going.

The terminology is chosen to suggest the following. We imagine that the given rules are "rules of logical inference," analogous to rules in classical logic like "given ‘P’ and ‘if P then Q,’ you may conclude ‘Q’.’" We imagine that the strings of our system are logical statements written in some formal language. And we imagine that the axioms are some logical statements that we assume to be true. So the "proof" above is akin to starting from a known axiom and using the rules of logical inference to deduce some desired theorem, sorta like a proof! Formal systems are a way of mechanistically codifying logical reasoning; one could easily write a program that starts from the axioms and recursively applies the inference rules to produce an ever-growing list of theorems. In fact, this is a very basic model for what automated theorem-provers like Coq do.

After introducing the MIU-system, Hofstadter offers the following puzzle, which I pass on to you:

 Question: Is MU a theorem? 

Try to figure it out yourself if you'd like, or read on to find the answer later.

In this example, the MIU-system doesn't seem to reflect the structure of anything we would care about. In contrast, the next example-and-half do: they are meant to model multiplication of natural numbers.

The tq-system:

  • Allowable characters: t, q, -
  • Axiom: -t-q-
  • Rules:
    • Rule I: given a string xtyqz where x,y,z are strings consisting of only hyphens, you can form x-tyqzy
    • Rule II: given a string xtyqz where x,y,z are strings consisting of only hyphens, you can form xty-qzx

Unlike the MIU-system, the tq-system comes with an interpretation which converts strings of the formal system into meaningful statements in some context. In this case, the context is “multiplications,” and the interpretation looks like

t  ⇒ times
q  ⇒ equals 
-  ⇒ one
-- ⇒ two

and so on. This interpretation turns the axiom -t-q- of the tq-system into the multiplication “one times one equals one” and the theorem --t---q------ (proved below) into the multiplication "two times three equals six.”

(1) -t-q-          (axiom)
(2) --t-q--        (rule I)
(3) --t--q----     (rule II)
(4) --t---q------  (rule II)


We can think of an interpretation as giving meaning to a formal system. Uninterpreted, --t---q------ is a meaningless string of characters, same as the strings of the MU-system. But equipped with the interpretation above, this string comes to mean the multiplication “two times three equals six.” An analogy: to a child ignorant of the world, a globe is just a spinning toy covered with meaningless pictures. But once the child learns that pictures on the globe (the formal system) represent (interpret to) masses of land on the actual Earth (the context), aspects of the globe start to carry meaning – the fact that the splotch of green labeled “Asia” is larger than the one labeled “Australia” corresponds to the continent Asia having a larger land-mass than the continent Australia. (Note that the formal system-interpretation-context relationship is very similar to the map-territory relationship.)

Liberation, by M.C. Escher. Strings in formal systems (the triangles at the bottom) are transformed into meaningful statements (the birds) via interpretation.

At this point, three caveats are in order.

First, you should not think that a formal system necessarily has only one interpretation. For example, here’s another interpretation of the tq-system, now into the context of divisions:

t  ⇒ equals
q  ⇒ divided into 
-  ⇒ one
-- ⇒ two

and so on, so that --t---q------ now interprets to “two equals three divided into six.” In a case like this, it’d be a mistake to argue about what the “true meaning” of --t---q------ is; the correct takeaway is that both meanings are encoded simultaneously. Even this simple example of a double-entendre is somewhat interesting: it demonstrates that the structure of multiplications is “the same” as the structure of divisions (borrowing a word from mathematics, Hofstadter would say that multiplications and divisions are “isomorphic”). 


The cover art is a real photograph of two carved blocks of wood. Depending on which interpretation (angle of the light) you use, you can pull three different meanings out of each block.

Second, not all strings of the tq-system come out meaningful under interpretation. The tq-system also contains strings like ttq-t which don’t correspond to any multiplication. Let’s call a string well-formed if it does carry meaning under our choice of interpretation. This includes strings like -t-q-- which do mean something (one times one equals two) even though that something is false.

And third, all of the theorems of the tq-system are not only well-formed, but they also represent true multiplications. For example the theorems -t-q- and --t---q------ interpret to the true multiplications "one times one equals one" and "two times three equals six." (The well-formed string -t-q-- doesn’t, but that’s fine because it’s not a theorem.) This is really important, so let’s make it a requirement: if I call something an “interpretation” of a formal system, I will always mean that the theorems are well-formed and come out true under the interpretation.

For a counterexample, if we changed ‘-’ to mean “two,” then we wouldn’t have an interpretation anymore since the theorem -t-q- would represent the multiplication "two times two equals two," which isn't two – achem excuse me – true.

As a final half-example of a formal system, let's augment the tq-system so it can prove theorems representing statements like "6 is composite."

The tqCP-system:

  • Allowable characters: t,q,-,C,P
  • Axioms: same as tq-system
  • Rules: same as the tq-system, plus
    • Rule III: given a string xtyqz where x,y,z consist of at least two hyphens, you can form Cz

The interpretation I intend for the tqCP-system into the context of “arithmetical statements” looks the same as the tq-system, plus:

Cx ⇒ x is composite
Px ⇒ x is not composite (or equivalently, x is prime)

What’s up with having a P when the inference rules don’t allow it to appear in theorems? More on that later.

II. Jumping out of the system

I claimed above that the given interpretation of the tq-system was valid, i.e. that it transforms theorems of the system into true multiplications. How do I know that? Sure I gave two examples, the theorems -t-q- and --t---q------, but how can I be sure that every one of the infinitely many theorems of the tq-system interpret to true multiplications?

I’ll argue like this. First of all, the axiom -t-q- interprets to a true multiplication (one times one equals one). Second, we note that given a string xtyqz which represents a true multiplication (x times y equals z), rule I produces a string which represents a true multiplication ((x plus 1) times y equals z plus y). Same goes for rule II. As our axioms are true and our rules of inference preserve truth, all of our theorems must be true as well!

Where did the reasoning in the last paragraph take place? It certainly wasn't a proof "inside the tq-system," since those proofs just look like lists of tq-strings which obey the inference rules. Rather, it was an example of "stepping outside of the system." We reasoned about the tq-system using ordinary reasoning, not the internal formal logic of the tq-system. After all, the system doesn’t “know” about the interpretation we've given it – that is, our choice of interpretation has no bearing on tq-system’s axioms or inference rules, which are the only things that determine which strings are theorems. So we can't possibly hope to prove the validity of the interpretation by working within the tq-system. We had to step outside.

Here's another example of stepping outside the system. We just saw that every theorem of the tq-system represents a true multiplication. In fact, the converse is also true, namely that every true multiplication is represented by a theorem of the tq-system! If you're interested, you may wish to prove this – it will require stepping outside the system. Then, using this observation, you can derive theorems of the tq-system "from the outside." For example, since ---t---q--------- represents a true multiplication, we know that it must be a tq-theorem. Again, this isn't a "proof" in the formal sense, because a proof is a sequence of tq-strings produced by applying rules. It is a proof (in the informal sense) from outside the system.

Hofstadter points out that jumping outside the system is an important feature of intelligence. Before I introduced the tq-system I told you what my intended interpretation was. But even had I not, it's very likely you would have discovered it after a few minutes writing down tq-theorems. Instead of mindlessly churning out an ever longer list of theorems, you would instead gradually notice the patterns, put down your pencil to think, and discover that you can predict what all the tq-theorems are without writing them down. These are all outside-the-system activities.

Even now, you’re likely making frequent jumps out of your “reading this book review” system. Perhaps you’re pausing to check if you’re thirsty or need to go to the bathroom. And perhaps now you’re asking yourself if it counts as jumping out of the system if I just told you to do it. And maybe you’re now trying to do something I didn’t tell you to do just to prove that you really can jump out of the system. (sorry)

Contrast this with the behavior of a graphing calculator running a basic program that prints out a list of tq-theorems. The graphing calculator will never stop executing its code, step back to survey the data, notice the pattern, and print out IT'S THE MULTIPLICATIONS YOU DUMMY. Of course a human is ultimately some program, albeit a very complicated one running on an extremely powerful computer. Accordingly, there is some system out of which we are unable to step, the same way biological evolution is unable to step back, take a look at the data, and shout into the void JUST KEEP MAKING MORE CRABS YOU DUMMY. The point isn't that human intelligence is "special" in some way that purely mechanistic reasoning can never replicate. The point is simpler: intelligent systems seem to be able to identify and run subtasks, as well as to monitor these subtasks from the outside and determine when to stop doing them.

III. Truth vs. provability

"Snow is white" is true if and only if snow is white.

– Alfred Tarski

At this point, it’s possible you’re mixing up the notions of truth and provability. If so, don't feel bad: so did literally everyone for the whole history of logic until 1930. That's when German logician Kurt Gödel announced his namesake Incompleteness Theorem, a consequence of which is that truth and provability really must be considered as separate notions. A goal of GEB part I, and of this book review, is to outline the key ideas going into this theorem. In this section I'll explain the distinction between truth and provability and state Gödel's theorem. But first, a story about Gödel.

Life in late 1930s Europe wasn't treating Gödel well. For one, he was unable to find an academic position because he had too many Jewish friends (a common side-effect of being a mathematician). And to make matters worse he had been conscripted to the German army. So Gödel did the logical thing: he fled to the U.S., got a position at Princeton, and hung out with his buddy Albert Einstein (who confessed that the only reason he showed up to work was for "the privilege of walking home with Gödel.") While studying for his U.S. citizenship exam, Gödel claimed to have discovered this one weird trick for legally turning the U.S. into a dictatorship (anti-fascists hate him!). Despite Einstein's warnings to Definitely Not Bring That Up, Gödel totally Brought It Up during his citizenship interview. Fortunately, Einstein, who was there serving as a witness and also knew the interviewer, managed to smooth everything over. Gödel became a citizen. I'm not sure what the moral is, but hopefully this gives you a taste of the mind of Kurt Gödel.

Okay, back to the logic textbook masquerading as a book review. A good way of thinking about the truth/provability distinction is that provability comes from the formal system and truth comes from the interpretation+context.

Provability is simpler, so let's tackle it first. Calling a string in a formal system provable is just a fancy way of calling it a theorem. That is, “provable string” and “theorem” are synonyms. This should make sense: remember that "theorem" just means something you can deduce from the axioms using the inference rules, i.e. something you can "prove."  For example, the strings -t-q- and C------ are provable in the tqCP-system, but -t-q-- is not. In the MIU-system, MI and MUIIU are provable but (spoiler!) MU is not. Note that provability is a purely formal notion, i.e. it depends only on the formal system and not on whatever interpretation you attach to it.

Truth on the other hand relies on a choice of interpretation. Given a formal system with an interpretation, we say that a string of the system is true if it comes out true under the given interpretation. For example, --t---q------ is true because two times three does equal six, but P---- is false because four isn’t prime. We can't say whether MIII or MMU are true because we don't have an interpretation for the MU-system in mind.

Since by fiat all of our interpretations translate theorems to true statements, we know: 

in a formal system with an interpretation, all provable strings of the system are also true. 

Or more succinctly: if provable then true. This is really important: it's why mathematicians and physicists can write funny symbols on paper, do logic on them, produce some new arrangement of funny symbols, and be confident that the new symbols actually tell them something true about the universe!

You might be tempted to believe the converse: that every true statement in a formal system is also provable. (Or at least, you might have been tempted to think that if I didn't have a whole section titled "truth vs. provability".) But consider the string P-- of the tqCP-system, which interprets to "two is prime." This string is certainly true, since two is prime. But it is not provable in the tqCP-system – in fact, none of the rules of the system allow you to produce a theorem with the character P.

You're probably thinking that this demonstrates that the tqCP-system is bad in some way, or at least woefully incomplete. Perhaps you're tempted to augment the tqCP-system by adding a new rule: if for some x consisting of only hyphens, Cx is not a theorem, then Px is a theorem. But there's an issue here: applying this rule requires making a list of all (infinitely many) theorems of the tqCP-system and checking that Cx is not among them. But this is not the sort of simple, mechanistic rule that our formal systems are allowed to have – you could never finish writing down all the theorems and checking that C-- is not among them. You might be able to show from outside the system that C-- is not a theorem, but such "outside the system" reasoning has no bearing on provability inside the system.

Fear not: Hofstadter does explain a way of augmenting the tqCP-system to be able to prove statements like P-- (though it requires adding new characters as well as new rules). So now can we admit that the tqCP-system is no-good, and that we should root out all formal systems that can’t prove all their truths and replace them with ones that can?

If only! Gödel doesn't tolerate citizenship interviewers ignorant about weird tricks for making dictatorships, and he won't tolerate our nonsense either. Here is his theorem.

Gödel's Incompleteness Theorem: any sufficiently rich formal system, together with an interpretation, has strings which are true but unprovable.

A formal system which is able to prove all of its truths is called "complete." So Gödel's theorem says that every sufficiently rich formal system is incomplete – there will always be unprovable truths. What does “sufficiently rich” mean? It means “expressive enough to permit certain types of self-reference”; more on that in the next section.

V. Self-reference and the proof of Gödel's theorem

Suppose that I were to walk into your room and say:

This sentence is a lie.

As I’m an untrusted stranger, you might suspect that I’m lying. But if that were the case, then "This sentence is a lie" would be the truth, so I’d be telling the truth ... a contradiction! Likewise, if you suppose I’m telling the truth, you'll come to find that I’m lying, another contradiction. (And at this point you should step out of your “solve logic puzzle” subtask and initiate a “report intruder to the police” subtask.)

This is called the liar's paradox, and it’s the basic idea behind the proof of Gödel's theorem. The core of the issue is that we have a system (the English language) trying to model itself, and we’ve exhibited a sentence whose interpreted meaning references that very same sentence. This snake-eating-its-own-tail pathology can be arranged to create other similar paradoxes.

You might think that we can fix things like this with a simple rule like “no interpretation of a formal system can have the context be that very same system.” Unfortunately, things aren't so easy. Consider the following two-step version of the liar's paradox.

The German sentence below is false.

Der obige englische Satz ist wahr. ("The English sentence above is true.")

Here, the system "sentences in English" has an interpretation expressive enough that sentences in English are able to make claims about arbitrary sentences in German. But the same is true about the system “sentences in German” being expressive enough to make claims about arbitrary sentences in English. And although each sentence by itself is perfectly harmless, the whole is paradoxical!

(Note that “sentences in English/German” are not really formal systems in the sense defined before, but they are similar in the sense that they are collections of strings which can be endowed with interpretations which give the strings meaning. So the above should be taken as a purely informal illustration of the ideas to be fleshed out below.)

Drawing Hands, by M.C. Escher, an illustration of the two-step liar’s paradox.

Part of the issue is that English is too rich. That is, it's able to talk about concepts like "truth" and "falseness" as well as support self-reference. It’s also rich enough to model systems (like German) which are themselves rich enough to model English, enabling the two-step liar’s paradox. These aren't issues that are easy to patch; it's not clear how much of English we would need to remove to make it "not too expressive." Perhaps in doing so, we would destroy our ability to say anything useful.

English is too fuzzy to work with, so instead Gödel works with statements of number theory – things like "two plus two equals four" and "four is a factor of eight." It ends up that while number theory isn’t expressive enough to talk about the truth of number-theoretical statements, it is expressive enough to talk about the provability of number-theoretical statements. 

(I won’t cover it further in this review, but the deeper reason that number theory is expressive enough to talk about provability is that it is able to support recursion (the key fact being that “theorems” are strings which arise from recursively applying rules). Going deeper, the reason number theory can do recursion is because it has an axiom of induction.)

The idea of Gödel's proof is to encode a “provability version” of the liar's paradox into number theory. That is, given a formal system rich enough to model number theory, Gödel comes up with a string G of the system whose meaning under interpretation is: 

G is not provable.

If G were false, then G would be provable and hence true, a contradiction. So G must be true, making it an unprovable truth. It follows that the formal system in question is incomplete.

The rest of this section fleshes out this idea in more detail, using an idea called Gödel numbering. I think it’s pretty cool, but if it’s not your cup of tea, feel free to skip to part VI.

As a warm up, recall the MU puzzle from above: determine whether MU is a theorem of the MIU-system. I will now demonstrate that the answer is "no – MU is not a theorem." The idea is to encode "MU is a theorem of the MIU-system" as a claim about number theory, and then figure out if that claim about number theory is true.

To do this, let's first transform strings of the MIU-system into numbers by the rule:

M ⇒ 3
I ⇒ 1
U ⇒ 0

For example, MIUUI is the number 31001, MU is the number 30, and the axiom MI is the number 31. Under this transformation, the rules of the MU-system can be stated arithmetically. For example, rule I says that if a number has units digit 1, then you may multiply it by 10 (thereby appending a 0 to the end). Or more formally:

given a number of the form 10m + 1, you may form the number 10*(10m + 1).

You can do the same thing for the other rules too.

Let's call a number which corresponds to a theorem of the MIU-system a MIU-number. So we've transformed the claim "MU is a theorem of the MIU-system" to the equivalent claim "30 is a MIU-number," which can also be stated as “30 can be formed from 31 by repeatedly applying such-and-such arithmetical operations.” This might not seem like progress, but it is! The claim "30 is a MIU-number" is a number theoretical statement (though perhaps not an interesting one). In essence, it’s similar to – but more complicated than – the more familiar statement “216 is a power of 6” i.e. “216 can be formed from 1 by repeatedly applying the multiply-by-6 operation.” 

Now we can dispose of the MU puzzle by proving a proposition about MIU-numbers:

Proposition: No MIU-number is divisible by 3.

I’ll leave the proof to you – it’s not hard, especially if you remember the rule for checking whether a number is divisible by 3.

Since MU corresponds to 30, which is divisible by 3, we deduce that 30 is not a MIU-number. Hence MU is not a theorem of the MIU-system, and we’re done. If you're rightly baffled, you can press pause on your computer screen now to reflect deeply on what’s happened. You can press play when you're ready to resume the proof of Gödel's theorem.

The procedure above turned strings of the MIU-system into numbers, and claims about those strings into statements of number theory. This is called Gödel numbering, and it can be done to any formal system. Via Gödel numbering, the claim "MU is a theorem" about the MIU-system corresponds to the number-theoretical claim "30 is a MIU-number.” In other words, despite the MIU-system having no interpretation that gives its strings meaning, Gödel numbering gives number-theoretical meaning to certain claims about the MIU-system.

Could something interesting happen if we Gödel number a system that already has an interpretation into number theory? Could the meaning acquired through the interpretation clash with the meaning induced by Gödel numbering? Could my rhetorical questions get any more leading? Is the answer to all of these yes?

In GEB, Hofstadter spends two chapters constructing an explicit example of a formal system that models number theory, called Typographical Number Theory, or TNT (foreshadowing that it will blow itself up). For the sake of concreteness, he then proves the Incompleteness Theorem for the system TNT. Nevertheless, the same proof works for a general formal system S with an interpretation into number theory, and I’ll explain it here in this more general language.

(here comes the technical meat; please set your brains to “think very hard”)

Suppose we are given a formal system S with an interpretation into number theory. And suppose that the formal system is "rich enough" in the sense that any statement about number theory can be rendered as a string of S. We want to show that S has an unprovable truth. Fix a Gödel numbering for S, i.e. a correspondence between characters of S and digits which turns all the strings of S into numbers and all the rules of S into arithmetical rules. As before, let's call a number an S-number if it corresponds to a theorem of S.

Given a string G of the system S, let g be the number corresponding to G under the Gödel numbering. Now, "G is not a theorem of S" is equivalent to the number-theoretical claim "g is not an S-number.” But the number theoretical claim "g is not an S-number" can in turn be rendered as a string of S (as can any number theoretical claim, by assumption). Let's call this string G'.

In a situation like this, Gödel gave a magic recipe (or see chapters 13 and 14 of GEB) for cooking up a specific string G such that the resulting G' is the same as G. Thus, this G interprets to the statement “g is not an S-number,” which is true if and only if G is not a theorem of S. Informally, we might say that G carries the meaning “G is not provable in S.” And now we’re done: if G is false, then G is a theorem of S, and is therefore true, a contradiction. So G is true, and thus G is not provable. Thus G is an unprovable truth and S is incomplete. Q.E.D.

I'll end this section with an exercise for those interested: how is this proof like the proof of undecidability of the halting problem? (For solutions, please consult Gödel, Escher, Bach by Douglas Hofstadter.)

Fine, I’ll tell you about the RNA transcription = Gödel numbering thing. But if you think this is the extent of the analogy, then let me tell you about this one book by Douglas Hofstadter.


As a math grad student, I’m not a bad person to write a book review of GEB part I. On the other hand, I’m vastly unqualified to say anything about GEB part II. Buuuut I’ll say a bit anyway. 

My executive summary of GEB part II is: you know all those cool ideas about self-reference, meaning, etc. in part I? Those all have something to do with intelligence and consciousness.

This is obviously a really bad summary. One issue with it is that large chunks of part II are devoted to completely uncontroversial topics like a short history of computing and artificial intelligence, an explainer of Gödel’s “magic recipe” mentioned above for producing the string G, and a refutation of incorrect takeaways from the incompleteness theorem like “artificial intelligence is impossible.” 

That said, it’s pretty clear that Hofstadter’s main goal for part II is the stuff about intelligence and consciousness; all the other topics are tangential. My overall sense is that Hofstadter’s most interesting ideas here have not aged well since GEB’s 1979 publication. Here’s a sampling of his claims:

  1. While he avoids positing the existence of a so-called "grandmother neuron" – that is, a neuron whose sole job is to fire whenever you need to make use of the concept "grandmother" – Hofstadter does seem to think that something kinda like this is true: that there is a "grandmother module" in the brain – perhaps a collection of neurons – which activates whenever you think of a grandmother (and doesn’t activate for other things).
  2. Hofstadter seems to believe that the way we think thoughts is for all of our various modules to fire together in roughly the same way that a bunch of words are said together to form a sentence. E.g. the thought "My grandmother is happy" boils down to the modules in your brain representing "grandmother" and "happy" activating together, along with some additional information to specify that it is “my grandmother” instead of just “a grandmother” and things like that.
  3. His paradigm of (artificial) intelligence seems to involve intelligent systems constructing formal systems which model the problems they’re trying to solve, and then solving the problems by working symbolically with the formal system. More specifically, he seems to imagine that intelligent systems: construct a network of concepts and their interrelations (similar to what WordNet does for languages) in which the concepts are internally represented by modules like the grandmother module above; then treat the modules like symbols in a formal system, manipulating them according to certain rules, thereby modifying the relations between them.
  4. Per Hofstadter, consciousness arises in systems which support self-reference and are able to self-modify in certain ways. Examples: computers which are able to edit their source code, neural nets in which the current weights determine how the weights are modified, and human brains in which higher-level abstractions affect the way we think about lower-level abstractions and vice versa. Hofstadter calls these self-referential systems “tangled hierarchies” and the way in which they self-modulate “strange loops.”

Idea 1 seems right when it comes to visual processing. At a low level, we’ve identified particular neurons that fire together in certain ways to encode information about the orientations of lines., which is kinda like a low-level version of a grandmother module for the concept of vertical-ness or whatever (though apparently predictive processing has another take on what information exactly is being represented). At a medium level, there are neurons that selectively fire when you see pictures of faces, animals, or scenes. At a high level … it’s unclear. Some people interpret the work of Quiroga et al as demonstrating the existence of a “Halle Berry” neuron, but per Wikipedia, Quiroga himself disputes this interpretation. But we do know that high-level modules like this exist in some artificial neural networks: OpenAI has found individual neurons in their CLIP neural net which seem to correspond to concepts like “Spiderman,” “winter,” or “West Africa.” It perhaps shouldn’t be so surprising that modules like this exist for visual processing – you can experience them activating firsthand when you look at a blurry picture and it all-of-a-sudden resolves into a face.

Beyond visual processing, things aren’t so clear. Your “face module” activates when you see a face, but does it activate when you think about the concept of a face? I’m personally skeptical. Our current understanding of GPT-3 is – I think – as an uninterpretable jungle of weights in which no single concept is localized to a particular part of the network, which seems like evidence against. But if there are major advances in AI interpretability, such that you can determine whether GPT-3 is thinking about a grandmother just by tracking the activation of a particular cluster of neurons, then that would be a different story.

Idea 2 – that thought arises from certain interactions among our hypothetical brain modules that mimic the interactions among words in sentences – goes further into murky territory. It first requires the existence of grandmother modules not just for visual processing but for cognition in general, which, again, seems incompatible with the uninterpretable jungle of neurons that comprise our best artificial minds. Then it’s an even stronger claim above-and-beyond that! If it were true, it would imply that advanced AI interpretability could not only detect the concepts that GPT-3 is utilizing, it could read GPT-3’s mind and write down in English what GPT-3 is thinking. Even if you interpret Hofstadter’s claims as being only about human brains, which are probably more interpretable than GPT-3, this stretches my credulity. Nothing here is impossible or definitively disproved, but it overall seems like a paradigm that’s a bad match for our modern understanding of the brain.

It’s hard for me to comment on idea 3 – the idea that intelligent systems work by constructing formal systems to model the world and then manipulating these formal systems to generate new data/predictions/whatever – because I don’t know how literally Hofstadter wants us to take him here. Again, my first instinct is to gesture frantically at GPT-3, which can write (and draw) better than most people I know, but fails at multiplying large numbers. If GPT-3 secretly works by constructing a super complex formal system that models the human-produced text in its training data, why didn’t it come up with a much simpler formal system (like the tq-system) for modeling the multiplications in its data?

That said, Hofstadter specifically addresses the question of whether AIs will be good at addition. He says they might not be. (Of course, this must be the right answer since I’m an intelligent system, I frequently fail to subtract 6 from 11, and there’s no reason to rule out that AIs could be the same; Hofstadter is aware of all this.) Hofstadter doesn’t seem to think this poses an issue for his model of how intelligent systems work, but it seems like it should? Did I just take all the stuff about grandmother modules and manipulation of symbols literally when he meant it all as an illustrative example? I’m honestly wondering if I misunderstood his point entirely.

In any case, I don’t think there’s a way of understanding Hofstadter’s ideas here that doesn’t involve Hofstadter’s model being pretty surprised by modern ML. Inventing better data structures for modeling the world and better procedures for refining and applying these models? Hah, that’s for chumps. Real intelligence comes from throwing ever-larger neural networks at data and letting them do uninterpretable nonsense with it.

Finally, idea 4 – that consciousness arises from tangled hierarchies and strange loops. You can think about this as a souped-up version of the idea that consciousness fundamentally arises from self-awareness (i.e. “you are dust which is aware that it is dust and is therefore more than dust”). Except Hofstadter replaces “self-awareness” with the more nuanced concept of a formal system being self-referential and self-modifying. I’m not sure this modification does anything to clarify the idea. Hofstadter clearly struggles to say exactly how strange loops lead to consciousness, and when he comes closest to addressing the issue he’s uncharacteristically brief, writing that conscious experience “comes from the vortex of self-perception in the brain.” Okay, cool, glad that’s settled. 

For now, I tentatively conclude that Hofstadter is playing the “Gödel’s theorem and consciousness are both mysterious and therefore equivalent” game. If there's some way to salvage Hofstadter's ideas in part II, someone other than me will have to write the book review doing it. 

Relativity, by M.C. Escher. Like GEB, this drawing makes more sense if you only look at half of it.
New Comment
25 comments, sorted by Click to highlight new comments since:

I'd like to use this post to ask a question I've had for a while about the incompleteness theorem and the halting problem but never had a good chance to ask.

When I first heard that that any sufficiently rich formal system, together with an interpretation, has strings which are true but unprovable or that it is not possible to determine whether a program halts, this felt like an amazing and important discovery. But when I think more about the standard proof of the halting problem, or the hear this explanation of incompleteness theorem or the English "this sentence is a lie" example the importance seems to diminish.

Sure, I cannot in general determine whether a program halts but the only counter example is this weird recursive statement that I do not actually care about. Maybe I can determine whether all programs that I do care about halt and ignore this one weirdness. Similarly if the only true unprovable statements are of this form then maybe I can still have formal systems that are complete for all other statements (those that I care about).

At this point my confusion becomes more fuzzy and I'm not sure how exactly to put it into words. It feels important that for the halting problem many systems turn out to be unexpectedly turing complete. Is one answer that it is impossible to determine whether a program is equivalent to the one used in the halting problem proof?

[Epistemic status: I think everything I write here is true, but I would feel better if a passing logician would confirm.]

Great question! The important thing to keep in mind is that truth only makes sense relative to an interpretation. That is, given a statement S in your system, you can ask whether it's true in a particular interpretation, but it might be true in some interpretations and false in others. 

Here's a fun fact: S is true in every interpretation of the system if and only if S is provable. (Proof: clearly if S is a theorem then it is true in every interpretation. Conversely, Godel's completeness theorem says that if S is true in every interpretation then it is a theorem.)

So when you ask for "naturally occurring" true-but-unprovable statements, do you want

  1. a statement which is true in every interpretation but unprovable? or
  2. a statement which is true in some interpretation but unprovable?

If it's (1), then the fun fact above says that no such statements exist. If you want (2), then by taking the contrapositive of the fun fact, you'll see that this is the very same as asking for a statement S such that neither S nor 'not S' is a theorem! In other words, you're looking for a statement which is independent of your system. And you probably already know of oodles of these: the axiom of choice is independent of ZF set theory, the continuum hypothesis is independent of ZFC, large cardinal axioms, etc.


I have a feeling that this purely mathematical explanation may have done nothing to cure your core confusion, so let me write something else here which I think strikes more at the heart of the question. We often like to think that the integers exist; that there's a God-given copy of the integers sitting out there in metaphysical space, not a formal model, but the real thing which the axioms are trying to capture. By the discussion above, we know that no matter what axioms we impose, we'll never be able to fully capture the integers (because any formal system will always have unproveable truths, and therefore multiple interpretations, and therefore allow for models of the integers with are different than the real ones). 

So when you ask for unproveable truths, maybe you're not asking for an unproveable statement which is true in some interpreation, maybe you're asking for an unproveable statement which is true in the one true interpretation. That is, you want a statement which is true about the actual integers, the real out-in-the-world number system that we're trying to model, but unproveable. I don't know of any "naturally occurring" statements like that off the top of my head (though some people think that the Goldbach conjecture might be one).

But if you replace "integers" everywhere in the last paragraph with "universe of sets" you can say very much the same thing. That is, we might intuitively believe that there is an actual universe of sets which the ZFC axioms are merely trying to model. And then you can take a statement independent of ZFC -- the continuum hypothesis say -- and ask "is this is true in the real, actual universe of sets?" If yes, then that's your unproveable truth. If no, then its negation is.

So yes, naturally occurring unproveable truths abound, at least in set theory (but depending on what you think the real universe of sets looks like, you might disagree with others about what they are).

By the discussion above, we know that no matter what axioms we impose, we'll never be able to fully capture the integers (because any formal system will always have unproveable truths, and therefore multiple interpretations, and therefore allow for models of the integers with are different than the real ones).

Assuming we use first-order logic, right?

Here is the part where I got stuck.. what exactly causes this attitude that if something cannot be accomplished using first-order logic, the reasonably thing is to give up? I assume there is a good reason behind that, but I have never heard it said explicitly, so to me it kinda sounds like "we found out that it is impossible to build a spaceship using wood, therefore there shall be no space travel", as opposed to trying to find a more suitable material for building spaceships.

[+][comment deleted]20

Thank you for this answer. You also helped make another point of confusion clearer: When I read the op I wasn't sure what the interpretation of number theory is. There must be one or the incompleteness theorem wouldn't apply but what is it?

Based on your comment here I understand this better and it makes more sense why people might disagree about for example the continuum hypothesis being "reasonable". While mostly everyone agrees on the standard ZFC axioms not everyone agrees on the interpretation of ZFC.

Then there must also be hardliners who take ZFC as the truth without interpretation and thus do not allow statements independent of it as they are not provable. It would be interesting to see what proofs they miss out on. (Not requesting anyone to list them for me, just thinking out loud.)

I think this interpretation is both intriguing and clarifying, thanks. It suggests that a good framing of the incompleteness theorem is:

  • Any sufficiently rich system S can be extended by multiple interpretations I, such that all of its provable theorems are true for all of I, but I contains interpretations who disagree on the truth of some unprovable statements of S

And if you try to narrow down the interpretations, you either don't get far enough -- because there are infinitely many possible interpretations, and no matter how many times you exclude half of them by adding a new axiom, still infinitely many possible interpretations remain -- or you introduce some new concept -- but then this new concept has its own multiple interpretations.

For example, if you try to make axioms for integers, it turns out that there are also some "nonstandard integers" which technically satisfy the axioms, even if they are obviously not the thing we meant. The traditional way out is to add an axiom like "the actual integers are the smallest set (from the perspective of set inclusion) of things that satisfy these axioms", but now you have invoked the concept of a set, and if you try to define what that means, you just open another can of worms... and this process never stops.

It is definitely impossible to (in general) determine whether a given program is equivalent to a specific Weird Program. This is a consequence of the halting problem itself!

I think the question about "statements I care about" is, at its core, a question about aesthetics and going to be kind of subjective. For example, does the above statement about not being able to prove the equivalence of programs qualify? (Or would it be non-interesting if one of the programs being compared were sufficiently weird?)

Another statement that might or might not qualify is of the form "the 8,000th Busy Beaver number is less than N" -- see The 8000th Busy Beaver number eludes ZF set theory. Though, admittedly, Yedidia and Aaronson did that one by asking whether a particular conjecture-counterexample-finding program halted, so maybe that's also too contrived for your aesthetics?

I feel like GEB has been diminished a bit by its own success. People reading it nowadays might go "what's the big deal?" A big theme is how the mind can be a machine and still do stupid stuff, which had to be spelled out in the 70s but has pretty much permeated the relevant subcultures these days. And of course Hofstadter didn't know a clear recipe for an actual AGI, so the speculative parts on that were left at the level of intriguing handwaving.

Totally agree that a major goal of GEB was to help people build intuition for things like "How could our minds be built out of things that follow purely mechanical rules?" that are now more well-accepted. With that in mind, there are two ways to interpret a lot of the stuff he writes about minds and artificial intelligence:

  1. "This model I'm presenting is literally how the brain works/how future AIs will work."
  2. "This model is a plausible idea about how the brain might work/how future AIs might work, presented here to build intuition, not necessarily because I think it's right."

I spent a long time trying to figure out which of these interpretations was intended, and my best guess was that it was different for different claims, but usually somewhere in the middle. The stuff about grandmother modules was more like (1) -- he was really trying to argue from first principles that grandmother modules must exist. A lot of the stuff about AI was more like (2), I think (but still with a little bit of (1), which is why I think he still ought to be a little surprised by modern ML). 


I'm actually very curious about to what extent GEB helped put ideas like "the mind is a machine" and "it's possible to create a thinking computer" into the water supply. Hofstadter's arguments for these things felt a little different than the standard arguments, so it never occurred to me that he could be partly responsible for the widespread acceptance of these ideas. Maybe GEB convinced a bunch of people, who eventually came up with better arguments? Or maybe GEB had nothing to do it, I honestly have no idea.

It may be interesting to read the ancient slashdot thread from where I first learned about GEB to see what the buzz around it was like 20 years ago.


I thought a lot of it showed the different little tricks that formal logic has. The concept of true and false doesn't exist in reality in any form. Things just are. A lot of representations and operations are things we use that are useful to create bigger useful logic statements. In a sense how we categorize intelligence is very human. This is why I think AGI will end up being human-like. We use these methods to help us organize reality into meaningful, to us, and practical representations that we recognize as individual concepts and entities mentally.

I have a very contrarian take on Gödel's Incompleteness Theorem, which is that it's widely misunderstood, most things people conclude from it are false, and it's actually largely irrelevant. This is why I was excited to read a review of this book I've heard so much fuss about, to see if it would change my mind.

Well, it didn't. Sam himself doesn't think the second half of the book (where we talk about conclusions) is all that strong, and I agree. So as an exploration of what to do with the theorem, this review isn't that useful; it's more of a negative example. But, as a distillation (of both the theorem and its proof), it's quite good. Sam manages to simplify the formalism significantly while leaving all the core ideas intact, and with math this complex, that's not easy. I don't know how much of the credit should go to Sam vs. the book, but either way, the result is excellent. In the future, this post will be the first thing I'll recommend for people who want to understand the theorem.

However, as-is, my recommendation will come with one caveat. Sam acknowledges that Truth is relative to interpretation whereas Provability is not, but he doesn't quite spell out what that means. Namely, given an arbitrary sentence in a formal language, if isn't provable (and hence the language is incomplete), that's always a good thing -- because it means is false under at least one interpretation. If were provable, that would be a disaster! It would violate soundness and thus make the entire thing useless. Hence I think this sentence:

A formal system which is able to prove all of its truths is called "complete." So Gödel's theorem says that every sufficiently rich formal system is incomplete – there will always be unprovable truths.

is misleading because the "unprovable truths" are only sometimes true, which means they better not be provable! This is precisely the point that I think leads to people misjudging the theorem's importance, and it's my only real gripe with this post.

I have a very contrarian take on Gödel's Incompleteness Theorem, which is that it's widely misunderstood, most things people conclude from it are false

That's not very contrarian on LessWrong. This point has been made a lot, including by Eliezer in Highly Advanced Epistemology 101 for Beginners

and it's actually largely irrelevant

This might be a contrarian position here. I'm not sure (I don't fully understand what Eliezer said in his posts about it).

About your caveat, I don't understand it, and would like to. I don't really know how to articulate my confusion about it though.. If you can elaborate or explain it differently I'd be glad.

Ok, here's an attempt. There are formal languages. They have many interpretations. You can add axioms to reduce the number of interpretations (as in, if you only count those that make all axioms true). If you add enough, there's only one interpretation left. This is always possible, but it may require an uncomputable set of axioms. In fact, there's a result which we could call the difficulty-of-restricting-interpretations-with-axioms-theorem that says that some languages always require uncomputable sets of axioms.

Put like this, it sounds like a pretty boring result, at least to me. But everything about the above description is accurate! The "Incompleteness" theorem is just a result about how many axioms you need to restrict the set of interpretations. Maybe there is a reason why this is important, but the post doesn't explain this, and I maintain that the sentence "there will always be unprovable truths" (which was the sentence I complained about) is heavily misleading. It's like a language trick; it sounds impressive because we usually think of truth as being absolute, but here it's relative. If we did use truth as being absolute (i.e., only call sentences true that are true under every interpretation), the claim is false; in fact, then every true statement is provable -- that's Gödel's Completeness Theorem.

Thanks for this review and reminding me of GEB. Definitely still the most important book I ever read. If I hadn't read it I think my life would have turned out very different indeed!

Why is there no Section IV? It's a delicious irony of reading a book review of Gödel, Escher, Bach on LessWrong that upon noticing that the Latin-numbered sections jump from 3 to 5 (III, V) I'm left wondering if that's simply a silly mistake or if the omission of section IV is some profound meta-self-referential in-joke on the part of the reviewer that I'm just not clever enough to see. 

I wish I could say that there was some sort of hilarious self-referential joke here, but actually I'm just bad at counting, oops. At this point I probably won't fix it for fear of ruining in-text section references. 

Thank you for this!  Another +1 for it being the single biggest influence on my thinking, and I am very happy that you've summarized it in this way - it won't remove the enjoyment of reading it, but it will probably remove 1.5 passes from the normal "read it 3 times over the course of a decade to really get it" advice.

I started reading GEB once, found the math parts relatively familiar and the consciousness parts gobbledygook, and never finished reading it. Dunno why it's so highly regarded - it has about as much crackpot energy as the average Stephen Wolfram blog post. The thing about consciousness being due to strange loops is also obviously wrong to anyone who has ever paid attention to a nonhuman animal. They obviously possess consciousness and experience qualia, but few of them seem to possess self awareness of the strange loop variety. I think it's most likely that consciousness is just "what computation feels like from the inside" and is ubiquitous in the universe.

I really enjoyed I am a Strange Loop, still haven’t read GEB yet but it’s on my bookshelf. I once email Dr. Hofstadter and he responded with one of his writings called his Stripped E’s Exercise. It was an autobiographical lipogram it was so awesome.

[+][comment deleted]10