This post is part of my Hazardous Guide To Rationality. I don't expect this to be new or exciting to frequent LW people, and I would appreciate comments and feedback in light of intents for the sequence, as outlined in the above link.
Student: Hmmm, let's see if I can remember how to integrate. So I know the x is a variable that I'm trying to integrate over. What's that "a" though? I remember it being called a constant. But what's that?
Tutor: A constant is just some number. It could be anything.
student disappears for a while and comes back with the following
Student: I did it!
Tutor: The fuc... um, how about you walk me through how you got that?
Curious. Here's another tale of confusion:
Once upon a time — specifically, 1976 — there was an AI named TALE-SPIN. This AI told stories by inferring how characters would respond to problems from background knowledge about the characters' traits. One day, TALE-SPIN constructed a most peculiar tale.
Henry Ant was thirsty. He walked over to the river bank where his good friend Bill Bird was sitting. Henry slipped and fell in the river. Gravity drowned.
Since Henry fell in the river near his friend Bill, TALE-SPIN concluded that Bill rescued Henry. But for Henry to fall in the river, gravity must have pulled Henry. Which means gravity must have been in the river. TALE-SPIN had never been told that gravity knows how to swim; and TALE-SPIN had never been told that gravity has any friends. So gravity drowned.
TALE-SPIN had previously been programmed to understand involuntary motion in the case of characters being pulled or carried by other characters — like Bill rescuing Henry. So it was programmed to understand 'character X fell to place Y' as 'gravity moves X to Y', as though gravity were a character in the story.1
For us, the hypothesis 'gravity drowned' has low prior probability because we know gravity isn't the type of thing that swims or breathes or makes friends. We want agents to seriously consider whether the law of gravity pulls down rocks; we don't want agents to seriously consider whether the law of gravity pulls down the law of electromagnetism. We may not want an AI to assign zero probability to 'gravity drowned', but we at least want it to neglect the possibility as Ridiculous-By-Default.
Computer Science has a notion of "type safety". In a given language, there are different "types" of things. Any operation you can do also specifies what types it's allowed to act on. In python
1+1 is allowed, but
1 + "hello" isn't. If you try to execute the second chunk you get a "type error" because "+" is the type of thing that expects two integers, and "hello" is a string, not an integer. A language is type safe to the degree that it catches and warns you of type errors.
Human language is not type safe. Saying that gravity drowned is completely valid English, and also isn't how reality works. The students math derivation was also completely valid manipulation of English sentences, but invalid calculus. Human language is not completely detached from reality; it wouldn't be useful if it wasn't. And still, it is not a given that valid English sentences form valid conclusions about the world.
Here's a look at how the student did their calculus problem:
- "a" is a constant
- A constant is something that could be anything.
- If something could be anything, it could be "x^2"
- Integral of 0 is C
- C is a constant
- "a" is a constant
- C is "a"
Almost none of us would make the mistake TALE-SPIN did about gravity. And if you know calculus, you'd probably never make the mistake the student did. But if you don't know calculus, it's not readily apparent that the derivation is false.
Human language is incredibly overloaded. Even in a domain like math, which does in fact invent a shit ton of it's own words, most of the words on the Wikipedia page for "category theory" are common English words.
Any given words that you are likely to use to refer to a specific concept are words that also have all sorts of other meanings. If the person parsing a sentence that contains words they are familiar with, but concepts they aren't familiar with, is inclined to try and understand the sentence in terms of the concepts they are familiar with. This can produce obvious nonsense that one immediately rejects and concludes they don't know what the sentence means. But in the case of TALE-SPIN, there was no meta level sanity check, and it spits out "Gravity drowned". In the case of the student, their false reasoning still lead to the type of thing that is allowed to be an answer (if they had concluded the integral equaled "walrus", they might have been given pause). That plus their lack of calculus knowledge meant that they didn't get any red flags while doing their operation
It is entirely possible to see a thing in the world, use a certain word for it, and hop around a path of twisted false syllogisms, and produce a conclusion that's completely opposed to the reality of the original object.
Be careful, and remember TALE-SPIN