The Nature of Logic

15th Nov 2008

0JamesAndrix

1Eliezer Yudkowsky

1JamesAndrix

2Alexandros3

1pdf23ds

0burger_flipper2

2Nominull3

0Tiiba3

0Psy-Kosh

0Abigail

-6Will_Pearson

2Nick Hay

New Comment

12 comments, sorted by Click to highlight new comments since: Today at 3:34 PM

A notion I had recently but have not tested:

When constructing logical arguments in a casual setting or about messy concepts, don't use forms like 'A and B implies X' or 'A and B, therefore X' Because it is far too easy for us to take one thing to mean another, and label it an implication, perhaps without even realizing the step is there.

"The ground is wet therefore it's raining." is an easy thought to have, but Y almost never implies X when dealing with such complex situations.

I think this can be avoided by structures like 'X is defined as A and B' and 'A and B, which is the definition of X'.

'The ground is wet, which is the definition of rain' and 'the alarm went off, and there was a earthquake, which is the definition of not being robbed' and 'being human is what we mean by being mortal' are clearly wrong. 'Dying when administered hemlock is the definition of mortal' at least has the right idea.

Y never implies X unless Y is exactly what you mean by X, or you really want to use the term in the hard logical sense.

Eliezer, I'm thinking of applying this to conversations between humans constructing particular arguments, not to building AI. I'm hoping it will avoid the particular sloppiness of proving one thing and taking it to have proven another, similar sounding thing. If we conflate things, we should make it explicit.

It's true that defining Y as X won't match X->Y where X is false and Y is true, but that's also the case where, given X->Y, you can't prove X from Y or Y from X. So X->Y wouldn't (or shouldn't!) be used in a proof in such a case.

If there is no element in our model for which X is true and Y is false, should we conclude \z.X(z)->Y(z) and use that as a premise in further reasoning? "If z is a raven then z is red." is true for everything if you have no ravens.

What? The pic of the sirens is back? I got excited when EY hinted he and RH were building to a discussion of their differences, just in time for Hanson to make good on his promise to quit.

I assumed the blank header of the last few days was for suspense building, pending a site relaunch. (And it was kinda working.)

I learned in my undergraduate degree, in about 1987, that "deductive" reasoning was different from "inductive" reasoning, that syllogisms did not add to actual knowledge, and scientific method which did was not deductive, and could not be certain in the same way.

I too would like shorter posts. Much of this post appears to be explaining what I have just said, even if in quite an entertaining way.

Previously in series: Selling NonapplesFollowup to: The Parable of HemlockDecades ago, there was a tremendous amount of effort invested in certain kinds of systems that centered around first-order logic - systems where you entered "Socrates is a human" and "all humans are mortal" into a database, and it would spit out "Socrates is mortal".

The fact that these systems failed to produce "general intelligence" is sometimes taken as evidence of the inadequacy, not only of logic, but of Reason Itself. You meet people who (presumably springboarding off the Spock meme) think that what we really need are

emotionalAIs, sincelogichasn't worked.What's really going on here is so

completely differentfrom the popular narrative that I'm not even sure where to start. So I'm going to try to explain what I see when I look at a "logical AI". It'snota grand manifestation of the ultimate power of Reason (which then fails).Let's start with the logical database containing the statements:

which then produces the output

where

|-is how we say that our logic asserts something, and\x.means "all x" or "in what follows, we can substitute anything we want for x".Thus the above is how we'd write the classic syllogism, "Socrates is human, all humans are mortal, therefore Socrates is mortal".

Now a few months back, I went through the sequence on words, which included The Parable of Hemlock, which runs thusly:

If you're going to take "all men are mortal" as something true by definition, then you can never conclude that Socrates is "human" until after you've observed him to be mortal. Since logical truths are true in all possible worlds, they never tell you

whichpossible world you live in - no logical truth can predict the result of an empirical event which couldpossiblygo either way.Could a skeptic say that this logical database is

not doing any cognitive work,since it can only tell us what we already know? Or that, since the database is only using logic, it will be unable to do anything empirical, ever?Even I think that's too severe. The "logical reasoner" is doing a quantum of cognitive labor - it's just a

smallquantum.Consider the following sequence of events:

|- human(x)to our database, whenever we observe that X has ten fingers, wears clothes, uses tools, and speaks a language. These particular characteristics are easier to observe (we think) than e.g. cutting X to see if red blood comes out.|- \x.(human(x)->mortal(x))to our database.|- human(Socrates)to the database.|- mortal(Socrates)on its screen.This process, taken as a whole, is hardly absolutely certain, as in the Spock stereotype of rationalists who cannot conceive that they are wrong. The process did briefly involve a computer program which mimicked a system, first-order classical logic, which also happens to be used by some mathematicians in verifying their proofs. That doesn't lend the entire process the character of mathematical proof. And if the process fails, somewhere along the line, that's no call to go casting aspersions on Reason itself.

In this admittedly contrived example, only an

infinitesimalfraction of the cognitive work is being performed by the computer program. It's such a small fraction that anything you could say about "logical AI", wouldn't say much about the process as a whole.So what's an example of harder cognitive labor?

How about deciding that "human" is an important category to put things in? It's not like we're born seeing little "human" tags hovering over objects, with high priority attached. You have to discriminate stable things in the environment, like Socrates, from your raw sensory information. You have to notice that various stable things are all similar to one another, wearing clothes and talking. Then you have to draw a category boundary around the cluster, and harvest characteristics like vulnerability to hemlock.

A human operator, not the computer, decides whether or not to classify Socrates as a "human", based on his shape and clothes. The human operator types

|- human(Socrates)into the database (itself an error-prone sort of process). Then the database spits out|- mortal(Socrates)- in the scenario, this is the only fact we've ever told it about humans, so we don't ask why it makes this deduction instead of another one. A human looks at the screen, interprets|- mortal(Socrates)to refer to a particular thing in the environment and to imply that thing's vulnerability to hemlock. Then the human decides, based on their values, that they'd rather not see Socrates die; works out a plan to stop Socrates from dying; and executes motor actions to dash the chalice from Socrates's fingers.Are the off-computer steps "logical"? Are they "illogical"? Are they unreasonable? Are they unlawful? Are they unmathy?

Let me interrupt this tale, to describe a case where you very much

dowant a computer program that processes logical statements:Suppose you've got to build a computer chip with a hundred million transistors, and you don't want to recall your product when a bug is discovered in multiplication. You might find it

verywise to describe the transistors in first-order logic, and try to prove statements about how the chip performs multiplication.But then why is logic suited to

thisparticular purpose?Logic relates abstract statements to specific models. Let's say that I have an abstract statement like "all green objects are round" or "all round objects are soft". Operating

syntactically,working with just thesentences,I can derive "all green objects are soft".Now if you show me a particular collection of shapes, and

if it so happens to be truethat every green object in that collection is round, and italsohappens to be true that every round object is soft, then it willlikewisebe true that all green objects are soft.We are not admitting of the possibility that a green-detector on a borderline green object will fire "green" on one occasion and "not-green" on the next. The form of logic in which every proof step preserves validity, relates crisp models to crisp statements. So if you want to draw a

directcorrespondence between elements of a logical model, and high-level objects in the real world, you had better be dealing with objects that have crisp identities, and categories that have crisp boundaries.Transistors in a computer chip generally

dohave crisp identities. So it may indeed be suitable to make a mapping between elements in a logical model, and real transistors in the real world.So let's say you can perform the mapping and get away with it - then what?

The power of logic is that it

relates models and statements. So you've got to make that mental distinction between models on the one hand, and statements on the other. You've got to draw a sharp line between the elements of a model that are green or round, and statements like\x.(green(x)->round(x)). The statement itself isn't green or round, but it can be true or false about a collection of objects that are green or round.And here is the power of logic: For each

syntacticstep we do on ourstatements,we preserve thematchto anymodel.In any model where our old collection of statements was true, the new statement will also be true. We don't have to check all possible conforming models to see if the new statement is true in all of them. We can trust certain syntactic stepsin general- not toproducetruth, but topreservetruth.Then you do a million syntactic steps in a row, and because each step preserves truth, you can trust that the whole sequence will preserve truth.

We start with a chip. We do some physics and decide that whenever transistor X is 0 at time T, transistor Y will be 1 at T+1, or some such - we credit that real events in the chip will correspond quite directly to a

modelof this statement. We do a whole lot ofsyntacticmanipulation on the abstract laws. We prove a statement that describes binary multiplication. And then we jump back to the model, and then back to the chip, and say, "Whatever the exact actual events on this chip, if they have the physics we described, then multiplication will work the way we want."It would be considerably harder (i.e. impossible) to work directly with logical models of every possible computation the chip could carry out. To verify multiplication on two 64-bit inputs, you'd need to check 340 trillion trillion trillion models.

But this trick of doing a million derivations one after the other, and preserving truth throughout, won't work if the premises are

onlytrue 999 out of 1000 times. You could get away with ten steps in the derivation and not lose too much, but a million would be out of the question.So the truth-preserving syntactic manipulations we call "logic" can be very useful indeed, when we draw a correspondence to a digital computer chip where the transistor error rate is very low.

But if you're trying to draw a direct correspondence between the primitive elements of a logical model and, say, entire biological humans, that may not work out as well.

First-order logic has a number of wonderful properties, like

detachment.We don't carehowyou proved something - once you arrive at a statement, we can forget how we got it. The syntactic rules arelocal,and use statements as fodder without worrying about their provenance. So once we prove a theorem, there's no need to keep track of how, in particular, it was proved.But what if one of your premises turns out to be wrong, and you have to retract something you already concluded? Wouldn't you want to keep track of which premises you'd used?

If the burglar alarm goes off, that means that a burglar is burgling your house. But if there was an earthquake that day, it's probably the earthquake that set off the alarm instead. But if you learned that there was a burglar from the police, rather than the alarm, then you don't want to retract the "burglar" conclusion on finding that there was an earthquake...

It says a lot about the problematic course of early AI, that people first tried to handle this problem with

nonmonotonic logics.They would try to handle statements like "A burglar alarm indicates there's a burglar - unless there's an earthquake" using aslightly modifiedlogical database that would draw conclusions and then retract them.And this gave rise to huge problems for many years, because they were trying to do, in the

styleof logic, something that was not at all like theactualnatureof logic as math. Trying to retract a particular conclusion goescompletely againstthe nature of first-order logic as a mathematical structure.If you were given to jumping to conclusions, you might say "Well,

mathcan't handle that kind of problem because there are no absolute laws as to what you conclude when you hear the burglar alarm - you've got to use your common-sense judgment, not math."But it's not an

unlawfulor evenunmathyquestion.It turns out that for at least the kind of case I've described above - where you've got effects that have more than one possible cause - we can

excellentlyhandle a wide range of scenarios using a crystallization of probability theory known as "Bayesian networks". And lo, we can prove all sorts of wonderful theorems that I'm not going to go into. (See Pearl's "Probabilistic Reasoning in Intelligent Systems".)And the

realsolution turned out to be muchmoreelegant than all the messy ad-hoc attempts at "nonmonotonic logic". On non-loopy networks, you can do all sorts of wonderful things like propagate updates in parallel using asynchronous messages, where each node only tracks the messages coming from its immediate neighbors in the graph, etcetera. And this parallel, asynchronous, decentralized algorithm is provably correct as probability theory, etcetera.So... are Bayesian networks

illogical?Certainly not in the colloquial sense of the word.

You

couldwrite a logic that implemented a Bayesian network. You could represent the probabilities and graphs in a logical database. The elements of your model would no longer correspond to things like Socrates, but rather correspond to conditional probabilities or graph edges... But why bother? Non-loopy Bayesian networks propagate their inferences in nicely local ways. There's no need to stick a bunch of statements in a centralized logical database and then waste computing power to pluck out global inferences.What am I trying to convey here? I'm trying to convey that

thinking mathematically about uncertain reasoningis acompletely different conceptfromAI programs that assume direct correspondences between the elements of a logical model and the high-level regularities of reality."The failure of logical AI" is not "the failure of mathematical thinking about AI" and certainly not "the limits of lawful reasoning". The "failure of logical AI" is more like, "That thing with the database containing statements about Socrates and hemlock - not only were you using the

wrong math,but you weren't evenlooking at the interesting parts of the problem."Now I did concede that the logical reasoner talking about Socrates and hemlock,

wasperforming a quantum of cognitive labor. We can now describe that quantum:"

Afteryou've arrived at such-and-such hypothesis about what goes on behind the scenes of your sensory information, and distinguished the pretty-crisp identity of 'Socrates' and categorized it into the pretty-crisp cluster of 'human', then,ifthe other things you've observed to usually hold true of 'humans' are accurate in this case, 'Socrates' will have the pretty-crisp property of 'mortality'."This quantum of labor tells you a single implication of what you already believe... but actually it's an even smaller quantum than this. The step carried out by the logical database corresponds to

verifyingthis step of inference, notdecidingto carry it out. Logic makes no mention of which inferences we should performfirst- the syntactic derivations are timeless and unprioritized. It is nowhere represented in the nature of logic as math, that if the 'Socrates' thingy is drinking hemlock,right nowis a good time to ask if he's mortal.And indeed, modern AI programs still aren't very good at guiding inference. If you want to prove a computer chip correct, you've got to have a human alongside to suggest the lemmas to be proved next. The nature of logic is better suited to verification than construction - it preserves truth through a million syntactic manipulations, but it doesn't

prioritizethose manipulations in any particular order. So saying "Use logic!" isn't going to solve the problem of searching for proofs.This doesn't mean that "What inference should I perform next?" is an

unlawfulquestion to which no math applies. Just that the math oflogicthat relates models and statements, relates timeless models to timeless statements in a world of unprioritized syntactic manipulations. You might be able to use logic to reasonabouttime oraboutexpected utility, the same way you could use it to represent a Bayesian network. But that wouldn't introduce time, or wanting, or nonmonotonicity, into the nature of logic as a mathematical structure.Now,

math itselftends to be timeless and detachable and proceed from premise to conclusion, at least when it happens to be right. So logic is well-suited toverifyingmathematical thoughts - thoughproducingthose thoughts in the first place, choosing lemmas and deciding which theorems are important, is a whole different problem.Logic might be well-suited to

verifyingyour derivation of the Bayesian network rules from the axioms of probability theory. But this doesn't mean that, as a programmer, you should try implementing a Bayesian network on top of a logical database. Nor, for that matter, that you should rely on a first-order theorem prover toinventthe idea of a "Bayesian network" from scratch.Thinking mathematically about uncertain reasoning, doesn't mean that you try to turn everything into a logical model. It means that you comprehend the nature of logic itselfwithinyour mathematical vision of cognition, so that you can see which environments and problems are nicely matched to the structure of logic.