Local Validity as a Key to Sanity and Civilization


159


Eliezer_Yudkowsky

(Cross-posted from Facebook.)

0.

Tl;dr: There's a similarity between these three concepts:

  • A locally valid proof step in mathematics is one that, in general, produces only true statements from true statements. This is a property of a single step, irrespective of whether the final conclusion is true or false.
  • There's such a thing as a bad argument even for a good conclusion. In order to arrive at sane answers to questions of fact and policy, we need to be curious about whether arguments are good or bad, independently of their conclusions. The rules against fallacies must be enforced even against arguments for conclusions we like.
  • For civilization to hold together, we need to make coordinated steps away from Nash equilibria in lockstep. This requires general rules that are allowed to impose penalties on people we like or reward people we don't like. When people stop believing the general rules are being evaluated sufficiently fairly, they go back to the Nash equilibrium and civilization falls.

i.

The notion of a locally evaluated argument step is simplest in mathematics, where it is a formalizable idea in model theory. In math, a general type of step is 'valid' if it only produces semantically true statements from other semantically true statements, relative to a given model. If x = y in some set of variable assignments, then 2x = 2y in the same model. Maybe x doesn't equal y, in some model, but even if it doesn't, the local step from "x = y" to "2x = 2y" is a locally valid step of argument. It won't introduce any new problems.

Conversely, xy = xz does not imply y = z. It happens to work when x = 2, y = 3, and z= 3, in which case the two statements say "6 = 6" and "3 = 3" respectively. But if x = 0, y = 4, z = 17, then we have "0 = 0" on one side and "4 = 17" on the other. We can feed in a true statement and get a false statement out the other end. This argument is not locally okay.

You can't get the concept of a "mathematical proof" unless on some level—though often an intuitive level rather than an explicit one—you understand the notion of a single step of argument that is locally okay or locally not okay, independent of whether you globally agreed with the final conclusion. There's a kind of approval you give to the pieces of the argument, rather than looking the whole thing over and deciding whether you like what came out the other end.

Once you've grasped that, it may even be possible to convince you of mathematical results that sound counterintuitive. When your understanding of the rules governing allowable argument steps has become stronger than your faith in your ability to judge whole intuitive conclusions, you may be convinced of truths you would not otherwise have grasped.

ii.

More generally in life, even outside of mathematics, there are such things as bad arguments for good conclusions.

There are even such things as genuinely good arguments for false conclusions, though of course those are much rarer. By the Bayesian definition of evidence, "strong evidence" is exactly that kind of evidence which we very rarely expect to find supporting a false conclusion. Lord Kelvin's careful and multiply-supported lines of reasoning arguing that the Earth could not possibly be so much as a hundred million years old, all failed simultaneously in a surprising way because that era didn't know about nuclear reactions. But most of the time this does not happen.

On the other hand, bad arguments for true conclusions are extremely easy to come by, because there are tiny elves that whisper them to people. There isn't anything the least bit more difficult in making an argument terrible when it leads to a good conclusion, since the tiny elves own lawnmowers.

One of the marks of an intellectually strong mind is that they are able to take a curious interest in whether a particular argument is a good argument or a bad argument, independently of whether they agree with the conclusion of that argument.

Even if they happen to start out believing that, say, the intelligence explosion thesis for Artificial General Intelligence is false, they are capable of frowning at the argument that the intelligence explosion is impossible because hypercomputation is impossible, or that there's really no such thing as intelligence because of the no-free-lunch theorem, and saying, "Even if I agree with your conclusion, I think that's a terrible argument for it." Even if they agree with the mainstream scientific consensus on anthropogenic global warming, they still wince and perhaps even offer a correction when somebody offers as evidence favoring global warming that there was a really scorching day last summer.

There are weaker and stronger versions of this attribute. Some people will think to themselves, "Well, it's important to use only valid arguments... but there was a sustained pattern of record highs worldwide over multiple years which does count as evidence, and that particular very hot day was a part of that pattern, so it's valid evidence for global warming." Other people will think to themselves, "I'd roll my eyes at someone who offers a single very cold day as an argument that global warming is false. So it can't be okay to use a single very hot day to argue that global warming is true."

I'd much rather buy a used car from the second person than the first person. I think I'd pay at least a 5% price premium.

Metaphorically speaking, the first person will court-martial an allied argument if they must, but they will favor allied soldiers when they can. They still have a sense of motion toward the Right Final Answer as being progress, and motion away from the right final answer as anti-progress, and they dislike not making progress.

The second person has something more like the strict mindset of a mathematician when it comes to local validity. They are able to praise some proof steps as obeying the rules, irrespective of which side those steps are on, without a sense that they are thereby betraying their side.

iii.

This essay has been bubbling in the back of my mind for a while, since I read that potential juror #70 for the Martin Shkreli trial was rejected during selection when, asked if they thought they could render impartial judgment, they replied, "I can be fair to one side but not the other." And I thought maybe I should write something about why that was possibly a harbinger of the collapse of civilization. I've been musing recently about how a lot of the standard Code of the Light isn't really written down anywhere anyone can find.

The thought recurred during the recent #MeToo saga when some Democrats were debating whether it made sense to kick Al Franken out of the Senate. I don't want to derail into debating Franken's behavior and whether that degree of censure was warranted per se, and I'll delete any such comments. What brought on this essay was that I read some unusually frank concerns from people who did think that Franken's behavior was per se cause to not represent the Democratic Party in the Senate; but who worried that the Democrats would police themselves, the Republicans wouldn't, and so the Republicans would end up controlling the Senate.

I've heard less of that since some upstanding Republican voters in Alabama stayed home on election night and put Doug Jones in the Senate.

But at the time, some people were replying, "That seems horrifyingly cynical and realpolitik. Is the idea here that sexual line-crossing is only bad and worthy of punishment when Republicans do it? Are we deciding that explicitly now?" And others were saying, "Look, the end result of your way of doing things is to just hand over the Senate to the Republican Party."

This is a conceptual knot that, I'm guessing, results from not explicitly distinguishing game theory from goodness.

There is, I think, a certain intuitive idea that ideally the Law is supposed to embody a subset of morality insofar as it is ever wise to enforce certain kinds of goodness. Murder is bad, and so there's a law against this bad behavior of murder. There's a lot of places where the law is in fact evil, like the laws criminalizing marijuana; that means the law is departing from its purpose, falling short of what it should be. Those who are not real-life straw authoritarians (who are sadly common) will cheerfully agree that there are some forms of goodness, even most forms of goodness, that it is not wise to try to legislate. But insofar as it is ever wise to make law, there's an intuitive sense that law should reflect some particular subset of morally good behavior that we have decided it is wise to enforce with guns, such as "Don't kill people."

It's from this perspective that "As a matter of pragmatic realpolitik we are going to not enforce sexual line-crossing rules against Democratic senators" seems like giving up, and maybe a harbinger of the fall of civilization if things have really gotten that bad.

But there's more than one function of legal codes, the way that money is both a store of value and a medium of exchange but these are different functions of money.

You can also look at laws as a kind of game theory played with people who might not share your morality at all. Some people take this perspective almost exclusively, at least in their verbal reports. They'll say, "Well, yes, I'd like it if I could walk into your house and take all your stuff, but I would dislike it even more if you could walk into my house and take my stuff, and that's why we have laws." I'm never quite sure how seriously to take the claim that they'd be happy walking into my house and taking my stuff. It seems to me that law enforcement and even social enforcement are simply not effective enough to count for the vast majority of human cooperation, and I have a sense that civilization is free-riding a whole lot on innate altruism... but game theory is certainly a function served by law.

The same way that money is both medium of exchange and store of value, the law is both collective utility function fragment and game theory.

In its function as game theory, the law (ideally) enables people with different utility functions to move from bad Nash equilibria to better Nash equilibria, closer to the Pareto frontier. Instead of mutual defection getting a payoff of (2, 2), both sides pay 0.1 for law enforcement and move to enforced mutual cooperation at (2.9, 2.9).

From this perspective, everything rests on notions like "fairness", "impartiality", "equality before the law", "it doesn't matter whose ox is being gored". If the so-called law punishes your defection but lets the other's defection pass, and this happens systematically enough and often enough, it is in your interest to blow up the current equilibrium if you have a chance.

It is coherent to say, "Crossing this behavioral line is universally bad when anyone does it, and also we're not going to punish Democratic senators unless you also punish Republican senators." Though as the saga of Senator Doug Jones of Alabama also shows, you should be careful about preemptively assuming the other side won't cooperate; there are sad lost opportunities there.

iv.

The way humans do law, it depends on the existence of what feel like simple general rules that apply to all cases.

This is not a universal truth of decision theory, it's a consequence of our cognitive limitations. Two superintelligences could negotiate a compromise with complicated detailed boundaries going right up to the Pareto frontier. They could agree on mutually verified pieces of cognitive code designed to intelligently decide future events according to known principles.

Humans use simpler laws than that.

To be clear, the kind of "law" I'm talking about here is not to be confused with the enormous modern morass of unreadable regulations. Think of, say, the written laws that actually got enforced in a small town in California in 1820. Or Democrats debating whether to enforce a sanction against Democratic senators if it's not being enforced against Republican senators. Or a small community's elders' star-chamber meeting to debate an accusation of sexual assault. Or the laws that cops will enforce even against other cops. These are the kinds of laws that must be simple in order to exist.

The reason that hunter-gatherer tribes don't have 100,000 pages of written legalism... is not that they've wisely realized that lengthy rules are easier to fill with loopholes, and that complicated regulations favor large corporations with legal departments, and that laws often have unintended consequences which don't resemble their stated justifications, and that deadweight losses increase quadratically. It's very clear that a supermajority of human beings are not that wise. Rather, hunter-gatherers just don't have enough time, energy, and paper to screw up that badly.

When humans try to verbalize The Law that isn't to be confused with written law, the law that cops will enforce against other cops, it comes out in universally quantified short sentences like "Anyone who defects in the Prisoner's Dilemma will be penalized TEN points even if that costs us fifteen" or "If you kill somebody who wasn't attacking you first, we'll exile you."

At one point somebody had the bright idea of trying to write down The Law. That way everyone could have common knowledge of what The Law was; and if you didn't break what was written, you could know you were safe from at least the official sanctions. Robert Heinlein called it the most important moment in political history, declaring that the law was above the politicians.

I for one rather doubt the Code of Hammurabi was universally enforced. I expect that hunter-gatherer tribes long before writing had a sense of there being Laws that were above the decisions of individual elders. I suspect that even in the best of times most of the The Law was never written down, and that more than half of what was written down was never really The Law.

But unfortunately, once somebody had the bright idea of writing down The Law, somebody else had the bright idea of writing down more words on the same clay tablet.

Today we live in a post-legalist era, when almost all of that which serves the true function of Law can no longer be written down. The government legalist system is too expensive in time and money and energy, too unreliable, and too slow, for any sane victim of sexual assault to appeal to the criminal justice system instead of the media justice system or the whispernet justice system. The civil legalist system outside of small claims court is a bludgeoning contest between entities that can afford lawyers, and the real law between corporations is enforced by merchant reputation and the threat of starting a bludgeoning contest. If you're in a lower-class neighborhood in the US, you can't get together and create order using your own town guards, because the police won't allow it. From your perspective, the function of the police is to prevent open gunfights and to not allow any more effective order than that to form.

But so it goes. We can't always keep the nice things we used to have, like written laws. The privilege was abused, and has been revoked.

When remains of The Law must indeed be simple, because our written-law privileges have been revoked, and so The Law relies on everyone knowing The Law without it being written down. It isn't even recited in memorable verse, as once it was. The Law relies on the community agreeing on the application of The Law without there being professional judges or a precedent-based judiciary. If not universal agreement, it must at least seem that the choices of the elders are trying to appeal to The Law instead of just naked self-interest. To the extent a voluntary association can't agree on The Law in this sense, it will soon cease to be a voluntary association.

The Law also breaks down if people start believing that, when the simple rules say one thing, the deciders will instead look at whose ox got gored, evaluate their personal interest, and enforce a different conclusion instead.

Which is to say: human law ends up with what people at least believe to be a set of simple rules that can be locally checked to test okay behavior. It's not actually algorithmically simple any more than walking is cheaply computable, but it feels simple the way that walking feels easy. Whatever doesn't feel like part of that small simple set won't be systematically enforced by the community, regardless of whether your civilization has reached the stage where police are seizing the cars of black people but not white people who use marijuana.

v.

The game-theoretic function of law can make following those simple rules feel like losing something, taking a step backward. You don't get to defect in the Prisoner's Dilemma, you don't get that delicious (5, 0) payoff instead of (3, 3). The law may punish one of your allies. You may be losing something according to your actual value function, which feels like the law having an objectively bad immoral result. You may coherently hold that the universe is a worse place for an instance of the enforcement of a good law, relative to its counterfactual state if that law could be lifted in just that instance without affecting any other instances. Though this does require seeing that law as having a game-theoretic function as well as a moral function.

So long as the rules are seen as moving from a bad global equilibrium to a global equilibrium seen as better, and so long as the rules are mostly-equally enforced on everyone, people are sometimes able to take a step backward and see that larger picture. Or, in a less abstract way, trade off the reified interest of The Law against their own desires and wishes.

This mental motion goes by names like "justice", "fairness", and "impartiality". It has ancient exemplars like a story I couldn't seem to Google, about a Chinese general who prohibited his troops from looting, and then his son appropriated a straw hat from a peasant; so the general sentenced his own son to death with tears running down his eyes.

Here's a fragment of thought as it was before the Great Stagnation, as depicted in passing in H. Beam Piper's Little Fuzzy, one of the earliest books I read as a child. It's from 1962, when the memetic collapse had started but not spread very far into science fiction. It stuck in my mind long ago and became one more tiny little piece of who I am now.

“Pendarvis is going to try the case himself,” Emmert said. “I always thought he was a reasonable man, but what’s he trying to do now? Cut the Company’s throat?”
“He isn’t anti-Company. He isn’t pro-Company either. He’s just pro-law. The law says that a planet with native sapient inhabitants is a Class-IV planet, and has to have a Class-IV colonial government. If Zarathustra is a Class-IV planet, he wants it established, and the proper laws applied. If it’s a Class-IV planet, the Zarathustra Company is illegally chartered. It’s his job to put a stop to illegality. Frederic Pendarvis’ religion is the law, and he is its priest. You never get anywhere by arguing religion with a priest.”

There is no suggestion in 1962 that the speakers are gullible, or that Pendarvis is a naif, or that Pendarvis is weird for thinking like this. Pendarvis isn't the defiant hero or even much of a side character. It's just a kind of judge you sometimes run into, part of a normal environment as projected from the author's mind that wrote the story.

If you don't have some people like Pendarvis, and you don't appreciate what they're trying to do even when they rule against you, sooner or later your tribe ends.

I mean, I doubt the United States will literally fall into anarchy this way before the AGI timeline runs out. But the concept applies on a smaller scale than countries. It applies on a smaller scale than communities, to bargains between three people or two.

The notion that you can "be fair to one side but not the other", that what's called "fairness" is a kind of favor you do for people you like, says that even the instinctive sense people had of law-as-game-theory is being lost in the modern memetic collapse. People are being exposed to so many social-media-viral depictions of the Other Side defecting, and viewpoints exclusively from Our Side without any leavening of any other viewpoint that might ask for a game-theoretic compromise, that they're losing the ability to appreciate the kind of anecdotes they used to tell in ancient China.

(Or maybe it's hormonelike chemicals leached from plastic food containers. Let's not forget all the psychological explanations offered for a wave of violence that turned out to be lead poisoning.)

vi.

And to take the point full circle:

The mental motion to evenhandedly apply The Rules irrespective of their conclusion is a kind of thinking that human beings appreciate intuitively, or at least they appreciated it in ancient China and mid-20th-century science fiction. In fact, we appreciate The Law more natively than we appreciate the notion of local syntactic rules capturing semantically valid steps in mathematical proofs, go figure.

So the legal metaphor is where a lot of people get started on epistemology: by seeing the local rules of valid argument as The Law, fallacies as crimes. The unusually healthy of mind will reject bad allied arguments with an emotional sense of practicing the way of an impartial judge.

It's ironic, in a way, because there is no game theory and no morality to the true way of the map that reflects the territory. A paperclip maximizer would also strive to debias its cognitive processes, alone in its sterile universe.

But I would venture a guess and hypothesis that you are better off buying a used car from a random mathematician than a random non-mathematician, even after controlling for IQ. The reasoning being that mathematicians are people whose sense of Law was strong enough to be appropriated for proofs, and that this will correlate, if imperfectly, with mathematicians abiding by what they see as The Law in other places as well. I could be wrong, and would be interested in seeing the results of any study like this if it were ever done. (But no studies on self-reports of criminal behavior, please. Unless there's some reason to believe that the self-report metric isn't measuring "honesty times criminality" rather than "criminality".)

I have no grand agenda in having said all this. I've just sometimes thought of late that it would be nice if more of the extremely basic rules of thinking were written down.