Optimization and the Intelligence Explosion

Eliezer Yudkowsky

Among the topics I haven’t delved into here is the notion of an optimization process. Roughly, this is the idea that your power as a mind is your ability to hit small targets in a large search space—this can be either the space of possible futures (planning) or the space of possible designs (invention).

Suppose you have a car, and suppose we already know that your preferences involve travel. Now suppose that you take all the parts in the car, or all the atoms, and jumble them up at random. It’s very unlikely that you’ll end up with a travel-artifact at all, even so much as a wheeled cart; let alone a travel-artifact that ranks as high in your preferences as the original car. So, relative to your preference ordering, the car is an extremely improbable artifact. The power of an optimization process is that it can produce this kind of improbability.

You can view both intelligence and natural selection as special cases of optimization: processes that hit, in a large search space, very small targets defined by implicit preferences. Natural selection prefers more efficient replicators. Human intelligences have more complex preferences. Neither evolution nor humans have consistent utility functions, so viewing them as “optimization processes” is understood to be an approximation. You’re trying to get at the sort of work being done, not claim that humans or evolution do this work perfectly.

This is how I see the story of life and intelligence—as a story of improbably good designs being produced by optimization processes. The “improbability” here is improbability relative to a random selection from the design space, not improbability in an absolute sense—if you have an optimization process around, then “improbably” good designs become probable.

Looking over the history of optimization on Earth up until now, the first step is to conceptually separate the meta level from the object level—separate the structure of optimization from that which is optimized.

If you consider biology in the absence of hominids, then on the object level we have things like dinosaurs and butterflies and cats. On the meta level we have things like sexual recombination and natural selection of asexual populations. The object level, you will observe, is rather more complicated than the meta level. Natural selection is not an easy subject and it involves math. But if you look at the anatomy of a whole cat, the cat has dynamics immensely more complicated than “mutate, recombine, reproduce.”

This is not surprising. Natural selection is an accidental optimization process, that basically just started happening one day in a tidal pool somewhere. A cat is the subject of millions of years and billions of years of evolution.

Cats have brains, of course, which operate to learn over a lifetime; but at the end of the cat’s lifetime, that information is thrown away, so it does not accumulate. The cumulative effects of cat-brains upon the world as optimizers, therefore, are relatively small.

Or consider a bee brain, or a beaver brain. A bee builds hives, and a beaver builds dams; but they didn’t figure out how to build them from scratch. A beaver can’t figure out how to build a hive, a bee can’t figure out how to build a dam.

So animal brains—up until recently—were not major players in the planetary game of optimization; they were pieces but not players. Compared to evolution, brains lacked both generality of optimization power (they could not produce the amazing range of artifacts produced by evolution) and cumulative optimization power (their products did not accumulate complexity over time). For more on this theme see Protein Reinforcement and DNA Consequentialism.

Very recently, certain animal brains have begun to exhibit both generality of optimization power (producing an amazingly wide range of artifacts, in time scales too short for natural selection to play any significant role) and cumulative optimization power (artifacts of increasing complexity, as a result of skills passed on through language and writing).

Natural selection takes hundreds of generations to do anything and millions of years for de novo complex designs. Human programmers can design a complex machine with a hundred interdependent elements in a single afternoon. This is not surprising, since natural selection is an accidental optimization process that basically just started happening one day, whereas humans are optimized optimizers handcrafted by natural selection over millions of years.

The wonder of evolution is not how well it works, but that it works at all without being optimized. This is how optimization bootstrapped itself into the universe—starting, as one would expect, from an extremely inefficient accidental optimization process. Which is not the accidental first replicator, mind you, but the accidental first process of natural selection. Distinguish the object level and the meta level!

Since the dawn of optimization in the universe, a certain structural commonality has held across both natural selection and human intelligence . . .

Natural selection selects on genes, but generally speaking, the genes do not turn around and optimize natural selection. The invention of sexual recombination is an exception to this rule, and so is the invention of cells and DNA. And you can see both the power and the rarity of such events, by the fact that evolutionary biologists structure entire histories of life on Earth around them.

But if you step back and take a human standpoint—if you think like a programmer—then you can see that natural selection is still not all that complicated. We’ll try bundling different genes together? We’ll try separating information storage from moving machinery? We’ll try randomly recombining groups of genes? On an absolute scale, these are the sort of bright ideas that any smart hacker comes up with during the first ten minutes of thinking about system architectures.

Because natural selection started out so inefficient (as a completely accidental process), this tiny handful of meta-level improvements feeding back in from the replicators—nowhere near as complicated as the structure of a cat—structure the evolutionary epochs of life on Earth.

And after all that, natural selection is still a blind idiot of a god. Gene pools can evolve to extinction, despite all cells and sex.

Now natural selection does feed on itself in the sense that each new adaptation opens up new avenues of further adaptation; but that takes place on the object level. The gene pool feeds on its own complexity—but only thanks to the protected interpreter of natural selection that runs in the background, and that is not itself rewritten or altered by the evolution of species.

Likewise, human beings invent sciences and technologies, but we have not yet begun to rewrite the protected structure of the human brain itself. We have a prefrontal cortex and a temporal cortex and a cerebellum, just like the first inventors of agriculture. We haven’t started to genetically engineer ourselves. On the object level, science feeds on science, and each new discovery paves the way for new discoveries—but all that takes place with a protected interpreter, the human brain, running untouched in the background.

We have meta-level inventions like science, that try to instruct humans in how to think. But the first person to invent Bayes’s Theorem did not become a Bayesian; they could not rewrite themselves, lacking both that knowledge and that power. Our significant innovations in the art of thinking, like writing and science, are so powerful that they structure the course of human history; but they do not rival the brain itself in complexity, and their effect upon the brain is comparatively shallow.

The present state of the art in rationality training is not sufficient to turn an arbitrarily selected mortal into Albert Einstein, which shows the power of a few minor genetic quirks of brain design compared to all the self-help books ever written in the twentieth century.

Because the brain hums away invisibly in the background, people tend to overlook its contribution and take it for granted; and talk as if the simple instruction to “Test ideas by experiment,” or the p < 0.05 significance rule, were the same order of contribution as an entire human brain. Try telling chimpanzees to test their ideas by experiment and see how far you get.

Now . . . some of us want to intelligently design an intelligence that would be capable of intelligently redesigning itself, right down to the level of machine code.

The machine code at first, and the laws of physics later, would be a protected level of a sort. But that “protected level” would not contain the dynamic of optimization; the protected levels would not structure the work. The human brain does quite a bit of optimization on its own, and screws up on its own, no matter what you try to tell it in school. But this fully wraparound recursive optimizer would have no protected level that was optimizing. All the structure of optimization would be subject to optimization itself.

And that is a sea change which breaks with the entire past since the first replicator, because it breaks the idiom of a protected meta level.

The history of Earth up until now has been a history of optimizers spinning their wheels at a constant rate, generating a constant optimization pressure. And creating optimized products, not at a constant rate, but at an accelerating rate, because of how object-level innovations open up the pathway to other object-level innovations. But that acceleration is taking place with a protected meta level doing the actual optimizing. Like a search that leaps from island to island in the search space, and good islands tend to be adjacent to even better islands, but the jumper doesn’t change its legs. Occasionally, a few tiny little changes manage to hit back to the meta level, like sex or science, and then the history of optimization enters a new epoch and everything proceeds faster from there.

Imagine an economy without investment, or a university without language, a technology without tools to make tools. Once in a hundred million years, or once in a few centuries, someone invents a hammer.

That is what optimization has been like on Earth up until now.

When I look at the history of Earth, I don’t see a history of optimization over time. I see a history of optimization power in, and optimized products out. Up until now, thanks to the existence of almost entirely protected meta-levels, it’s been possible to split up the history of optimization into epochs, and, within each epoch, graph the cumulative object-level optimization over time, because the protected level is running in the background and is not itself changing within an epoch.

What happens when you build a fully wraparound, recursively self- improving AI? Then you take the graph of “optimization in, optimized out,” and fold the graph in on itself. Metaphorically speaking.

If the AI is weak, it does nothing, because it is not powerful enough to significantly improve itself—like telling a chimpanzee to rewrite its own brain.

If the AI is powerful enough to rewrite itself in a way that increases its ability to make further improvements, and this reaches all the way down to the AI’s full understanding of its own source code and its own design as an optimizer . . . then even if the graph of “optimization power in” and “optimized product out” looks essentially the same, the graph of optimization over time is going to look completely different from Earth’s history so far.

People often say something like, “But what if it requires exponentially greater amounts of self-rewriting for only a linear improvement?” To this the obvious answer is, “Natural selection exerted roughly constant optimization power on the hominid line in the course of coughing up humans; and this doesn’t seem to have required exponentially more time for each linear increment of improvement.”

All of this is still mere analogic reasoning. A full Artificial General Intelligence thinking about the nature of optimization and doing its own AI research and rewriting its own source code, is not really like a graph of Earth’s history folded in on itself. It is a different sort of beast. These analogies are at best good for qualitative predictions, and even then, I have a large amount of other beliefs I haven’t yet explained, which are telling me which analogies to make, et cetera.

But if you want to know why I might be reluctant to extend the graph of biological and economic growth over time, into the future and over the horizon of an AI that thinks at transistor speeds and invents self-replicating molecular nanofactories and improves its own source code, then there is my reason: you are drawing the wrong graph, and it should be optimization power in versus optimized product out, not optimized product versus time.

The first publication of this post is here.

LESSWRONG
LW

LESSWRONG
LW

68

Optimization and the Intelligence Explosion

68

68

68