A Primer On Risks From AI

XiXiDu

The Power of Algorithms

Evolutionary processes are the most evident example of the power of simple algorithms [1][2][3][4][5].

The field of evolutionary biology gathered a vast amount of evidence [6] that established evolution as the process that explains the local decrease in entropy [7], the complexity of life.

Since it can be conclusively shown that all life is an effect of an evolutionary process it is implicit that everything we do not understand about living beings is also an effect of evolution.

We might not understand the nature of intelligence [8] and consciousness [9] but we do know that they are the result of an optimization process that is neither intelligent nor conscious.

Therefore we know that it is possible for an physical optimization process to culminate in the creation of more advanced processes that feature superior qualities.

One of these qualities is the human ability to observe and improve the optimization process that created us. The most obvious example being science [10].

Science can be thought of as civilization-level self-improvement method. It allows us to work together in a systematic and efficient way and accelerate the rate at which further improvements are made.

The Automation of Science

We know that optimization processes that can create improved versions of themselves are possible, even without an explicit understanding of their own workings, as exemplified by natural selection.

We know that optimization processes can lead to self-reinforcing improvements, as exemplified by the adaptation of the scientific method [11] as an improved evolutionary process and successor of natural selection.

Which raises questions about the continuation of this self-reinforcing feedback cycle and its possible implications.

One possibility is to automate science [12][13] and apply it to itself and its improvement.

But science is a tool and its bottleneck are its users. Humans, the biased [14] effect of the blind idiot god that is evolution.

Therefore the next logical step is to use science to figure out how to replace humans by a better version of themselves, artificial general intelligence.

Artificial general intelligence, that can recursively optimize itself [15], is the logical endpoint of various converging and self-reinforcing feedback cycles.

Risks from AI

Will we be able to build an artificial general intelligence? Yes, sooner or later.

Even the unintelligent, unconscious and aimless process of natural selection was capable of creating goal-oriented, intelligent and conscious agents that can think ahead, jump fitness gaps and improve upon the process that created them to engage in prediction and direct experimentation.

The question is, what are the possible implications of the invention of an artificial, fully autonomous, intelligent and goal-oriented optimization process?

One good bet is that such an agent will recursively improve its most versatile, and therefore instrumentally useful, resource. It will improve its general intelligence, respectively cross-domain optimization power.

Since it is unlikely that human intelligence is the optimum, the positive feedback effect, that is a result of using intelligence amplifications to amplify intelligence, is likely to lead to a level of intelligence that is generally more capable than the human intelligence level.

Humans are unlikely to be the most efficient thinkers because evolution is mindless and has no goals. Evolution did not actively try to create the smartest thing possible.

Evolution is further not limitlessly creative, each step of an evolutionary design must increase the fitness of its host. Which makes it probable that there are artificial mind designs that can do what no product of natural selection could accomplish, since an intelligent artificer does not rely on the incremental fitness of each step in the development process.

It is actually possible that human general intelligence is the bare minimum. Because the human level of intelligence might have been sufficient to both survive and reproduce and that therefore no further evolutionary pressure existed to select for even higher levels of general intelligence.

The implications of this possibility might be the creation of an intelligent agent that is more capable than humans in every sense. Maybe because it does directly employ superior approximations of our best formal methods, that tell us how to update based on evidence and how to choose between various actions. Or maybe it will simply think faster. It doesn’t matter.

What matters is that a superior intellect is probable and that it will be better than us at discovering knowledge and inventing new technology. Technology that will make it even more powerful and likely invincible.

And that is the problem. We might be unable to control such a superior being. Just like a group of chimpanzees is unable to stop a company from clearing its forest [16].

But even if such a being is only slightly more capable than us. We might find ourselves at its mercy nonetheless.

Human history provides us with many examples [17][18][19] that make it abundantly clear that even the slightest advance can enable one group to dominate others.

What happens is that the dominant group imposes its values on the others. Which in turn raises the question of what values an artificial general intelligence might have and the implications of those values for us.

Due to our evolutionary origins, our struggle for survival and the necessity to cooperate with other agents, we are equipped with many values and a concern for the welfare of others [20].

The information theoretic complexity [21][22] of our values is very high. Which means that it is highly unlikely for similar values to automatically arise in agents that are the product of intelligent design, agents that never underwent the million of years of competition with other agents that equipped humans with altruism and general compassion.

But that does not mean that an artificial intelligence won’t have any goals [23][24]. Just that those goals will be simple and their realization remorseless [25].

An artificial general intelligence will do whatever is implied by its initial design. And we will be helpless to stop it from achieving its goals. Goals that won’t automatically respect our values [26].

A likely implication is the total extinction of all of humanity [27].

[1] Genetic Algorithms and Evolutionary Computation, talkorigins.org/faqs/genalg/genalg.html
[2] Fixing software bugs in 10 minutes or less using evolutionary computation, genetic-programming.org/hc2009/1-Forrest/Forrest-Presentation.pdf
[3] Automatically Finding Patches Using Genetic Programming, genetic-programming.org/hc2009/1-Forrest/Forrest-Paper-on-Patches.pdf
[4] A Genetic Programming Approach to Automated Software Repair, genetic-programming.org/hc2009/1-Forrest/Forrest-Paper-on-Repair.pdf
[5]GenProg: A Generic Method for Automatic Software Repair, virginia.edu/~weimer/p/weimer-tse2012-genprog.pdf
[6] 29+ Evidences for Macroevolution (The Scientific Case for Common Descent), talkorigins.org/faqs/comdesc/
[7] Thermodynamics, Evolution and Creationism, talkorigins.org/faqs/thermo.html
[8] A Collection of Definitions of Intelligence, vetta.org/documents/A-Collection-of-Definitions-of-Intelligence.pdf
[9] plato.stanford.edu/entries/consciousness/
[10] en.wikipedia.org/wiki/Science
[11] en.wikipedia.org/wiki/Scientific_method
[12] The Automation of Science, sciencemag.org/content/324/5923/85.abstract
[13] Computer Program Self-Discovers Laws of Physics, wired.com/wiredscience/2009/04/newtonai/
[14] List of cognitive biases, en.wikipedia.org/wiki/List_of_cognitive_biases
[15] Intelligence explosion, wiki.lesswrong.com/wiki/Intelligence_explosion
[16] 1% with Neil deGrasse Tyson, youtu.be/9nR9XEqrCvw
[17] Mongol military tactics and organization, en.wikipedia.org/wiki/Mongol_military_tactics_and_organization
[18] Wars of Alexander the Great, en.wikipedia.org/wiki/Wars_of_Alexander_the_Great
[19] Spanish colonization of the Americas, en.wikipedia.org/wiki/Spanish_colonization_of_the_Americas
[20] A Quantitative Test of Hamilton's Rule for the Evolution of Altruism, plosbiology.org/article/info:doi/10.1371/journal.pbio.1000615
[21] Algorithmic information theory, scholarpedia.org/article/Algorithmic_information_theory
[22] Algorithmic probability, scholarpedia.org/article/Algorithmic_probability
[23] The Nature of Self-Improving Artiﬁcial Intelligence, selfawaresystems.files.wordpress.com/2008/01/nature_of_self_improving_ai.pdf
[24] The Basic AI Drives, selfawaresystems.files.wordpress.com/2008/01/ai_drives_final.pdf
[25] Paperclip maximizer, wiki.lesswrong.com/wiki/Paperclip_maximizer
[26] Friendly artificial intelligence, wiki.lesswrong.com/wiki/Friendly_artificial_intelligence
[27] Existential Risk, existential-risk.org

I'm intrigued as to the thought processes and motivations which lead to this article in light of your previous two weeks of comments and posts.

I'm intrigued as to the thought processes and motivations which lead to this article in light of your previous two weeks of comments and posts.

I realized that I might have entered some sort of vicious circle of motivated skepticism.
I can't ask other people to explore both sides of an argument if I don't do so either.
Someone wrote that I shouldn't ask AI researchers about risks from AI if I don't understand the basic arguments underlying the possibility.
I was curious if my perception of the arguments in favor of risks from AI is flawed and if I am missing important points. Since I haven't read the Sequences.
I recently wrote that I agree with 99,99% of what Eliezer Yudkowsky writes. The number was wrong. But I wanted to show that it isn't just made up.
I don't perceive myself to be a troll at all. Although some unthoughtful comments might have given that impression.

Although it looks like that everyone hates me now, I still don't want to be wrong.

I know that not having read the Sequences is received badly. Especially since I posted a lot in the past. But that's not some incredible evil plan or anything. I am unable to play games I want to play for longer than 20 minutes either. Yet I have to do physical exercises every day for like 2 hours, even though I don't really want to. It sometimes takes me months to read a single book. I think some here underestimate how people can act in a weird way without being evil. I am in psychiatric therapy for 3 years now (yeah, I can prove this).

I can neither get myself to read the Sequences nor am I able to ignore risks from AI. But I am trying.

Thank you for explaining.

I think you're an important guy to have around for reasons of evaporative cooling.

I like the combination of conciseness and thoroughness you've achieved with this.

There are a couple of specific parts I'll quibble about:

Therefore the next logical step is to use science to figure out how to replace humans by a better version of themselves, artificial general intelligence.

"The Automation of Science" section seems weaker to me than the others, perhaps even superfluous. I think the line I've quoted is the crux of the problem; I highly doubt that the development of AGI will be driven by any such motivations.

Will we be able to build an artificial general intelligence? Yes, sooner or later.

I assign a high probability to the proposition that we will be able to build AGI, but I think a straight "yes" is too strong here.

Agreed -- AGI will probably not be developed with the aim of improving science.

I also want to quibble about this:

Therefore the next logical step is to use science to figure out how to replace humans by a better version of themselves

Since most readers don't want to be replaced, at least in one interpretation of that term, this line sticks in the throat and breaks the flow. The natural response is something like "logical? According to whose goals?"

The information theoretic complexity of our values is very high. Which means that it is highly unlikely for similar values to automatically arise in agents that are the product of intelligent design, agents that never underwent the million of years of competition with other agents that equipped humans with altruism and general compassion.

But that does not mean that an artificial intelligence won’t have any goals. Just that those goals will be simple and their realization remorseless.

New York city is complex - yet it exists. Linux is complex - yet it exists. Something being in a tiny corner of a search space doesn't mean it isn't going to be hit.

Nobody argues that complex values will "automatically arise" in machines. They will be built in - in a similar way to the way car air bags were built in - or safety features on blenders were built in.

NYC and Linux were built incrementally. We can't easily test a super intelligent AI's morality in advance of deploying it. And the probability of failure is conjunctive, since getting just one thing wrong means failure.

Out of curiosity, what are your current thoughts on the arguments you've laid out here?

Out of curiosity, what are your current thoughts on the arguments you've laid out here?

Strong enough to justify the existence of an organisation like SIAI. Everything else is a matter of expected utility calculations. Which I am not able to handle. Not given my current education and not given my psyche.

I know how what I am saying is incredible repugnant to some people here. I see no flaws. But I can't help but flinch away from taking all those ideas seriously. Although I am currently trying hard. I suppose the post above is a baby-step.

This video pretty much is the window to my soul. You see how something can be completely rational yet feel ridiculous?

Less Wrong opens up the terrifying vistas of reality that I tried to flee from since a young age.

The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents. We live on a placid island of ignorance in the midst of black seas of infinity, and it was not meant that we should voyage far. The sciences, each straining in its own direction, have hitherto harmed us little; but some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the light into the peace and safety of a new dark age.

-- The Call of Cthulhu

I felt compelled to try and see if I can make it all vanish.

as the process that explains the local decrease in entropy

I don't think so. It increases entropy, in fact.

http://www.amazon.com/Evolution-Entropy-Science-Conceptual-Foundations/dp/0226075745

Human history provides us with many examples [17][18][19] that make it abundantly clear that even the slightest advance can enable one group to dominate others.

There are also counterexamples of technologically underprivileged groups resisting quite successfully. I think there might be a chapter on this in War Before Civilization.

Beware positive bias.

Regarding the vast 'mind design space', it gets infinitely smaller when you are to stop considering the theoretical stuff based on oracles and realize that the classical computing AI - the one still competing with us for resources - can only square or cube current processing power before it runs out of things in the universe.

Let's say, to the 4th power, just to cover our bases. The computational complexity of e.g. weather forecasting (or forecasting of any other nonlinear phenomena) grows at least as exponent of the time - with no shortcuts - and so your superhuman intelligence is not even a very impressive weather forecaster, with 4 times longer span of forecast. It is also not very impressive self-forecaster. And it is not very impressive at handling nonlinearly coupled unknowns (for which it must simulate all combinations to get the future utility, i.e. 100 unknowns 10 values each = space of 10^100).

It just operates based on time-local strategies with rules like 'assign negative utility to actions you can't undo' (proportional to accuracy of undoing) , 'assign the positive utility to the logarithm of number of available choices', 'assign positive utility to collection of interesting information', as well as more sophisticated, complementary ones - which it would come up with not by forecasting but by testing strategies on hypothetical scenarios. The end result likely won't even resemble straightforward utility maximization any more than human behaviour resembles utility maximization. It would be more optimal, but once again, not in the sense of outperforming theoretical utility maximization at maximizing utility, but in the sense of trading accuracy for speed better.

Bottom line is, the AIs that fit inside our universe, are only a very tiny fraction of mind design space; the very scary super psychopathic monsters were pulled out of other parts of the mind design space, far off in the area where gods live.

Not to say that there is absolutely no risk in the AIs, but the risk is of entirely different kind, arising from entirely different kind of entities. One should be careful not to set off the time-local strategies that destroy you - overt unfriendliness towards AI may set off one or other strategic solution, and lead to more, or less discriminative response.

Unless it develops better quantum computing or exploits other strange physical phenomena we don't know about.

At which time all bets are off with regards to whenever it would even compete for resources. A lot of stuff of this kind can happen, e.g. discovering that we are in simulation.

The bottom line is, now that we established limitations, the AI risks better be prefaced with "If AI develops awesome quantum computing, but it still needs atoms from your body, the following theoretizations about godlike AIs might apply:".

Furthermore, as the AI has to start off with the low hanging fruit - optimizing itself on commodity hardware - the argumentation is not about random points in the awesome quantum mind design space, but our original classical AI's creations, subject to original AI getting perfectly ordinary cold feet about the transition due to wide variety of heuristics that scream 'no, that change is too big and unpredictable', and the AI trying to preserve itself through the transition.

Regarding the vast 'mind design space', it gets infinitely smaller when you are to stop considering the theoretical stuff based on oracles and realize that the classical computing AI - the one still competing with us for resources - can only square or cube current processing power before it runs out of things in the universe.

But we don't know how much it can improve its algorithms before it hits the theoretical limit of efficient resource utilization and has to start expanding outward. You sound a bit like you're assuming that human brains are already near that limit (so that the only way that an AI could beat us was by grabbing lots and lots of resources). So resource-boundedness doesn't really tell us anything about the upper bound on AI harmfulness. That's one of the ways you could try to defeat the argument about AI risk. Another would be arguing about the thesis that AIs are likely to be harmful if safety isn't carefully engineered into them. I don't see how we could deduce anything reassuring about that issue from the fact that AIs will have bounded resources.

But we don't know how much it can improve its algorithms before it hits the theoretical limit of efficient resource utilization and has to start expanding outward. You sound a bit like you're assuming that human brains are already near that limit (so that the only way that an AI could beat us was by grabbing lots and lots of resources). So resource-boundedness doesn't really tell us anything about the upper bound on AI harmfulness.

You forget one important bit: There are other sources of insights about AI than human analogies and speculations what oracle would do.

It stands that no amount of optimization of AI's 'predictor of the future' can beat Lyapunov's exponent ; the AI, however effective, still can't forecast jack shit. On top of that, the AI has enormous number of actions it can take, far far larger than it can process if it were to process them by forecasting the outcomes very accurately.

The very important optimization is not thinking about stuff that has low payoff/thought ratio. Long forecasts have very low payoff, as the cost is exponential in time. Inventions on the other hand should pay off very well. Survival of mankind is not on hand of a calculation 'okay, action A destroys mankind and action B does not, and action A leads to ever so slightly higher utility 1000 years from now; the mankind got to go'. That sort of forecasting is infeasible. The one with short time span is foolish and near-sighted, while the far future is unknown.

It's on hands of some general purpose, effective strategies of the kind 'it is valuable to maximize the choices in the future' , 'information is valuable' (curiosity), 'penalize actions depending to how badly you can undo them'. Versus 'if they all act paranoid on Eliezer's suggestion, they might try damage me'. (Though, hopefully, if the AI is engineering a virus against mankind, the virus won't exterminate but would medicate for paranoia because that solves immediate problem just as well while leaving more options for future and avoiding actions that can't be undone even approximately). We are still made of atoms, that AI could use, but there's quite a plenty of other atoms around. I'd be more worried about it wanting our computation hardware (brains). Crappy it might be, but it is already around, and doesn't need to be manufactured.

It is very ineffective to just go ahead and limit future options and do entirely irreversible things simply because you don't have simple expected utility based answer to"why not". The future you will still be trying to achieve what ever goals you want to achieve, and choosing among choices it has available, and giving the future self more choices is extremely solid heuristic even though you can't straightforwardly calculate expected utility of doing so (due to recursion).

The oracle is the first mental superpower (the idea dates back quite a while), the least feasible, but the easiest for uneducated to speculate about. It is the easiest way to portray super-intelligence without being super intelligent - why, the super intelligence just knows complete outcomes of it's actions and chooses best action. That's easy to think of, and also is extremely inefficient approach to maximization of anything.

The point here is not that AI would necessarily be unable to get rid of mankind. The point is that AI is not particularly more likely to do so than the mankind itself (and may well be less likely). The risks are differential. People are prone to retarded ideologies too. Other people are not you-friendly intelligences, which is totally obvious when you are not living all your life in the privileged class. Groups of other people are unfriendly non-you intelligences, too, highly dangerous and prone to hurting you even if it hurts them as well. Presumably the AI will at least be friendly enough not to hurt you on it's own expense; you can't assume even this rudimentary friendliness of your fellow ferocious survival machines, crazed and meme-infested to the brim. It remains to be shown that AI is any more of the existential risk than human all natural stupidity. When one's speculating up scary stuff one can speculate up the scary human ideologies and social orders as easily as scary AI goal systems. The AI can stop us from killing ourselves, or may kill us, that is not yet a risk until you show that the former is significantly less than latter.

I'm intrigued as to the thought processes and motivations which lead to this article in light of your previous two weeks of comments and posts.

I'm intrigued as to the thought processes and motivations which lead to this article in light of your previous two weeks of comments and posts.

I realized that I might have entered some sort of vicious circle of motivated skepticism.
I can't ask other people to explore both sides of an argument if I don't do so either.
Someone wrote that I shouldn't ask AI researchers about risks from AI if I don't understand the basic arguments underlying the possibility.
I was curious if my perception of the arguments in favor of risks from AI is flawed and if I am missing important points. Since I haven't read the Sequences.
I recently wrote that I agree with 99,99% of what Eliezer Yudkowsky writes. The number was wrong. But I wanted to show that it isn't just made up.
I don't perceive myself to be a troll at all. Although some unthoughtful comments might have given that impression.

Although it looks like that everyone hates me now, I still don't want to be wrong.

I can neither get myself to read the Sequences nor am I able to ignore risks from AI. But I am trying.

Thank you for explaining.

I think you're an important guy to have around for reasons of evaporative cooling.

I like the combination of conciseness and thoroughness you've achieved with this.

There are a couple of specific parts I'll quibble about:

Therefore the next logical step is to use science to figure out how to replace humans by a better version of themselves, artificial general intelligence.

Will we be able to build an artificial general intelligence? Yes, sooner or later.

I assign a high probability to the proposition that we will be able to build AGI, but I think a straight "yes" is too strong here.

Agreed -- AGI will probably not be developed with the aim of improving science.

I also want to quibble about this:

Therefore the next logical step is to use science to figure out how to replace humans by a better version of themselves

The information theoretic complexity of our values is very high. Which means that it is highly unlikely for similar values to automatically arise in agents that are the product of intelligent design, agents that never underwent the million of years of competition with other agents that equipped humans with altruism and general compassion.

But that does not mean that an artificial intelligence won’t have any goals. Just that those goals will be simple and their realization remorseless.

New York city is complex - yet it exists. Linux is complex - yet it exists. Something being in a tiny corner of a search space doesn't mean it isn't going to be hit.

Out of curiosity, what are your current thoughts on the arguments you've laid out here?

Out of curiosity, what are your current thoughts on the arguments you've laid out here?

This video pretty much is the window to my soul. You see how something can be completely rational yet feel ridiculous?

Less Wrong opens up the terrifying vistas of reality that I tried to flee from since a young age.

The most merciful thing in the world, I think, is the inability of the human mind to correlate all its contents. We live on a placid island of ignorance in the midst of black seas of infinity, and it was not meant that we should voyage far. The sciences, each straining in its own direction, have hitherto harmed us little; but some day the piecing together of dissociated knowledge will open up such terrifying vistas of reality, and of our frightful position therein, that we shall either go mad from the revelation or flee from the light into the peace and safety of a new dark age.

-- The Call of Cthulhu

I felt compelled to try and see if I can make it all vanish.

as the process that explains the local decrease in entropy

I don't think so. It increases entropy, in fact.

http://www.amazon.com/Evolution-Entropy-Science-Conceptual-Foundations/dp/0226075745

Human history provides us with many examples [17][18][19] that make it abundantly clear that even the slightest advance can enable one group to dominate others.

There are also counterexamples of technologically underprivileged groups resisting quite successfully. I think there might be a chapter on this in War Before Civilization.

Beware positive bias.

Unless it develops better quantum computing or exploits other strange physical phenomena we don't know about.

At which time all bets are off with regards to whenever it would even compete for resources. A lot of stuff of this kind can happen, e.g. discovering that we are in simulation.

Regarding the vast 'mind design space', it gets infinitely smaller when you are to stop considering the theoretical stuff based on oracles and realize that the classical computing AI - the one still competing with us for resources - can only square or cube current processing power before it runs out of things in the universe.

But we don't know how much it can improve its algorithms before it hits the theoretical limit of efficient resource utilization and has to start expanding outward. You sound a bit like you're assuming that human brains are already near that limit (so that the only way that an AI could beat us was by grabbing lots and lots of resources). So resource-boundedness doesn't really tell us anything about the upper bound on AI harmfulness.

You forget one important bit: There are other sources of insights about AI than human analogies and speculations what oracle would do.

22

A Primer On Risks From AI

22

The Power of Algorithms

The Automation of Science

Risks from AI

Further Reading

22

22