AI safety in the age of neural networks and Stanislaw Lem 1959 prediction

turchin

Tl;DR: Neural networks will result in slow takeoff and arm race between two AIs. It has some good and bad consequences to the problem of AI safety. Hard takeoff may happen after it anyway.

Summary: Neural networks based AI can be built; it will be relatively safe, not for a long time though.

The neuro AI era (since 2012) feature an exponential growth of the total AI expertise, with a doubling period of about 1 year, mainly due to data exchange among diverse agents and different processing methods. It will probably last for about 10 to 20 years, after that, hard takeoff of strong AI or creation of Singleton based on integration of different AI systems can take place.

Neural networks based AI implies slow takeoff, which can take years and eventually lead to AI’s evolutionary integration into the human society. A similar scenario was described by Stanisław Lem in 1959: the arms race between countries would cause power race between AIs. The race is only possible if the self-enhancement rate is rather slow and there is data interchange between the systems. The slow takeoff will result in a world system with two competitive AI-countries. Its major risk will be a war between AIs and corrosion of value system of competing AIs.

The hard takeoff implies revolutionary changes within days or weeks. The slow takeoff can transform into the hard takeoff at some stage. The hard takeoff is only possible if one AI considerably surpasses its peers (OpenAI project wants to prevent it).

Part 1. Limitations of explosive potential of neural nets

Everyday now we hear about success of neural networks, and we could conclude that human level AI is near the corner. But such type of AI is not fit for explosive self-improvement.

If AI is based on neural net, it is not easy for it to undergo quick self-improvement for several reasons:

1. A neuronet’s executable code is not fully transparent because of theoretical reasons, as knowledge is not explicitly present within it. So even if one can read neuron weight values, it’s not easy to understand how they can be changed to improve something.

2. Educating a new neural network is a resource-consuming task. If a neuro AI decides to go the way of self-enhancement, but is unable to understand its source code, a logical solution would be to ‘deliver a child’, i.e. to teach a new neural network. However, educating neural networks requires much more resources than their executing; it requires huge databases and has high failure probability. All those factors will lead to rather slow AI self-enhancement.

3. Neural network education depends on big data volumes and new ideas coming from the external world. It means that a single AI will hardly break away, if it has stopped free information exchange with the external world; its level will not surpass the rest of the world considerably.

4. The neural network power has relatively linear dependence on the power of the computer it’s run on, so with a neuro AI, the hardware power is limiting to its self-enhancement ability.

5. Neuro AI would be a rather big program of about 1 TByte, so it can hardly leak into the network unnoticed (at current internet speeds).

6. Even if a neuro AI reaches the human level, it will not get self-enhancement ability (because no one person can understand all scientific aspects). For this end, a big lab with numerous experts in different branches is needed. Additionally, it should be able to launch such virtual laboratory at a rate at least 10 -100 times higher than that of a human being to get an edge as compared to the rest of mankind. That is, it has to be as powerful as 10,000 people or more to surpass the rest part of the mankind in terms of enhancement rate. This is a very high requirement. As a result, the neural net era can lead to building a human, or even a bit superhuman level AI, which is unable to self-enhance or does it so slowly that lags behind the technical progress.

The civilization-level intelligence is the total IQ that the civilization possesses for 100 years of its history, which is defined as a complexity of scientific and engineering tasks it can solve. For example, during the 20^th century, nuclear weapon was created, but problems of cancer, aging and AI creation failed to be solved. It means, those tasks have superior complexity.

For a strong AI to be able to change the human destiny, its IQ should 100 – 1000 times surpass that of the entire civilization. In this case, it can create a remedy against aging, treat cancer etc. within a year. (To destroy mankind, it would need a much lower level of intelligence like that of a sophisticated computer virus, which can,

7. The richest neural net project will get decisive advantage (or not?). Neuronets based AI’s power is approximately proportional to the power of computers it’s run on, time, and a size of the base it was educated on, and number of researchers in the project. The project from the basement may only succeed if it has superior master algorithm of intelligence. That said, small project can hardly experience exponential self-improvement, because the first AI will already have had better computing power and, more important, better databases and researchers. And as now we live in AI-summer period and there is a lot of hype about AI, big companies will invest to buy the best ideas about master algorithm (by buying startups).

Objection here is that now everybody who spent 1 year in machine learning and has GPU could reach meaningful results in creation his own RNN. So may be collective intelligence of private researchers may overweight big companies.

A detailed roadmap of AI failure modes can be found here; and the roadmap of the ways of creating a safe AI here.

See also machine learning expert opinion and recent FHI post.

Part 2. Some ideas about ensuring safety of neural nets based AI

The mentioned concepts can be helpful in creating extra safety measures (while not absolute, they can reduce chances of AI getting out of control.)

1. Program memory access restriction. In state-of-the-art computers, a program code and data share the same memory, which is known as von Neumann architecture. However, there is also Harvard architecture, where data and programs are stored separately. https://en.wikipedia.org/wiki/Harvard_architecture

This kind of computer can be physically produced and provided with source-code reading attempt indicators. (However, there are neural networks at the moment, which generate a new neural network for every task, so they have higher efficiency, and this approach will not work for them or limit AI power).

2. A compiled executable code and its source written in a high-level programming language should be stored separately. If AI tries to reach it, it will be a signal that it is getting out of control. With neural networks, a compiled code, its source, and a database used to educate the neural network should be stored separately.

3. AI should be limited in knowledge useful to build AI. To improve itself, AI should have very precise understanding of its internal construction (the same applies to a human being). It enables us to keep it under control: if not provided with this data, it will have problems with self-improvement. That is AI, which is not an AI expert, is safer to some extent.

4. We can use “foggy architecture”, which is very difficult for optimizing because the AI’s values and major features are hidden among heaps of other details. Neural networks are the architecture of this kind. But it will make it also difficult to us to put our values in it.

5. We can physically restrict the ability of the AI to create “AI children”; their bringing-up requires huge resources, time, and data prior they become more effective than their parents. AI can create another AI, but then it will need to test it for a long time to find out if the latter observes the parent’s values, if it has any mistakes, and if it is really superior to the original version. This kind of testing is time-consuming, generally nontrivial, and even incomputable. Especially, this kind of task is difficult for neural networks which contain a huge amount of random elements, are not exact in principle, and their faultlessness can not be proven. (Moreover, there are numerous examples when even a small image shift will cause a recognition error.) However, this problem can be solved because this is the way that the human society goes: children are born, tested, and educated.

Creation of AI-children may be restricted by restricting needed infrastructure: memory, datasets, processing power.

6. A neuronet-based agent (like a robot) will be anthropomorphous in terms of its brain structure. Even if we shall not imitate the human brain intentionally, we shall get approximately the same thing. In a sense, it’s may be good as even if these AIs supplant people, they still will be almost people who are different from normal people like one generation from another. And being anthropomorphous they may be more compaterble with human value systems. Along with that, there may exist absolutely humanless AI architecture types (for example, if evolution is regarded as an inventor.)

But neural net world will be not EM-dominated world of Hanson. EM-world may appear on later stage, but I think that exact uploads still will not be dominating form of AI.

Part 3. Transition from slow to hard takeoff

In a sense, neuronet-based AI is like a chemical fuel rocket: they do fly and can fly even across the entire solar system, but they are limited in terms of their development potential, bulky, and clumsy.

Sooner or later, using the same principle or another one, completely different AI can be built, which will be less resource-consuming and faster in terms of self-improvement ability.

If a certain superagent will be built, which can create neural networks, but is not a neural network itself, it can be of a rather small size and, partly due to this, experience faster evolution. Neural networks have rather poor intelligence per code concentration. Probably, the same thing could be done in a more optimum way by reducing its size by an order of magnitude, for example, by creating a program to analyze an already educated neural network and get all necessary information from it.

When, in 10 – 20 years, hardware will improve, multiple neuronets will be able to evolve within the same computer simultaneously or be transmitted via the Internet, which will boost their development.

Smart neuro AI can analyze all available data analysis methods and create new AI architecture able to speed up faster.

Launch of quantum-computer-based networks can boost their optimization drastically.

There are many other promising AI directions which did not pop up yet: Bayesian networks, genetic algorithms.

The neuro AI era will feature exponential growth of the total humanity intelligence, with a doubling period of about 1 year, mainly due to the data exchange among diverse agents and different processing methods. It will last for about 10 to 20 years (2025-2035) and, after that, hard take-off of strong AI can take place.

That is, the slow take-off period will be the period of collective evolution of both computer science and mankind, which will enable us to adapt to changes under way and adjust them.

Just like there are Mac and PC in the computer world or democrats and republicans in politics, it is likely that two big competing AI systems will appear (plus, ecology consisting of smaller ones). It could be Google and Facebook or USA and China, depending on whether the world will choose the way of economical competition or military opposition. That is, the slow take-off hinders the world consolidation under the single control, but rather promotes a bipolar model. While a bipolar system can remain stable for a long period of time, there are always risks of a real war between the AIs (see Lem’s quote below).

Part 4. In the course of the slow takeoff, AI will go through several stages, that we can figure out now

While the stages can be passed rather fast or be diluted, we still can track them like milestones. The dates are only estimates.

1. AI autopilot. Tesla has it already.

2. AI home robot. All prerequisites are available to build it by 2020 maximum. This robot will be able to understand and fulfill an order like ‘Bring my slippers from the other room’. On its basis, something like “mind-brick” may be created, which is a universal robot brain able to navigate in natural space and recognize speech. Then, this mind-brick can be used to create more sophisticated systems.

3. AI intellectual assistant. Searching through personal documentation, possibility to ask questions in a natural language and receive wise answers. 2020-2030.

4. AI human model. Very vague as yet. Could be realized by means of a robot brain adaptation. Will be able to simulate 99% of usual human behavior, probably, except for solving problems of consciousness, complicated creative tasks, and generating innovations. 2030.

5. AI as powerful as an entire research institution and able to create scientific knowledge and get self-upgraded. Can be made of numerous human models. 100 simulated people, each working 100 times faster than a human being, will be probably able to create AI capable to get self-improved faster, than humans in other laboratories can do it. 2030-2100

5a Self-improving threshold. AI becomes able to self-improve independently and quicker than all humanity

5b Consciousness and qualia threshold. AI is able not only pass Turing test in all cases, but has experiences and has understanding why and what it is.

6. Mankind-level AI. AI possessing intelligence comparable to that of the whole mankind. 2040-2100

7. AI with the intelligence 10 – 100 times bigger than that of the whole mankind. It will be able to solve problems of aging, cancer, solar system exploration, nanorobots building, and radical improvement of life of all people. 2050-2100

8. Jupiter brain – huge AI using the entire planet’s mass for calculations. It can reconstruct dead people, create complex simulations of the past, and dispatch von Neumann probes. 2100-3000

9. Galactic kardashov level 3 AI. Several million years from now.

10. All-Universe AI. Several billion years from now

Part 5. Stanisław Lem on AI, 1959, Investigation

In his novel «Investigation» Lem's character discusses future of arm race and AI:

-------

- Well, it was somewhere in 46^th, A nuclear race had started. I knew that when the limit would be reached (I mean maximum destruction power), development of vehicles to transport the bomb would start. .. I mean missiles. And here is where the limit would be reached, that is both parts would have nuclear warhead missiles at their disposal. And there would arise desks with notorious buttons thoroughly hidden somewhere. Once the button is pressed, missiles take off. Within about 20 minutes, finis mundi ambilateralis comes - the mutual end of the world. <…> Those were only prerequisites. Once started, the arms race can’t stop, you see? It must go on. When one part invents a powerful gun, the other responds by creating a harder armor. Only a collision, a war is the limit. While this situation means finis mundi, the race must go on. The acceleration, once applied, enslaves people. But let’s assume they have reached the limit. What remains? The brain. Command staff’s brain. Human brain can not be improved, so some automation should be taken on in this field as well. The next stage is an automated headquarters or strategic computers. And here is where an extremely interesting problem arises. Namely, two problems in parallel. Mac Cat has drawn my attention to it. Firstly, is there any limit for development of this kind of brain? It is similar to chess-playing devices. A device, which is able to foresee the opponent’s actions ten moves in advance, always wins against the one, which foresees eight or nine moves ahead. The deeper the foresight, the more perfect the brain is. This is the first thing. <…> Creation of devices of increasingly bigger volume for strategic solutions means, regardless of whether we want it or not, the necessity to increase the amount of data put into the brain, It in turn means increasing dominating of those devices over mass processes within a society. The brain can decide that the notorious button should be placed otherwise or that the production of a certain sort of steel should be increased – and will request loans for the purpose. If the brain like this has been created, one should submit to it. If a parliament starts discussing whether the loans are to be issued, the time delay will occur. The same minute, the counterpart can gain the lead. Abolition of parliament decisions is inevitable in the future. The human control over solutions of the electronic brain will be narrowing as the latter will concentrate knowledge. Is it clear? On both sides of the ocean, two continuously growing brains appear. What is the first demand of a brain like this, when, in the middle of an accelerating arms race, the next step will be needed? <…> The first demand is to increase it – the brain itself! All the rest is derivative.

- In a word, your forecast is that the earth will become a chessboard, and we – the pawns to be played by two mechanical players during the eternal game?

Sisse’s face was radiant with proud.

- Yes. But this is not a forecast. I just make conclusions. The first stage of a preparatory process is coming to the end; the acceleration grows. I know, all this sounds unlikely. But this is the reality. It really exists!

— <…> And in this connection, what did you offer at that time?

- Agreement at any price. While it sounds strange, but the ruin is a less evil than the chess game. This is awful, lack of illusions, you know.

-----

Part 6. The primary question is: Will strong AI be built during our lifetime?

That is, is this a question of future generations’ good (the question that an efficient altruist, not a common person, is concerned about) or a question of my near term planning?

If AI will be built during my lifetime, it may lead to either the radical life extension by means of different technologies and realization of all sorts of good things not to be numbered here or my death and probably pain, if this AI is unfriendly.

It depends on the time when AI is built and my expected lifetime (with the account for the life extension to be obtained from weaker AI versions and scientific progress on one hand, and its reduction due to global risks irrelevant to AI.)

Note that we should consider different dates for different events. If we would like to avoid AI risks, we should take the earliest date of its possible appearance (for example, the first 10%). And if we count on its good, then – the median.

Since the moment of neuro-revolution, an approximate rate of doubling AI algorithms efficiency (mainly in image recognition area) is about 1 year. It is difficult to quantify this process as the task complexity does not change linearly, and it is always more difficult to recognize recent patterns.

Now, an important factor is a radical change in attitude towards AI research. Winter is over, the unstrained summer with all its overhype has begun. It caused huge investments to AI research (chart), more enthusiasts and employees in this field, and bold researches. It’s a shame to have no own AI project now. Even KAMAZ develops a friendly AI system. The entry threshold has dropped: one can learn basic neuronet adjustment skills within one year; heaps of tutorial programs are available. Supercomputer hardware got cheaper. Also, a guaranteed market of AIs in form of autopilot cars and, in the future, home robots has emerged.

If the algorithm improvement keeps the pace of about one doubling per year, it means 1,000,000 during 20 years, which certainly will be equal to creating a strong AI beyond a self-improvement threshold. In this case, a lot of people (and me) have good chances to live till the moment and get immortality.

Conclusion

Even not self-improving neural AI system may be unsafe if it get global domination (and will have bad values) or if it will go into confrontation with equally large opposing system. Such confrontation may result in nuclear or nanotech based war, and human population may be hostage especially if both systems have pro-human value system (blackmail).

Anyway slow takeoff AI risks of human extinction are not inevitable and are manageable in ad hoc basis. Slow takeoff does not prevent hard takeoff on later stage of AI development.

Hard takeoff is probably the next logical stage of soft takeoff, as it will continue the trend of accelerating progress. During biological evolution we could witness the same process: slow process of brain enlargement of mammalian species in last tens of million years was replace by almost hard takeoff of Homo sapience intelligence which threatens ecological balance.

Hardtake off is a global catastrophe almost by definition, which needs extraordinary measures to be put into safe way. Maybe the period of almost human level neural net based AI will help us to create instruments of AI control. Maybe we could use simpler neural AIs to control self-improving system.

Another option is that neural AI age will be very short and it is already almost over. In 2016 Google Deep Mind beats Go using complex approach of several AI architectures combined. If such trend continue we could get Strong AI before 2020 and we are completely not ready for it.

[-][anonymous]8y10

The biggest problem here is that you start from the assumption that current neural net systems will eventually be made into AI systems with all the failings and limitations they have now. You extrapolate massively from the assumption.

But there is absolutely no reason to believe that the evolutionary changes to NN that are required in order to make them fully intelligent (AGI) will leave them with the all the same characteristics they have now. There will be SO MANY changes, that virtually nothing about the current systems will be true of those future systems.

Which renders your entire extrapolation moot.

[-]TheAncientGeek8y20

virtually nothing

OK, but does anything survive? How about the idea that

Some systems will be opaque to human programmers
...they will also be opaque to themselves
..which will stymie recursive self-improvement.

Well, here is my thinking.

Neural net systems have one major advantage: they use massive weak-constraint relaxation (aka the wisdom of crowds) to do the spectacular things they do.

But they have a cluster of disadvantages, all related to their inability to do symbolic, structured cognition. These have been known for a long time -- Donald Norman, for example, wrote down a list of issues in his chapter at the end of the two PDP volumes (McClelland and Rumelhart, 1987.

But here's the thing: most of the suggested ways to solve this problem (including the one I use) involve keeping the massive weak constraint relaxation, throwing away all irrelevant assumptions, and introducing new features to get the structured symbolic stuff. And that revision process generally leaves you with hybrid systems in which all the important stuff is NO LONGER particularly opaque. The weak constraint aspects can be done without forcing (too much) opaqueness into the system.

Are there ways to develop neural nets in a way that do cause them to stay totally opaque, while solving all the issues that stand between the current state of the art and AGI? Probably. Well, certainly there is one .... whole brain emulation gives you opaqueness by the bucketload. But I think those approaches are the exception rather than the rule.

So the short answer to your question is: the opaqueness, at least, will not survive.

[-]Baughn8y30

Where can I read about this?

Well, the critical point is whether NN are currently on a track to AGI. If they are not, then one cannot extrapolate anything. Compare: steam engine technology is also not going to eventually become AGI, so how would it look if someone wrote about the characteristics of steam engine technology and tried to predict the future of AGI based on those characteristics?

My own research (which started with NN, but tried to find ways to get it to be useful for AGI) is already well beyond the point where the statements you make about NN are of any relevance. Never mind what will be happening in 5, 10 or 20 years.

[-]turchin8y10

It looks like you are on track to hard takeoff, but from other domains I know that people tend to overestimate their achievements 10-100 times, so I have to be a little bit sceptical. NN is much closer to AGI than steam engines anyway.

I agree that NN will eventually evolve in something else and this will end NN age, which may last in my opinion 10-20 years, but may be as short as 5 years. After NN age will end, most of these assumption should be revisited, but now situation looks like we live in such age.

[-]buybuydandavis8y00

Part 1b. UNLimitations of explosive potential of neural nets

NNs are highly parallelizable. Functional units are highly parallelizable.

We're currently in the midst of performance gains from application of deep learning.

Toolkits and hardware are parallelizing the work stream.

The better we get at it, the more people and resources are drawn into the work.

I see the hardware gains getting multiplied by expanding techniques getting multiplied by expanded toolkit availability getting multiplied by expanding resources.

Building blocks are getting improved very fast right now. It's going to come down to how hard system integration is.

[-]tukabel8y-20

Just a little remark: before "friendly AI" becomes a problem, there's the usual one: "friendly humans".

Let's take the standard assumption:

Evolution is about evolution of Intelligence. Needs our last animal/wetware stage only to develop its "in silico" successor.

Then the conclusion is quite obvious:

We are the danger. Mostly because due to our stupidity we can easily destroy ourselves before advancing to the next level.

It's quite a simplecstory: animals got technology from another, higher civilization - this smells deadly.

Memetic parallel civilization of a tiny intelligent but powerless fraction of per cent is giving the humanimal "society" (for free and without any say/power over its usage) more and more advanced power based on sci/tech.

Enormous conflict coming from the discrepancy between these advances and animalistic roots of the "society" (ruled by the usual predator/prey jungle-system - call it king/slaves, employer/employees or politician/taxpayers) is clear.

Already nuclear bombs were beyond the limit. Now imagine cheap, almost homemade drone-transported nuclear grenade - coming nanotech can be much worse, as we all know.

Now slip just for a while into the boots of the upcoming Super AGI.

Giving this techpower to the bunch of predatory lunatics (read: governments) often drugged by stupid ideologies? Or even general public?

Are you kidding?

We are much bigger existential risk for the Superintelligence than the AGI is for us (and if it's really THE Superintelligence, then the humans do not matter anymore, anyway).