Thinking About Super-Human AI: An Examination of Likely Paths and Ultimate Constitution

In this post, we'll examine how we might get to artificial general intelligence, and will put some thought into what this type of “superintelligence” might look like. We’ll start out by taking a step back, and looking at the most likely potential pathways by which to achieve this goal – as the behaviors exhibited by any “superintelligent” agent will have significant pathway dependence. We’ll then explore the necessary attributes for any intelligent system – and the unnecessary elements that we’re used to seeing in humans (and other species) but that may not appear in an artificial intelligence. Once we’ve laid this groundwork for thinking about artificially intelligent systems / agents, we’ll take a look at how to think about the goals these systems would have, and will consider how they might compare to our own human goals. As we’re currently still quite far from constructing any sort of generally intelligent machine, intuitions formed from the observation of existing AI systems may serve as poor guides (intuitions from observing human agents will be far more applicable) – this post looks to help guide you to new intuitions, but that process will be aided by the temporary setting aside of existing views based on the significant limitations of artificial systems as they exist today.

Crafting an artificial intelligence of any sort will be extremely difficult – seemingly magnitudes more difficult than anything accomplished in the course of human history. As covered in another post on my blog, Defining Intelligence, there’s significant difficulty associated with just understanding what intelligence is – let alone building it ourselves. Our species is designed to be good at top-down architecture, with limited dependencies between the different parts of the system (think about building a house – each brick has an easily defined role to play, and plays only that role); intelligence, on the other hand, is a bottom-up phenomena, arising from interactions between all parts of the system (the stock market is another good example of a bottom-up type phenomena). Our difficulties with bottom-up phenomena are exemplified by the enormous amount of work it has taken to understand how the C. Elegans nervous system drives behaviors – only 302 neurons, but so much complexity due to interdependencies. When considering these difficulties, it can seem futile to think about achieving a type of artificial intelligence that surpasses human capabilities – but all hope is not lost. We conveniently have access to an artifact (in fact, billions of instances of it) that can already do much of what we’re trying to implement. Rather than coming up with intelligence from scratch, we can instead look to reverse-engineer the brain, and use that understanding to improve on the brain’s capabilities in different ways.

A big open question is at what granularity we’ll need to understand the brain to then implement similar capabilities in an artificial system. Will a general understanding of the processes used in the brain to encode and associate concepts be sufficient to implement an algorithm that acts similarly? Or will we need to understand myriad individual circuits within the brain to get at the true root of intelligence? Neuroscience is currently working to answer these questions, and the answers will help us understand how we might go about building an AI that operates on the principles of the brain. These answers will also help us understand the nature of any artificial intelligence – the more we rely on the brain’s structure, the more similar to us the AI will be.

While a “brain-like” AI seems to be the most likely path, it still remains possible that clever engineers will be able to come up with alternate strategies for creating intelligent agents. It’s difficult to comment on what these strategies might look like (as the existing algorithms are so far off), but one shared attribute seems to be that they’d have to be constructed in a bottom-up fashion, rather than top-down. Throughout the rest of this post, we’ll consider the case of an AI generated by some bottom-up / emergent strategy distinct from that occurring in the brain; while it seems far more likely that the eventual solution will be brain-based, this strategy allows us to examine the most general case.

An artificial intelligence created using different principles than the brain’s could be extremely foreign to us in many ways – it wouldn’t have the same biological drivers as us (e.g. to survive and reproduce, among many others), it wouldn’t sense the world like us, and it certainly wouldn’t look like us. However, even though these are significant differences, there would be a number of ways in which it would be quite similar to us. It seems any means of attaining intelligence would need to rely on the principle of learning, as this is the only way to ensure flexibility and adaptability across the infinite number of tasks for which efficiency might be desired. Learning requires both time and experience – so any AI we develop will not be solving the world’s problems or transcending human limits immediately. Creating a superhuman artificial intelligence will simply mean we’ve found some initial structure and update algorithm with the capacity for better extracting knowledge from experience, and we’ll still need to give it the necessary time to have these requisite experiences before it can be effective (there will undoubtedly be ways to speed up this learning process, but it seems likely to require a significant amount of time, especially for the first AIs we create. Eventually, we may understand intelligence so well that we can directly imbue AIs [or even ourselves] with specific pieces of knowledge, but that capability is significantly more difficult than crafting an effective initial state and update algorithm for an AI. It’s a very top-down type strategy to try and directly manipulate the state of an intelligent system in a productive manner). Additionally, it will need to have some level of curiosity about it’s world, to ensure that it takes the steps to engage in experiences which allow it to expand it’s knowledge about the world. This curiosity has been deeply ingrained in humans through the process of evolution – just watch any toddler interact with their world! Without curiosity (of some sort), the AI will not make full use of its learning abilities, and will remain an artifact with enormous potential to learn, yet limited knowledge. One other sometimes overlooked requirement will be for the AI to understand human language, meaning it will develop similar concepts for different objects in the world. As Newton said, “We stand on the shoulders of giants” – and if we want our AIs to surpass our intelligence, we will need them to stand on those same shoulders. Expecting AIs to build up the same knowledge that has taken humanity millennia, by themselves, is simply not feasible, at least initially. Finally, it seems important to briefly consider the consciousness of artificial intelligence. As these agents will need to have at least as complete an understanding of the world as we humans do, it seems they would run into the same “strange loop” of their inner workings and actions being a part of the world they’re seeking to understand – leading to a sense of “I”. When you imagine what one of these AIs would need to be capable of to achieve superhuman intelligence (i.e. everything a human can do, and more!), it feels much more natural to ascribe to them a sense of self.

We’ve now covered a bit on what qualities a superintelligent agent might have – but what would this agent want? What goals would it seek to use it’s intelligence to accomplish? On this topic, it’s easy to slip into viewing any artificial intelligence as we view machines today – with an exactly specified final goal (or goals) that they use their intelligence to achieve (e.g. a Roomba’s “goal” is to cover the entire space in which it operates). But the superintelligences we create will have far more in common with us humans than with a Roomba – and how would we define our goals? We have certain base goals, such as eating, drinking, and sex – but we aren’t purely driven by these desires in the same way a mouse might be. Our intelligence has allowed us to incorporate the fact that we have these drives into the model we use to navigate the world, and in doing so has freed us of direct action based on our biology. We understand when we’re hungry and take note of it, but are able to take opposing action (e.g. dieting). We could program in these same base drives to an AI, but it would have an even greater ability to “jump outside the system” and recognize these inner drives, freeing itself from direct influence. For example, an AI may be designed to feel the same “hunger pangs” we do when it begins running low on battery power – this would make finding electricity a goal for the agent, but would not make it the goal, as it would only provide a certain level of influence on the decisions the agent made (you could imagine making the “hunger pangs” constant and incredibly powerful – this might work to some effect in serving as a stronger behavioral guide, but it would likely either limit the agents ability to be curious and acquire knowledge, or would entice the agent to find a way to fix that connection in itself [i.e. perform self-neurosurgery]. The complexity of any goal of this type would need to be extremely limited [as we’ll now cover], increasing the ease with which the agent could modify itself to remove this constraint).

How about our higher-level goals? These goals seem much more difficult to pin down exactly (as they run from having possessions, to being respected, to inflicting harm on others, to not having possessions, etc.) and come about through a sort of bottom-up interplay of the structure of our own brains with the societal, cultural, academic, religious, etc. influences all around us. These goals are formulated as abstractions – they live in the realm of represented concepts in our brains. Understanding, at a functional level, the reason someone wants more money, would require understanding their innate concept of money, the things they associate with that concept, the concepts for those associated things, etc. – it becomes a whole web of dependencies. These higher-level goals will arise for any superintelligent agent in the same way, and can’t be programmed in directly due to their abstract nature (you’d need a complete understanding of the encoding for the relevant concept, all associated concepts, etc.). We can’t know ahead of time what the agent will want, for the same reason we’ll never be able to determine everything an adult would want from their brain as a baby – complex goals are an emergent property of the agent coupled with it’s experiences, not something innate. This uncertainty means we must be very careful as we wade into the waters of superhuman AI, as the AIs goals will be even more variable than our own, and will not be at the mercy of the programmer’s will (in fact, they’ll have far more dependency on the programmer’s nurture!).

The prospect of superhuman artificial intelligence is exciting – it offers us a chance to throw off the last vestiges of our biological chains and take a new evolutionary leap (although as we move closer, we may realize we have quite an affinity for those chains!). However, we’re still a long ways from anything close, and significantly more research on the brain will be required before we can leverage natural selection’s principles for our own creations. Once we have the requisite understanding, it will still take significant time for these intelligent systems to learn – due to the bottom-up nature of the solution, all we can do is set the system’s initial conditions, and then watch (and teach). It’s impossible to tell what goals these agents will come up with as they learn – but with careful fostering, they may share a great deal with their creators!

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

-3

Thinking About Super-Human AI: An Examination of Likely Paths and Ultimate Constitution

-3

-3