Outline of an approach to AGI Estimation

LESSWRONG
is fundraising!
LW

Outline of an approach to AGI Estimation — LessWrong

We are worried about what will happens when we make a system that can do the important things that human can do, like programming and science. Will it explode off into infinity as it finds better ways to improve itself or will it be a slower more manageable process? There are a number of productive ways we can react to this, not limited to:

Attempt to make AI systems controllable by focusing on the control problem
Attempt to predict when we might get agi by looking at progress, e.g. ai impacts
Attempt to predict what happens when we get artificial general intelligence by looking at current artificial intelligences and current general intelligences and making predictions about the intersection.
Figure out how to make intelligence augmentation, so we can improve humans abilities to compete with full AGI agents.

This blog post is looking at the third option. It is important, because the only reason we think AGIs might exist is the existence proof of humans. Computer AGI might be vastly different in capabilities but we can't get any direct information about them, our only possible approach to predicting their capability is to adjust our estimate of what humans can do based on the difference between humans and narrow AI on different tasks and those tasks importance for economically important generality.

There are three core questions to AGI Estimation that I have identified so far.

What do we mean by 'general intelligence'
How important is 'general intelligence' to different economic activities
How good are narrow AIs at parts of the processes involved in general intelligence vs humans. Can we estimate how good an AGI would be at another task based on the comparison?

Generality

We often speak and act as if there is something to one person being better than another person and mental acts in general. We try and measure it with IQ, we think there is something there. However there are a number of different things "general intelligence" might actually be each of which might lead to better performance, while still being general in some way.

Better at tasks in general
Better at absorbing information by experimentation, so that per bit of information the task performance improves quicker
Better at holding more complex behaviours, so that the limit of task performance is higher
Better at understanding verbal and cultural information, so that they can improve tasks performance by absorbing the information about tasks acquired by other people.
Better in general at figuring out the relevant information and how it should be formatted/collected, so they can select the correct paradigm for solving a problem.

Number 1 seems trivially untrue. So we shouldn't worry about a super intelligent AGI system automatically being better at persuading people, unless it has been fed the right information. You don't know what the right information is for a super intelligence to get good at persuading, so probably still worth while being paranoid.

My bet is that "generally better" is a normally a mixture of aspects 2-4 (which feed into each other), but different aspects will dominate in different aspects of economic activity. Let us take physics for an example, I suspect that a humans potential is more limited by 4, the ability to read and understand other people research, now that we have super computers that can be programmed so that complex simulations can be off loaded from the human brain (so 3 is less important). If we throw more processing power and compute at the "general intelligence" portion of physics and this mainly improve 4, then we should expect it to get up to speed on the state of the art of physics a lot more quickly than humans, but not necessarily get better with more data. If 5 can be improved in "general" then we should expect a super intelligent AGI physicist to have a completely different view point on physics and experimentation in short order to humans.

We can look at this by looking at task performance in humans, if someone can be a lot better at absorbing information over a large number of tasks, then we should expect AGIs to be able to vary a lot on this scale. If humans general task performance variation depends heavily on being able to access the internet/books, then we don't have good grounds for expecting computers to go beyond humanities knowledge easily, unless we get indication from narrow AI that it is better at something like absorbing information.

Before getting into the other question I will illustrate what I mean by the different sorts of generality. I'll build off a simple model of human knowledge that Matt Might used to explain what grad school is like can be found here. We can imagine different ways an AGI might be better than a human at filling in the circle.

They could be fast at filling in the circle without humans help (2), be able to keep more of the circle in mind at once (3). They could be fast at filling in the circle with human culture (4a) or be able to fill more of the circle from human culture (4b). If they are faster at doing PhD level work, in general, they will expand the knowledge more (5). At some point I will put in-line diagrams or animations of the differences.

Knowing which of these facets humans change upon most and have the most impact on ability to get things done, seems pretty important.

Improving Generality and Economic Activity

However we are not interested in all tasks we are mainly interested in the ones that lead to greater economic activity and impact upon the world.

So we need to ask how important human brains are in economic activity in general, to know how much impact improving them would have.

For example for programming a computer (an important task in recursive self-improvement), how big a proportion of the time spent in the task is the humans intelligence vs total task. For example when an AI researcher is trying to improve an algorithm how much time of making a new algorithm variant is needed for thinking vs running the algorithm with different parameters. For example if it takes 3 days for a new variant of AlphaGo Zero, to train itself and be tested and 1 day for the human to analyse the result and tweak the algorithm, speeding up the human portion of the process 1000 fold won't have a huge impact. This is especially the case if the AGI is not appreciably better at aspect 2 of 'generalness', it will still need as many iterations of running AlphaGo Zero as a human to be able to improve it.

The impact of what are we improving

This brings us to the question, when we create super human narrow AI systems which aspect of the system are super to human. Is it the ability to absorb knowledge, the amount of knowledge the system has been given (as it can be given knowledge more quickly than humans) or is it able to hold a more complex model than we can hold in our heads?

If we determine that the aspects that are super human are those that are very important for different economic activities and the aspects that we have not improved are not important, then we should increase our expectation of the impact of AGI. This should hopefully not make us too provincial in our predictions about what AGI should be capable of doing.

A first stab at the kind of reasoning that I want to attempt is here where I attempt to look at how quickly Alpha Go Zero learns compared to humans based on number of games played. This might be a bad comparison, but if it is we should try and improve upon it.

Conclusion

I have outlined a potential method for gaining information about AGI that relies on mainly experiments on human task performance.
We do not have lots of options when it comes to getting information about AGI so we should evaluate all possible ways of getting information.

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

6

Outline of an approach to AGI Estimation

6

6

Generality

Improving Generality and Economic Activity

The impact of what are we improving

Conclusion