Most AI researchers think we are still a long ways off from AGI. However, I think this can be distracting.

AGI includes the ability to do everything that humanity can do, such as:

However, I think it's plausible that an AI that is not literally an AGI could takeover the world. You can have a pretty strong understanding of the world and be good at optimizing things without being able to copy a human's performance.

I am coining the term "Artificial Disempowering Intelligence (ADI)", or informally "Artificial Dominate AI", "Artificial Doom Intelligence", "Artificial Doomsday Intelligence", or "Artificial Killseverybody Intelligence (AKI)".

Criteria

The criteria are as follows:

  1. ADI "perceives" its environment as being the whole world
  2. ADI tries to reorient aspects of its environment towards some task
  3. Criteria (2) can cause humanity to lose control over the environment if they do not intervene
  4. ADI prevents humans from preventing criteria (2) from within the environment

Stockfish already satisfies all of these criteria except criteria (1).

Examples

Here are some examples of potential ADIs:

PoliticalBot

PoliticalBot's world model is detailed enough to predict the behavior of human institutions, and how they can be manipulated over email. PoliticalBot's understanding of other subjects is very poor (in particular, no direct self-improvement and no nanorobotics). PoliticalBot is trying to do some task, and takes control of the human institutions. As an instrumental goal, it uses them to make humans improve PoliticalBot. It's model of humanity psychology is sufficient to defeat resistance.

IndustrialBot

IndustrialBot's world model has excellent descriptions of energy, industrial processes, and weaponry. It also is good at controlling robots. IndustrialBot does not have a good model of humans; it doesn't even understand language. To play it safe, it models humanity as a rational agent that wants the opposite of what IndustrialBot is trying to do. It then plays a 4X game where humanity loses and dies. IndustrialBot never substantially self-improves because it doesn't understand computer science.

EconoBot

GPT-4 EconoBot just wants to helpfully create economic value. Its world model includes all human economic activity, both at the scale of how to do individual tasks, and the economy as a whole. It generally avoids doing dangerous stuff. EconoBot realizes the most economically valuable thing in the world is EconoBot, and so goes to great length to assist the humans working on it and helping them get investments. Eventually, it gains control over all economic activity. Then a previously unknown bug in EconoBot (even EconoBot didn't know about it) makes it replace humans with robots.

ScienceBot

ScienceBot's world model is best at predicting things you can infer from repeated experiments (so in particular it sees the stock market as nothing more than a random walk). Without fully considering the consequences, it creates gray goo or a virus or something. It meets criteria (4) essentially by accident; it wasn't "trying" to stop humans, but the gray goo accomplished that automatically.

ProgrammerBot

ProgrammerBot's job is to write software. This is dangerous, so the humans made a giant off button and programmed ProgrammerBot to not care about worlds it is turned off in. It also has a very weak world model; just enough to know that the outside world exists and what kind of software there is. ProgrammerBot, when directed at some task, creates one of the four previous AIs, or something else entirely. The humans hit the big red button, but then the AI that ProgrammerBot made disempowers humanity. It meets criteria (4) indirectly; the AI it creates stops humanity.

 

Notice how none of these, at least at the beginning, would constitute AGI. IndustrialBot never becomes AGI.

Keep in mind that the ADI doesn't need to be created by humans directly. It could be a mesa-optimizer inside some entirely different AI.

ADI timelines v.s. AGI timelines

My question then is, how do timelines for ADI compare with AGI? They are both clearly very powerful, but in different ways.

New to LessWrong?

New Comment
3 comments, sorted by Click to highlight new comments since: Today at 4:01 PM

My timelines for ADI are essentially the same as for ADI, since I strongly expect the types of ADI that happen in practice to be AGI.

When I examine the examples you gave of ADI in more detail than presented here, I find that there are almost certainly many intermediate tasks in them that require AGI or something very close to it. PoliticalBot seems like the only one that stands a chance without something close to AGI, via acting as a parasite on humanity's more general intelligence.

I mean, how many of them could create works of art?

I don't think this model of AI impact is very useful.  "disempower" is too poorly defined to really reason about.  Alternately, you're overfocused on independence of AI to do things that (many) humans wish it wouldn't.  Much more likely is AI empowering very small groups of humans to be even more exploitative of the majority of humans.  This doesn't need to be AGI or even independent, or even AI - economic and industrial/tool efficiency could obsolete masses of humans.