The date of AI Takeover is not the day the AI takes over

by Daniel Kokotajlo2 min read22nd Oct 202023 comments

87

Ω 29

AI TimelinesAI TakeoffAI
Frontpage
Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

Instead, it’s the point of no return—the day we AI risk reducers lose the ability to significantly reduce AI risk. This might happen years before classic milestones like “World GWP doubles in four years” and “Superhuman AGI is deployed."

The rest of this post explains, justifies, and expands on this obvious but underappreciated idea. (Toby Ord appreciates it; see quote below). I found myself explaining it repeatedly, so I wrote this post as a reference.

AI timelines often come up in career planning conversations. Insofar as AI timelines are short, career plans which take a long time to pay off are a bad idea, because by the time you reap the benefits of the plans it may already be too late. It may already be too late because AI takeover may already have happened.

But this isn’t quite right, at least not when “AI takeover” is interpreted in the obvious way, as meaning that an AI or group of AIs is firmly in political control of the world, ordering humans about, monopolizing violence, etc. Even if AIs don’t yet have that sort of political control, it may already be too late. Here are three examples:

  1. Superhuman agent AGI is still in its box but nobody knows how to align it and other actors are going to make their own version soon, and there isn’t enough time to convince them of the risks. They will make and deploy agent AGI, it will be unaligned, and we have no way to oppose it except with our own unaligned AGI. Even if it takes years to actually conquer the world, it’s already game over.

  2. Various weak and narrow AIs are embedded in the economy and beginning to drive a slow takeoff; capabilities are improving much faster than safety/alignment techniques and due to all the money being made there’s too much political opposition to slowing down capability growth or keeping AIs out of positions of power. We wish we had done more safety/alignment research earlier, or built a political movement earlier when opposition was lower.

  3. Persuasion tools have destroyed collective epistemology in the relevant places. AI isn’t very capable yet, except in the narrow domain of persuasion, but everything has become so politicized and tribal that we have no hope of getting AI projects or governments to take AI risk seriously. Their attention is dominated by the topics and ideas of powerful ideological factions that have access to more money and data (and thus better persuasion tools) than us. Alternatively, maybe we ourselves have fallen apart as a community, or become less good at seeking the truth and finding high-impact plans.

Conclusion: We should remember that when trying to predict the date of AI takeover, what we care about is the date it’s too late for us to change the direction things are going; the date we have significantly less influence over the course of the future than we used to; the point of no return.

This is basically what Toby Ord said about x-risk: “So either because we’ve gone extinct or because there’s been some kind of irrevocable collapse of civilization or something similar. Or, in the case of climate change, where the effects are very delayed that we’re past the point of no return or something like that. So the idea is that we should focus on the time of action and the time when you can do something about it rather than the time when the particular event happens.”

Of course, influence over the future might not disappear all on one day; maybe there’ll be a gradual loss of control over several years. For that matter, maybe this gradual loss of control began years ago and continues now... We should keep these possibilities in mind as well.

87

Ω 29

23 comments, sorted by Highlighting new comments since Today at 8:48 AM
New Comment

You can steer a bit away from catastrophe today. Tomorrow you will be able to do less. After years and decades go by, you will have to be miraculously lucky or good to do something that helps. At some point, it's not the kind of "miraculous" you hope for, it's the kind you don't bother to model.

Today you are blind, and are trying to shape outcomes you can't see. Tomorrow you will know more, and be able to do more. After years and decades, you might know enough about the task you are trying to accomplish to really help. Hopefully the task you find yourself faced with is the kind you can solve in time.

If AI is un-alignable (or at least significantly easier to create than to prevent dis-alignment), the point of no return was 1837.  If Babbage had kept his mouth shut, maybe we could have avoided this path.  

But really, it's a mistake to think of it as a single point in time.  There's a slew of contributing factors, happening over a long time period.  It's somewhat similar to recent discussions about human revolutions (https://www.lesswrong.com/posts/osYFcQtxnRKB4F4HA/a-tale-from-communist-china and others).  It happens slowly, then quickly, and the possible interventions are very unclear at any point. 

career plans which take a long time to pay off are a bad idea, because by the time you reap the benefits of the plans it may already be too late

This is true, even if AI takeover never happens.  The environment changes significantly over a human lifetime, and the only reasonable strategy is to thread a path that has BOTH long-term impact (to the extent that you can predict anything) AND short-term satisfaction.  “Find a job you enjoy doing, and you will never have to work a day in your life.” remains solid advice, regardless of reasons for uncertainty.

As AI is inherently a weapons technology I would suggest that the point of no return predates our species, either to the first tool use by an ancestor species, or in the first expression of intraspecies violence in the same. We've just been honing that first weapon for a very long time.

I often see people worrying about an AI takeover when we're perfectly capable of killing ourselves with human directed AI that isn't even remotely what we'd recognise as sentient, let alone AGI. We haven't solved our own alignment problems.

What does "inherently a weapons technology" mean? Given some technology, how does one determine whether or not it is "inherently a weapons technology"?

I ask because it seems to me that AI is clearly not "inherently a weapons technology" as I would use those words, and I suspect you mean something different by them.

Regardless, any generalization of AI that includes (e.g.) pointed sticks and flint arrowheads is surely too broad for present purposes; even if "how do we stop humans screwing everything up with whatever tools they have available?" is a more important question than "how do we stop AIs screwing up in ways that their makers and owners would be horrified by?", it's a different question, with (probably) different answers, and the latter is the subject here.

What makes a tool a weapon? The ease with which it causes injury and death. In this respect AI is very much a weapon.

Does government and military want AI for the purposes of furthering their ability to kill people? Could we ever create an AI that they didn't want?

As for generalisation, if Dagon can point the finger at Babbage then I can point the finger at the biological urge for intraspecies violence. 

I don't agree with your answer to your rhetorical question. A kitchen knife can cause injury and death pretty easily, but while it can be a weapon I wouldn't say that kitchen knives are "inherently a weapons technology". A brick can cause injury and death pretty easily too, and bricks are certainly not "inherently a weapons technology".

I would only say that something is "inherently a weapons technology" if (1) a major motivation for its development is (broadly speaking) military and/or (2) what it's best at is causing injury, destruction and death.

Military organizations have put quite a lot of effort into AI, but so have plenty of non-military organizations and it looks to me as if the latter have had much more (visible) success than the former. And so far, the things AI has proven most useful for are things like distinguishing cats from dogs, translating text, and beating humans at board games. Those (or things like them) may well have military applications, but they aren't weapons. (Not even when applied militarily. A better way of spotting enemy tanks makes your weapons more effective, but it isn't itself a weapon.)

Both you and Dagon can point your fingers wherever you like. The more interesting question is where it's useful to point your fingers.

But this isn’t quite right, at least not when “AI takeover” is interpreted in the obvious way, as meaning that an AI or group of AIs is firmly in political control of the world, ordering humans about, monopolizing violence, etc. Even if AIs don’t yet have that sort of political control, it may already be too late.

The AI's will probably never be in a position of political control. I suspect the AI would bootstrap self-replicating (nano?) tech. It might find a way to totally brainwash people, and spread it across the internet. The end game is always going to be covering the planet in self replicating nanotech, or similar.  Politics does not seem that helpful towards such goal. Politics is generally slow.

I think this depends on how fast the takeoff is. If crossing the human range, and recursive self-improvement, take months or years rather than days, there may be an intermediate period where political control is used to get more resources and security. Politics can happen on a timespan of weeks or months. Brainwashing people is a special case of politics. Yeah I agree the endgame is always nanobot swarms etc.

That makes sense and I think it's important that this point gets made. I'm particularly interested by the political movement that you refer to. Could you explain this concept in more detail? Is there anything like such a political movement already being built at the moment? If not, how would you see this starting?

I don't consider this my area of expertise; I think it's very easy to do more harm than good by starting political movements. However, it seems likely to me that in order for the future to go well various governments and corporations will need to become convinced that AI risk is real, and maybe an awareness-raising campaign is the best way to do this. That's what I had in mind. In some sense that's what many people have been doing already, e.g. by writing books like Superintelligence. However, maybe eventually we'd need to get more political, e.g. by organizing a protest or something. Idk. Like I said, this could easily backfire.

I agree and I think books such as Superintelligence have definitely decreased the x-risk chance. I think 'convincing governments and corporations that this is a real risk' would be a great step forward. What I haven't seen anywhere, is a coherent list of options how to achieve that, preferably ranked by impact. A protest might be up there, but probably there are better ways. I think making that list would be a great first step. Can't we do that here somewhere?

I think there are various people working on it, the AI policy people at Future of Humanity Institute for example, maybe people at CSET. I recommend you read their stuff and maybe try to talk to them.

I know their work and I'm pretty sure there's no list on how to convince governments and corporations that AI risk is an actual thing.. PhDs are not the kind of people inclined to take any concrete action I think.

I disagree. I would be surprised if they haven't brainstormed such a list at least once. And just because you don't see them doing any concrete action doesn't mean they aren't--they just might not be doing anything super public yet.

Don't get me wrong, I think institutes like FHI are doing very useful research. I think there should be a lot more of them, at many different universities. I just think what's missing in the whole X-risk scene is a way to take things out of this still fairly marginal scene and into the mainstream. As long as the mainstream is not convinced that this is an actual problem, efforts are always enormously going to lag mainstream AI efforts, with predictable results.

Maybe. But I actually currently think that the longer these issues stay out of the mainstream, the better. Mainstream political discourse is so corrupted; when something becomes politicized, that means it's harder for anything to be done about it and a LOT harder for the truth to win out. You don't see nuanced, balancing-risks-and-benefits solutions come out of politicized debates. Instead you see two one-sided, extreme agendas bashing on each other and then occasionally one of them wins.

(That said, now that I put it that way, maybe that's what we want for AI risk--but only if we get to dictate the content of one of the extreme agendas and only if we are likely to win. Those are two very big ifs.)

It's funny, I heard that opinion a number of times before, mostly from Americans. Maybe it has to do with your bipartisan flavor of democracy. I think Americans are also much more skeptical of states in general. You tend to look to companies for solving problems, Europeans tend to look to states (generalized). In The Netherlands we have a host of parties, and although there are still a lot of pointless debates, I wouldn't say it's nearly as bad as what you describe. I can't imagine e.g. climate change solved without state intervention (the situation here is now that the left is calling for renewables, the right for nuclear - not so bad).

For AI Safety, even with a bipartisan debate, the situation now is that both parties implicitly think AI Safety is not an issue (probably because they have never heard of it, or at least not given it serious thought). After politicization, worst case at least one of the parties will think it's a serious issue. That would mean that roughly 50% of the time, if party #1 wins, we get a fair chance of meaningful intervention such as appropriate funding, hopefully helpful regulation efforts (that's our responsibility too - we can put good regulation proposals out there), and even cooperation with other countries. If party #2 wins, there will perhaps be zero effort or some withdrawal. I would say this 50% solution easily beats the 0% solution we have now. In a multi-party system such as we have, the outcome could even be better.

I think we should prioritize getting the issue out there. The way I see it, it's the only hope for state intervention, which is badly needed.

Perhaps American politics is indeed less rational than European politics, I wouldn't know. But American politics is more important for influencing AI since the big AI companies are American.

Besides, if you want to get governments involved, raising public awareness is only one way to do that, and not the best way IMO. I think it's much more effective to do wonkery / think tankery / lobbying / etc. Public movements are only necessary when you have massive organized opposition that needs to be overcome by sheer weight of public opinion. When you don't have massive organized opposition, and heads are still cool, and there's still a chance of just straightforwardly convincing people of the merits of your case... best not to risk ruining that lucky situation!

I have kind of a strong opinion in favor of policy intervention because I don't think it's optional. I think it's necessary. My main argument is as follows:

I think we have two options to reduce AI extinction risk:

1) Fixing it technically and ethically (I'll call the combination of both working out the 'tech fix'). Don't delay.

2) Delay until we can work out 1. After the delay, AGI may or may not still be carried out, depending mainly on the outcome of 1.

If option 1 does not work, of which there is a reasonable chance (it hasn't worked so far and we're not necessarily close to a safe solution), I think option 2 is our only chance to reduce the AI X-risk to acceptable levels. However, AI academics and corporations are both strongly opposed to option 2. It would therefore take a force at least as powerful as those two groups combined to still pursue this option. The only option I can think of is a popular movement. Lobbying and think tanking may help, but corporations will be better funded and therefore the public interest is not likely to prevail. Wonkery could be promising as well. I'm happy to be convinced of more alternative options.

If the tech fix works, I'm all for it. But currently, I think the risks are way too big and it may not work at all. Therefore I think it makes sense to apply the precautionary principle here and start with policy interventions, until it can be demonstrated that X-risk for AGI has fallen to an acceptable level. As a nice side effect, this should dramatically increase AI Safety funding, since suddenly corporate incentives are to fund this first in order to reach allowed AGI.

I'm aware that this is a strong minority opinion on LW, since:

1) Many people here have affinity with futurism which would love an AGI revolution

2) Many people have backgrounds in AI academia, and/or AI corporations, which both have incentives to continue working on AGI

3) It could be wrong of course. :) I'm open for arguments which would change the above line of thinking.

So I'm not expecting a host of upvotes, but as rationalists, I'm sure you appreciate the value of dissent as a way to move towards a careful and balanced opinion. I do at least. :)

Want to have a video chat about this? I'd love to. :)

Well sure, why not. I'll send you a PM.

I wouldn't say less rational, but more bipartisan, yes. But you're right I guess that European politics is less important in this case. Also don't forget Chinese politics, which has entirely different dynamics of course.

I think you have a good point as well that wonkery, think tankery, and lobbying are also promising options. I think they, and starting a movement, should be on a little list of policy intervention options. I think each will have its own merits and issues. But still, we should have a group of people actually starting to work on this, whatever the optimal path turns out to be.

The idea in this post, combined with my generally short timelines, makes me quite bearish on career plans that involve spending several years doing relatively unimportant things for the sake of credentials (e.g. most grad school plans).