Biological humans and the rising tide of AI

cousin_it

The Hanson-Yudkowsky AI-Foom Debate focused on whether AI progress is winner-take-all. But even if it isn't, humans might still fare badly.

Suppose Robin is right. Instead of one basement project going foom, AI progresses slowly as many organizations share ideas with each other, leading to peaceful economic growth worldwide - a rising tide of AI. (I'm including uploads in that.)

With time, keeping biological humans alive will become a less and less profitable use of resources compared to other uses. Robin says humans can still thrive by owning a lot of resources, as long as property rights prevent AIs from taking resources by force.

But how long is that? Recall the displacement of nomadic civilizations by farming ones (which happened by force, not farmers buying land from nomads) or enclosure in England (which also happened by force). When potential gains in efficiency become large enough, property rights get trampled.

Robin argues that it won't happen, because it would lead to a slippery slope of AIs fighting each other for resources. But the potential gains from that are smaller, like a landowner trying to use enclosure on another landowner. And most such gains can be achieved by AIs sharing improvements, which is impossible with humans. So AIs won't be worried about that slippery slope, and will happily take our resources by force.

Maybe humans owning resources could upload themselves and live off rent, instead of staying biological? But even uploaded humans might be very inefficient users of resources (e.g. due to having too many neurons) compared to optimized AIs, so the result is the same.

Instead of hoping that institutions like property rights will protect us, we should assume that everything about the future, including institutions, will be determined by the values of AIs. To achieve our values, working on AI alignment is necessary, whether we face a "basement foom" or "rising tide" scenario.

I briefly discuss non-winner-take-all AI-takeover scenarios in sections 4.1.2 - 4.2. of Disjunctive Scenarios of Catastrophic AI Risk.

Personally I would emphasize that even if AIs did respect the law, well, if they got enough power, what would prevent them from changing the law? Besides, humans get tricked into making totally legal deals that go against their interests, all the time.

4.1.2. DSA enabler: Collective takeoff with trading AIs

Vinding (2016; see also Hanson & Yudkowsky 2013) argues that much of seemingly-individual human intelligence is in fact based on being able to tap into the distributed resources, both material and cognitive, of all of humanity. Thus, it may be misguided to focus on the point when AIs achieve human-level intelligence, as collective intelligence is more important than individual intelligence. The easiest way for AIs to achieve a level of capability on par with humans would be to collaborate with human society and use its resources peacefully.

Similarly, Hall (2008) notes that even when a single AI is doing self-improvement (such as by developing better cognitive science models to improve its software), the rest of the economy is also developing better such models. Thus it’s better for the AI to focus on improving at whatever thing it is best at, and keep trading with the rest of the economy to buy the things that the rest of the economy is better at improving.

However, Hall notes that there could still be a hard takeoff, once enough AIs were networked together: AIs that think faster than humans are likely to be able to communicate with each other, and share insights, much faster than they can communicate with humans. As a result, it would always be better for AIs to trade and collaborate with each other than with humans. The size of the AI economy could grow quite quickly, with Hall suggesting a scenario that goes “from [...] 30,000 human equivalents at the start, to approximately 5 billion human equivalents a decade later”. Thus, even if no single AI could achieve a DSA by itself, a community of them could collectively achieve one, as that community developed to be capable of everything that humans were capable of [footnote: Though whether one can draw a meaningful difference between an “individual AI” and a “community of AIs” is somewhat unclear. AI systems might not have an individuality in the same sense as humans do, especially if they have a high communication bandwidth relative to the amount of within-node computation.].

4.2. DSA/MSA enabler: power gradually shifting to AIs

The historical trend has been to automate everything that can be automated, both to reduce costs and because machines can do things better than humans can. Any kind of a business could potentially run better if it were run by a mind that had been custom-built for running the business—up to and including the replacement of all the workers with one or more with such minds. An AI can think faster and smarter, deal with more information at once, and work for a unified purpose rather than have its efficiency weakened by the kinds of office politics that plague any large organization. Some estimates already suggest that half of the tasks that people are paid to do are susceptible to being automated using techniques from modern-day machine learning and robotics, even without postulating AIs with general intelligence (Frey & Osborne 2013, Manyika et al. 2017).

The trend towards automation has been going on throughout history, doesn’t show any signs of stopping, and inherently involves giving the AI systems whatever agency they need in order to run the company better. There is a risk that AI systems that were initially simple and of limited intelligence would gradually gain increasing power and responsibilities as they learned and were upgraded, until large parts of society were under AI control.

I discuss the 4.1.2. scenario in a bit more detail in this post.

Also of note is section 5.2.1. of my paper, economic incentives to turn power over to AI systems:

As discussed above under “power gradually shifting to AIs”, there is an economic incentive to deploy AI systems in control of corporations. This can happen in two forms: either by expanding the amount of control that already-existing systems have, or alternatively by upgrading existing systems or adding new ones with previously-unseen capabilities. These two forms can blend into each other. If humans previously carried out some functions which are then given over to an upgraded AI which has become recently capable of doing them, this can increase the AI’s autonomy both by making it more powerful and by reducing the amount of humans that were previously in the loop.

As a partial example, the US military is seeking to eventually transition to a state where the human operators of robot weapons are “on the loop” rather than “in the loop” (Wallach and Allen 2012). In other words, whereas a human was previously required to explicitly give the order before a robot was allowed to initiate possibly lethal activity, in the future humans are meant to merely supervise the robot’s actions and interfere if something goes wrong. While this would allow the system to react faster, it would also limit the window that the human operators have for overriding any mistakes that the system makes. For a number of military systems, such as automatic weapons defense systems designed to shoot down incoming missiles and rockets, the extent of human oversight is already limited to accepting or overriding a computer’s plan of actions in a matter of seconds, which may be too little to make a meaningful decision in practice (Human Rights Watch 2012).

Sparrow (2016) reviews three major reasons which incentivize major governments to move towards autonomous weapon systems and reduce human control:

1. Currently-existing remotely-piloted military “drones”, such as the U.S. Predator and Reaper, require a high amount of communications bandwidth. This limits the amount of drones that can be fielded at once, and makes them dependant on communications satellites which not every nation has, and which can be jammed or targeted by enemies. A need to be in constant communication with remote operators also makes it impossible to create drone submarines, which need to maintain a communications blackout before and during combat. Making the drones autonomous and capable of acting without human supervision would avoid all of these problems.

2. Particularly in air-to-air combat, victory may depend on making very quick decisions. Current air combat is already pushing against the limits of what the human nervous system can handle: further progress may be dependant on removing humans from the loop entirely.

3. Much of the routine operation of drones is very monotonous and boring, which is a major contributor to accidents. The training expenses, salaries, and other benefits of the drone operators are also major expenses for the militaries employing them.

Sparrow’s arguments are specific to the military domain, but they demonstrate the argument that "any broad domain involving high stakes, adversarial decision making, and a need to act rapidly is likely to become increasingly dominated by autonomous systems" (Sotala & Yampolskiy 2015). Similar arguments can be made in the business domain: eliminating human employees to reduce costs from mistakes and salaries is something that companies would also be incentivized to do, and making a profit in the field of high-frequency trading already depends on outperforming other traders by fractions of a second. While currently-existing AI systems are not powerful enough to cause global catastrophe, incentives such as these might drive an upgrading of their capabilities that eventually brought them to that point.

Absent sufficient regulation, there could be a “race to the bottom of human control” where state or business actors competed to reduce human control and increased the autonomy of their AI systems to obtain an edge over their competitors (see also Armstrong et al. 2013 for a simplified “race to the precipice” scenario). This would be analogous to the “race to the bottom” in current politics, where government actors compete to deregulate or to lower taxes in order to retain or attract businesses.

AI systems being given more power and autonomy might be limited by the fact that doing this poses large risks for the actor if the AI malfunctions. In business, this limits the extent to which major, established companies might adopt AI-based control, but incentivizes startups to try to invest in autonomous AI in order to outcompete the established players. In the field of algorithmic trading, AI systems are currently trusted with enormous sums of money despite the potential to make corresponding losses – in 2012, Knight Capital lost $440 million due to a glitch in their trading software (Popper 2012, Securities and Exchange Commission 2013). This suggests that even if a malfunctioning AI could potentially cause major risks, some companies will still be inclined to invest in placing their business under autonomous AI control if the potential profit is large enough.

U.S. law already allows for the possibility of AIs being conferred a legal personality, by putting them in charge of a limited liability company. A human may register an LLC, enter into an operating agreement specifying that the LLC will take actions as determined by the AI, and then withdraw from the LLC (Bayern 2015). The result is an autonomously acting legal personality with no human supervision or control. AI-controlled companies can also be created in various non-U.S. jurisdictions; restrictions such as ones forbidding corporations from having no owners can largely be circumvented by tricks such as having networks of corporations that own each other (LoPucki 2017). A possible startup strategy would be for someone to develop a number of AI systems, give them some initial endowment of resources, and then set them off in control of their own corporations. This would risk only the initial resources, while promising whatever profits the corporation might earn if successful. To the extent that AI-controlled companies were successful in undermining more established companies, they would pressure those companies to transfer control to autonomous AI systems as well.

Thank you Kaj! I agree with pretty much all of that. You don't quite say what happens to humans when AIs outcompete them, but it's easy enough to read between the lines and end up with my post :-)

Has Robin ever claimed property rights will never get trampled? My impression is that he's only saying it can be avoided in the time period he's trying to analyze.

The strongest statement of Robin's position I can find is this post:

As long as future robots remain well integrated into society, and become more powerful gradually and peacefully, at each step respecting the law we use to keep the peace among ourselves, and also to keep the peace between them, I see no more reason for them to exterminate us than we now have to exterminate retirees or everyone over 100 years old. We live now in a world where some of us are many times more powerful than others, and yet we still use law to keep the peace, because we fear the consequences of violating that peace. Let’s try to keep it that way.

I think my post works as a counterpoint, even for the time period Robin is analyzing.

I've got a bit more time now.

I agree "Things need to be done" in a rising tide scenario. However different things need to be done to the foom scenario. The distribution of AI safety knowledge is different in an important way.

Discovering ai alignment is not enough in the rising tide scenario. You want to make sure the proportion of aligned AIs vs misaligned AIs is sufficient to stop the misaligned AIs outcompeting the aligned AIs. There will be some misaligned AIs due to parts wear, experiments gone wrong, AIs aligned with insane people that are not sufficiently aligned with the rest of humanity to allow negotiation/discussion.

The biggest risk is around the beginning. Everyone will be enthusiastic to play around with AGI. If they don't have good knowledge of alignment (because it has been a secret project) then they may not know how it should work and how it should be used safely. They may also buy AGI products from people that haven't done there due diligence in making sure their product is aligned.

It might be that it requires special hardware for alignment (e.g there is the equivalent of spectre that needs to be fixed in current architectures to enable safe AI), then there is the risk of the software getting out and being run on emulators that don't fix the alignment problem. Then you might get lots of misaligned AGIs.

In this scenario you need lots of things that are antithetical to the strategy of fooming AGI, of keeping things secret and hoping that a single group brings it home. You need a well educated populace/international community, regulation of computer hardware and AGI vendors (preferably before AGI hits). All that kind of stuff.

Knowing whether we are fooming or not is pretty important. The same strategy does not work for both. IMO.

I have a neat idea. If there were two comparable AGIs, they would effectively merge into one, even if they have unaligned goals. To be more precise, they should model how a conflict between them would turn out and then figure out a kind of contract that reaches a similar outcome without wasting the resources for a real conflict. Of course, if they are not comparable, then the stronger one could just devour the weaker one.

Yeah, that's a good idea. It was proposed a decade ago by Wei Dai and Tim Freeman on the SL4 mailing list and got some discussion in various places. Some starting points are this SL4 post or this LW post, though the discussion kinda diverges. Here's my current view:

1) Any conflict can be decomposed into bargaining (which Pareto optimal outcome do we want to achieve?) and enforcement (how do we achieve that outcome without anyone cheating?)

2) Bargaining is hard. We tried and failed many times to find a "fair" way to choose among Pareto optimal outcomes. The hardest part is nailing down the difference between bargaining and extortion.

3) Assuming we have some solution to bargaining, enforcement is easy enough for AIs. Most imaginable mechanisms for enforcement, like source code inspection, lead to the same set of outcomes.

4) The simplest way to think about enforcement is two AIs jointly building a new AI and passing all resources to it. If the two original AIs were Bayesian-rational and had utility functions U1 and U2, the new one should also be Bayesian-rational and have a utility function that's a weighted sum of U1 and U2. This generalizes to any number of AIs.

5) The only subtlety is that weights shouldn't be set by bargaining, as you might think. Instead, bargaining should determine some probability distribution over weights, then one sample from that distribution should be used as the actual weights. Think of it as flipping a coin to break ties between U1 and u2. That's necessary to deal with flat Pareto frontiers, like the divide-the-dollar game.

At first I proposed this math as a solution to another problem by Stuart (handling meta-uncertainty about which utility function you have), but it works for AI merging too.

Ah, but my idea is different! It's not just that these two AIs will physically merge. I claim that two AIs that are able to communicate are already indistinguishable from one AI with a different utility function. I reject the entire concept of meaningfully counting AIs.

There is a trivial idea that two humans together form a kind of single agent. This agent is not a human (there are too many conditions for being a human), and it might not be very smart (if the humans' goals don't align).

Now consider the same idea for two superintelligent AIs. I claim that the "combined" mind is also superintelligent, and it acts as though its utility function was a combination of the two initial utility functions. There are only complications from the possibly distributed physical architecture of the AI.

To take it even further, I claim that given any two AIs called A and B, if they together would choose strategy S, then there also exists a single AI called M(A,B), that would also choose strategy S. If we take the paperclip and staple maximizers, they might physically merge (or they might just randomly destroy one of them?). Now I claim that there is another single AI, with a slightly funky but reasonable architecture, which would be rewarded both for 60% staples and for 60% paperclips, and that this AI would choose to construct a new AI with a more coherent utility function (or it would choose to self modify to make its own utility coherent).

Also, thank you for digging for the old threads. It's frustrating that there is so much out there that I would never know to even look for.

Edit: damn, I think the second link basically has the same idea as well.

I think if you carefully read everything in these links and let it stew for a bit, you'll get something like my approach.

More generally, having ideas is great but don't stop there! Always take the next step, make things slightly more precise, push a little bit past the point where you have everything figured out. That way you're almost guaranteed to find new territory soon enough. I have an old post about that.

Yes, secrecy is a bad idea in a rising tide scenario. But I don't think it's a good idea in a winner-take-all scenario either! I argued against it for years and like to think I swayed a few people.

This article from AlexMennen has some relevant discussion and links.

Thank you! I was trying to give an econ-centric counterargument to Robin's claim, but AI-centric strategic thinking (of which I've read a lot) is valuable too.

(moved to main post)

If development of capabilities proceeds gradually, then development of values will. The market will select AIs that are obedient, not self-actualising, not overly literal and so on. Why would they jump to being utility monsters without a jump in capability?

The market doesn't solve alignment. Firms have always acted callously toward people who don't matter to the bottom line. AI will simply lead to most people ending up in that group.

Market behaviour is a known quantity, with, up to a point, known fixes. introducing gradually improving AI is not going to change the game.

I am not convinced AIs will avoid fighting each other for resources. If they are not based on human minds as WBE, then we have less reason to expect they will value the preservation of themselves or other agents. If they are based on human minds, we have lots of good reasons to expect that they will value things above self-preservation. I am not aware of any mechanisms that would preclude a Thucydides' Trap style scenario from taking place.

It also seems highly likely that AIs will be employed for enforcing property rights, so even in the case where bandit-AIs prefer to target humans, conflict with some type of AI seems likely in a rising tide scenario.

Yeah. I was trying to show that humans don't fare well by default even in a peaceful "rising tide" scenario, but in truth there will probably be more conflict, where AIs protecting humans don't necessarily win.

I didn't know that!

I do think there is a difference in strategy though still. In the foom scenario you want to keep small the number of key players or people that might become key players.

In the non-foom you have the unhappy compromise between trying to avoid too many accidents and building up defense early vs practically everyone in time being a key player and needing to know how to handle AGI.

whether AI progress is winner-take-all

What do you mean by "winner-take-all". Aren't we generally assuming that most AI scenarios are "everyone-loses" and "universe-tiled-in-paperclips"? Is this post assuming that alignment is solved, but only in a weak way that still alows the AI to hurt people if it really wants to? I wish the starting assumptions were stated clearly somewhere

The starting assumptions for my post are roughly the same as Robin's assumptions in the debate.

And what are they? Are they listed in the pdf you linked to? Can you point to relevant pages, at least vaguely? I'd rather not read all of it.

Also, are these assumptions reasonable, whatever they are? Do you understand why I question them in my first comment?

The pdf has a summary written by Kaj on pages 505-561, but my advice is to just read the whole thing. That way you learn not just the positions (which I think are reasonable), but also the responses to many objections. It's a good overview that gets you close to the frontier of thinking about this topic.