rosehadshar — LessWrong

New 80k problem profile: extreme power concentration

Thanks for the comment Michael.

A minor quibble is that I think it's not clear you need ASI to end up with dangerous levels of power concentration, so you might need to ban AGI, and to do that you might need to ban AI development pretty soon.

I've been meaning to read your post though, so will do that soon.

New 80k problem profile: extreme power concentration

rosehadshar17d30

Thanks, I think these are interesting points.

I agree that some power concentration is likely necessary, and that it could be a lot, though I'm pretty unsure there.

In terms of what to do about that:

Do you have ideas about what widening political control over a small number of AIs would look like?
Another alternative to crossing our fingers would be to distribute strategic power but also build AI resilience. This doesn't work if there are big capability gaps I think, but if capabilities are relatively evenly distributed then there might be ways to build defensive enclaves

Sense-making about extreme power concentration

rosehadshar3mo20

Yup sorry, the Tom above is actually Rose!

I like your distinction between narrow and broad IC dynamics. I was basically thinking just about narrow, but now agree that there is also potentially some broader thing.

How likely do you think it is that helper-nanobots outcompete auto-nanobots? Two possible things that could be going on are:

I'm unhelpfully abstracting away what kind of AI systems we end up with, but actually this significantly impacts how likely power concentration is and I should think more about it
Theoretically the distinction between helper and auto-nanobots is significant, but in practice it's very unlikely that helper-nanobots will be competitive, so it's fine to ignore the possibility and treat auto-nanobots as 'AI'

Sense-making about extreme power concentration

rosehadshar4mo20

All of 1-4 seem plausible to me, and I don't centrally expect that power concentration will lead to everyone dying.

Even if all of 1-4 hold, I think the future will probably be a lot less good than it could have been:
- 4 is more likely to mean that earth becomes a nature reserve for humans or something, than that the stars are equitably allocated

- I'm worried that there are bad selection effects such that 3 already screens out some kinds of altruists (e.g. ones who aren't willing to strategy steal). Some good stuff might still happen to existing humans, but the future will miss out on some values completely

- I'm worried about power corrupting/there being no checks and balances/there being no incentives to keep doing good stuff for others

Sense-making about extreme power concentration

rosehadshar4mo40

Thanks, agree that 'emergent dynamics' is woolly above.

I guess I don't think the y-axis should be the temporal dimension. To give some cartoon examples:

I'd put an extremely Machievellian 10 year plan on the part of a cabal of politicians to backslide into a dictatorship then seize power over the rest of the world near the top end of the axis
I'd put unfavourable order of capabilities, where in an unplanned way superpersuasion comes online before defenses, and actors fail to coordinate not to deploy it because of competitive dynamics, near the bottom end of the axis. Even if the whole thing unfolds over a few months

I do think the y-axis is pretty correlated with temporal scales, but I don't think it's the same. I also don't think physical violence is the same, though it's probably also correlated (cf the backsliding example which is v powerseeking but not v violent).

The thing I had in mind was more like, should I imagine some actor consciously trying to bring power concentration about? To the extent that's a good model, it's power-seeking. Or should I imagine that no actor is consciously planning this, but the net result of the system is still extreme power concentration? If that's a good model, it's emergent dynamics.

Idk, I see that this is messy and probably there's some other better concept here

AI-enabled coups: a small group could use AI to seize power

rosehadshar5mo*Ω120

I think I agree that, once an AI-enabled coup has happened, the expected remaining AI takeover risk would be much lower. This is partly because it ends the race within the country where the takeover happened (though it wouldn't necessarily end the international race), but also partly just because of the evidential update: apparently AI is now capable of taking over countries, and apparently someone could instruct the AIs to do that, and the AIs handed the power right back to that person! Seems like alignment is working.

I don't currently agree that the remaining AI takeover risk would be much lower:

The international race seems like a big deal. Ending the domestic race is good, but I'd still expect reckless competition I think. Maybe you're imagining that a large chunk of powergrabs are motivated by stopping the race? I'm a bit sceptical.
I don't think the evidential update is that strong. If misaligned AI found it convenient to take over the US using humans, why should we expect them to immediately cease to find humans useful at that point? They might keep using humans as they accumulate more power, up until some later point.
There's another evidential update which I think is much stronger, which is that the world has completely dropped the ball on an important thing almost no one wants (powergrabs), where there are tractable things they could have done, and some of those things would directly reduce AI takeover risk (infosec, alignment audits etc). In a world where coups over the US are possible, I expect we've failed to do basic alignment stuff too.

Curious what you think.

The Industrial Explosion

rosehadshar6mo30

I think it might be a bit clearer to communicate the stages by naming them based on the main vector of improvement throughout the entire stage, i.e. 'optimization of labor' for stage one, 'automation of labor' for stage two, 'miniturization' for stage three.

I think these names are better names for the underlying dynamics, at least - thanks for suggesting them. (I'm less clear they are better labels for the stages overall, as they are a bit more abstract.)

Should there be just one western AGI project?

rosehadshar1y10

Changed to motivation, thanks for the suggestion.

I agree that centralising to make AI safe would make a difference. It seems a lot less likely to me than centralising to beat China (there's already loads of beat China rhetoric, and it doesn't seem very likely to go away).

Should there be just one western AGI project?

rosehadshar1y10

"it is potentially a lot easier to stop a single project than to stop many projects simultaneously" -> agree.

Should there be just one western AGI project?

rosehadshar1y20

I think I still believe the thing we initially wrote:

Agree with you that there might be strong incentives to sell stuff at monopoloy prices (and I'm worried about this). But if there's a big gap, you can do this without selling your most advanced models. (You sell access to weaker models for a big mark up, and keep the most advanced ones to yourselves to help you further entrench your monopoly/your edge over any and all other actors.)
I'm sceptical of worlds where 5 similarly advanced AGI projects don't bother to sell
- Presumably any one of those could defect at any time and sell at a decent price. Why doesn't this happen?
- Eventually they need to start making revenue, right? They can't just exist on investment forever
  (I am also not an economist though and interested in pushback.)

LESSWRONG
is fundraising!
LW

LESSWRONG
is fundraising!
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments