LESSWRONG
LW

566
rosehadshar
1224Ω8412290
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
No wikitag contributions to display.
Sense-making about extreme power concentration
rosehadshar1mo20

Yup sorry, the Tom above is actually Rose!

I like your distinction between narrow and broad IC dynamics. I was basically thinking just about narrow, but now agree that there is also potentially some broader thing.

How likely do you think it is that helper-nanobots outcompete auto-nanobots? Two possible things that could be going on are:

  1. I'm unhelpfully abstracting away what kind of AI systems we end up with, but actually this significantly impacts how likely power concentration is and I should think more about it
  2. Theoretically the distinction between helper and auto-nanobots is significant, but in practice it's very unlikely that helper-nanobots will be competitive, so it's fine to ignore the possibility and treat auto-nanobots as 'AI'
Reply
Sense-making about extreme power concentration
rosehadshar2mo20

All of 1-4 seem plausible to me, and I don't centrally expect that power concentration will lead to everyone dying. 

Even if all of 1-4 hold, I think the future will probably be a lot less good than it could have been:
- 4 is more likely to mean that earth becomes a nature reserve for humans or something, than that the stars are equitably allocated

- I'm worried that there are bad selection effects such that 3 already screens out some kinds of altruists (e.g. ones who aren't willing to strategy steal). Some good stuff might still happen to existing humans, but the future will miss out on some values completely

- I'm worried about power corrupting/there being no checks and balances/there being no incentives to keep doing good stuff for others

Reply
Sense-making about extreme power concentration
rosehadshar2mo40

Thanks, agree that 'emergent dynamics' is woolly above.

I guess I don't think the y-axis should be the temporal dimension. To give some cartoon examples:

  • I'd put an extremely Machievellian 10 year plan on the part of a cabal of politicians to backslide into a dictatorship then seize power over the rest of the world near the top end of the axis
  • I'd put unfavourable order of capabilities, where in an unplanned way superpersuasion comes online before defenses, and actors fail to coordinate not to deploy it because of competitive dynamics, near the bottom end of the axis. Even if the whole thing unfolds over a few months

I do think the y-axis is pretty correlated with temporal scales, but I don't think it's the same. I also don't think physical violence is the same, though it's probably also correlated (cf the backsliding example which is v powerseeking but not v violent).


The thing I had in mind was more like, should I imagine some actor consciously trying to bring power concentration about? To the extent that's a good model, it's power-seeking. Or should I imagine that no actor is consciously planning this, but the net result of the system is still extreme power concentration? If that's a good model, it's emergent dynamics.

Idk, I see that this is messy and probably there's some other better concept here

Reply
AI-enabled coups: a small group could use AI to seize power
rosehadshar3mo*Ω120

I think I agree that, once an AI-enabled coup has happened, the expected remaining AI takeover risk would be much lower. This is partly because it ends the race within the country where the takeover happened (though it wouldn't necessarily end the international race), but also partly just because of the evidential update: apparently AI is now capable of taking over countries, and apparently someone could instruct the AIs to do that, and the AIs handed the power right back to that person! Seems like alignment is working.

I don't currently agree that the remaining AI takeover risk would be much lower:

  • The international race seems like a big deal. Ending the domestic race is good, but I'd still expect reckless competition I think. Maybe you're imagining that a large chunk of powergrabs are motivated by stopping the race? I'm a bit sceptical.
  • I don't think the evidential update is that strong. If misaligned AI found it convenient to take over the US using humans, why should we expect them to immediately cease to find humans useful at that point? They might keep using humans as they accumulate more power, up until some later point.
  • There's another evidential update which I think is much stronger, which is that the world has completely dropped the ball on an important thing almost no one wants (powergrabs), where there are tractable things they could have done, and some of those things would directly reduce AI takeover risk (infosec, alignment audits etc). In a world where coups over the US are possible, I expect we've failed to do basic alignment stuff too.

Curious what you think.

Reply
The Industrial Explosion
rosehadshar5mo30

I think it might be a bit clearer to communicate the stages by naming them based on the main vector of improvement throughout the entire stage, i.e. 'optimization of labor' for stage one, 'automation of labor' for stage two, 'miniturization' for stage three.

I think these names are better names for the underlying dynamics, at least - thanks for suggesting them. (I'm less clear they are better labels for the stages overall, as they are a bit more abstract.)

Reply
Should there be just one western AGI project?
rosehadshar1y10

Changed to motivation, thanks for the suggestion.

I agree that centralising to make AI safe would make a difference. It seems a lot less likely to me than centralising to beat China (there's already loads of beat China rhetoric, and it doesn't seem very likely to go away).

Reply
Should there be just one western AGI project?
rosehadshar1y10

"it is potentially a lot easier to stop a single project than to stop many projects simultaneously" -> agree.

Reply
Should there be just one western AGI project?
rosehadshar1y20

I think I still believe the thing we initially wrote:

  • Agree with you that there might be strong incentives to sell stuff at monopoloy prices (and I'm worried about this). But if there's a big gap, you can do this without selling your most advanced models. (You sell access to weaker models for a big mark up, and keep the most advanced ones to yourselves to help you further entrench your monopoly/your edge over any and all other actors.)
  • I'm sceptical of worlds where 5 similarly advanced AGI projects don't bother to sell
    • Presumably any one of those could defect at any time and sell at a decent price. Why doesn't this happen?
    • Eventually they need to start making revenue, right? They can't just exist on investment forever

      (I am also not an economist though and interested in pushback.)

       

Reply
Should there be just one western AGI project?
rosehadshar1y30

Thanks, I expect you're right that there's some confusion in my thinking here.

Haven't got to the bottom of it yet, but on more incentive to steal the weights:
- partly I'm reasoning in the way that you guess, more resources -> more capabilities -> more incentives
- I'm also thinking "stronger signal that the US is all in and thinks this is really important -> raises p(China should also be all in) from a Chinese perspective -> more likely China invests hard in stealing the weights"
- these aren't independent lines of reasoning, as the stronger signal is sent by spending more resources
- but I tentatively think that it's not the case that at a fixed capability level the incentives to steal the weights are the same. I think they'd be higher with a centralised project, as conditional on a centralised project there's more reason for China to believe a) AGI is the one thing that matters, b) the US is out to dominate

Reply
Should there be just one western AGI project?
rosehadshar1y30

Thanks, I agree this is an important argument.

Two counterpoints:

  • The more projects you have, the more attempts at alignment you have. It's not obvious to me that more draws are net bad, at least at the margin of 1 to 2 or 3.
  • I'm more worried about the harms from a misaligned singleton than from a misaligned (or multiple misaligned) systems in a wider ecosystem which includes powerful aligned systems. 
Reply
Load More
69Sense-making about extreme power concentration
2mo
25
26Good government
2mo
0
124The Industrial Explosion
Ω
5mo
Ω
66
132AI-enabled coups: a small group could use AI to seize power
Ω
7mo
Ω
23
40Three Types of Intelligence Explosion
Ω
8mo
Ω
8
45Intelsat as a Model for International AGI Governance
8mo
0
78Should there be just one western AGI project?
1y
75
31New report: A review of the empirical evidence for existential risk from AI via misaligned power-seeking
2y
5
61Results from an Adversarial Collaboration on AI Risk (FRI)
2y
3
41Strongest real-world examples supporting AI risk claims?
Q
2y
Q
7
Load More