LESSWRONG
LW

AI
Frontpage

1

[ Question ]

Could AI be used to engineer a sociopolitical situation where humans can solve the problems surrounding AGI?

by hollowing
25th Jan 2023
1 min read
A
1
6

1

AI
Frontpage

1

Could AI be used to engineer a sociopolitical situation where humans can solve the problems surrounding AGI?
1Jay Bailey
1hollowing
1Jay Bailey
1hollowing
2Jay Bailey
1Mitchell_Porter
New Answer
New Comment

1 Answers sorted by
top scoring

Jay Bailey

Jan 25, 2023

10

As a useful exercise, I would advise asking yourself this question first, and thinking about it for five minutes (using a clock) with as much genuine intent to argue against your idea as possible. I might be overestimating the amount of background knowledge required, but this does feel solvable with info you already have.

ROT13: Lbh lbhefrys unir cbvagrq bhg gung n fhssvpvragyl cbjreshy vagryyvtrapr fubhyq, va cevapvcyr, or noyr gb pbaivapr nalbar bs nalguvat. Tvira gung, jr pna'g rknpgyl gehfg n fgengrtl gung n cbjreshy NV pbzrf hc jvgu hayrff jr nyernql gehfg gur NV. Guhf, jr pna'g eryl ba cbgragvnyyl hanyvtarq NV gb perngr n cbyvgvpny fgengrtl gb cebqhpr nyvtarq NV.

Add Comment
[-]hollowing3y10

Thanks for the response. I did think of this objection, but wouldn't it be obvious if the AI were trying to engineer a different situation than the one requested? E.g., wouldn't such a strategy seem unrelated and unconventional?

It also seems like a hypothetical AI with just enough ability to generate a strategy for the desired situation would not be able to engineer a strategy for a different situation which would both work, and deceive the human actors. As in, it seems the latter would be harder and require an AI with greater ability. 

Reply
1Jay Bailey3y
I think the most likely scenario of actually trying this with an AI in real life is that you end up with a strategy that is convincing to humans and ends up being ineffective or unhelpful in reality, rather than ending up with a galaxy-brained strategy that pretends to produce X but actually produces Y while simultaneously deceiving humans into thinking it produces X. I agree with you that "Come up with a strategy to produce X" is easier than "Come up with a strategy to produce Y AND convince the humans that it produces X", but I also think it is much easier to perform "Come up with a strategy that convinces the humans that it produces X" than to produce a strategy that actually works. So, I believe this strategy would be far more likely to be useless than dangerous, but I still don't think it would help.
1hollowing3y
I agree this would be much easier. However, I'm wondering why you think an AI would prefer it, if it has the capability to do either. I can see some possible reasons (e.g., an AI may not want problems of alignment to be solved). Do you think that would be an inevitable characteristic of an unaligned AI with enough capability to do this?
2Jay Bailey3y
I agree an AI would prefer to produce a working plan if it had the capacity. I think that an unaligned AI, almost by definition, does not want the same goal we do. If we ask for Plan X, it might choose to produce Plan X for us as asked if that plan was totally orthogonal to its goals (I.e, the plan's success or failure is irrelevant to the AI) but if it could do better by creating Plan Y instead, it would. So, the question is - how large is the capability difference between "AI can produce a working plan for Y, but can't fool us into thinking it's a plan for X" and "AI can produce a working plan for Y that looks to us like a plan for X"? The honest answer is "We don't know". Since failure could be catastrophic, this isn't something I'd like to leave to chance, even though I wouldn't go so far as to call the result inevitable.
1 comment, sorted by
top scoring
Click to highlight new comments since: Today at 11:14 PM
[-]Mitchell_Porter3y10

You should discuss this with ChatGPT. 

Reply
Moderation Log
More from hollowing
View more
Curated and popular this week
A
1
1

0

hi, i've been learning about alignment and am new to lesswrong. here's my question.

there seems to be a consensus here that AI couldn't be used to solve the problem of AI control per se. that said, is there any discussion or literature on whether a future AI might be able to generate a very impactful political strategy which, if enacted, would engineer a sociopolitical situation where humans have better prospects to solve the problems around AGI?

this question came to my mind in discussing how it seems that, in principle, there should be a way to string together words (and tone, body language, etc) to convince anyone of anything. likewise, it seems there are in principle sequences of actions which would change society/culture to any arbitrary state. however, most of these strategies are far outside the range of what a human could come up with; but a smarter AI might be able to come up with them, or in general have very intelligent ideas humans can't come up with, as Robert Miles helped illustrate to me in this video (https://youtu.be/L5pUA3LsEaw?t=359).