LESSWRONG
LW

51
otto.barten
503221440
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
1otto.barten's Shortform
5y
29
"AI Alignment" is a Dangerously Overloaded Term
otto.barten2y*90

I think it's a great idea to think about what you call goalcraft.

I see this problem as similar to the age-old problem of controlling power. I don't think ethical systems such as utilitarianism are a great place to start. Any academic ethical model is just an attempt to summarize what people actually care about in a complex world. Taking such a model and coupling that to an all-powerful ASI seems a highway to dystopia.

(Later edit: also, an academic ethical model is irreversible once implemented. Any goal which is static cannot be reversed anymore, since this will never bring the current goal closer. If an ASI is aligned to someone's (anyone's) preferences, however, the whole ASI could be turned off if they want it to, making the ASI reversible in principle. I think ASI reversibility (being able to switch it off in case we turn out not to like it) should be mandatory, and therefore we should align to human preferences, rather than an abstract philosophical framework such as utilitarianism.)

I think letting the random programmer that happened to build the ASI, or their no less random CEO or shareholders, determine what would happen to the world, is an equally terrible idea. They wouldn't need the rest of humanity for anything anymore, making the fates of >99% of us extremely uncertain, even in an abundant world.

What I would be slightly more positive about is aggregating human preferences (I think preferences is a more accurate term than the more abstract, less well defined term values). I've heard two interesting examples, there are no doubt a lot more options. The first is simple: query chatgpt. Even this relatively simple model is not terrible at aggregating human preferences. Although a host of issues remain, I think using a future, no doubt much better AI for preference aggregation is not the worst option (and a lot better than the two mentioned above). The second option is democracy. This is our time-tested method of aggregating human preferences to control power. For example, one could imagine an AI control council consisting of elected human representatives at the UN level, or perhaps a council of representative world leaders. I know there is a lot of skepticism among rationalists on how well democracy is functioning, but this is one of the very few time tested aggregation methods we have. We should not discard it lightly for something that is less tested. An alternative is some kind of unelected autocrat (e/autocrat?), but apart from this not being my personal favorite, note that (in contrast to historical autocrats), such a person would also in no way need the rest of humanity anymore, making our fates uncertain.

Although AI and democratic preference aggregation are the two options I'm least negative about, I generally think that we are not ready to control an ASI. One of the worst issues I see is negative externalities that only become clear later on. Climate change can be seen as a negative externality of the steam/petrol engine. Also, I'm not sure a democratically controlled ASI would necessarily block follow-up unaligned ASIs (assuming this is at all possible). In order to be existentially safe, I would say that we would need a system that does at least that.

I think it is very likely that ASI, even if controlled in the least bad way, will cause huge externalities leading to a dystopia, environmental disasters, etc. Therefore I agree with Nathan above: "I expect we will need to traverse multiple decades of powerful AIs of varying degrees of generality which are under human control first. Not because it will be impossible to create goal-pursuing ASI, but because we won't be sure we know how to do so safely, and it would be a dangerously hard to reverse decision to create such. Thus, there will need to be strict worldwide enforcement (with the help of narrow AI systems) preventing the rise of any ASI."

About terminology, it seems to me that what I call preference aggregation, outer alignment, and goalcraft mean similar things, as do inner alignment, aimability, and control. I'd vote for using preference aggregation and control.

Finally, I strongly disagree with calling diversity, inclusion, and equity "even more frightening" than someone who's advocating human extinction. I'm sad on a personal level that people at LW, an otherwise important source of discourse, seem to mostly support statements like this. I do not.

Reply
These are my reasons to worry less about loss of control over LLM-based agents
otto.barten6d10

I agree AI intelligence is and likely will remain spiky and some spikes are above human-level (of course a calculator also spikes above human-level). But I'm as of yet not convinced that the whole LLM-based intelligence spectrum will max out above takeover-level. But I'd be open for arguments.

Reply
These are my reasons to worry less about loss of control over LLM-based agents
otto.barten6d10

Someone or something will always be in power. If that entity decides to not allocate any resources to most humans, true, we die. But that could have happened in an AI takeover scenario as well, depending on whose values etc. would have been in there.

Reply
MAGA speakers at NatCon were mostly against AI
otto.barten15d10

"Protectionism against AI" is a bit of an indirect way to point at not using AI for some tasks for job market reasons, but thanks for clarifying. Reducing immigration or trade won't solve AI-induced job loss, right? I do agree that countries could decide to either not use AI, or redistribute AI-generated income, with the caveat that those choosing not to use AI may be outcompeted by those who do. I guess we could, theoretically, sign treaties to not use AI for some jobs anywhere.

I think AI-generated income redistribution is more likely though, since it seems the obviously better solution.

Reply1
MAGA speakers at NatCon were mostly against AI
otto.barten15d10

Thanks for correcting it. I still don't really get your connection between protectionism and mass unemployment. Perhaps you could make it explicit?

Reply
MAGA speakers at NatCon were mostly against AI
otto.barten15d12

Scifi was probably fun to think about for some in the 90s but things got more serious when it became clear the singularity could kill everyone we love. Yud bit the bullet and now says we should stop AI before it kills us. Did you bite that bullet too? If so, you're not purely pro-tech anymore whether you like it or not. (Which I think shouldn't matter because pro- and anti-tech has always been a silly way to look at the world.)

Reply
MAGA speakers at NatCon were mostly against AI
otto.barten15d10

I don't really understand your thoughts about developing vs developed countries and protectionism, could you make them more explicit?

Reply
MAGA speakers at NatCon were mostly against AI
otto.barten15d10

How would you define pro-tech, which I assume you identify as? For example, should AI replace humanity a) in any case if it can, b) only if it's conscious, c) not at all?

Reply
MAGA speakers at NatCon were mostly against AI
otto.barten15d1-6

If we end up in a world with mass unemployment (like 90%), I expect those people currently self-identifying as conservatives to support strong redistribution of income, along with almost all others. I expect strong redistribution to happen in countries where democracy with income-independent voting rights is still alive by then, if any. In those where it's not, maybe it won't happen and people might die of starvation, be driven out of their homes, etc.

Reply1
MAGA speakers at NatCon were mostly against AI
otto.barten15d17

Anti- vs pro-tech is an outdated, needlessly primitive, and needlessly polarizing framework to look at the world. We should obviously consider which tech is net positive and build that, and which tech is net negative and regulate that at the point where it starts being so.

Reply
Load More
No wikitag contributions to display.
7These are my reasons to worry less about loss of control over LLM-based agents
6d
4
11We should think about the pivotal act again. Here's a better version of it.
1mo
2
15AI Offense Defense Balance in a Multipolar World
2mo
5
17Yes RAND, AI Could Really Cause Human Extinction [crosspost]
3mo
4
10US-China trade talks should pave way for AI safety treaty [SCMP crosspost]
4mo
0
15New AI safety treaty paper out!
6mo
2
11Proposing the Conditional AI Safety Treaty (linkpost TIME)
10mo
9
9Announcing the AI Safety Summit Talks with Yoshua Bengio
1y
1
13What Failure Looks Like is not an existential risk (and alignment is not the solution)
2y
12
17Announcing #AISummitTalks featuring Professor Stuart Russell and many others
2y
1
Load More