kibber — LessWrong

I think what the OP is saying is that each luigi step is actually a superposition step, and therefore each next line adds up the probability of collapse. However, from a pure trope perspective I believe this is not really the case - in most works of fiction that have a twist, the author tends to leave at least some subtle clues for the twist (luigi turning out to be a waluigi). So it is possible at least for some lines to decrease the possibility of waluigi collapse.

The Preference Fulfillment Hypothesis

kibber3y32

To clarify, I meant the AI is unlikely to have it by default (being able to perfectly simulate a person does not in itself require having empathy as part of the reward function).

If we try to hardcode it, Goodhart's curse seems relevant: https://arbital.com/p/goodharts_curse/

The Preference Fulfillment Hypothesis

kibber3y*42

An AI that can be aligned to preferences of even just one person is already an aligned AI, and we have no idea how to do that.

An AI that's able to ~perfectly simulate what a person would feel would not necessarily want to perform actions that would make the person feel good. Humans are somewhat likely to do that because we have actual (not simulated) empathy, that makes us feel bad when someone close feels bad, and the AI is unlikely to have that. We even have humans that act like that (i.e. sociopaths), and they are still humans, not AIs!

Failed Utopia #4-2

kibber13y30

...from Venus, and only animals left on Earth, so one more planet than we had before.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments