Point is you will have to answer that question - whether after deliberation or instinctly - and then move on to a hundred other decisions and calculations relevant to the situation. And if you fumble too much, your superior might replace you with someone who is more prepared to make such decisions.
Most disagreements could be easily solved by each of us taking half.
Could you please elaborate on this?
Self preservation isn't worth risking to make a few changes to the copy's plans.
Would this mean you personally value your own life pretty highly (relative to rest of humanity)?
Hedonism is fun and destruction is easy, but creation is challenging and satisfying in a way neither of them are.
Makes sense, can totally relate!
Yup makes sense. But I also feel the "toy agent model" of terminal and instrumental preferences has real life implications (even though it may not be best model). Namely that you will always value yourself over your clone for instrumental reasons if you're not perfect clones. And I also tend to feel the extent to which you value yourself over your clone will be high in such high stakes situations.
Keen on anything that would weaken / strengthen my intuitions on this :)
Just to clarify, utopia here I wish to mean in terms of positivity/valence of actual experience, rather than just technological superiority. So Nazis might think they will reach utopia once they have technological superiority, but it wouldn't be in my book until they're also much happier than people today. I would hope desires to exterminate races are not stable in "happy" societies, but the truth is I don't really know. (If they're not stable I'm assuming either self-destruction, or else psychological rewiring until they don't want to kill people)
Thanks for replying!I generally agree with your intuition that similar people arw worth cooperating with, but I also feel like when the stakes are high this can break down. Maybe I should have defined the hypothetical such that you're both co-rulers or something until one kills the other.Cause like - worst case in a fight is you lose and the clone does what they want - which is already not that awful (probably), this is already guaranteed. But you may still believe you have something non-trivially better to offer than this worst case. And you may be willing to fight for it. (Just my intuitions :p)Do you have thoughts on what you'll do once you're the ruler?
War doesn't necessarily prevent utopias from being created. Nor do bad rulers - I'm sure Nazis would work towards utopia too if they knew it were possible. The only thing we know for certainty will prevent utopias from being created is existential risks as identified in the post.
But if human preferences make references to self, then those preferences are also relevant to the AGI alignment problem. (trying to make AI have the same preferences that humans have).Although I guess my example was also about:Even if human's terminal preferences do not make references to self, they will still instrumentally value self and not the clone, because of lack of trust in clone's preferences being 100% identical.
Thank you, I will check!
Interesting. Let's say you both agree to leave the room. Would you later feel guilt looking at the all the suffering in the world, knowing you could have helped prevent it? Be it genocides, world wars, misaligned AI, Zuckerberg becoming the next dictator, or something else.
If you trust that the other person has identical goals to yours, will it matter to you who presses the button? Say you both race for the button, you both collide into each other but miss the button. Will you now fight or graciously let the other person press it?