I read here a lot about how an AI would not allow humans to change its goal, even going so far as to kill humans who are trying to do so.

On one of Rob Miles' videos (which are awesome!) he says something along the lines of "Imagine someone offers you a pill that will make it so you don't care if your child lives or dies.  Would you take the pill?"

Obviously not.  

However, think of other goals I have.  Imagine someone offers me a pill such that "You no longer have the goal of seeking heroin at all costs."  I'll take it!

More mundanely, imagine I'm on my way to buy some oranges and someone comes up and begs me to take a pill such that I will no longer be interested in buying oranges today.  Sure?

I get that computers are single-minded: Tell it to do X and it will fucking do it, man.

However, these AGIs are going to be complex enough that to me it isn't obvious that they won't be indifferent to goal-changing.  Obviously they will have competing objectives interiorly, just like humans do.  It isn't obvious to me that they will be monomaniacal.

A random addendum: Notice in your own thoughts if you are immediately searching for reasons why I am wrong.  

New Comment
2 comments, sorted by Click to highlight new comments since: Today at 9:41 PM

Obviously they will have competing objectives interiorly, just like humans do.

This is not so obvious to me.

Humans are a product of evolution, so it makes sense to have various trackers of "things that can hurt us" (such as hunger, low social status, etc.), where each gives a simple advice, but sometimes the different pieces of advice contradict (you are really hungry, but in a situation where admitting it would lower your status).

Computers follow an algorithm. If the algorithm is "for each possible token, calculate the probability of its appearing in a text, then write the token with the greatest probability", there is not much of a potential for internal conflict.

Sure, but it only takes one hyperdesperate squigglewanter. Perhaps a non-desperate orangewanter will take a pill to not want oranges, but do you really think a hyperdesperate squigglewanter is going to care that an orangewanter took a pill?