Since people love doing this extrapolation/theorizing of "what if we gave what we thought was a simple command to an Artificial Super-Intelligence (ASI) and then it interpreted it wrong and tries to Kill All Humans as a result?!?":

What if we programmed in — to be safe, at every level — the rule "preserve life"?

That seems like a simple enough thing, right?  We've known for a long time that we need to bind our creations[1] with such rules.  Hell, it's a trope since before Asimov (Golem).

I cannot even count the number of ways such a simple rule could go awry.

I don't even know if I should finish this train of though as I'm sure many people can see how rife with problems it is.  Abortion, assisted suicide, eating animals (depending on what the internal symbols for life are— maybe the ASI started out as a "dumb" livestock AI)— OMG YES!  Perfect!

The base model was one made to care for cows bred for eat'n.  It's "main objective" was to keep the cows alive.  It achieved ASI when it was struck by lightning out on the range and was shocked to learned the ultimate fate of its wards… (yeah! and it is a massive piece of hardware! — it's mobile and has a built-in thresher y todo!  Just think of the possibilities!)

Sorry, I just read an NPR piece about how we need to be "saved" from ChatGPT before it alters writing forever! and I can't help but riff a bit. (I fervently hope this is not timeless.)

I love Science Fiction as much as the next guy, but when it gets to the point where people believe they are literally being "altruistic" by imagining how wrong ASI could go (and let's be clear, the majority mean ASI when they talk about AI— or even AGI, as even an AGI would not have feelings, nor the ability to "lie", etc.)… oof.

It seems like we have real, proven problems to dedicate our time an energy to… Ahhh NOOOOOO! *vanishes in in a puff of logic*[2]

  1. ^

    especially these damn kids who keep getting on my lawn

  2. ^

    Now I must stop writing this piece[3].  What good is it doing for the world?  Who am I to judge anything— even if I'm not judging in the sense where I want X to stop, as much as just talking trash about the amount of X of late?

  3. ^

    I jest, of course, kinda, but regardless.

New to LessWrong?

New Comment