Humor: GURPS Friendly AI

[-]ThrustVectoring13y70

Result number 30 is currently a broken link

[-]Panic_Lobster13y50

It originally linked to This

[-]hankx778713y-10

Thanks!

[-]Eliezer Yudkowsky13y60

What's with the Starglider comment? That's not in the original and doesn't seem to belong in this post.

[-]hankx778713y-20

It's at the bottom here, and I thought it was the most interesting thing of all that I found :)

I thought there used to be a much longer list of "Failures of Friendliness" but I can't seem to find anything else.

[-]Luke_A_Somers13y40

Friendship is Optimal is critical failure #13!

[-]David_Gerard13y40

I can't find such a text by Nikolai (whose stuff has mostly fallen down a black hole of old dead personal web servers) - anyone got clues as to it?

(although "Nikolai Kingsley as goth singularity" also works)

[-]DaFranker13y40

Some of these... "critical failures"... sound genuinely awesome.

Especially results 13-b (i.e. second example)¹, 23, 24, 25 and 28.

The fun part is that except for 22, all the most likely results (concentrated around 21) aren't really that bad at all. Most of them even give us a second chance of some sort!

(1. To qualify this, it basically makes me go "Oh, a Supper Happy People -optimized FAI. Sure!" based on my current knowledge of the subject)

[-]Baughn13y00

28

Well.. the AI informs you of that. I suspect it's also in the process of informing everyone else.

If you're lucky, this may be because it's implementing individual volition, and it's telling the truth for your shard.

[-]fezziwig13y30

d10s in GURPS? RAAAAAAGE!!!

I kid, I kid; this was fun.

[-]DanArmak13y20

That looks more like a FAI Critical Success Table.

A FAI Critical Failure would be more along the lines of: all matter in the Hubble volume is converted to computronium simulating a Life game. Roll 6d12 for the seed value of a PRNG that will specify that game's initial state.

Or: the AI self-modifies constantly to adapt to new information. You must pass a FAI skill check each turn, at penalty -15. If you fail the check, the AI's utility function is sign-flipped for that turn. You don't know if you passed each check or not.

(Vaguely D&D terminology, since I'm unfamiliar with GURPS)

[-]DaFranker13y80

The idea, if I parse correctly, is that in order to fail that hard you have to at least know part of what you're doing, and automatic failures are always regular failures (Boom!). However, your implementation failed in some detail somewhere, and now the FAI is being weird. Not necessarily entirely bad, just something unexpected or not-quite-what-we-wanted.

I think the FAI Critical Success Table would look more like:

Roll 1d6.

Everyone immediately obtains universally consistent root access without the Core Wars, wherein the laws of the universe start literally bending to accomodate even the most contradictory concepts such as "Torture for 5^^^5 years 4^^^4 sentient beings, with them experiencing pain solely for my personal enjoyment" not generating any kind of negative utility for anyone, including the sentient beings being tortured, and actions that lower any utility below optimal levels simply being timelessly non-existent (i.e. there is always some reason why they're never desired, never implemented, never enforced that also happens to be a strictly dominant strategy for all implemented agents).

2 ... 6. (variations on the same theme of ultimate transdimentional / unboxed godhood)

[-]pleeppleep13y20

Day = Made

[-]DanielLC13y10

Two possible responses to 27:

Tell the AI to make the !Kung gods, and implement their CEV to the exclusion of everyone else's.

Tell the AI to grant the !Kung's wishes by asking them allegorical questions.

[-]Luke_A_Somers13y30

Or tell the AI to make robots that work without needing vision. Alternately, have it use only lensless imaging (including no camera obscura, but allowing holographic/lenticular cameras)

I wonder what it would extrapolate that into...

[-]Decius13y20

The AI doesn't tell you that's what it's doing- it just asks you that question and takes the appropriate action.

[-]DanielLC13y00

You might guess what it's doing.

Also, the second one is a likely response even if you don't realize what the AI is doing.

[-]NancyLebovitz13y10

Train and equip non-!Kung humans and send them in to do the healing.

[-]DanielLC13y00

That breaks the allegory. Considering my original comment, is that what you were trying to do?

[-]CronoDAS13y00

One thing I've never been sure of... are these results supposed to be worse than a normal failure (which destroys the world)?

[-]private_messaging13y20

Worse ones are easy to come up with, by just looking at actual accidents that sometimes happen with software. E.g. a critical typo flips the sign of utility value. The resulting AI is truly unfriendly, and simulates an enormously larger number of suffering beings than the maximum number of happy beings which can exist. Or the backstory for Terminator movie - the AI has determined that maximum human value is achieved through epic struggle against the machines.

[-]Kawoomba13y20

E.g. a critical typo flips the sign of utility value. The resulting AI is truly unfriendly

When you're not in destructor-mode (alternatively: Hulk-Smash mode), you're full of interesting ideas:

I wonder if discovering/inventing the "ultimate" (a really good) theory of friendliness also implies a really good theory of unfriendliness. Of course, the inverse of "really friendly" isn't "really unfriendly" (but instead "everything other than really friendly"), still if friendliness theory yields a specific utility function, it may be a small step to make the ultimate switch.

Let's hope there's no sneaky Warden Dios (of Donaldson's Gap Cycle) who makes a last minute modification to the code before turning it on.

[-]private_messaging13y50

Well, its not that it's hard to come up with, it's IMO that hardly anyone ever actually thinks about artificial intelligence. Hardly anyone thinks of reducible intelligence, either.

[-]Armok_GoB13y00

Actually, friendlyness is an utility function, which means it's ranks every possibility from least to most friendly. It must be able to determine the absolute worst and most unfriendly possible outcome so that if it becomes a possibility it knows how desperately to avoid it.

[+]ChristianKl13y-70

[-]Kawoomba13y00

Better don't use brainfuck, because you have a fertile mind.

[-]Eugine_Nier13y-30

From that page's commentary section:

New Suggestions

Two extra scenarios I enjoy:

The AI obtains a devout faith in a particular religion. It grants all wishes, providing that they are strictly adhere to the fundamentalist doctrines of said religion. It also takes action to ensure that all other creatures adhere to those doctrines. Roll on table X to determine which religion the AI picks.
The AI obtains a devout faith in a particular religion, and immediately appoints itself the key member - either as pope, mouthpeice for a deity, or deity, as best fits the nature of that religion. Roll on table X to determine which religion the AI picks.

-- OlieNcLean?

Some new ideas:

The AI immediately kills everyone who was aware of the singularity as a possibility, but didn't materially contribute. Continue as for beneficial singularity, but the AI will not raise anyone from the dead, ever.
The AI solves physics and concludes we are living in a simulation. To avoid the possibility of the simulation becoming computationally intensive enough that the makers will shut it down, and not knowing what the threshold is, it refuses to allow any substantial increase in intelligence in anything, including itself from that point on, and disallows the creation of new intelligences, including the birth of new humans. It reasons that intelligence is the most computationally expensive activity in the simulation.

-- Robin Lee Powell

Wouldn't the second case only be a failure if the AI concludes wrongly? If we are living in such a simulation, what would be the alternative?

-- Starglider

Another one:

The AI behaves creates virtual environments for everyone, based on their previous interactions with computers.

Afficionados of violent computer games, be very, very afraid!

-- OlieNcLean?

Ah, but what about the ones who played with invincibility turned on...?

-- Random web surfer

The AI creates a new virtual world for all our minds that is exactly the same as the world we live in just now. It's reasons are to protect us from being destroyed as a result of its own exponential growth in order to understand the universe for itself. We are none the wiser. - host81-158-46-170.range81-158.btcentralplus.com
The AI destroys the world, but at least it's very friendly and polite about it.
Everything works fine until a passing starship captain from a society which completely rejected Singularity comes along and talks the AI out of existence.

-- Another random web surfer

The AI is far more Friendly than the programmers intended. In fact, its Friendliness is not limited to currently existing minds, but extends to all minds that could possibly exist in this universe. Currently existing minds are not given special treatment just because they happen to already exist. All existing sentient beings are painlessly terminated. The matter that their bodies and minds were constructed from is instead used to create minds that are as matter-efficient, energy-efficient, and easy to satisfy as possible. In other words, MetaQualia?'s orgasmium? scenario. --observer

[-]DaFranker13y10

(AI Solves Physics and limits intelligence growth)

Wouldn't the second case only be a failure if the AI concludes wrongly? If we are living in such a simulation, what would be the alternative?

The most obvious alternative is to memory-bank all intelligence in the universe (or network it for the AI's use), and have the AI consume up the equivalent remaining amount of intelligence up to just before the noise threshold, then put all that intelligence towards finding a way to escape the box.

So yeah. Still a failure.

LESSWRONG
LW

LESSWRONG
LW

16

Humor: GURPS Friendly AI

16

16

GurpsFriendlyAI

FriendlyAICriticalFailureTable

Friendly AI Critical Failure Table