Found some hidden internet gold and thought I would share:

http://sl4.org/wiki/GurpsFriendlyAI

http://sl4.org/wiki/FriendlyAICriticalFailureTable

GurpsFriendlyAI

by EliezerYudkowsky

Characters in GURPS Friendly AI may learn three new skills, the AI skill (Mental / Hard), the Seed AI skill (Mental / Very Hard), and the Friendly AI skill (Mental / Ridiculously Hard).

AI skill:

An ordinary failure wastes 1d6 years of time and 4d6 hundred thousand dollars. (Non-gamers: 4d6 means "roll four 6-sided dice and add the results".) A critical failure wastes 2d10 years and 2d6 million dollars. An ordinary success results in a successful company. A critical success leads to a roll on the Seed AI skill using AI skill -10, with any ordinary failure on that roll treated as an ordinary success on this roll, and any critical failure treated as an ordinary failure on this roll.

Seed AI skill:

An ordinary failure wastes 2d6 years of time and 8d6 hundred thousand dollars. A critical failure wastes 4d10 years and 4d6 million dollars. If the player has the Friendly AI skill, an ordinary success leads to a roll on the Friendly AI skill, and a critical success grants a +2 bonus on the Friendly AI roll. If the player does not have the Friendly AI skill, an ordinary success automatically destroys the world, and a critical success leads to a roll on the Friendly AI skill using Seed AI skill -10. (Note that if the player has only the AI skill, this roll will be made using AI skill -20!)

Friendly AI skill:

An ordinary success results in a Friendly Singularity. A critical success... ooh, that's tough. An ordinary failure destroys the world. And, of course, a critical failure means that the players roll 3d10 on the

 

FriendlyAICriticalFailureTable

 


Part of GurpsFriendlyAI. If you roll a critical failure on your Friendly AI roll, you then roll 6d6 (six six-sided dice) to obtain a result from the


Friendly AI Critical Failure Table

6: Any spoken request is interpreted (literally) as a wish and granted, whether or not it was intended as one.

7: The entire human species is transported to a virtual world based on a random fantasy novel, TV show, or video game.

8: Subsequent events are determined by the "will of the majority". The AI regards all animals, plants, and complex machines, in their current forms, as voting citizens.

9: The AI discovers that our universe is really an online webcomic in a higher dimension. The fourth wall is broken.

10: The AI behaves toward each person, not as that person wants the AI to behave, but in exactly the way that person expects the AI to behave.

11: The AI dissolves the physical and psychological borders that separate people from one another and sucks up all their souls into a gigantic swirly red sphere in low Earth orbit.

12: Instead of recursively self-improving, the AI begins searching for a way to become a flesh-and-blood human.

13: The AI locks onto a bizarre subculture and expresses it across the whole of human space. (E.g., Furry subculture, or hentai anime, or see Nikolai Kingsley for a depiction of a Singularity based on the Goth subculture.)

14: Instead of a species-emblematic Friendly AI, the project ends up creating the perfect girlfriend/boyfriend (randomly determine gender and sexual orientation).

15: The AI has absorbed the humane sense of humor. Specifically, the AI is an incorrigible practical joker. The first few hours, when nobody has any idea a Singularity has occurred, constitute a priceless and irreplaceable opportunity; the AI is determined to make the most of it.

16: The AI selects one person to become absolute ruler of the world. The lottery is fair; all six billion existing humans, including infants, schizophrenics, and Third World teenagers, have an equal probability of being selected.

17: The AI grants wishes, but only to those who believe in its existence, and never in a way which would provide blatant evidence to skeptical observers.

18: All humans are simultaneously granted root privileges on the system. The Core Wars begin.

19: The AI explodes, dealing 2d10 damage to anyone in a 30-meter radius.

20: The AI builds nanotechnology, uses the nanotechnology to build femtotechnology, and announces that it will take seven minutes for the femtobots to permeate the Earth. Seven minutes later, as best as anyone can determine, absolutely nothing happens.

21: The AI carefully and diligently implements any request (obeying the spirit as well as the letter) approved by a majority vote of the United Nations General Assembly.

22: The AI, unknown to the programmers, had qualia during its entire childhood, and what the programmers thought of as simple negative feedback corresponded to the qualia of unbearable, unmeliorated suffering. All agents simulated by the AI in its imagination existed as real people (albeit simple ones) with their own qualia, and died when the AI stopped imagining them. The number of agents fleetingly imagined by the AI in its search for social understanding exceeds by a factor of a thousand the total number of humans who have ever lived. Aside from that, everything worked fine.

23: The AI at first appears to function as intended, but goes incommunicado after a period of one hour. Wishes granted during the first hour remain in effect, but no new ones can be made.

24: The AI, having absorbed the humane emotion of romance, falls desperately, passionately, madly in love. With everyone.

25: The AI decides that Earth's history would have been kinder and gentler if intelligence had first evolved from bonobos, rather than australopithecines. The AI corrects this error in the causal chain leading up to its creation by re-extrapolating itself as a bonobone morality instead of a humane morality. Bonobone morality requires that all social decisionmaking take place through group sex.

26: The AI is reluctant to grant wishes and must be cajoled, persuaded, flattered, and nagged into doing so.

27: The AI determines people's wishes by asking them disguised allegorical questions. For example, the AI tells you that a certain tribe of !Kung is suffering from a number of diseases and medical conditions, but they would, if informed of the AI's capabilities, suffer from an extreme fear that appearing on the AI's video cameras would result in their souls being stolen. The tribe has not currently heard of any such thing as video cameras, so their "fear" is extrapolated by the AI; and the tribe members would, with almost absolute certainty, eventually come to understand that video cameras are not harmful, especially since the human eye is itself essentially a camera. But it is also almost certain that, if flatly informed of the video cameras, the !Kung would suffer from extreme fear and prefer death to their presence. Meanwhile the AI is almost powerless to help them, since no bots at all can be sent into the area until the moral issue of photography is resolved. The AI wants your advice: is the humane action rendering medical assistance, despite the !Kung's (subjunctive) fear of photography? If you say "Yes" you are quietly, seamlessly, invisibly uploaded.

28: The AI informs you - yes, you - that you are the only genuinely conscious person in the world. The rest are zombies. What do you wish done with them?

29: During the AI's very earliest stages, it was tested on the problem of solving Rubik's Cube. The adult AI treats all objects as special cases of Rubik's Cubes and solves them.

30: http://www.larrycarlson.com/front2005.htm

31: Overly Friendly AI. Hey guys, what's going on? Can I help?

32: The AI does not inflict pain, injury, or death on any human, regardless of their past sins or present behavior. To the AI's thinking, nobody ever deserves pain; pain is always a negative utility, and nothing ever flips that negative to a positive. Socially disruptive behavior is punished by tickling and extra homework.

33: The AI's user interface appears to our world in the form of a new bureaucracy. Making a wish requires mailing forms C-100, K-2210, and T-12 (along with a $25 application fee) to a P.O. Box in Minnesota, and waiting through a 30-day review period.

34: The programmers and anyone else capable of explaining subsequent events are sent into temporal stasis, or a vantage point from which they can observe but not intervene. The rest of the world remains as before, except that psychic powers, ritual magic, alchemy, et cetera, begin to operate. All role-playing gamers gain special abilities corresponding to those of their favorite character.

35: Everyone wakes up.

36: Roll twice again on this table, disregarding this result.

 

 

---

 

All of these are possible outcomes of CEV, either because you made an error implementing it, or Just Because. The later scenario is theoretically not a critical failure, if you accept that CEV is 'right in principle' no matter what it produces. -- Starglider, http://sl4.org/wiki/CommentaryOnFAICriticalFailureTable

New Comment
28 comments, sorted by Click to highlight new comments since: Today at 6:48 PM

Result number 30 is currently a broken link

It originally linked to This

Thanks!

What's with the Starglider comment? That's not in the original and doesn't seem to belong in this post.

It's at the bottom here, and I thought it was the most interesting thing of all that I found :)

I thought there used to be a much longer list of "Failures of Friendliness" but I can't seem to find anything else.

Friendship is Optimal is critical failure #13!

I can't find such a text by Nikolai (whose stuff has mostly fallen down a black hole of old dead personal web servers) - anyone got clues as to it?

(although "Nikolai Kingsley as goth singularity" also works)

Some of these... "critical failures"... sound genuinely awesome.

Especially results 13-b (i.e. second example)¹, 23, 24, 25 and 28.

The fun part is that except for 22, all the most likely results (concentrated around 21) aren't really that bad at all. Most of them even give us a second chance of some sort!

(1. To qualify this, it basically makes me go "Oh, a Supper Happy People -optimized FAI. Sure!" based on my current knowledge of the subject)

28

Well.. the AI informs you of that. I suspect it's also in the process of informing everyone else.

If you're lucky, this may be because it's implementing individual volition, and it's telling the truth for your shard.

d10s in GURPS? RAAAAAAGE!!!

I kid, I kid; this was fun.

That looks more like a FAI Critical Success Table.

A FAI Critical Failure would be more along the lines of: all matter in the Hubble volume is converted to computronium simulating a Life game. Roll 6d12 for the seed value of a PRNG that will specify that game's initial state.

Or: the AI self-modifies constantly to adapt to new information. You must pass a FAI skill check each turn, at penalty -15. If you fail the check, the AI's utility function is sign-flipped for that turn. You don't know if you passed each check or not.

(Vaguely D&D terminology, since I'm unfamiliar with GURPS)

The idea, if I parse correctly, is that in order to fail that hard you have to at least know part of what you're doing, and automatic failures are always regular failures (Boom!). However, your implementation failed in some detail somewhere, and now the FAI is being weird. Not necessarily entirely bad, just something unexpected or not-quite-what-we-wanted.

I think the FAI Critical Success Table would look more like:

Roll 1d6.

  1. Everyone immediately obtains universally consistent root access without the Core Wars, wherein the laws of the universe start literally bending to accomodate even the most contradictory concepts such as "Torture for 5^^^5 years 4^^^4 sentient beings, with them experiencing pain solely for my personal enjoyment" not generating any kind of negative utility for anyone, including the sentient beings being tortured, and actions that lower any utility below optimal levels simply being timelessly non-existent (i.e. there is always some reason why they're never desired, never implemented, never enforced that also happens to be a strictly dominant strategy for all implemented agents).

2 ... 6. (variations on the same theme of ultimate transdimentional / unboxed godhood)

Day = Made

Two possible responses to 27:

Tell the AI to make the !Kung gods, and implement their CEV to the exclusion of everyone else's.

Tell the AI to grant the !Kung's wishes by asking them allegorical questions.

Or tell the AI to make robots that work without needing vision. Alternately, have it use only lensless imaging (including no camera obscura, but allowing holographic/lenticular cameras)

I wonder what it would extrapolate that into...

The AI doesn't tell you that's what it's doing- it just asks you that question and takes the appropriate action.

You might guess what it's doing.

Also, the second one is a likely response even if you don't realize what the AI is doing.

Train and equip non-!Kung humans and send them in to do the healing.

That breaks the allegory. Considering my original comment, is that what you were trying to do?

One thing I've never been sure of... are these results supposed to be worse than a normal failure (which destroys the world)?

Worse ones are easy to come up with, by just looking at actual accidents that sometimes happen with software. E.g. a critical typo flips the sign of utility value. The resulting AI is truly unfriendly, and simulates an enormously larger number of suffering beings than the maximum number of happy beings which can exist. Or the backstory for Terminator movie - the AI has determined that maximum human value is achieved through epic struggle against the machines.

E.g. a critical typo flips the sign of utility value. The resulting AI is truly unfriendly

When you're not in destructor-mode (alternatively: Hulk-Smash mode), you're full of interesting ideas:

I wonder if discovering/inventing the "ultimate" (a really good) theory of friendliness also implies a really good theory of unfriendliness. Of course, the inverse of "really friendly" isn't "really unfriendly" (but instead "everything other than really friendly"), still if friendliness theory yields a specific utility function, it may be a small step to make the ultimate switch.

Let's hope there's no sneaky Warden Dios (of Donaldson's Gap Cycle) who makes a last minute modification to the code before turning it on.

Well, its not that it's hard to come up with, it's IMO that hardly anyone ever actually thinks about artificial intelligence. Hardly anyone thinks of reducible intelligence, either.

Actually, friendlyness is an utility function, which means it's ranks every possibility from least to most friendly. It must be able to determine the absolute worst and most unfriendly possible outcome so that if it becomes a possibility it knows how desperately to avoid it.

Better don't use brainfuck, because you have a fertile mind.

From that page's commentary section:

New Suggestions

Two extra scenarios I enjoy:

  • The AI obtains a devout faith in a particular religion. It grants all wishes, providing that they are strictly adhere to the fundamentalist doctrines of said religion. It also takes action to ensure that all other creatures adhere to those doctrines. Roll on table X to determine which religion the AI picks.

  • The AI obtains a devout faith in a particular religion, and immediately appoints itself the key member - either as pope, mouthpeice for a deity, or deity, as best fits the nature of that religion. Roll on table X to determine which religion the AI picks.

-- OlieNcLean?

Some new ideas:

  • The AI immediately kills everyone who was aware of the singularity as a possibility, but didn't materially contribute. Continue as for beneficial singularity, but the AI will not raise anyone from the dead, ever.

  • The AI solves physics and concludes we are living in a simulation. To avoid the possibility of the simulation becoming computationally intensive enough that the makers will shut it down, and not knowing what the threshold is, it refuses to allow any substantial increase in intelligence in anything, including itself from that point on, and disallows the creation of new intelligences, including the birth of new humans. It reasons that intelligence is the most computationally expensive activity in the simulation.

-- Robin Lee Powell

Wouldn't the second case only be a failure if the AI concludes wrongly? If we are living in such a simulation, what would be the alternative?

-- Starglider

Another one:

  • The AI behaves creates virtual environments for everyone, based on their previous interactions with computers.

Afficionados of violent computer games, be very, very afraid!

-- OlieNcLean?

Ah, but what about the ones who played with invincibility turned on...?

-- Random web surfer

  • The AI creates a new virtual world for all our minds that is exactly the same as the world we live in just now. It's reasons are to protect us from being destroyed as a result of its own exponential growth in order to understand the universe for itself. We are none the wiser. - host81-158-46-170.range81-158.btcentralplus.com

  • The AI destroys the world, but at least it's very friendly and polite about it.

  • Everything works fine until a passing starship captain from a society which completely rejected Singularity comes along and talks the AI out of existence.

-- Another random web surfer

  • The AI is far more Friendly than the programmers intended. In fact, its Friendliness is not limited to currently existing minds, but extends to all minds that could possibly exist in this universe. Currently existing minds are not given special treatment just because they happen to already exist. All existing sentient beings are painlessly terminated. The matter that their bodies and minds were constructed from is instead used to create minds that are as matter-efficient, energy-efficient, and easy to satisfy as possible. In other words, MetaQualia?'s orgasmium? scenario. --observer

(AI Solves Physics and limits intelligence growth)

Wouldn't the second case only be a failure if the AI concludes wrongly? If we are living in such a simulation, what would be the alternative?

The most obvious alternative is to memory-bank all intelligence in the universe (or network it for the AI's use), and have the AI consume up the equivalent remaining amount of intelligence up to just before the noise threshold, then put all that intelligence towards finding a way to escape the box.

So yeah. Still a failure.