LESSWRONG
LW

AI
Frontpage

38

AI safety as featherless bipeds *with broad flat nails*

by Stuart_Armstrong
19th Aug 2020
AI Alignment Forum
1 min read
1

38

Ω 14

AI
Frontpage

38

Ω 14

AI safety as featherless bipeds *with broad flat nails*
4Capybasilisk
New Comment
1 comment, sorted by
top scoring
Click to highlight new comments since: Today at 12:44 PM
[-]Capybasilisk5y40

For example, take the idea that an AI should maximise “complexity”. This comes, I believe, from the fact that, in our current world, the category of “is complex” and “is valuable to humans” match up a lot.

elaborates on this:

Juergen Schmidhuber of IDSIA, during the 2009 Singularity Summit, gave a talk proposing that the best and most moral utility function for an AI was the gain in compression of sensory data over time. Schmidhuber gave examples of valuable behaviors he thought this would motivate, like doing science and understanding the universe, or the construction of art and highly aesthetic objects.

Yudkowsky in Q&A suggested that this utility function would instead motivate the construction of external objects that would internally generate random cryptographic secrets, encrypt highly regular streams of 1s and 0s, and then reveal the cryptographic secrets to the AI.

Reply
Moderation Log
Curated and popular this week
1Comments
The Arbital entry on Unforeseen Maximums
Mentioned in
48Different perspectives on concept extrapolation

There's a famous story about Diogenes and Plato:

[...] when Plato gave the tongue-in-cheek definition of man as "featherless bipeds," Diogenes plucked a chicken and brought it into Plato's Academy, saying, "Behold! I've brought you a man," and so the Academy added "with broad flat nails" to the definition.

What Plato was (allegedly) doing was not providing a definition of man, but what I'd call a sufficient reference or a sufficient pointer. If I'm in ancient Athens and divide the obvious objects that I can see or think of into "featherless bipeds" and "not featherless bipeds", then "man" will match up with the first category.

Then Diogenes, acting like an AI, created something that fell within the sufficient pointer class but that was clearly not a man. The Academy then amended the pointer to add "with broad flat nails", patching it till it was sufficient again. Had there been a powerful AI around, or a god, or a meddling human with enough means and persistence, then they could have produced a "featherless-biped-with-broad-flat-nails" that was also not a human, making the pointer inadequate again.

A lot of suggestions on AI safety are sufficient pointers. For example, take the idea that an AI should maximise "complexity". This comes, I believe, from the fact that, in our current world, the category of "is complex" and "is valuable to humans" match up a lot. It's a sufficient pointer. But along comes a Diogenes/AI with complexity as a goal, and now it enriches the set of objects in the world with complex-but-worthless things, breaking the "definition".

Therefore, a lot of things that people say they value or want AIs to preserve/maximise, should not be taken as saying that they value the specific thing they say. Instead, this should be taken as pointer to what they value in the current world, and the challenge is then to extend that to new maps and new territories.