x

LESSWRONG
LW

pebbles — LessWrong

pebbles

pebbles

Message

1

4y

pebbles

pebbles has not written any posts yet.

Message

1 comment

Member for 4 years

Replying toSelf-Reference Breaks the Orthogonality Thesis

Self-Reference Breaks the Orthogonality Thesis

If you want to keep the search function from wireheading the world model then you have to code "don't break the world model" into your value function. This is a general contradiction to the Orthogonality Thesis. A sufficiently powerful world-optimizing artificial intelligence must have a value function that preserves the integrity of its world model, because otherwise it'll just wirehead itself, instead of optimizing the world.

If the value function says ~"maximise the number of paperclips, as counted by my paperclip-counting-machinery", a weak AI might achieve this by making paperclips, but a stronger AI might trick the paperclip-counting-machinery into counting arbitrarily many paperclips, rather than actually making any paperclips.

However, this isn't a failure... (read more)

1

0