Eliezer Yudkowsky wrote in 2016:

At an early singularity summit, Jürgen Schmidhuber, who did some of the pioneering work on self-modifying agents that preserve their own utility functions with his Gödel machine, also solved the friendly AI problem. Yes, he came up with the one true utility function that is all you need to program into AGIs!

(For God’s sake, don’t try doing this yourselves. Everyone does it. They all come up with different utility functions. It’s always horrible.)

His one true utility function was “increasing the compression of environ

... (read more)

Something like Goodhart's Law, I suppose. There are natural situations where X is associated with something good, but literally maximizing X is actually quite bad. (Having more gold would be nice. Converting the entire universe into atoms of gold, not necessarily so.)

EY has practiced the skill of trying to see things like a machine. When people talk about "maximizing X", they usually mean "trying to increase X in a way that proves my point"; i.e. they use motivated thinking.

Whatever X you take, the priors are almost 100% that literally maximizing X would be horrible. That includes the usual applause lights, whether they appeal to normies or nerds.

Thomas Kwa's Shortform

by Thomas Kwa 22nd Mar 20203 comments

2