I think this paper will be of interest. It's a formal definition of universal intelligence/optimization power. Essentially you ask how well the agent does on average in an environment specified by a random program, where all rewards are specified by the environment program and observed by the agent. Unfortunately it's uncomputable and requires a prior over environments.

How to measure optimisation power

by Stuart_Armstrong 1 min read25th May 201230 comments

8


As every school child knows, an advanced AI can be seen as an optimisation process - something that hits a very narrow target in the space of possibilities. The Less Wrong wiki entry proposes some measure of optimisation power:

One way to think mathematically about optimization, like evidence, is in information-theoretic bits. We take the base-two logarithm of the reciprocal of the probability of the result. A one-in-a-million solution (a solution so good relative to your preference ordering that it would take a million random tries to find something that good or better) can be said to have log2(1,000,000) = 19.9 bits of optimization.

This doesn't seem a fully rigorous definition - what exactly is meant by a million random tries? Also, it measures how hard it would be to come up with that solution, but not how good that solution is. An AI that comes up with a solution that is ten thousand bits more complicated to find, but that is only a tiny bit better than the human solution, is not one to fear.

Other potential measurements could be taking any of the metrics I suggested in the reduced impact post, but used in reverse: to measure large deviations from the status quo, not small ones.

Anyway, before I reinvent the coloured wheel, I just wanted to check whether there was a fully defined agreed upon measure of optimisation power.