# 2

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

I've maintained a consistent focus on the approximable when it comes to idealized theories of intelligence. I don't care much how long it takes to converge, but if there \emph{is} no convergent approximation, I figure all bets are off. This is somewhat nonstandard.

There are many levels at which we could focus attention. The real target for algorithms is, of course, practical utility. This has several proxies, including polynomial-time computability and polynomial-time approximability (of various kinds).

Going beyond this, we could focus on computable distriutions (IE, distributions which can be approximated to any desired epsilon by a terminating algorithm); distributions which can be approximated from below, from above, or both; and my typical choice, that which has a convergent approximation at all.

It is also perfectly reasonable in some cases to go to even broader levels. For example, we might have approximable agents which are indexed by ordinals, such that higher ordinals are strictly better in some sense, but we cannot take the limit to give a single approximable super-agent better than all in the set. Rather than computations that converge to a theoretical ideal, we are now forced to choose some sufficiently high ordinal and approximate that. This kind of agent is effectively one Turing jump higher than mere approximability.

It is also possible to consider agents which encompass the whole arithmetic hierarchy, or the analytic hierarchy, or agents which have perfect knowledge of set theory, and so on.

Each of these levels gives rise to a significantly different landscape of possibilities. Our hope with any level we choose is that we can derive useful insights to apply at the practical level. However, the phenomena at each level are significantly different. A path may look promising when examined at one level and impossible when examined at another.

The example of this which I have in mind is the reflective oracle result, which intuitively tells us that it is possible for a rational Bayesian agent to understand environments which include equally-powerful agents, so long as we define a special self-reference call into the agent and allow the agents to take on mixed strategies in a way that makes consistent self-reference possible.

The method, however, is not approximable. I don't know exactly what happens when we try to put it into the realm of approximability, but we might instead find that no such thing is possible. This might lead us to have a very different feeling of what is possible at the practical level.

The level of approximability seems somewhat special to me in that anything less seems to give up quite a bit, and anything more seems difficult to connect with reality. I don't want to make my message here too strong, though; it seems the important thing is to be aware of the different computational levels we can work at and their connection to reality.