You can compute everything that takes finite compute and memory instantly. (This implies some sense of cartesianess, as I am sort of imagining the system running faster than the world, as it can just do an entire tree search in one "clock tick" of the environment.)

This part makes me quite skeptical that the described result would constitute embedded agency at all. It's possible that you are describing a direction which would yield some kind of intellectual progress if pursued in the right way, but you are not describing a set of constraints such that I'd say a thing in this direction would definitely be progress.

My intuition is that this would still need to solve the problem of giving an agent a correct representation of itself, in the sense that it can "plan over itself" arbitrarily. This can be thought of as enabling the agent to reason over the entire environment which includes itself. Is that part a solved problem?

This part seems inconsistent with the previous quoted paragraph; if the agent is able to reason about the world only because it can run faster than the world, then it sounds like it'll have trouble reasoning about itself.

Reflective Oracles solve the problem of describing an agent with infinite computational resources which can do planning involving itself and other similar agents, including uncertainty (via reflective-oracle solomonoff induction), which sounds superior to the sort of direction you propose. However, they do not run "faster than the world", as they can reason about worlds which include things like themselves.

Reply

[-]abramdemski2y70

This topic came up when working on a project where I try to make a set of minimal assumptions such that I know how to construct an aligned system under these assumptions. After knowing how to construct an aligned system under this set of assumptions, I then attempt to remove an assumption and adjust the system such that it is still aligned. I am trying to remove the cartesian assumption right now.

I would encourage you to consider looking at Reflective Oracles next, to describe a computationally unbounded agent which is capable of thinking about worlds which are as computationally unbounded as itself; and a next logical step after that would be to look at logical induction or infrabayesianism, to think about agents which are smaller than what they reason about.

Reply

Moderation Log

LESSWRONG
LW

LESSWRONG
LW

9

[ Question ]

Would this be Progress in Solving Embedded Agency?

9

9