I found this paper while browsing on semantic scholar - it is so far the only paper known by semanticscholar to cite Discovering Agents, and seems to me like a promising refinement. It's going to take me some time to read it properly, and I'd love to get others' takes on it as well.


Under what circumstances can a system be said to have beliefs and goals, and how do such agency-related features relate to its physical state? Recent work has proposed a notion of interpretation map , a function that maps the state of a system to a probability distribution representing its beliefs about an external world. Such a map is not com-pletely arbitrary, as the beliefs it attributes to the system must evolve over time in a manner that is consistent with Bayes’ theorem, and conse-quently the dynamics of a system constrain its possible interpretations. Here we build on this approach, proposing a notion of interpretation not just in terms of beliefs but in terms of goals and actions. To do this we make use of the existing theory of partially observable Markov processes (POMDPs): we say that a system can be interpreted as a solution to a POMDP if it not only admits an interpretation map describing its beliefs about the hidden state of a POMDP but also takes actions that are optimal according to its belief state. An agent is then a system together with an interpretation of this system as a POMDP solution. Although POMDPs are not the only possible formulation of what it means to have a goal, this nevertheless represents a step towards a more general formal definition of what it means for a system to be an agent.

New Comment
2 comments, sorted by Click to highlight new comments since: Today at 9:39 AM

Do you know if this approach makes any testable predictions? For example, are worms agents? What about a navigation app on your phone?

todo: figure out and reply,

New to LessWrong?