Wake up babe, new decision theory just dropped!
Furthermore, MUPI provides a new formalism that captures some of the core intuitions of functional
decision theory (FDT) without resorting to its most problematic element: logical counterfactuals. FDT
advises an agent to choose the action that would yield the best outcome if its decision-making function
were to produce that output, thereby accounting for all instances of its own algorithm in the world.
This enables FDT to coordinate and cooperate well with copies of itself. FDT must reason about
what would have happened if its deterministic algorithm had produced a different output, a notion of
logical counterfactuals that is not yet mathematically well-defined. MUPI achieves a similar outcome
through a different mechanism: the combination of treating universes including itself as programs,
while having epistemic uncertainty about which universe it is inhabiting—including which policy it
is itself running. As explained in Remark 3.14, from the agent’s internal perspective, it acts as if its
choice of action decides which universe it inhabits, including which policy it is running. When it
contemplates taking action , it updates its beliefs , effectively concentrating probability
mass on universes compatible with taking action . Because the agent’s beliefs about its own policy
are coupled with its beliefs about the environment through structural similarities, this process allows
the agent to reason about how its choice of action relates to the behavior of other agents that share
structural similarities. This “as if” decision-making process allows MUPI to manifest the sophisticated,
similarity-aware behavior FDT aims for, but on the solid foundation of Bayesian inference rather than on yet-to-be-formalized logical counterfactuals.
Yes, it seems to be closer UDT, but… updateful. So not that close to UDT. Really, it’s “just” an mathematically rigorous, embedded EDT.
Abstract for those who want to see it without clicking on the link:
The standard theory of model-free reinforcement learning assumes that the environment dynamics are stationary and that agents are decoupled from their environment, such that policies are treated as being separate from the world they inhabit. This leads to theoretical challenges in the multi-agent setting where the non-stationarity induced by the learning of other agents demands prospective learning based on prediction models. To accurately model other agents, an agent must account for the fact that those other agents are, in turn, forming beliefs about it to predict its future behavior, motivating agents to model themselves as part of the environment. Here, building upon foundational work on universal artificial intelligence (AIXI), we introduce a mathematical framework for prospective learning and embedded agency centered on self-prediction, where Bayesian RL agents predict both future perceptual inputs and their own actions, and must therefore resolve epistemic uncertainty about themselves as part of the universe they inhabit. We show that in multi-agent settings, self-prediction enables agents to reason about others running similar algorithms, leading to new game-theoretic solution concepts and novel forms of cooperation unattainable by classical decoupled agents. Moreover, we extend the theory of AIXI, and study universally intelligent embedded agents which start from a Solomonoff prior. We show that these idealized agents can form consistent mutual predictions and achieve infinite-order theory of mind, potentially setting a gold standard for embedded multi-agent learning.
I’ve invited the authors to present at the AIXI research meetings (uaiasi.com). It will probably take two presentations. I will advertise here (and other places you will see) once we have dates.
Tentatively 2 pm ET on Monday December 15th at the usual zoom link: https://uwaterloo.zoom.us/j/7921763961?pwd=TDatET6CBu47o4TxyNn9ccL2Ia8HN4.1
Check the calendar for any updates: https://uaiasi.com
It will be one 90-120 minute presentation.
This seems like it's building on or inspired by work you've done? Or was this team interested in embeddedness and reflective oracles for other reasons?
It's ridiculously long (which is great, I'll read through it when I get a chance), do you have any pointers to sections that you think have particularly valuable insights?
I believe they were mainly inspired by Demski and Garrabrant, but we were in contact for the last few months and I’m glad to see that some of my recent work was applicable. We arrived at the idea of using a joint distribution with a grain of truth independently, and they introduce a novel “RUI” construction, but also study (what I’ve been calling) AEDT wrt rOSI in section 5.2. The differences are pretty technical, IMO the RUI approach is halfway between rOSI and logical induction.
It’s so long that even I’m still reading it, and I got a copy early. Assuming you’re familiar with Solomonoff induction / AIXI / embedded agency (which it sounds like you are) the core of it is section 3 and section 5 (particularly 5.1-5.3 I think). The appendix is like 100 pages and so far doesn’t seem essential unless you want to extend the results (also some of it will be familiar if you read my GOT paper).
Author here. We were heavily inspired by multiple things, including Demski and Garrabrant, the 1990's work of Kalai and Lehrer, empirical work in our group inspired by neuroscience pointing towards systems that predict their own actions, and the earlier work on reflective oracles by Leike . We were not aware of @Cole Wyeth et al.'s excellent 2025 paper which puts the reflective oracle work on firmer theoretical footing, as our work was (largely but not entirely) done before this paper appeared.
Hey Jeremy! Our team's interest is mainly on multi-agent learning, and self-modeling and theory of mind. As properly formalizing a coherent theory for these topics turned out to be quite difficult, we dived deeper and deeper and ultimately arrived at the AIXI and reflective oracles frameworks, which provided a good set of tools as starting points for addressing these questions more formally. The resulting 'monster paper' is a write-up of the past year of work we did on these topics. Due to our interest in multi-agent learning, a good chunk of the paper is on the game-theoretic behavior of such 'embedded Bayesian agents' (Section 4). As Cole mentioned, we arrived independently to some similar results as Cole's (as we came from a bit outside of the less wrong community), and we are very excited to now start collaborating more closely with Cole on the next questions enabled by both of our theories!
A team at Google has substantially advanced the theory of embedded agency with a grain of truth (GOT), including new developments on reflective oracles and an interesting alternative construction (the "Reflective Universal Inductor" or RUI).
(I was not involved in this work)
Abstract: