This post was written as part of research done at MATS 9.0 under the mentorship of Richard Ngo.
Summary
This post illustrates with examples how the qualitative concepts behind active inference and its usage of Markov blankets can be used to clarify agentic behaviour. If we see agents as predictors that minimise surprisal about their internal states by interacting with external states through intermediaries (such as the senses), then concepts such as goals, models, self-fulfilling goals and various types of cognitive biases are explained in satisfying (ish) ways.
More generally, these clarifying features of Markov blankets suggest to me that they are a helpful tool for developing ambitious, unifying theories for agentic behaviour. As I'll discuss at the end of the post, one important limitation of Markov blankets in describing agency also hints at interesting further research directions.
Active inference and Markov blankets
Active inference is a theory from neuroscience that posits agents as possessing strong priors about their possible observations, which are favourable to their survival. For example, humans have a strong prior that they will be adequately hydrated. Agents seek to minimise how surprised they are about their observations through a combination of updating their world model to fit their sensory feedback and acting on the world to manifest observations they have strong priors on. For instance, humans regularly drink to maintain themselves in the expected state of hydration.
Mathematically, a Markov blanket for a set of random variables A, is a set of random variables B such that A is independent of other variables given B; other variables only act on A through the blanket. A qualitative example is given by the aphorism: "the future is independent of the past given the present".
Active inference uses Markov blankets to model agents' information loop with the world. The agent only has access to the internal states. They interact with the true, external states through sensory states that impact the agent's experience of their internal states, and active states through which the agent influences the world to fit its expectations. The agent cannot directly see the external, active and sensory states but has a continuously updating model of them.
The concrete math behind active inference seems to decrease the clarifying power of the framework even to those that have an adequate background in statistical physics, so we'll keep things fuzzy while we use the concepts to explore and explain various examples of agentic phenomena.
Promising examples
In the following section, I use "experience" synonymously with the internal states the agent has access to.
From the perspective of the agent, models are priors about experience given a set of sensory states and a potentially empty set of active states. Agents have a conception of what they will experience through their senses after an action, which could just be the trivial "waiting" action.
Goals are strong priors about experience that are conditional on a non-empty set of active states. Desires are encoded as models that are both dependent on actions, and are expected by the agent to occur.[1] In active inference, agents engage in self-evidencing,[2] impacting their active states to instantiate these expected observations.
A belief having a high prior is thus a necessary but not a sufficient condition for it to qualify as a goal. Goals and "models with high priors" are therefore fundamentally separated by the extent of the active states' involvements. This intuitively suggests the existence of a continuum between the archetype of a goal and the archetype of a strongly held belief.
Goal Models are strong priors about experience that can be realised either through self-evidencing or through a change in the internal states. For instance, suppose I am a person who identifies as successful and would like to maintain that identity intact. I could choose to apply to a prestigious university, giving myself a chance of increasing the evidence for my successfulness. However, rejection could also decrease the evidence for this model that I'm emotionally invested in. Depending on how costly rejection may be to me, I could convince myself that the prestigious university's courses "aren't that interesting to me anyway", leading me to instead apply to a less prestigious university with lower admission standards.
In the above example, one could say that my abstract goal of being successful is vulnerable to me reward-hacking by instead optimising the internal evidence for my own success. I think many classic examples of cognitive biases can be explained in this way: behaviour that appears to irrationally pursue some external goal is actually rationally pursuing an internal representation of that goal. At least some irrationality is therefore downstream of imperfect or downright adversarial goal representations.
A fundamental limitation of Markov blankets in describing agency
Not everything that is statistically couched from the outside world by a boundary is sensibly described as an agent. For instance, a rock is a self-organised entity with reasonably clear boundaries. Moreover, these boundaries are generally more robust than those of living beings, lasting considerably longer.
I would say that rocks are in some sense too independent from their environments to be interesting. The archetypical agent has some kind of fuzzy boundary between itself and the environment, but is constantly sampling from the world and communicating information to it. This reciprocity and flexibility of boundaries is what makes agents such a beautiful mess. Humans are infinitely more interesting because we are constantly exchanging bits with the social structures we are embedded in. This behaviour results in emergent complexity that reframes humans as subagents interacting with larger agentic structures such as families, companies and countries.
You could define agents as entities that interface with the world through Markov blankets that allow information exchange within reasonable upper and lower bounds. The upper bound would be there to distinguish agents from maximally entropic noise, and the lower bound would serve to distinguish them from rocks. However, I think this undersells the interest of seeing agency as a fractal-like phenomenon that doesn't fit a clear, discrete separation between agents and their environments. I suspect that developing frameworks that serve this interest is worth someone's time.
A goal is characterised by a high prior on an event X that is dependent on an action Y, not by a high prior on the implication "if I do Y, then X". For instance, I may have a high prior that if I don't eat for a while, I will get hungry; this is not a goal. A better example of a goal is "I will be well fed". This is an observation to which I assign a high prior that I must manifest by eating.
This post was written as part of research done at MATS 9.0 under the mentorship of Richard Ngo.
Summary
This post illustrates with examples how the qualitative concepts behind active inference and its usage of Markov blankets can be used to clarify agentic behaviour. If we see agents as predictors that minimise surprisal about their internal states by interacting with external states through intermediaries (such as the senses), then concepts such as goals, models, self-fulfilling goals and various types of cognitive biases are explained in satisfying (ish) ways.
More generally, these clarifying features of Markov blankets suggest to me that they are a helpful tool for developing ambitious, unifying theories for agentic behaviour. As I'll discuss at the end of the post, one important limitation of Markov blankets in describing agency also hints at interesting further research directions.
Active inference and Markov blankets
Active inference is a theory from neuroscience that posits agents as possessing strong priors about their possible observations, which are favourable to their survival. For example, humans have a strong prior that they will be adequately hydrated. Agents seek to minimise how surprised they are about their observations through a combination of updating their world model to fit their sensory feedback and acting on the world to manifest observations they have strong priors on. For instance, humans regularly drink to maintain themselves in the expected state of hydration.
Mathematically, a Markov blanket for a set of random variables A, is a set of random variables B such that A is independent of other variables given B; other variables only act on A through the blanket. A qualitative example is given by the aphorism: "the future is independent of the past given the present".
Active inference uses Markov blankets to model agents' information loop with the world. The agent only has access to the internal states. They interact with the true, external states through sensory states that impact the agent's experience of their internal states, and active states through which the agent influences the world to fit its expectations. The agent cannot directly see the external, active and sensory states but has a continuously updating model of them.
The concrete math behind active inference seems to decrease the clarifying power of the framework even to those that have an adequate background in statistical physics, so we'll keep things fuzzy while we use the concepts to explore and explain various examples of agentic phenomena.
Promising examples
In the following section, I use "experience" synonymously with the internal states the agent has access to.
From the perspective of the agent, models are priors about experience given a set of sensory states and a potentially empty set of active states. Agents have a conception of what they will experience through their senses after an action, which could just be the trivial "waiting" action.
Goals are strong priors about experience that are conditional on a non-empty set of active states. Desires are encoded as models that are both dependent on actions, and are expected by the agent to occur.[1] In active inference, agents engage in self-evidencing,[2] impacting their active states to instantiate these expected observations.
A belief having a high prior is thus a necessary but not a sufficient condition for it to qualify as a goal. Goals and "models with high priors" are therefore fundamentally separated by the extent of the active states' involvements. This intuitively suggests the existence of a continuum between the archetype of a goal and the archetype of a strongly held belief.
Goal Models are strong priors about experience that can be realised either through self-evidencing or through a change in the internal states. For instance, suppose I am a person who identifies as successful and would like to maintain that identity intact. I could choose to apply to a prestigious university, giving myself a chance of increasing the evidence for my successfulness. However, rejection could also decrease the evidence for this model that I'm emotionally invested in. Depending on how costly rejection may be to me, I could convince myself that the prestigious university's courses "aren't that interesting to me anyway", leading me to instead apply to a less prestigious university with lower admission standards.
In the above example, one could say that my abstract goal of being successful is vulnerable to me reward-hacking by instead optimising the internal evidence for my own success. I think many classic examples of cognitive biases can be explained in this way: behaviour that appears to irrationally pursue some external goal is actually rationally pursuing an internal representation of that goal. At least some irrationality is therefore downstream of imperfect or downright adversarial goal representations.
A fundamental limitation of Markov blankets in describing agency
Not everything that is statistically couched from the outside world by a boundary is sensibly described as an agent. For instance, a rock is a self-organised entity with reasonably clear boundaries. Moreover, these boundaries are generally more robust than those of living beings, lasting considerably longer.
I would say that rocks are in some sense too independent from their environments to be interesting. The archetypical agent has some kind of fuzzy boundary between itself and the environment, but is constantly sampling from the world and communicating information to it. This reciprocity and flexibility of boundaries is what makes agents such a beautiful mess. Humans are infinitely more interesting because we are constantly exchanging bits with the social structures we are embedded in. This behaviour results in emergent complexity that reframes humans as subagents interacting with larger agentic structures such as families, companies and countries.
You could define agents as entities that interface with the world through Markov blankets that allow information exchange within reasonable upper and lower bounds. The upper bound would be there to distinguish agents from maximally entropic noise, and the lower bound would serve to distinguish them from rocks. However, I think this undersells the interest of seeing agency as a fractal-like phenomenon that doesn't fit a clear, discrete separation between agents and their environments. I suspect that developing frameworks that serve this interest is worth someone's time.
A goal is characterised by a high prior on an event X that is dependent on an action Y, not by a high prior on the implication "if I do Y, then X". For instance, I may have a high prior that if I don't eat for a while, I will get hungry; this is not a goal. A better example of a goal is "I will be well fed". This is an observation to which I assign a high prior that I must manifest by eating.
term from active inference