Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.
This is a linkpost for https://arxiv.org/abs/2312.07547

[Submitted on 6 Dec 2023]

Active Inference and Intentional Behaviour

Karl J. Friston, Tommaso Salvatori, Takuya Isomura, Alexander Tschantz, Alex Kiefer, Tim Verbelen, Magnus Koudahl, Aswin Paul, Thomas Parr, Adeel Razi, Brett Kagan, Christopher L. Buckley, Maxwell J. D. Ramstead

Abstract:

Recent advances in theoretical biology suggest that basal cognition and sentient behaviour are emergent properties of in vitro cell cultures and neuronal networks, respectively. Such neuronal networks spontaneously learn structured behaviours in the absence of reward or reinforcement. In this paper, we characterise this kind of self-organisation through the lens of the free energy principle, i.e., as self-evidencing. We do this by first discussing the definitions of reactive and sentient behaviour in the setting of active inference, which describes the behaviour of agents that model the consequences of their actions. We then introduce a formal account of intentional behaviour, that describes agents as driven by a preferred endpoint or goal in latent state-spaces. We then investigate these forms of (reactive, sentient, and intentional) behaviour using simulations. First, we simulate the aforementioned in vitro experiments, in which neuronal cultures spontaneously learn to play Pong, by implementing nested, free energy minimising processes. The simulations are then used to deconstruct the ensuing predictive behaviour, leading to the distinction between merely reactive, sentient, and intentional behaviour, with the latter formalised in terms of inductive planning. This distinction is further studied using simple machine learning benchmarks (navigation in a grid world and the Tower of Hanoi problem), that show how quickly and efficiently adaptive behaviour emerges under an inductive form of active inference.

From the introduction:

Specifically, this paper differentiates between three kinds of behaviour: reactive, sentient, and intentional. The first two have formulations that have been extensively studied in the literature, under the frameworks of model-free reinforcement learning (RL) and active inference, respectively. In model-free RL, the system selects actions using either a lookup table (Q-learning), or a neural network (deep Q-learning). In standard active inference, the action selection depends on the expected free energy of policies (Equation 2), where the expectation is over observations in the future that become random variables. This means that preferred outcomes—that subtend expected cost and risk—are prior beliefs that constrain the implicit planning as inference [15–17]. Things that evince this kind of behaviour can hence be described as planning their actions, based upon a generative model of the consequences of those actions [15, 16, 18]. It was this sense in which the behaviour of the cell cultures was considered sentient.

This form of sentient behaviour —described in terms of Bayesian mechanics [19–21]—can be augmented with intended endpoints or goals. This leads to a novel kind of sentient behaviour that not only predicts the consequences of its actions, but is also able to select them to reach a goal state that may be many steps in the future. This kind of behaviour, that we call intentional behaviour, generally requires some form of backwards induction [22, 23] of the kind found in dynamic programming [24–27]: this is, starting from the intended goal state, and working backwards, inductively, to the current state of affairs, in order to plan moves to that goal state. Backwards induction was applied to the partially observable setting and explored in the context of active inference in [27]. In that work, dynamic programming was shown to be more efficient than traditional planning methods in active inference.

The focus of this work is to formally define a framework for intentional behaviour, where the agent minimises a constrained form of expected free energy—and to demonstrate this framework in silico. These constraints are defined on a subset of latent states that represent the intended goals of the agent, and propagated to the agent via a form of backward induction. As a result, states that do not allow the agent to make any ‘progress’ towards one of the intended goals are penalised, and so are actions that lead to such disfavoured states. This leads to a distinction between sentient and intentional behaviour, were intentional behaviour is equipped with inductive constraints.

In this treatment, the word inductive is used in several senses. First, to distinguish inductive planning from the abductive kind of inference that usually figures in applications of Bayesian mechanics; i.e., to distinguish between mere inference to the best explanation (abductive inference) and genuinely goal-directed inference (inductive planning) [28, 29]. Second, it is used with a nod to backwards induction in dynamic programming, where one starts from an intended endpoint and works backwards in time to the present, to decide what to do next [24, 25, 27, 30]. Under this naturalisation of behaviours, a thermostat would not exhibit sentient behaviour, but insects might (i.e., thermostats exhibit merely reactive behaviour). Similarly, insects would not exhibit intentional behaviour, but mammals might (i.e., insects exhibit merely sentient behaviour). The numerical analyses presented below suggest that in vitro neuronal cultures may exhibit sentient behaviour, but not intentional behaviour. Crucially, we show that neither sentient nor intentional behaviour can be explained by reinforcement learning. In the experimental sections of this work, we study and compare the performance of active inference agents with and without intended goal states. For ease of reference, we will call active inference agents without goal states abductive agents, and agents with intended goals inductive agents. [emphasis added]

From the discussion:

Inductive planning, as described here, can also be read as importing logical or symbolic (i.e. deductive) reasoning into a probabilistic (i.e., inductive, in the sense of inductive programming) framework. This speaks to symbolic approaches to problem solving and planning—e.g., [79–81]—and a move towards the network tensor computations found in quantum computing: e.g., [82, 83]. However, in so doing, one has to assume precise priors over state transitions and intended states. In other words, this kind of inductive planning is only apt when one has precisely stated goals and knowledge about state transitions. Is this a reasonable assumption for active inference? It could be argued that it is reasonable in the sense that: (i) goal-states or intended states are stipulatively precise (one cannot formulate an intention to act without specifying the intended outcome with a certain degree of precision) and (ii) the objective functions that underwrite self-evidencing lead to precise likelihood and transition mappings. In other words, to minimise expected free energy—via learning—just is to maximise the mutual information between latent states and their outcomes, and between successive latent states.

To conclude, inductive planning differs from previous approaches proposed in both the reinforcement learning and active inference literature, due to the presence of intended goals defined in latent state space. In both model free and model based reinforcement learning, goals are defined via a reward function. In alternative but similar approaches, such as active inference, rewards are passed to the agent as privileged (usually precise but sparse) observations [41, 84]. This influences the behaviour of the agent, which learns to design and select policies that maximise expected future reward either via model-free approaches, which assign values to state-action pairs, or via model-based approaches, which select actions after simulating possible futures. Defining preferences directly in the state space, however, induces a different kind of behaviour: the fast and frugal computation involved in inductive planning is now apt to capture the efficiency of human-like decision-making, where indefinitely many possible paths, inconsistent with intended states, are ruled out a priori—hence combining the ability of agents to seek long-term goals, with the efficiency of short-term planning. [emphasis added]

Commentary

Although I added the "Agent Foundations" tag to this post, I currently think that there is nothing "foundational" about all agency frameworks (including the one presented above). They should all be treated as instrumental, i.e., from the engineering perspectives of how amenable they are for backing in coordination and commitment mechanisms, imposing constraints, control, alignment, interpretability, etc.

This paper doesn't describe the approach of modelling goals as "states in the latent world model" from these perspectives, apart from noting that it's a "fast and efficient" way of doing planning, which is a capability rather than a safety characteristic. However, clearly, this wasn't the intended goal of the paper.

Different papers argued for state-space planning and goal-setting architectures for these "safety" and "alignment" purposes: "Designing Ecosystems of Intelligence from First Principles" (Friston et al., 2022), LeCun's H-JEPA vision paper  (2022), and a part of @Rafael Kaufmann Nedal's commentary on another Frison's paper ("Sketch 2: Post-Newtonian Science"). Though I have a lot of my thoughts[1] about this (in the context of comparing collective mind architectures) and generally think the argument that state-space planning architectures are better for safety and alignment than alternatives (such as multi-agent RL), the papers that I reference here are at best incomplete. No comprehensive, slam-dunk paper has been written on this topic yet.

Therefore, the choice of the authors of the paper to call any RL modelling (even very advanced) merely "reactive behaviour" while putting even relatively primitive active inference agents on the "sentient" pedestal could rub many people in the wrong way. I encourage these people to look past the "reactive", "sentient" and "intentional" labels and just look at the object-level characteristics of the frameworks.

  1. ^

    A quote from one of my recent discussions with Vitaly Vanchurin about his WaNN framework that could help someone understand where I'm pointing:
     
    "Goals become quantifiable precisely when a predictive model of the world is used as a reference frame. If you have no model of the world, the proposition "Agent X has a goal to establish a colony on Mars" is not quantifiable. If you have scientific, economic, psychological, and other models, you can start to quantify the probability of success and the time and resources needed to accomplish this goal, i.e., to quantify it.

    I don't dispute that an agent that does "System 2" planning with a predictive model could be viewed as just an NN with a particular architecture. I.e., I don't dispute the universality of the NN framework (at least in this context). But I do question its sufficiency and particular usability for modelling cooperation, communication, and collective sense- and decision-making, particularly, when these processes happen between "not so many" agents, from just two to dozens (as opposed to large numbers, such as larger-scale political decision-making). These could be the contexts of economic relationships (i.e., two economic agents interacting, or three: two agents + platform/intermediary), a family, and relationships between communities and countries.

    It feels to me that in these contexts, talking just in terms of NN framework ontology ("loss function", "establishing connection", "network properties") hardly permits solving cooperation and communication problems that didn't seem solvable before.

    Economic agents, family members, diplomats and presidents have always been solving these problems in terms of goals, beliefs, intentions, benefits, and moral values (sometimes). This "folk art of negotiation and diplomacy" has been mostly rooted in predictive models (perhaps, except cooperation from a moral standpoint, such as the Crusades).

    I would like to think that there is a demand for turning this "art of cooperation and diplomacy" into a science (more accurately, an engineering theory), but it seems to me that it couldn't be based just on the NN framework and the modelling of loss functions and data exchange between agents. A phenomenological model that includes predictive models has to be in the foundation of such an engineering theory.

    Active Inference might be somewhat too constrained as a phenomenological model for this. Its classical formulation doesn't account for the contextuality of inference and incoherent beliefs, which abound in the real-world inference of agents, and "excess Bayesian inference" might be too unwieldy.

    Thus, is H-JEPA a good middle ground? It doesn't have problems with the contextuality of inference, at the cost of some (theoretical?) hit to learning efficiency and credit assignment. Also, counterfactuals would not be as clean and nice as in Active Inference (if we need them for our engineering theory of cooperation, which we likely do).

    BTW, speaking about autonomous driving, H-JEPA is currently being scaled to tackle this problem: https://wayve.ai/thinking/scaling-gaia-1/."

New Comment