alec_tschantz — LessWrong

Great post; a few short comments:

Closing the action loop of active inference

There is a sense in which this loop is already closed - the sensory interface for an LLM is a discrete space of size context window x vocabulary that it observes and acts upon. The environment is whatever else writes to this space, e.g., a human interlocutor. This description contains the necessary variables and dependencies to get an action-perception loop off the ground. One caveat is that action-perception loops usually have actions that influence the environment to generate desirable observations, whereas LLMs directly influence their observation space. However, there are counter-examples, such as LLMs generating questions that cause the environment (a user) to generate the desired observations.

Fixed priors/desires

In active inference, the agent's wants/desires are usually expressed in terms of its stationary distribution over observations (equated with its generative world model). A typical example might be the desire to have "blood temperature at 37 degrees," which would be interpreted as assigning a high probability to observing blood temperature at 37 degrees.

You could argue that LLMs already have this attribute by parametrizing a distribution over likely sequences. In active inference terminology, when an LLM observes "The cat sat on..." it wants to observe "the mat" and acts on the world to make this happen.

A small example to help illustrate points 1 and 2: imagine an LLM trained to generate sequences describing the history of human tool use. The LLM assigns a probability distribution over sequences (its desires) and acts to manifest these. Suppose some external process (the environment) periodically inserts random low-probability tokens. The LLM will observe these and will act to course correct back to higher probability regions of sequence space (the action-perception loop).

If the external process is predictable, the LLM will move to parts of the state space that best account for the effects of the environment and its model of the most likely sequences (loosely analogous to a Bayesian posterior). For example, if the external process is generating tokens related to bronze - the LLM will describe tool use in the bronze age.

It's also worth highlighting the differences between a system that outputs probabilities and a system whose internal states parameterize a probability distribution. Most active inference models fall into this latter category, while it's not obvious that LLMs do. However, some arguments might suggest they can be implicitly interpreted this way.

No convincing evidence for gradient descent in activation space

alec_tschantz3y30

Interesting, iterative attention mechanisms had always reminded me of predictive coding, where cross-attention encodes a kind of prediction error between the latent and data. But I could also see how self-attention could be read as a type of prediction error between tokens and ${1, . . ., n + 1}$ .

There is some work comparing residual connections and iterative inference that may be of relevance; they show that such architectures "naturally encourage features to move along the negative gradient of loss during the feedforward phase", I expect some of these insights could be applied to the residual stream in transformers.

Why Simulator AIs want to be Active Inference AIs

alec_tschantz3y*Ω373

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments