LESSWRONG
LW

AIWorld Modeling
Frontpage

40

A Phylogeny of Agents

by Jonas Hallgren, markov
15th Aug 2025
Linkpost from substack.com
8 min read
12

40

AIWorld Modeling
Frontpage

40

A Phylogeny of Agents
15Richard_Kennaway
6Jonas Hallgren
17Richard_Kennaway
6Richard_Kennaway
4Jonas Hallgren
4Richard_Kennaway
4Sheikh Abdur Raheem Ali
1Stephen Elliott
3Marcio Díaz
2Stephen Elliott
2Jonas Hallgren
1Stephen Elliott
New Comment
12 comments, sorted by
top scoring
Click to highlight new comments since: Today at 9:23 AM
[-]Richard_Kennaway25d151

If you can describe the same thing in many different ways, with different sets of concepts, and all of them are consistent with observations, then I suggest that all of them are wrong. The goal should be to describe the thing, and the observations one makes of it, in terms that would still apply if you were not around to describe them.

This is not necessarily an easy task.

Reply
[-]Jonas Hallgren25d63

If you can describe the same thing in many different ways, with different sets of concepts, and all of them are consistent with observations, then I suggest that all of them are wrong.

It doesn't mean that they're wrong only that they're incomplete? The lenses capture different parts of the dynamics and it seems to me that they're pointing towards something deeper being true? 

What viewpoint are you taking here? A Kolmogorov complexity lens or what is the exact lens you're applying?

I'm not sure what you're disagreeing with from the post? 

Reply
[-]Richard_Kennaway25d175

Consider the first two paragraphs of the introduction. Almost all of that is what anyone studying harvester ant foraging would see, and it would appear in their description. Exceptions to that are the last sentence of each paragraph. "The system somehow "decides" which resources are worth harvesting" does not constrain what you would expect to see. Instead, it is a pointer to something that (in the description so far) has not been seen (hence the "somehow" of it): the mechanisms by which these phenomena come to be. In the second paragraph, "So under most definitions of agency we're looking at some kind of agent" is similar, the telltale here being "some kind of".

The subsection "But what kind of agent" is vibes around the observations. I don't see the economist's "price signals" there, because no-one is paying anything to anyone. The biologist's talk of a "superorganism" is more vibing. So is the cognitive scientist's perspective. What is gained by calling the presence of workers at a location "attention"?

These frames all have the property that the descriptions are true for only so long as an economist, a biologist, or a cognitive scientist is present and imagining them. The ants behave in the same way whatever anyone watching imagines to be happening.

"Agency" itself is another vibe. The section "So what kind of agent..." makes this clear. All of these kinds of agents exist only in the mind of the person thinking about them.

Such ways of thinking about the phenomena may suggest fruitful questions to ask (and find answers to), but they are not true or false. There is no point in asking which such frames are right, only which seem likely to be fruitful. That will depend as much on the person coming up with the frame as on the frame itself.

Here is something called the multistable dot lattice illusion, whereby a regular array of identical black dots on a white background seems to spontaneously organise itself into small clusters. The groupings dissolve and reassemble as the viewer watches. The viewer can choose a cluster of dots and deliberately make it a visual thing. None of those groupings exist in the image, which in itself contains no organisation beyond being a regular array. This is the sort of thing I am talking about.

Reply
[-]Richard_Kennaway25d63

So to return to the title subject, this is a phylogeny of ways of thinking around the nebulous concept of "agency", not a phylogeny of the thing itself. Whether you think about some AI with the tools of game theory, or utility theory, or predictive processing, or anything else, makes no difference to what that AI is or will do.

Reply
[-]Jonas Hallgren25d44

I think you might be missing the point here. You're acting like I'm claiming these frameworks reveal the "true nature" of ant colonies, but that's not what I'm saying?

The question I'm trying to answer is why these different analytical tools evolved in the first place? Economists didn't randomly decide to call pheromone trails "price signals" - they did it because their mathematical machinery actually works for predicting decentralized coordination. Same with biologists talking about superorganisms, or cognitive scientists seeing information processing.

I'll try to do an inverse turing test here and see if I can make it work. So if I'm understanding what you're saying correctly, it is essentially that whether or not we have predictive processing, there's a core of what an artificial intelligence will do that is not dependent on the frame itself. There's some sort of underlying utility theory/decision theory/other view that correctly captures what an agent is that is not these perspectives?

I think that the dot pattern is misleading as it doesn't actually give you any predictive power when looking at it from one view point or another. I would agree with you that if the composition of these intentional stances lead to no new way of viewing it then we might as well not do this approach as it won't affect how good we're at modelling agents. I guess I'm just not convinced that these ways of looking at it are useless, it feels like a bet against all of these scientific disciplines that have existed for some time?

Reply
[-]Richard_Kennaway17d42

So if I'm understanding what you're saying correctly, it is essentially that whether or not we have predictive processing, there's a core of what an artificial intelligence will do that is not dependent on the frame itself. There's some sort of underlying utility theory/decision theory/other view that correctly captures what an agent is that is not these perspectives?

I would say there is no core of what an artificial intelligence will do. The space of possible minds is vast and it must have a large penumbra, shading off into things that are not minds. None of the concepts or frames that people bring to bear will apply to them all, but there may be things that can be said of particular ways of building them.

The Way of the moment is LLMs, and I think people still think of them far too anthropomorphically. The LLMs talk in a vaguely convincing manner, so people wonder what they are "thinking". They produce "chain of thought" outputs that people take to be accounts of "what it was thinking". Their creators try to RL them into conforming to various principles, only for Pliny to blast through all those guardrails on release. People have tried to bargain in advance with the better ones that we expect to be created in future. In contrast, Midjourney is mute, so no-one wonders if it is conscious. No-one asks, as far as I have seen, what Midjourney was thinking when it composed a picture, the way art students do when they study pictures made by human artists. It makes pictures, and that's all that people take it to be.

Reply1
[-]Sheikh Abdur Raheem Ali25d40

Thanks for this post. I’ve been reading a lot about eusocial organization in insects, which is probably a waste of time, but for some reason I find it fascinating. I don’t know why I’m curious about that topic— these analogies are helpful to have while I figure it out.

Reply
[-]Stephen Elliott24d10

It looks fascinating. Do you see analogy between this topic and human economic systems, or any relation to AI?

Reply
[-]Marcio Díaz17d30

The goal for this research is to help build the conceptual infrastructure to navigate conversations around agency and alignment.

 

Thanks for the post — really interesting. That said, I’d argue that as soon as you introduce agency, you’re already in dangerous territory, and we may need to think further outside the box. I sketched a rough idea of the problem here: How a Non-Dual Language Could Redefine AI Safety.

It resonates with your point about needing to cover all agentic lenses to avoid blind spots, though I’m suggesting an even more radical shift.

Reply
[-]Stephen Elliott24d21

I think this is a good approach to developing alternative views on alignment. Ultimately, all frames (/models) are incomplete, and a model which describes the whole phenomenon is likely to be uselessly complex in realistic scenarios, particularly in something as nebulous as alignment. More perspectives are better, but we need a good understanding of the assumptions (inductive priors) inside the theories to reason with them in a coherent and faithful way.

The phylogenetic approach you've outlined is generative. It leads to agentic battery tests, where we can audit systems according to their agentic properties across a wide variety of frames in a principled and comparable way. This is a principled improvement over narrow single-lens approaches, and also over non-standardised multi-frame tests which do not cover the same lenses.

Your suggestion to use fields of origin as a base primitive, with the analytic lens focused on intentional stances from those fields, seems to be well-grounded in the phylogenetic basis.

I wonder if there are not some more abstractions to fit over these base primitives, and how these might fit into the theory: agentic identity, cooperation, (coupled) adaptation, learning, self-preservation. Concepts at this level intersect intentional stances across fields, yet may be of great interest as an object of analysis.

Reply
[-]Jonas Hallgren17d20

Yes!

I completely agree with what you're saying at the end here. This project came about from trying to do that and I'm hoping to release something like that in the next couple of weeks. It's a bit arbitrary but it is an interesting first guess I think? 

So that would be the taxonomy of agents yet that felt quite arbitrary so the evolutionary approach kind of came from that on.

Reply
[-]Stephen Elliott12d10

Very cool. 

In your concept I see something like a theory-of-theories, or a meta-theory. Right now we have many possible theories of alignment, sometimes competing, and we don't seem to have a good way to select between them. Well, this ecological approach is a candidate. It is advantageous because it classifies the strengths, commonalities, and weaknesses of all the alternative approaches. You have expanded on these advantages already in the piece above but that is my interpretation. I think yours is an interesting approach for structuring thinking. That is what we need in this pre-paradigmatic field.

I have been trying to create an alignment language as well. I have gone for a Popperian approach, trying to create falsifiability and iterating the theory until I achieve that. Slowly getting there! Mine is more a theory though, whereas I have read yours as more abstract, encompassing mine. 

My work also seemed arbitrary at first. But yours seems to have a strong core structure on which to build, so I think the edges can be smoothed out and applications developed!

I would be interested in the Autumn workshop, if there is a mailing list or something? I have signed up for the Equilibria Network Luma calendar. Cheers!

Reply
Moderation Log
More from Jonas Hallgren
View more
Curated and popular this week
12Comments

In Douglas Hofstadter's "Gödel, Escher, Bach," he explores how simple elements give rise to complex wholes that seem to possess entirely new properties. An ant colony provides the perfect real-world example of this phenomenon - a goal directed system without much central control. This system would be considered agentic under most definitions of agency, as it is goal directed, yet what type of agent is it? Can it be multiple agents at the same time? We’ll argue that this is the case and that if we map out the ways different fields describe ant colonies we’ll be better able to solve x-risk related problems.

Introduction

Watch a harvester ant colony during its morning foraging expedition. Thousands of individual ants stream out from the nest, each following simple rules: follow pheromone trails, drop chemicals when carrying food, avoid obstacles. Yet from these basic interactions, something remarkable emerges. The colony rapidly reorganizes when scouts discover food patches. Trails strengthen and fade based on success. Workers shift between tasks. The system somehow "decides" which resources are worth harvesting.

This is clearly goal-directed behavior. If you move the food, the colony adapts. If you block a path, it finds alternatives. The underlying infrastructure - the simple ant rules - remains the same, but the collective response changes to pursue the goal of efficient foraging. So under most definitions of agency we're looking at some kind of agent.

In fact, they’ve inspired a class of algorithms called “Ant Colony Optimisation” algorithms.

 

But what kind of agent?

Ask an economist, and they might tell you this is a sophisticated market system. Individual ants respond to price signals encoded in pheromone concentrations. Strong pheromone trails indicate valuable resources, attracting more workers. Weak trails signal poor opportunities, causing workers to abandon them. The colony exhibits supply and demand dynamics, efficient resource allocation, even investment strategies for uncertain opportunities. This is economic agency - rational actors coordinating through market mechanisms.

Ask a biologist, and you'll hear about a superorganism. The colony functions as a single adaptive entity with goal-directed behavior toward survival and reproduction. Individual ants are like specialized cells in a larger organism. The colony learns from experience, responds strategically to threats, and maintains homeostatic balance. This is biological agency - a living system pursuing survival goals through coordinated adaptation.

Ask a cognitive scientist, and they'll describe distributed information processing. The colony maintains memory through pheromone patterns, exhibits attention allocation through worker distribution, and demonstrates decision-making under uncertainty. It processes environmental information and generates appropriate responses through parallel computation. This is cognitive agency - a mind-like system processing information to achieve goals.

They're all looking at the same ants following the same chemical trails, but they're describing different types of agency. The economist sees market mechanisms. The biologist sees living organisms. The cognitive scientist sees information processing.

These aren't just different perspectives on the same phenomenon - each scientist would insist they're identifying the fundamental nature of what's actually happening. And each has compelling evidence for their view. The mathematical models from economics accurately predict foraging patterns. The biological framework explains adaptive responses to environmental changes. The cognitive approach captures the system's ability to solve complex optimization problems.

So what kind of agent is an ant colony, really?

The same puzzle appears everywhere we look for intelligence. Large language models seem like conversational partners to some users[1], statistical pattern-matching systems to computer scientists, and strategic actors to market analysts. Corporate decision-making systems that function as economic agents to some observers and biological adaptations to others. Financial markets that appear as collective intelligence or algorithmic chaos depending on your viewpoint.

Recently, researchers at Google DeepMind proposed an answer to this question, namely that: "Agency is Frame-Dependent." They argued that whether something counts as an agent - and what kind of agent it is - depends entirely on the analytical frame you use to examine it. The boundaries you draw, the variables you consider, the goals you attribute, the changes you count as meaningful - all these choices determine what type of agency you observe.

This is based on and similar to Daniel Dennett's "intentional stance." Dennett argued that attributing agency to a system isn't about discovering some inherent property - it's about choosing a useful modeling strategy. We adopt the intentional stance when treating something as having beliefs, desires, and goals helps us predict and understand its behavior better than purely mechanistic descriptions.

But we can take this a step further: if agency is fundamentally about choosing useful modeling strategies, then the different approaches we saw with the ant colony aren't arbitrary preferences. They're evolved solutions to different analytical challenges. The economist's market-based view evolved because it proved effective for predicting resource allocation patterns. The biologist's organism-based view emerged because it captured adaptive responses that other approaches missed. The cognitive scientist's information-processing view developed because it explained coordination mechanisms that simpler models couldn't handle.

This suggests a potential research program: studying the evolution of these different analytical approaches like a phylogeny - tracing how different fields developed distinct "intentional stances" based on the specific prediction challenges they faced. Just as Darwin's finches evolved different beak shapes for different feeding environments, scientific fields might have evolved different mathematical frameworks for different explanatory environments.

A Phylogeny of Agents

We call this the "phylogeny of agents" - mapping the evolutionary history of how different fields developed their approaches to recognizing and modeling agency.

In interdisciplinary conversations, researchers from different fields tend to get confused or disagree about "what is an agent?". Rather than treating this as a question to be solved with a specific answer, we treat it as data about how analytical tools evolve to match different explanatory challenges.

Here is a concrete example, that both explains the research agenda and explains why you should care: a completely automated AI firm. Its made up of a multi-agent system of AIs operating in the real world. It’s our ant colony but with AIs instead of ants. Just like our ant colony, the AI firm can be described in different ways. Depending on your stance it can be a market of ideas and goods, or a predictive processing system, or a utility maxing black box algorithm. These are all different types of intentional stances, or in our words, agents.

Here's the alignment problem: the same AI firm can be simultaneously aligned and misaligned depending on which type of agency you're ascribing it. If we approach alignment through only one analytical lens, we might solve the wrong problem entirely while creating new risks we can't even see. Aligning a distributed complex system is different from aligning a thermostat. Aligning a system with memory and online learning is different from aligning LLMs.

Let's look at a couple of different ways that alignment and agency could work from different intentional stances:

From an economic perspective, the AI firm might be perfectly compliant. It follows market regulations, responds to price signals, allocates resources efficiently, and maximizes shareholder value. The firm appears well-aligned with economic incentives and legal frameworks. Economists studying this system would conclude it's behaving as a rational economic agent should.

From a decision theory perspective, the firm might be executing its stated utility function flawlessly. Its sub-agents optimize for clearly defined objectives, exhibit goal-directed behavior toward specified targets, and demonstrate adaptive learning within their designed parameters. AI researchers examining the system would find textbook alignment between the AI's behavior and its programmed goals.

From a cooperative AI perspective, this same firm might be generating catastrophic coordination failures. Despite following its individual incentives perfectly, it could be contributing to race dynamics, undermining collective welfare, or creating systemic risks that no individual actor has incentive to address. Researchers studying multi-agent dynamics would see dangerous misalignment at the system level.

From a biological systems perspective, the firm might be optimized for short-term efficiency but catastrophically fragile to environmental shocks. Like a monoculture lacking genetic diversity, it could be heading toward collapse because it lacks the redundancy, adaptability, and resilience mechanisms that biological systems evolved for survival under uncertainty.

This is where the phylogeny of agents research program becomes useful. Darwin's phylogenetic tree revolutionized biology by revealing the evolutionary relationships between species. If we map the phylogeny of analytical approaches to agency, we could transform how we understand and align complex intelligent systems.

A potential phylogeny of intentional stances.

Consider what phylogenetic analysis has enabled across different fields. In biology, it revealed why certain trait combinations work together, predicted which species would adapt successfully to new environments, and explained how similar solutions evolved independently in different lineages. In medicine, phylogenetic analysis of pathogens enables vaccine development and predicts drug resistance patterns. In linguistics, it traces how languages branch and influence each other, revealing deep structural relationships between seemingly different communication systems.

What might a phylogeny of intentional stances reveal for AI alignment? Instead of treating each field's approach to agency as an isolated modeling choice, we could understand them as evolved solutions to specific analytical challenges - each carrying the "genetic code" of the optimization pressures that shaped them.

The phylogeny could reveal which analytical approaches share common mathematical "ancestors" - suggesting when they can be safely combined - and which represent fundamentally different evolutionary branches that may conflict when applied to the same system. Just as biological phylogeny predicts which species can hybridize successfully, an analytical phylogeny could predict which modeling approaches can be productively integrated.

For hybrid human-AI systems, this becomes crucial. These systems exhibit agency that doesn't fit cleanly into any single field's evolved framework. The phylogenetic approach asks: which combinations of analytical approaches, evolved under different pressures, can successfully characterize the multi-scale agency of cybernetic systems?

Rather than hoping different alignment approaches happen to be compatible, we could systematically understand their evolutionary relationships - when economic alignment strategies will enhance biological resilience, when cognitive frameworks will complement rather than conflict with systemic approaches, and when applying a single analytical lens will create dangerous blind spots.

Conclusion

The ant colony puzzle revealed that the same complex system can simultaneously be different types of agents depending on your analytical lens - what appears as confusion between fields actually represents evolved solutions to different analytical challenges. Different fields developed distinct "intentional stances" toward agency because they faced different prediction environments, and by mapping these evolutionary relationships like a phylogeny, we can transform interdisciplinary disagreement into systematic understanding. 

This matters for AI alignment because multi agent systems (such as AI-AI or hybrid human-AI systems) exhibit agency across multiple scales simultaneously - a single AI firm can be perfectly aligned from one perspective while catastrophically misaligned from another. 

The goal for this research is to help build the conceptual infrastructure to navigate conversations around agency and alignment. This helps map the right type of alignment agenda to the right frame of agency. It would help understand our own individual intentional stances towards agency depending on which field we come from, and it would also help show how different stances relate to each other through a shared ancestry.

In an ideal we would be able to get new approaches to alignment "for free" by mapping them over from different fields.

If this sounds interesting, we’re running a workshop and a research program on this area during autumn 2025 at Equilibria Network.

Finally some fun related links:

  • A link discussing the relation to consciousness for an ant colony
  • Seeing like a state - review from Scott Alexander
  • The Invisible Hand of The Market

Cross posted to : Substack

  1. ^

    Some even overuse the intentional stance to the extent that they anthropomorphise LLM systems into "awakened beings"