TLDR: There may be differences in the alignment behaviour of multi-agent AI systems when compared to single-agent AI systems. If there are, we should know and react to this sooner rather than later, as it is likely that we will see more and more AI agents working together in the future.
Link to the Repository: https://github.com/CameronMoreira/agenty-python
Intro
Most alignment work has focused on single agents acting alone. Yet at least some real-world applications are likely to rely on teams of agents working together in the future. Multi-agent systems can offer advantages, just like working in a team does for humans, but they also introduce new challenges. When agents interact, unexpected dynamics can emerge: coordination breakdowns, conflicting... (read 1356 more words →)
We used Claude Sonnet 4 for the agents and narration, and Claude 3.5 Sonnet for most of the evaluation.
We haven't made any specific plans yet on how to measure alignment; our first goal was to check if there were observable differences at all, before making those differences properly measurable.