The Multi-Agent Minefield: Can LLMs Cooperate to Avoid Global Catastrophe?
ArXiv paper here. Most AI safety research asks a familiar question: Will a single model behave safely? But many of the risks we actually worry about β including arms races, coordination failures, and runaway competition β donβt involve one single AI model acting alone. They emerge when multiple advanced AI...

Yes! In this current setup, they don't communicate, great pointing that out! But we wanted to focus on studying this specific setting really well. One very much interesting thing we had was seeing models able to coordinate themselves without any communication! That was a really high rate, compared to chance. And leads to schelling points and ideas in that direction. Check out, for example,Β https://www.arxiv.org/abs/2601.22184, which we found very similar to this discovery of implicit coordination.
Regarding communication, yes, it helps, good intuition π; we already have some results showing this internally (yet they are not perfect even there), but the design space of the communication protocols is huge, and we are trying to find some way to analyze that setting satisfactorily, too!
Hope this helps :).