Abstract
Despite much progress in training AI systems to imitate human language, building agents that use language to communicate intentionally with humans in interactive environments remains a major challenge. We introduce Cicero, the first AI agent to achieve human-level performance in Diplomacy, a strategy game involving both cooperation and competition that emphasizes natural language negotiation and tactical coordination between seven players. Cicero integrates a language model with planning and reinforcement learning algorithms by inferring players' beliefs and intentions from its conversations and generating dialogue in pursuit of its plans. Across 40 games of an anonymous online Diplomacy league, Cicero achieved more than double the average score of the human players and ranked in the top 10% of participants who played more than one game.
Meta Fundamental AI Research Diplomacy Team (FAIR)†, Anton Bakhtin, Noam Brown, Emily Dinan, Gabriele Farina, Colin Flaherty, Daniel Fried, et al. 2022. “Human-Level Play in the Game of Diplomacy by Combining Language Models with Strategic Reasoning.” Science, November, eade9097. https://doi.org/10.1126/science.ade9097.
I'm an author on the paper. This is an interesting topic that I think we approached in roughly the right way. For context, some of my teammates and I did earlier research on AI for poker, so that concern for exploitability certainly carried over to our work on Diplomacy.
The setting that the human plays in the video (one human vs 6 known Cicero agents) is not the setting that we intended the agent to play in and is not the setting that we evaluate the agent. That's simply a demonstration to get a sense of how the bot plays. If you want to evaluate the bot's exploitability and game theory, it should be done in the setting we intended for evaluation.
The setting we intended the bot to play in is games where all players are anonymous, and there is a large pool of possible players. That means players don't necessarily know which player is a bot, or whether there is a bot in that specific game at all. In that case, it's reasonable for the human players to assume all other players might engage in retaliatory behavior, so the agent gets the benefit of a tit-for-tat reputation without having to actually demonstrate it.
The assumption that players are anonymous is explicitly accounted for in the algorithm. It's the reason why we assume there is a common knowledge distribution over our lambda parameters for piKL while in fact we actually play according to a single low lambda. If you were to change that assumption, perhaps by having all players know that a specific player is a bot at the start of the game, then you should change the common knowledge distribution over lambda parameters to be that the bot will play according to the lambda it actually intends to play. In that case the agent will behave differently. Specifically, it will play a much more mixed, less exploitable policy.