Human-level play in the game of Diplomacy by combining language models with strategic reasoning – CICERO

December 12, 2022

On the 22nd of November, 2022 the Meta AI™ research group announced the launch of CICERO, an AI program that can play the board game Diplomacy at a human-level performance.

CICERO is a step forward in solving the challenge of language emulation. It uses a combination of NLP (Natural Language Processing) and strategic reasoning to negotiate and cooperate with the human players.

Diplomacy, the game

Diplomacy is a board game that simulates the politics and warfare of the early 20th century. It was created by Allan B. Calhamer in 1954. The game is played by two to seven players, each of whom represents one of the major European powers of the time, such as England, France, or Russia. The objective of the game is to gain control of a majority of the main cities and provinces of Europe.

The game is known for its complex strategy. It requires careful planning, negotiation and coordination. Diplomacy™ is more complex than most other board games. The players must use diplomacy and negotiation to form alliances between themselves and conquer territory. They must also be prepared for betrayal and backstabbing as other players try to outmaneuver them.

These challenges were all easily overcome by CICERO.

The Model

According to the research paper released by Meta AI in Science, the main architecture of CICERO is based on two modules: Controllable Dialogue and Strategic Reasoning.

CICERO generates a dialogue using a pre-trained LM (Language Model) and based on this dialogue it outputs a message. The message is the result of a filtering process in which the low-quality messages are rejected.

The Strategic Reasoning selects the future intents and actions of the AI. This module runs a planning algorithm that predicts the policies of all other players and helps CICERO in choosing an optimal action. It predicts everyone’s policy based on the dialogue it produced from interacting with the other players. CICERO then chooses the policy with the highest chance of winning.

CICERO code is open source and you can check it out on GitHub.

Can humans still beat CICERO?

Being a bot, it is easier for CICERO to handle many conversations simultaneously. It is very talkative which ironically could be one of its weak points. Moreover, CICERO is almost entirely honest but in Diplomacy™ the players often make use of betrayal and dishonesty to win the game. For this reason, it is still possible to beat the algorithm at this game.

Conclusion

Language models that imitate human language are a big challenge in the field of AI. The technology behind CICERO and other language-processing models are lowering the communication barriers between AI-powered agents and the real world.