Recent breakthroughs in AI for multi-agent games like Go, Poker, and Dota, have seen great strides in recent years. Yet none of these games address the real-life challenge of cooperation in the presence of unknown and uncertain teammates. This challenge is a key game mechanism in hidden role games. Here we develop the DeepRole algorithm, a multi-agent reinforcement learning agent that we test on The Resistance: Avalon, the most popular hidden role game. DeepRole combines counterfactual regret minimization (CFR) with deep value networks trained through self-play. Our algorithm integrates deductive reasoning into vector-form CFR to reason about joint beliefs and deduce partially observable actions. We augment deep value networks with constraints that yield interpretable representations of win probabilities. These innovations enable DeepRole to scale to the full Avalon game. Empirical game-theoretic methods show that DeepRole outperforms other hand-crafted and learned agents in 5-player Avalon. DeepRole played with and against human players on the web in hybrid human-agent teams. We find that DeepRole outperforms human players as both a cooperator and a competitor.
2022-11-28: Diplomacy has fallen (with big caveats)
When people say the AI ‘solved’ Diplomacy, it really really didn’t. What it did, which is still impressive, is get a handle on the basics of Diplomacy, in this particular context where bots cannot be identified and are in the minority, and in particular where message detail is sufficiently limited that it can use an LLM to be able to communicate with humans reasonably and not be identified.
If this program entered the world championships, with full length turns, I would not expect it to do well in its current form, although I would not be shocked if further efforts could fix this (or if they proved surprisingly tricky).
Interestingly, this AI is programmed not to mislead the player on purpose, although it will absolutely go back on its word if it feels like it. This is closer to correct than most players think but a huge weakness in key moments and is highly exploitable if someone knows this and is willing and able to ‘check in’ every turn.
2022-12-02: Stratego has fallen
Stratego, the classic board game that’s more complex than chess and Go, and craftier than poker, has been mastered. We present DeepNash, an AI agent that learned the game from scratch to a human expert level by playing against itself.
DeepNash uses a novel approach, based on game theory and model-free deep reinforcement learning. Its play style converges to a Nash equilibrium, which means its play is very hard for an opponent to exploit. So hard, in fact, that DeepNash has reached an all-time top-three ranking among human experts on the world’s biggest online Stratego platform, Gravon. The machine learning approaches that work so well on perfect information games, such as DeepMind’s AlphaZero, are not easily transferred to Stratego. The need to make decisions with imperfect information, and the potential to bluff, makes Stratego more akin to Texas hold’em poker and requires a human-like capacity once noted by the American writer Jack London: “Life is not always a matter of holding good cards, but sometimes, playing a poor hand well.”