Month: June 2019

Multi-Agent Games

Recent breakthroughs in AI for multi-agent games like Go, Poker, and Dota, have seen great strides in recent years. Yet none of these games address the real-life challenge of cooperation in the presence of unknown and uncertain teammates. This challenge is a key game mechanism in hidden role games. Here we develop the DeepRole algorithm, a multi-agent reinforcement learning agent that we test on The Resistance: Avalon, the most popular hidden role game. DeepRole combines counterfactual regret minimization (CFR) with deep value networks trained through self-play. Our algorithm integrates deductive reasoning into vector-form CFR to reason about joint beliefs and deduce partially observable actions. We augment deep value networks with constraints that yield interpretable representations of win probabilities. These innovations enable DeepRole to scale to the full Avalon game. Empirical game-theoretic methods show that DeepRole outperforms other hand-crafted and learned agents in 5-player Avalon. DeepRole played with and against human players on the web in hybrid human-agent teams. We find that DeepRole outperforms human players as both a cooperator and a competitor.

2022-11-28: Diplomacy has fallen (with big caveats)

When people say the AI ‘solved’ Diplomacy, it really really didn’t. What it did, which is still impressive, is get a handle on the basics of Diplomacy, in this particular context where bots cannot be identified and are in the minority, and in particular where message detail is sufficiently limited that it can use an LLM to be able to communicate with humans reasonably and not be identified.

If this program entered the world championships, with full length turns, I would not expect it to do well in its current form, although I would not be shocked if further efforts could fix this (or if they proved surprisingly tricky).

Interestingly, this AI is programmed not to mislead the player on purpose, although it will absolutely go back on its word if it feels like it. This is closer to correct than most players think but a huge weakness in key moments and is highly exploitable if someone knows this and is willing and able to ‘check in’ every turn.

2022-12-02: Stratego has fallen

Stratego, the classic board game that’s more complex than chess and Go, and craftier than poker, has been mastered. We present DeepNash, an AI agent that learned the game from scratch to a human expert level by playing against itself.
DeepNash uses a novel approach, based on game theory and model-free deep reinforcement learning. Its play style converges to a Nash equilibrium, which means its play is very hard for an opponent to exploit. So hard, in fact, that DeepNash has reached an all-time top-three ranking among human experts on the world’s biggest online Stratego platform, Gravon. The machine learning approaches that work so well on perfect information games, such as DeepMind’s AlphaZero, are not easily transferred to Stratego. The need to make decisions with imperfect information, and the potential to bluff, makes Stratego more akin to Texas hold’em poker and requires a human-like capacity once noted by the American writer Jack London: “Life is not always a matter of holding good cards, but sometimes, playing a poor hand well.”

Crypto is here to stay?

No matter what you think of this idea, it likely would boost the demand for Bitcoin and other crypto assets, as cryptocurrencies are potentially a way to store assets out of reach of many tax authorities. And the US is hardly the only nation that may be looking to a wealth tax in the future to balance the books. In essence, the new and higher price of Bitcoin is telling us that fiscal solvency will be hard to come by, and the wealthy will not give up their assets without a fight.

New Era of Political Reform?

History suggests that we’re at an inflection point on the cusp of a new era of reform. In one sense, it’s right on schedule. As Samuel Huntington notes in his 1981 classic, American Politics: The Promise of Disharmony, the US goes through periods of reform politics about every 60 years or so: the 1960s, the Progressive Era, Jacksonian Democracy, and the Revolutionary War. In these years, Americans grew disillusioned and discontented with the corrupt status quo, and reform movements spread. New media and expanding participation upended traditional power politics. The parallels of today with earlier eras are striking.

Scooters

At least at eye level, the lax regulations France does have – the minimum age is 8, cities may choose to permit or prohibit riding on the sidewalk, riding on all streets with speed limit up to 50 km/h is required – appear sufficient. The American, British, and Italian approaches are too draconian and only serve to discourage this mode of transportation.

Jony Ive Design Legacy

In conversation, he would always be unfailingly polite (if not always prompt in recent years), a gentle soul in the body of a rugby player. He’d shimmer with intensity as he dove into the tiny things that were always paramount in bringing his visions into physical form. Like Carl Sagan awed at some mind-bending celestial wonder, he’d extol the sound that a laptop made when it closed shut, or praise the way concrete had been poured in the parking garage at Apple headquarters. When we talked about the iPod, he would launch into reveries about its whiteness. “It’s not just a color. So brutally simple and so … pristine … so shocking.” That word again.

Grabbing Now vs Later

So instead of grabbing stuff from the rich and businesses today, consider the option of waiting, to grab later. If you don’t grab stuff from them today, these actors will invest much of that stuff, producing a lot more stuff later. Yes, you might think some of your favorite projects are good investments, but let’s be honest; most of the stuff you grab won’t be invested, and the investments that do happen will be driven more by political than rate-of-return considerations. Furthermore, if you grab a lot today, news of that event will discourage future folks from generating stuff, and encourage those folks to move and hide it better.

Tensor Considered Harmful

Despite its ubiquity in deep learning, Tensor is broken. It forces bad habits such as exposing private dimensions, broadcasting based on absolute position, and keeping type information in documentation. This post presents a proof-of-concept of an alternative approach, named tensors, with named dimensions. This change eliminates the need for indexing, dim arguments, einsum- style unpacking, and documentation-based coding. The prototype PyTorch library accompanying this blog post is available as namedtensor.

change.org parody

A neural net trained on change.org tries to write its own petitions, eg “Help Bring Climate Change to the Philippines!” and “Donald Trump: Change the name of the National Anthem to be called the ‘Fiery Gator’”

stories about sexual selection (like “women are naturally attracted to dominant-looking men, because throughout evolution they were better able to provide”) are meaningless, because for most of human history women did not choose their own mates, and so women are unlikely to have strong biologically-ingrained mate preferences.