Fairness in AI, StarCraft Edition

January 31, 2019

An interesting debate is emerging over Google’s announcement last week of some details of its attempt to battle humans at the video game StarCraft II, a debate over what “AI” is and what “fairness” may mean for the field.

Also: Google ponders the shortcomings of machine learning

ArsTechnica’s Timothy B. Lee published an article yesterday analyzing the face-off between Google’s DeepMind unit’s AlphaStar computer program against one of the dominant human champs at StarCraft II, Poland’s Grzegorz Komincz, known by his gamer handle MaNa.

Lee argues the Google win over MaNa was “not a fair fight,” citing the fact that AlphaStar’s neural network configuration had an advantage in having access to “raw” game data during the course of play, data that human players do not have. When DeepMind researchers forced AlphaStar to play by only looking through a “camera,” the way humans see the game, it lost.

The use of extra information by AlphaStar raises two entwined questions: Should machine learning be constrained by human limitations, in order to be fair, and if it is not, can it really make any claims to be developing anything called intelligence, at least in the sense most people use that term?

Also: Google’s AI surfs the “gamescape” to conquer game theory

Lee offers the view that AI should be bound by fair rules: “The ultimate way to level the playing field would be to make AlphaStar use the exact same user interface as human players,” he writes.

The 148 comments in response to Lee’s article, as of Thursday afternoon, offer a lively debate on the matter. Some argue that AlphaStar’s access to raw data give it an ability to better plan its actions and therefore to make actions that a human never could. Others point out the moves that AlphaStar came up with in response to that information have advantages, but that they are simple tricks that don’t seem like sophisticated strategizing, contrary to what Google’s researchers imply about their achievement. Some, taking the opposing view, suggest that if a machine can manipulate a video game, that’s no surprise, a machine should be able to manipulate another machine anyway it can.

Screent capture of AlphaStar playing against the human Team Liquid.

(Image: Google DeepMind/Blizzard Entertainment)

New York University psychology professor Gary Marcus, who has been a frequent critic of the deep learning approach to AI, wrote on Twitter that the AlphaStar announcement was “DeepMind’s latest publicity stunt.”

deepmind’s latest publicity stunt, dissected. https://t.co/piLnejQXUs

— Gary Marcus (@GaryMarcus) January 30, 2019

He followed up by saying that StarCraft II should not even be considered a “grand challenge” for AI, as DeepMind asserts.

I don’t even think Starcraft is a “grand challenge”; that too is just part of the PR game. But yes current AI is useful, even if it isn’t really anywhere near AGI. Who disagrees?

— Gary Marcus (@GaryMarcus) January 30, 2019

Also: Google’s StarCraft II victory shows AI improves via diversity, invention, not reflexes

Because there is no academic paper yet from DeepMind, many details of the work are vague. For example, Marcus points out that it’s not clear from DeepMind’s description whether AlphaStar’s access included “raw pixels” or “pre-parsed API” calls to StarCraft, and that the difference matters.

thanks! but that doesn’t say whether the particular software that won against experts used pixels or preparsed APi stuff, which are both available but very different

— Gary Marcus (@GaryMarcus) January 31, 2019

DeepMind says the access to extra information is not the key to victory. In the group’s blog post last week, the researchers write that even when AlphaStar was restricted to the camera view of the game, the program was “almost as strong as the raw interface.” From that, the researchers conclude that “these results suggest that AlphaStar’s success against MaNa and TLO [the two human players] was in fact due to superior macro and micro-strategic decision-making, rather than superior click-rate, faster reaction times, or the raw interface.”

Regardless of how that particular point is resolved by future research, much of the commentary of the past 24 hours likely is mistaken about the priorities and intentions of the AlphaStar work. The notions of fairness and intelligence are clouding the matter. What little is available in the post, and in related work, makes clear that the priorities of the DeepMind researchers are different from what Lee and Marcus and others believe it to be.

Also: China’s AI scientists teach a neural net to train itself

DeepMind’s training of an elite player — unlike anything a human athlete has ever experienced.

Google DeepMind

For one thing, the blog post makes clear that with its privileged access, AlphaStar still has to contend with the “fog of war,” as it’s called, meaning, it has no access to resources of opponents that are not visible on the game board, whether through camera view, through raw pixels, or an through API. Such private information designates the game as a game of “incomplete information,” which is a critical concept in game theory.

Dealing with the fog of war is obviously, to DeepMind, an important aspect of what the researchers are trying to engineer. In the 2017 paper on StarCraft II produced by DeepMind, StarCraft II: A New Challenge for Reinforcement Learning, the authors explicitly separate fog of war as a separate limitation from the camera view — the two are not the same thing.

Moreover, the focus on extra information and camera view is odd given that the way AlphaStar improves is clearly not anything human. The blog post suggests that AlphaStar builds upon work by DeepMind’s David Balduzzi and colleagues on something called Nash averaging, where multiple players are surveyed by the neural network across multiple games, to create a kind of ideal player built from strengths of various agents in those multiple games. That involved some of the individual agents experiencing “up to 200 years of real-time StarCraft play.” More details that seem relevant were offered up in a new paper by Balduzzi and colleagues on Friday.

Also: MIT lets AI “synthesize” computer programs to aid data scientists

The DeepMind approach to Nash averaging and the rest may become better at playing against humans this year, with and without human-like constraints such as camera view. But it is clearly not a form of training that a single human player would recognize. Just like with AlphaGo Zero, which in 2017 conquered human players of the classic strategy game Go, AlphaStar is progressing by means that are something novel, something not to be compared directly to human experience.

Must read

‘AI is very, very stupid,’ says Google’s AI leader (CNET)
How to get all of Google Assistant’s new voices right now (CNET)
Unified Google AI division a clear signal of AI’s future (TechRepublic)
Top 5: Things to know about AI (TechRepublic)

None of this really speaks to fairness, per se, but it’s also not clear what fairness is supposed to be in the context of a video game. StarCraft is a simulated environment with certain rules, but it was always possible to exploit the game through clever engineering. Hence, cries for fairness are somewhat beside the point.

A large part of what Balduzzi and others at DeepMind are trying to achieve is not to find human intelligence, but to find a better way to engineer deep “reinforcement learning” to create systems that can get some kind of work done, and also to find what happens to classic game theory when it is tested on a novel scale.

For reasons such as these, humans are likely to end up finding that their presuppositions about AlphaStar, and about deep learning in general, are not really the focus of deep learning. Developing intelligence of some kind is certainly a focus of DeepMind and others in the field. And hopefully, so is fairness. But at this point in time, neither intelligence nor fairness as formulated in deep learning accords with the notions that Lee and Marcus and others are promoting as self-evident. What is emerging is something else, and expectations may have to be adjusted to the engineering reality.