After DeepMind's AlphaStar announcement, critical discussion focused on the system's sometimes seemingly superhuman click-rate (see threads on Hacker News, LessWrong and the Reddit AMA with Oriol Vinyals, David Silver, TLO and MaNa).
As one redditor put it:
Statistics aside, it was clear from the gamers', presenters', and audience's shocked reaction to the Stalker micro, all saying that no human player in the world could do what AlphaStar was doing. Using just-beside-the-point statistics is obfuscation and an avoiding of acknowledging this.
AlphaStar wasn't outsmarting the humans—it's not like TLO and MaNa slapped their foreheads and said, "I wish I'd thought of microing Stalkers that fast! Genius!"
This figure is what DeepMind revealed. Many have argued that the yellow distribution (TLO's APM) is potentially misleading, and the other two distributions reveal a more accurate picture, with a seemingly normal distribution for MaNa, but a very long tail of APMs for AlphaStar.
In particular, the claim is that AlphaStar's move accuracy likely did not deteriorate with increased click-rate to the same extent as human accuracy. For example, the huge peak for TLO seems attributable to him not playing his standard race, resulting in a large amount of desperate, non-useful clicks (as well as possibly using a repeater keyboard).
We can try to make sense of the discussion and update on what this limitation really means, by asking:
By the end of 2019, will there be a published AI system -- with performance (roughly) at least as good as the AlphaStar that defeated TLO and MaNa, and hard-coded knowledge (roughly) no greater than AlphaStar -- whose APM distribution has a tail comparable to human players?
A "published" system is one described in a credible blog post, pre-print, peer-reviewed paper, or similar.
A "a tail comparable to humans players" is quite vague, as I don't want to end up with an ambiguous resolution by too narrowly constraining what possible APM restrictions might look like. Feedback on improving this resolution condition is appreciated.