Computer game bot Turing test

The computer game bot Turing test is a variant of the Turing test, where a human judge viewing and interacting with a virtual world must distinguish between other humans and video game bots, both interacting with the same virtual world. This variant was first proposed in 2008 by Associate Professor Philip Hingston of Edith Cowan University, and implemented through a tournament called the 2K BotPrize.

History
The computer game bot Turing test was proposed to advance the fields of artificial intelligence (AI) and computational intelligence with respect to video games. It was considered that a poorly implemented bot implied a subpar game, so a bot that would be capable of passing this test, and therefore might be indistinguishable from a human player, would directly improve the quality of a game. It also served to debunk a flawed notion that "game AI is a solved problem."

Emphasis is placed on a game bot that interacts with other players in a multiplayer environment. Unlike a bot that simply needs to make optimal human-like decisions to play or beat a game, this bot must make the same decisions while also convincing another in-game player of its human-likeness.

Implementation
The computer game bot Turing test was designed to test a bot's ability to interact with a game environment in comparison with a human player, simply 'winning' was insufficient. This evolved into a contest with a few important goals in mind:


 * There are three participants: a human player, a computer-game bot, and a judge.
 * The bot needs to appear more human-like than the human player. Judge scores are not bipolar — both human and bot can be scored anywhere on a scale from 1 to 5 (1=not humanlike, 5=human).
 * All three participants are to be indistinguishable in the arena, with the exception of a randomly generated name tag, so as to reduce the chance of random elements such as name or appearance influencing the judges.
 * Chat is disabled throughout the match.
 * Bots were not given omniscient powers as they may be in other games. Bots must react only to the data that might be reasonably available to a human player.
 * Human participants were of a moderate skill range, with no participant either ignorant to the game or capable of playing at a professional level.

In 2008, the first 2K BotPrize tournament took place. The contest was held with the game Unreal Tournament 2004 as the platform. Contestants created their bots in advance using the GameBots interface. GameBots had some modifications made so as to adhere to the above conditions, such as removing data about vantage points or weapon damage that unfairly informed the bots of relevant strengths/weakness that a human would otherwise need to learn.

Tournament
The first BotPrize Tournament was held on 17 December 2008, as part of the 2008 IEEE Symposium on Computational Intelligence and Games in Australia. Each competing team was given time to set up and adjust their bots to the modified game client, although no coding changes were allowed at that point. The tournament was run in rounds, each a 10-minute death match. Judges were the last to join the server and every judge observed every player and every bot exactly once, although the pairing of players and bots did change. When the tournament ended, no bot was rated as more human than any player.

In subsequent tournaments, run during 2009–2011,  bots achieved scores that were increasingly human-like, but no contestant had won the BotPrize in any of these contests.

In 2012, the 2K BotPrize was held once again, and two teams programmed bots that achieved scores greater than those of human players.

Successful bots
To date, there have been two successfully programmed bots that passed the computer game bot Turing test:


 * UT^2, a team from the University of Texas at Austin, emphasized a bot that adjusted its behaviour based on previously observed human behaviour and neuroevolution. The team has made their bot available, although a copy of Unreal Tournament 2004 is required.
 * Mihai Polceanu, a doctoral student from Romania, focused on creating a bot that would mimic opponent reactions, in a sense 'borrowing' the human-like nature of the opponent. These victors succeeded in the year 2012, Alan Turing's centenary year.

Aftermath
The outcome of a bot that appears more human-like than a human player is possibly overstated, since in the tournament in which the bots succeeded, the average 'humanness' rating of the human players was only 41.4%. This showcases some limits of this Turing test, since the results demonstrate that human behaviour is more complicated and quantitative than was accounted for. In light of this, the BotPrize competition organizers will increase the difficulty in upcoming years with new challenges, forcing competitors to improve their bots.

It is also believed that methods and techniques developed for the computer game bot Turing test will be useful in fields other than video games, such as virtual training environments and in improving Human–robot interaction.

Contrasts to the Turing test
The computer game bot Turing test differs from the traditional or generic Turing test in a number of ways:


 * Unlike the traditional Turing test, for example the Chatterbot-style contest held annually by the Loebner Prize competition, the humans who played against the Computer Game Bots are not trying to convince judges they are the human; rather, they want to win the game (i.e., by achieving the highest kill score).
 * Judges are not restricted to awarding only one participant in a match as the 'human' and the other as the 'non-human.' This emphasizes more qualitative rather than polarized findings.
 * With regards to a successful video game bot, this is not to be confused with a claim that the bot is 'intelligent,' whereas a machine that 'passed' the Turing test would arguably have some evidence for its Chatterbot's 'intelligence.'
 * The game Unreal Tournament 2004 was chosen for its commercial availability and its interface for creating bots, GameBots. This limitation on medium is a sharp contrast to the Turing test, which emphasizes a conversation, where possible questions are vastly more numerous than the set of possible actions available in any specific video game.
 * The available information to the participants, humans and bots, is not equal. Humans interact through vision and sound, whereas bots interact with data and events.
 * The judges cannot introduce new events (e.g., a lava pit) to aid in differentiating between human and bot, whereas in a Chatterbot designed system, judges may theoretically ask any question in any manner.
 * The two participants and the judge take part in a three-way interaction, unlike, for example, the paired two-way interaction of the Loebner Prize Contest.