The "hello world" of the artificial intelligence world is tic-tac-toe. Sticking to the tradition, we will also use this as our subject game, although the architecture and approach extend well beyond this simple example (for example, Global Thermonuclear War; for other use cases, refer to John Badham's War Games).
Tic-tac-toe is a two-player game of Xes and Os. The board is a 3 x 3 grid. One player has the symbol O and the other has the symbol X, and the play alternates. On a turn, a player places their symbol in any open cell in the grid. If by placing their symbol, it completes a horizontal, vertical, or diagonal line of three contiguous symbols, that player wins. If all cells are filled without forming a line of three, then the game is a tie.
A common approach to developing Artificial Intelligence programs for games with alternating turns is to explore the game tree recursively searching for the game state that evaluates best for the current player (or worse for the opposition). A game tree is a tree structure whose nodes are game states. A node's immediate children are game states that can be achieved by making a legal move from that node's game state.
A sample game tree for tic-tac-toe is shown in the following diagram:
Chapter 6
The simplest of algorithms that traverses a game tree searching for the best move is the Minimax algorithm. The algorithm scores each board recursively and returns the best score found. For this algorithm, we assume that a good score for the opposition is a bad score for the current player. Thus, the algorithm actually alternates between maximizing and minimizing the score of the current board. The Minimax algorithm can be summarized with the following pseudocode:
miniMax (board, depth, maximizing)
Artificial Intelligence end
return bestValue
else // minimizing bestValue = ∞ for (child : children)
value = miniMax (child, depth-1, false) if (value < bestValue)
bestValue = value end
end
return bestValue end
end
A client invokes the algorithm with a game state, a depth, and a Boolean variable that indicates whether or not the algorithm should seek to maximize or minimize the score. In our use case, the game state is fully encapsulated by the board, which is a 3 x 3 grid partially filled with Xes and Os.
The algorithm is recursive. The first few lines of the code are the base case. This ensures that the algorithm does not recurse endlessly. This conditions on the depth variable. In a game of alternating turns, the depth indicates how many turns the algorithm should explore.
In our use case, the Storm topology need not track the depth. We will let the Storm topology explore endlessly (or until there are no new boards returned from the move method).
Typically, each player is given a set amount of time and must make his or her move within the allotted time. Since we will more likely have antsy human players competing against the AI, let's assume the system needs to respond in fewer than 200 milliseconds.
After the algorithm checks for the base case, it calls the move() method, which returns boards for all possible moves. The algorithm then cycles through all possible child boards. If maximizing, the algorithm finds the child board that leads to the highest score. If minimizing, the algorithm finds the board that leads to the least score.
Chapter 6
[ 157 ]
The Negamax algorithm accomplishes the same more succinctly by alternating the sign of the score. Additionally, in a real-world scenario, we might apply Alpha-Beta pruning, which attempts to trim the branches of the tree that are explored. The algorithm only considers branches that fall within a threshold. In our use case, this is not necessary because the search space is small enough to explore in its entirety.
In our simple use case, it is possible to enumerate the entire game tree. In more complicated games such as Chess, the game tree is impossible to enumerate. In an extreme case such as Go, experts have calculated the number of legal boards to be in excess of 2 x 10170.
The goal of the Minimax algorithm is to traverse the game tree and assign a score to each node. In our Storm topology, which is not beholden to any SLA, the score of any non-leaf node is simply the maximum (or minimum) of its descendants. For a leaf node, we must interpret the game state into a corresponding score. In our simple use case, there are three possible outcomes: we win, our opponent wins, or the game is a tie.
In our synchronous system, however, we might very well run out of time before we reach a leaf node. In this case, we need to calculate the score from the current state of the board. Scoring heuristics are often the most difficult aspect of developing an AI application.
For our simple use case, we will compute the score for any board by considering the lines in the grid. There are eight lines to consider: three horizontal, three vertical, and two diagonals. Each line contributes to the score according to the following table:
Status Score
Three in a row for the current player +1000 Two in a row for the current player +10 One in a row for current player +1 Three in a row for an opponent -1000 Two in a row for an opponent -10 One in a row for an opponent -1
Artificial Intelligence
The preceding table applies only if the remaining cells in the line are empty.
Although there are improvements to the preceding heuristic, it suffices for this example. And, since we expect Storm to work continually on our game tree, we hope not to rely on the heuristic all that much. Instead, we would rely directly on the minimum (or maximum) of the leaf scores, which will always be a win (+1000), loss (-1000), or draw (0).
Finally, armed with an approach, our algorithm and a scoring function, we are able to move on to the architecture and design.