Chapter 6
|
Games versus search problems
Types of games
Game definition
- initial state - includes board position and the player to move
- successor function - returns a lists of (move, state) pairs where move is legal and the resulting state
- terminal test - game over at some terminal state
- utility function (objective or payoff function) - give numeric value of a terminal state (e.g. +1=win, -1=loss, 0=draw)
Game tree
Initial state and legal moves define game tree (see below figure for tic-tac-toe)
initial state: board empty, MAX turn
successor function: For tic-tac-toe returns list of 9 (move, state)
from initial state.
[(0,[X,1,2,3,4,5,6,7,8]), (1,[0,X,2,3,4,5,6,7,8]), ...
Game tree (partial) of tic-tac-toe (2-player, deterministic, turns)

MAX plays X.
- Are all the Terminal nodes generated at equal depth?
- What is the minimum depth of a Terminal node?
- Explain the values of Utility at Terminal nodes.

Minimax(root) = max(min(3,12,8),min(2,4,6),min(14,5,2)
In the above game tree, MAX makes the first choice, then
MIN.
|
![]()
|
| def MINIMAX_DECISION(state): infinity = 1.e400 def MAX_VALUE(state): if TERMINAL_TEST(state): return UTILITY(state) v = -infinity for (a, s) in SUCCESSORS(state): v = max(v, MIN_VALUE(s)) return v def MIN_VALUE(state): if TERMINAL_TEST(state): return UTILITY(state) v = infinity for (a, s) in SUCCESSORS(state): v = min(v, MAX_VALUE(s)) return v action, state = argmax(SUCCESSORS(state), lambda ((a, s)): MIN_VALUE(s)) return action |
The following is the tic-tac-toe board layout:
0 1 2 3 4 5 6 7 8
- The initial state of the board with no moves is: [0,1,2,3,4,5,6,7,8].
- After an X is moved to the center square, the state is: [0,1,2,3,X,5,6,7,8]
- SUCCESSORS( [0,1,2,3,4,5,6,7,8] ) returns
[(0,[X,1,2,3,4,5,6,7,8]), (1,[0,X,2,3,4,5,6,7,8]), ... or
as an (action, state) tuple.
- UTILITY(state) returns for X (MAX) win +1, O (MIN) win -1 and tie 0.
- TERMINAL-STATE(state) is true when the state is a win or tie (board full)
- What is the branching factor at depth 0?
- At depth 1?
- What is the maximum depth?
- A MIN move will attempt to minimize or maximize the UTILITY result?
- Are states following a TERMINAL state (i.e.
) explored?
- Are all possible states explored to a TERMINAL state?
- Is this a depth or breadth first search? How do you know? (See the Python program below).
- Run the Minimax tic-tac-toe program; you'll play O. (Download)
Minimax (Download)
# Figure 6.3 p166 to implement tic-tac-toe
def MINIMAX_DECISION(state):
"""Given a state , calculate the best move by searching
forward all the way to the terminal states. [Fig. 6.4]"""
infinity = 1.e400
def MAX_VALUE(state):
if TERMINAL_TEST(state): return UTILITY(state)
v = -infinity
for (a, s) in SUCCESSORS(state):
v = max(v, MIN_VALUE(s))
return v
def MIN_VALUE(state):
if TERMINAL_TEST(state): return UTILITY(state)
v = infinity
for (a, s) in SUCCESSORS(state):
v = min(v, MAX_VALUE(s))
return v
action, state = argmax(SUCCESSORS(state), lambda ((a, s)): MIN_VALUE(s))
return action
def TERMINAL_TEST(state) : # win(state) = 'X', 'O' or None
return win(state)!=None or tie(state)
def UTILITY(state) :
if win(state) == 'X' : return +1
if win(state) == 'O' : return -1
return 0
def SUCCESSORS(state) :
successors=[]
open=0
for move in range(9) :
if state[move]==move : open = open + 1
if open%2 == 1 : player = 'X' # X makes odd numbered moves
else : player = 'O'
for move in range(9) :
if state[move]==move : # Its a 0, 1, 2, etc.
successor = state[:]
successor[move]=player # Place the player
successors.append((move,successor))
return successors
def tie(state) : # When no more moves
for i in range(9) :
if state[i]==i : return False
return True
def win(state) :
for c in [0,3,6] :
if state[c+0]==state[c+1] and state[c+0]==state[c+2] : return state[c+0]
for c in [0,1,2] :
if state[c+0]==state[c+3] and state[c+0]==state[c+6] : return state[c+0]
if state[0]==state[4] and state[0]==state[8] : return state[4]
if state[2]==state[4] and state[2]==state[6] : return state[4]
return None
def display(state) :
for c in [0,3,6] : print state[c+0],state[c+1],state[c+2]
def argmax(seq, fn) : # Return seq that evaluate to maximum fn(seq[0..n])
max = 0
maxvalue = fn(seq[0])
for i in range(len(seq)) :
testvalue = fn(seq[i])
if testvalue > maxvalue :
max = i
maxvalue = testvalue
return seq[max]
def run() :
board=[0,1,2,3,4,5,6,7,8]
while not win(board) and not tie(board) :
board[MINIMAX_DECISION(board)]='X'
if not win(board) and not tie(board) :
display(board)
board[input('Your move? ')]='O'
display(board)Properties of Minimax
Explores all possible state outcomes to TERMINAL state of win or tie.
Resource limits
Evaluation functions
Exact value don't matter
a-b search
Idea: At a MAX level establish initial cutoff value alpha and prune any branches that are less, explore branches that are greater. At a MIN level establish initial cutoff beta and prune any branches that are greater, explore branches that are less.
a-b(root) = max(min(3,12,8),min(2,x,y),min(14,5,2)
= max(3,min(2,x,y),2)
= max(3,z,2) where z <= 2
= 3
The maximum is independent of x and y; can prune without exploring. When examining 2 know that MIN will return 2 or less, MAX 3 or more.
a=highest-value so far for MAX
b=lowest-value so far for MIN
(a) MIN B node initially a=-
,b=+
; b=min(3,+
); a=-
,b=3.
(b) MIN B node, a=-
,b=3; b=min(12,3); a=-
,b=3
(c) MIN B node, a=-
,b=3; b=min(8,3); a=-
, b=3; return 3
MAX A node, a=-,b=+
; a=max(-
,3); a=3, b=+
(d) MIN C node, a=3,b=+
; b=min(2,+
); 2<=3 can prune; return 2;
MAX A node, a=3,b=+; a=max(3,2); a=3, b=+
(e) MIN D node, a=3,b=+
; b=min(14,+
); 14>3 cannot prune
(f) MIN D node, a=3,b=+
; b=min(14,5); 5>3 cannot prune
MIN D node, a=3,b=+; b=min(5,2); 2<=3 nothing to prune, return 2
MAX A node, a=3,b=+; a=max(3,2); a=3, b=+
; return 3
a-b pruning example
MAX A, MIN B
a-b pruning example



For the game trees at right, give the Minimax values. For the game trees at right, give the a-b values.
Properties of a-b
Why called a-b
# Figure 6.7 p170 to implement Figure 6.5 p168
def ALPHA_BETA_SEARCH(state):
"""Given a state in a game, calculate the best move by searching
forward all the way to the terminal states. [Fig. 6.4]"""
infinity = 1.e400
def MAX_VALUE(state, alpha, beta):
if TERMINAL_TEST(state): return UTILITY(state)
v = -infinity
for (a, s) in SUCCESSORS(state):
v = max(v, MIN_VALUE(s, alpha, beta))
if v >= beta : return v
alpha = min(alpha,v)
return v
def MIN_VALUE(state, alpha, beta):
if TERMINAL_TEST(state): return UTILITY(state)
v = infinity
for (a, s) in SUCCESSORS(state):
v = min(v, MAX_VALUE(s, alpha, beta))
if v <= alpha : return v
beta = min(beta,v)
return v
action, state = argmax(SUCCESSORS(state), lambda ((a, s)): MIN_VALUE(s))
return action
def TERMINAL_TEST(state) : return isinstance(state, int)
def UTILITY(state) : return state
def SUCCESSORS(state) : return STATE_SPACE[state]
def argmax(seq, fn) : # Return seq that evaluate to maximum fn(seq[0..n])
max = 0
maxvalue = fn(seq[0])
for i in range(len(seq)) :
testvalue = fn(seq[i])
if testvalue > maxvalue :
max = i
maxvalue = testvalue
return seq[max]
A,B,C,D='A','B','C','D'
STATE_SPACE={A:[B,C,D], B:[3,12,8], C:[2,4,1], D:[14,5,2]}
def run() :
print ALPHA_BETA_SEARCH('A')
Note that the above algorithm returns the MAX value not the action by: return MAX_VALUE(state,-infinity, infinity)
To return the action use:
return argmax(SUCCESSORS(state), lambda (s): MIN_VALUE(s, -infinity, infinity))
The following is the tic-tac-toe board layout:
0 1 2 3 4 5 6 7 8
- Run the a-b tic-tac-toe program; you'll play O. (Download)
Deterministic games in practice
6.4 IMPERFECT, REAL-TIME DECISIONS
6.5 GAMES THAT INCLUDE AN ELEMENT OF CHANCE
Nondeterministic games: Backgammon
Nondeterministic games in general
Game tree must include chance nodes in addition to MAX and MIN nodes (circles).
Branches from chance nodes denote the possible coin flips, heads or tails, equally likely.
Evaluation function has assigned values to the leaves
Pruning in nondeterministic game trees
More pruning occurs if we can bound the leaf values