Chapter 2
|

f : P* g A
Percept Sequence Action [A, Clean]
[A, Dirty]
[B, Clean]
[B, Dirty]
[A, Clean], [A, Clean]
[A, Clean], [A, Dirty]
:
[A, Clean], [A, Clean], [A, Clean]
[A, Clean], [A, Clean], [A, Dirty]
:Right
Suck
Left
Suck
Right
Suck
:
Right
Suck
:
Agent definition
- Complete agent definition by filling in all possible percept sequences and corresponding actions.
- Can the table be completed? How many table entries are needed for percept sequences of length 2? of length 3? of length n?
- Can you give examples of table entries that make the agent good? or bad? smart or dumb?
Vacuum agent
- Performance measure might maximize cleaning and minimize travel.
- By the above measure, does the table define a rational agent?
- Give an example where changing the table defines a less rational agent.
Assumptions:
|
Specifying the task environment - PEAS
To design a rational agent must specify task environment
Design of an automated taxi
- Performance measure - safety, destination, profits, legality, comfort, ...
- Environment - US streets/freeways, traffic, pedestrians, weather, ...
- Actuators - steering, accelerator, brake, horn, ...
- Sensors - video, accelerometers, gauges, engine sensors, keyboard, GPS, ...
Design of an Internet shopping agent
- Performance measure?
- Environment?
- Actuators?
- Sensors?
Environment types
- Fully/partially observable - fully observable if can sense entire environment.
- Vacuum world is ...?
- Deterministic/stochastic (nondeterministic) - deterministic if next state completely determined by current state and agent action; otherwise stochastic. Vacuum world deterministic but if dirt appear randomly then stochastic.
- Episodic/sequential - episodic if next episode does not depend on previous episodes; sequential if current decision can affect future decisions.
- Vacuum world is ...?
- Static/dynamic - dynamic if environment can change after agent senses but before agent acts.
- Vacuum world is ...?
- Discrete/continuous - how time is handled; how environment changes over time and agent percepts and actions. If time is not a factor then discrete.
- Vacuum world is ...?
- Single/multi agent - multi-agents may compete or cooperate (e.g. two agents playing chess or a pilot and flight controller agent flying airplane).
- Vacuum world is ...?
Real world is partially observable, stochastic, sequential, dynamic, continuous and multi-agent
Solitaire Internet shopping Taxi Observerable? Yes No No Deterministic? Episodic? Static? Discrete Single agent
Four basic types in order of increasing generality:
- simple reflex agents - select action based on current percept, ignoring percept history.
- reflex agents with state - select action based on percept history
- goal-based agents - select action to achieve some goal
- utility-based agents - select action that achieves best measure of success
All above can also be learning agents that analyze experience to select action.
TABLE-DRIVEN-AGENT (Figure 2.7)
- Table contains all possible percepts that can occur.
- Each step appends current percept to list of percepts.
- LOOKUP current percepts in table.
function TABLE-DRIVEN-AGENT( percept ) returns an action
static: percepts, a sequence, initially empty
table, a table of actions, indexed by percept sequences, initially fully specified
append percept to the end of percepts
action = LOOKUP( percepts, table)
return actiondef TABLE_DRIVEN_AGENT(percept): # Determine action based on table and percepts percepts.append(percept) # Append percept action = LOOKUP(percepts, table) # Lookup appropriate action for percepts return action
# Figure 2.7 page 45
A='A' B='B' percepts = []table = {((A, 'Clean'),): 'Right', # [Fig. 2.3] ((A, 'Dirty'),): 'Suck', ((B, 'Clean'),): 'Left', ((B, 'Dirty'),): 'Suck', ((A, 'Clean'), (A, 'Clean')): 'Right', ((A, 'Clean'), (A, 'Dirty')): 'Suck', # ... ((A, 'Clean'), (A, 'Clean'), (A, 'Clean')): 'Right', ((A, 'Clean'), (A, 'Clean'), (A, 'Dirty')): 'Suck', # ... }def LOOKUP(percepts, table): # Lookup appropriate action for percepts action = table.get(tuple(percepts)) return actiondef TABLE_DRIVEN_AGENT(percept): # Determine action based on table and percepts percepts.append(percept) # Add percept action = LOOKUP(percepts, table) # Lookup appropriate action for percepts return actiondef run() : # run agent on several sequential percepts print 'Action\tPercepts' print TABLE_DRIVEN_AGENT((A, 'Clean')),'\t', percepts print TABLE_DRIVEN_AGENT((A, 'Dirty')),'\t', percepts print TABLE_DRIVEN_AGENT((B, 'Clean')),'\t', percepts
Try
- Copy and paste the above program into Python editor.
- Save as f2-7.py
- Run the module:
- Press F5
- Enter: run()
- The percepts should now be: [('A', 'Clean'), ('A', 'Dirty'), ('B', 'Clean')].
- The table contains all possible percept sequences to match with the percept history.
- Enter:
- print TABLE_DRIVEN_AGENT((B, 'Clean'))
- percepts
- Explain the results.
- How many table entries would be required if only the current percept was used to select an action rather than the percept history?
- How many table entries are required for an agent lifetime of T steps?
REFLEX-VACUUM-AGENT (Figure 2.8)
- Only responds to current percept (location and status) ignoring percept history.
- Uses condition-action rules rather than table.
- if condition then return action
- if status = Dirty then return Suck
- Sensors() - Function to sense current location and status of environment (i.e. location of agent and status of square).
- Actuators( action ) - Function to affect current environment location by some action (i.e. Suck, Left, Right, NoOp).
function REFLEX-VACUUM-AGENT( [location, status] ) returns an action
if status = Dirty then return Suck
else if location = A then return Right
else if location = B then return Leftdef REFLEX_VACUUM_AGENT((location, status)): # Determine action if status == 'Dirty': return 'Suck' elif location == A: return 'Right' elif location == B: return 'Left'
# Figure 2.8 page 46
A='A'
B='B'
Environment = { A:'Dirty', B:'Dirty', 'Current': A }
def REFLEX_VACUUM_AGENT((location, status)): # Determine action
if status == 'Dirty': return 'Suck'
if location == A: return 'Right'
if location == B: return 'Left'
def Sensors() : # Sense Environment
location = Environment['Current']
return (location, Environment[location])
def Actuators(action) : # Modify Environment
location = Environment['Current']
if action == 'Suck' : Environment[location] = 'Clean'
elif action == 'Right' and location == A : Environment['Current'] = B
elif action == 'Left' and location == B : Environment['Current'] = A
def run(n): # run the agent through n steps
print '\tCurrent\t\t\t\tNew'
print 'location\tstatus\taction\tlocation\tstatus'
for i in range(1,n):
(location, status) = Sensors() # Sense Environment before action
print location + '\t\t'+ status + '\t' ,
action = REFLEX_VACUUM_AGENT(Sensors())
Actuators(action)
(location, status) = Sensors() # Sense Environment after action
print action + '\t' + location + '\t\t'+ status
Try
- Copy and paste the above program into Python editor.
- Save as f2-8.py
- Run the module:
- Press F5
- Enter: run(10)
- Should bogus actions be able to corrupt the environment?
- Change the REFLEX_VACUUM_AGENT to return bogus actions, such as Left when should go Right, etc. Run the agent. Do the Actuators allow bogus actions?
Simple reflex agents
- Only responds to current percept (location and status) ignoring percept history.
- Uses condition-action rules rather than table of percepts.
- Rules looked-up rather than executing if-then statements.
- Any knowledge intrinsically pre-defined in condition-action rules.
SIMPLE-REFLEX-AGENT (Figure 2.9)
function SIMPLE-REFLEX-AGENT( percept ) returns an action
static: rules, a sequence, a set of condition-action rules
state = INTERPRET-INPUT( percept )
rule = RULE-MATCH( state, rules )
action = RULE-ACTION[ rule ]
return actiondef SIMPLE_REFLEX_AGENT(percept): # Determine action state = INTERPRET_INPUT(percept) rule = RULE_MATCH(state, rules) action = RULE_ACTION[rule] return actionCondition-action
- rules = { (A,'Dirty'):1, (B,'Dirty'):1, (A,'Clean'):2, (B,'Clean'):3, (A, B, 'Clean'):4 }
Defines rule for each condition, such as: condition == (A,'Dirty') uses rule 1.
- RULE_ACTION = { 1:'Suck', 2:'Right', 3:'Left', 4:'NoOp' }
Defines action for each rule, such as: rule 1 produces action 'Suck'
# Figure 2.9 page 47
A='A'
B='B'RULE_ACTION = { 1:'Suck', 2:'Right', 3:'Left', 4:'NoOp' }
rules = { (A,'Dirty'):1, (B,'Dirty'):1, (A,'Clean'):2, (B,'Clean'):3, (A, B, 'Clean'):4 }
# Ex. rule (if location == A && Dirty then rule 1)
def INTERPRET_INPUT(input) : # No interpretation
return input
def RULE_MATCH(state, rules) : # Match rule for a given state
rule = rules.get(tuple(state))
return rule
def SIMPLE_REFLEX_AGENT(percept): # Determine action
state = INTERPRET_INPUT(percept)
rule = RULE_MATCH(state, rules)
action = RULE_ACTION[ rule ]
return action
# ----------------------------- Same below this line --------------------------------
Environment = { A:'Dirty', B:'Dirty', 'Current': A }
def Sensors() : # Sense Environment
location = Environment['Current']
return (location, Environment[location])
def Actuators(action) : # Modify Environment
location = Environment['Current']
if action == 'Suck' : Environment[location] = 'Clean'
elif action == 'Right' and location == A : Environment['Current'] = B
elif action == 'Left' and location == B : Environment['Current'] = A
def run(n): # run the agent through n steps
print '\tCurrent\t\t\t\tNew'
print 'location\tstatus\taction\tlocation\tstatus'
for i in range(1,n):
(location, status) = Sensors() # Sense Environment before action
print location + '\t\t'+ status + '\t' ,
action = SIMPLE_REFLEX_AGENT(Sensors())
Actuators(action)
(location, status) = Sensors() # Sense Environment after action
print action + '\t' + location + '\t\t'+ status
Try
- Copy and paste the above program into Python editor.
- Save as f2-9.py
- Run the module:
- Press F5
- Enter: run(10)
- Change the SIMPLE_REFLEX_AGENT condition-action rules to return bogus actions, such as Left when should go Right, or Crash, etc. Rerun the agent. Do the Actuators allow bogus actions?
REFLEX-AGENT-WITH-STATE (Figure 2.11)
Reflex agent only responded to current percepts, no history or knowledge.
Model-based reflex agents
- Maintain internal state that depends upon percept history.
- Agent has a model of how the world works.
- The model requires two types of information to update internal:
- How environment evolves independent of the agent (e.g. Clean square stays clean)
- How agent's actions affect the environment (e.g. Suck cleans square)
function REFLEX-AGENT-WITH-STATE( percept ) returns an action
static: state, a description of the current world state
rules, a sequence, a set of condition-action rules
action, the most recent action, initially none
state = UPDATE-STATE( state, action, percept )
rule = RULE-MATCH( state, rules )
action = RULE-ACTION[ rule ]
return actiondef REFLEX_AGENT_WITH_STATE(percept): global state, action state = UPDATE_STATE(state, action, percept) rule = RULE_MATCH(state, rules) action = RULE_ACTION[ rule ] return actionModel - Used to update history.
- History initially empty:
model = {A: None, B: None}
- Model only used to change state when A == B == 'Clean'
if model[A] == model[B] == 'Clean' : state = (A, B, 'Clean')
# Figure 2.11 page 49
A='A'
B='B'
state = {}
action = None
model = {A: None, B: None} # Initially ignorant
def UPDATE_STATE(state, action, percept) :
(location, status) = percept
state = percept
if model[A] == model[B] == 'Clean' :
state = (A, B, 'Clean') # Model consulted only for A and B Clean
model[location] = status # Update the model state
return state
def REFLEX_AGENT_WITH_STATE(percept):
global state, action
state = UPDATE_STATE(state, action, percept)
rule = RULE_MATCH(state, rules)
action = RULE_ACTION[rule]
return action
# ----------------------------- Same below this line --------------------------------
Environment = { A:'Dirty', B:'Dirty', 'Current': A }
RULE_ACTION = { 1:'Suck', 2:'Right', 3:'Left', 4:'NoOp' }
rules = { (A,'Dirty'):1, (B,'Dirty'):1, (A,'Clean'):2, (B,'Clean'):3, (A, B, 'Clean'):4 }
def RULE_MATCH(state, rules) : # Match rule for a given state
rule = rules.get(tuple(state))
return rule
def Sensors() : # Sense Environment
location = Environment['Current']
return (location, Environment[location])
def Actuators(action) : # Modify Environment
location = Environment['Current']
if action == 'Suck' : Environment[location] = 'Clean'
elif action == 'Right' and location == A : Environment['Current'] = B
elif action == 'Left' and location == B : Environment['Current'] = A
def run(n): # run the agent through n steps
print '\tCurrent\t\t\t\tNew'
print 'location\tstatus\taction\tlocation\tstatus'
for i in range(1,n):
(location, status) = Sensors() # Sense Environment before action
print location + '\t\t'+ status + '\t' ,
action = REFLEX_AGENT_WITH_STATE(Sensors())
Actuators(action)
(location, status) = Sensors() # Sense Environment after action
print action + '\t' + location + '\t\t'+ status
Consider
- What changes are necessary to add a C square?
Goal-based agents
MODEL-BASED, GOAL-BASED AGENT Figure 2.13
- Incorporates goals in determining action. Search and planning, in later chapters, serve to find action sequences that achieve agent's goals.
- Different from condition-action rules in that considers future goals.
- Condition-action rules can define a Suck when Dirty rule that the agent blindly follows.
- Goal-based agent considers how to achieve goal of a clean environment.
Utility-based agents
MODEL-BASED, UTILITY-BASED AGENT Figure 2.14
- Goal alone not usually enough.
- Utility function measures degree of success, mapping a state into a real number; for example, cleaning the environment in the fewest moves.
- Utility function allows rational decisions in two kinds of cases where goals inadequate:
- Conflicting goals; utility function specifies the appropriate tradeoff.
- Several goals, none of which are certain to be achieved; can weigh likelihood of success against importance of goals.
- Action chosen that leads to the best expected utility value weighted by the probability of the outcome.
LEARNING AGENT Figure 2.15
- Only solutions to deterministic problems can be completely specified adequately.
- Learning agent can be divided into 4 components:
- learning element - makes improvements to performance element based on critic feedback
- performance element - selects actions
- critic - gives learning element feedback on achieving some fixed (external) performance standard
- problem generator - Suggests new actions to performance element based on observations from learning element. For example, the vacuum agent might try a Left rather than a Suck action to determine the effect on performance.