Example Learning System

Next: Learning by Problem Solving Up: Learning by Taking Advice Previous: Knowledge Base Maintenance

Example Learning System - FOO

Learning the game of hearts

FOO (First Operational Operationaliser) tries to convert high level advice (principles, problems, methods) into effective executable (LISP) procedures.

Hearts:

Game played as a series of tricks.
One player - who has the lead - plays a card.
Other players follow in turn and play a card.
- The player must follow suit.
- If he cannot he play any of his cards.
The player who plays the highest value card wins the trick and the lead.
The winning player takes the cards played in the trick.
The aim is to avoid taking points. Each heart counts as one point the queen of spades is worth 13 points.
The winner is the person that after all tricks have been played has the lowest points score.

Hearts is a game of partial information with no known algorithm for winning.

Although the possible situations are numerous general advice can be given such as:

Avoid taking points.
Do not lead a high card in suit in which an opponent is void.
If an opponent has the queen of spades try to flush it.

In order to receive advice a human must convert into a FOO representation (LISP clause)

(avoid (take-points me) (trick))

FOO operationalises the advice by translating it into expressions it can use in the game. It can UNFOLD avoid and then trick to give:

(achieve (not (during
               (scenario
                 (each p1 (players) (play-card p1))
                 (take-trick (trick-winner)))
                 (take-points me))))

However the advice is still not operational since it depends on the outcome of trick which is generally not known. Therefore FOO uses case analysis (on the during expression) to determine which steps could case one to take points. Step 1 is ruled out and step 2's take-points is UNFOLDED:

(achieve (not (exists c1 (cards-played)
                  (exists c2 (point-cards)
                  (during (take (trick-winner) c1)
                           (take me c2))))))

FOO now has to decide: Under what conditions does (take me c2) occur during (take (trick-winner) c1).

A technique, called partial matching, hypothesises that points will be taken if me = trick-winner and c2 = c1. We can reduce our expression to:

(achieve (not (and (have-points(card-played))
                   (= (trick-winner) me ))))

This not quite enough a this means Do not win trick that has points. We do not know who the trick-winner is, also we have not said anything about how to play in a trick that has point led in the suit. After a few more steps to achieve this FOO comes up with:

(achieve (>= (and (in-suit-led(card-of me))
                   (possible (trick-has-points)))
              (low(card-of me)))

FOO had an initial knowledge base that was made up of:

basic domain concepts such as trick, hand, deck suits, avoid, win etc.
Rules and behavioural constraints -- general rules of the game.
Heuristics as to how to UNFOLD.

FOO has 2 basic shortcomings:

It lacks a control structure that could apply operationalisation automatically.
It is specific to hearts and similar tasks.

Next: Learning by Problem Solving Up: Learning by Taking Advice Previous: Knowledge Base Maintenance

dave@cs.cf.ac.uk