Next: Learning by Problem Solving Up: Learning by Taking Advice Previous: Knowledge Base Maintenance

### Example Learning System - FOO

Learning the game of hearts

FOO (First Operational Operationaliser) tries to convert high level advice (principles, problems, methods) into effective executable (LISP) procedures.

Hearts:

• Game played as a series of tricks.
• One player - who has the lead - plays a card.
• Other players follow in turn and play a card.
• The player must follow suit.
• If he cannot he play any of his cards.
• The player who plays the highest value card wins the trick and the lead.
• The winning player takes the cards played in the trick.
• The aim is to avoid taking points. Each heart counts as one point the queen of spades is worth 13 points.
• The winner is the person that after all tricks have been played has the lowest points score.

Hearts is a game of partial information with no known algorithm for winning.

Although the possible situations are numerous general advice can be given such as:

• Avoid taking points.
• Do not lead a high card in suit in which an opponent is void.
• If an opponent has the queen of spades try to flush it.

In order to receive advice a human must convert into a FOO representation (LISP clause)

`(avoid (take-points me) (trick))`

FOO operationalises the advice by translating it into expressions it can use in the game. It can UNFOLD avoid and then trick to give:

```(achieve (not (during
(scenario
(each p1 (players) (play-card p1))
(take-trick (trick-winner)))
(take-points me))))```

However the advice is still not operational since it depends on the outcome of trick which is generally not known. Therefore FOO uses case analysis (on the `during` expression) to determine which steps could case one to take points. Step 1 is ruled out and step 2's `take-points` is UNFOLDED:

```(achieve (not (exists c1 (cards-played)
(exists c2 (point-cards)
(during (take (trick-winner) c1)
(take me c2))))))```

FOO now has to decide: Under what conditions does `(take me c2)` occur during `(take (trick-winner) c1)`.

A technique, called partial matching, hypothesises that points will be taken if `me = trick-winner` and `c2 = c1`. We can reduce our expression to:

```(achieve (not (and (have-points(card-played))
(= (trick-winner) me ))))```

This not quite enough a this means Do not win trick that has points. We do not know who the `trick-winner` is, also we have not said anything about how to play in a trick that has point led in the suit. After a few more steps to achieve this FOO comes up with:

```(achieve (>= (and (in-suit-led(card-of me))
(possible (trick-has-points)))
(low(card-of me)))```

FOO had an initial knowledge base that was made up of:

• basic domain concepts such as trick, hand, deck suits, avoid, win etc.
• Rules and behavioural constraints -- general rules of the game.
• Heuristics as to how to UNFOLD.

FOO has 2 basic shortcomings:

• It lacks a control structure that could apply operationalisation automatically.
• It is specific to hearts and similar tasks.

Next: Learning by Problem Solving Up: Learning by Taking Advice Previous: Knowledge Base Maintenance

dave@cs.cf.ac.uk