Next: Explanation Based Learning (EBL)
Up: Inductive Learning
Previous: Version Spaces
Quinlan in his ID3 system (986) introduced the idea of decision trees.
ID3 is a program that can build trees automatically from given positive and
negative instances.
Basically each leaf of a decision tree asserts a positive or negative
concept. To classify a particular input we start at the top and follow
assertions down until we reach an answer (Fig 28)
Fig. 28 Edible Mushroom decision tree
Building decision trees
- ID3 uses an iterative method.
- Simple trees preferred as more accurate classification is afforded.
- A random choice of samples from training set chosen for initial assembly
of tree -- the window subset.
- Other training examples used to test tree.
- If all examples classified correctly stop.
- Otherwise add a number of training examples to window and start
again.
Adding new nodes
When assembling the tree we need to choose when to add a new node:
- Some attributes will yield more information than others.
- Adding a new node might be useless in the overall classification process.
- Sometimes attributes will separate training instances into subsets whose
members share a common label. Here branching can be terminates and a leaf node
assigned for the whole subset.
Decision tree advantages:
- Quicker than version spaces when concept space is large.
- Disjunction easier.
Disadvantages:
- Representation not natural to humans -- a decision tree may find it hard
to explain its classification.
dave@cs.cf.ac.uk