Decision Trees

Next: Explanation Based Learning (EBL) Up: Inductive Learning Previous: Version Spaces

Decision Trees

Quinlan in his ID3 system (986) introduced the idea of decision trees.

ID3 is a program that can build trees automatically from given positive and negative instances.

Basically each leaf of a decision tree asserts a positive or negative concept. To classify a particular input we start at the top and follow assertions down until we reach an answer (Fig 28)

Fig. 28 Edible Mushroom decision tree

Building decision trees

ID3 uses an iterative method.
Simple trees preferred as more accurate classification is afforded.
A random choice of samples from training set chosen for initial assembly of tree -- the window subset.
Other training examples used to test tree.
If all examples classified correctly stop.
Otherwise add a number of training examples to window and start again.

Adding new nodes

When assembling the tree we need to choose when to add a new node:

Some attributes will yield more information than others.
Adding a new node might be useless in the overall classification process.
Sometimes attributes will separate training instances into subsets whose members share a common label. Here branching can be terminates and a leaf node assigned for the whole subset.

Decision tree advantages:

Quicker than version spaces when concept space is large.
Disjunction easier.

Disadvantages:

Representation not natural to humans -- a decision tree may find it hard to explain its classification.

dave@cs.cf.ac.uk