Decision Tree [Knowledge]

A decision tree is a technique for predicting target values from observations.

Motivation

Assume your program has data samples and needs to draw certain conclusions about each of them. This is the case when mining for information in large databases, or trying to find an appropriate behaviour for a particular situation. You can apply decision trees to this problem as long as:

  • The data samples each have attributes, either discrete or continuous values. For example, a symbol indicating if it was overcast or not, and the humidity level.
  • The predictions must also be discrete or continuous values. For example, the estimated temperature or a prediction of whether it will rain or not.

Decision trees assume that it’s possible to check the attributes of the data samples multiple times, and use this information to refine the predicted value.

Description

A decision tree is based on a hierarchical data-structure, where each branch in the tree represents a condition. This condition typically evaluates to a boolean, with true/false each corresponding to a child node. However, it’s also possible to have more than two child nodes — for example if ranges are used as conditions. The leaves in the tree store the predictions for the target value(s).

Decision Tree

A simple traversal algorithm is also necessary to traverse the tree, evaluating each decision and then selecting the appropriate sub-tree to enter. The algorithm is then applied recursively until a leaf node is reached. This process is rather efficient as very little memory is required to store a tree, and only simple conditional checks need to be evaluated while predicting.

Application

A decision tree can be used in almost any classification problem (to predict discrete symbols) or regression problem (to predict continuous values). They are particularly useful in data-mining because of their efficiency. However, such decision trees cannot solve all types of problems because of assumptions of the hierarchical model.

In practice, to apply the traversal algorithm successfully, a good decision tree is needed. This can be obtained in multiple ways:

  1. The decision tree may be edited by an expert.
  2. The decision tree can be induced from data samples.

Since expert solutions may take time to develop and tune, supervised learning is often used.

Resources

  • See papers about the decision tree in the resources section on AI Depot.

Related Posts

Comments are closed.