A System for Induction of Oblique Decision Trees

A System for Induction of Oblique Decision Trees

1994 | Sreerama K. Murthy, Simon Kasif, Steven Salzberg
This paper introduces OC1, a new system for inducing oblique decision trees. OC1 combines deterministic hill-climbing with randomization to find a good oblique split (a hyperplane) at each node of a decision tree. Oblique decision trees are particularly effective for domains with numeric attributes, though they can also handle symbolic or mixed attributes. The authors present extensive empirical studies using both real and artificial data, showing that OC1 constructs oblique trees that are smaller and more accurate than axis-parallel counterparts. They also examine the benefits of randomization in building oblique decision trees. OC1 is a system that tests a linear combination of attributes at each internal node. The test at each node has the form $ \sum_{i=1}^{d}a_{i}x_{i}+a_{d+1}>0 $, where $ a_{1},\ldots,a_{d+1} $ are real-valued coefficients. These tests are equivalent to hyperplanes at an oblique orientation to the axes, hence the term "oblique decision trees." OC1 is implemented as an oblique decision tree induction system and is available online. The algorithm uses a randomized hill-climbing approach, which is more efficient than other existing randomized oblique decision tree methods. OC1 guarantees a worst-case running time that is only $ O(\log n) $ times greater than the worst-case time for inducing axis-parallel trees. OC1 is compared to other decision tree induction methods on a range of real-world datasets. The results show that OC1 performs well, often producing smaller and more accurate trees than other methods. The use of randomization in OC1 helps avoid local minima and improves the quality of the decision trees. OC1 also includes a pruning strategy to reduce overfitting and improve accuracy. The system can handle irrelevant attributes by using feature selection methods. The paper also discusses impurity measures used in OC1, including the Twoing Rule, which is the default impurity measure for OC1. The authors conclude that OC1 is a promising system for inducing oblique decision trees, and that further research is needed to explore its potential in various domains. The system is efficient, effective, and has shown good results in empirical studies.This paper introduces OC1, a new system for inducing oblique decision trees. OC1 combines deterministic hill-climbing with randomization to find a good oblique split (a hyperplane) at each node of a decision tree. Oblique decision trees are particularly effective for domains with numeric attributes, though they can also handle symbolic or mixed attributes. The authors present extensive empirical studies using both real and artificial data, showing that OC1 constructs oblique trees that are smaller and more accurate than axis-parallel counterparts. They also examine the benefits of randomization in building oblique decision trees. OC1 is a system that tests a linear combination of attributes at each internal node. The test at each node has the form $ \sum_{i=1}^{d}a_{i}x_{i}+a_{d+1}>0 $, where $ a_{1},\ldots,a_{d+1} $ are real-valued coefficients. These tests are equivalent to hyperplanes at an oblique orientation to the axes, hence the term "oblique decision trees." OC1 is implemented as an oblique decision tree induction system and is available online. The algorithm uses a randomized hill-climbing approach, which is more efficient than other existing randomized oblique decision tree methods. OC1 guarantees a worst-case running time that is only $ O(\log n) $ times greater than the worst-case time for inducing axis-parallel trees. OC1 is compared to other decision tree induction methods on a range of real-world datasets. The results show that OC1 performs well, often producing smaller and more accurate trees than other methods. The use of randomization in OC1 helps avoid local minima and improves the quality of the decision trees. OC1 also includes a pruning strategy to reduce overfitting and improve accuracy. The system can handle irrelevant attributes by using feature selection methods. The paper also discusses impurity measures used in OC1, including the Twoing Rule, which is the default impurity measure for OC1. The authors conclude that OC1 is a promising system for inducing oblique decision trees, and that further research is needed to explore its potential in various domains. The system is efficient, effective, and has shown good results in empirical studies.
Reach us at info@futurestudyspace.com
[slides] A System for Induction of Oblique Decision Trees | StudySpace