A Modified Bayesian Information Criterion for Improving the Performance of Tree-Based Learning Algorithms Without the Use of Cross-Validation
Casting tree building as a change-point detection problem, we show that it is possible to prune a regression tree efficiently using properly modified information criteria, and we discuss some applications to tree-based ensemble learning methods. We prove that the proposed pruning approach using a modified Bayesian information criterion is consistent for identifying the correct tree model when it exists as a subtree within a larger tree. In practice, we obtain simplified trees that have prediction accuracy comparable to trees obtained using standard cost-complexity pruning. We briefly discuss an extension to random forests that adaptively trims trees to prevent excessive variance, building upon the work of other authors. The extension includes regular random forests as a special case, and is therefore expected to perform at least as well, with a negligible additional computational cost.
Session
Date and Time
-
Language of Oral Presentation
English
Language of Visual Aids
English