Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Announcing DTREG 2.0 decision trees with TreeBoost

28 views
Skip to first unread message

Phil Sherrod

unread,
Apr 5, 2004, 10:32:13 PM4/5/04
to
Version 2.0 of the DTREG decision tree generator program has been released. A
free demonstration version is available for download from:

http://www.dtreg.com

Here is a summary of the new features:

DTREG now can generate both single-tree (CART) models and TreeBoost models
consisting of a series of trees. TreeBoost is an implementation of stochastic
gradient boosting for models with decision trees as the base functions.
TreeBoost is somewhat similar to AdaBoost, but it is optimized for tree models,
and it introduces random selection of rows during the series build. The
randomization improves the prediction accuracy and makes the TreeBoost method
very similar to random forests. TreeBoost series are much less prone to
overfitting than single-tree models. TreeBoost uses Huber's M-regression loss
function which is very robust to noisy or mislabeled data values.

Charts have been added for Lift/Gain curves, model size/error rate and variable
importance.

Speed improvements have been made.

DTREG supports V-fold cross-validation with pruning to select the optimal tree
size for generalization to independent data. Random row subsetting also can be
used.

DTREG supports surrogate splitter (predictor) variables to handle missing data
values.

Prior probabilities and misclassification costs can be specified.

Text variables (for example, "male"/"female") are supported as well as numeric
variables.

Both regression trees with continuous target variables and classification trees
with categorical target variables can be generated.

0 new messages