Showing 1–2 of 2 results for author: Colthurst, T

Search v0.5.6 released 2020-02-24

arXiv:1710.11555 [pdf, other]

stat.ML cs.LG

TF Boosted Trees: A scalable TensorFlow based framework for gradient boosting

Authors: Natalia Ponomareva, Soroush Radpour, Gilbert Hendry, Salem Haykal, Thomas Colthurst, Petr Mitrichev, Alexander Grushetsky

Abstract: TF Boosted Trees (TFBT) is a new open-sourced frame-work for the distributed training of gradient boosted trees. It is based on TensorFlow, and its distinguishing features include a novel architecture, automatic loss differentiation, layer-by-layer boosting that results in smaller ensembles and faster prediction, principled multi-class handling, and a number of regularization techniques to prevent… ▽ More TF Boosted Trees (TFBT) is a new open-sourced frame-work for the distributed training of gradient boosted trees. It is based on TensorFlow, and its distinguishing features include a novel architecture, automatic loss differentiation, layer-by-layer boosting that results in smaller ensembles and faster prediction, principled multi-class handling, and a number of regularization techniques to prevent overfitting. △ Less

Submitted 31 October, 2017; originally announced October 2017.

Comments: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML PKDD 2017). The final publication will be available at link.springer.com and is available on ECML website http://ecmlpkdd2017.ijs.si/papers/paperID705.pdf
arXiv:1710.11547 [pdf, other]

stat.ML cs.LG

Compact Multi-Class Boosted Trees

Authors: Natalia Ponomareva, Thomas Colthurst, Gilbert Hendry, Salem Haykal, Soroush Radpour

Abstract: Gradient boosted decision trees are a popular machine learning technique, in part because of their ability to give good accuracy with small models. We describe two extensions to the standard tree boosting algorithm designed to increase this advantage. The first improvement extends the boosting formalism from scalar-valued trees to vector-valued trees. This allows individual trees to be used as mul… ▽ More Gradient boosted decision trees are a popular machine learning technique, in part because of their ability to give good accuracy with small models. We describe two extensions to the standard tree boosting algorithm designed to increase this advantage. The first improvement extends the boosting formalism from scalar-valued trees to vector-valued trees. This allows individual trees to be used as multiclass classifiers, rather than requiring one tree per class, and drastically reduces the model size required for multiclass problems. We also show that some other popular vector-valued gradient boosted trees modifications fit into this formulation and can be easily obtained in our implementation. The second extension, layer-by-layer boosting, takes smaller steps in function space, which is empirically shown to lead to a faster convergence and to a more compact ensemble. We have added both improvements to the open-source TensorFlow Boosted trees (TFBT) package, and we demonstrate their efficacy on a variety of multiclass datasets. We expect these extensions will be of particular interest to boosted tree applications that require small models, such as embedded devices, applications requiring fast inference, or applications desiring more interpretable models. △ Less

Submitted 31 October, 2017; originally announced October 2017.

Comments: Accepted for publication in IEEE Big Data 2017 http://cci.drexel.edu/bigdata/bigdata2017/AcceptedPapers.html

Search v0.5.6 released 2020-02-24