Skip to main content

Showing 1–1 of 1 results for author: Basilico, J D

Searching in archive cs. Search in all archives.
.
  1. arXiv:1103.2068  [pdf, other

    cs.LG cs.DC stat.ML

    COMET: A Recipe for Learning and Using Large Ensembles on Massive Data

    Authors: Justin D. Basilico, M. Arthur Munson, Tamara G. Kolda, Kevin R. Dixon, W. Philip Kegelmeyer

    Abstract: COMET is a single-pass MapReduce algorithm for learning on large-scale data. It builds multiple random forest ensembles on distributed blocks of data and merges them into a mega-ensemble. This approach is appropriate when learning from massive-scale data that is too large to fit on a single machine. To get the best accuracy, IVoting should be used instead of bagging to generate the training subset… ▽ More

    Submitted 8 September, 2011; v1 submitted 10 March, 2011; originally announced March 2011.

    ACM Class: I.5; I.2.6; H.2.8

    Journal ref: ICDM 2011: Proceedings of the 2011 IEEE International Conference on Data Mining, pp. 41-50, 2011