Skip to main content

Showing 1–1 of 1 results for author: Sirer, M I

Searching in archive stat. Search in all archives.
.
  1. arXiv:1402.0422  [pdf, other

    stat.ML cs.IR cs.LG physics.soc-ph

    A high-reproducibility and high-accuracy method for automated topic classification

    Authors: Andrea Lancichinetti, M. Irmak Sirer, Jane X. Wang, Daniel Acuna, Konrad Körding, Luís A. Nunes Amaral

    Abstract: Much of human knowledge sits in large databases of unstructured text. Leveraging this knowledge requires algorithms that extract and record metadata on unstructured text documents. Assigning topics to documents will enable intelligent search, statistical characterization, and meaningful classification. Latent Dirichlet allocation (LDA) is the state-of-the-art in topic classification. Here, we perf… ▽ More

    Submitted 3 February, 2014; originally announced February 2014.

    Comments: 23 pages, 24 figures