Skip to main content

Showing 1–1 of 1 results for author: Silva, M C R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2208.01712  [pdf, other

    cs.LG cs.CL cs.IR stat.ML

    No Pattern, No Recognition: a Survey about Reproducibility and Distortion Issues of Text Clustering and Topic Modeling

    Authors: Marília Costa Rosendo Silva, Felipe Alves Siqueira, João Pedro Mantovani Tarrega, João Vitor Pataca Beinotti, Augusto Sousa Nunes, Miguel de Mattos Gardini, Vinícius Adolfo Pereira da Silva, Nádia Félix Felipe da Silva, André Carlos Ponce de Leon Ferreira de Carvalho

    Abstract: Extracting knowledge from unlabeled texts using machine learning algorithms can be complex. Document categorization and information retrieval are two applications that may benefit from unsupervised learning (e.g., text clustering and topic modeling), including exploratory data analysis. However, the unsupervised learning paradigm poses reproducibility issues. The initialization can lead to variabi… ▽ More

    Submitted 2 August, 2022; originally announced August 2022.

    ACM Class: I.2; I.2.7; I.5.3