Skip to main content

Showing 1–2 of 2 results for author: Johannemann, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:1908.09874  [pdf, other

    stat.ML cs.LG

    Sufficient Representations for Categorical Variables

    Authors: Jonathan Johannemann, Vitor Hadad, Susan Athey, Stefan Wager

    Abstract: Many learning algorithms require categorical data to be transformed into real vectors before it can be used as input. Often, categorical variables are encoded as one-hot (or dummy) vectors. However, this mode of representation can be wasteful since it adds many low-signal regressors, especially when the number of unique categories is large. In this paper, we investigate simple alternative solution… ▽ More

    Submitted 28 October, 2021; v1 submitted 26 August, 2019; originally announced August 2019.

  2. arXiv:1907.01974  [pdf, other

    stat.ML cs.LG

    Spectral Overlap and a Comparison of Parameter-Free, Dimensionality Reduction Quality Metrics

    Authors: Jonathan Johannemann, Robert Tibshirani

    Abstract: Nonlinear dimensionality reduction methods are a popular tool for data scientists and researchers to visualize complex, high dimensional data. However, while these methods continue to improve and grow in number, it is often difficult to evaluate the quality of a visualization due to a variety of factors such as lack of information about the intrinsic dimension of the data and additional tuning req… ▽ More

    Submitted 3 September, 2019; v1 submitted 3 July, 2019; originally announced July 2019.