Skip to main content

Showing 1–1 of 1 results for author: van Lith, J W

.
  1. arXiv:2111.01868  [pdf, other

    cs.LG cs.AI

    From Strings to Data Science: a Practical Framework for Automated String Handling

    Authors: John W. van Lith, Joaquin Vanschoren

    Abstract: Many machine learning libraries require that string features be converted to a numerical representation for the models to work as intended. Categorical string features can represent a wide variety of data (e.g., zip codes, names, marital status), and are notoriously difficult to preprocess automatically. In this paper, we propose a framework to do so based on best practices, domain knowledge, and… ▽ More

    Submitted 4 November, 2021; v1 submitted 2 November, 2021; originally announced November 2021.