Skip to main content

Showing 1–3 of 3 results for author: Elor, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2201.08528  [pdf, other

    cs.LG

    To SMOTE, or not to SMOTE?

    Authors: Yotam Elor, Hadar Averbuch-Elor

    Abstract: Balancing the data before training a classifier is a popular technique to address the challenges of imbalanced binary classification in tabular data. Balancing is commonly achieved by duplication of minority samples or by generation of synthetic minority samples. While it is well known that balancing affects each classifier differently, most prior empirical studies did not include strong state-of-… ▽ More

    Submitted 11 May, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

  2. arXiv:2105.08204  [pdf, other

    cs.LG

    Synthesising Multi-Modal Minority Samples for Tabular Data

    Authors: Sajad Darabi, Yotam Elor

    Abstract: Real-world binary classification tasks are in many cases imbalanced, where the minority class is much smaller than the majority class. This skewness is challenging for machine learning algorithms as they tend to focus on the majority and greatly misclassify the minority. Adding synthetic minority samples to the dataset before training the model is a popular technique to address this difficulty and… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

    Comments: Code can be found in https://github.com/aws/sagemaker-scikit-learn-extension/tree/master/src/sagemaker_sklearn_extension/contrib/taei

  3. arXiv:2012.08483  [pdf, other

    cs.LG

    Amazon SageMaker Autopilot: a white box AutoML solution at scale

    Authors: Piali Das, Valerio Perrone, Nikita Ivkin, Tanya Bansal, Zohar Karnin, Huibin Shen, Iaroslav Shcherbatyi, Yotam Elor, Wilton Wu, Aida Zolic, Thibaut Lienart, Alex Tang, Amr Ahmed, Jean Baptiste Faddoul, Rodolphe Jenatton, Fela Winkelmolen, Philip Gautier, Leo Dirac, Andre Perunicic, Miroslav Miladinovic, Giovanni Zappella, Cédric Archambeau, Matthias Seeger, Bhaskar Dutt, Laurence Rouesnel

    Abstract: AutoML systems provide a black-box solution to machine learning problems by selecting the right way of processing features, choosing an algorithm and tuning the hyperparameters of the entire pipeline. Although these systems perform well on many datasets, there is still a non-negligible number of datasets for which the one-shot solution produced by each particular system would provide sub-par perfo… ▽ More

    Submitted 16 December, 2020; v1 submitted 15 December, 2020; originally announced December 2020.