Skip to main content

Showing 1–4 of 4 results for author: Kartashev, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19380  [pdf, other

    cs.LG

    TabReD: A Benchmark of Tabular Machine Learning in-the-Wild

    Authors: Ivan Rubachev, Nikolay Kartashev, Yury Gorishniy, Artem Babenko

    Abstract: Benchmarks that closely reflect downstream application scenarios are essential for the streamlined adoption of new research in tabular machine learning (ML). In this work, we examine existing tabular benchmarks and find two common characteristics of industry-grade tabular data that are underrepresented in the datasets available to the academic community. First, tabular data often changes over time… ▽ More

    Submitted 1 July, 2024; v1 submitted 27 June, 2024; originally announced June 2024.

    Comments: Code: https://github.com/yandex-research/tabred (V2: fix the link to the code in this comment; no changes to the PDF)

  2. arXiv:2307.14338  [pdf, other

    cs.LG

    TabR: Tabular Deep Learning Meets Nearest Neighbors in 2023

    Authors: Yury Gorishniy, Ivan Rubachev, Nikolay Kartashev, Daniil Shlenskii, Akim Kotelnikov, Artem Babenko

    Abstract: Deep learning (DL) models for tabular data problems (e.g. classification, regression) are currently receiving increasingly more attention from researchers. However, despite the recent efforts, the non-DL algorithms based on gradient-boosted decision trees (GBDT) remain a strong go-to solution for these problems. One of the research directions aimed at improving the position of tabular DL involves… ▽ More

    Submitted 26 October, 2023; v1 submitted 26 July, 2023; originally announced July 2023.

    Comments: Code: https://github.com/yandex-research/tabular-dl-tabr

  3. arXiv:2206.15407  [pdf, other

    cs.LG cs.AI stat.ML

    Shifts 2.0: Extending The Dataset of Real Distributional Shifts

    Authors: Andrey Malinin, Andreas Athanasopoulos, Muhamed Barakovic, Meritxell Bach Cuadra, Mark J. F. Gales, Cristina Granziera, Mara Graziani, Nikolay Kartashev, Konstantinos Kyriakopoulos, Po-Jui Lu, Nataliia Molchanova, Antonis Nikitakis, Vatsal Raina, Francesco La Rosa, Eli Sivena, Vasileios Tsarsitalidis, Efi Tsompopoulou, Elena Volf

    Abstract: Distributional shift, or the mismatch between training and deployment data, is a significant obstacle to the usage of machine learning in high-stakes industrial applications, such as autonomous driving and medicine. This creates a need to be able to assess how robustly ML models generalize as well as the quality of their uncertainty estimates. Standard ML baseline datasets do not allow these prope… ▽ More

    Submitted 15 September, 2022; v1 submitted 30 June, 2022; originally announced June 2022.

  4. arXiv:2111.07148  [pdf, other

    cs.CL cs.AI cs.SI

    SocialBERT -- Transformers for Online SocialNetwork Language Modelling

    Authors: Ilia Karpov, Nick Kartashev

    Abstract: The ubiquity of the contemporary language understanding tasks gives relevance to the development of generalized, yet highly efficient models that utilize all knowledge, provided by the data source. In this work, we present SocialBERT - the first model that uses knowledge about the author's position in the network during text analysis. We investigate possible models for learning social network info… ▽ More

    Submitted 13 November, 2021; originally announced November 2021.