Skip to main content

Showing 1–2 of 2 results for author: Hanák, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2305.15793  [pdf, other

    cs.LG cs.AI cs.CE stat.CO

    Feature space reduction method for ultrahigh-dimensional, multiclass data: Random forest-based multiround screening (RFMS)

    Authors: Gergely Hanczár, Marcell Stip**er, Dávid Hanák, Marcell T. Kurbucz, Olivér M. Törteli, Ágnes Chripkó, Zoltán Somogyvári

    Abstract: In recent years, numerous screening methods have been published for ultrahigh-dimensional data that contain hundreds of thousands of features; however, most of these features cannot handle data with thousands of classes. Prediction models built to authenticate users based on multichannel biometric data result in this type of problem. In this study, we present a novel method known as random forest-… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: 9 pages, 2 figures, 2 tables

    MSC Class: 62G05; 68T01; 62H30 ACM Class: I.2.6; I.2.1; G.3

  2. arXiv:2206.10747  [pdf, other

    cs.LG cs.AI cs.DB stat.CO

    BiometricBlender: Ultra-high dimensional, multi-class synthetic data generator to imitate biometric feature space

    Authors: Marcell Stip**er, Dávid Hanák, Marcell T. Kurbucz, Gergely Hanczár, Olivér M. Törteli, Zoltán Somogyvári

    Abstract: The lack of freely available (real-life or synthetic) high or ultra-high dimensional, multi-class datasets may hamper the rapidly growing research on feature screening, especially in the field of biometrics, where the usage of such datasets is common. This paper reports a Python package called BiometricBlender, which is an ultra-high dimensional, multi-class synthetic data generator to benchmark a… ▽ More

    Submitted 25 April, 2023; v1 submitted 21 June, 2022; originally announced June 2022.

    Comments: DISCLAIMER: This is a preprint article. A final peer reviewed article has now been published in SoftwareX. DOI: https://doi.org/10.1016/j.softx.2023.101366

    MSC Class: 62H30 (Primary) 68T10; 94A62 (Secondary) ACM Class: G.3; I.5.2; K.6.5

    Journal ref: SoftwareX 22 (2023) 101366