Skip to main content

Showing 1–5 of 5 results for author: Raimondo, F

.
  1. arXiv:2311.04179  [pdf

    cs.LG cs.AI

    On Leakage in Machine Learning Pipelines

    Authors: Leonard Sasse, Eliana Nicolaisen-Sobesky, Juergen Dukart, Simon B. Eickhoff, Michael Götz, Sami Hamdan, Vera Komeyer, Abhijit Kulkarni, Juha Lahnakoski, Bradley C. Love, Federico Raimondo, Kaustubh R. Patil

    Abstract: Machine learning (ML) provides powerful tools for predictive modeling. ML's popularity stems from the promise of sample-level prediction with applications across a variety of fields from physics and marketing to healthcare. However, if not properly implemented and evaluated, ML pipelines may contain leakage typically resulting in overoptimistic performance estimates and failure to generalize to ne… ▽ More

    Submitted 5 March, 2024; v1 submitted 7 November, 2023; originally announced November 2023.

    Comments: second draft

  2. arXiv:2310.12568  [pdf, other

    cs.LG q-bio.NC

    Julearn: an easy-to-use library for leakage-free evaluation and inspection of ML models

    Authors: Sami Hamdan, Shammi More, Leonard Sasse, Vera Komeyer, Kaustubh R. Patil, Federico Raimondo

    Abstract: The fast-paced development of machine learning (ML) methods coupled with its increasing adoption in research poses challenges for researchers without extensive training in ML. In neuroscience, for example, ML can help understand brain-behavior relationships, diagnose diseases, and develop biomarkers using various data sources like magnetic resonance imaging and electroencephalography. The primary… ▽ More

    Submitted 19 October, 2023; originally announced October 2023.

    Comments: 13 pages, 5 figures

  3. arXiv:2212.08525  [pdf, other

    cs.CR eess.SY

    Resource-Interaction Graph: Efficient Graph Representation for Anomaly Detection

    Authors: James Pope, **yuan Liang, Vijay Kumar, Francesco Raimondo, Xinyi Sun, Ryan McConville, Thomas Pasquier, Rob Piechocki, George Oikonomou, Bo Luo, Dan Howarth, Ioannis Mavromatis, Adrian Sanchez Mompo, Pietro Carnelli, Theodoros Spyridopoulos, Aftab Khan

    Abstract: Security research has concentrated on converting operating system audit logs into suitable graphs, such as provenance graphs, for analysis. However, provenance graphs can grow very large requiring significant computational resources beyond what is necessary for many security tasks and are not feasible for resource constrained environments, such as edge devices. To address this problem, we present… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

    Comments: 15 pages, 11 figures, 6 tables, for dataset see https://github.com/jpope8/container-escape-dataset, for code see https://github.com/jpope8/container-escape-analysis

  4. arXiv:2211.01840  [pdf, other

    cs.LG cs.CR cs.DC

    LE3D: A Lightweight Ensemble Framework of Data Drift Detectors for Resource-Constrained Devices

    Authors: Ioannis Mavromatis, Adrian Sanchez-Mompo, Francesco Raimondo, James Pope, Marcello Bullo, Ingram Weeks, Vijay Kumar, Pietro Carnelli, George Oikonomou, Theodoros Spyridopoulos, Aftab Khan

    Abstract: Data integrity becomes paramount as the number of Internet of Things (IoT) sensor deployments increases. Sensor data can be altered by benign causes or malicious actions. Mechanisms that detect drifts and irregularities can prevent disruptions and data bias in the state of an IoT application. This paper presents LE3D, an ensemble framework of data drift estimators capable of detecting abnormal sen… ▽ More

    Submitted 18 November, 2022; v1 submitted 3 November, 2022; originally announced November 2022.

    Comments: IEEE CCNC 2023, Las Vegas, USA

  5. arXiv:1612.08194  [pdf, other

    stat.AP

    Autoreject: Automated artifact rejection for MEG and EEG data

    Authors: Mainak Jas, Denis A. Engemann, Yousra Bekhti, Federico Raimondo, Alexandre Gramfort

    Abstract: We present an automated algorithm for unified rejection and repair of bad trials in magnetoencephalography (MEG) and electroencephalography (EEG) signals. Our method capitalizes on cross-validation in conjunction with a robust evaluation metric to estimate the optimal peak-to-peak threshold -- a quantity commonly used for identifying bad trials in M/EEG. This approach is then extended to a more so… ▽ More

    Submitted 12 July, 2017; v1 submitted 24 December, 2016; originally announced December 2016.