Skip to main content

Showing 1–13 of 13 results for author: State, R

Searching in archive stat. Search in all archives.
.
  1. arXiv:2005.03773  [pdf, other

    cs.LG stat.ML

    Minority Class Oversampling for Tabular Data with Deep Generative Models

    Authors: Ramiro Camino, Christian Hammerschmidt, Radu State

    Abstract: In practice, machine learning experts are often confronted with imbalanced data. Without accounting for the imbalance, common classifiers perform poorly and standard evaluation metrics mislead the practitioners on the model's performance. A common method to treat imbalanced datasets is under- and oversampling. In this process, samples are either removed from the majority class or synthetic samples… ▽ More

    Submitted 20 July, 2020; v1 submitted 7 May, 2020; originally announced May 2020.

  2. arXiv:1910.01449  [pdf, ps, other

    cs.CR cs.LG stat.ML

    A Data Science Approach for Honeypot Detection in Ethereum

    Authors: Ramiro Camino, Christof Ferreira Torres, Mathis Baden, Radu State

    Abstract: Ethereum smart contracts have recently drawn a considerable amount of attention from the media, the financial industry and academia. With the increase in popularity, malicious users found new opportunities to profit by deceiving newcomers. Consequently, attackers started luring other attackers into contracts that seem to have exploitable flaws, but that actually contain a complex hidden trap that… ▽ More

    Submitted 19 December, 2019; v1 submitted 3 October, 2019; originally announced October 2019.

  3. arXiv:1908.09899  [pdf, other

    cs.LG stat.ML

    SynGAN: Towards Generating Synthetic Network Attacks using GANs

    Authors: Jeremy Charlier, Aman Singh, Gaston Ormazabal, Radu State, Henning Schulzrinne

    Abstract: The rapid digital transformation without security considerations has resulted in the rise of global-scale cyberattacks. The first line of defense against these attacks are Network Intrusion Detection Systems (NIDS). Once deployed, however, these systems work as blackboxes with a high rate of false positives with no measurable effectiveness. There is a need to continuously test and improve these sy… ▽ More

    Submitted 26 August, 2019; originally announced August 2019.

  4. arXiv:1905.13020  [pdf, other

    cs.LG stat.ML

    Visualization of AE's Training on Credit Card Transactions with Persistent Homology

    Authors: Jeremy Charlier, Francois Petit, Gaston Ormazabal, Radu State, Jean Hilger

    Abstract: Auto-encoders are among the most popular neural network architecture for dimension reduction. They are composed of two parts: the encoder which maps the model distribution to a latent manifold and the decoder which maps the latent manifold to a reconstructed distribution. However, auto-encoders are known to provoke chaotically scattered data distribution in the latent manifold resulting in an inco… ▽ More

    Submitted 12 August, 2019; v1 submitted 24 May, 2019; originally announced May 2019.

    Comments: arXiv admin note: substantial text overlap with arXiv:1905.09894

  5. arXiv:1905.12568  [pdf, other

    cs.LG stat.ML

    Predicting Sparse Clients' Actions with CPOPT-Net in the Banking Environment

    Authors: Jeremy Charlier, Radu State, Jean Hilger

    Abstract: The digital revolution of the banking system with evolving European regulations have pushed the major banking actors to innovate by a newly use of their clients' digital information. Given highly sparse client activities, we propose CPOPT-Net, an algorithm that combines the CP canonical tensor decomposition, a multidimensional matrix decomposition that factorizes a tensor as the sum of rank-one te… ▽ More

    Submitted 23 May, 2019; originally announced May 2019.

  6. arXiv:1905.12567  [pdf, other

    cs.LG stat.ML

    MQLV: Optimal Policy of Money Management in Retail Banking with Q-Learning

    Authors: Jeremy Charlier, Gaston Ormazabal, Radu State, Jean Hilger

    Abstract: Reinforcement learning has become one of the best approach to train a computer game emulator capable of human level performance. In a reinforcement learning approach, an optimal value function is learned across a set of actions, or decisions, that leads to a set of states giving different rewards, with the objective to maximize the overall reward. A policy assigns to each state-action pairs an exp… ▽ More

    Submitted 21 August, 2019; v1 submitted 24 May, 2019; originally announced May 2019.

  7. arXiv:1905.10363  [pdf, other

    math.NA cs.CE cs.LG stat.ML

    User-Device Authentication in Mobile Banking using APHEN for Paratuck2 Tensor Decomposition

    Authors: Jeremy Charlier, Eric Falk, Radu State, Jean Hilger

    Abstract: The new financial European regulations such as PSD2 are changing the retail banking services. Noticeably, the monitoring of the personal expenses is now opened to other institutions than retail banks. Nonetheless, the retail banks are looking to leverage the user-device authentication on the mobile banking applications to enhance the personal financial advertisement. To address the profiling of th… ▽ More

    Submitted 23 May, 2019; originally announced May 2019.

  8. arXiv:1905.09894  [pdf, other

    cs.LG stat.ML

    PHom-GeM: Persistent Homology for Generative Models

    Authors: Jeremy Charlier, Radu State, Jean Hilger

    Abstract: Generative neural network models, including Generative Adversarial Network (GAN) and Auto-Encoders (AE), are among the most popular neural network models to generate adversarial data. The GAN model is composed of a generator that produces synthetic data and of a discriminator that discriminates between the generator's output and the true data. AE consist of an encoder which maps the model distribu… ▽ More

    Submitted 23 May, 2019; originally announced May 2019.

  9. arXiv:1902.10666  [pdf, other

    cs.LG stat.ML

    Improving Missing Data Imputation with Deep Generative Models

    Authors: Ramiro D. Camino, Christian A. Hammerschmidt, Radu State

    Abstract: Datasets with missing values are very common on industry applications, and they can have a negative impact on machine learning models. Recent studies introduced solutions to the problem of imputing missing values based on deep generative models. Previous experiments with Generative Adversarial Networks and Variational Autoencoders showed interesting results in this domain, but it is not clear whic… ▽ More

    Submitted 27 February, 2019; originally announced February 2019.

  10. arXiv:1807.01202  [pdf, other

    stat.ML cs.LG

    Generating Multi-Categorical Samples with Generative Adversarial Networks

    Authors: Ramiro Camino, Christian Hammerschmidt, Radu State

    Abstract: We propose a method to train generative adversarial networks on mutivariate feature vectors representing multiple categorical values. In contrast to the continuous domain, where GAN-based methods have delivered considerable results, GANs struggle to perform equally well on discrete data. We propose and compare several architectures based on multiple (Gumbel) softmax output layers taking into accou… ▽ More

    Submitted 4 July, 2018; v1 submitted 3 July, 2018; originally announced July 2018.

    Journal ref: Presented at the ICML 2018 workshop on Theoretical Foundations and Applications of Deep Generative Models, Stockholm, Sweden

  11. arXiv:1707.09430  [pdf, ps, other

    stat.ML cs.LG

    Human in the Loop: Interactive Passive Automata Learning via Evidence-Driven State-Merging Algorithms

    Authors: Christian A. Hammerschmidt, Radu State, Sicco Verwer

    Abstract: We present an interactive version of an evidence-driven state-merging (EDSM) algorithm for learning variants of finite state automata. Learning these automata often amounts to recovering or reverse engineering the model generating the data despite noisy, incomplete, or imperfectly sampled data sources rather than optimizing a purely numeric target function. Domain expertise and human knowledge abo… ▽ More

    Submitted 28 July, 2017; originally announced July 2017.

    Comments: 4 pages, presented at the Human in the Loop workshop at ICML 2017

  12. arXiv:1703.10121  [pdf, ps, other

    cs.LG cs.AI stat.ML

    The Top 10 Topics in Machine Learning Revisited: A Quantitative Meta-Study

    Authors: Patrick Glauner, Manxing Du, Victor Paraschiv, Andrey Boytsov, Isabel Lopez Andrade, Jorge Meira, Petko Valtchev, Radu State

    Abstract: Which topics of machine learning are most commonly addressed in research? This question was initially answered in 2007 by doing a qualitative survey among distinguished researchers. In our study, we revisit this question from a quantitative perspective. Concretely, we collect 54K abstracts of papers published between 2007 and 2016 in leading machine learning journals and conferences. We then use m… ▽ More

    Submitted 29 March, 2017; originally announced March 2017.

    Journal ref: Proceedings of the 25th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2017)

  13. arXiv:1611.07100  [pdf, other

    stat.ML cs.AI

    Interpreting Finite Automata for Sequential Data

    Authors: Christian Albert Hammerschmidt, Sicco Verwer, Qin Lin, Radu State

    Abstract: Automaton models are often seen as interpretable models. Interpretability itself is not well defined: it remains unclear what interpretability means without first explicitly specifying objectives or desired attributes. In this paper, we identify the key properties used to interpret automata and propose a modification of a state-merging approach to learn variants of finite state automata. We apply… ▽ More

    Submitted 24 November, 2016; v1 submitted 21 November, 2016; originally announced November 2016.

    Comments: Presented at NIPS 2016 Workshop on Interpretable Machine Learning in Complex Systems

    ACM Class: I.2.6