-
Chain-structured neural architecture search for financial time series forecasting
Authors:
Denis Levchenko,
Efstratios Rappos,
Shabnam Ataee,
Biagio Nigro,
Stephan Robert
Abstract:
We compare three popular neural architecture search strategies on chain-structured search spaces: Bayesian optimization, the hyperband method, and reinforcement learning in the context of financial time series forecasting.
We compare three popular neural architecture search strategies on chain-structured search spaces: Bayesian optimization, the hyperband method, and reinforcement learning in the context of financial time series forecasting.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Generation and Simulation of Synthetic Datasets with Copulas
Authors:
Regis Houssou,
Mihai-Cezar Augustin,
Efstratios Rappos,
Vivien Bonvin,
Stephan Robert-Nicoud
Abstract:
This paper proposes a new method to generate synthetic data sets based on copula models. Our goal is to produce surrogate data resembling real data in terms of marginal and joint distributions. We present a complete and reliable algorithm for generating a synthetic data set comprising numeric or categorical variables. Applying our methodology to two datasets shows better performance compared to ot…
▽ More
This paper proposes a new method to generate synthetic data sets based on copula models. Our goal is to produce surrogate data resembling real data in terms of marginal and joint distributions. We present a complete and reliable algorithm for generating a synthetic data set comprising numeric or categorical variables. Applying our methodology to two datasets shows better performance compared to other methods such as SMOTE and autoencoders.
△ Less
Submitted 30 March, 2022;
originally announced March 2022.
-
Radial Autoencoders for Enhanced Anomaly Detection
Authors:
Mihai-Cezar Augustin,
Vivien Bonvin,
Regis Houssou,
Efstratios Rappos,
Stephan Robert-Nicoud
Abstract:
In classification problems, supervised machine-learning methods outperform traditional algorithms, thanks to the ability of neural networks to learn complex patterns. However, in two-class classification tasks like anomaly or fraud detection, unsupervised methods could do even better, because their prediction is not limited to previously learned types of anomalies. An intuitive approach of anomaly…
▽ More
In classification problems, supervised machine-learning methods outperform traditional algorithms, thanks to the ability of neural networks to learn complex patterns. However, in two-class classification tasks like anomaly or fraud detection, unsupervised methods could do even better, because their prediction is not limited to previously learned types of anomalies. An intuitive approach of anomaly detection can be based on the distances from the centers of mass of the two respective classes. Autoencoders, although trained without supervision, can also detect anomalies: considering the center of mass of the normal points, reconstructions have now radii, with largest radii most likely indicating anomalous points. Of course, radii-based classification were already possible without interposing an autoencoder. In any space, radial classification can be operated, to some extent. In order to outperform it, we proceed to radial deformations of data (i.e. centric compression or expansions of axes) and autoencoder training. Any autoencoder that makes use of a data center is here baptized a centric autoencoder (cAE). A special type is the cAE trained with a uniformly compressed dataset, named the centripetal autoencoder (cpAE). The new concept is studied here in relation with a schematic artificial dataset, and the derived methods show consistent score improvements. But tested on real banking data, our radial deformation supervised algorithms alone still perform better that cAEs, as expected from most supervised methods; nonetheless, in hybrid approaches, cAEs can be combined with a radial deformation of space, improving its classification score. We expect that centric autoencoders will become irreplaceable objects in anomaly live detection based on geometry, thanks to their ability to stem naturally on geometrical algorithms and to their native capability of detecting unknown anomaly types.
△ Less
Submitted 31 March, 2022; v1 submitted 29 March, 2022;
originally announced March 2022.
-
A Force-Directed Approach for Offline GPS Trajectory Map Matching
Authors:
Efstratios Rappos,
Stephan Robert,
Philippe Cudré-Mauroux
Abstract:
We present a novel algorithm to match GPS trajectories onto maps offline (in batch mode) using techniques borrowed from the field of force-directed graph drawing. We consider a simulated physical system where each GPS trajectory is attracted or repelled by the underlying road network via electrical-like forces. We let the system evolve under the action of these physical forces such that individual…
▽ More
We present a novel algorithm to match GPS trajectories onto maps offline (in batch mode) using techniques borrowed from the field of force-directed graph drawing. We consider a simulated physical system where each GPS trajectory is attracted or repelled by the underlying road network via electrical-like forces. We let the system evolve under the action of these physical forces such that individual trajectories are attracted towards candidate roads to obtain a map matching path. Our approach has several advantages compared to traditional, routing-based, algorithms for map matching, including the ability to account for noise and to avoid large detours due to outliers in the data whilst taking into account the underlying topological restrictions (such as one-way roads). Our empirical evaluation using real GPS traces shows that our method produces better map matching results compared to alternative offline map matching algorithms on average, especially for routes in dense, urban areas.
△ Less
Submitted 29 March, 2019;
originally announced March 2019.
-
Treatment of Unicode canoncal decomposition among operating systems
Authors:
Efstratios Rappos
Abstract:
This article shows how the text characters that have multiple representations under the Unicode standard are treated by popular operating systems. Whilst most characters have a unique representation in Unicode, some characters such as the accented European letters, can have multiple representations due to a feature of Unicode called normalization. These characters are treated differently by popula…
▽ More
This article shows how the text characters that have multiple representations under the Unicode standard are treated by popular operating systems. Whilst most characters have a unique representation in Unicode, some characters such as the accented European letters, can have multiple representations due to a feature of Unicode called normalization. These characters are treated differently by popular operating systems, leading to additional challenges during interoperability of computer programs.
△ Less
Submitted 28 November, 2017;
originally announced November 2017.
-
Using GPU Simulation to Accurately Fit to the Power-Law Distribution
Authors:
Efstratios Rappos,
Stephan Robert
Abstract:
This article describes a methodology for fitting experimental data to the discrete power-law distribution and provides the results of a detailed simulation exercise used to calculate accurate cutoff values used to assess the fit to a power-law distribution when using the maximum likelihood estimation for the exponent of the distribution. Using massively parallel programming computing, we were able…
▽ More
This article describes a methodology for fitting experimental data to the discrete power-law distribution and provides the results of a detailed simulation exercise used to calculate accurate cutoff values used to assess the fit to a power-law distribution when using the maximum likelihood estimation for the exponent of the distribution. Using massively parallel programming computing, we were able to accelerate by a factor of 60 the computational time required for these calculations across a range of parameters and construct a series of detailed tables containing the test values to be used in a Kolmogorov-Smirnov goodness-of-fit test, allowing for an accurate assessment of the power-law fit from empirical data.
△ Less
Submitted 29 May, 2013;
originally announced May 2013.