Synthetic Data Applications in Finance
Authors:
Vamsi K. Potluru,
Daniel Borrajo,
Andrea Coletta,
Niccolò Dalmasso,
Yousef El-Laham,
Elizabeth Fons,
Mohsen Ghassemi,
Sriram Gopalakrishnan,
Vikesh Gosai,
Eleonora Kreačić,
Ganapathy Mani,
Saheed Obitayo,
Deepak Paramanand,
Natraj Raman,
Mikhail Solonin,
Srijan Sood,
Svitlana Vyetrenko,
Haibei Zhu,
Manuela Veloso,
Tucker Balch
Abstract:
Synthetic data has made tremendous strides in various commercial settings including finance, healthcare, and virtual reality. We present a broad overview of prototypical applications of synthetic data in the financial sector and in particular provide richer details for a few select ones. These cover a wide variety of data modalities including tabular, time-series, event-series, and unstructured ar…
▽ More
Synthetic data has made tremendous strides in various commercial settings including finance, healthcare, and virtual reality. We present a broad overview of prototypical applications of synthetic data in the financial sector and in particular provide richer details for a few select ones. These cover a wide variety of data modalities including tabular, time-series, event-series, and unstructured arising from both markets and retail financial applications. Since finance is a highly regulated industry, synthetic data is a potential approach for dealing with issues related to privacy, fairness, and explainability. Various metrics are utilized in evaluating the quality and effectiveness of our approaches in these applications. We conclude with open directions in synthetic data in the context of the financial domain.
△ Less
Submitted 20 March, 2024; v1 submitted 29 December, 2023;
originally announced January 2024.
Sparsity in Continuous-Depth Neural Networks
Authors:
Hananeh Aliee,
Till Richter,
Mikhail Solonin,
Ignacio Ibarra,
Fabian Theis,
Niki Kilbertus
Abstract:
Neural Ordinary Differential Equations (NODEs) have proven successful in learning dynamical systems in terms of accurately recovering the observed trajectories. While different types of sparsity have been proposed to improve robustness, the generalization properties of NODEs for dynamical systems beyond the observed data are underexplored. We systematically study the influence of weight and featur…
▽ More
Neural Ordinary Differential Equations (NODEs) have proven successful in learning dynamical systems in terms of accurately recovering the observed trajectories. While different types of sparsity have been proposed to improve robustness, the generalization properties of NODEs for dynamical systems beyond the observed data are underexplored. We systematically study the influence of weight and feature sparsity on forecasting as well as on identifying the underlying dynamical laws. Besides assessing existing methods, we propose a regularization technique to sparsify "input-output connections" and extract relevant features during training. Moreover, we curate real-world datasets consisting of human motion capture and human hematopoiesis single-cell RNA-seq data to realistically analyze different levels of out-of-distribution (OOD) generalization in forecasting and dynamics identification respectively. Our extensive empirical evaluation on these challenging benchmarks suggests that weight sparsity improves generalization in the presence of noise or irregular sampling. However, it does not prevent learning spurious feature dependencies in the inferred dynamics, rendering them impractical for predictions under interventions, or for inferring the true underlying dynamics. Instead, feature sparsity can indeed help with recovering sparse ground-truth dynamics compared to unregularized NODEs.
△ Less
Submitted 26 October, 2022;
originally announced October 2022.