Search | arXiv e-print repository

Six Levels of Privacy: A Framework for Financial Synthetic Data

Authors: Tucker Balch, Vamsi K. Potluru, Deepak Paramanand, Manuela Veloso

Abstract: Synthetic Data is increasingly important in financial applications. In addition to the benefits it provides, such as improved financial modeling and better testing procedures, it poses privacy risks as well. Such data may arise from client information, business information, or other proprietary sources that must be protected. Even though the process by which Synthetic Data is generated serves to o… ▽ More Synthetic Data is increasingly important in financial applications. In addition to the benefits it provides, such as improved financial modeling and better testing procedures, it poses privacy risks as well. Such data may arise from client information, business information, or other proprietary sources that must be protected. Even though the process by which Synthetic Data is generated serves to obscure the original data to some degree, the extent to which privacy is preserved is hard to assess. Accordingly, we introduce a hierarchy of ``levels'' of privacy that are useful for categorizing Synthetic Data generation methods and the progressively improved protections they offer. While the six levels were devised in the context of financial applications, they may also be appropriate for other industries as well. Our paper includes: A brief overview of Financial Synthetic Data, how it can be used, how its value can be assessed, privacy risks, and privacy attacks. We close with details of the ``Six Levels'' that include defenses against those attacks. △ Less

Submitted 20 March, 2024; originally announced March 2024.

Comments: Six privacy levels framework; excerpted from "Synthetic Data Applications in Finance'' (arxiv:2401.00081) article

arXiv:2401.00081 [pdf, other]

Synthetic Data Applications in Finance

Authors: Vamsi K. Potluru, Daniel Borrajo, Andrea Coletta, Niccolò Dalmasso, Yousef El-Laham, Elizabeth Fons, Mohsen Ghassemi, Sriram Gopalakrishnan, Vikesh Gosai, Eleonora Kreačić, Ganapathy Mani, Saheed Obitayo, Deepak Paramanand, Natraj Raman, Mikhail Solonin, Srijan Sood, Svitlana Vyetrenko, Haibei Zhu, Manuela Veloso, Tucker Balch

Abstract: Synthetic data has made tremendous strides in various commercial settings including finance, healthcare, and virtual reality. We present a broad overview of prototypical applications of synthetic data in the financial sector and in particular provide richer details for a few select ones. These cover a wide variety of data modalities including tabular, time-series, event-series, and unstructured ar… ▽ More Synthetic data has made tremendous strides in various commercial settings including finance, healthcare, and virtual reality. We present a broad overview of prototypical applications of synthetic data in the financial sector and in particular provide richer details for a few select ones. These cover a wide variety of data modalities including tabular, time-series, event-series, and unstructured arising from both markets and retail financial applications. Since finance is a highly regulated industry, synthetic data is a potential approach for dealing with issues related to privacy, fairness, and explainability. Various metrics are utilized in evaluating the quality and effectiveness of our approaches in these applications. We conclude with open directions in synthetic data in the context of the financial domain. △ Less

Submitted 20 March, 2024; v1 submitted 29 December, 2023; originally announced January 2024.

Comments: 50 pages, journal submission; updated 6 privacy levels

arXiv:2310.09621 [pdf, other]

Prime Match: A Privacy-Preserving Inventory Matching System

Authors: Antigoni Polychroniadou, Gilad Asharov, Benjamin Diamond, Tucker Balch, Hans Buehler, Richard Hua, Suwen Gu, Greg Gimler, Manuela Veloso

Abstract: Inventory matching is a standard mechanism/auction for trading financial stocks by which buyers and sellers can be paired. In the financial world, banks often undertake the task of finding such matches between their clients. The related stocks can be traded without adversely impacting the market price for either client. If matches between clients are found, the bank can offer the trade at advantag… ▽ More Inventory matching is a standard mechanism/auction for trading financial stocks by which buyers and sellers can be paired. In the financial world, banks often undertake the task of finding such matches between their clients. The related stocks can be traded without adversely impacting the market price for either client. If matches between clients are found, the bank can offer the trade at advantageous rates. If no match is found, the parties have to buy or sell the stock in the public market, which introduces additional costs. A problem with the process as it is presently conducted is that the involved parties must share their order to buy or sell a particular stock, along with the intended quantity (number of shares), to the bank. Clients worry that if this information were to leak somehow, then other market participants would become aware of their intentions and thus cause the price to move adversely against them before their transaction finalizes. We provide a solution, Prime Match, that enables clients to match their orders efficiently with reduced market impact while maintaining privacy. In the case where there are no matches, no information is revealed. Our main cryptographic innovation is a two-round secure linear comparison protocol for computing the minimum between two quantities without preprocessing and with malicious security, which can be of independent interest. We report benchmarks of our Prime Match system, which runs in production and is adopted by J.P. Morgan. The system is designed utilizing a star topology network, which provides clients with a centralized node (the bank) as an alternative to the idealized assumption of point-to-point connections, which would be impractical and undesired for the clients to implement in reality. Prime Match is the first secure multiparty computation solution running live in the traditional financial world. △ Less

Submitted 14 October, 2023; originally announced October 2023.

Comments: 27 pages, 7 figures, USENIX Security 2023

Journal ref: Prime match: A privacy-preserving inventory matching system. In Joseph A. Calandrino and Carmela Troncoso, editors, 32nd USENIX Security Symposium, USENIX Security 2023, Anaheim, CA, USA, August 9-11, 2023. USENIX Association, 2023

arXiv:2309.01784 [pdf, other]

INTAGS: Interactive Agent-Guided Simulation

Authors: Song Wei, Andrea Coletta, Svitlana Vyetrenko, Tucker Balch

Abstract: In many applications involving multi-agent system (MAS), it is imperative to test an experimental (Exp) autonomous agent in a high-fidelity simulator prior to its deployment to production, to avoid unexpected losses in the real-world. Such a simulator acts as the environmental background (BG) agent(s), called agent-based simulator (ABS), aiming to replicate the complex real MAS. However, developin… ▽ More In many applications involving multi-agent system (MAS), it is imperative to test an experimental (Exp) autonomous agent in a high-fidelity simulator prior to its deployment to production, to avoid unexpected losses in the real-world. Such a simulator acts as the environmental background (BG) agent(s), called agent-based simulator (ABS), aiming to replicate the complex real MAS. However, develo** realistic ABS remains challenging, mainly due to the sequential and dynamic nature of such systems. To fill this gap, we propose a metric to distinguish between real and synthetic multi-agent systems, which is evaluated through the live interaction between the Exp and BG agents to explicitly account for the systems' sequential nature. Specifically, we characterize the system/environment by studying the effect of a sequence of BG agents' responses to the environment state evolution and take such effects' differences as MAS distance metric; The effect estimation is cast as a causal inference problem since the environment evolution is confounded with the previous environment state. Importantly, we propose the Interactive Agent-Guided Simulation (INTAGS) framework to build a realistic ABS by optimizing over this novel metric. To adapt to any environment with interactive sequential decision making agents, INTAGS formulates the simulator as a stochastic policy in reinforcement learning. Moreover, INTAGS utilizes the policy gradient update to bypass differentiating the proposed metric such that it can support non-differentiable operations of multi-agent environments. Through extensive experiments, we demonstrate the effectiveness of INTAGS on an equity stock market simulation example. We show that using INTAGS to calibrate the simulator can generate more realistic market data compared to the state-of-the-art conditional Wasserstein Generative Adversarial Network approach. △ Less

Submitted 17 November, 2023; v1 submitted 4 September, 2023; originally announced September 2023.

arXiv:2304.04912 [pdf, other]

Financial Time Series Forecasting using CNN and Transformer

Authors: Zhen Zeng, Rachneet Kaur, Suchetha Siddagangappa, Saba Rahimi, Tucker Balch, Manuela Veloso

Abstract: Time series forecasting is important across various domains for decision-making. In particular, financial time series such as stock prices can be hard to predict as it is difficult to model short-term and long-term temporal dependencies between data points. Convolutional Neural Networks (CNN) are good at capturing local patterns for modeling short-term dependencies. However, CNNs cannot learn long… ▽ More Time series forecasting is important across various domains for decision-making. In particular, financial time series such as stock prices can be hard to predict as it is difficult to model short-term and long-term temporal dependencies between data points. Convolutional Neural Networks (CNN) are good at capturing local patterns for modeling short-term dependencies. However, CNNs cannot learn long-term dependencies due to the limited receptive field. Transformers on the other hand are capable of learning global context and long-term dependencies. In this paper, we propose to harness the power of CNNs and Transformers to model both short-term and long-term dependencies within a time series, and forecast if the price would go up, down or remain the same (flat) in the future. In our experiments, we demonstrated the success of the proposed method in comparison to commonly adopted statistical and deep learning methods on forecasting intraday stock price change of S&P 500 constituents. △ Less

Submitted 10 April, 2023; originally announced April 2023.

Comments: Published at AAAI 2023 - AI for Financial Services Bridge

arXiv:2210.09897 [pdf, other]

doi 10.1145/3533271.3561753

Learning to simulate realistic limit order book markets from data as a World Agent

Authors: Andrea Coletta, Aymeric Moulin, Svitlana Vyetrenko, Tucker Balch

Abstract: Multi-agent market simulators usually require careful calibration to emulate real markets, which includes the number and the type of agents. Poorly calibrated simulators can lead to misleading conclusions, potentially causing severe loss when employed by investment banks, hedge funds, and traders to study and evaluate trading strategies. In this paper, we propose a world model simulator that accur… ▽ More Multi-agent market simulators usually require careful calibration to emulate real markets, which includes the number and the type of agents. Poorly calibrated simulators can lead to misleading conclusions, potentially causing severe loss when employed by investment banks, hedge funds, and traders to study and evaluate trading strategies. In this paper, we propose a world model simulator that accurately emulates a limit order book market -- it requires no agent calibration but rather learns the simulated market behavior directly from historical data. Traditional approaches fail short to learn and calibrate trader population, as historical labeled data with details on each individual trader strategy is not publicly available. Our approach proposes to learn a unique "world" agent from historical data. It is intended to emulate the overall trader population, without the need of making assumptions about individual market agent strategies. We implement our world agent simulator models as a Conditional Generative Adversarial Network (CGAN), as well as a mixture of parametric distributions, and we compare our models against previous work. Qualitatively and quantitatively, we show that the proposed approaches consistently outperform previous work, providing more realism and responsiveness. △ Less

Submitted 26 September, 2022; originally announced October 2022.

arXiv:2210.08569 [pdf, other]

Limited or Biased: Modeling Sub-Rational Human Investors in Financial Markets

Authors: Penghang Liu, Kshama Dwarakanath, Svitlana S Vyetrenko, Tucker Balch

Abstract: Human decision-making in real-life deviates significantly from the optimal decisions made by fully rational agents, primarily due to computational limitations or psychological biases. While existing studies in behavioral finance have discovered various aspects of human sub-rationality, there lacks a comprehensive framework to transfer these findings into an adaptive human model applicable across d… ▽ More Human decision-making in real-life deviates significantly from the optimal decisions made by fully rational agents, primarily due to computational limitations or psychological biases. While existing studies in behavioral finance have discovered various aspects of human sub-rationality, there lacks a comprehensive framework to transfer these findings into an adaptive human model applicable across diverse financial market scenarios. In this study, we introduce a flexible model that incorporates five different aspects of human sub-rationality using reinforcement learning. Our model is trained using a high-fidelity multi-agent market simulator, which overcomes limitations associated with the scarcity of labeled data of individual investors. We evaluate the behavior of sub-rational human investors using hand-crafted market scenarios and SHAP value analysis, showing that our model accurately reproduces the observations in the previous studies and reveals insights of the driving factors of human behavior. Finally, we explore the impact of sub-rationality on the investor's Profit and Loss (PnL) and market quality. Our experiments reveal that bounded-rational and prospect-biased human behaviors improve liquidity but diminish price efficiency, whereas human behavior influenced by myopia, optimism, and pessimism reduces market liquidity. △ Less

Submitted 8 March, 2024; v1 submitted 16 October, 2022; originally announced October 2022.

arXiv:2210.07184 [pdf, other]

Towards Multi-Agent Reinforcement Learning driven Over-The-Counter Market Simulations

Authors: Nelson Vadori, Leo Ardon, Sumitra Ganesh, Thomas Spooner, Selim Amrouni, Jared Vann, Mengda Xu, Zeyu Zheng, Tucker Balch, Manuela Veloso

Abstract: We study a game between liquidity provider and liquidity taker agents interacting in an over-the-counter market, for which the typical example is foreign exchange. We show how a suitable design of parameterized families of reward functions coupled with shared policy learning constitutes an efficient solution to this problem. By playing against each other, our deep-reinforcement-learning-driven age… ▽ More We study a game between liquidity provider and liquidity taker agents interacting in an over-the-counter market, for which the typical example is foreign exchange. We show how a suitable design of parameterized families of reward functions coupled with shared policy learning constitutes an efficient solution to this problem. By playing against each other, our deep-reinforcement-learning-driven agents learn emergent behaviors relative to a wide spectrum of objectives encompassing profit-and-loss, optimal execution and market share. In particular, we find that liquidity providers naturally learn to balance hedging and skewing, where skewing refers to setting their buy and sell prices asymmetrically as a function of their inventory. We further introduce a novel RL-based calibration algorithm which we found performed well at imposing constraints on the game equilibrium. On the theoretical side, we are able to show convergence rates for our multi-agent policy gradient algorithm under a transitivity assumption, closely related to generalized ordinal potential games. △ Less

Submitted 1 August, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

arXiv:2202.00941 [pdf, other]

CTMSTOU driven markets: simulated environment for regime-awareness in trading policies

Authors: Selim Amrouni, Aymeric Moulin, Tucker Balch

Abstract: Market regimes is a popular topic in quantitative finance even though there is little consensus on the details of how they should be defined. They arise as a feature both in financial market prediction problems and financial market task performing problems. In this work we use discrete event time multi-agent market simulation to freely experiment in a reproducible and understandable environment… ▽ More Market regimes is a popular topic in quantitative finance even though there is little consensus on the details of how they should be defined. They arise as a feature both in financial market prediction problems and financial market task performing problems. In this work we use discrete event time multi-agent market simulation to freely experiment in a reproducible and understandable environment where regimes can be explicitly switched and enforced. We introduce a novel stochastic process to model the fundamental value perceived by market participants: Continuous-Time Markov Switching Trending Ornstein-Uhlenbeck (CTMSTOU), which facilitates the study of trading policies in regime switching markets. We define the notion of regime-awareness for a trading agent as well and illustrate its importance through the study of different order placement strategies in the context of order execution problems. △ Less

Submitted 3 February, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

Comments: fix typo in title

arXiv:2112.03874 [pdf, other]

doi 10.1145/3533271.3561755

Efficient Calibration of Multi-Agent Simulation Models from Output Series with Bayesian Optimization

Authors: Yuanlu Bai, Henry Lam, Svitlana Vyetrenko, Tucker Balch

Abstract: Multi-agent simulation is commonly used across multiple disciplines, specifically in artificial intelligence in recent years, which creates an environment for downstream machine learning or reinforcement learning tasks. In many practical scenarios, however, only the output series that result from the interactions of simulation agents are observable. Therefore, simulators need to be calibrated so t… ▽ More Multi-agent simulation is commonly used across multiple disciplines, specifically in artificial intelligence in recent years, which creates an environment for downstream machine learning or reinforcement learning tasks. In many practical scenarios, however, only the output series that result from the interactions of simulation agents are observable. Therefore, simulators need to be calibrated so that the simulated output series resemble historical -- which amounts to solving a complex simulation optimization problem. In this paper, we propose a simple and efficient framework for calibrating simulator parameters from historical output series observations. First, we consider a novel concept of eligibility set to bypass the potential non-identifiability issue. Second, we generalize the two-sample Kolmogorov-Smirnov (K-S) test with Bonferroni correction to test the similarity between two high-dimensional distributions, which gives a simple yet effective distance metric between the output series sample sets. Third, we suggest using Bayesian optimization (BO) and trust-region BO (TuRBO) to minimize the aforementioned distance metric. Finally, we demonstrate the efficiency of our framework using numerical experiments both on a multi-agent financial market simulator. △ Less

Submitted 20 September, 2022; v1 submitted 3 December, 2021; originally announced December 2021.

Comments: This paper has been accepted and will be published in ICAIF 2022 proceedings

arXiv:2110.14771 [pdf, other]

doi 10.1145/3490354.3494433

ABIDES-Gym: Gym Environments for Multi-Agent Discrete Event Simulation and Application to Financial Markets

Authors: Selim Amrouni, Aymeric Moulin, Jared Vann, Svitlana Vyetrenko, Tucker Balch, Manuela Veloso

Abstract: Model-free Reinforcement Learning (RL) requires the ability to sample trajectories by taking actions in the original problem environment or a simulated version of it. Breakthroughs in the field of RL have been largely facilitated by the development of dedicated open source simulators with easy to use frameworks such as OpenAI Gym and its Atari environments. In this paper we propose to use the Open… ▽ More Model-free Reinforcement Learning (RL) requires the ability to sample trajectories by taking actions in the original problem environment or a simulated version of it. Breakthroughs in the field of RL have been largely facilitated by the development of dedicated open source simulators with easy to use frameworks such as OpenAI Gym and its Atari environments. In this paper we propose to use the OpenAI Gym framework on discrete event time based Discrete Event Multi-Agent Simulation (DEMAS). We introduce a general technique to wrap a DEMAS simulator into the Gym framework. We expose the technique in detail and implement it using the simulator ABIDES as a base. We apply this work by specifically using the markets extension of ABIDES, ABIDES-Markets, and develop two benchmark financial markets OpenAI Gym environments for training daily investor and execution agents. As a result, these two environments describe classic financial problems with a complex interactive market behavior response to the experimental agent's action. △ Less

Submitted 27 October, 2021; originally announced October 2021.

arXiv:2110.13287 [pdf, other]

Towards Realistic Market Simulations: a Generative Adversarial Networks Approach

Authors: Andrea Coletta, Matteo Prata, Michele Conti, Emanuele Mercanti, Novella Bartolini, Aymeric Moulin, Svitlana Vyetrenko, Tucker Balch

Abstract: Simulated environments are increasingly used by trading firms and investment banks to evaluate trading strategies before approaching real markets. Backtesting, a widely used approach, consists of simulating experimental strategies while replaying historical market scenarios. Unfortunately, this approach does not capture the market response to the experimental agents' actions. In contrast, multi-ag… ▽ More Simulated environments are increasingly used by trading firms and investment banks to evaluate trading strategies before approaching real markets. Backtesting, a widely used approach, consists of simulating experimental strategies while replaying historical market scenarios. Unfortunately, this approach does not capture the market response to the experimental agents' actions. In contrast, multi-agent simulation presents a natural bottom-up approach to emulating agent interaction in financial markets. It allows to set up pools of traders with diverse strategies to mimic the financial market trader population, and test the performance of new experimental strategies. Since individual agent-level historical data is typically proprietary and not available for public use, it is difficult to calibrate multiple market agents to obtain the realism required for testing trading strategies. To addresses this challenge we propose a synthetic market generator based on Conditional Generative Adversarial Networks (CGANs) trained on real aggregate-level historical data. A CGAN-based "world" agent can generate meaningful orders in response to an experimental agent. We integrate our synthetic market generator into ABIDES, an open source simulator of financial markets. By means of extensive simulations we show that our proposal outperforms previous work in terms of stylized facts reflecting market responsiveness and realism. △ Less

Submitted 25 October, 2021; originally announced October 2021.

Comments: 8 pages, 9 figures, ICAIF'21 - 2nd ACM International Conference on AI in Finance

MSC Class: I.2; I.6 ACM Class: I.2; I.6

arXiv:2110.01325 [pdf, other]

doi 10.1145/3490354.3494386

Learning to Classify and Imitate Trading Agents in Continuous Double Auction Markets

Authors: Mahmoud Mahfouz, Tucker Balch, Manuela Veloso, Danilo Mandic

Abstract: Continuous double auctions such as the limit order book employed by exchanges are widely used in practice to match buyers and sellers of a variety of financial instruments. In this work, we develop an agent-based model for trading in a limit order book and show (1) how opponent modelling techniques can be applied to classify trading agent archetypes and (2) how behavioural cloning can be used to i… ▽ More Continuous double auctions such as the limit order book employed by exchanges are widely used in practice to match buyers and sellers of a variety of financial instruments. In this work, we develop an agent-based model for trading in a limit order book and show (1) how opponent modelling techniques can be applied to classify trading agent archetypes and (2) how behavioural cloning can be used to imitate these agents in a simulated setting. We experimentally compare a number of techniques for both tasks and evaluate their applicability and use in real-world scenarios. △ Less

Submitted 29 October, 2021; v1 submitted 4 October, 2021; originally announced October 2021.

arXiv:2108.00664 [pdf, other]

Learning who is in the market from time series: market participant discovery through adversarial calibration of multi-agent simulators

Authors: Victor Storchan, Svitlana Vyetrenko, Tucker Balch

Abstract: In electronic trading markets often only the price or volume time series, that result from interaction of multiple market participants, are directly observable. In order to test trading strategies before deploying them to real-time trading, multi-agent market environments calibrated so that the time series that result from interaction of simulated agents resemble historical are often used. To ensu… ▽ More In electronic trading markets often only the price or volume time series, that result from interaction of multiple market participants, are directly observable. In order to test trading strategies before deploying them to real-time trading, multi-agent market environments calibrated so that the time series that result from interaction of simulated agents resemble historical are often used. To ensure adequate testing, one must test trading strategies in a variety of market scenarios -- which includes both scenarios that represent ordinary market days as well as stressed markets (most recently observed due to the beginning of COVID pandemic). In this paper, we address the problem of multi-agent simulator parameter calibration to allow simulator capture characteristics of different market regimes. We propose a novel two-step method to train a discriminator that is able to distinguish between "real" and "fake" price and volume time series as a part of GAN with self-attention, and then utilize it within an optimization framework to tune parameters of a simulator model with known agent archetypes to represent a market scenario. We conclude with experimental results that demonstrate effectiveness of our method. △ Less

Submitted 2 August, 2021; originally announced August 2021.

arXiv:2107.01273

Visual Time Series Forecasting: An Image-driven Approach

Authors: Naftali Cohen, Srijan Sood, Zhen Zeng, Tucker Balch, Manuela Veloso

Abstract: In this work, we address time-series forecasting as a computer vision task. We capture input data as an image and train a model to produce the subsequent image. This approach results in predicting distributions as opposed to pointwise values. To assess the robustness and quality of our approach, we examine various datasets and multiple evaluation metrics. Our experiments show that our forecasting… ▽ More In this work, we address time-series forecasting as a computer vision task. We capture input data as an image and train a model to produce the subsequent image. This approach results in predicting distributions as opposed to pointwise values. To assess the robustness and quality of our approach, we examine various datasets and multiple evaluation metrics. Our experiments show that our forecasting tool is effective for cyclic data but somewhat less for irregular data such as stock prices. Importantly, when using image-based evaluation metrics, we find our method to outperform various baselines, including ARIMA, and a numerical variation of our deep learning approach. △ Less

Submitted 15 November, 2021; v1 submitted 2 July, 2021; originally announced July 2021.

Comments: This work was intended as a replacement of arXiv:2011.09052 and any subsequent updates will appear there

arXiv:2006.08682 [pdf, other]

The Importance of Low Latency to Order Book Imbalance Trading Strategies

Authors: David Byrd, Sruthi Palaparthi, Maria Hybinette, Tucker Hybinette Balch

Abstract: There is a pervasive assumption that low latency access to an exchange is a key factor in the profitability of many high-frequency trading strategies. This belief is evidenced by the "arms race" undertaken by certain financial firms to co-locate with exchange servers. To the best of our knowledge, our study is the first to validate and quantify this assumption in a continuous double auction market… ▽ More There is a pervasive assumption that low latency access to an exchange is a key factor in the profitability of many high-frequency trading strategies. This belief is evidenced by the "arms race" undertaken by certain financial firms to co-locate with exchange servers. To the best of our knowledge, our study is the first to validate and quantify this assumption in a continuous double auction market with a single exchange similar to the New York Stock Exchange. It is not feasible to conduct this exploration with historical data in which trader identity and location are not reported. Accordingly, we investigate the relationship between latency of access to order book information and profitability of trading strategies exploiting that information with an agent-based interactive discrete event simulation in which thousands of agents pursue archetypal trading strategies. We introduce experimental traders pursuing a low-latency order book imbalance (OBI) strategy in a controlled manner across thousands of simulated trading days, and analyze OBI trader profit while varying distance (latency) from the exchange. Our experiments support that latency is inversely related to profit for the OBI traders, but more interestingly show that latency rank, rather than absolute magnitude, is the key factor in allocating returns among agents pursuing a similar strategy. △ Less

Submitted 15 June, 2020; originally announced June 2020.

arXiv:1912.04941 [pdf, other]

Get Real: Realism Metrics for Robust Limit Order Book Market Simulations

Authors: Svitlana Vyetrenko, David Byrd, Nick Petosa, Mahmoud Mahfouz, Danial Dervovic, Manuela Veloso, Tucker Hybinette Balch

Abstract: Machine learning (especially reinforcement learning) methods for trading are increasingly reliant on simulation for agent training and testing. Furthermore, simulation is important for validation of hand-coded trading strategies and for testing hypotheses about market structure. A challenge, however, concerns the robustness of policies validated in simulation because the simulations lack fidelity.… ▽ More Machine learning (especially reinforcement learning) methods for trading are increasingly reliant on simulation for agent training and testing. Furthermore, simulation is important for validation of hand-coded trading strategies and for testing hypotheses about market structure. A challenge, however, concerns the robustness of policies validated in simulation because the simulations lack fidelity. In fact, researchers have shown that many market simulation approaches fail to reproduce statistics and stylized facts seen in real markets. As a step towards addressing this we surveyed the literature to collect a set of reference metrics and applied them to real market data and simulation output. Our paper provides a comprehensive catalog of these metrics including mathematical formulations where appropriate. Our results show that there are still significant discrepancies between simulated markets and real ones. However, this work serves as a benchmark against which we can measure future improvement. △ Less

Submitted 10 December, 2019; originally announced December 2019.

Journal ref: NeurIPS 2019 Workshop on Robust AI in Financial Services: Data, Fairness, Explainability, Trustworthiness, and Privacy

arXiv:1911.12816 [pdf, other]

On the Importance of Opponent Modeling in Auction Markets

Authors: Mahmoud Mahfouz, Angelos Filos, Cyrine Chtourou, Joshua Lockhart, Samuel Assefa, Manuela Veloso, Danilo Mandic, Tucker Balch

Abstract: The dynamics of financial markets are driven by the interactions between participants, as well as the trading mechanisms and regulatory frameworks that govern these interactions. Decision-makers would rather not ignore the impact of other participants on these dynamics and should employ tools and models that take this into account. To this end, we demonstrate the efficacy of applying opponent-mode… ▽ More The dynamics of financial markets are driven by the interactions between participants, as well as the trading mechanisms and regulatory frameworks that govern these interactions. Decision-makers would rather not ignore the impact of other participants on these dynamics and should employ tools and models that take this into account. To this end, we demonstrate the efficacy of applying opponent-modeling in a number of simulated market settings. While our simulations are simplified representations of actual market dynamics, they provide an idealized "playground" in which our techniques can be demonstrated and tested. We present this work with the aim that our techniques could be refined and, with some effort, scaled up to the full complexity of real-world market scenarios. We hope that the results presented encourage practitioners to adopt opponent-modeling methods and apply them online systems, in order to enable not only reactive but also proactive decisions to be made. △ Less

Submitted 28 November, 2019; originally announced November 2019.

arXiv:1908.08168 [pdf, other]

Intra-day Equity Price Prediction using Deep Learning as a Measure of Market Efficiency

Authors: David Byrd, Tucker Hybinette Balch

Abstract: In finance, the weak form of the Efficient Market Hypothesis asserts that historic stock price and volume data cannot inform predictions of future prices. In this paper we show that, to the contrary, future intra-day stock prices could be predicted effectively until 2009. We demonstrate this using two different profitable machine learning-based trading strategies. However, the effectiveness of bot… ▽ More In finance, the weak form of the Efficient Market Hypothesis asserts that historic stock price and volume data cannot inform predictions of future prices. In this paper we show that, to the contrary, future intra-day stock prices could be predicted effectively until 2009. We demonstrate this using two different profitable machine learning-based trading strategies. However, the effectiveness of both approaches diminish over time, and neither of them are profitable after 2009. We present our implementation and results in detail for the period 2003-2017 and propose a novel idea: the use of such flexible machine learning methods as an objective measure of relative market efficiency. We conclude with a candidate explanation, comparing our returns over time with high-frequency trading volume, and suggest concrete steps for further investigation. △ Less

Submitted 21 August, 2019; originally announced August 2019.

Comments: In journal submission

arXiv:1907.10046 [pdf, other]

Trading via Image Classification

Authors: Naftali Cohen, Tucker Balch, Manuela Veloso

Abstract: The art of systematic financial trading evolved with an array of approaches, ranging from simple strategies to complex algorithms all relying, primary, on aspects of time-series analysis. Recently, after visiting the trading floor of a leading financial institution, we noticed that traders always execute their trade orders while observing images of financial time-series on their screens. In this w… ▽ More The art of systematic financial trading evolved with an array of approaches, ranging from simple strategies to complex algorithms all relying, primary, on aspects of time-series analysis. Recently, after visiting the trading floor of a leading financial institution, we noticed that traders always execute their trade orders while observing images of financial time-series on their screens. In this work, we built upon the success in image recognition and examine the value in transforming the traditional time-series analysis to that of image classification. We create a large sample of financial time-series images encoded as candlestick (Box and Whisker) charts and label the samples following three algebraically-defined binary trade strategies. Using the images, we train over a dozen machine-learning classification models and find that the algorithms are very efficient in recovering the complicated, multiscale label-generating rules when the data is represented visually. We suggest that the transformation of continuous numeric time-series classification problem to a vision problem is useful for recovering signals typical of technical analysis. △ Less

Submitted 26 October, 2020; v1 submitted 23 July, 2019; originally announced July 2019.

arXiv:1907.09567 [pdf, other]

The Effect of Visual Design in Image Classification

Authors: Naftali Cohen, Tucker Balch, Manuela Veloso

Abstract: Financial companies continuously analyze the state of the markets to rethink and adjust their investment strategies. While the analysis is done on the digital form of data, decisions are often made based on graphical representations in white papers or presentation slides. In this study, we examine whether binary decisions are better to be decided based on the numeric or the visual representation o… ▽ More Financial companies continuously analyze the state of the markets to rethink and adjust their investment strategies. While the analysis is done on the digital form of data, decisions are often made based on graphical representations in white papers or presentation slides. In this study, we examine whether binary decisions are better to be decided based on the numeric or the visual representation of the same data. Using two data sets, a matrix of numerical data with spatial dependencies and financial data describing the state of the S&P index, we compare the results of supervised classification based on the original numerical representation and the visual transformation of the same data. We show that, for these data sets, the visual transformation results in higher predictability skill compared to the original form of the data. We suggest thinking of the visual representation of numeric data, effectively, as a combination of dimensional reduction and feature engineering techniques. In particular, if the visual layout encapsulates the full complexity of the data. In this view, thoughtful visual design can guard against overfitting, or introduce new features -- all of which benefit the learning process, and effectively lead to better recognition of meaningful patterns. △ Less

Submitted 20 August, 2019; v1 submitted 22 July, 2019; originally announced July 2019.

arXiv:1906.12010 [pdf, other]

How to Evaluate Trading Strategies: Single Agent Market Replay or Multiple Agent Interactive Simulation?

Authors: Tucker Hybinette Balch, Mahmoud Mahfouz, Joshua Lockhart, Maria Hybinette, David Byrd

Abstract: We show how a multi-agent simulator can support two important but distinct methods for assessing a trading strategy: Market Replay and Interactive Agent-Based Simulation (IABS). Our solution is important because each method offers strengths and weaknesses that expose or conceal flaws in the subject strategy. A key weakness of Market Replay is that the simulated market does not substantially adapt… ▽ More We show how a multi-agent simulator can support two important but distinct methods for assessing a trading strategy: Market Replay and Interactive Agent-Based Simulation (IABS). Our solution is important because each method offers strengths and weaknesses that expose or conceal flaws in the subject strategy. A key weakness of Market Replay is that the simulated market does not substantially adapt to or respond to the presence of the experimental strategy. IABS methods provide an artificial market for the experimental strategy using a population of background trading agents. Because the background agents attend to market conditions and current price as part of their strategy, the overall market is responsive to the presence of the experimental strategy. Even so, IABS methods have their own weaknesses, primarily that it is unclear if the market environment they provide is realistic. We describe our approach in detail, and illustrate its use in an example application: The evaluation of market impact for various size orders. △ Less

Submitted 27 June, 2019; originally announced June 2019.

Journal ref: Presented at the 2019 ICML Workshop on AI in Finance: Applications and Infrastructure for Multi-Agent Learning. Long Beach, CA

Showing 1–22 of 22 results for author: Balch, T