Search | arXiv e-print repository

End-to-End Policy Learning of a Statistical Arbitrage Autoencoder Architecture

Authors: Fabian Krause, Jan-Peter Calliess

Abstract: In Statistical Arbitrage (StatArb), classical mean reversion trading strategies typically hinge on asset-pricing or PCA based models to identify the mean of a synthetic asset. Once such a (linear) model is identified, a separate mean reversion strategy is then devised to generate a trading signal. With a view of generalising such an approach and turning it truly data-driven, we study the utility o… ▽ More In Statistical Arbitrage (StatArb), classical mean reversion trading strategies typically hinge on asset-pricing or PCA based models to identify the mean of a synthetic asset. Once such a (linear) model is identified, a separate mean reversion strategy is then devised to generate a trading signal. With a view of generalising such an approach and turning it truly data-driven, we study the utility of Autoencoder architectures in StatArb. As a first approach, we employ a standard Autoencoder trained on US stock returns to derive trading strategies based on the Ornstein-Uhlenbeck (OU) process. To further enhance this model, we take a policy-learning approach and embed the Autoencoder network into a neural network representation of a space of portfolio trading policies. This integration outputs portfolio allocations directly and is end-to-end trainable by backpropagation of the risk-adjusted returns of the neural policy. Our findings demonstrate that this innovative end-to-end policy learning approach not only simplifies the strategy development process, but also yields superior gross returns over its competitors illustrating the potential of end-to-end training over classical two-stage approaches. △ Less

Submitted 13 February, 2024; originally announced February 2024.

Comments: 11 pages, 1 figure

arXiv:2301.08688 [pdf, other]

doi 10.3389/frai.2023.1151003

Asynchronous Deep Double Duelling Q-Learning for Trading-Signal Execution in Limit Order Book Markets

Authors: Peer Nagy, Jan-Peter Calliess, Stefan Zohren

Abstract: We employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilise it to simulate a realistic trading environment for NASDAQ equities based on historic order book message… ▽ More We employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilise it to simulate a realistic trading environment for NASDAQ equities based on historic order book messages. To train a trading agent that learns to maximise its trading return in this environment, we use Deep Duelling Double Q-learning with the APEX (asynchronous prioritised experience replay) architecture. The agent observes the current limit order book state, its recent history, and a short-term directional forecast. To investigate the performance of RL for adaptive trading independently from a concrete forecasting algorithm, we study the performance of our approach utilising synthetic alpha signals obtained by perturbing forward-looking returns with varying levels of noise. Here, we find that the RL agent learns an effective trading strategy for inventory management and order placing that outperforms a heuristic benchmark trading strategy having access to the same signal. △ Less

Submitted 25 September, 2023; v1 submitted 20 January, 2023; originally announced January 2023.

Journal ref: Front. Artif. Intell., 25 September 2023 Sec. Artificial Intelligence in Finance Volume 6 - 2023

arXiv:2109.05317 [pdf, other]

Bayesian Topic Regression for Causal Inference

Authors: Maximilian Ahrens, Julian Ashwin, Jan-Peter Calliess, Vu Nguyen

Abstract: Causal inference using observational text data is becoming increasingly popular in many research areas. This paper presents the Bayesian Topic Regression (BTR) model that uses both text and numerical information to model an outcome variable. It allows estimation of both discrete and continuous treatment effects. Furthermore, it allows for the inclusion of additional numerical confounding factors n… ▽ More Causal inference using observational text data is becoming increasingly popular in many research areas. This paper presents the Bayesian Topic Regression (BTR) model that uses both text and numerical information to model an outcome variable. It allows estimation of both discrete and continuous treatment effects. Furthermore, it allows for the inclusion of additional numerical confounding factors next to text data. To this end, we combine a supervised Bayesian topic model with a Bayesian regression framework and perform supervised representation learning for the text features jointly with the regression parameter training, respecting the Frisch-Waugh-Lovell theorem. Our paper makes two main contributions. First, we provide a regression framework that allows causal inference in settings when both text and numerical confounders are of relevance. We show with synthetic and semi-synthetic datasets that our joint approach recovers ground truth with lower bias than any benchmark model, when text and numerical features are correlated. Second, experiments on two real-world datasets demonstrate that a joint and supervised learning strategy also yields superior prediction results compared to strategies that estimate regression weights for text and non-text features separately, being even competitive with more complex deep neural networks. △ Less

Submitted 11 September, 2021; originally announced September 2021.

Comments: accepted as a conference paper at EMNLP 2021

arXiv:2011.06430 [pdf, other]

doi 10.1038/s41598-021-82338-6

Sentiment Correlation in Financial News Networks and Associated Market Movements

Authors: Xingchen Wan, Jie Yang, Slavi Marinov, Jan-Peter Calliess, Stefan Zohren, Xiaowen Dong

Abstract: In an increasingly connected global market, news sentiment towards one company may not only indicate its own market performance, but can also be associated with a broader movement on the sentiment and performance of other companies from the same or even different sectors. In this paper, we apply NLP techniques to understand news sentiment of 87 companies among the most reported on Reuters for a pe… ▽ More In an increasingly connected global market, news sentiment towards one company may not only indicate its own market performance, but can also be associated with a broader movement on the sentiment and performance of other companies from the same or even different sectors. In this paper, we apply NLP techniques to understand news sentiment of 87 companies among the most reported on Reuters for a period of seven years. We investigate the propagation of such sentiment in company networks and evaluate the associated market movements in terms of stock price and volatility. Our results suggest that, in certain sectors, strong media sentiment towards one company may indicate a significant change in media sentiment towards related companies measured as neighbours in a financial network constructed from news co-occurrence. Furthermore, there exists a weak but statistically significant association between strong media sentiment and abnormal market return as well as volatility. Such an association is more significant at the level of individual companies, but nevertheless remains visible at the level of sectors or groups of companies. △ Less

Submitted 13 February, 2021; v1 submitted 5 November, 2020; originally announced November 2020.

Comments: 12 pages, 5 figures, 1 table (29 pages including References and Appendices). Published in Scientific Reports 11

Journal ref: Sci. Rep. 11, 3062 (2021)

arXiv:2008.07871 [pdf, other]

Fast Agent-Based Simulation Framework with Applications to Reinforcement Learning and the Study of Trading Latency Effects

Authors: Peter Belcak, Jan-Peter Calliess, Stefan Zohren

Abstract: We introduce a new software toolbox for agent-based simulation. Facilitating rapid prototy** by offering a user-friendly Python API, its core rests on an efficient C++ implementation to support simulation of large-scale multi-agent systems. Our software environment benefits from a versatile message-driven architecture. Originally developed to support research on financial markets, it offers the… ▽ More We introduce a new software toolbox for agent-based simulation. Facilitating rapid prototy** by offering a user-friendly Python API, its core rests on an efficient C++ implementation to support simulation of large-scale multi-agent systems. Our software environment benefits from a versatile message-driven architecture. Originally developed to support research on financial markets, it offers the flexibility to simulate a wide-range of different (easily customisable) market rules and to study the effect of auxiliary factors, such as delays, on the market dynamics. As a simple illustration, we employ our toolbox to investigate the role of the order processing delay in normal trading and for the scenario of a significant price change. Owing to its general architecture, our toolbox can also be employed as a generic multi-agent system simulator. We provide an example of such a non-financial application by simulating a mechanism for the coordination of no-regret learning agents in a multi-agent network routing scenario previously proposed in the literature. △ Less

Submitted 21 September, 2022; v1 submitted 18 August, 2020; originally announced August 2020.

Comments: Presented at the International Workshop on Multi-Agent Systems and Agent-Based Simulation (MABS@AAMAS) 2021, 12 pages, 8 figures

arXiv:1912.00071 [pdf, other]

Safety Guarantees for Planning Based on Iterative Gaussian Processes

Authors: Kyriakos Polymenakos, Luca Laurenti, Andrea Patane, Jan-Peter Calliess, Luca Cardelli, Marta Kwiatkowska, Alessandro Abate, Stephen Roberts

Abstract: Gaussian Processes (GPs) are widely employed in control and learning because of their principled treatment of uncertainty. However, tracking uncertainty for iterative, multi-step predictions in general leads to an analytically intractable problem. While approximation methods exist, they do not come with guarantees, making it difficult to estimate their reliability and to trust their predictions. I… ▽ More Gaussian Processes (GPs) are widely employed in control and learning because of their principled treatment of uncertainty. However, tracking uncertainty for iterative, multi-step predictions in general leads to an analytically intractable problem. While approximation methods exist, they do not come with guarantees, making it difficult to estimate their reliability and to trust their predictions. In this work, we derive formal probability error bounds for iterative prediction and planning with GPs. Building on GP properties, we bound the probability that random trajectories lie in specific regions around the predicted values. Namely, given a tolerance $ε> 0 $, we compute regions around the predicted trajectory values, such that GP trajectories are guaranteed to lie inside them with probability at least $1-ε$. We verify experimentally that our method tracks the predictive uncertainty correctly, even when current approximation techniques fail. Furthermore, we show how the proposed bounds can be employed within a safe reinforcement learning framework to verify the safety of candidate control policies, guiding the synthesis of provably safe controllers. △ Less

Submitted 7 September, 2020; v1 submitted 29 November, 2019; originally announced December 2019.

Comments: An earlier version of this work presented in NeurIPS-2019 Workshop on Safety and Robustness in Decision Making. A shorter (but otherwise equivalent) paper was accepted to the 59th Conference on Decision and Control (CDC2020)

arXiv:1901.10452 [pdf, other]

Asynchronous Batch Bayesian Optimisation with Improved Local Penalisation

Authors: Ahsan S. Alvi, Binxin Ru, Jan Calliess, Stephen J. Roberts, Michael A. Osborne

Abstract: Batch Bayesian optimisation (BO) has been successfully applied to hyperparameter tuning using parallel computing, but it is wasteful of resources: workers that complete jobs ahead of others are left idle. We address this problem by develo** an approach, Penalising Locally for Asynchronous Bayesian Optimisation on $k$ workers (PLAyBOOK), for asynchronous parallel BO. We demonstrate empirically th… ▽ More Batch Bayesian optimisation (BO) has been successfully applied to hyperparameter tuning using parallel computing, but it is wasteful of resources: workers that complete jobs ahead of others are left idle. We address this problem by develo** an approach, Penalising Locally for Asynchronous Bayesian Optimisation on $k$ workers (PLAyBOOK), for asynchronous parallel BO. We demonstrate empirically the efficacy of PLAyBOOK and its variants on synthetic tasks and a real-world problem. We undertake a comparison between synchronous and asynchronous BO, and show that asynchronous BO often outperforms synchronous batch BO in both wall-clock time and number of function evaluations. △ Less

Submitted 28 May, 2019; v1 submitted 29 January, 2019; originally announced January 2019.

Comments: Camera-ready version after incorporating reviewers' suggestions

arXiv:1702.08898 [pdf, other]

Lipschitz Optimisation for Lipschitz Interpolation

Authors: Jan-Peter Calliess

Abstract: Techniques known as Nonlinear Set Membership prediction, Kinky Inference or Lipschitz Interpolation are fast and numerically robust approaches to nonparametric machine learning that have been proposed to be utilised in the context of system identification and learning-based control. They utilise presupposed Lipschitz properties in order to compute inferences over unobserved function values. Unfort… ▽ More Techniques known as Nonlinear Set Membership prediction, Kinky Inference or Lipschitz Interpolation are fast and numerically robust approaches to nonparametric machine learning that have been proposed to be utilised in the context of system identification and learning-based control. They utilise presupposed Lipschitz properties in order to compute inferences over unobserved function values. Unfortunately, most of these approaches rely on exact knowledge about the input space metric as well as about the Lipschitz constant. Furthermore, existing techniques to estimate the Lipschitz constants from the data are not robust to noise or seem to be ad-hoc and typically are decoupled from the ultimate learning and prediction task. To overcome these limitations, we propose an approach for optimising parameters of the presupposed metrics by minimising validation set prediction errors. To avoid poor performance due to local minima, we propose to utilise Lipschitz properties of the optimisation objective to ensure global optimisation success. The resulting approach is a new flexible method for nonparametric black-box learning. We provide experimental evidence of the competitiveness of our approach on artificial as well as on real data. △ Less

Submitted 28 February, 2017; originally announced February 2017.

arXiv:1701.00178 [pdf, other]

Lazily Adapted Constant Kinky Inference for Nonparametric Regression and Model-Reference Adaptive Control

Authors: Jan-Peter Calliess

Abstract: Techniques known as Nonlinear Set Membership prediction, Lipschitz Interpolation or Kinky Inference are approaches to machine learning that utilise presupposed Lipschitz properties to compute inferences over unobserved function values. Provided a bound on the true best Lipschitz constant of the target function is known a priori they offer convergence guarantees as well as bounds around the predict… ▽ More Techniques known as Nonlinear Set Membership prediction, Lipschitz Interpolation or Kinky Inference are approaches to machine learning that utilise presupposed Lipschitz properties to compute inferences over unobserved function values. Provided a bound on the true best Lipschitz constant of the target function is known a priori they offer convergence guarantees as well as bounds around the predictions. Considering a more general setting that builds on Hoelder continuity relative to pseudo-metrics, we propose an online method for estimating the Hoelder constant online from function value observations that possibly are corrupted by bounded observational errors. Utilising this to compute adaptive parameters within a kinky inference rule gives rise to a nonparametric machine learning method, for which we establish strong universal approximation guarantees. That is, we show that our prediction rule can learn any continuous function in the limit of increasingly dense data to within a worst-case error bound that depends on the level of observational uncertainty. We apply our method in the context of nonparametric model-reference adaptive control (MRAC). Across a range of simulated aircraft roll-dynamics and performance metrics our approach outperforms recently proposed alternatives that were based on Gaussian processes and RBF-neural networks. For discrete-time systems, we provide guarantees on the tracking success of our learning-based controllers both for the batch and the online learning setting. △ Less

Submitted 10 March, 2021; v1 submitted 31 December, 2016; originally announced January 2017.

ACM Class: I.2.6; G.1.2; G.3; I.2.8

arXiv:1402.4157 [pdf, other]

Conservative collision prediction and avoidance for stochastic trajectories in continuous time and space

Authors: Jan-Peter Calliess, Michael Osborne, Stephen Roberts

Abstract: Existing work in multi-agent collision prediction and avoidance typically assumes discrete-time trajectories with Gaussian uncertainty or that are completely deterministic. We propose an approach that allows detection of collisions even between continuous, stochastic trajectories with the only restriction that means and variances can be computed. To this end, we employ probabilistic bounds to deri… ▽ More Existing work in multi-agent collision prediction and avoidance typically assumes discrete-time trajectories with Gaussian uncertainty or that are completely deterministic. We propose an approach that allows detection of collisions even between continuous, stochastic trajectories with the only restriction that means and variances can be computed. To this end, we employ probabilistic bounds to derive criterion functions whose negative sign provably is indicative of probable collisions. For criterion functions that are Lipschitz, an algorithm is provided to rapidly find negative values or prove their absence. We propose an iterative policy-search approach that avoids prior discretisations and yields collision-free trajectories with adjustably high certainty. We test our method with both fixed-priority and auction-based protocols for coordinating the iterative planning process. Results are provided in collision-avoidance simulations of feedback controlled plants. △ Less

Submitted 12 May, 2014; v1 submitted 17 February, 2014; originally announced February 2014.

Comments: This preprint is an extended version of a conference paper that is to appear in \textit{Proceedings of the 13th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2014)}

arXiv:1311.4468 [pdf, other]

Stochastic processes and feedback-linearisation for online identification and Bayesian adaptive control of fully-actuated mechanical systems

Authors: Jan-Peter Calliess, Antonis Papachristodoulou, Stephen J. Roberts

Abstract: This work proposes a new method for simultaneous probabilistic identification and control of an observable, fully-actuated mechanical system. Identification is achieved by conditioning stochastic process priors on observations of configurations and noisy estimates of configuration derivatives. In contrast to previous work that has used stochastic processes for identification, we leverage the struc… ▽ More This work proposes a new method for simultaneous probabilistic identification and control of an observable, fully-actuated mechanical system. Identification is achieved by conditioning stochastic process priors on observations of configurations and noisy estimates of configuration derivatives. In contrast to previous work that has used stochastic processes for identification, we leverage the structural knowledge afforded by Lagrangian mechanics and learn the drift and control input matrix functions of the control-affine system separately. We utilise feedback-linearisation to reduce, in expectation, the uncertain nonlinear control problem to one that is easy to regulate in a desired manner. Thereby, our method combines the flexibility of nonparametric Bayesian learning with epistemological guarantees on the expected closed-loop trajectory. We illustrate our method in the context of torque-actuated pendula where the dynamics are learned with a combination of normal and log-normal processes. △ Less

Submitted 1 April, 2014; v1 submitted 18 November, 2013; originally announced November 2013.

MSC Class: 68T05; 68T40; 62G08; 37H10 ACM Class: I.2.9; I.2.8; I.2.6

arXiv:1104.5384 [pdf, other]

Chance-constrained Model Predictive Control for Multi-Agent Systems

Authors: Daniel Lyons, Jan-P. Calliess, Uwe D. Hanebeck

Abstract: We consider stochastic model predictive control of a multi-agent systems with constraints on the probabilities of inter-agent collisions. We first study a sample-based approximation of the collision probabilities and use this approximation to formulate constraints for the stochastic control problem. This approximation will converge as the number of samples goes to infinity, however, the complexity… ▽ More We consider stochastic model predictive control of a multi-agent systems with constraints on the probabilities of inter-agent collisions. We first study a sample-based approximation of the collision probabilities and use this approximation to formulate constraints for the stochastic control problem. This approximation will converge as the number of samples goes to infinity, however, the complexity of the resulting control problem is so high that this approach proves unsuitable for control under real-time requirements. To alleviate the computational burden we propose a second approach that uses probabilistic bounds to determine regions with increased probability of presence for each agent and formulate constraints for the control problem that guarantee that these regions will not overlap. We prove that the resulting problem is conservative for the original problem with probabilistic constraints, ie. every control strategy that is feasible under our new constraints will automatically be feasible for the original problem. Furthermore we show in simulations in a UAV path planning scenario that our proposed approach grants significantly better run-time performance compared to a controller with the sample-based approximation with only a small degree of sub-optimality resulting from the conservativeness of our new approach. △ Less

Submitted 16 August, 2011; v1 submitted 28 April, 2011; originally announced April 2011.

Comments: 33 pages, 5 figures, revised version

Showing 1–12 of 12 results for author: Calliess, J