-
A Survey of Large Language Models for Financial Applications: Progress, Prospects and Challenges
Authors:
Yuqi Nie,
Yaxuan Kong,
Xiaowen Dong,
John M. Mulvey,
H. Vincent Poor,
Qingsong Wen,
Stefan Zohren
Abstract:
Recent advances in large language models (LLMs) have unlocked novel opportunities for machine learning applications in the financial domain. These models have demonstrated remarkable capabilities in understanding context, processing vast amounts of data, and generating human-preferred contents. In this survey, we explore the application of LLMs on various financial tasks, focusing on their potenti…
▽ More
Recent advances in large language models (LLMs) have unlocked novel opportunities for machine learning applications in the financial domain. These models have demonstrated remarkable capabilities in understanding context, processing vast amounts of data, and generating human-preferred contents. In this survey, we explore the application of LLMs on various financial tasks, focusing on their potential to transform traditional practices and drive innovation. We provide a discussion of the progress and advantages of LLMs in financial contexts, analyzing their advanced technologies as well as prospective capabilities in contextual understanding, transfer learning flexibility, complex emotion detection, etc. We then highlight this survey for categorizing the existing literature into key application areas, including linguistic tasks, sentiment analysis, financial time series, financial reasoning, agent-based modeling, and other applications. For each application area, we delve into specific methodologies, such as textual analysis, knowledge-based analysis, forecasting, data augmentation, planning, decision support, and simulations. Furthermore, a comprehensive collection of datasets, model assets, and useful codes associated with mainstream applications are presented as resources for the researchers and practitioners. Finally, we outline the challenges and opportunities for future research, particularly emphasizing a number of distinctive aspects in this field. We hope our work can help facilitate the adoption and further development of LLMs in the financial sector.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
Few-Shot Learning Patterns in Financial Time-Series for Trend-Following Strategies
Authors:
Kieran Wood,
Samuel Kessler,
Stephen J. Roberts,
Stefan Zohren
Abstract:
Forecasting models for systematic trading strategies do not adapt quickly when financial market conditions rapidly change, as was seen in the advent of the COVID-19 pandemic in 2020, causing many forecasting models to take loss-making positions. To deal with such situations, we propose a novel time-series trend-following forecaster that can quickly adapt to new market conditions, referred to as re…
▽ More
Forecasting models for systematic trading strategies do not adapt quickly when financial market conditions rapidly change, as was seen in the advent of the COVID-19 pandemic in 2020, causing many forecasting models to take loss-making positions. To deal with such situations, we propose a novel time-series trend-following forecaster that can quickly adapt to new market conditions, referred to as regimes. We leverage recent developments from the deep learning community and use few-shot learning. We propose the Cross Attentive Time-Series Trend Network -- X-Trend -- which takes positions attending over a context set of financial time-series regimes. X-Trend transfers trends from similar patterns in the context set to make forecasts, then subsequently takes positions for a new distinct target regime. By quickly adapting to new financial regimes, X-Trend increases Sharpe ratio by 18.9% over a neural forecaster and 10-fold over a conventional Time-series Momentum strategy during the turbulent market period from 2018 to 2023. Our strategy recovers twice as quickly from the COVID-19 drawdown compared to the neural-forecaster. X-Trend can also take zero-shot positions on novel unseen financial assets obtaining a 5-fold Sharpe ratio increase versus a neural time-series trend forecaster over the same period. Furthermore, the cross-attention mechanism allows us to interpret the relationship between forecasts and patterns in the context set.
△ Less
Submitted 28 March, 2024; v1 submitted 16 October, 2023;
originally announced October 2023.
-
Dynamic Time War** for Lead-Lag Relationships in Lagged Multi-Factor Models
Authors:
Yichi Zhang,
Mihai Cucuringu,
Alexander Y. Shestopaloff,
Stefan Zohren
Abstract:
In multivariate time series systems, lead-lag relationships reveal dependencies between time series when they are shifted in time relative to each other. Uncovering such relationships is valuable in downstream tasks, such as control, forecasting, and clustering. By understanding the temporal dependencies between different time series, one can better comprehend the complex interactions and patterns…
▽ More
In multivariate time series systems, lead-lag relationships reveal dependencies between time series when they are shifted in time relative to each other. Uncovering such relationships is valuable in downstream tasks, such as control, forecasting, and clustering. By understanding the temporal dependencies between different time series, one can better comprehend the complex interactions and patterns within the system. We develop a cluster-driven methodology based on dynamic time war** for robust detection of lead-lag relationships in lagged multi-factor models. We establish connections to the multireference alignment problem for both the homogeneous and heterogeneous settings. Since multivariate time series are ubiquitous in a wide range of domains, we demonstrate that our algorithm is able to robustly detect lead-lag relationships in financial markets, which can be subsequently leveraged in trading strategies with significant economic benefits.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
On statistical arbitrage under a conditional factor model of equity returns
Authors:
Trent Spears,
Stefan Zohren,
Stephen Roberts
Abstract:
We consider a conditional factor model for a multivariate portfolio of United States equities in the context of analysing a statistical arbitrage trading strategy. A state space framework underlies the factor model whereby asset returns are assumed to be a noisy observation of a linear combination of factor values and latent factor risk premia. Filter and state prediction estimates for the risk pr…
▽ More
We consider a conditional factor model for a multivariate portfolio of United States equities in the context of analysing a statistical arbitrage trading strategy. A state space framework underlies the factor model whereby asset returns are assumed to be a noisy observation of a linear combination of factor values and latent factor risk premia. Filter and state prediction estimates for the risk premia are retrieved in an online way. Such estimates induce filtered asset returns that can be compared to measurement observations, with large deviations representing candidate mean reversion trades. Further, in that the risk premia are modelled as time-varying quantities, non-stationarity in returns is de facto captured. We study an empirical trading strategy respectful of transaction costs, and demonstrate performance over a long history of 29 years, for both a linear and a non-linear state space model. Our results show that the model is competitive relative to the results of other methods, including simple benchmarks and other cutting-edge approaches as published in the literature. Also of note, while strategy performance degradation is noticed through time -- especially for the most recent years -- the strategy continues to offer compelling economics, and has scope for further advancement.
△ Less
Submitted 5 September, 2023;
originally announced September 2023.
-
Generative AI for End-to-End Limit Order Book Modelling: A Token-Level Autoregressive Generative Model of Message Flow Using a Deep State Space Network
Authors:
Peer Nagy,
Sascha Frey,
Silvia Sapora,
Kang Li,
Anisoara Calinescu,
Stefan Zohren,
Jakob Foerster
Abstract:
Develo** a generative model of realistic order flow in financial markets is a challenging open problem, with numerous applications for market participants. Addressing this, we propose the first end-to-end autoregressive generative model that generates tokenized limit order book (LOB) messages. These messages are interpreted by a Jax-LOB simulator, which updates the LOB state. To handle long sequ…
▽ More
Develo** a generative model of realistic order flow in financial markets is a challenging open problem, with numerous applications for market participants. Addressing this, we propose the first end-to-end autoregressive generative model that generates tokenized limit order book (LOB) messages. These messages are interpreted by a Jax-LOB simulator, which updates the LOB state. To handle long sequences efficiently, the model employs simplified structured state-space layers to process sequences of order book states and tokenized messages. Using LOBSTER data of NASDAQ equity LOBs, we develop a custom tokenizer for message data, converting groups of successive digits to tokens, similar to tokenization in large language models. Out-of-sample results show promising performance in approximating the data distribution, as evidenced by low model perplexity. Furthermore, the mid-price returns calculated from the generated order flow exhibit a significant correlation with the data, indicating impressive conditional forecast performance. Due to the granularity of generated data, and the accuracy of the model, it offers new application areas for future work beyond forecasting, e.g. acting as a world model in high-frequency financial reinforcement learning applications. Overall, our results invite the use and extension of the model in the direction of autoregressive large financial models for the generation of high-frequency financial data and we commit to open-sourcing our code to facilitate future research.
△ Less
Submitted 23 August, 2023;
originally announced September 2023.
-
JAX-LOB: A GPU-Accelerated limit order book simulator to unlock large scale reinforcement learning for trading
Authors:
Sascha Frey,
Kang Li,
Peer Nagy,
Silvia Sapora,
Chris Lu,
Stefan Zohren,
Jakob Foerster,
Anisoara Calinescu
Abstract:
Financial exchanges across the world use limit order books (LOBs) to process orders and match trades. For research purposes it is important to have large scale efficient simulators of LOB dynamics. LOB simulators have previously been implemented in the context of agent-based models (ABMs), reinforcement learning (RL) environments, and generative models, processing order flows from historical data…
▽ More
Financial exchanges across the world use limit order books (LOBs) to process orders and match trades. For research purposes it is important to have large scale efficient simulators of LOB dynamics. LOB simulators have previously been implemented in the context of agent-based models (ABMs), reinforcement learning (RL) environments, and generative models, processing order flows from historical data sets and hand-crafted agents alike. For many applications, there is a requirement for processing multiple books, either for the calibration of ABMs or for the training of RL agents. We showcase the first GPU-enabled LOB simulator designed to process thousands of books in parallel, with a notably reduced per-message processing time. The implementation of our simulator - JAX-LOB - is based on design choices that aim to best exploit the powers of JAX without compromising on the realism of LOB-related mechanisms. We integrate JAX-LOB with other JAX packages, to provide an example of how one may address an optimal execution problem with reinforcement learning, and to share some preliminary results from end-to-end RL training on GPUs.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
Learning to Learn Financial Networks for Optimising Momentum Strategies
Authors:
Xingyue Pu,
Stefan Zohren,
Stephen Roberts,
Xiaowen Dong
Abstract:
Network momentum provides a novel type of risk premium, which exploits the interconnections among assets in a financial network to predict future returns. However, the current process of constructing financial networks relies heavily on expensive databases and financial expertise, limiting accessibility for small-sized and academic institutions. Furthermore, the traditional approach treats network…
▽ More
Network momentum provides a novel type of risk premium, which exploits the interconnections among assets in a financial network to predict future returns. However, the current process of constructing financial networks relies heavily on expensive databases and financial expertise, limiting accessibility for small-sized and academic institutions. Furthermore, the traditional approach treats network construction and portfolio optimisation as separate tasks, potentially hindering optimal portfolio performance. To address these challenges, we propose L2GMOM, an end-to-end machine learning framework that simultaneously learns financial networks and optimises trading signals for network momentum strategies. The model of L2GMOM is a neural network with a highly interpretable forward propagation architecture, which is derived from algorithm unrolling. The L2GMOM is flexible and can be trained with diverse loss functions for portfolio performance, e.g. the negative Sharpe ratio. Backtesting on 64 continuous future contracts demonstrates a significant improvement in portfolio profitability and risk control, with a Sharpe ratio of 1.74 across a 20-year period.
△ Less
Submitted 23 August, 2023;
originally announced August 2023.
-
Network Momentum across Asset Classes
Authors:
Xingyue Pu,
Stephen Roberts,
Xiaowen Dong,
Stefan Zohren
Abstract:
We investigate the concept of network momentum, a novel trading signal derived from momentum spillover across assets. Initially observed within the confines of pairwise economic and fundamental ties, such as the stock-bond connection of the same company and stocks linked through supply-demand chains, momentum spillover implies a propagation of momentum risk premium from one asset to another. The s…
▽ More
We investigate the concept of network momentum, a novel trading signal derived from momentum spillover across assets. Initially observed within the confines of pairwise economic and fundamental ties, such as the stock-bond connection of the same company and stocks linked through supply-demand chains, momentum spillover implies a propagation of momentum risk premium from one asset to another. The similarity of momentum risk premium, exemplified by co-movement patterns, has been spotted across multiple asset classes including commodities, equities, bonds and currencies. However, studying the network effect of momentum spillover across these classes has been challenging due to a lack of readily available common characteristics or economic ties beyond the company level. In this paper, we explore the interconnections of momentum features across a diverse range of 64 continuous future contracts spanning these four classes. We utilise a linear and interpretable graph learning model with minimal assumptions to reveal the intricacies of the momentum spillover network. By leveraging the learned networks, we construct a network momentum strategy that exhibits a Sharpe ratio of 1.5 and an annual return of 22%, after volatility scaling, from 2000 to 2022. This paper pioneers the examination of momentum spillover across multiple asset classes using only pricing data, presents a multi-asset investment strategy based on network momentum, and underscores the effectiveness of this strategy through robust empirical analysis.
△ Less
Submitted 22 August, 2023;
originally announced August 2023.
-
Multi-Factor Inception: What to Do with All of These Features?
Authors:
Tom Liu,
Stefan Zohren
Abstract:
Cryptocurrency trading represents a nascent field of research, with growing adoption in industry. Aided by its decentralised nature, many metrics describing cryptocurrencies are accessible with a simple Google search and update frequently, usually at least on a daily basis. This presents a promising opportunity for data-driven systematic trading research, where limited historical data can be augme…
▽ More
Cryptocurrency trading represents a nascent field of research, with growing adoption in industry. Aided by its decentralised nature, many metrics describing cryptocurrencies are accessible with a simple Google search and update frequently, usually at least on a daily basis. This presents a promising opportunity for data-driven systematic trading research, where limited historical data can be augmented with additional features, such as hashrate or Google Trends. However, one question naturally arises: how to effectively select and process these features? In this paper, we introduce Multi-Factor Inception Networks (MFIN), an end-to-end framework for systematic trading with multiple assets and factors. MFINs extend Deep Inception Networks (DIN) to operate in a multi-factor context. Similar to DINs, MFIN models automatically learn features from returns data and output position sizes that optimise portfolio Sharpe ratio. Compared to a range of rule-based momentum and reversion strategies, MFINs learn an uncorrelated, higher-Sharpe strategy that is not captured by traditional, hand-crafted factors. In particular, MFIN models continue to achieve consistent returns over the most recent years (2022-2023), where traditional strategies and the wider cryptocurrency market have underperformed.
△ Less
Submitted 25 July, 2023;
originally announced July 2023.
-
Deep Inception Networks: A General End-to-End Framework for Multi-asset Quantitative Strategies
Authors:
Tom Liu,
Stephen Roberts,
Stefan Zohren
Abstract:
We introduce Deep Inception Networks (DINs), a family of Deep Learning models that provide a general framework for end-to-end systematic trading strategies. DINs extract time series (TS) and cross sectional (CS) features directly from daily price returns. This removes the need for handcrafted features, and allows the model to learn from TS and CS information simultaneously. DINs benefit from a ful…
▽ More
We introduce Deep Inception Networks (DINs), a family of Deep Learning models that provide a general framework for end-to-end systematic trading strategies. DINs extract time series (TS) and cross sectional (CS) features directly from daily price returns. This removes the need for handcrafted features, and allows the model to learn from TS and CS information simultaneously. DINs benefit from a fully data-driven approach to feature extraction, whilst avoiding overfitting. Extending prior work on Deep Momentum Networks, DIN models directly output position sizes that optimise Sharpe ratio, but for the entire portfolio instead of individual assets. We propose a novel loss term to balance turnover regularisation against increased systemic risk from high correlation to the overall market. Using futures data, we show that DIN models outperform traditional TS and CS benchmarks, are robust to a range of transaction costs and perform consistently across random seeds. To balance the general nature of DIN models, we provide examples of how attention and Variable Selection Networks can aid the interpretability of investment decisions. These model-specific methods are particularly useful when the dimensionality of the input is high and variable importance fluctuates dynamically over time. Finally, we compare the performance of DIN models on other asset classes, and show how the space of potential features can be customised.
△ Less
Submitted 7 July, 2023;
originally announced July 2023.
-
Deep Attentive Survival Analysis in Limit Order Books: Estimating Fill Probabilities with Convolutional-Transformers
Authors:
Alvaro Arroyo,
Alvaro Cartea,
Fernando Moreno-Pino,
Stefan Zohren
Abstract:
One of the key decisions in execution strategies is the choice between a passive (liquidity providing) or an aggressive (liquidity taking) order to execute a trade in a limit order book (LOB). Essential to this choice is the fill probability of a passive limit order placed in the LOB. This paper proposes a deep learning method to estimate the filltimes of limit orders posted in different levels of…
▽ More
One of the key decisions in execution strategies is the choice between a passive (liquidity providing) or an aggressive (liquidity taking) order to execute a trade in a limit order book (LOB). Essential to this choice is the fill probability of a passive limit order placed in the LOB. This paper proposes a deep learning method to estimate the filltimes of limit orders posted in different levels of the LOB. We develop a novel model for survival analysis that maps time-varying features of the LOB to the distribution of filltimes of limit orders. Our method is based on a convolutional-Transformer encoder and a monotonic neural network decoder. We use proper scoring rules to compare our method with other approaches in survival analysis, and perform an interpretability analysis to understand the informativeness of features used to compute fill probabilities. Our method significantly outperforms those typically used in survival analysis literature. Finally, we carry out a statistical analysis of the fill probability of orders placed in the order book (e.g., within the bid-ask spread) for assets with different queue dynamics and trading activity.
△ Less
Submitted 8 June, 2023;
originally announced June 2023.
-
Robust Detection of Lead-Lag Relationships in Lagged Multi-Factor Models
Authors:
Yichi Zhang,
Mihai Cucuringu,
Alexander Y. Shestopaloff,
Stefan Zohren
Abstract:
In multivariate time series systems, key insights can be obtained by discovering lead-lag relationships inherent in the data, which refer to the dependence between two time series shifted in time relative to one another, and which can be leveraged for the purposes of control, forecasting or clustering. We develop a clustering-driven methodology for robust detection of lead-lag relationships in lag…
▽ More
In multivariate time series systems, key insights can be obtained by discovering lead-lag relationships inherent in the data, which refer to the dependence between two time series shifted in time relative to one another, and which can be leveraged for the purposes of control, forecasting or clustering. We develop a clustering-driven methodology for robust detection of lead-lag relationships in lagged multi-factor models. Within our framework, the envisioned pipeline takes as input a set of time series, and creates an enlarged universe of extracted subsequence time series from each input time series, via a sliding window approach. This is then followed by an application of various clustering techniques, (such as k-means++ and spectral clustering), employing a variety of pairwise similarity measures, including nonlinear ones. Once the clusters have been extracted, lead-lag estimates across clusters are robustly aggregated to enhance the identification of the consistent relationships in the original universe. We establish connections to the multireference alignment problem for both the homogeneous and heterogeneous settings. Since multivariate time series are ubiquitous in a wide range of domains, we demonstrate that our method is not only able to robustly detect lead-lag relationships in financial markets, but can also yield insightful results when applied to an environmental data set.
△ Less
Submitted 18 September, 2023; v1 submitted 11 May, 2023;
originally announced May 2023.
-
Spatio-Temporal Momentum: Jointly Learning Time-Series and Cross-Sectional Strategies
Authors:
Wee Ling Tan,
Stephen Roberts,
Stefan Zohren
Abstract:
We introduce Spatio-Temporal Momentum strategies, a class of models that unify both time-series and cross-sectional momentum strategies by trading assets based on their cross-sectional momentum features over time. While both time-series and cross-sectional momentum strategies are designed to systematically capture momentum risk premia, these strategies are regarded as distinct implementations and…
▽ More
We introduce Spatio-Temporal Momentum strategies, a class of models that unify both time-series and cross-sectional momentum strategies by trading assets based on their cross-sectional momentum features over time. While both time-series and cross-sectional momentum strategies are designed to systematically capture momentum risk premia, these strategies are regarded as distinct implementations and do not consider the concurrent relationship and predictability between temporal and cross-sectional momentum features of different assets. We model spatio-temporal momentum with neural networks of varying complexities and demonstrate that a simple neural network with only a single fully connected layer learns to simultaneously generate trading signals for all assets in a portfolio by incorporating both their time-series and cross-sectional momentum features. Backtesting on portfolios of 46 actively-traded US equities and 12 equity index futures contracts, we demonstrate that the model is able to retain its performance over benchmarks in the presence of high transaction costs of up to 5-10 basis points. In particular, we find that the model when coupled with least absolute shrinkage and turnover regularization results in the best performance over various transaction cost scenarios.
△ Less
Submitted 20 February, 2023;
originally announced February 2023.
-
View fusion vis-Ã -vis a Bayesian interpretation of Black-Litterman for portfolio allocation
Authors:
Trent Spears,
Stefan Zohren,
Stephen Roberts
Abstract:
The Black-Litterman model extends the framework of the Markowitz Modern Portfolio Theory to incorporate investor views. We consider a case where multiple view estimates, including uncertainties, are given for the same underlying subset of assets at a point in time. This motivates our consideration of data fusion techniques for combining information from multiple sources. In particular, we consider…
▽ More
The Black-Litterman model extends the framework of the Markowitz Modern Portfolio Theory to incorporate investor views. We consider a case where multiple view estimates, including uncertainties, are given for the same underlying subset of assets at a point in time. This motivates our consideration of data fusion techniques for combining information from multiple sources. In particular, we consider consistency-based methods that yield fused view and uncertainty pairs; such methods are not common to the quantitative finance literature. We show a relevant, modern case of incorporating machine learning model-derived view and uncertainty estimates, and the impact on portfolio allocation, with an example subsuming Arbitrage Pricing Theory. Hence we show the value of the Black-Litterman model in combination with information fusion and artificial intelligence-grounded prediction methods.
△ Less
Submitted 31 January, 2023;
originally announced January 2023.
-
Asynchronous Deep Double Duelling Q-Learning for Trading-Signal Execution in Limit Order Book Markets
Authors:
Peer Nagy,
Jan-Peter Calliess,
Stefan Zohren
Abstract:
We employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilise it to simulate a realistic trading environment for NASDAQ equities based on historic order book message…
▽ More
We employ deep reinforcement learning (RL) to train an agent to successfully translate a high-frequency trading signal into a trading strategy that places individual limit orders. Based on the ABIDES limit order book simulator, we build a reinforcement learning OpenAI gym environment and utilise it to simulate a realistic trading environment for NASDAQ equities based on historic order book messages. To train a trading agent that learns to maximise its trading return in this environment, we use Deep Duelling Double Q-learning with the APEX (asynchronous prioritised experience replay) architecture. The agent observes the current limit order book state, its recent history, and a short-term directional forecast. To investigate the performance of RL for adaptive trading independently from a concrete forecasting algorithm, we study the performance of our approach utilising synthetic alpha signals obtained by perturbing forward-looking returns with varying levels of noise. Here, we find that the RL agent learns an effective trading strategy for inventory management and order placing that outperforms a heuristic benchmark trading strategy having access to the same signal.
△ Less
Submitted 25 September, 2023; v1 submitted 20 January, 2023;
originally announced January 2023.
-
Understanding stock market instability via graph auto-encoders
Authors:
Dragos Gorduza,
Xiaowen Dong,
Stefan Zohren
Abstract:
Understanding stock market instability is a key question in financial management as practitioners seek to forecast breakdowns in asset co-movements which expose portfolios to rapid and devastating collapses in value. The structure of these co-movements can be described as a graph where companies are represented by nodes and edges capture correlations between their price movements. Learning a timel…
▽ More
Understanding stock market instability is a key question in financial management as practitioners seek to forecast breakdowns in asset co-movements which expose portfolios to rapid and devastating collapses in value. The structure of these co-movements can be described as a graph where companies are represented by nodes and edges capture correlations between their price movements. Learning a timely indicator of co-movement breakdowns (manifested as modifications in the graph structure) is central in understanding both financial stability and volatility forecasting. We propose to use the edge reconstruction accuracy of a graph auto-encoder (GAE) as an indicator for how spatially homogeneous connections between assets are, which, based on financial network literature, we use as a proxy to infer market volatility. Our experiments on the S&P 500 over the 2015-2022 period show that higher GAE reconstruction error values are correlated with higher volatility. We also show that out-of-sample autoregressive modeling of volatility is improved by the addition of the proposed measure. Our paper contributes to the literature of machine learning in finance particularly in the context of understanding stock market instability.
△ Less
Submitted 9 December, 2022;
originally announced December 2022.
-
DeepVol: Volatility Forecasting from High-Frequency Data with Dilated Causal Convolutions
Authors:
Fernando Moreno-Pino,
Stefan Zohren
Abstract:
Volatility forecasts play a central role among equity risk measures. Besides traditional statistical models, modern forecasting techniques, based on machine learning, can readily be employed when treating volatility as a univariate, daily time-series. However, econometric studies have shown that increasing the number of daily observations with high-frequency intraday data helps to improve predicti…
▽ More
Volatility forecasts play a central role among equity risk measures. Besides traditional statistical models, modern forecasting techniques, based on machine learning, can readily be employed when treating volatility as a univariate, daily time-series. However, econometric studies have shown that increasing the number of daily observations with high-frequency intraday data helps to improve predictions. In this work, we propose DeepVol, a model based on Dilated Causal Convolutions to forecast day-ahead volatility by using high-frequency data. We show that the dilated convolutional filters are ideally suited to extract relevant information from intraday financial data, thereby naturally mimicking (via a data-driven approach) the econometric models which incorporate realised measures of volatility into the forecast. This allows us to take advantage of the abundance of intraday observations, hel** us to avoid the limitations of models that use daily data, such as model misspecification or manually designed handcrafted features, whose devise involves optimising the trade-off between accuracy and computational efficiency and makes models prone to lack of adaptation into changing circumstances. In our analysis, we use two years of intraday data from NASDAQ-100 to evaluate DeepVol's performance. The reported empirical results suggest that the proposed deep learning-based approach learns global features from high-frequency data, achieving more accurate predictions than traditional methodologies, yielding to more appropriate risk measures.
△ Less
Submitted 13 October, 2022; v1 submitted 23 September, 2022;
originally announced October 2022.
-
Transfer Ranking in Finance: Applications to Cross-Sectional Momentum with Data Scarcity
Authors:
Daniel Poh,
Stephen Roberts,
Stefan Zohren
Abstract:
Cross-sectional strategies are a classical and popular trading style, with recent high performing variants incorporating sophisticated neural architectures. While these strategies have been applied successfully to data-rich settings involving mature assets with long histories, deploying them on instruments with limited samples generally produce over-fitted models with degraded performance. In this…
▽ More
Cross-sectional strategies are a classical and popular trading style, with recent high performing variants incorporating sophisticated neural architectures. While these strategies have been applied successfully to data-rich settings involving mature assets with long histories, deploying them on instruments with limited samples generally produce over-fitted models with degraded performance. In this paper, we introduce Fused Encoder Networks -- a novel and hybrid parameter-sharing transfer ranking model. The model fuses information extracted using an encoder-attention module operated on a source dataset with a similar but separate module focused on a smaller target dataset of interest. This mitigates the issue of models with poor generalisability that are a consequence of training on scarce target data. Additionally, the self-attention mechanism enables interactions among instruments to be accounted for, not just at the loss level during model training, but also at inference time. Focusing on momentum applied to the top ten cryptocurrencies by market capitalisation as a demonstrative use-case, the Fused Encoder Networks outperforms the reference benchmarks on most performance measures, delivering a three-fold boost in the Sharpe ratio over classical momentum as well as an improvement of approximately 50% against the best benchmark model without transaction costs. It continues outperforming baselines even after accounting for the high transaction costs associated with trading cryptocurrencies.
△ Less
Submitted 21 February, 2023; v1 submitted 21 August, 2022;
originally announced August 2022.
-
Canonical Portfolios: Optimal Asset and Signal Combination
Authors:
Nikan Firoozye,
Vincent Tan,
Stefan Zohren
Abstract:
This paper presents a novel framework for analyzing the optimal asset and signal combination problem. Our approach builds upon the dynamic portfolio selection problem introduced by Brandt and Santa-Clara (2006) and consists of two stages. First, we reformulate their original investment problem into a tractable one that allows us to derive a closed-form expression for the optimal portfolio policy t…
▽ More
This paper presents a novel framework for analyzing the optimal asset and signal combination problem. Our approach builds upon the dynamic portfolio selection problem introduced by Brandt and Santa-Clara (2006) and consists of two stages. First, we reformulate their original investment problem into a tractable one that allows us to derive a closed-form expression for the optimal portfolio policy that is scalable to large cross-sectional financial applications. Second, we recast the problem of selecting a portfolio of correlated assets and signals into selecting a set of uncorrelated managed portfolios through the lens of Canonical Correlation Analysis of Hotelling (1936). The new investment environment of uncorrelated managed portfolios offers unique economic insights into the joint correlation structure of our optimal portfolio policy. We also operationalize our theoretical framework to bridge the gap between theory and practice, showcasing the improved performance of our proposed method over natural competing benchmarks.
△ Less
Submitted 12 July, 2023; v1 submitted 22 February, 2022;
originally announced February 2022.
-
Trading with the Momentum Transformer: An Intelligent and Interpretable Architecture
Authors:
Kieran Wood,
Sven Giegerich,
Stephen Roberts,
Stefan Zohren
Abstract:
We introduce the Momentum Transformer, an attention-based deep-learning architecture, which outperforms benchmark time-series momentum and mean-reversion trading strategies. Unlike state-of-the-art Long Short-Term Memory (LSTM) architectures, which are sequential in nature and tailored to local processing, an attention mechanism provides our architecture with a direct connection to all previous ti…
▽ More
We introduce the Momentum Transformer, an attention-based deep-learning architecture, which outperforms benchmark time-series momentum and mean-reversion trading strategies. Unlike state-of-the-art Long Short-Term Memory (LSTM) architectures, which are sequential in nature and tailored to local processing, an attention mechanism provides our architecture with a direct connection to all previous time-steps. Our architecture, an attention-LSTM hybrid, enables us to learn longer-term dependencies, improves performance when considering returns net of transaction costs and naturally adapts to new market regimes, such as during the SARS-CoV-2 crisis. Via the introduction of multiple attention heads, we can capture concurrent regimes, or temporal dynamics, which are occurring at different timescales. The Momentum Transformer is inherently interpretable, providing us with greater insights into our deep-learning momentum trading strategy, including the importance of different factors over time and the past time-steps which are of the greatest significance to the model.
△ Less
Submitted 22 November, 2022; v1 submitted 15 December, 2021;
originally announced December 2021.
-
A Universal End-to-End Approach to Portfolio Optimization via Deep Learning
Authors:
Chao Zhang,
Zihao Zhang,
Mihai Cucuringu,
Stefan Zohren
Abstract:
We propose a universal end-to-end framework for portfolio optimization where asset distributions are directly obtained. The designed framework circumvents the traditional forecasting step and avoids the estimation of the covariance matrix, lifting the bottleneck for generalizing to a large amount of instruments. Our framework has the flexibility of optimizing various objective functions including…
▽ More
We propose a universal end-to-end framework for portfolio optimization where asset distributions are directly obtained. The designed framework circumvents the traditional forecasting step and avoids the estimation of the covariance matrix, lifting the bottleneck for generalizing to a large amount of instruments. Our framework has the flexibility of optimizing various objective functions including Sharpe ratio, mean-variance trade-off etc. Further, we allow for short selling and study several constraints attached to objective functions. In particular, we consider cardinality, maximum position for individual instrument and leverage. These constraints are formulated into objective functions by utilizing several neural layers and gradient ascent can be adopted for optimization. To ensure the robustness of our framework, we test our methods on two datasets. Firstly, we look at a synthetic dataset where we demonstrate that weights obtained from our end-to-end approach are better than classical predictive methods. Secondly, we apply our framework on a real-life dataset with historical observations of hundreds of instruments with a testing period of more than 20 years.
△ Less
Submitted 17 November, 2021;
originally announced November 2021.
-
Realised Volatility Forecasting: Machine Learning via Financial Word Embedding
Authors:
Eghbal Rahimikia,
Stefan Zohren,
Ser-Huang Poon
Abstract:
This study develops FinText, a financial word embedding compiled from 15 years of business news archives. The results show that FinText produces substantially more accurate results than general word embeddings based on the gold-standard financial benchmark we introduced. In contrast to well-known econometric models, and over the sample period from 27 July 2007 to 27 January 2022 for 23 NASDAQ stoc…
▽ More
This study develops FinText, a financial word embedding compiled from 15 years of business news archives. The results show that FinText produces substantially more accurate results than general word embeddings based on the gold-standard financial benchmark we introduced. In contrast to well-known econometric models, and over the sample period from 27 July 2007 to 27 January 2022 for 23 NASDAQ stocks, using stock-related news, our simple natural language processing model supported by different word embeddings improves realised volatility forecasts on high volatility days. This improvement in realised volatility forecasting performance switches to normal volatility days when general hot news is used. By utilising SHAP, an Explainable AI method, we also identify and classify key phrases in stock-related and general hot news that moved volatility.
△ Less
Submitted 1 March, 2023; v1 submitted 1 August, 2021;
originally announced August 2021.
-
Slow Momentum with Fast Reversion: A Trading Strategy Using Deep Learning and Changepoint Detection
Authors:
Kieran Wood,
Stephen Roberts,
Stefan Zohren
Abstract:
Momentum strategies are an important part of alternative investments and are at the heart of commodity trading advisors (CTAs). These strategies have, however, been found to have difficulties adjusting to rapid changes in market conditions, such as during the 2020 market crash. In particular, immediately after momentum turning points, where a trend reverses from an uptrend (downtrend) to a downtre…
▽ More
Momentum strategies are an important part of alternative investments and are at the heart of commodity trading advisors (CTAs). These strategies have, however, been found to have difficulties adjusting to rapid changes in market conditions, such as during the 2020 market crash. In particular, immediately after momentum turning points, where a trend reverses from an uptrend (downtrend) to a downtrend (uptrend), time-series momentum (TSMOM) strategies are prone to making bad bets. To improve the response to regime change, we introduce a novel approach, where we insert an online changepoint detection (CPD) module into a Deep Momentum Network (DMN) [1904.04912] pipeline, which uses an LSTM deep-learning architecture to simultaneously learn both trend estimation and position sizing. Furthermore, our model is able to optimise the way in which it balances 1) a slow momentum strategy which exploits persisting trends, but does not overreact to localised price moves, and 2) a fast mean-reversion strategy regime by quickly flip** its position, then swap** it back again to exploit localised price moves. Our CPD module outputs a changepoint location and severity score, allowing our model to learn to respond to varying degrees of disequilibrium, or smaller and more localised changepoints, in a data driven manner. Back-testing our model over the period 1995-2020, the addition of the CPD module leads to an improvement in Sharpe ratio of one-third. The module is especially beneficial in periods of significant nonstationarity, and in particular, over the most recent years tested (2015-2020) the performance boost is approximately two-thirds. This is interesting as traditional momentum strategies have been underperforming in this period.
△ Less
Submitted 20 December, 2021; v1 submitted 28 May, 2021;
originally announced May 2021.
-
Multi-Horizon Forecasting for Limit Order Books: Novel Deep Learning Approaches and Hardware Acceleration using Intelligent Processing Units
Authors:
Zihao Zhang,
Stefan Zohren
Abstract:
We design multi-horizon forecasting models for limit order book (LOB) data by using deep learning techniques. Unlike standard structures where a single prediction is made, we adopt encoder-decoder models with sequence-to-sequence and Attention mechanisms to generate a forecasting path. Our methods achieve comparable performance to state-of-art algorithms at short prediction horizons. Importantly,…
▽ More
We design multi-horizon forecasting models for limit order book (LOB) data by using deep learning techniques. Unlike standard structures where a single prediction is made, we adopt encoder-decoder models with sequence-to-sequence and Attention mechanisms to generate a forecasting path. Our methods achieve comparable performance to state-of-art algorithms at short prediction horizons. Importantly, they outperform when generating predictions over long horizons by leveraging the multi-horizon setup. Given that encoder-decoder models rely on recurrent neural layers, they generally suffer from slow training processes. To remedy this, we experiment with utilising novel hardware, so-called Intelligent Processing Units (IPUs) produced by Graphcore. IPUs are specifically designed for machine intelligence workload with the aim to speed up the computation process. We show that in our setup this leads to significantly faster training times when compared to training models with GPUs.
△ Less
Submitted 27 August, 2021; v1 submitted 21 May, 2021;
originally announced May 2021.
-
Enhancing Cross-Sectional Currency Strategies by Context-Aware Learning to Rank with Self-Attention
Authors:
Daniel Poh,
Bryan Lim,
Stefan Zohren,
Stephen Roberts
Abstract:
The performance of a cross-sectional currency strategy depends crucially on accurately ranking instruments prior to portfolio construction. While this ranking step is traditionally performed using heuristics, or by sorting the outputs produced by pointwise regression or classification techniques, strategies using Learning to Rank algorithms have recently presented themselves as competitive and via…
▽ More
The performance of a cross-sectional currency strategy depends crucially on accurately ranking instruments prior to portfolio construction. While this ranking step is traditionally performed using heuristics, or by sorting the outputs produced by pointwise regression or classification techniques, strategies using Learning to Rank algorithms have recently presented themselves as competitive and viable alternatives. Although the rankers at the core of these strategies are learned globally and improve ranking accuracy on average, they ignore the differences between the distributions of asset features over the times when the portfolio is rebalanced. This flaw renders them susceptible to producing sub-optimal rankings, possibly at important periods when accuracy is actually needed the most. For example, this might happen during critical risk-off episodes, which consequently exposes the portfolio to substantial, unwanted drawdowns. We tackle this shortcoming with an analogous idea from information retrieval: that a query's top retrieved documents or the local ranking context provide vital information about the query's own characteristics, which can then be used to refine the initial ranked list. In this work, we use a context-aware Learning-to-rank model that is based on the Transformer architecture to encode top/bottom ranked assets, learn the context and exploit this information to re-rank the initial results. Backtesting on a slate of 31 currencies, our proposed methodology increases the Sharpe ratio by around 30% and significantly enhances various performance metrics. Additionally, this approach also improves the Sharpe ratio when separately conditioning on normal and risk-off market states.
△ Less
Submitted 27 January, 2022; v1 submitted 20 May, 2021;
originally announced May 2021.
-
Deep Learning for Market by Order Data
Authors:
Zihao Zhang,
Bryan Lim,
Stefan Zohren
Abstract:
Market by order (MBO) data - a detailed feed of individual trade instructions for a given stock on an exchange - is arguably one of the most granular sources of microstructure information. While limit order books (LOBs) are implicitly derived from it, MBO data is largely neglected by current academic literature which focuses primarily on LOB modelling. In this paper, we demonstrate the utility of…
▽ More
Market by order (MBO) data - a detailed feed of individual trade instructions for a given stock on an exchange - is arguably one of the most granular sources of microstructure information. While limit order books (LOBs) are implicitly derived from it, MBO data is largely neglected by current academic literature which focuses primarily on LOB modelling. In this paper, we demonstrate the utility of MBO data for forecasting high-frequency price movements, providing an orthogonal source of information to LOB snapshots and expanding the universe of alpha discovery. We provide the first predictive analysis on MBO data by carefully introducing the data structure and presenting a specific normalisation scheme to consider level information in order books and to allow model training with multiple instruments. Through forecasting experiments using deep neural networks, we show that while MBO-driven and LOB-driven models individually provide similar performance, ensembles of the two can lead to improvements in forecasting accuracy - indicating that MBO data is additive to LOB-based features.
△ Less
Submitted 27 July, 2021; v1 submitted 17 February, 2021;
originally announced February 2021.
-
Building Cross-Sectional Systematic Strategies By Learning to Rank
Authors:
Daniel Poh,
Bryan Lim,
Stefan Zohren,
Stephen Roberts
Abstract:
The success of a cross-sectional systematic strategy depends critically on accurately ranking assets prior to portfolio construction. Contemporary techniques perform this ranking step either with simple heuristics or by sorting outputs from standard regression or classification models, which have been demonstrated to be sub-optimal for ranking in other domains (e.g. information retrieval). To addr…
▽ More
The success of a cross-sectional systematic strategy depends critically on accurately ranking assets prior to portfolio construction. Contemporary techniques perform this ranking step either with simple heuristics or by sorting outputs from standard regression or classification models, which have been demonstrated to be sub-optimal for ranking in other domains (e.g. information retrieval). To address this deficiency, we propose a framework to enhance cross-sectional portfolios by incorporating learning-to-rank algorithms, which lead to improvements of ranking accuracy by learning pairwise and listwise structures across instruments. Using cross-sectional momentum as a demonstrative case study, we show that the use of modern machine learning ranking algorithms can substantially improve the trading performance of cross-sectional strategies -- providing approximately threefold boosting of Sharpe Ratios compared to traditional approaches.
△ Less
Submitted 13 December, 2020;
originally announced December 2020.
-
Estimation of Large Financial Covariances: A Cross-Validation Approach
Authors:
Vincent Tan,
Stefan Zohren
Abstract:
We introduce a novel covariance estimator for portfolio selection that adapts to the non-stationary or persistent heteroskedastic environments of financial time series by employing exponentially weighted averages and nonlinearly shrinking the sample eigenvalues through cross-validation. Our estimator is structure agnostic, transparent, and computationally feasible in large dimensions. By correctin…
▽ More
We introduce a novel covariance estimator for portfolio selection that adapts to the non-stationary or persistent heteroskedastic environments of financial time series by employing exponentially weighted averages and nonlinearly shrinking the sample eigenvalues through cross-validation. Our estimator is structure agnostic, transparent, and computationally feasible in large dimensions. By correcting the biases in the sample eigenvalues and aligning our estimator to more recent risk, we demonstrate that our estimator performs well in large dimensions against existing state-of-the-art static and dynamic covariance shrinkage estimators through simulations and with an empirical application in active portfolio management.
△ Less
Submitted 20 January, 2023; v1 submitted 10 December, 2020;
originally announced December 2020.
-
Sentiment Correlation in Financial News Networks and Associated Market Movements
Authors:
Xingchen Wan,
Jie Yang,
Slavi Marinov,
Jan-Peter Calliess,
Stefan Zohren,
Xiaowen Dong
Abstract:
In an increasingly connected global market, news sentiment towards one company may not only indicate its own market performance, but can also be associated with a broader movement on the sentiment and performance of other companies from the same or even different sectors. In this paper, we apply NLP techniques to understand news sentiment of 87 companies among the most reported on Reuters for a pe…
▽ More
In an increasingly connected global market, news sentiment towards one company may not only indicate its own market performance, but can also be associated with a broader movement on the sentiment and performance of other companies from the same or even different sectors. In this paper, we apply NLP techniques to understand news sentiment of 87 companies among the most reported on Reuters for a period of seven years. We investigate the propagation of such sentiment in company networks and evaluate the associated market movements in terms of stock price and volatility. Our results suggest that, in certain sectors, strong media sentiment towards one company may indicate a significant change in media sentiment towards related companies measured as neighbours in a financial network constructed from news co-occurrence. Furthermore, there exists a weak but statistically significant association between strong media sentiment and abnormal market return as well as volatility. Such an association is more significant at the level of individual companies, but nevertheless remains visible at the level of sectors or groups of companies.
△ Less
Submitted 13 February, 2021; v1 submitted 5 November, 2020;
originally announced November 2020.
-
Fast Agent-Based Simulation Framework with Applications to Reinforcement Learning and the Study of Trading Latency Effects
Authors:
Peter Belcak,
Jan-Peter Calliess,
Stefan Zohren
Abstract:
We introduce a new software toolbox for agent-based simulation. Facilitating rapid prototy** by offering a user-friendly Python API, its core rests on an efficient C++ implementation to support simulation of large-scale multi-agent systems. Our software environment benefits from a versatile message-driven architecture. Originally developed to support research on financial markets, it offers the…
▽ More
We introduce a new software toolbox for agent-based simulation. Facilitating rapid prototy** by offering a user-friendly Python API, its core rests on an efficient C++ implementation to support simulation of large-scale multi-agent systems. Our software environment benefits from a versatile message-driven architecture. Originally developed to support research on financial markets, it offers the flexibility to simulate a wide-range of different (easily customisable) market rules and to study the effect of auxiliary factors, such as delays, on the market dynamics. As a simple illustration, we employ our toolbox to investigate the role of the order processing delay in normal trading and for the scenario of a significant price change. Owing to its general architecture, our toolbox can also be employed as a generic multi-agent system simulator. We provide an example of such a non-financial application by simulating a mechanism for the coordination of no-regret learning agents in a multi-agent network routing scenario previously proposed in the literature.
△ Less
Submitted 21 September, 2022; v1 submitted 18 August, 2020;
originally announced August 2020.
-
Investment sizing with deep learning prediction uncertainties for high-frequency Eurodollar futures trading
Authors:
Trent Spears,
Stefan Zohren,
Stephen Roberts
Abstract:
In this work we show that prediction uncertainty estimates gleaned from deep learning models can be useful inputs for influencing the relative allocation of risk capital across trades. In this way, consideration of uncertainty is important because it permits the scaling of investment size across trade opportunities in a principled and data-driven way. We showcase this insight with a prediction mod…
▽ More
In this work we show that prediction uncertainty estimates gleaned from deep learning models can be useful inputs for influencing the relative allocation of risk capital across trades. In this way, consideration of uncertainty is important because it permits the scaling of investment size across trade opportunities in a principled and data-driven way. We showcase this insight with a prediction model and find clear outperformance based on a Sharpe ratio metric, relative to trading strategies that either do not take uncertainty into account, or that utilize an alternative market-based statistic as a proxy for uncertainty. Of added novelty is our modelling of high-frequency data at the top level of the Eurodollar Futures limit order book for each trading day of 2018, whereby we predict interest rate curve changes on small time horizons. We are motivated to study the market for these popularly-traded interest rate derivatives since it is deep and liquid, and contributes to the efficient functioning of global finance -- though there is relatively little by way of its modelling contained in the academic literature. Hence, we verify the utility of prediction models and uncertainty estimates for trading applications in this complex and multi-dimensional asset price space.
△ Less
Submitted 31 July, 2020;
originally announced July 2020.
-
Deep Learning for Portfolio Optimization
Authors:
Zihao Zhang,
Stefan Zohren,
Stephen Roberts
Abstract:
We adopt deep learning models to directly optimise the portfolio Sharpe ratio. The framework we present circumvents the requirements for forecasting expected returns and allows us to directly optimise portfolio weights by updating model parameters. Instead of selecting individual assets, we trade Exchange-Traded Funds (ETFs) of market indices to form a portfolio. Indices of different asset classes…
▽ More
We adopt deep learning models to directly optimise the portfolio Sharpe ratio. The framework we present circumvents the requirements for forecasting expected returns and allows us to directly optimise portfolio weights by updating model parameters. Instead of selecting individual assets, we trade Exchange-Traded Funds (ETFs) of market indices to form a portfolio. Indices of different asset classes show robust correlations and trading them substantially reduces the spectrum of available assets to choose from. We compare our method with a wide range of algorithms with results showing that our model obtains the best performance over the testing period, from 2011 to the end of April 2020, including the financial instabilities of the first quarter of 2020. A sensitivity analysis is included to understand the relevance of input features and we further study the performance of our approach under different cost rates and different risk levels via volatility scaling.
△ Less
Submitted 23 January, 2021; v1 submitted 27 May, 2020;
originally announced May 2020.
-
Detecting Changes in Asset Co-Movement Using the Autoencoder Reconstruction Ratio
Authors:
Bryan Lim,
Stefan Zohren,
Stephen Roberts
Abstract:
Detecting changes in asset co-movements is of much importance to financial practitioners, with numerous risk management benefits arising from the timely detection of breakdowns in historical correlations. In this article, we propose a real-time indicator to detect temporary increases in asset co-movements, the Autoencoder Reconstruction Ratio, which measures how well a basket of asset returns can…
▽ More
Detecting changes in asset co-movements is of much importance to financial practitioners, with numerous risk management benefits arising from the timely detection of breakdowns in historical correlations. In this article, we propose a real-time indicator to detect temporary increases in asset co-movements, the Autoencoder Reconstruction Ratio, which measures how well a basket of asset returns can be modelled using a lower-dimensional set of latent variables. The ARR uses a deep sparse denoising autoencoder to perform the dimensionality reduction on the returns vector, which replaces the PCA approach of the standard Absorption Ratio, and provides a better model for non-Gaussian returns. Through a systemic risk application on forecasting on the CRSP US Total Market Index, we show that lower ARR values coincide with higher volatility and larger drawdowns, indicating that increased asset co-movement does correspond with periods of market weakness. We also demonstrate that short-term (i.e. 5-min and 1-hour) predictors for realised volatility and market crashes can be improved by including additional ARR inputs.
△ Less
Submitted 27 September, 2020; v1 submitted 23 January, 2020;
originally announced February 2020.
-
Deep Reinforcement Learning for Trading
Authors:
Zihao Zhang,
Stefan Zohren,
Stephen Roberts
Abstract:
We adopt Deep Reinforcement Learning algorithms to design trading strategies for continuous futures contracts. Both discrete and continuous action spaces are considered and volatility scaling is incorporated to create reward functions which scale trade positions based on market volatility. We test our algorithms on the 50 most liquid futures contracts from 2011 to 2019, and investigate how perform…
▽ More
We adopt Deep Reinforcement Learning algorithms to design trading strategies for continuous futures contracts. Both discrete and continuous action spaces are considered and volatility scaling is incorporated to create reward functions which scale trade positions based on market volatility. We test our algorithms on the 50 most liquid futures contracts from 2011 to 2019, and investigate how performance varies across different asset classes including commodities, equity indices, fixed income and FX markets. We compare our algorithms against classical time series momentum strategies, and show that our method outperforms such baseline models, delivering positive profits despite heavy transaction costs. The experiments show that the proposed algorithms can follow large market trends without changing positions and can also scale down, or hold, through consolidation periods.
△ Less
Submitted 22 November, 2019;
originally announced November 2019.
-
Extending Deep Learning Models for Limit Order Books to Quantile Regression
Authors:
Zihao Zhang,
Stefan Zohren,
Stephen Roberts
Abstract:
We showcase how Quantile Regression (QR) can be applied to forecast financial returns using Limit Order Books (LOBs), the canonical data source of high-frequency financial time-series. We develop a deep learning architecture that simultaneously models the return quantiles for both buy and sell positions. We test our model over millions of LOB updates across multiple different instruments on the Lo…
▽ More
We showcase how Quantile Regression (QR) can be applied to forecast financial returns using Limit Order Books (LOBs), the canonical data source of high-frequency financial time-series. We develop a deep learning architecture that simultaneously models the return quantiles for both buy and sell positions. We test our model over millions of LOB updates across multiple different instruments on the London Stock Exchange. Our results suggest that the proposed network not only delivers excellent performance but also provides improved prediction robustness by combining quantile estimates.
△ Less
Submitted 11 June, 2019;
originally announced June 2019.
-
Enhancing Time Series Momentum Strategies Using Deep Neural Networks
Authors:
Bryan Lim,
Stefan Zohren,
Stephen Roberts
Abstract:
While time series momentum is a well-studied phenomenon in finance, common strategies require the explicit definition of both a trend estimator and a position sizing rule. In this paper, we introduce Deep Momentum Networks -- a hybrid approach which injects deep learning based trading rules into the volatility scaling framework of time series momentum. The model also simultaneously learns both tre…
▽ More
While time series momentum is a well-studied phenomenon in finance, common strategies require the explicit definition of both a trend estimator and a position sizing rule. In this paper, we introduce Deep Momentum Networks -- a hybrid approach which injects deep learning based trading rules into the volatility scaling framework of time series momentum. The model also simultaneously learns both trend estimation and position sizing in a data-driven manner, with networks directly trained by optimising the Sharpe ratio of the signal. Backtesting on a portfolio of 88 continuous futures contracts, we demonstrate that the Sharpe-optimised LSTM improved traditional methods by more than two times in the absence of transactions costs, and continue outperforming when considering transaction costs up to 2-3 basis points. To account for more illiquid assets, we also propose a turnover regularisation term which trains the network to factor in costs at run-time.
△ Less
Submitted 27 September, 2020; v1 submitted 9 April, 2019;
originally announced April 2019.
-
BDLOB: Bayesian Deep Convolutional Neural Networks for Limit Order Books
Authors:
Zihao Zhang,
Stefan Zohren,
Stephen Roberts
Abstract:
We showcase how dropout variational inference can be applied to a large-scale deep learning model that predicts price movements from limit order books (LOBs), the canonical data source representing trading and pricing movements. We demonstrate that uncertainty information derived from posterior predictive distributions can be utilised for position sizing, avoiding unnecessary trades and improving…
▽ More
We showcase how dropout variational inference can be applied to a large-scale deep learning model that predicts price movements from limit order books (LOBs), the canonical data source representing trading and pricing movements. We demonstrate that uncertainty information derived from posterior predictive distributions can be utilised for position sizing, avoiding unnecessary trades and improving profits. Further, we test our models by using millions of observations across several instruments and markets from the London Stock Exchange. Our results suggest that those Bayesian techniques not only deliver uncertainty information that can be used for trading but also improve predictive performance as stochastic regularisers. To the best of our knowledge, we are the first to apply Bayesian networks to LOBs.
△ Less
Submitted 25 November, 2018;
originally announced November 2018.
-
DeepLOB: Deep Convolutional Neural Networks for Limit Order Books
Authors:
Zihao Zhang,
Stefan Zohren,
Stephen Roberts
Abstract:
We develop a large-scale deep learning model to predict price movements from limit order book (LOB) data of cash equities. The architecture utilises convolutional filters to capture the spatial structure of the limit order books as well as LSTM modules to capture longer time dependencies. The proposed network outperforms all existing state-of-the-art algorithms on the benchmark LOB dataset [1]. In…
▽ More
We develop a large-scale deep learning model to predict price movements from limit order book (LOB) data of cash equities. The architecture utilises convolutional filters to capture the spatial structure of the limit order books as well as LSTM modules to capture longer time dependencies. The proposed network outperforms all existing state-of-the-art algorithms on the benchmark LOB dataset [1]. In a more realistic setting, we test our model by using one year market quotes from the London Stock Exchange and the model delivers a remarkably stable out-of-sample prediction accuracy for a variety of instruments. Importantly, our model translates well to instruments which were not part of the training set, indicating the model's ability to extract universal features. In order to better understand these features and to go beyond a "black box" model, we perform a sensitivity analysis to understand the rationale behind the model predictions and reveal the components of LOBs that are most relevant. The ability to extract robust features which translate well to other instruments is an important property of our model which has many other applications.
△ Less
Submitted 23 January, 2020; v1 submitted 10 August, 2018;
originally announced August 2018.