-
Multi-relational Graph Diffusion Neural Network with Parallel Retention for Stock Trends Classification
Authors:
Zinuo You,
Pengju Zhang,
** Zheng,
John Cartlidge
Abstract:
Stock trend classification remains a fundamental yet challenging task, owing to the intricate time-evolving dynamics between and within stocks. To tackle these two challenges, we propose a graph-based representation learning approach aimed at predicting the future movements of multiple stocks. Initially, we model the complex time-varying relationships between stocks by generating dynamic multi-rel…
▽ More
Stock trend classification remains a fundamental yet challenging task, owing to the intricate time-evolving dynamics between and within stocks. To tackle these two challenges, we propose a graph-based representation learning approach aimed at predicting the future movements of multiple stocks. Initially, we model the complex time-varying relationships between stocks by generating dynamic multi-relational stock graphs. This is achieved through a novel edge generation algorithm that leverages information entropy and signal energy to quantify the intensity and directionality of inter-stock relations on each trading day. Then, we further refine these initial graphs through a stochastic multi-relational diffusion process, adaptively learning task-optimal edges. Subsequently, we implement a decoupled representation learning scheme with parallel retention to obtain the final graph representation. This strategy better captures the unique temporal features within individual stocks while also capturing the overall structure of the stock graph. Comprehensive experiments conducted on real-world datasets from two US markets (NASDAQ and NYSE) and one Chinese market (Shanghai Stock Exchange: SSE) validate the effectiveness of our method. Our approach consistently outperforms state-of-the-art baselines in forecasting next trading day stock trends across three test periods spanning seven years. Datasets and code have been released (https://github.com/pixelhero98/MGDPR).
△ Less
Submitted 5 January, 2024;
originally announced January 2024.
-
DGDNN: Decoupled Graph Diffusion Neural Network for Stock Movement Prediction
Authors:
Zinuo You,
Zijian Shi,
Hongbo Bo,
John Cartlidge,
Li Zhang,
Yan Ge
Abstract:
Forecasting future stock trends remains challenging for academia and industry due to stochastic inter-stock dynamics and hierarchical intra-stock dynamics influencing stock prices. In recent years, graph neural networks have achieved remarkable performance in this problem by formulating multiple stocks as graph-structured data. However, most of these approaches rely on artificially defined factors…
▽ More
Forecasting future stock trends remains challenging for academia and industry due to stochastic inter-stock dynamics and hierarchical intra-stock dynamics influencing stock prices. In recent years, graph neural networks have achieved remarkable performance in this problem by formulating multiple stocks as graph-structured data. However, most of these approaches rely on artificially defined factors to construct static stock graphs, which fail to capture the intrinsic interdependencies between stocks that rapidly evolve. In addition, these methods often ignore the hierarchical features of the stocks and lose distinctive information within. In this work, we propose a novel graph learning approach implemented without expert knowledge to address these issues. First, our approach automatically constructs dynamic stock graphs by entropy-driven edge generation from a signal processing perspective. Then, we further learn task-optimal dependencies between stocks via a generalized graph diffusion process on constructed stock graphs. Last, a decoupled representation learning scheme is adopted to capture distinctive hierarchical intra-stock features. Experimental results demonstrate substantial improvements over state-of-the-art baselines on real-world datasets. Moreover, the ablation study and sensitivity study further illustrate the effectiveness of the proposed method in modeling the time-evolving inter-stock and intra-stock dynamics.
△ Less
Submitted 3 January, 2024;
originally announced January 2024.
-
Not feeling the buzz: Correction study of mispricing and inefficiency in online sportsbooks
Authors:
Lawrence Clegg,
John Cartlidge
Abstract:
We present a replication and correction of a recent article (Ramirez, P., Reade, J.J., Singleton, C., Betting on a buzz: Mispricing and inefficiency in online sportsbooks, International Journal of Forecasting, 39:3, 2023, pp. 1413-1423, doi: 10.1016/j.ijforecast.2022.07.011). RRS measure profile page views on Wikipedia to generate a "buzz factor" metric for tennis players and show that it can be u…
▽ More
We present a replication and correction of a recent article (Ramirez, P., Reade, J.J., Singleton, C., Betting on a buzz: Mispricing and inefficiency in online sportsbooks, International Journal of Forecasting, 39:3, 2023, pp. 1413-1423, doi: 10.1016/j.ijforecast.2022.07.011). RRS measure profile page views on Wikipedia to generate a "buzz factor" metric for tennis players and show that it can be used to form a profitable gambling strategy by predicting bookmaker mispricing. Here, we use the same dataset as RRS to reproduce their results exactly, thus confirming the robustness of their mispricing claim. However, we discover that the published betting results are significantly affected by a single bet (the "Hercog" bet), which returns substantial outlier profits based on erroneously long odds. When this data quality issue is resolved, the majority of reported profits disappear and only one strategy, which bets on "competitive" matches, remains significantly profitable in the original out-of-sample period. While one profitable strategy offers weaker support than the original study, it still provides an indication that market inefficiencies may exist, as originally claimed by RRS. As an extension, we continue backtesting after 2020 on a cleaned dataset. Results show that (a) the "competitive" strategy generates no further profits, potentially suggesting markets have become more efficient, and (b) model coefficients estimated over this more recent period are no longer reliable predictors of bookmaker mispricing. We present this work as a case study demonstrating the importance of replication studies in sports forecasting, and the necessity to clean data. We open-source release comprehensive datasets and code.
△ Less
Submitted 11 July, 2024; v1 submitted 3 May, 2023;
originally announced June 2023.
-
Neural Stochastic Agent-Based Limit Order Book Simulation: A Hybrid Methodology
Authors:
Zijian Shi,
John Cartlidge
Abstract:
Modern financial exchanges use an electronic limit order book (LOB) to store bid and ask orders for a specific financial asset. As the most fine-grained information depicting the demand and supply of an asset, LOB data is essential in understanding market dynamics. Therefore, realistic LOB simulations offer a valuable methodology for explaining empirical properties of markets. Mainstream simulatio…
▽ More
Modern financial exchanges use an electronic limit order book (LOB) to store bid and ask orders for a specific financial asset. As the most fine-grained information depicting the demand and supply of an asset, LOB data is essential in understanding market dynamics. Therefore, realistic LOB simulations offer a valuable methodology for explaining empirical properties of markets. Mainstream simulation models include agent-based models (ABMs) and stochastic models (SMs). However, ABMs tend not to be grounded on real historical data, while SMs tend not to enable dynamic agent-interaction. To overcome these limitations, we propose a novel hybrid LOB simulation paradigm characterised by: (1) representing the aggregation of market events' logic by a neural stochastic background trader that is pre-trained on historical LOB data through a neural point process model; and (2) embedding the background trader in a multi-agent simulation with other trading agents. We instantiate this hybrid NS-ABM model using the ABIDES platform. We first run the background trader in isolation and show that the simulated LOB can recreate a comprehensive list of stylised facts that demonstrate realistic market behaviour. We then introduce a population of `trend' and `value' trading agents, which interact with the background trader. We show that the stylised facts remain and we demonstrate order flow impact and financial herding behaviours that are in accordance with empirical observations of real markets.
△ Less
Submitted 28 February, 2023;
originally announced March 2023.
-
Using coevolution and substitution of the fittest for health and well-being recommender systems
Authors:
Hugo Alcaraz-Herrera,
John Cartlidge
Abstract:
This research explores substitution of the fittest (SF), a technique designed to counteract the problem of disengagement in two-population competitive coevolutionary genetic algorithms. SF is domain-independent and requires no calibration. We first perform a controlled comparative evaluation of SF's ability to maintain engagement and discover optimal solutions in a minimal toy domain. Experimental…
▽ More
This research explores substitution of the fittest (SF), a technique designed to counteract the problem of disengagement in two-population competitive coevolutionary genetic algorithms. SF is domain-independent and requires no calibration. We first perform a controlled comparative evaluation of SF's ability to maintain engagement and discover optimal solutions in a minimal toy domain. Experimental results demonstrate that SF is able to maintain engagement better than other techniques in the literature. We then address the more complex real-world problem of evolving recommendations for health and well-being. We introduce a coevolutionary extension of EvoRecSys, a previously published evolutionary recommender system. We demonstrate that SF is able to maintain engagement better than other techniques in the literature, and the resultant recommendations using SF are higher quality and more diverse than those produced by EvoRecSys.
△ Less
Submitted 6 December, 2022; v1 submitted 1 November, 2022;
originally announced November 2022.
-
Nonstationary Continuum-Armed Bandit Strategies for Automated Trading in a Simulated Financial Market
Authors:
Bingde Liu,
John Cartlidge
Abstract:
We approach the problem of designing an automated trading strategy that can consistently profit by adapting to changing market conditions. This challenge can be framed as a Nonstationary Continuum-Armed Bandit (NCAB) problem. To solve the NCAB problem, we propose PRBO, a novel trading algorithm that uses Bayesian optimization and a ``bandit-over-bandit'' framework to dynamically adjust strategy pa…
▽ More
We approach the problem of designing an automated trading strategy that can consistently profit by adapting to changing market conditions. This challenge can be framed as a Nonstationary Continuum-Armed Bandit (NCAB) problem. To solve the NCAB problem, we propose PRBO, a novel trading algorithm that uses Bayesian optimization and a ``bandit-over-bandit'' framework to dynamically adjust strategy parameters in response to market conditions. We use Bristol Stock Exchange (BSE) to simulate financial markets containing heterogeneous populations of automated trading agents and compare PRBO with PRSH, a reference trading strategy that adapts strategy parameters through stochastic hill-climbing. Results show that PRBO generates significantly more profit than PRSH, despite having fewer hyperparameters to tune. The code for PRBO and performing experiments is available online open-source (https://github.com/HarmoniaLeo/PRZI-Bayesian-Optimisation).
△ Less
Submitted 25 June, 2023; v1 submitted 4 August, 2022;
originally announced August 2022.
-
Substitution of the Fittest: A Novel Approach for Mitigating Disengagement in Coevolutionary Genetic Algorithms
Authors:
Hugo Alcaraz-Herrera,
John Cartlidge
Abstract:
We propose substitution of the fittest (SF), a novel technique designed to counteract the problem of disengagement in two-population competitive coevolutionary genetic algorithms. The approach presented is domain-independent and requires no calibration. In a minimal domain, we perform a controlled evaluation of the ability to maintain engagement and the capacity to discover optimal solutions. Resu…
▽ More
We propose substitution of the fittest (SF), a novel technique designed to counteract the problem of disengagement in two-population competitive coevolutionary genetic algorithms. The approach presented is domain-independent and requires no calibration. In a minimal domain, we perform a controlled evaluation of the ability to maintain engagement and the capacity to discover optimal solutions. Results demonstrate that the solution discovery performance of SF is comparable with other techniques in the literature, while SF also offers benefits including a greater ability to maintain engagement and a much simpler mechanism.
△ Less
Submitted 6 August, 2021;
originally announced August 2021.
-
The Limit Order Book Recreation Model (LOBRM): An Extended Analysis
Authors:
Zijian Shi,
John Cartlidge
Abstract:
The limit order book (LOB) depicts the fine-grained demand and supply relationship for financial assets and is widely used in market microstructure studies. Nevertheless, the availability and high cost of LOB data restrict its wider application. The LOB recreation model (LOBRM) was recently proposed to bridge this gap by synthesizing the LOB from trades and quotes (TAQ) data. However, in the origi…
▽ More
The limit order book (LOB) depicts the fine-grained demand and supply relationship for financial assets and is widely used in market microstructure studies. Nevertheless, the availability and high cost of LOB data restrict its wider application. The LOB recreation model (LOBRM) was recently proposed to bridge this gap by synthesizing the LOB from trades and quotes (TAQ) data. However, in the original LOBRM study, there were two limitations: (1) experiments were conducted on a relatively small dataset containing only one day of LOB data; and (2) the training and testing were performed in a non-chronological fashion, which essentially re-frames the task as interpolation and potentially introduces lookahead bias. In this study, we extend the research on LOBRM and further validate its use in real-world application scenarios. We first advance the workflow of LOBRM by (1) adding a time-weighted z-score standardization for the LOB and (2) substituting the ordinary differential equation kernel with an exponential decay kernel to lower computation complexity. Experiments are conducted on the extended LOBSTER dataset in a chronological fashion, as it would be used in a real-world application. We find that (1) LOBRM with decay kernel is superior to traditional non-linear models, and module ensembling is effective; (2) prediction accuracy is negatively related to the volatility of order volumes resting in the LOB; (3) the proposed sparse encoding method for TAQ exhibits good generalization ability and can facilitate manifold tasks; and (4) the influence of stochastic drift on prediction accuracy can be alleviated by increasing historical samples.
△ Less
Submitted 1 July, 2021;
originally announced July 2021.
-
The LOB Recreation Model: Predicting the Limit Order Book from TAQ History Using an Ordinary Differential Equation Recurrent Neural Network
Authors:
Zijian Shi,
Yu Chen,
John Cartlidge
Abstract:
In an order-driven financial market, the price of a financial asset is discovered through the interaction of orders - requests to buy or sell at a particular price - that are posted to the public limit order book (LOB). Therefore, LOB data is extremely valuable for modelling market dynamics. However, LOB data is not freely accessible, which poses a challenge to market participants and researchers…
▽ More
In an order-driven financial market, the price of a financial asset is discovered through the interaction of orders - requests to buy or sell at a particular price - that are posted to the public limit order book (LOB). Therefore, LOB data is extremely valuable for modelling market dynamics. However, LOB data is not freely accessible, which poses a challenge to market participants and researchers wishing to exploit this information. Fortunately, trades and quotes (TAQ) data - orders arriving at the top of the LOB, and trades executing in the market - are more readily available. In this paper, we present the LOB recreation model, a first attempt from a deep learning perspective to recreate the top five price levels of the LOB for small-tick stocks using only TAQ data. Volumes of orders sitting deep in the LOB are predicted by combining outputs from: (1) a history compiler that uses a Gated Recurrent Unit (GRU) module to selectively compile prediction relevant quote history; (2) a market events simulator, which uses an Ordinary Differential Equation Recurrent Neural Network (ODE-RNN) to simulate the accumulation of net order arrivals; and (3) a weighting scheme to adaptively combine the predictions generated by (1) and (2). By the paradigm of transfer learning, the source model trained on one stock can be fine-tuned to enable application to other financial assets of the same class with much lower demand on additional data. Comprehensive experiments conducted on two real world intraday LOB datasets demonstrate that the proposed model can efficiently recreate the LOB with high accuracy using only TAQ data as input.
△ Less
Submitted 2 March, 2021;
originally announced March 2021.
-
Time Matters: Exploring the Effects of Urgency and Reaction Speed in Automated Traders
Authors:
Henry Hanifan,
Ben Watson,
John Cartlidge,
Dave Cliff
Abstract:
We consider issues of time in automated trading strategies in simulated financial markets containing a single exchange with public limit order book and continuous double auction matching. In particular, we explore two effects: (i) reaction speed - the time taken for trading strategies to calculate a response to market events; and (ii) trading urgency - the sensitivity of trading strategies to appr…
▽ More
We consider issues of time in automated trading strategies in simulated financial markets containing a single exchange with public limit order book and continuous double auction matching. In particular, we explore two effects: (i) reaction speed - the time taken for trading strategies to calculate a response to market events; and (ii) trading urgency - the sensitivity of trading strategies to approaching deadlines. Much of the literature on trading agents focuses on optimising pricing strategies only and ignores the effects of time, while real-world markets continue to experience a race to zero latency, as automated trading systems compete to quickly access information and act in the market ahead of others. We demonstrate that modelling reaction speed can significantly alter previously published results, with simple strategies such as SHVR outperforming more complex adaptive algorithms such as AA. We also show that adding a pace parameter to ZIP traders (ZIP-Pace, or ZIPP) can create a sense of urgency that significantly improves profitability.
△ Less
Submitted 28 February, 2021;
originally announced March 2021.
-
Fools Rush In: Competitive Effects of Reaction Time in Automated Trading
Authors:
Henry Hanifan,
John Cartlidge
Abstract:
We explore the competitive effects of reaction time of automated trading strategies in simulated financial markets containing a single exchange with public limit order book and continuous double auction matching. A large body of research conducted over several decades has been devoted to trading agent design and simulation, but the majority of this work focuses on pricing strategy and does not con…
▽ More
We explore the competitive effects of reaction time of automated trading strategies in simulated financial markets containing a single exchange with public limit order book and continuous double auction matching. A large body of research conducted over several decades has been devoted to trading agent design and simulation, but the majority of this work focuses on pricing strategy and does not consider the time taken for these strategies to compute. In real-world financial markets, speed is known to heavily influence the design of automated trading algorithms, with the generally accepted wisdom that faster is better. Here, we introduce increasingly realistic models of trading speed and profile the computation times of a suite of eminent trading algorithms from the literature. Results demonstrate that: (a) trading performance is impacted by speed, but faster is not always better; (b) the Adaptive-Aggressive (AA) algorithm, until recently considered the most dominant trading strategy in the literature, is outperformed by the simplistic Shaver (SHVR) strategy - shave one tick off the current best bid or ask - when relative computation times are accurately simulated.
△ Less
Submitted 29 November, 2020; v1 submitted 5 December, 2019;
originally announced December 2019.
-
Modelling Resilience in Cloud-Scale Data Centres
Authors:
John Cartlidge,
Ilango Sriram
Abstract:
The trend for cloud computing has initiated a race towards data centres (DC) of an ever-increasing size. The largest DCs now contain many hundreds of thousands of virtual machine (VM) services. Given the finite lifespan of hardware, such large DCs are subject to frequent hardware failure events that can lead to disruption of service. To counter this, multiple redundant copies of task threads may b…
▽ More
The trend for cloud computing has initiated a race towards data centres (DC) of an ever-increasing size. The largest DCs now contain many hundreds of thousands of virtual machine (VM) services. Given the finite lifespan of hardware, such large DCs are subject to frequent hardware failure events that can lead to disruption of service. To counter this, multiple redundant copies of task threads may be distributed around a DC to ensure that individual hardware failures do not cause entire jobs to fail. Here, we present results demonstrating the resilience of different job scheduling algorithms in a simulated DC with hardware failure. We use a simple model of jobs distributed across a hardware network to demonstrate the relationship between resilience and additional communication costs of different scheduling methods.
△ Less
Submitted 27 June, 2011;
originally announced June 2011.