Search | arXiv e-print repository

Dynamic Time War** for Lead-Lag Relationships in Lagged Multi-Factor Models

Authors: Yichi Zhang, Mihai Cucuringu, Alexander Y. Shestopaloff, Stefan Zohren

Abstract: In multivariate time series systems, lead-lag relationships reveal dependencies between time series when they are shifted in time relative to each other. Uncovering such relationships is valuable in downstream tasks, such as control, forecasting, and clustering. By understanding the temporal dependencies between different time series, one can better comprehend the complex interactions and patterns… ▽ More In multivariate time series systems, lead-lag relationships reveal dependencies between time series when they are shifted in time relative to each other. Uncovering such relationships is valuable in downstream tasks, such as control, forecasting, and clustering. By understanding the temporal dependencies between different time series, one can better comprehend the complex interactions and patterns within the system. We develop a cluster-driven methodology based on dynamic time war** for robust detection of lead-lag relationships in lagged multi-factor models. We establish connections to the multireference alignment problem for both the homogeneous and heterogeneous settings. Since multivariate time series are ubiquitous in a wide range of domains, we demonstrate that our algorithm is able to robustly detect lead-lag relationships in financial markets, which can be subsequently leveraged in trading strategies with significant economic benefits. △ Less

Submitted 15 September, 2023; originally announced September 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2305.06704

arXiv:2308.01419 [pdf, other]

Graph Neural Networks for Forecasting Multivariate Realized Volatility with Spillover Effects

Authors: Chao Zhang, Xingyue Pu, Mihai Cucuringu, Xiaowen Dong

Abstract: We present a novel methodology for modeling and forecasting multivariate realized volatilities using customized graph neural networks to incorporate spillover effects across stocks. The proposed model offers the benefits of incorporating spillover effects from multi-hop neighbors, capturing nonlinear relationships, and flexible training with different loss functions. Our empirical findings provide… ▽ More We present a novel methodology for modeling and forecasting multivariate realized volatilities using customized graph neural networks to incorporate spillover effects across stocks. The proposed model offers the benefits of incorporating spillover effects from multi-hop neighbors, capturing nonlinear relationships, and flexible training with different loss functions. Our empirical findings provide compelling evidence that incorporating spillover effects from multi-hop neighbors alone does not yield a clear advantage in terms of predictive accuracy. However, modeling nonlinear spillover effects enhances the forecasting accuracy of realized volatilities, particularly for short-term horizons of up to one week. Moreover, our results consistently indicate that training with the Quasi-likelihood loss leads to substantial improvements in model performance compared to the commonly-used mean squared error. A comprehensive series of empirical evaluations in alternative settings confirm the robustness of our results. △ Less

Submitted 1 August, 2023; originally announced August 2023.

Comments: 8 figures, 5 tables

arXiv:2305.06704 [pdf, other]

Robust Detection of Lead-Lag Relationships in Lagged Multi-Factor Models

Authors: Yichi Zhang, Mihai Cucuringu, Alexander Y. Shestopaloff, Stefan Zohren

Abstract: In multivariate time series systems, key insights can be obtained by discovering lead-lag relationships inherent in the data, which refer to the dependence between two time series shifted in time relative to one another, and which can be leveraged for the purposes of control, forecasting or clustering. We develop a clustering-driven methodology for robust detection of lead-lag relationships in lag… ▽ More In multivariate time series systems, key insights can be obtained by discovering lead-lag relationships inherent in the data, which refer to the dependence between two time series shifted in time relative to one another, and which can be leveraged for the purposes of control, forecasting or clustering. We develop a clustering-driven methodology for robust detection of lead-lag relationships in lagged multi-factor models. Within our framework, the envisioned pipeline takes as input a set of time series, and creates an enlarged universe of extracted subsequence time series from each input time series, via a sliding window approach. This is then followed by an application of various clustering techniques, (such as k-means++ and spectral clustering), employing a variety of pairwise similarity measures, including nonlinear ones. Once the clusters have been extracted, lead-lag estimates across clusters are robustly aggregated to enhance the identification of the consistent relationships in the original universe. We establish connections to the multireference alignment problem for both the homogeneous and heterogeneous settings. Since multivariate time series are ubiquitous in a wide range of domains, we demonstrate that our method is not only able to robustly detect lead-lag relationships in financial markets, but can also yield insightful results when applied to an environmental data set. △ Less

Submitted 18 September, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

arXiv:2304.03877 [pdf, other]

OFTER: An Online Pipeline for Time Series Forecasting

Authors: Nikolas Michael, Mihai Cucuringu, Sam Howison

Abstract: We introduce OFTER, a time series forecasting pipeline tailored for mid-sized multivariate time series. OFTER utilizes the non-parametric models of k-nearest neighbors and Generalized Regression Neural Networks, integrated with a dimensionality reduction component. To circumvent the curse of dimensionality, we employ a weighted norm based on a modified version of the maximal correlation coefficien… ▽ More We introduce OFTER, a time series forecasting pipeline tailored for mid-sized multivariate time series. OFTER utilizes the non-parametric models of k-nearest neighbors and Generalized Regression Neural Networks, integrated with a dimensionality reduction component. To circumvent the curse of dimensionality, we employ a weighted norm based on a modified version of the maximal correlation coefficient. The pipeline we introduce is specifically designed for online tasks, has an interpretable output, and is able to outperform several state-of-the art baselines. The computational efficacy of the algorithm, its online nature, and its ability to operate in low signal-to-noise regimes, render OFTER an ideal approach for financial multivariate time series problems, such as daily equity forecasting. Our work demonstrates that while deep learning models hold significant promise for time series forecasting, traditional methods carefully integrating mainstream tools remain very competitive alternatives with the added benefits of scalability and interpretability. △ Less

Submitted 7 April, 2023; originally announced April 2023.

Comments: 26 pages, 12 figures

arXiv:2302.09382 [pdf, other]

Co-trading networks for modeling dynamic interdependency structures and estimating high-dimensional covariances in US equity markets

Authors: Yutong Lu, Gesine Reinert, Mihai Cucuringu

Abstract: The time proximity of trades across stocks reveals interesting topological structures of the equity market in the United States. In this article, we investigate how such concurrent cross-stock trading behaviors, which we denote as co-trading, shape the market structures and affect stock price co-movements. By leveraging a co-trading-based pairwise similarity measure, we propose a novel method to c… ▽ More The time proximity of trades across stocks reveals interesting topological structures of the equity market in the United States. In this article, we investigate how such concurrent cross-stock trading behaviors, which we denote as co-trading, shape the market structures and affect stock price co-movements. By leveraging a co-trading-based pairwise similarity measure, we propose a novel method to construct dynamic networks of stocks. Our empirical studies employ high-frequency limit order book data from 2017-01-03 to 2019-12-09. By applying spectral clustering on co-trading networks, we uncover economically meaningful clusters of stocks. Beyond the static Global Industry Classification Standard (GICS) sectors, our data-driven clusters capture the time evolution of the dependency among stocks. Furthermore, we demonstrate statistically significant positive relations between low-latency co-trading and return covariance. With the aid of co-trading networks, we develop a robust estimator for high-dimensional covariance matrix, which yields superior economic value on portfolio allocation. The mean-variance portfolios based on our covariance estimates achieve both lower volatility and higher Sharpe ratios than standard benchmarks. △ Less

Submitted 12 May, 2024; v1 submitted 18 February, 2023; originally announced February 2023.

arXiv:2301.13009 [pdf, other]

DeFi: data-driven characterisation of Uniswap v3 ecosystem & an ideal crypto law for liquidity pools

Authors: Deborah Miori, Mihai Cucuringu

Abstract: Uniswap is a Constant Product Market Maker built around liquidity pools, where pairs of tokens are exchanged subject to a fee that is proportional to the size of transactions. At the time of writing, there exist more than 6,000 pools associated with Uniswap v3, implying that empirical investigations on the full ecosystem can easily become computationally expensive. Thus, we propose a systematic wo… ▽ More Uniswap is a Constant Product Market Maker built around liquidity pools, where pairs of tokens are exchanged subject to a fee that is proportional to the size of transactions. At the time of writing, there exist more than 6,000 pools associated with Uniswap v3, implying that empirical investigations on the full ecosystem can easily become computationally expensive. Thus, we propose a systematic workflow to extract and analyse a meaningful but computationally tractable sub-universe of liquidity pools. Leveraging on the 34 pools found relevant for the six-months time window January-June 2022, we then investigate the related liquidity consumption behaviour of market participants. We propose to represent each liquidity taker by a suitably constructed transaction graph, which is a fully connected network where nodes are the liquidity taker's executed transactions, and edges contain weights encoding the time elapsed between any two transactions. We extend the NLP-inspired graph2vec algorithm to the weighted undirected setting, and employ it to obtain an embedding of the set of graphs. This embedding allows us to extract seven clusters of liquidity takers, with equivalent behavioural patters and interpretable trading preferences. We conclude our work by testing for relationships between the characteristic mechanisms of each pool, i.e. liquidity provision, consumption, and price variation. We introduce a related ideal crypto law, inspired from the ideal gas law of thermodynamics, and demonstrate that pools adhering to this law are healthier trading venues in terms of sensitivity of liquidity and agents' activity. Regulators and practitioners could benefit from our model by develo** related pool health monitoring tools. △ Less

Submitted 31 January, 2023; v1 submitted 20 December, 2022; originally announced January 2023.

Comments: 26 pages, 21 figures

arXiv:2209.10334 [pdf, other]

Trade Co-occurrence, Trade Flow Decomposition, and Conditional Order Imbalance in Equity Markets

Authors: Yutong Lu, Gesine Reinert, Mihai Cucuringu

Abstract: The time proximity of high-frequency trades can contain a salient signal. In this paper, we propose a method to classify every trade, based on its proximity with other trades in the market within a short period of time, into five types. By means of a suitably defined normalized order imbalance associated to each type of trade, which we denote as conditional order imbalance (COI), we investigate th… ▽ More The time proximity of high-frequency trades can contain a salient signal. In this paper, we propose a method to classify every trade, based on its proximity with other trades in the market within a short period of time, into five types. By means of a suitably defined normalized order imbalance associated to each type of trade, which we denote as conditional order imbalance (COI), we investigate the price impact of the decomposed trade flows. Our empirical findings indicate strong positive correlations between contemporaneous returns and COIs. In terms of predictability, we document that associations with future returns are positive for COIs of trades which are isolated from trades of stocks other than themselves, and negative otherwise. Furthermore, trading strategies which we develop using COIs achieve conspicuous returns and Sharpe ratios, in an extensive experimental setup on a universe of 457 stocks using daily data for a period of four years. △ Less

Submitted 13 March, 2024; v1 submitted 21 September, 2022; originally announced September 2022.

arXiv:2209.08825 [pdf, other]

SEC Form 13F-HR: Statistical investigation of trading imbalances and profitability analysis

Authors: Deborah Miori, Mihai Cucuringu

Abstract: US Institutions with more than $100 million assets under management must disclose part of their long positions into the SEC Form 13F-HR on a quarterly basis. We consider the number of variations in holdings between consecutive reporting periods, and compute imbalances in buying versus selling behaviour for the assets under consideration. A significant opportunity for profit arises if an external i… ▽ More US Institutions with more than $100 million assets under management must disclose part of their long positions into the SEC Form 13F-HR on a quarterly basis. We consider the number of variations in holdings between consecutive reporting periods, and compute imbalances in buying versus selling behaviour for the assets under consideration. A significant opportunity for profit arises if an external investor is willing to trade contrarian to the 13F filings imbalances. Indeed, imbalances capture the amount of information already consumed in the market and the related trades tend to be inflated by crowding and herding. Betting on a relatively short-term movement of prices against the sign of imbalances results in a profitable strategy especially when using a time horizon between 21 and 42 trading days (corresponding to 1-2 calendar months) after each financial quarter ends. △ Less

Submitted 19 September, 2022; originally announced September 2022.

Comments: 23 pages, 18 figures

arXiv:2209.00268 [pdf, other]

Returns-Driven Macro Regimes and Characteristic Lead-Lag Behaviour between Asset Classes

Authors: Deborah Miori, Mihai Cucuringu

Abstract: We define data-driven macroeconomic regimes by clustering the relative performance in time of indices belonging to different asset classes. We then investigate lead-lag relationships within the regimes identified. Our study unravels market features characteristic of different windows in time and leverages on this knowledge to highlight market trends or risks that can be informative with respect to… ▽ More We define data-driven macroeconomic regimes by clustering the relative performance in time of indices belonging to different asset classes. We then investigate lead-lag relationships within the regimes identified. Our study unravels market features characteristic of different windows in time and leverages on this knowledge to highlight market trends or risks that can be informative with respect to recurrent market developments. The framework developed also lays the foundations for multiple possible extensions. △ Less

Submitted 2 September, 2022; v1 submitted 1 September, 2022; originally announced September 2022.

Comments: 9 pages, 8 figures

arXiv:2203.15470 [pdf, other]

Graph similarity learning for change-point detection in dynamic networks

Authors: Deborah Sulem, Henry Kenlay, Mihai Cucuringu, Xiaowen Dong

Abstract: Dynamic networks are ubiquitous for modelling sequential graph-structured data, e.g., brain connectome, population flows and messages exchanges. In this work, we consider dynamic networks that are temporal sequences of graph snapshots, and aim at detecting abrupt changes in their structure. This task is often termed network change-point detection and has numerous applications, such as fraud detect… ▽ More Dynamic networks are ubiquitous for modelling sequential graph-structured data, e.g., brain connectome, population flows and messages exchanges. In this work, we consider dynamic networks that are temporal sequences of graph snapshots, and aim at detecting abrupt changes in their structure. This task is often termed network change-point detection and has numerous applications, such as fraud detection or physical motion monitoring. Leveraging a graph neural network model, we design a method to perform online network change-point detection that can adapt to the specific network domain and localise changes with no delay. The main novelty of our method is to use a siamese graph neural network architecture for learning a data-driven graph similarity function, which allows to effectively compare the current graph and its recent history. Importantly, our method does not require prior knowledge on the network generative distribution and is agnostic to the type of change-points; moreover, it can be applied to a large variety of networks, that include for instance edge weights and node attributes. We show on synthetic and real data that our method enjoys a number of benefits: it is able to learn an adequate graph similarity function for performing online network change-point detection in diverse types of change-point settings, and requires a shorter data history to detect changes than most existing state-of-the-art baselines. △ Less

Submitted 29 March, 2022; originally announced March 2022.

Comments: 33 pages, 21 figures, 5 tables

arXiv:2203.15009 [pdf]

DAMNETS: A Deep Autoregressive Model for Generating Markovian Network Time Series

Authors: Jase Clarkson, Mihai Cucuringu, Andrew Elliott, Gesine Reinert

Abstract: Generative models for network time series (also known as dynamic graphs) have tremendous potential in fields such as epidemiology, biology and economics, where complex graph-based dynamics are core objects of study. Designing flexible and scalable generative models is a very challenging task due to the high dimensionality of the data, as well as the need to represent temporal dependencies and marg… ▽ More Generative models for network time series (also known as dynamic graphs) have tremendous potential in fields such as epidemiology, biology and economics, where complex graph-based dynamics are core objects of study. Designing flexible and scalable generative models is a very challenging task due to the high dimensionality of the data, as well as the need to represent temporal dependencies and marginal network structure. Here we introduce DAMNETS, a scalable deep generative model for network time series. DAMNETS outperforms competing methods on all of our measures of sample quality, over both real and synthetic data sets. △ Less

Submitted 31 October, 2023; v1 submitted 28 March, 2022; originally announced March 2022.

arXiv:2203.01664 [pdf, other]

Tail-GAN: Learning to Simulate Tail Risk Scenarios

Authors: Rama Cont, Mihai Cucuringu, Renyuan Xu, Chao Zhang

Abstract: The estimation of loss distributions for dynamic portfolios requires the simulation of scenarios representing realistic joint dynamics of their components, with particular importance devoted to the simulation of tail risk scenarios. We propose a novel data-driven approach that utilizes Generative Adversarial Network (GAN) architecture and exploits the joint elicitability property of Value-at-Risk… ▽ More The estimation of loss distributions for dynamic portfolios requires the simulation of scenarios representing realistic joint dynamics of their components, with particular importance devoted to the simulation of tail risk scenarios. We propose a novel data-driven approach that utilizes Generative Adversarial Network (GAN) architecture and exploits the joint elicitability property of Value-at-Risk (VaR) and Expected Shortfall (ES). Our proposed approach is capable of learning to simulate price scenarios that preserve tail risk features for benchmark trading strategies, including consistent statistics such as VaR and ES. We prove a universal approximation theorem for our generator for a broad class of risk measures. In addition, we show that the training of the GAN may be formulated as a max-min game, leading to a more effective approach for training. Our numerical experiments show that, in contrast to other data-driven scenario generators, our proposed scenario simulation method correctly captures tail risk for both static and dynamic portfolios. △ Less

Submitted 25 March, 2023; v1 submitted 3 March, 2022; originally announced March 2022.

Comments: 39 pages, 17 figures, 11 tables. An earlier version of this paper circulated under the title "TailGAN: Nonparametric Scenario Generation for Tail Risk Estimation". First draft: March 2022

arXiv:2202.08962 [pdf, ps, other]

Volatility forecasting with machine learning and intraday commonality

Authors: Chao Zhang, Yihuang Zhang, Mihai Cucuringu, Zhongmin Qian

Abstract: We apply machine learning models to forecast intraday realized volatility (RV), by exploiting commonality in intraday volatility via pooling stock data together, and by incorporating a proxy for the market volatility. Neural networks dominate linear regressions and tree-based models in terms of performance, due to their ability to uncover and model complex latent interactions among variables. Our… ▽ More We apply machine learning models to forecast intraday realized volatility (RV), by exploiting commonality in intraday volatility via pooling stock data together, and by incorporating a proxy for the market volatility. Neural networks dominate linear regressions and tree-based models in terms of performance, due to their ability to uncover and model complex latent interactions among variables. Our findings remain robust when we apply trained models to new stocks that have not been included in the training set, thus providing new empirical evidence for a universal volatility mechanism among stocks. Finally, we propose a new approach to forecasting one-day-ahead RVs using past intraday RVs as predictors, and highlight interesting time-of-day effects that aid the forecasting mechanism. The results demonstrate that the proposed methodology yields superior out-of-sample forecasts over a strong set of traditional baselines that only rely on past daily RVs. △ Less

Submitted 24 February, 2023; v1 submitted 8 February, 2022; originally announced February 2022.

Comments: 40 pages, 12 figures, 6 tables; to appear in Journal of Financial Econometrics

arXiv:2201.09319 [pdf, other]

Option Volume Imbalance as a predictor for equity market returns

Authors: Nikolas Michael, Mihai Cucuringu, Sam Howison

Abstract: We investigate the use of the normalized imbalance between option volumes corresponding to positive and negative market views, as a predictor for directional price movements in the spot market. Via a nonlinear analysis, and using a decomposition of aggregated volumes into five distinct market participant classes, we find strong signs of predictability of excess market overnight returns. The strong… ▽ More We investigate the use of the normalized imbalance between option volumes corresponding to positive and negative market views, as a predictor for directional price movements in the spot market. Via a nonlinear analysis, and using a decomposition of aggregated volumes into five distinct market participant classes, we find strong signs of predictability of excess market overnight returns. The strongest signals come from Market-Maker volumes. Among other findings, we demonstrate that most of the predictability stems from high-implied-volatility option contracts, and that the informational content of put option volumes is greater than that of call options. △ Less

Submitted 23 January, 2022; originally announced January 2022.

Comments: 43 pages, 33 figures

arXiv:2201.08283 [pdf, other]

Lead-lag detection and network clustering for multivariate time series with an application to the US equity market

Authors: Stefanos Bennett, Mihai Cucuringu, Gesine Reinert

Abstract: In multivariate time series systems, it has been observed that certain groups of variables partially lead the evolution of the system, while other variables follow this evolution with a time delay; the result is a lead-lag structure amongst the time series variables. In this paper, we propose a method for the detection of lead-lag clusters of time series in multivariate systems. We demonstrate tha… ▽ More In multivariate time series systems, it has been observed that certain groups of variables partially lead the evolution of the system, while other variables follow this evolution with a time delay; the result is a lead-lag structure amongst the time series variables. In this paper, we propose a method for the detection of lead-lag clusters of time series in multivariate systems. We demonstrate that the web of pairwise lead-lag relationships between time series can be helpfully construed as a directed network, for which there exist suitable algorithms for the detection of pairs of lead-lag clusters with high pairwise imbalance. Within our framework, we consider a number of choices for the pairwise lead-lag metric and directed network clustering components. Our framework is validated on both a synthetic generative model for multivariate lead-lag time series systems and daily real-world US equity prices data. We showcase that our method is able to detect statistically significant lead-lag clusters in the US equity market. We study the nature of these clusters in the context of the empirical finance literature on lead-lag relations and demonstrate how these can be used for the construction of predictive financial signals. △ Less

Submitted 20 January, 2022; originally announced January 2022.

Comments: 29 pages, 28 figures; preliminary version appeared at KDD 2021 - 7th SIGKKDD Workshop on Mining and Learning from Time Series (MiLeTS)

arXiv:2112.13213 [pdf, other]

Cross-Impact of Order Flow Imbalance in Equity Markets

Authors: Rama Cont, Mihai Cucuringu, Chao Zhang

Abstract: We investigate the impact of order flow imbalance (OFI) on price movements in equity markets in a multi-asset setting. First, we propose a systematic approach for combining OFIs at the top levels of the limit order book into an integrated OFI variable which better explains price impact, compared to the best-level OFI. We show that once the information from multiple levels is integrated into OFI, m… ▽ More We investigate the impact of order flow imbalance (OFI) on price movements in equity markets in a multi-asset setting. First, we propose a systematic approach for combining OFIs at the top levels of the limit order book into an integrated OFI variable which better explains price impact, compared to the best-level OFI. We show that once the information from multiple levels is integrated into OFI, multi-asset models with cross-impact do not provide additional explanatory power for contemporaneous impact compared to a sparse model without cross-impact terms. On the other hand, we show that lagged cross-asset OFIs do improve the forecasting of future returns. We also establish that this lagged cross-impact mainly manifests at short-term horizons and decays rapidly in time. △ Less

Submitted 13 June, 2023; v1 submitted 25 December, 2021; originally announced December 2021.

Comments: 33 pages, 10 figures, 11 tables

arXiv:2111.09170 [pdf, other]

A Universal End-to-End Approach to Portfolio Optimization via Deep Learning

Authors: Chao Zhang, Zihao Zhang, Mihai Cucuringu, Stefan Zohren

Abstract: We propose a universal end-to-end framework for portfolio optimization where asset distributions are directly obtained. The designed framework circumvents the traditional forecasting step and avoids the estimation of the covariance matrix, lifting the bottleneck for generalizing to a large amount of instruments. Our framework has the flexibility of optimizing various objective functions including… ▽ More We propose a universal end-to-end framework for portfolio optimization where asset distributions are directly obtained. The designed framework circumvents the traditional forecasting step and avoids the estimation of the covariance matrix, lifting the bottleneck for generalizing to a large amount of instruments. Our framework has the flexibility of optimizing various objective functions including Sharpe ratio, mean-variance trade-off etc. Further, we allow for short selling and study several constraints attached to objective functions. In particular, we consider cardinality, maximum position for individual instrument and leverage. These constraints are formulated into objective functions by utilizing several neural layers and gradient ascent can be adopted for optimization. To ensure the robustness of our framework, we test our methods on two datasets. Firstly, we look at a synthetic dataset where we demonstrate that weights obtained from our end-to-end approach are better than classical predictive methods. Secondly, we apply our framework on a real-life dataset with historical observations of hundreds of instruments with a testing period of more than 20 years. △ Less

Submitted 17 November, 2021; originally announced November 2021.

Comments: 12 pages,

arXiv:2108.09750 [pdf, other]

Fragmentation, Price Formation, and Cross-Impact in Bitcoin Markets

Authors: Jakob Albers, Mihai Cucuringu, Sam Howison, Alexander Y. Shestopaloff

Abstract: In light of micro-scale inefficiencies induced by the high degree of fragmentation of the Bitcoin trading landscape, we utilize a granular data set comprised of orderbook and trades data from the most liquid Bitcoin markets, in order to understand the price formation process at sub-1 second time scales. To achieve this goal, we construct a set of features that encapsulate relevant microstructural… ▽ More In light of micro-scale inefficiencies induced by the high degree of fragmentation of the Bitcoin trading landscape, we utilize a granular data set comprised of orderbook and trades data from the most liquid Bitcoin markets, in order to understand the price formation process at sub-1 second time scales. To achieve this goal, we construct a set of features that encapsulate relevant microstructural information over short lookback windows. These features are subsequently leveraged first to generate a leader-lagger network that quantifies how markets impact one another, and then to train linear models capable of explaining between 10% and 37% of total variation in $500$ms future returns (depending on which market is the prediction target). The results are then compared with those of various PnL calculations that take trading realities, such as transaction costs, into account. The PnL calculations are based on natural $\textit{taker}$ strategies (meaning they employ market orders) that we associate to each model. Our findings emphasize the role of a market's fee regime in determining its propensity to being a leader or a lagger, as well as the profitability of our taker strategy. Taking our analysis further, we also derive a natural $\textit{maker}$ strategy (i.e., one that uses only passive limit orders), which, due to the difficulties associated with backtesting maker strategies, we test in a real-world live trading experiment, in which we turned over 1.5 million USD in notional volume. Lending additional confidence to our models, and by extension to the features they are based on, the results indicate a significant improvement over a naive benchmark strategy, which we also deploy in a live trading environment with real capital, for the sake of comparison. △ Less

Submitted 22 August, 2021; originally announced August 2021.

Comments: 62 pages, 34 figures, 24 tables

arXiv:1909.04497 [pdf, other]

Equity2Vec: End-to-end Deep Learning Framework for Cross-sectional Asset Pricing

Authors: Qiong Wu, Christopher G. Brinton, Zheng Zhang, Andrea Pizzoferrato, Zhenming Liu, Mihai Cucuringu

Abstract: Pricing assets has attracted significant attention from the financial technology community. We observe that the existing solutions overlook the cross-sectional effects and not fully leveraged the heterogeneous data sets, leading to sub-optimal performance. To this end, we propose an end-to-end deep learning framework to price the assets. Our framework possesses two main properties: 1) We propose… ▽ More Pricing assets has attracted significant attention from the financial technology community. We observe that the existing solutions overlook the cross-sectional effects and not fully leveraged the heterogeneous data sets, leading to sub-optimal performance. To this end, we propose an end-to-end deep learning framework to price the assets. Our framework possesses two main properties: 1) We propose Equity2Vec, a graph-based component that effectively captures both long-term and evolving cross-sectional interactions. 2) The framework simultaneously leverages all the available heterogeneous alpha sources including technical indicators, financial news signals, and cross-sectional signals. Experimental results on datasets from the real-world stock market show that our approach outperforms the existing state-of-the-art approaches. Furthermore, market trading simulations demonstrate that our framework monetizes the signals effectively. △ Less

Submitted 26 October, 2021; v1 submitted 7 September, 2019; originally announced September 2019.

Comments: 9 pages

Journal ref: International Conference on AI in Finance, 2021

Showing 1–19 of 19 results for author: Cucuringu, M