-
Optimum Output Long Short-Term Memory Cell for High-Frequency Trading Forecasting
Authors:
Adamantios Ntakaris,
Moncef Gabbouj,
Juho Kanniainen
Abstract:
High-frequency trading requires fast data processing without information lags for precise stock price forecasting. This high-paced stock price forecasting is usually based on vectors that need to be treated as sequential and time-independent signals due to the time irregularities that are inherent in high-frequency trading. A well-documented and tested method that considers these time-irregulariti…
▽ More
High-frequency trading requires fast data processing without information lags for precise stock price forecasting. This high-paced stock price forecasting is usually based on vectors that need to be treated as sequential and time-independent signals due to the time irregularities that are inherent in high-frequency trading. A well-documented and tested method that considers these time-irregularities is a type of recurrent neural network, named long short-term memory neural network. This type of neural network is formed based on cells that perform sequential and stale calculations via gates and states without knowing whether their order, within the cell, is optimal. In this paper, we propose a revised and real-time adjusted long short-term memory cell that selects the best gate or state as its final output. Our cell is running under a shallow topology, has a minimal look-back period, and is trained online. This revised cell achieves lower forecasting error compared to other recurrent neural networks for online high-frequency trading forecasting tasks such as the limit order book mid-price prediction as it has been tested on two high-liquid US and two less-liquid Nordic stocks.
△ Less
Submitted 15 May, 2023; v1 submitted 17 April, 2023;
originally announced April 2023.
-
Mid-price Prediction Based on Machine Learning Methods with Technical and Quantitative Indicators
Authors:
Adamantios Ntakaris,
Juho Kanniainen,
Moncef Gabbouj,
Alexandros Iosifidis
Abstract:
Stock price prediction is a challenging task, but machine learning methods have recently been used successfully for this purpose. In this paper, we extract over 270 hand-crafted features (factors) inspired by technical and quantitative analysis and tested their validity on short-term mid-price movement prediction. We focus on a wrapper feature selection method using entropy, least-mean squares, an…
▽ More
Stock price prediction is a challenging task, but machine learning methods have recently been used successfully for this purpose. In this paper, we extract over 270 hand-crafted features (factors) inspired by technical and quantitative analysis and tested their validity on short-term mid-price movement prediction. We focus on a wrapper feature selection method using entropy, least-mean squares, and linear discriminant analysis. We also build a new quantitative feature based on adaptive logistic regression for online learning, which is constantly selected first among the majority of the proposed feature selection methods. This study examines the best combination of features using high frequency limit order book data from Nasdaq Nordic. Our results suggest that sorting methods and classifiers can be used in such a way that one can reach the best performance with a combination of only very few advanced hand-crafted features.
△ Less
Submitted 13 July, 2019;
originally announced July 2019.
-
Feature Engineering for Mid-Price Prediction with Deep Learning
Authors:
Adamantios Ntakaris,
Giorgio Mirone,
Juho Kanniainen,
Moncef Gabbouj,
Alexandros Iosifidis
Abstract:
Mid-price movement prediction based on limit order book (LOB) data is a challenging task due to the complexity and dynamics of the LOB. So far, there have been very limited attempts for extracting relevant features based on LOB data. In this paper, we address this problem by designing a new set of handcrafted features and performing an extensive experimental evaluation on both liquid and illiquid…
▽ More
Mid-price movement prediction based on limit order book (LOB) data is a challenging task due to the complexity and dynamics of the LOB. So far, there have been very limited attempts for extracting relevant features based on LOB data. In this paper, we address this problem by designing a new set of handcrafted features and performing an extensive experimental evaluation on both liquid and illiquid stocks. More specifically, we implement a new set of econometrical features that capture statistical properties of the underlying securities for the task of mid-price prediction. Moreover, we develop a new experimental protocol for online learning that treats the task as a multi-objective optimization problem and predicts i) the direction of the next price movement and ii) the number of order book events that occur until the change takes place. In order to predict the mid-price movement, the features are fed into nine different deep learning models based on multi-layer perceptrons (MLP), convolutional neural networks (CNN) and long short-term memory (LSTM) neural networks. The performance of the proposed method is then evaluated on liquid and illiquid stocks, which are based on TotalView-ITCH US and Nordic stocks, respectively. For some stocks, results suggest that the correct choice of a feature set and a model can lead to the successful prediction of how long it takes to have a stock price movement.
△ Less
Submitted 7 June, 2019; v1 submitted 10 April, 2019;
originally announced April 2019.
-
Machine Learning for Forecasting Mid Price Movement using Limit Order Book Data
Authors:
Paraskevi Nousi,
Avraam Tsantekidis,
Nikolaos Passalis,
Adamantios Ntakaris,
Juho Kanniainen,
Anastasios Tefas,
Moncef Gabbouj,
Alexandros Iosifidis
Abstract:
Forecasting the movements of stock prices is one the most challenging problems in financial markets analysis. In this paper, we use Machine Learning (ML) algorithms for the prediction of future price movements using limit order book data. Two different sets of features are combined and evaluated: handcrafted features based on the raw order book data and features extracted by ML algorithms, resulti…
▽ More
Forecasting the movements of stock prices is one the most challenging problems in financial markets analysis. In this paper, we use Machine Learning (ML) algorithms for the prediction of future price movements using limit order book data. Two different sets of features are combined and evaluated: handcrafted features based on the raw order book data and features extracted by ML algorithms, resulting in feature vectors with highly variant dimensionalities. Three classifiers are evaluated using combinations of these sets of features on two different evaluation setups and three prediction scenarios. Even though the large scale and high frequency nature of the limit order book poses several challenges, the scope of the conducted experiments and the significance of the experimental results indicate that Machine Learning highly befits this task carving the path towards future research in this field.
△ Less
Submitted 8 April, 2019; v1 submitted 19 September, 2018;
originally announced September 2018.
-
Benchmark Dataset for Mid-Price Forecasting of Limit Order Book Data with Machine Learning Methods
Authors:
Adamantios Ntakaris,
Martin Magris,
Juho Kanniainen,
Moncef Gabbouj,
Alexandros Iosifidis
Abstract:
Managing the prediction of metrics in high-frequency financial markets is a challenging task. An efficient way is by monitoring the dynamics of a limit order book to identify the information edge. This paper describes the first publicly available benchmark dataset of high-frequency limit order markets for mid-price prediction. We extracted normalized data representations of time series data for fi…
▽ More
Managing the prediction of metrics in high-frequency financial markets is a challenging task. An efficient way is by monitoring the dynamics of a limit order book to identify the information edge. This paper describes the first publicly available benchmark dataset of high-frequency limit order markets for mid-price prediction. We extracted normalized data representations of time series data for five stocks from the NASDAQ Nordic stock market for a time period of ten consecutive days, leading to a dataset of ~4,000,000 time series samples in total. A day-based anchored cross-validation experimental protocol is also provided that can be used as a benchmark for comparing the performance of state-of-the-art methodologies. Performance of baseline approaches are also provided to facilitate experimental comparisons. We expect that such a large-scale dataset can serve as a testbed for devising novel solutions of expert systems for high-frequency limit order book data analysis.
△ Less
Submitted 11 March, 2020; v1 submitted 9 May, 2017;
originally announced May 2017.