[1,1]\fnmRasoul \surAmirzadeh

\equalcont

These authors contributed equally to this work.

\equalcont

These authors contributed equally to this work.

\equalcont

These authors contributed equally to this work.

[1]\orgdivSchool of Information Technology, \orgnameDeakin University, \orgaddress\streetWaurn Ponds Campus, \cityGeelong, \postcode3216, \stateVictoria, \countryAustralia

2]\orgdivDeakin Business School, \orgname Deakin University, \orgaddress\streetBurwood, \cityMelbourne, \postcode3125, \stateVictoria, \countryAustralia

Causal Feature Engineering of Price Directions of Cryptocurrencies using Dynamic Bayesian Networks

[email protected]    \fnmDhananjay \surThiruvady [email protected]    \fnmAsef \surNazari [email protected]    \fnmMong \surShan Ee [email protected] * [
Abstract

Cryptocurrencies have gained popularity across various sectors, especially in finance and investment. The popularity is partly due to their unique specifications originating from blockchain-related characteristics such as privacy, decentralisation, and untraceability. Despite their growing popularity, cryptocurrencies remain a high-risk investment due to their price volatility and uncertainty. The inherent volatility in cryptocurrency prices, coupled with internal cryptocurrency-related factors and external influential global economic factors makes predicting their prices and price movement directions challenging. Nevertheless, the knowledge obtained from predicting the direction of cryptocurrency prices can provide valuable guidance for investors in making informed investment decisions. To address this issue, this paper proposes a dynamic Bayesian network approach, which can model complex systems in multivariate settings, to predict price direction movements in the next trading day of six popular cryptocurrencies. These include Bitcoin and five altcoins (cryptocurrencies other than Bitcoin): Binance Coin, Ethereum, Litecoin, Ripple, and Tether. The efficacy of the proposed model in predicting cryptocurrency price directions is evaluated from two perspectives. Firstly, comparisons are made to five popular baseline models, namely auto-regressive integrated moving average, support vector regression, long short-term memory, random forests and support vector machines. Secondly, from a feature engineering point of view, the impact of twenty-three different features, grouped into four categories, on the DBN’s prediction performance is investigated. The experimental results demonstrate that the DBN significantly outperforms the baseline models. In addition, among the groups of features, technical indicators are found to be the most effective predictors of cryptocurrency price directions.

keywords:
Cryptocurrencies, Dynamic Bayesian networks, Price direction prediction, Causal feature engineering

1 Introduction

The cryptocurrency market has emerged as an important player in global financial markets despite its relatively short lifespan [1, 2]. Since the inception of cryptocurrencies in 2012,111Initiated by “Bitcoin: A Peer-to-Peer Electronic Cash System” paper published by a scholar named Nakamoto Satoshi [3]. the cryptocurrency market capitalisation has soared. For instance, the total market capitalisation of cryptocurrencies witnessed an increase from 17 billion dollars at the beginning of 2017 to 1.1 trillion dollars at the beginning of 2023.222coinmarketcap.com This exponential growth has made cryptocurrencies an excellent investment opportunity and attracted a considerable number of investors to this market. Furthermore, Bitcoin is losing its dominance to altcoins, as recent studies have suggested [11, 12], highlighting the need for expanding research into altcoins along with Bitcoin.333Altcoins have significantly grown in popularity; their market capitalisation surged from 9% in January 2017 to 60% in January 2023, with the number of different cryptocurrencies expanding from 50 to over 20,000, according to data from coinmarketcap.com

Despite the attractiveness of the cryptocurrency market, there are several challenges concerning the profitability of cryptocurrency trading [76]. In addition to common challenges that affect investment decisions in traditional financial assets, such as accurate price prediction and being influenced by global economic and non-economic factors [77], cryptocurrencies face distinct challenges due to their unique ecosystem. The challenges that contribute to the volatility of their prices include mining difficulty, the security of wallets and cryptocurrency exchanges, blockchain-related energy consumption, and the lack of international acceptance and legislation [4]. Moreover, the cryptocurrency market is characterised by no closed trading periods, and often impacted by public sentiment, which distinguishes it from other financial markets [5]. Cryptocurrencies exhibit higher volatility and lower liquidity than traditional financial instruments, presenting challenges for investors aiming to trade cryptocurrencies [96]. Hence, the interplay between these unique characteristics of cryptocurrencies and challenges in the financial markets adds further complexity in predicting cryptocurrency market behaviour.

Although knowledge about the future price movement directions of financial assets is useful for investors, predicting those movements is a complex task in financial time series analysis [6]. This challenge is primarily due to the inherently complex, nonlinear, and noisy attributes of financial data [7]. In addition, there are numerous factors affecting the variability in the financial data that necessitate feature engineering beyond associations derived from correlations. These factors may include market sentiment, regulatory changes, and macroeconomic indicators [91, 92, 93] Furthermore, in the context of cryptocurrency market, where multiple financial entities are represented in a multivariate time series, fluctuations in one variable can significantly impact the values of other variables [8]. Nevertheless, predicting price direction is useful for short-term investors to reduce cryptocurrency market potential risks by taking necessary steps when their investment value may increase or decrease due to market conditions [6]. Additionally, predicting the direction of prices can generate ‘buy’ and ‘sell’ signals, which can be used by algorithmic traders to develop automated trading activities [9]. Therefore, by reducing the quantitative price value prediction problem to a narrower version in terms of classifying price movement direction, it becomes possible to determine trend changes qualitatively for a given window of time [10]. Adopting this approach can potentially result in a useful technique for predicting cryptocurrency market directions and is an essential step toward designing reliable procedures for predicting the actual quantitative prices of financial assets [8].

Artificial intelligence (AI), in particular, machine learning (ML) has proven to be effective in prediction across various fields, ranging from health [79, 80] to transportation [78], and finance [81]. As a result, financial technology (Fintech)444The term ‘Fintech’ refers to individuals or companies that bring innovation and disruption to the financial industry by merging technological and financial capabilities [13]. companies are increasingly utilising ML techniques, such as long short-term memory [14] and artificial neural networks [15], to address the challenge of price prediction in the cryptocurrency market. However, as surveyed in [16], finding an appropriate ML technique to address the challenges of predictions with high accuracy and effective feature engineering is not straightforward. In particular, the accuracy of prediction is highly sensitive to the choice of the model, sample size and corresponding hyperparameters [17, 82, 83].

Dynamic Bayesian networks (DBNs) are versatile tools for modelling complex systems that change over time and can be used to gain knowledge about the causal relationships between variables. DBNs are a widely used technique for learning dependencies between random variables in time-series data [18]. DBNs are useful for applications such as prognosis, fault detection, reliability analysis, risk assessment, and safety evaluation [19, 94]. In particular, DBNs have been used for prediction [20, 84, 85], smoothing and filtering [21], and are frequently utilised in fields such as robotics [22], speech recognition [23], and finance [24].

Our research aims at predicting the price directions of cryptocurrencies using causal feature engineering, we therefore choose DBNs for several reasons. First, DBNs efficiently describe Markov processes, requiring only the current state to determine the probability of the next state. While this construction may be challenging for a complex system such as cryptocurrency market due to the vast set of states, DBNs are incredibly useful for compactly describing combinatorial state spaces in Markov processes [95]. Moreover, DBNs establish causal relationships among all system variables, effectively addressing the challenge of understanding how fluctuations in one variable affecting cryptocurrency price movements impact other factors or the entire system [8].555We use the words ‘variables’ and ‘factors’ interchangeably throughout the paper. Compared to some ML models used for classification, which are limited to predicting only the direction of market movement without accounting for the magnitude or duration of price movement directions [5], DBNs offer the advantage of providing a probabilistic inference for both upward and downward price movements, providing an estimation of the likelihood or probability of such movements. Furthermore, with the numerous factors affecting fluctuations in financial time series data, DBNs are capable of identifying the most influential set of features within the data. Understanding the most influential factors allows for a deeper insight into the drivers affecting cryptocurrency prices, which is crucial for comprehending this unique and complex market [25]. Furthermore, the lack of explainability in some ML methods, frequently described as “black-box” models, presents challenges for their practical application. The absence of critical details, such as the importance of features, the connection between variables, and the rationale behind predictions, can hinder the adoption of ML approaches, particularly for those not familiar with ML methodologies [97]. DBNs provide a structured approach that addresses this shortcoming by offering a graphical model to represent the dynamics of systems within uncertain environments [75]. This approach is particularly beneficial for analysing the cryptocurrency market, which is characterised by high levels of volatility and complexity. Despite all these important attributes, DBNs are rarely used in the domain of cryptocurrency.

In response to the challenges of price prediction for financial assets, we employ DBNs to forecast the direction of price movements in cryptocurrencies. In addition to Bitcoin, the market leader, the study investigates the effectiveness of DBNs in predicting the daily price directions of five popular altcoins, which are mainly chosen considering market capitalisation and data availability. The performance of the DBN models is compared with five frequently used baseline models, namely autoregressive integrated moving average (ARIMA), support vector regression (SVR), random forests (RF), long short-term memory (LSTM), and support vector machines (SVM) models. Moreover, the research explores whether increasing the number of features fed into the DBN leads to a monotonic improvement in precision accuracy. To test this, we investigate how different feature categories influence the performance of DBNs in predicting market directions. The contributions of our research are:

  • Analysing daily price directions for six popular cryptocurrencies: Bitcoin, Binance Coin, Ethereum, Litecoin, Ripple, and Tether.

  • Discovering the causal relationships among four feature groups—price information, social media activity, macro-financial data, and technical indicators—and their impact on predicting cryptocurrency prices.

  • Proposing DBNs to predict cryptocurrency prices by leveraging causal influences. Specifically, by predicting the direction of price movement, DBNs can generate probabilistic ‘buy’ and ‘sell’ signals, which can be helpful for investors in making investment decisions.

The remainder of this paper is structured as follows. In Section 2, a literature review is provided on the studies related to predicting cryptocurrency prices. Section 3 introduces DBNs. Section 4 outlines the selection of features. Section 5 provides information on experimental design and data analysis procedures. The findings of our study are discussed in Section 6. Finally, Section 7 provides concluding remarks on the study and suggests directions for future research.

2 Related work

In this section, we review a set of recent academic publications concerning using ML models in predicting price movement directions. We first briefly consider the feature engineering side of devising ML models including incorporating technical indicators, social media data, and their impact on the accuracy of the models. We then survey the applications of BNs in predicting upward and downward trends in financial markets.

Technical indicators are valuable tools for making predictions and trading decisions, offering financial market participants valuable insights into market trends. They are commonly used features in ML studies for predicting various financial markets, as seen in [27] and [28]. However, they have received less attention as input features in the cryptocurrency literature. For instance, a tree-based classification model is built by Huang et al. [29] to assess whether Bitcoin returns are predictable. They create 124 indicators based on five categories of them using Python’s TA-Lib library, which include overlap study, momentum, cycle, volatility, and pattern recognition indicators. According to their results, the model has predictive power for narrow intraday ranges of Bitcoin and outperforms the buy-and-hold strategy. The study also concludes that technical analysis is useful in the Bitcoin market, despite the fact that non-fundamental factors are the primary drivers of its value. Also, a study by Alonso-Monsalve et al. [30] predicts trends of six popular cryptocurrencies (Bitcoin, Dash, Ethereum, Litecoin, Monero, and Ripple) using eighteen technical indicators derived from one-year data. They classify one-minute trends into three categories (increase, neutral, or decrease) using four different neural network architectures, including convolutional neural networks and multilayer perceptrons. Their results suggest that all cryptocurrencies are predictable to a certain extent by using technical indicators. However, the study focuses on short-term trend prediction, which is subject to the limitations such as response times and liquidity issues. Moreover, their proposed hybrid network outperforms other models, and their performances are better at predicting Bitcoin, Ethereum, and Litecoin. In another study, Akyildirim et al. [31] test the predictability of twelve cryptocurrencies using four ML algorithms, including support vector machine (SVM) and logistic regression. For their investigations, they use past price information and eight technical indicators, including the five-day relative strength index (RSI) and simple moving average (SMA) as features for their models. The results show that all four algorithms have an average classification accuracy consistently above the 50% threshold for all cryptocurrencies.

The influence of popular social platforms, such as Twitter, on cryptocurrency price movements, has received substantial attention [32, 33] in the literature. In a work by Abraham et al. [34], the authors predict changes in Bitcoin and Ethereum prices using Twitter and Google Trends data. According to their study, the volume of tweets is a predictor of price direction rather than sentiment. Moreover, they employ a sentiment analysis tool in the study and observe that the sentiment of tweets tends to remain positive, regardless of price changes in their investigation. In another study, Valencia et al. [5] use multilayer perceptron, SVM, and random forest to predict the price movement of four cryptocurrencies (Bitcoin, Ethereum, Ripple, and Litecoin) by incorporating Twitter data and raw price data as two feature groups. The Twitter data used in the study include the sentiment of daily tweets, which are grouped into four categories of positive, negative, neutral, and compound sentiments. The models’ performances are compared based on independent feature groups and their combinations. The results suggest that Twitter data alone has the potential to predict specific cryptocurrencies where the best results are obtained for Bitcoin. However, there is not a universally accepted criterion for evaluating the model performance in predicting different cryptocurrencies. For example, different models show superiority on different coins based on accuracy and precision performance criteria.

Despite the considerable capacity of DBNs in modelling complex stochastic situations and detecting features with causal relationships, it has rarely been used in analysing the cryptocurrency market. However, a number of studies investigating Bayesian networks (BNs) and DBNs are presented here as examples. For instance, Wang et al. [10] apply DBNs to predict the stock market trend in both the US and China markets by incorporating nine macroeconomic factors, such as gross domestic product and monthly inflation rate, into their DBNs. The study results show that the proposed method effectively captures changes in market trends that preceded actual turning points of stocks by a lead margin of a few months. However, the models do not determine the direction of the trend changes. As another example, Jangmin et al. [35] implement a variant of DBNs to model the dynamics of trends of prices of twenty companies in the Korean stock market. The results show that the proposed model can not outperform the buy-and-hold strategy as the first baseline model, since the test period of the dataset was in its bull market, and the buy-and-hold strategy naturally wins. However, the model outperforms the triple exponential average indicator, as the second baseline model, in terms of cumulative profit. In another DBNs-related study, Wang et al. [36] investigate the potential of DBNs in predicting stock prices and generating profits in the stock market. They model the stock prices of NASDAQ and the Stock Exchange of Thailand using five years of historical intraday closing-price data. The profit generated by their model is compared with several benchmarks, including the buy-and-hold strategy, and it statistically outperforms the benchmark strategies.

Regarding studies about predicting price movement direction, to detect the upward and downward movement of stock indices, Zuo and Kita [37] utilise BNs for the task. The accuracy and total profit of the BN are compared with psychological line and trend estimation algorithms. The psychological line in the study uses a predefined formula to analyse the prices of previous days and provide insights on whether the price will increase or decrease on the next day. The proposed model demonstrates an accuracy rate of around 60%, which is approximately 10% higher than the other investment strategies investigated in the study. BNs also generate much greater profit than baseline models. In addition, Malagrino et al. [38] use BNs to investigate the impact of foreign markets on Brazil’s main stock exchange index (iBOVESPA). Two different BN topologies are designed using 24- and 48-hour timeframes, with each BN predicting the index’s next-day closing direction (up or down). The study evaluates the performance of each BN using sets of one, two, and three different indices from three continents. The results reveal that the 24-hour BNs achieve a mean accuracy rate of approximately 71%, which is better than the 68% accuracy rate achieved by its 48-hour counterpart model. Additionally, the study found that the BN models’ performance improves when fewer indices are included in the BNs.

3 Dynamic Bayesian networks

BNs are a powerful tool for modelling complex systems and can be used to extract the underlying causal relationships between variables [39]. They are a type of probabilistic graphical model (PGM) used to represent stochastic relationships between random variables through directed acyclic graphs (DAGs) [40]. They are based on the Bayesian theory of conditional probability and can be used for reasoning, prediction, and decision-making in a wide range of fields from environmental management [41], maritime accidents [42] to wind energy industry [43] and credit assessment [44]. In a BN, nodes represent variables, and directed edges represent the conditional dependencies between variables. Each node in the network is associated with a probability distribution that describes the probabilities of possible values of that variable given the probabilities of values of its parent variables. These joint probability distributions over the nodes are represented by conditional probability tables (CPTs) for each node [45]. BNs can be utilised to perform inference, which involves using the network structure of the variables in the system to make predictions or draw conclusions about the state of the system given some evidence or observations. In particular, the capacity of a visual representation of dependencies in a system using DAGs makes BNs a versatile tool for communicating some properties of the system. They can handle missing data and hidden variables, and on top of that training of BNs inherently allow for avoiding overfitting [39], which is a common problem in ML modelling.

DBNs are a type of BNs that can be used to model systems that change over time. Unlike static BNs, which only model dependencies between random variables at a single point in time, DBNs represent the temporal evolution of a system by establishing time-dependent dependencies between variables. In a DBN, each node represents a random variable at a specific point in time, and directed edges model the conditional dependencies between variables at the same or different time intervals [46]. A DBN is composed of two parts: time slices and inter-slice arcs. A time slice represents the states of the system at a given time t𝑡titalic_t and is essentially an identical BN at each time step. Within a time slice, the relationships between variables are represented by intra-slice arcs. The second component of a DBN is inter-slice arcs, also known as temporal arcs. These arcs represent the relationships between variables within a time slice or between certain variables across time slices. These components are illustrated in Figure 1(a). In addition, a basic assumption in constructing DBN is that the system follows a first-order Markov process, in which the state of the system at time t𝑡titalic_t only depends on its state at the previous time slice t1𝑡1t-1italic_t - 1. In other words, future temporal nodes at time t𝑡titalic_t are only connected to corresponding nodes at the previous time slice t1𝑡1t-1italic_t - 1 [47, 48, 49].

Refer to caption
(a) Unrolled 2TBN
Refer to caption
(b) 2TBN
Figure 1: An illustration of a DBN composed of two-time slices. Note that these two models are representing the same DBN, where Sub-figure (b) represents a compact version of Sub-figure plot (a). While connections between A0subscript𝐴0A_{0}italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, B0subscript𝐵0B_{0}italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, and C0subscript𝐶0C_{0}italic_C start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT inside time slices 00 represent intra-slice arcs (solid links), links between A0subscript𝐴0A_{0}italic_A start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and A1subscript𝐴1A_{1}italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and B0subscript𝐵0B_{0}italic_B start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and B1subscript𝐵1B_{1}italic_B start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT from time slice 00 to time slice 1111 are inter-slice arcs (dotted lines)

The process of building a DBN is iterative. Each time slice requires the same structural form as the previous or next slice, and time slices reflect the change in probabilities of the variables [50]. In particular, converting a BN to a DBN involves three main steps. The first step is to modify the BN structure to incorporate the dynamics of the process. Next, one needs to introduce a time parameter in the definition of the states of all nodes to describe the temporal relationship. Finally, the static BN is repeated for n𝑛nitalic_n time steps and the belief in the system is updated for the given time step [51, 52].

From a mathematical perspective, a DBN is a pair (B0subscript𝐵0\mathit{B_{0}}italic_B start_POSTSUBSCRIPT italic_0 end_POSTSUBSCRIPT,Bsubscript𝐵\mathit{B_{\rightarrow}}italic_B start_POSTSUBSCRIPT → end_POSTSUBSCRIPT), where B0subscript𝐵0\mathit{B_{0}}italic_B start_POSTSUBSCRIPT italic_0 end_POSTSUBSCRIPT is a BN model that defines the prior network, and Bsubscript𝐵\mathit{B_{\rightarrow}}italic_B start_POSTSUBSCRIPT → end_POSTSUBSCRIPT is a two-time slice temporal BN (2TBN) which defines the relationship between two consecutive time slices through a transition probability table [50, 53]. The joint probability distribution of a DBN can be demonstrated as

P(X1:T1:N)=t=1Ti=1NP(Xti|Pa(Xti)),𝑃superscriptsubscript𝑋:1𝑇:1𝑁superscriptsubscriptproduct𝑡1𝑇superscriptsubscriptproduct𝑖1𝑁𝑃conditionalsuperscriptsubscript𝑋𝑡𝑖𝑃𝑎superscriptsubscript𝑋𝑡𝑖P(X_{1:T}^{1:N})=\prod_{t=1}^{T}\prod_{i=1}^{N}P({X_{t}}^{i}|Pa({X_{t}}^{i})),italic_P ( italic_X start_POSTSUBSCRIPT 1 : italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 : italic_N end_POSTSUPERSCRIPT ) = ∏ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_P ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT | italic_P italic_a ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) ) ,

where Xtisuperscriptsubscript𝑋𝑡𝑖X_{t}^{i}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT is the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT node at time step t𝑡titalic_t, and Pa(Xti)𝑃𝑎superscriptsubscript𝑋𝑡𝑖Pa({X_{t}}^{i})italic_P italic_a ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) represents the parent nodes of Xtisuperscriptsubscript𝑋𝑡𝑖X_{t}^{i}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT in the corresponding DAG. Furthermore, the conditional probability P(Xti|Pa(Xti)P({X_{t}}^{i}|Pa({X_{t}}^{i})italic_P ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT | italic_P italic_a ( italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ) indicates that the transition probabilities are a product of the CPTs in the 2TBN, where T𝑇Titalic_T is the full-time horizon (T=5𝑇5T=5italic_T = 5 in this study), and N𝑁Nitalic_N in the number of nodes in Xtisuperscriptsubscript𝑋𝑡𝑖X_{t}^{i}italic_X start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT. Once the probabilities on nodes in a DBN are determined through the joint probability distribution calculation, different forms of reasoning and inferencing such as prediction, diagnosis, or decision-making can be performed.

4 Proposed framework

This section outlines the framework for the development of a prediction model for price movement directions. It briefly discusses the rationale for selecting particular features for this study. A detailed explanation of the process of develo** a DBN based on these features follows.

4.1 Data features

To predict cryptocurrency price directions, we examine three feature categories to analyse their impact on our model’s prediction accuracy. Figure 2 presents the conceptual framework of our proposed cryptocurrency price prediction model that explains the structure of the selected groups of features, which we demonstrate here.

Refer to caption
Figure 2: This conceptual framework depicts the feature engineering approach used in the study, outlining three distinct feature categories that impact the direction of closing price movements, depicted in gray. Basic price information is shown in pink, encompassing open, high, low and volume data. The second group, coloured in amber, consists of nine specific technical indicators. The third category, external factors, illustrated in green, comprises macro-financial factors, including five financial assets and social media impact measured by daily tweet volume related to each cryptocurrency.

The first group of features incorporates basic price data, including daily open, high, low, close prices, and volume (OHLCV). Opening and closing prices reveal market trends, high and low prices indicate price volatility, and trading volume shows market liquidity. Therefore, this group of features provides a detailed snapshot of a financial asset’s status on a given day [54].

Apart from the raw price and volume data, price movements can be predicted using technical indicators [30]. Technical indicators, derived from price data through mathematical formulas, are extensively employed to identify trends and generate buy and sell signals in financial markets [55]. We use a combination of various types of technical indicators in our proposed model, including accumulation/distribution (AD), Bollinger bands (BBands), exponential moving average (EMA), on-balance volume (OBV), moving average convergence/divergence (MACD), normalised average true range (NATR), relative strength indicator (RSI), simple moving average (SMA), and stochastic oscillator (Stoch). These technical indicators offer insights into financial markets, including movement strength, trend reversals, and price volatility. Further references on details and applications of these technical indicators can be found in [30, 56].

Cryptocurrency prices are influenced by external factors, including adoption, attractiveness, and macro-financial drivers, which are unrelated to direct price data [57, 58]. The adoption and attractiveness, rooted in behavioural finance,666Behavioural finance is a relatively new school of thought in the finance domain that describes the psychological reasons behind the decision-making of investors [59]. is described as the investment attractiveness of the cryptocurrency market and intentions of investors to utilise this new financial product [60]. In the context of cryptocurrency adoption and attractiveness, sentiment analysis of posts on Twitter777The company’s name has changed to X, but we use the name that was in effect at the time of the study. is popular social media data for determining public opinion about the market [61]. Therefore, we include the daily tweet numbers associated with each altcoin in the study. Regarding macro-financial factors, several traditional financial assets and indices from diverse economic classes, such as currencies, commodities, and stock indices, are frequently selected in cryptocurrency and financial literature [62, 77, 64]. Hence, gold, the US dollar index (USDX), Standard & Poor’s 500 Index (S&P 500), MSCI, and West Texas Intermediate (WTI) are selected as macro-financial factors in this study. A short summary of all features is provided in Table 1.

Table 1: Feature description - A short description of selected features, their types, and the time window considered in this study.
Type Feature Description Time window
Macro-financial Gold The gold spot market price in US dollars Daily
MSCI A market capitalisation-weighted index comprising 1,546 companies from around the world Daily
S&P500 A market capitalisation-weighted index of the 500 leading publicly traded companies in the US Daily
USDX The value of the US dollar relative to a basket of six foreign currencies Daily
WTI A popular oil price benchmark Daily
OHLVC Close price The price at which a cryptocurrency is last
traded in a trading interval
Daily
Low price It is a cryptocurrency’s lowest trading price in a trading interval Daily
High price A cryptocurrency’s highest trading price in a trading interval Daily
Open price The price at which a cryptocurrency is first traded in a trading interval Daily
Volume The total number of cryptocurrencies traded in a trading interval Daily
Social media Tweet number The number of daily tweets associated with a cryptocurrency Daily
Technical indicators AD It uses volume and price to calculate the money flow into or out of a security and determines the accumulation or distribution of funds by traders. Last Period
BBands It consists of a band of three lines, usually SMA in the middle, and the upper and lower bands are positioned two standard deviations away from the SMA. 5 days
EMA It uses moving averages; however, it applies more weight to recent data points to reduce the data lag. 10 days
MACD MACD measures two EMAs (typically EMA for 12 and 26 days). Fast period=12
slow period=26
signal period=9
NATR It is a metric to measure volatility. 14 day
OBV OBV is a cumulative indicator that measures the buying and selling pressure. Last period
RSI It indicates overbought and oversold conditions by comparing the magnitude of gains and losses in stocks. 14 days
SMA The SMA average data points for a given period. 10 days
Stoch It is a range-bound momentum indicator that potentially determines overbought and oversold situations. 14 days

4.2 Develo** a DBN model for price prediction

In this study, a process is implemented to deploy our DBN models. Initially, we construct BNs to uncover the structure of causal relationships within the training data, enabling us to capture the relationships between price features and generate probability distributions for the states of the system. However, BNs uncover these relationships at a static moment, but they do not capture the temporal evolution of these factors. To address this and accommodate the cryptocurrency market’s dynamic nature and its time series data, we convert the BNs into DBNs. This conversion to DBNs does not imply dynamic changes in network structure or parameters. Instead, it is a repeated BN over successive timeframes, which allows for modelling the whole system over time. DBN models calculate the changing probabilities over time, which is vital for incorporating the historical price when forecasting future prices. Our proposed DBNs incorporate a time step of the last five days, allowing the model to incorporate price data from the preceding five days to predict today’s price direction.

The architecture of our proposed DBN is shown in Figure 3. We assume that each feature group is independent of one another, meaning no inter-slice arcs between two consecutive timeframes. However, features may be connected within a time frame based on the causal structure in the data learned using static BNs.

Refer to caption
Figure 3: An illustration of the proposed DBN model at two consecutive time steps for price direction prediction based on the dynamic interactions between corresponding feature groups: abbreviated as TI for technical indicators, Price for OHLV features, SM for social media (tweet numbers), and FA for financial assets. The model includes four feature groups at each of the two consecutive time steps, and inter-slice arcs (dotted lines) connect corresponding nodes between the time steps. Note that the intra-slice arcs (solid lines) connected to close price direction are only for illustrative purposes and do not indicate actual causal relationships among the feature groups in our constructed DBNs. Potentially there could be intra-slice arcs between groups of features.

5 Experimental design

The study aims to predict the daily price direction of six cryptocurrencies using DBNs. Additionally, it explores the impact of various combinations of feature groups on the model’s price prediction performance. To achieve this goal, four distinct feature groups are created as follows:

  • The first combination includes only the OHLCV data.

  • The second combination incorporates OHLCV with external price factors containing Twitter data and traditional financial assets.

  • The third combination uses OHLCV data along with nine technical indicators.

  • The fourth combination includes all features.

Traditional financial asset data is generally unavailable on weekends, unlike cryptocurrencies and Twitter data. Therefore, weekday data points from these three sources are aligned to create a unified database for each cryptocurrency enabling numerical analysis. Table 2 outlines the different combinations of features with each combination assigned a unique group number, ranging from 1 (with the fewest number of features) to 4 (with the largest number of features). After creating the four groups of features, they were preprocessed using the min-max normalisation method to balance the input data ranges. Min-max normalisation has been reported to deliver satisfactory performance in supervised and unsupervised learning tasks [65, 86, 87].

Table 2: Feature groups - The list of feature group combinations and the number of individual features in each combination.
No Groups of features No. of features Abbr.
1 OHLCV 5 OHLCV
2 OHLCV and external factors 11 OHLCV-EF
3 OHLCV and technical indicators 17 OHLCV-TI
4 OHLCV, external factors, and technical indicators 23 OHLCV-EF-TI

Since the price data are continuous, and we predict future price directions, which are categorical variables, it is necessary to categorise the data prices into several market states. An appropriate choice of market states is dependent on the model choice or data and impacts the model’s performance. For instance, there are studies that use three states for market trends, including down, steady, and up, while others may consider two additional trend states of strong down and strong up [88, 89]. Based on the results presented in [66], the two states of ‘Down’ and ‘Up’ provide the best model performance for BNs applied on cryptocurrencies. Consequently, the data is labelled into these two states for market data, where each data record was labelled as ‘Down’ if today’s record value was lower than the previous day’s, and ‘Up’ otherwise. Equation 1 outlines the labelling approach for the data records.

Label for time step(t+1)={Down,ifCt<Ct+1Up,ifCtCt+1\text{Label for time step}(t+1)=\left\{\begin{matrix}Down,&if&C_{t}<C_{t+1}\\ ~{}Up,~{}~{}~{}~{}&if&C_{t}\geq C_{t+1}\\ \end{matrix}\right.Label for time step ( italic_t + 1 ) = { start_ARG start_ROW start_CELL italic_D italic_o italic_w italic_n , end_CELL start_CELL italic_i italic_f end_CELL start_CELL italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT < italic_C start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_U italic_p , end_CELL start_CELL italic_i italic_f end_CELL start_CELL italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ≥ italic_C start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG (1)

where Ct+1subscript𝐶𝑡1C_{t+1}italic_C start_POSTSUBSCRIPT italic_t + 1 end_POSTSUBSCRIPT means the data record of the next day, and Ctsubscript𝐶𝑡C_{t}italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is the data record of the current date. Furthermore, the labelled data was split into training and testing sets, with 67% of the data allocated for training and 33% for testing. This split ratio is widely adopted for data map** and independent accuracy assessment [67].

To implement the DBNs, we utilised the GeNIe software package through the PySMILE wrapper. PySMILE is a Python-based package that allows for Bayesian inference and modification of Bayesian networks using Python [68]. Therefore, we fed the training set of each feature group into our implemented PySMILE code to learn the structure and parameters of the corresponding DBNs. Once the DBNs were constructed from the training set, a five-day moving window was generated from the test set and fed as inputs to the DBNs to predict the directions of price. Specifically, the last five-day data were used as inputs to our DBNs to predict the closing price.

We compare the prediction accuracy of our proposed DBNs models against five baseline models. ARIMA and SVR are well-known and frequently used models for accurately predicting time series in various nonlinear systems, including economic and financial systems [69, 70]. Moreover, as demonstrated in the survey by [74], LSTM, RF and SVM are among the most widely used methods for price prediction. thus employed parameter optimisation methods including GridSearchCV [71] and grid search logic [72] to fine-tune the parameters of our models to obtain the best predictions. Moreover, since the outputs of certain models, such as ARIMA, are continuous values, we labelled the price prediction as ‘Up’ or ‘Down’ based on Equation 1.

To evaluate the performance of the models, we compare the predictions generated by DBNs, ARIMA, SVR, LSTM, RF, SVM models to the actual price directions using the precision metric. The Precision metric is defined as a measure of exactness or accuracy, and it is computed as the ratio of true positives to all positive predictions (true and false positives) [90]. True positives are instances correctly identified as positive by the model, whereas false positives are instances incorrectly identified as positive, actually being negative. Precision is chosen as the evaluation metric to ensure the models accurately identify market direction, reducing the risk of incorrect price movement predictions and potential investment losses. Mathematically, the precision metric is defined as follows.

Precision=TPTP+FPPrecision𝑇𝑃𝑇𝑃𝐹𝑃\text{Precision}=\frac{TP}{TP+FP}Precision = divide start_ARG italic_T italic_P end_ARG start_ARG italic_T italic_P + italic_F italic_P end_ARG

where TP is the number of true positives, and FP is the number of false positives.

In order to ensure a consistent and standardised analysis of the numerical findings, we decide to utilise a single metric, precision, despite the practice of employing multiple metrics in certain publications. The precision metric measures the number of true positives when the model correctly predicts the positive records relative to the total number of positive predictions [73].

5.1 Data

In this research, six cryptocurrencies are examined. Bitcoin was chosen as it is the most popular cryptocurrency and the altcoins chosen are ones that consistently ranked among the top ten cryptocurrencies by market capitalisation for several years. Collectively, the market capitalisation of five studied altcoins represents over 30% of the cryptocurrency market at the beginning of 2023. Additionally, there are decades of data available for these coins across, which allows for a robust statistical analysis of the unpaired correlations between these cryptocurrencies and traditional financial assets. Specifically, we select coins which consist of 1,100 daily data records, equivalent to around four years worth of data.

To obtain the daily price data for both cryptocurrencies and traditional financial assets, we use the yfinance Python package that connects to the Yahoo Finance website (https://finance.yahoo.com/). Additionally, we extract the daily tweet count associated with each cryptocurrency from www.bitinfocharts.com.888Both websites were accessed in May 2023. The analysis of the impact of the social media data on Tether is excluded because this coin’s daily tweet data is unavailable.999Descriptive statistics for the closing price data of cryptocurrencies, traditional financial assets, and tweet numbers are provided in the appendix A for details.

Figure 4 shows the close prices of cryptocurrencies and traditional financial assets from January 2018 to October 2023. We see that all traditional financial assets experienced growth during this period. However, the close price plots of Binance Coin and Ethereum demonstrate a relatively steady trend before 2021, while Litecoin and Ripple experienced a considerably unstable period during 2018 before a period of stability until 2021. In 2021, all coins experienced high variation in their close prices, except Tether, as it serves as a stablecoin. For instance, Ethereum’s close price variances were 44931.208 and 920876.95 before 2021 and after 2021, respectively. As a result, a high level of prediction error can be expected due to the increasing variances in data.

Refer to caption
Figure 4: The price fluctuation of traditional financial assets and cryptocurrencies in this study.

Figure 5 shows the daily tweet numbers for all the coins. Despite several spikes, the trends of tweet numbers are relatively consistent compared to the price data in Figure 4. We observe a significant jump in price during 2021 compared to other years

Refer to caption
Figure 5: The number of daily tweets associated with Bitcoin, Binance Coin, Ethereum, Litecoin, and Ripple between January 2018 and October 2022.

6 Results and discussions

To assess our models’ efficacy, we compare the results of their prediction against those of baseline models, namely ARIMA, SVR, LSTM, RF, and SVM. We also investigate the impact of feature engineering on our DBNs’ performance. The experiments are conducted on a Windows 10 Enterprise operating system, running an Intel i7-Core(TM) CPU @ 1.90GHz, 2.11 GHz processor, and 16.0 GB of RAM. To implement baseline models, we utilise Anaconda and Python libraries, including Sklearn, Statsmodels, and Keras. These libraries provide essential tools and functionalities for building, training, and evaluating various machine learning and statistical models.

Table 3 summarises the performance results for the best-performing DBN model for each coin among those evaluated, based on various features, as well as for five baseline models. These results pertain to predicting the next day’s closing price of a cryptocurrency. It should be noted that the baseline models use only the close price time series data of cryptocurrencies. Additionally, the ‘Average for Model’ row presents the average precision of each model for all cryptocurrencies.

Table 3: Model performance comparison - A comparative analysis of daily price direction predictions using DBN, ARIMA, SVR, LSTM, RF, and SVM based on precision percentage. Precision is the measure of accuracy, calculated as the ratio of true positives to all positive predictions (true and false positives).
Model
DBN ARIMA SVR LSTM RF SVM
Bitcoin 72.19 48.90 49.61 44.96 49.02 53.00
Binance Coin 73.63 61.81 57.92 43.01 51.54 47.21
Ethereum 73.86 63.27 59.54 44.00 47.83 48.73
Litecoin 70.56 69.35 51.51 44.33 52.35 45.31
Ripple 77.00 48.64 46.35 47.00 54.75 51.31
Tether 57.57 39.81 41.54 49.53 53.31 52.69
Average (sd) 70.81 (6.45) 55.30 (10.52) 51.08 (5.77) 45.47 (2.33) 51.47 (2.95) 49.71 (3.84)

The best-performing model for each cryptocurrency is the DBN, outperforming all baselines despite their extensive use in the literature in predicting time series data. Considering the baseline models, ARIMA and RF exhibit generally superior precision compared to the other models, with average scores of 55.30 and 51.47, respectively. Conversely, LSTM demonstrates low performance in predicting the daily price direction of cryptocurrencies.

Among all the coins, Ripple is the cryptocurrency with the highest overall prediction accuracy, considering precision, across the models used with the a maximum precision rate of 77%. This observation is in line with the close price plot of Ripple in Figure 4, where the coin exhibits relatively low volatility across the period considered. On the other hand, predicting the price direction of Tether more challenging, with a maximum precision of 57.57%. Comparing Bitcoin’s precision to altcoins for the DBN model, it is evident that certain altcoins (Binance Coin, Ethereum, and Ripple) outperform Bitcoin in terms of prediction precision with DBN. This could suggests that DBNs can capture altcoin behaviours more effectively than Bitcoin’s. Figure 6 shows the best-performing DBNs based on the highest precision in Table 3.

Refer to caption
Figure 6: Precision of the proposed DBNs, ARIMA, SVR, LSTM, RF, and SVM for predicting price movement directions of cryptocurrencies.

6.1 Evaluating the influence of feature combinations on the performance of DBNs

We also investigate the impact of different combinations of features on the performance of DBNs. Table 4 summarises the effect of feature engineering on the performance of the DBNs. Columns two to five correspond to the different feature sets used for building the DBNs. Each column represents a specific group of features, denoted by an abbreviation, defined in Table 2.

Table 4: The performance of the DBN models - Overall comparison of daily market direction prediction using DBN models based on different groups of features considering the precision metric. OHLCV stands for Open, High, Low, Close, Volume; EF refers to external factors; TI denotes technical indicators. Each group of features is assigned a distinct group number.
DBN Feature Set Group
OHLCV OHLCV-EF OHLCV-TI OHLCV-EF-TI Best DBN
Bitcoin 71.67 68.38 72.19 70.23 DBN (OHLCV-TI)
Binance Coin 59.27 71.02 73.63 69.98 DBN (OHLCV-TI)
Ethereum 73.86 68.02 71.83 71.83 DBN (OHLCV)
Litecoin 64.21 70.56 70.56 70.56 DBN (OHLCV-EF)
Ripple 72.87 66.67 77.00 72.87 DBN (OHLCV-TI)
Tether 57.57 57.57 55.09 50.12 DBN (OHLCV)
Average (sd) 66.57 (6.57) 67.04 (4.49) 70.05 (6.99) 67.6 (7.88) 70.80 (6.23)

Table 4 shows that there is no single DBN that performs best for all the coins considered in the study. This is consistent with other studies such as the study by Valencia et al. [5], indicating that different cryptocurrencies have varying specifications. This highlights the significance of feature engineering, as it allows for selecting and optimising specific feature groups tailored to each cryptocurrency, resulting in enhanced performance. Nonetheless, the best performance for Ethereum and Tether are obtained using only OHLCV data, whereas the DBN (OHLCV-EF-TI) with the full set of features never outperforms the other DBNs. Similar results are observed for Binance Coin, Bitcoin and Ripple, where DBN (OHLCV-EF-TI) does not exhibit superior performance compared to other DBN models. Thus, identifying the set of features for each coin must be done independently to achieve best results. In addition, to compare the different versions of DBNs, the last row of Table 4 shows the average precision of each model across all cryptocurrencies. Among the four different settings, the DBN with OHLCV and technical indicators has a slightly higher average precision. The combination of these two feature sets is also selected three times as the best-performing model, for Binance coin, Bitcoin and Ripple. This underscores the potential importance of these features in enhancing model performance.

We also see that on average, the combination of OHLCV and technical indicators yields the best results for all cryptocurerncies (average precision 70.05%percent70.0570.05\%70.05 %). Conversely, DBNs built solely on OHLCV achieve the lowest performance (average precision 66.57%percent66.5766.57\%66.57 %). However, it should be noted that adding more features to the DBN models does not always result in further improvements, but sometimes decreases the performance. For example, the DBN model constructed with all features (OHLCV-EF-TI) for Binance Coin performance worse (69.98%percent69.9869.98\%69.98 %) than the DBN model that only contains OHLCV and technical indicators (OHLCV-TI) with 73.63%percent73.6373.63\%73.63 % precision. Hence, including external factors alongside other groups of features may negatively impact the precision of the DBN model for Binance Coin.

The influence of technical indicators on DBN performance varies among cryptocurrencies. For Binance Coin, Litecoin, and Ripple, introducing technical indicators to the basic DBN model (OHLCV) substantially substantially improves the model’s performance, while for Ethereum and Tether, it leads to a slight decrease in precision. In particular, using only Group No. 1 is sufficient for predicting Ethereum and Tether, as this group has the highest performance rate. For Bitcoin, introducing technical indicators slightly improves the performance. It is worth noting that DBN (OHLCV-TI) consistently outperform or matches those incorporating all features.

The impact of external factors on the prediction performance of DBN models has mixed results. While introducing external factors (OHLCV-EF), which contains tweet numbers and traditional financial assets, to the DBN model constructed by OHLCV improves the performances for predicting Binance Coin and Litecoin, it reduces the precision of the model outputs for Bitcoin, Ethereum and Ripple. In the case of Tether, the performance of the both DBN (OHLCV) and DBN (OHLCV-EF) remains the unchanged. Considering its DBN structure, these external factors do not influence the close price as they are isolated in the DBN model, and no arc connects them to OHLCV. Additionally, it should be noted that incorporating external factors along with OHLCV and technical indicators to construct all features does not consistently result in improved model precision. For example, in the case of Ripple, the DBN model constructed with all features (OHLCV-EF-TI) has lower performance than the DBN model constructed by combining OHLCV and technical indicators (OHLCV-TI). This indicates that not all external factors may have a significant impact on the prediction accuracy of the model. Hence, selecting the most relevant features is crucial for improving the precision of the DBN models. The results presented are visualised Figure 7 for better understanding of the results. One important observation is that Tether shows the lowest precision across all the cryptocurrencies. Moreover, the highest precision is observed for the DBN (OHLCV-TI) model applied to Ripple, while the lowest precision is recorded for the DBN (OHLCV-EF-TI) model in the case of Tether.

Refer to caption
Figure 7: Precision in prediction based on different feature groups evaluates the performance of constructed DBNs.

6.2 Analysing causal structures

Refer to caption
(a) Bitcoin DBN (OHLCV-TI)
Refer to caption
(b) Binance Coin DBN (OHLCV-TI)
Refer to caption
(c) Ethereum DBN (OHLCV)
Refer to caption
(d) Litecoin DBN (OHLCV-EF)
Refer to caption
(e) Ripple DBN (OHLCV-TI)
Refer to caption
(f) Tether DBN (OHLCV)
Figure 8: The DBNs with the highest precision for each altcoin, considering a temporal plate of five-time steps. The numbers inside the parentheses indicate the most influential set of feature categories as represented in Table 4.

Analysing the structures of DBNs in Figure 8 offers essential insights, enabling investors to improve predictions by identifying key factors affecting cryptocurrency prices, demonstrating their advantage in explainability over other AI deep learning methods. The figure displays the best-performing DBNs associated with each cryptocurrency in Table 4.101010Note, the DBNs are chosen using the precision metric. Bitcoin, Binance Coin and Ripple achieve best performance with DBN (OHLCV-TI) (Sub-figures 8(a)8(b) and 8(e)), while Ethereum and Tether perform best with DBN (OHLCV) (Sub-figures 8(c) and 8(f)). Litecoin, on the other hand, stands out as the only coin with its best performance in DBN (OHLCV-EF)(Sub-figure 8(d)). Notably, all nodes within each DBN are interconnected, without any isolated nodes. This highlights the inherent interdependencies among these entities and their mutual influence on each other, specifically for the selected combination of feature categories. Moreover, the structure of DBNs varies across different cryptocurrencies, even within the same group of features, underscoring the unique characteristics of each coin. For instance, when examining Sub-figures 8(c) and 8(f), the causal relationships between the components of OHLCV features for Ethereum and Litecoin exhibit different dynamics. This distinction further emphasises the individual nature of each cryptocurrency and the specific factors that influence their price movements.

Furthermore, in Sub-figure 8(b), the DBN representing Binance Coin, the root node OBV exhibits the highest influence, evidenced by its numerous descendants. Additionally, RSI and BBand mid nodes display significant interactivity, possessing the highest number of edges connecting them to other nodes. Contrastingly, Sub-figure 8(c), which showcases the best-performing DBN for Ethereum, reveals the Volume node as the root node with the most descendants. Among the other nodes, High exhibits the highest level of interactivity. Furthermore, regardless of the set of feature categories employed, the best-performing DBNs for Ethereum (Sub-figure 8(c)) and Tether (Sub-figure 8(f)) exhibit distinctly different structures in terms of node interactions. This emphasises the unique relationships between the nodes within each DBN and underscores the influence of specific factors on the price dynamics of Ethereum and Tether. The same argument is valid for Bitcoin (Sub-figure 8(a)), Binance coin (Sub-figure 8(b)) and Ripple (Sub-figure 8(e)) as they share the same category of features with completely different dynamics among their respective nodes.

7 Conclusions and future directions

This study investigates the efficacy of DBNs in predicting the future price movement directions of Bitcoin and five popular altcoins. The study demonstrates that DBNs outperform the five baseline models, ARIMA, SVR, LSTM, RF, and SVM, for all cryptocurrencies considered. On average, DBNs exhibit at least a 15% improvement in precision over all baseline models. Moreover, across different cryptocurrencies, the performance of DBNs varies due to the varying nature of market dynamics.

A second aim of this study is to investigate the impact of feature selection on model performance by incorporating different combinations of four groups of twenty-three features. The results reveals that while the combination of basic price information and technical indicators produces the accurate predictions, basic price information achieves the lowest prediction score among the group of features. Moreover, we explore whether increasing the number of features fed into the DBN leads to a monotonic improvement in precision accuracy, finding that increasing the number of features does not always enhance the performance of our DBN models. This highlights the importance of selecting and integrating pertinent features to optimise AI model performance in cryptocurrency price prediction.

The findings of this study have the potential to create a reliable decision-support system that can be used to enhance the accuracy of investment decisions and optimise trading strategies. One potential research direction is to combine expert elicitation for constructing DBNs. Incorporating expert opinions alongside data-driven approaches may reduce expert opinion’s inherent subjectivity and dependence on large-scale data, potentially leading to more accurate predictions. While we focus on external factors influencing the cryptocurrency market, incorporating internal factors, such as blockchain information, could be of great potential in increasing the accuracy of predictions.

One factor that could significantly impact an AI prediction model is the frequency of timeframes used for features. In this study, we utilise intraday price data for predictions, but employing higher frequency data, such as four or one-hour intervals, could yield varying precision results for DBNs. Further research could analyse diverse data frequencies across various market conditions to evaluate their influence on DBN performance in price prediction. Indeed, investigating the effects of different time frame choices in various bull or bear markets is another research direction, as cryptocurrencies demonstrate distinct behaviours during these market conditions.

Conflict of interest

The authors declare that they have no conflict of interest.

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Appendix A Descriptive analysis of data frames

Descriptive statistics of close price data of cryptocurrencies and traditional financial assets are presented in Table 5.

Table 5: Summary of close prices - Summary statistics for the dataset of combined daily close prices of cryptocurrencies and traditional financial assets. The data period is between January 2018 and October 2022. The last column ‘Obs.’ specifies the total number of observations available at the time of extracting data.
Mean Std. Dev. Min. Median Max. Obs.
Bitcoin 20705.53 17356.86 3242.48 10796.95 67566.83 1189
Binance coin 151.56 187.21 4.53 27.22 675.69 1171
Ethereum 1150.13 1225.25 884.31 474.21 4812.09 1206
Litecoin 101.55 64.61 75.172 23.47 377.39 1206
Ripple 0.52 0.33 0.14 0.38 2.456 1184
Tether 1.01 0.01 0.97 1.02 1.09 1232
Gold 16.79 4.5 8.8 16.09 28.17 1464
MSCI 2360.99 410.09 1596 2197.15 3242.3 1464
S&P 500 3265.73 718.52 2237.4 2985.61 4796.56 1464
USDX 96.36 4.61 88.59 96.13 114.11 1464
WTI 61.91 19.02 -37.63 58.82 123.7 1464

As Table 5 shows, Bitcoin exhibits the highest mean close price among cryptocurrencies, indicating its relatively higher value than others. Meanwhile, Binance Coin’s higher standard deviation suggests that its price fluctuates more widely over time, indicating greater price volatility.

Table 6 provides summary statistics for the daily tweet numbers associated with each cryptocurrency in this study between January 2018 and October 2022. As previously mentioned, daily tweet data associated with Tether is not available on www.bitinfocharts.com, and hence Tether is not presented in Table 6.

Table 6: Data summary of tweet number - Summary statistics for daily tweet number associated with the cryptocurrencies in this study between January 2018 and October 2022.
Mean Std. Dev. Min. Median Max.
Bitcoin 55551.35 48305.20 445 31428.50 363566
Binance coin 120.05 215.83 1 55 1601
Ethereum 17912.37 15881.35 2418 12058 138220
Litecoin 1544.24 1301.59 360 1172 13778
Ripple 17233.21 39085.54 2362 7498 735252

Appendix B A typical scenario for DBNs

In this appendix, we conduct additional analysis to gain a deeper understanding of the behaviour of DBNs. The networks for Ethereum and Tether illustrated in Sub-figures 8(c) and 8(f), respectively, have the smallest number of nodes compared to the other networks. This characteristic makes them suitable for further visual examination of the interactions between their nodes. Therefore, we showcase a typical scenario of the DBNs for Ethereum and Tether, where we fix the states of the nodes, excluding the close node, and observe the resulting changes in probabilities for the target node (close price). Table 7, Sub-figures 9(a) and 9(b) illustrate the changes in the close price.

Table 7: Typical scenario -  Five-day movement directions for Ethereum and Tether considering open, high, and low prices along with changes in the respective volume. The “Prob.”column displays the probabilities of the closing price direction being up or down on the fifth day, derived from the observed scenarios.
Cryptocurrency Feature Five day movement scenario Prob. of Up Close Prob. of Down Close
Ethereum Open Up Up Down Down Down 0.16241266 0.83758734
High Down Up Down Down Down
Low Up Down Down Down Down
Volume Down Up Down Up Up
Tether Open Up Up Up Down Down 0.49966089 0.50033911
High Down Up Up Down Up
Low Up Down Up Down Down
Volume Up Up Down Down Up

As it can be seen in Figure 9, in both cryptocurrencies, the probabilities of the close nodes changing to the down state, shown in purple, dominate. Therefore, the predictions of the DBNs indicate a downward movement for both coins. Notably, the probability of the close state being down is higher for Ethereum compared to Tether.

Refer to caption
(a) Ethereum DBN (OHLCV)
Refer to caption
(b) Tether DBN (OHLCV)
Figure 9: Typical scenario analysis for predicting price directions in the best-performing DBNs for Ethereum and Tether.

References

  • \bibcommenthead
  • Gajardo et al. [2018] Gajardo, G., Kristjanpoller, W.D., Minutolo, M.: Does bitcoin exhibit the same asymmetric multifractal cross-correlations with crude oil, gold and djia as the euro, great british pound and yen? Chaos, Solitons & Fractals 109, 195–205 (2018)
  • Maasoumi and Wu [2021] Maasoumi, E., Wu, X.: Contrasting cryptocurrencies with other assets: Full distributions and the covid impact. Journal of Risk and Financial Management 14(9), 440 (2021)
  • Nakamoto and Bitcoin [2008] Nakamoto, S., Bitcoin, A.: A peer-to-peer electronic cash system. Bitcoin.–URL: https://bitcoin. org/bitcoin. pdf 4(2) (2008)
  • Sabry et al. [2020] Sabry, F., Labda, W., Erbad, A., Malluhi, Q.: Cryptocurrencies and artificial intelligence: Challenges and opportunities. IEEE Access 8, 175840–175858 (2020)
  • Valencia et al. [2019] Valencia, F., Gómez-Espinosa, A., Valdés-Aguirre, B.: Price movement prediction of cryptocurrencies using sentiment analysis and machine learning. Entropy 21(6), 589 (2019)
  • Ismail et al. [2020] Ismail, M.S., Noorani, M.S.M., Ismail, M., Razak, F.A., Alias, M.A.: Predicting next day direction of stock price movement using machine learning methods with persistent homology: Evidence from kuala lumpur stock exchange. Applied Soft Computing 93, 106422 (2020)
  • Wang et al. [2021] Wang, X., Yang, K., Liu, T.: Stock price prediction based on morphological similarity clustering and hierarchical temporal memory. IEEE Access 9, 67241–67248 (2021)
  • Quesada et al. [2022] Quesada, D., Bielza, C., Fontán, P., Larrañaga, P.: Piecewise forecasting of nonlinear time series with model tree dynamic bayesian networks. International Journal of Intelligent Systems 37(11), 9108–9137 (2022)
  • Qiao and Beling [2016] Qiao, Q., Beling, P.A.: Decision analytics and machine learning in economic and financial systems. Springer (2016)
  • Wang et al. [2015] Wang, L., Wang, Z., Zhao, S., Tan, S.: Stock market trend prediction using dynamical bayesian factor graph. Expert Systems with Applications 42(15-16), 6267–6275 (2015)
  • Akyildirim et al. [2021] Akyildirim, E., Aysan, A.F., Cepni, O., Darendeli, S.P.C.: Do investor sentiments drive cryptocurrency prices? Economics Letters 206, 109980 (2021)
  • Ji et al. [2019] Ji, Q., Bouri, E., Lau, C.K.M., Roubaud, D.: Dynamic connectedness and integration in cryptocurrency markets. International Review of Financial Analysis 63, 257–272 (2019)
  • [13] Altan, İ.M., Hatipoğlu, C., Gujrati, R.: Future of finance: Fintech. FUTURE 20(2), 511–523
  • Swathi et al. [2022] Swathi, T., Kasiviswanath, N., Rao, A.A.: An optimal deep learning-based lstm for stock price prediction using twitter sentiment analysis. Applied Intelligence 52(12), 13675–13688 (2022)
  • Liu and Ma [2022] Liu, G., Ma, W.: A quantum artificial neural network for stock closing price prediction. Information Sciences 598, 75–85 (2022)
  • Amirzadeh et al. [2022] Amirzadeh, R., Nazari, A., Thiruvady, D.: Applying artificial intelligence in cryptocurrency markets: A survey. Algorithms 15(11), 428 (2022)
  • Cummings and Li [2021] Cummings, M.L., Li, S.: Subjectivity in the creation of machine learning models. ACM Journal of Data and Information Quality 13(2), 1–19 (2021)
  • Grzegorczyk and Husmeier [2019] Grzegorczyk, M., Husmeier, D.: Modelling non-homogeneous dynamic bayesian networks with piecewise linear regression models. Handbook of Statistical Genomics: Two Volume Set, 899–28 (2019)
  • Türkali [2020] Türkali, B.: Evaluation of alternative maintenance strategies on a complex system in thermal power systems. Master’s thesis, Işık Üniversitesi (2020)
  • Sabourin et al. [2011] Sabourin, J., Mott, B.W., Lester, J.C.: Modeling learner affect with theoretically grounded dynamic bayesian networks. In: ACII (1), pp. 286–295 (2011)
  • Xiao et al. [2017] Xiao, Q., Chaoqin, C., Li, Z.: Time series prediction using dynamic bayesian network. Optik 135, 98–103 (2017)
  • Premebida et al. [2017] Premebida, C., Faria, D.R., Nunes, U.: Dynamic bayesian network for semantic place classification in mobile robotics. Autonomous Robots 41(5), 1161–1172 (2017)
  • Nefian et al. [2002] Nefian, A.V., Liang, L., Pi, X., Liu, X., Murphy, K.: Dynamic bayesian networks for audio-visual speech recognition. EURASIP Journal on Advances in Signal Processing 2002, 1–15 (2002)
  • Tan et al. [2011] Tan, Z., Quek, C., Cheng, P.Y.: Stock trading with cycles: A financial application of anfis and reinforcement learning. Expert Systems with Applications 38(5), 4741–4755 (2011)
  • Zhang et al. [2016] Zhang, C., Lim, P., Qin, A.K., Tan, K.C.: Multiobjective deep belief networks ensemble for remaining useful life estimation in prognostics. IEEE transactions on neural networks and learning systems 28(10), 2306–2318 (2016)
  • Zhao et al. [2021] Zhao, Y., Tong, J., Zhang, L.: Rapid source term prediction in nuclear power plant accidents based on dynamic bayesian networks and probabilistic risk assessment. Annals of Nuclear Energy 158, 108217 (2021)
  • Borovkova and Tsiamas [2019] Borovkova, S., Tsiamas, I.: An ensemble of lstm neural networks for high-frequency stock market classification. Journal of Forecasting 38(6), 600–619 (2019)
  • Choudhry and Garg [2008] Choudhry, R., Garg, K.: A hybrid machine learning system for stock market forecasting. International Journal of Computer and Information Engineering 2(3), 689–692 (2008)
  • Huang et al. [2019] Huang, J.-Z., Huang, W., Ni, J.: Predicting bitcoin returns using high-dimensional technical indicators. The Journal of Finance and Data Science 5(3), 140–155 (2019)
  • Alonso-Monsalve et al. [2020] Alonso-Monsalve, S., Suárez-Cetrulo, A.L., Cervantes, A., Quintana, D.: Convolution on neural networks for high-frequency trend prediction of cryptocurrency exchange rates using technical indicators. Expert Systems with Applications 149, 113250 (2020)
  • Akyildirim et al. [2021] Akyildirim, E., Goncu, A., Sensoy, A.: Prediction of cryptocurrency returns using machine learning. Annals of Operations Research 297(1), 3–36 (2021)
  • Ye et al. [2022] Ye, Z., Wu, Y., Chen, H., Pan, Y., Jiang, Q.: A stacking ensemble deep learning model for bitcoin price prediction using twitter comments on bitcoin. Mathematics 10(8), 1307 (2022)
  • Zou and Herremans [2022] Zou, Y., Herremans, D.: A multimodal model with twitter finbert embeddings for extreme price movement prediction of bitcoin. arXiv preprint arXiv:2206.00648 (2022)
  • Abraham et al. [2018] Abraham, J., Higdon, D., Nelson, J., Ibarra, J.: Cryptocurrency price prediction using tweet volumes and sentiment analysis. SMU Data Science Review 1(3), 1 (2018)
  • Jangmin et al. [2004] Jangmin, O., Lee, J.W., Park, S.-B., Zhang, B.-T.: Stock trading by modelling price trend with dynamic bayesian networks. In: International Conference on Intelligent Data Engineering and Automated Learning, pp. 794–799 (2004). Springer
  • Wang et al. [2017] Wang, H., Chatpatanasiri, R., Sattayatham, P.: Stock trading using pe ratio: a dynamic bayesian network modeling on behavioral finance and fundamental investment. arXiv preprint arXiv:1706.02985 (2017)
  • Zuo and Kita [2012] Zuo, Y., Kita, E.: Up/down analysis of stock index by using bayesian network. Engineering Management Research 1(2), 46 (2012)
  • Malagrino et al. [2018] Malagrino, L.S., Roman, N.T., Monteiro, A.M.: Forecasting stock market index daily direction: A bayesian network approach. Expert Systems with Applications 105, 11–22 (2018)
  • Heckerman [2008] Heckerman, D.: A tutorial on learning with bayesian networks. Innovations in Bayesian networks, 33–82 (2008)
  • Alameddine et al. [2011] Alameddine, I., Cha, Y., Reckhow, K.H.: An evaluation of automated structure learning with bayesian networks: An application to estuarine chlorophyll dynamics. Environmental Modelling & Software 26(2), 163–172 (2011)
  • Death et al. [2015] Death, R.G., Death, F., Stubbington, R., Joy, M.K., Belt, M.: How good are bayesian belief networks for environmental management? a test with data from an agricultural river catchment. Freshwater biology 60(11), 2297–2309 (2015)
  • Kuzmanić Skelin et al. [2021] Kuzmanić Skelin, A., Vojković, L., Mohović, D., Zec, D.: Weight of evidence approach to maritime accident risk assessment based on bayesian network classifier. Transactions on Maritime Science 10(02), 330–347 (2021)
  • Adedipe et al. [2020] Adedipe, T., Shafiee, M., Zio, E.: Bayesian network modelling for the wind energy industry: An overview. Reliability Engineering & System Safety 202, 107053 (2020)
  • Masmoudi et al. [2019] Masmoudi, K., Abid, L., Masmoudi, A.: Credit risk modeling using bayesian network with a latent variable. Expert Systems with Applications 127, 157–166 (2019)
  • Rohmer [2020] Rohmer, J.: Uncertainties in conditional probability tables of discrete bayesian belief networks: A comprehensive review. Engineering Applications of Artificial Intelligence 88, 103384 (2020)
  • Shiguihara et al. [2021] Shiguihara, P., Lopes, A.D.A., Mauricio, D.: Dynamic bayesian network modeling, learning, and inference: A survey. IEEE Access 9, 117639–117648 (2021)
  • Portinale et al. [2010] Portinale, L., Raiteri, D.C., Montani, S.: Supporting reliability engineers in exploiting the power of dynamic bayesian networks. International journal of approximate reasoning 51(2), 179–195 (2010)
  • Gao et al. [2014] Gao, X.-G., Mei, J.-F., Chen, H.-Y., Chen, D.-Q.: Approximate inference for dynamic bayesian networks: sliding window approach. Applied intelligence 40(4), 575–591 (2014)
  • Wu et al. [2015] Wu, X., Liu, H., Zhang, L., Skibniewski, M.J., Deng, Q., Teng, J.: A dynamic bayesian network based approach to safety decision support in tunnel construction. Reliability Engineering & System Safety 134, 157–168 (2015)
  • Zhang et al. [2023] Zhang, B., Bai, L., Zhang, K., Kang, S., Zhou, X.: Dynamic assessment of project portfolio risks from the life cycle perspective. Computers & Industrial Engineering 176, 108922 (2023)
  • Voronenko et al. [2020] Voronenko, M., Nikytenko, D., Krejci, J., Krugla, N., Naumov, O., Savina, N., Topalova, E., Filippova, V., Lytvynenko, V.: Dynamic bayesian networks application for economy competitiveness situational modelling. In: Conference on Computer Science and Information Technologies, pp. 210–224 (2020). Springer
  • Amin et al. [2019] Amin, M.T., Khan, F., Imtiaz, S.: Fault detection and pathway analysis using a dynamic bayesian network. Chemical Engineering Science 195, 777–790 (2019)
  • Wu et al. [2016] Wu, S., Zhang, L., Zheng, W., Liu, Y., Lundteigen, M.A.: A dbn-based risk assessment model for prediction and diagnosis of offshore drilling incidents. Journal of Natural Gas Science and Engineering 34, 139–158 (2016)
  • Motard [2022] Motard, P.: Hierarchical reinforcement learning for algorithmic trading (2022)
  • Manujakshi et al. [2022] Manujakshi, B., Kabadi, M.G., Naik, N.: A hybrid stock price prediction model based on pre and deep neural network. Data 7(5), 51 (2022)
  • Srivastava et al. [2021] Srivastava, P.R., Zhang, Z.J., Eachempati, P.: Deep neural network and time series approach for finance systems: predicting the movement of the indian stock market. Journal of Organizational and End User Computing (JOEUC) 33(5), 204–226 (2021)
  • Ciaian et al. [2016] Ciaian, P., Rajcaniova, M., Kancs, d.: The economics of bitcoin price formation. Applied Economics 48(19), 1799–1815 (2016)
  • Poyser [2019] Poyser, O.: Exploring the dynamics of bitcoin’s price: a bayesian structural time series approach. Eurasian Economic Review 9(1), 29–60 (2019)
  • Königstorfer and Thalmann [2020] Königstorfer, F., Thalmann, S.: Applications of artificial intelligence in commercial banks–a research agenda for behavioral finance. Journal of behavioral and experimental finance 27, 100352 (2020)
  • Ricciardi and Simon [2000] Ricciardi, V., Simon, H.K.: What is behavioral finance? Business, Education & Technology Journal 2(2), 1–9 (2000)
  • Chursook et al. [2022] Chursook, A., Dawod, A.Y., Chanaim, S., Naktnasukanjn, N., Chakpitak, N.: Twitter sentiment analysis and expert ratings of initial coin offering fundraising: Evidence from australia and singapore markets. TEM Journal 11(1), 44 (2022)
  • Corbet et al. [2018] Corbet, S., Meegan, A., Larkin, C., Lucey, B., Yarovaya, L.: Exploring the dynamic relationships between cryptocurrencies and other financial assets. Economics Letters 165, 28–34 (2018)
  • Charfeddine et al. [2020] Charfeddine, L., Benlagha, N., Maouchi, Y.: Investigating the dynamic relationship between cryptocurrencies and conventional assets: Implications for financial investors. Economic Modelling 85, 198–217 (2020)
  • Ji et al. [2018] Ji, Q., Bouri, E., Gupta, R., Roubaud, D.: Network causality structures among bitcoin and other financial assets: A directed acyclic graph approach. The Quarterly Review of Economics and Finance 70, 203–213 (2018)
  • Wijaya et al. [2021] Wijaya, D.R., Sarno, R., Zulaika, E.: Dwtlstm for electronic nose signal processing in beef quality monitoring. Sensors and Actuators B: Chemical 326, 128931 (2021)
  • Amirzadeh et al. [2023] Amirzadeh, R., Nazari, A., Thiruvady, D., Ee, M.S.: Modelling Determinants of Cryptocurrency Prices: A Bayesian Network Approach (2023)
  • Lyons et al. [2018] Lyons, M.B., Keith, D.A., Phinn, S.R., Mason, T.J., Elith, J.: A comparison of resampling methods for remote sensing classification and accuracy assessment. Remote Sensing of Environment 208, 145–153 (2018)
  • BayesFusion [2017] BayesFusion, L.: Genie modeler. User Manual. Available online: https://support. bayesfusion. com/docs/(accessed on 21 October 2019) 16, 30–32 (2017)
  • Fawzy et al. [2020] Fawzy, H., Rady, E.H.A., Abdel Fattah, A.M.: Comparison between support vector machines and k-nearest neighbor for time series forecasting. J. Math. Comput. Sci. 10(6), 2342–2359 (2020)
  • Lee et al. [2019] Lee, K., Lim, J., Yoon, D., Jung, H.: Prediction of shale-gas production at duvernay formation using deep-learning algorithm. SPE Journal 24(06), 2423–2437 (2019)
  • Alhakeem et al. [2022] Alhakeem, Z.M., Jebur, Y.M., Henedy, S.N., Imran, H., Bernardo, L.F., Hussein, H.M.: Prediction of ecofriendly concrete compressive strength using gradient boosting regression tree combined with gridsearchcv hyperparameter-optimization techniques. Materials 15(21), 7432 (2022)
  • Chivukula and Lakshmi [2020] Chivukula, R., Lakshmi, T.J.: Cryptocurrency price prediction: A machine learning approach. Sensors & Transducers 244(5), 44–47 (2020)
  • Quiroz and Alférez [2020] Quiroz, I.A., Alférez, G.H.: Image recognition of legacy blueberries in a chilean smart farm through deep learning. Computers and Electronics in Agriculture 168, 105044 (2020)
  • Htun, Biehl, and Petkov [2023] Htun, H.H., Biehl, M., Petkov, N.: Survey of feature selection and extraction techniques for stock market prediction. Financial Innovation 9, 26 (2023)
  • Koutsoukos et al. [2009] Koutsoukos, X., Roychoudhury, I., Biswas, G., Gautam, : Distributed Diagnosis of Dynamic Systems Using Dynamic Bayesian Networks. DX-09 June 14-17 Stockholm , 329 (2009)
  • Li et al. [2023] Li, W., Wang, X., Wang, Z., Yang, M.: Profitability and Data-Snoo** Tests of Four Technical Trade Strategies for Cryptocurrency Pair BTC/USDT and ETH/USDT in Cryptocurrency Markets During 2022-2023. Journal of Accounting and Finance 23, 106–180 (2023). North American Business Press
  • Charfeddine et al. [2020] Charfeddine, L., Benlagha, N., Maouchi, Y.: Investigating the dynamic relationship between cryptocurrencies and conventional assets: Implications for financial investors. Economic Modelling 85, 198–217 (2020). Elsevier
  • Shaygan et al. [2022] Shaygan, M., Meese, C., Li, W., Zhao, X. G., Nejad, M.: Traffic prediction using artificial intelligence: review of recent advances and emerging opportunities. Transportation Research Part C: Emerging Technologies 145, 103921 (2022). Elsevier
  • Krittanawong et al. [2017] Krittanawong, C., Zhang, H., Wang, Z., Aydar, M., Kitai, T.: Artificial intelligence in precision cardiovascular medicine. Journal of the American College of Cardiology 69, 2657–2664 (2017). American College of Cardiology Foundation Washington, DC
  • Bera et al. [2022] Bera, K., Braman, N., Gupta, A., Velcheti, V., Madabhushi, A.: Predicting cancer outcomes with radiomics and artificial intelligence in radiology. Nature Reviews Clinical Oncology 19, 132–146 (2022). Nature Publishing Group UK London
  • Bahrammirzaee [2010] Bahrammirzaee, A.: A comparative survey of artificial intelligence applications in finance: artificial neural networks, expert system and hybrid intelligent systems. Neural Computing and Applications 19, 1165–1195 (2010). Springer
  • ** et al. [2021] **, L., Liu, T., Ma, J.: Modeling thermal sensation prediction using random forest classifier. In: Intelligent Equipment, Robots, and Vehicles: 7th International Conference on Life System Modeling and Simulation, LSMS 2021 and 7th International Conference on Intelligent Computing for Sustainable Energy and Environment, ICSEE 2021, Hangzhou, China, October 22–24, 2021, Proceedings, Part III 7, 552–561 (2021). Springer
  • Jiang et al. [2021] Jiang, L., Li, Z., Todhunter, R.J., Huang, M.: Genomic prediction of two complex orthopedic traits across multiple pure and mixed breed dogs. Frontiers in Genetics 12, 666740 (2021). Frontiers
  • Lytvynenko et al. [2020] Lytvynenko, V., Naumov, O., Voronenko, M., Krejci, J., Naumova, L., Nikytenko, D., Savina, N.: Dynamic Bayesian Networks Application for Evaluating the Investment Projects Effectiveness. International Scientific Conference “Intellectual Systems of Decision Making and Problem of Computational Intelligence”, 315–330 (2020). Springer
  • Santos et al. [2023] Santos, T., Bessani, M., da Silva, I.: Evolving Dynamic Bayesian Networks for CO2 Emissions Forecasting in Multi-Source Power Generation Systems. IEEE Latin America Transactions 21, 1022–1031 (2023). IEEE
  • Shantal et al. [2023] Shantal, M., Othman, Z., Bakar, A.A.: A Novel Approach for Data Feature Weighting Using Correlation Coefficients and Min–Max Normalization. Symmetry 15, 2185 (2023). MDPI
  • Polatgil [2022] Polatgil, M.: Investigation of the effect of normalization methods on ANFIS success: forestfire and diabetes datasets. International Journal of Information Technology and Computer Science 14, 1–8 (2022).
  • Zou and Sun [2012] Zou, H., Sun, L.: The influence of investor sentiment on stock return and its volatility under different market states. In: 2012 Fifth International Conference on Business Intelligence and Financial Engineering, 337–341 (2012). IEEE
  • Stefanovska [2020] Stefanovska, M.: The Effects of Equal Weighting and Rebalancing on Portfolio Performance. , (2020).
  • Liji et al. [2018] Liji, U., Chai, Y., Chen, J.: Improved personalized recommendation based on user attributes clustering and score matrix filling. Computer Standards & Interfaces 57, 59–67 (2018). Elsevier
  • Bouri et al. [2017] Bouri, E., Molnár, P., Azzi, G., Roubaud, D., Hagfors, L. I.: On the hedge and safe haven properties of Bitcoin: Is it really more than a diversifier?. Finance Research Letters 20, 192–198 (2017). Elsevier
  • Anamika et al. [2023] Anamika, Chakraborty, M., Subramaniam, S.: Does sentiment impact cryptocurrency?. Journal of Behavioral Finance 24, 202–218 (2023). Taylor & Francis
  • Nabilou and Prum [2019] Nabilou, H., Prum, A.: Central banks and regulation of cryptocurrencies. Rev. Banking & Fin. L. 39, 1003 (2019). HeinOnline
  • Halabi et al. [2017] Halabi, A., Kenett, R. S., Sacerdote, L.: Using dynamic Bayesian networks to model technical risk management efficiency. Quality and Reliability Engineering International 33, 6, 1179–1196 (2017). Wiley Online Library
  • Heine [2020] Heine, C.: Towards modeling visualization processes as dynamic Bayesian networks. IEEE Transactions on Visualization and Computer Graphics 27(2), 1000–1010 (2020)
  • El-Khatib and Hatemi-J [2023] El-Khatib, Y., Hatemi-J, A.: A Stochastic Model for Cryptocurrencies in Illiquid Markets with Extreme Conditions and Structural Changes. In: Artificial Intelligence and Transforming Digital Marketing, Springer, , 479–488 (2023)
  • Aldrees et al. [2024] Aldrees, A., Khan, M., Taha, A. T. B., Ali, M.: Evaluation of water quality indexes with novel machine learning and SHapley Additive ExPlanation (SHAP) approaches. Journal of Water Process Engineering 58, 104789 (2024)