Abstract
Cryptocurrencies are distributed digital currencies that have emerged as a consequence of financial technology advancement. In 2017, cryptocurrencies have shown a huge rise in their market capitalization and popularity. They are now employed in today’s financial systems as individual investors, corporate firms, and big institutions are heavily investing in them. However, this industry is less stable than traditional currency markets. It can be affected by several legal, sentimental, and technical factors, so it is highly volatile, dynamic, uncertain, and unpredictable, hence, accurate forecasting is essential. Recently, cryptocurrency price prediction becomes a trending research topic globally. Various machine and deep learning algorithms, e.g., Neural Networks (NN), Gated Recurrent Unit (GRU), Long Short-Term Memory (LSTM), and Bidirectional LSTM (BiLSTM) were utilized to analyze the factors influencing the prices of the cryptocurrencies and accordingly predict them. This paper suggests a five-phase framework for cryptocurrency price prediction based on two state-of-the-art deep learning architectures (i.e., BiLSTM and GRU). The current study uses three public real-time cryptocurrency datasets from “Yahoo Finance”. Bidirectional Long Short-Term Memory and Gated Recurrent Unit deep learning-based algorithms are used to forecast the prices of three popular cryptocurrencies (i.e., Bitcoin, Ethereum, and Cardano). The Grid Search approach is used for the hyperparameters optimization processes. Results indicate that GRU outperformed the BiLSTM algorithm for Bitcoin, Ethereum, and Cardano, respectively. The lowest RMSE for the GRU model was found to be 0.01711, 0.02662, and 0.00852 for Bitcoin, Ethereum, and Cardano, respectively. Experimental results proved the significant performance of the proposed framework that achieves the minimum MSE and RMSE values.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
1 Introduction
Traditionally, economic systems depend only on third-party financial organizations (e.g., banks). These organizations are the intermediates between the parties, having complete control over the transactions and process of exchanging funds. These traditional systems allow limited money transactions and lack security, trust, flexibility, and transparency [1]. To address these issues, a system that can minimize the role of financial intermediaries. Satoshi Nakamoto, a researcher, wrote a paper [2] in 2008, describing the notion of peer-to-peer (P2P) electronic payment transmission without the involvement of any intermediary financial organizations. In financial systems, cryptocurrency is a decentralized virtual currency [3]. Cryptography is used to secure this currency, making it impossible to be double-spent or faked [4]. A central authority (i.e., central banks) is not used to issue cryptocurrencies. It is created using the blockchain technology [5]. This technology is incredibly complicated and seeks to store data in a way that makes hacking, altering, or defrauding the system impossible or difficult. The blockchain is composed of two essential modules, namely, transaction and block. The transaction defines the participant’s initiated action, while the block records the transaction and additional details (e.g., creation timestamp and correct sequence) [6].
For more than 2 years, Bitcoin was the first and only blockchain-based cryptocurrency [7]. Beginning from the first cryptocurrency suggested by Satoshi Nakamoto, more than 5200 cryptocurrencies, such as Bitcoin, Ethereum, Cardano, Ripple, Monero, Stellar, Litecoin, and Dash, are being traded nowadays [8]. Because of its essential nature of mixing monetary units and encryption technology, cryptocurrencies have recently gained a lot of attention in the domains of cryptography, computer science, and economics [7]. Generally, cryptocurrencies are grouped into three major categories (i.e., currency, platform, and application). The currency domain serve as an exchange medium (i.e., payment method). Platform cryptocurrencies enable the development of a wide range of blockchain-based apps. Finally, cryptocurrencies classified as application domains are used in specific sectors [1].
Since the cryptocurrency’s inception, there has been a significant increase in usage, particularly in the previous 5 years. According to Blockchain.com, the the number of blockchain wallet users had increased from 2015 to 2021, as illustrated in Fig. 1 [9, 10]. Bitcoin went from having no value in 2008 to reaching the highest recorded price of $20,089.00 in 2017. Since then, the price of bitcoin has not fallen below $3000. In mid-April, 2021, Bitcoin prices reached all-time highs of more than $60,000 as Coinbase (i.e., a cryptocurrency exchange) went public [11]. On Nov. 10, 2021, Bitcoin again reached an all-time high of $68,789 before closing at 64,995$ [12]. On June 13, 2021, the prices of the cryptocurrencies dropped, thus, Bitcoin price lowered below $23,000 for the first time since December 2020 [13]. The total market capitalization of all cryptocurrencies was approximately 19 billion USD as of February 2017. According to [14], the top 15 currencies amounted to more than 97% of the market, while seven accounted for 90% of market capitalization [14].
As mentioned before, cryptocurrencies gained broad market acceptance and accelerated development. Many financial institutions have included cryptocurrency-related assets in their trading strategies. Cryptocurrency trading is concerned with the act of selling and buying cryptocurrencies to make a profit. Kyriazis [15] surveyed the predictability of the pricing of cryptocurrencies. So, the efficient market hypothesis is rejected and speculation is possible through trading. Additionally, in Fang et al. [16], a comprehensive survey of cryptocurrency trading research, covering various aspects (e.g., cryptocurrency trading systems, prediction of volatility and return, and technical trading) was proposed. Makarova and Schoar [17] studied the arbitrage and price formation in the market of cryptocurrencies. Chava et al. [18] used Google’s Search Volume Index as a proxy for retail investor attention. They found that celebrity endorsements of crypto tokens, initial coin offerings (ICOs), non-fungible tokens (NFTs), and crypto platforms providing high yields to investors cause greater attention from areas concerning with higher lottery sales per capita.
The bitcoin price has gained the interest of scholars all over the world. Unfortunately, cryptocurrency prices are volatile and dynamic. It is determined by several elements (for example, popularity, mining difficulty, transaction cost, market trends, price of alternate coins, sentiments, stock markets, and some legal issues) [19]. Additionally, small cryptocurrencies with a small market share become a source of shocks that can benefit or harm other cryptocurrencies. These factors make the prices of cryptocurrencies unstable, change rapidly over time, and are difficult to be predicted. As a result, scams, suspected hacks, or other hidden problems lead to dramatically drop in cryptocurrency prices [20]. For example, on June 26th, 2019, more than 10% of the Bitcoin price was lost in a few minutes because of the crashes of the Coinbase digital exchange. Consequently, price prediction has become a critical responsibility for researchers [21]. Clients use thousands of coins around the world. The focus of this study is on three of the most popular cryptocurrencies (i.e., Bitcoin, Ethereum, and Cardano). Bitcoin is leading at $960.79 billion, followed by Ethereum at $189.98 billion, Binance Coin at $39.91 billion, Tether at $35.96 billion, and Cardano at $33.36 billion in market capitalization [22]. It is worth mentioning that these values are regularly updated as the market changes.
Different machine and deep learning architectures have been proposed to perform the task of cryptocurrency price prediction (e.g., support vector machines SVM [23], Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM) [24], and Deep Neural Networks (DNN) [25]). This study proposes a deep learning-based system to predict cryptocurrency values, employing 2 RNN algorithms, specifically, Bidirectional Long Short-Term Memory (BiLSTM) and Gated Recurrent Unit (GRU). Three datasets, namely, Bitcoin USD (BTC-USD) [26], Ethereum USD (ETH-USD) [27], and Cardano USD (ADA-USD) [28], including past cryptocurrency prices, are utilized for training the algorithms to anticipate cryptocurrency values. Figure 2 shows the flow for a cryptocurrency price prediction system.
The cryptocurrency price is suffering from instability and dynamism; hence, prediction is a crucial task. In the current study, deep learning-based algorithms (i.e., Bidirectional LSTM and GRU), are used to predict the prices of three of the most used cryptocurrencies (i.e., Bitcoin, Ethereum, and Cardano). This method seeks to uncover hidden patterns in data, integrate them, and generate more accurate forecasts. The current study’s contributions can be summarised as follows:
-
Utilizing deep learning-based algorithms (i.e., BiLSTM and GRU) to predict the three cryptocurrencies’ prices (i.e., Bitcoin, Ethereum, and Cardano).
-
Utilizing the Grid Search approach for the hyperparameters optimization processes.
-
Evaluating the performance of the proposed models utilizing evaluation metrics such as MSE, RMSE, MAE, and MAPE.
The rest of this paper is structured as follows: Sect. 2 introduces the related work in this field. Section 3 discusses the methodology, data acquisition phase, data pre-processing phase, classification and optimization phase, and performance evaluation phase. Section 4 explores the experimental results. Section 5 presents the current study limitations. Section 5 concludes the paper and presents the future work.
2 Related studies
The prediction of the cryptocurrency price is a time series problem [29]. A time series is a sequence of variable measurements made over time. Usually, these measurements are made at equally spaced times. Time series problems are completely different from other problems because of the following reasons. First, time series problems are time-dependent. Hence, the basic hypothesis of a linear regression model can not be held in this case. Secondly, most of these problems have some form of seasonality trends along with a decreasing or increasing trend (i.e. particular time frame has specific variations) [30]. Time series prediction is used to predict the future values based on the prior observed values over time. Some methods include Autoregressive (AR) [31], Autoregressive Integrated Moving Average (ARIMA) [32], Seasonal Autoregressive Integrated Moving Average (SARIMA) [33], Exponential Smoothing (ES) [34], DeepAR [35], and N-BEATS [36]. Machine learning (ML) models used to solve time series problems include K-Nearest Neighbor (KNN), Classification and Regression Trees (CART), Decision Trees, and Support Vector Regression (SVR), while deep learning (DL) algorithms include Recurrent Neural Network (RNN), Long Short-Term Memory (LSTM), XGBoost, AdaBoost, and Convolutional Neural Network (CNN). The most significant advantage of ML-based approaches is that they are computationally inexpensive for implementing online models. DL-based models show several advantages over other predicting models as it not only produces a result that is almost or exactly the same as the actual result but also enhance the result accuracy [37].
Deep learning (DL) is a type of artificial intelligence that uses previous data to predict the future. DL-based models offer several advantages over other prediction models in that they not only produce a result that is nearly or exactly the same as the true result, but they also enhance the result’s accuracy [37]. DNN, RNN, LSTM, and Deep Belief Networks are examples of DL.
Derbentseva et al. [38] presented two machine learning (ML) techniques for time-series data forecasting: random forests (RF) and stochastic gradient boosting machine (SGBM). They forecasted the prices of three of most valuable cryptocurrencies: Bitcoin, Ethereum, and Ripple. Their reported results show that the ML ensemble technique can be used to forecast bitcoin values. Three types of recurrent neural network techniques are proposed in Hamayel, and Owda’s [39] to predict the prices of three cryptocurrencies (i.e., Bitcoin,itecoin, and Ethereum). Their proposed methods produce accurate forecasts based on the mean absolute percentage error. According to the results, the GRU model outperformed the LSTM and bidirectional LSTM (BiLSTM) models in prediction for the three cryptocurrencies. GRU provides MAPE percentages of 0.2454%, 0.2116%, and 0.8267% for Bitcoin, Litecoin, and Ethereum, respectively.
Pour et al. [40] proposed a hybrid model for Bitcoin price prediction that uses Long Short-Term Memory and Bayesian Optimization. Their model was validated using MSE, RMSE, and NRMSE. Their reported results showed that the proposed model had good predictive power. Patel et al. [1] proposed a hybrid cryptocurrency prediction method based on LSTM and GRU, concentrating on Litecoin and Monero. According to reported results, the proposed algorithm predicted the prices with great accuracy, and it may be used in various applications based on the bitcoin price prediction. Finally, Kim et al. [41] proposed a multiple LSTM based on self-attention. Their proposed model contains numerous LSTM modules for on-chain variable groups as well as the attention mechanism of the prediction model. Experiments based on real Bitcoin price data were utilized to assess the suggested framework, which resulted in MAE, RMSE, MSE, and MAPE values of 0.3462, 0.5035, 0.2536, and 1.3251, respectively.
Miura et al. [42] utilized several ML models to predict future values based on prior samples, including multilayer perceptron, GRU, LSTM, SVM, and regression. The results demonstrated that the proposed system accurately predicts prices, thus, the technology might be utilized to anticipate prices for other cryptocurrencies. Yiying and Yeze [43] proposed a sophisticated artificial intelligence approach of fully connected Artificial Neural Network (ANN) and LSTM to study Bitcoin, Etherum, and Ripple price fluctuations. Their stated results revealed that ANN was more dependent on long-term history. LSTM, on the other hand, depended more on short-term dynamics, indicating that it extracts information from historical memory more efficiently than ANN.
A machine learning-based strategy was proposed in SHAHBAZI and BYUN [44] to perform cryptocurrency price predictions (i.e., Litecoin and Monero) for a financial institution. Their proposed method included a reinforcement Learning algorithm to assess and forecast cryptocurrency prices and a blockchain infrastructure to ensure a secure transaction environment. Their findings revealed that the proposed approach predicted prices more correctly than previous state-of-the-art algorithms. KALARIYA et al. [45] proposed a stochastic neural network model for cryptocurrency price prediction based on random walk theory. The Multi-Layer Perceptron (MLP) and LSTM models were trained, and the experimental results for Bitcoin, Ethereum, and Litecoin were released. The model outperformed the deterministic models, according to their findings.
Uras et al. [46] have used statistical methodologies and ML algorithms to forecast the prices of Bitcoin, Litecoin, and Ethereum. The Simple Linear Regression was used to forecast univariate series using only price data, whereas the Multiple Linear Regression was used to forecast multivariate series utilizing both price and volume data. Multilayer Perceptron and Long Short-Term Memory were the deep learning methods employed. In addition, a deep-learning-based hybrid model employing GRU and LSTM was proposed by TANWAR et al. [47] to forecast the price of Litecoin and Zcash. Their proposed model has the potential to be utilized in real-time applications. According to their published results, the model forecasted prices with great accuracy compared to existing models.
Related studies comparison: Table 1 presents a comparison between the mentioned literature and the current study. They are organized in descending order.
Research Gap: In the light of the mentioned literature, the studies addressing Bitcoin are more than those that address the other altcoins as seen in Table 1. For traders, security and privacy are critical problems while trading to gain more trust. Hence, the possibility of using ML- and DL-based algorithms to address the anonymity, security, and privacy level of other cryptocurrencies are needed to be explored. ML Ensemble algorithms are not examined that much in the field of price prediction of cryptocurrencies. Furthermore, not enough focus was performed on optimizing ML techniques to enhance the accuracy.
3 The suggested DLCP2F
As mentioned earlier, Cryptocurrency popularity increased in 2017 as its market value increased rapidly for several months in a row. Prices peaked at around $800 billion in January 2018 [48]. The current study suggests a framework for the Cryptocurrency Price Prediction that utilizes state-of-the-art deep learning architectures. The proposed framework is presented in five phases: (1) data acquisition, where the data is acquired from a public source, (2) data preprocessing phase to prepare the dataset for the next phase, (3) classification phase to learn and optimize the models, (4) performance evaluation phase, and (5) future prediction phase. It is summarized graphically in Fig. 3.
3.1 Data acquisition phase
The current study depends on three public real-time cryptocurrency datasets retrieved from “Yahoo Finance”. Their “Historical Prices” daily records are retrieved until the date “August 9, 2022”. The first dataset is named Bitcoin USD (BTC-USD) and consists of 2885 daily records from “September 17, 2014” [26]. The second dataset is named Ethereum USD (ETH-USD) and consists of 1735 daily records from “November 9, 2017” [27]. The third dataset is named Cardano USD (ADA-USD) and consists of 1735 daily records from “November 9, 2017” [28]. The three datasets consist of 7 columns: “Date”, “Open”, “High”, “Low”, “Close”, “Adj Close, and “Volume”. The “Open” and “Close” prices represent the currency market’s open and closed prices on a specific “Date”. The “High” and “Low” prices represent the currency market’s maximum and minimum prices on a specific “Date”. The “Volume” is the amount of money in circulation on a particular “Date”. Table 2 summarizes the details of the datasets. Figure 4 shows the close prices summarization for the three datasets. From it, the close prices are low in the initial period then takes an incremental slope. After that, the prices change but in the high region. From that, the datasets show a recognizable challenge to forecasting the cryptocurrency prices using the given trading features. Statistics on the three datasets are reported in Table 3. Skew is concerned with the measurement of symmetry. A distribution (i.e., dataset) is symmetric if the right and left sides look the same from the center point. Kurtosis measures whether the data is heavy- or light-tailed when compared to a normal distribution. Thus, datasets with high kurtosis (i.e., heavy-tailed) tend to contain outliers. while datasets with low kurtosis (i.e., light-tailed) lack outliers [49, 50]. Table 3 shows that the last column has a very high standard deviation compared to other columns.
3.2 Data preprocessing phase
The data is arranged chronologically and recorded at regular intervals (i.e., 1 day). It is considered a time series data that requires special treatment with the used models (i.e., BiLSTM and GRU). The first step is to filter the features. The current study uses the “Open”, “Close”, “Adj Close”, and “Volume” features while the other features are dropped. As the target of the current study is to predict the price of the cryptocurrency, it just depends on the selected columns. After that, the features will be squished using the min-max scaler (Eq. 1) where \(X_i\) is the input record and \(X_o\) is the scaled output record. This will facilitate the optimization algorithm to converge faster.
The last step in the preprocessing phase is to make data sequences. Building sequences begin with creating a sequence of a specific length (i.e., window size) at position 0. Then a new sequence is created by shifting one position to the right. This is continued until all of the available positions have been utilized. Finally, the inputs and outputs are created using the same approach. The only difference between the inputs and outputs is that specified value shifts the outputs, namely “days shift”.
The models are controlled by two variables (i.e., days shift and sequence length). The days shift is concerned with the time gap between input (i.e., features) and output (i.e., close price). For example, if the value of the days shift is 5 and the first 10 days were taken as an input, the output will be from the 5th to the 15th day. How this will affect the prediction? When 10 days are entered as an input (i.e., from the 1st to the 10th day) and the value of the days shift is 3, it is supposed to predict the output from the 3rd to the 13th days. Since the goal is to predict future data, the last three elements on the predicted output values are the future values. The lowest value of the days shift is 1, hence, the future data of the next day will be predicted along with with the previous days. The sequence length is concerned with how the data are passed to the model. When the value of the sequence length is 10, the input will be divided into groups, each group consisting of 10 records, and each group is treated as one record. For example, the input X consists of 100 records, and the sequence length is set to 25, hence, four sequences will be generated, and each is treated as one record by the model. When the value sequence length is higher, the performance will be better. This happens because each record contains more information, however, the complexity of the training time will be increased. Thus, the current work aims to determine the best value for both days shift and sequence length through the use of the grid search approach.
Figure 5 shows a graphical sample of the training and testing inputs and outputs process. In this example, a 1000-record dataset is split into 900 for training and 100 for testing where the days shift value is set to 5. The training inputs start from 0 while the training outputs start from 5 (i.e., the days shift value). Hence, the input X is the first 850 records and the output Y is the last 850 records. This means that the prediction will be the forecast for the next 5 days based on the current inputs.
3.3 Classification and optimization phase
The current phase works on creating two state-of-the-art deep learning models (i.e., BiLSTM and GRU) and optimizing them based on the input data. Long-Short Term Memory (LSTM) works by allowing each internal layer to use certain gates to access information from both previous and current layers. After going through several gates (for example, they forget and input gates) and many activation functions, the data is delivered via the LSTM cells (such as the Tanh function and ReLU function). The main advantage is that each LSTM cell can recall patterns for a specific time. It is important to note that LSTM can remember important information while forgetting irrelevant information. Furthermore, an LSTM’s default behavior is remembering information for a long time [51].
Bidirectional LSTM (BiLSTM) is a recurrent neural network (RNN) that is commonly used to process natural language. In contrast to typical LSTM, the input flows in both directions and can use data from both sides. In short, BiLSTM adds another LSTM layer, reversing the data flow. In a nutshell, the input sequence flows backward in the additional LSTM layer. The outputs from both LSTM layers are combined in various methods, including average, sum, multiplication, and concatenation [52]. The suggested BiLSTM network consists of: (1) an input LSTM layer with several units equal to the sequence length and Tanh activation function, (2) a 50% dropout layer, (3) a BiLSTM layer with 256 units, (4) another 50% dropout layer, and (5) an output layer with a linear activation function. Figure 6 presents the hierarchy of the BiLSTM model using a sequence length of 50. The “None” keyword means to accept any value.
GRU (Gated Recurrent Unit) is an RNN that seeks to tackle the vanishing gradient problem. GRU might be regarded as a variant of the LSTM. It employs the so-called update gate and reset gate to overcome the vanishing gradient problem of a regular RNN. Two vectors determine what data should be sent to the output. They are unique in that they can be trained to retain knowledge from the past without having to wash it away over time or delete information unrelated to the forecast [53]. The suggested GRU network consists of: (1) an input GRU layer with a number of 50 units and Tanh activation function, (2) a 25% dropout layer, (3) another GRU layer with 100 units and Tanh activation function, (4) another 25% dropout layer, and (5) an output layer with a linear activation function. Figure 7 presents the hierarchy of the GRU model using a sequence length of 50. The “None” keyword means to accept any value.
For both networks, the AdaGrad parameters’ optimizer [54] is used. It has several advantages: (1) it eliminates the need to manually regulate the learning rate, (2) it achieves faster and more reliable convergence than the basic SGD when the weight scaling is unequal, and (3) it is not sensitive to the size of the step. It uses the update rule in Eq. 2 where \(\eta\) is the learning rate, \(g_t\) is the partial derivative of the objective function, and \(G_t\) is a diagonal matrix. \(\varepsilon\) is added to avoid any divisions by zeros. A model’s hyperparameter is a model’s feature that is independent of the model and whose value cannot be calculated from data. Before the learning process can begin, the hyperparameter’s value must be determined. The grid search (GS) is used to identify the model’s optimal hyperparameters that produce the most optimistic predictions [55]. The target is the GS approach to find the best combination between the sequence length and day shift value. The sequence length range is [10, 20, 30, 40, 50] and the days shift range is [1, 2, 3, 4, 5].
3.4 Performance evaluation phase
For each epoch, the performance is evaluated. The current study applies 100 epochs with the early stopping of 10. The dataset is split into training, testing, and validation. The testing size is set to 100. The validation size is set to 10% of the remaining data. The mean squared error is used as the loss and evaluation function. The less the value, the better the model. It is equated in Eq. 3 where N is number of records, \(y_i\) is the actual value, and \(y^*_i\) is the predicted value. Also, the root mean squared error, mean absolute error, mean absolute percentage error, and R2 score are calculated, and their equations are shown from Eqs. 4 to 6. It is worth mentioning that both Eqs. 3 and 4 can be derived from one another.
4 Experiments and discussions
The current section reports the executed experiments and their discussions. The experiments’ configurations are reported in Table 4.
4.1 The “Bitcoin USD (BTC-USD)” dataset experiments
Table 5 shows reported performance metrics for the “Bitcoin USD (BTC-USD)” dataset using the two models (i.e., GRU and BiLSTM). The minimum MSE, RMSE, MAE, MAPE, and R2 are 0.00029, 0.01711, 0.02214, 0.07036, and − 2.89528 respectively for the GRU and 0.00210, 0.04582, 0.03358, 0.10942, and − 10.71676 respectively for the BiLSTM. The maximum MSE, RMSE, MAE, MAPE, and R2 are 0.00264, 0.05136, 0.07599, 0.21793, and 0.74536 respectively for the GRU and 0.00793, 0.08907, 0.08577, 0.25511, and 0.44163 respectively for the BiLSTM. For the GRU model, the best sequence length and days shift concerning the MSE (and RMSE) are 50 and 1 respectively, concerning the MAE are 50 and 4 respectively, and concerning the MAPE are 50 and 4 respectively. For the BiLSTM model, the best sequence length and days shift concerning the MSE (and RMSE) are 50 and 1 respectively, concerning the MAE are 50 and 1 respectively, and concerning the MAPE are 50 and 1 respectively. From them, the majority voted sequence length and days shift are are 50 and 1 respectively. Figures 8 and 9 summarize the reported RMSE results graphically using BiLSTM and GRU, where the x-axis represents the day shifts range (i.e., [1, 2, 3, 4, 5] and the y-axis represents RMSE values). From them, the GRU and 50-sequence-length report the best RMSE values while the BiLSTM and 50-sequence-length report the best RMSE values.
4.2 The “Ethereum USD (ETH-USD)” dataset experiments
Table 6 shows reported performance metrics for the “Ethereum USD (ETH-USD)” dataset using the two models (i.e., GRU and BiLSTM). The minimum MSE, RMSE, MAE, MAPE, and R2 are 0.00071, 0.02662, 0.05350, 19.67071, and − 0.48171 respectively for the GRU and 0.00446, 0.06681, 0.05710, 0.23379, and − 3.91045 respectively for the BiLSTM. The maximum MSE, RMSE, MAE, MAPE, and R2 are 0.00411, 0.06414, 0.09458, 33.75066, and 0.73784 respectively for the GRU and 0.01289, 0.11355, 0.10970, 0.40123, and 0.11536 respectively for the BiLSTM. For the GRU model, the best sequence length and days shift concerning the MSE (and RMSE) are 50 and 1 respectively, concerning the MAE are 50 and 5 respectively, and concerning the MAPE are 50 and 5 respectively. For the BiLSTM model, the best sequence length and days shift concerning the MSE (and RMSE) are 10 and 2 respectively, concerning the MAE are 50 and 1 respectively, and concerning the MAPE are 50 and 1 respectively. From them, the majority voted sequence length and days shift are are 50 and 1 respectively. Figures 10 and 11 summarize the reported RMSE results graphically using BiLSTM and GRU, where the x-axis represents the day shifts range (i.e., [1, 2, 3, 4, 5] and the y-axis represents RMSE values). From them, the GRU and 50-sequence-length report the best RMSE values while the BiLSTM and 10-sequence-length report the best RMSE values.
4.3 The “Cardano USD (ADA-USD)” dataset experiments
Table 7 shows reported performance metrics for the “Cardano USD (ADA-USD)” dataset using the two models (i.e., GRU and BiLSTM). The minimum MSE, RMSE, MAE, MAPE, and R2 are 0.00007, 0.00852, 0.00971, 0.06256, and − 2.59725 respectively for the GRU and 0.00031, 0.01752, 0.01599, 0.10515, and − 8.22629 respectively for the BiLSTM. The maximum MSE, RMSE, MAE, MAPE, and R2 are 0.00068, 0.02604, 0.02152, 0.13393, and 0.24750 respectively for the GRU and 0.00165, 0.04058, 0.03424, 0.21699, and − 0.20519 respectively for the BiLSTM. For the GRU model, the best sequence length and days shift concerning the MSE (and RMSE) are 50 and 1 respectively, concerning the MAE are 50 and 4 respectively, and concerning the MAPE are 50 and 4 respectively. For the BiLSTM model, the best sequence length and days shift concerning the MSE (and RMSE) are 30 and 1 respectively, concerning the MAE are 40 and 3 respectively, and concerning the MAPE are 40 and 3 respectively. From them, the majority voted sequence length and days shift are are 50 and 1 respectively. Figures 12 and 13 summarize the reported RMSE results graphically using BiLSTM and GRU, where the x-axis represents the day shifts range (i.e., [1, 2, 3, 4, 5] and the y-axis represents RMSE values). From them, the GRU and 50-sequence-length report the best RMSE values while the BiLSTM and 30- and 40-sequence-length report the best RMSE values.
5 Limitations
One of the main limitations of this work is that each cryptocurrency was treated independently neglecting its potential relations with other cryptocurrencies. Additionally, the non-stationarity and the complexity of cryptocurrency time-series data are not considered. Additionally, all the coins considered in the study are high market capitalization, thus, their behavior is different from newly ICO coins that have low market capitalization. The proposed approach suffers from un-instantaneity. As the training of the models is the most time-consuming stage. Moreover, only 2 RNN architectures are used.
6 Conclusion and future work
A precise prediction of cryptocurrency prices methodology is critical in digital financial markets. Due to variances in forecasting capability per coin, artificial intelligence and machine learning approaches are appealing. This study introduced a framework based on two types of deep learning algorithms (i.e., BiLSTM and GRU). They are utilized to predict the prices of three of the most famous types of cryptocurrency (i.e., Bitcoin, Ethereum, and Cardano). The framework consists of five main phases. First, data is retrieved from a public real-time cryptocurrency source from “Yahoo Finance.”Second, data is preprocessed to prepare the dataset for the next phase via filtering and squished features. Third, the classification via BiLSTM and GRU and model optimization. Fourth, performance evaluation for each epoch, and finally, the future prediction phase. The evaluation metrics such as MSE, RMSE, MAE, MAPE, and R2 are applied to test the accuracy of the used models. For the “Bitcoin USD (BTC-USD)” dataset, for the GRU model, the minimum MSE, RMSE, MAE, and MAPE are 0.00029, 0.01711, 0.02214, and 0.07036, respectively. Similarly, for the BiLSTM model, the minimum MSE, RMSE, MAE, and MAPE are 0.00210, 0.04582, 0.03358, and 0.10942, respectively. For the “Ethereum USD (ETH-USD)” dataset, for the GRU model, the minimum MSE, RMSE, MAE, and MAPE are 0.00071, 0.02662, 0.05350, and 19.67071, respectively. Similarly, for the BiLSTM model, the minimum MSE, RMSE, MAE, and MAPE are 0.00446, 0.06681, 0.05710, and 0.23379, respectively. For the “Cardano USD (ADA-USD)” dataset, for the GRU model, the minimum MSE, RMSE, MAE, and MAPE are 0.00007, 0.00852, 0.00971, and 0.06256, respectively. Similarly, for the BiLSTM model, the minimum MSE, RMSE, MAE, and MAPE are 0.00031, 0.01752, 0.01599, and 0.10515, respectively. The results indicate that GRU outperformed the BiLSTM algorithm for Bitcoin, Ethereum, and Cardano, respectively. The R2 for the GRU model was found to be − 2.89528, − 0.48171, and − 2.5972 for Bitcoin, Ethereum, and Cardano, respectively. Relying on these results, the GRU model is more efficient and reliable in predicting the prices of cryptocurrencies than BiLSTM but the two algorithms deliver excellent results.
In future work, other factors that affect the cryptocurrency market will be investigated. Autoencoder-based time series neural networks will be applied to perform predictions of the time-series data. Moreover, various hyperparameter tuning algorithms such as random or metaheuristic optimization algorithms (e.g., Genetic algorithm, Particle Swarm algorithm, and Bayesian optimization algorithm) can be applied instead of grid search. Additionally, the effect that social media can have on the price and trading volume of cryptocurrencies will be focused on. Hence, sentiment analysis and natural language processing techniques will be used to analyze posts and tweets to extract insights.
Data availibility
The datasets, if existing, that are used, generated, or analyzed during the current study (A) if the datasets are owned by the authors, they are available from the corresponding author on reasonable request, (B) if the datasets are not owned by the authors, the supplementary information including the links are included in this published article.
References
Patel MM, Tanwar S, Gupta R, Kumar N. A deep learning-based cryptocurrency price prediction scheme for financial institutions. J Inf Security Appl. 2020;55:102583.
Nakamoto S. Bitcoin: a peer-to-peer electronic cash system. Decentralized Business Review. 2008:21260.
Mukhopadhyay U, Skjellum A, Hambolu O, Oakley J, Yu L, Brooks R. A brief survey of cryptocurrency systems. In: 2016 14th annual conference on privacy, security and trust (PST), IEEE. 2016:745–52.
Rose C, et al. The evolution of digital currencies: Bitcoin, a cryptocurrency causing a monetary revolution. Int Bus Econ Res J. 2015;14(4):617–22.
Eyal I. Blockchain technology: transforming libertarian cryptocurrency dreams to finance and banking realities. Computer. 2017;50(9):38–49.
Adams R, Kewell B, Parry G. Blockchain for good? digital ledger technology and sustainable development goals. In: Leal Filho W, Marans R, Callewaert J, editors. Handbook of sustainability and social science research. Berlin: Springer; 2018. p. 127–40.
Saad M, Choi J, Nyang D, Kim J, Mohaisen A. Toward characterizing blockchain-based cryptocurrencies for highly accurate predictions. IEEE Syst J. 2019;14(1):321–32.
Jang H, Lee J. An empirical study on modeling and prediction of bitcoin prices with Bayesian neural networks based on blockchain information. IEEE Access. 2017;6:5427–37.
Number of blockchain wallet users 2022/2023: breakdowns, timelines, and predictions. https://financesonline.com/number-of-blockchain-wallet-users/. Accessed 24 June 2022.
Blockchain.com wallets. https://www.blockchain.com/bg/charts/my-wallet-n-users. Accessed 16 Sept 2022.
Coinbase announces effectiveness of registration statement and anticipated listing date of its class a common stock on the nasdaq global select market. https://blog.coinbase.com/coinbase-announces-effectiveness-of-registration-statement-and-anticipated-listing-date-of-its-1509b281f760. Accessed 16 Sept 2022.
Coinmarketcap. “bitcoin: Historical data”. https://coinmarketcap.com/currencies/bitcoin/historical-data/. Accessed 16 Sept 2022.
Blockchain.com. “bitcoin btc: Jan. 1, 2022 to June 13, 2022”. https://www.blockchain.com/explorer/assets/BTC?from=1641056400 &to=1655136000 &timeSpan=custom &scale=0 &style=line. Accessed 16 Sept 2022.
Chan S, Chu J, Nadarajah S, Osterrieder J. A statistical analysis of cryptocurrencies. J Risk Financ Manag. 2017;10(2):12.
Kyriazis NA. A survey on efficiency and profitable trading opportunities in cryptocurrency markets. J Risk Financ Manag. 2019;12(2):67.
Fang F, Ventre C, Basios M, Kanthan L, Martinez-Rego D, Fan W, Li L. Cryptocurrency trading: a comprehensive survey. Financ Innov. 2022;8(1):1–59.
Makarov I, Schoar A. Trading and arbitrage in cryptocurrency markets. J Financ Econ. 2020;135(2):293–319.
Chava S, Hu F, Paradkar N. Gambling on crypto tokens?. SSRN 4149937. 2022.
Sovbetov Y. Factors influencing cryptocurrency prices: Evidence from bitcoin, ethereum, dash, litcoin, and monero. J Econ Financ Anal. 2018;2(2):1–27.
Fantazzini D, Zimin S. A multivariate approach for the simultaneous modelling of market risk and credit risk for cryptocurrencies. J Ind Bus Econ. 2020;47(1):19–69.
Fosso Wamba S, Kala Kamdjoug JR, Epie Bawack R, Keogh JG. Bitcoin, blockchain and fintech: a systematic review and case studies in the supply chain. Prod Plan Control. 2020;31(2–3):115–42.
Coinmarketcap. “cryptocurrency prices, charts, and market capitalizations”. https://coinmarketcap.com/. Accessed 17 Sept 2022.
Binaghi E, Omodei M, Pedoia V, Balbi S, Lattanzi D, Monti E. Automatic segmentation of MR brain tumor images using support vector machine in combination with graph cut. In: IJCCI (NCTA). 2014:152–7.
Schuster M, Paliwal KK. Bidirectional recurrent neural networks. IEEE Trans Signal Process. 1997;45(11):2673–81.
Ciregan D, Meier U, Schmidhuber J. Multi-column deep neural networks for image classification. In: 2012 IEEE conference on computer vision and pattern recognition, IEEE. 2012:3642–9.
Bitcoin usd (btc-usd). https://finance.yahoo.com/quote/BTC-USD/history?p=BTC-USD. Accessed 09 Aug 2022.
Ethereum usd (eth-usd). https://finance.yahoo.com/quote/ETH-USD/history?p=ETH-USD. Accessed 09 Aug 2022.
Cardano usd (ada-usd). https://finance.yahoo.com/quote/ADA-USD/history?p=ADA-USD. Accessed 09 Aug 2022.
Mudassir M, Bennbaia S, Unal D, Hammoudeh M. Time-series forecasting of bitcoin prices using high-dimensional features: a machine learning approach. Neural Comput Appl. 2020:1–15.
Gupta M, Asthana A, Joshi N, Mehndiratta P. Improving time series forecasting using mathematical and deep learning models. In: International conference on big data analytics, Springer. 2018:115–25.
Lustick LS, Saltzberg B, Buckley JK, Heath RG. Autoregressive model for simplified computer generation of EEG correlation functions. In: Proceedings of the IEEE annual conference on engineering in medicine and biology. 1968;10:78–94.
Nelson K. Time series analysis using autoregressive integrated moving average (arima) models. Acad Emerg Med. 1998;5(7):739–44.
Chang X, Gao M, Wang Y, Hou X. Seasonal autoregressive integrated moving average model for precipitation time series. J Math Stat. 2012;8(4):500–5.
Hyndman RJ, Athanasopoulos G. Forecasting: principles and practice. Heathmont: OTexts; 2018.
Salinas D, Flunkert V, Gasthaus J, Januschowski T. Deepar: Probabilistic forecasting with autoregressive recurrent networks. Int J Forecast. 2020;36(3):1181–91.
Oreshkin BN, Carpov D, Chapados N, Bengio Y. N-beats: neural basis expansion analysis for interpretable time series forecasting. 2019;arXiv preprint arXiv:1905.10437.
Hitam NA, Ismail AR. Comparative performance of machine learning algorithms for cryptocurrency forecasting. Ind J Electr Eng Comput Sci. 2018;11(3):1121–8.
Derbentsev V, Babenko V, Khrustalev K, Obruch H, Khrustalova S. Comparative performance of machine learning ensemble algorithms for forecasting cryptocurrency prices. Int J Eng. 2021;34(1):140–8.
Hamayel MJ, Owda AY. A novel cryptocurrency price prediction model using GRU, LSTM and bi-LSTM machine learning algorithms. AI. 2021;2(4):477–96.
Pour ES, Jafari H, Lashgari A, Rabiee E, Ahmadisharaf A. Cryptocurrency price prediction with neural networks of LSTM and Bayesian optimization. Eur J Bus Manage Res. 2022;7(2):20–7.
Kim G, Shin D-H. Jae GC, Sunghoon L. IEEE access: a deep learning-based cryptocurrency price prediction model that uses on-chain data; 2022.
Miura R, Pichl L, Kaizoji T. Artificial neural networks for realized volatility prediction in cryptocurrency time series. In: International symposium on neural networks. Springer; 2019:165–72.
Yiying W, Yeze Z. Cryptocurrency price analysis with artificial intelligence. In: 2019 5th international conference on Information Management (ICIM), IEEE. 2019:97–101.
Shahbazi Z, Byun Y-C. Improving the cryptocurrency price prediction performance based on reinforcement learning. IEEE Access. 2021;9:162651–9.
Jay P, Kalariya V, Parmar P, Tanwar S, Kumar N, Alazab M. Stochastic neural networks for cryptocurrency price prediction. IEEE Access. 2020;8:82804–18.
Uras N, Marchesi L, Marchesi M, Tonelli R. Forecasting bitcoin closing price series using linear regression and neural networks models. PeerJ Comput Sci. 2020;6:e279.
Tanwar S, Patel NP, Patel SN, Patel JR, Sharma G, Davidson IE. Deep learning-based cryptocurrency price prediction scheme with inter-dependent relations. IEEE Access. 2021;9:138633–46.
Alessandretti L, ElBahrawy A, Aiello LM, Baronchelli A. Machine learning the cryptocurrency market. SSRN 3183792. 2018.
Fang F, Chung W, Ventre C, Basios M, Kanthan L, Li L, Wu F. Ascertaining price formation in cryptocurrency markets with machine learning. Eur J Finance. 2021. https://doi.org/10.1080/1351847X.2021.1908390.
Alanyali M, Moat HS, Preis T. Quantifying the relationship between financial news and the stock market. Sci Rep. 2013;3(1):1–6.
Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J. Lstm: a search space odyssey. IEEE Trans Neural Netw Learn Syst. 2016;28(10):2222–32.
Siami-Namini S, Tavakoli N, Namin AS. The performance of LSTM and bilstm in forecasting time series. In: 2019 IEEE international conference on Big Data (Big Data), IEEE. 2019:3285–92.
Dey R, Salem FM. Gate-variants of gated recurrent unit (GRU) neural networks. In: 2017 IEEE 60th international midwest symposium on circuits and systems (MWSCAS), IEEE. 2017:1597–600.
Lydia A, Francis S. Adagrad—an optimizer for stochastic gradient descent. Int J Inf Comput Sci. 2019;6(5):566–8.
Liashchynskyi P, Liashchynskyi P. Grid search, random search, genetic algorithm: a big comparison for nas. 2019. arXiv preprint arXiv:1912.06059.
Funding
No funding was received for this work.
Author information
Authors and Affiliations
Contributions
The authors certify that the work they have submitted for publication is entirely new, has never been published before, and is not presently being considered for publication elsewhere. We certify that all mentioned authors have reviewed the paper and that no other individuals who meet the requirements for authorship but are not listed have contributed to the work. We further reaffirm that we all approved of the order in which the authors are listed in the text. We recognize that the corresponding author serves as the exclusive point of contact for the editorial procedure. He is in charge of informing the other writers of his progress, their submission of corrections, and his final acceptance of the proofs. The author read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
The current study does not contain any studies with human participants and/or animals performed by any of the authors. There is no informed consent for the current study.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Aljadani, A. DLCP2F: a DL-based cryptocurrency price prediction framework. Discov Artif Intell 2, 20 (2022). https://doi.org/10.1007/s44163-022-00036-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s44163-022-00036-2