1 Introduction

The stock market is a complicated and varied system due to the chaotic, blaring, and non-stationary data. The fast and accurate prediction of the stock market becomes challenging among the investors to invest the capital for making profits. Therefore, plenty of previous research has been done in this field, and they can be generally summarized in two categories: linear models based on statistical theories; and nonlinear models based on machine learning [1, 2]. In the existing methods of statistical models, time series [3], gray [4], vector autoregression (VAR) [5], and others are usually used. These prediction methods such as linear regression, moving average, and GARCH are much favorable for financial time series forecasting because of their interpretability [6]. They perform well in linear and stationary time series while do not accomplish ideally on the nonlinear and non-stationary data.

Apart from the traditional statistical methods, data mining and machine learning have carved their own niches in time series analysis [7, 8]. Regarding data mining studies utilized in daily stock data, there are prediction works based on the support vector machine (SVM) [9, 10]. To determine whether the new pattern data belong to a certain category, artificial neural networks (ANN) [11, 12] has achieved successful predictions even with complicated relationships between variables. The fuzzy time series (FTS) [13, 14] method is also employed to predict and identify the time series variation. Including particle swarm optimization (PSO), genetic algorithm (GA), and rough sets theory (RS), the well-known optimization algorithms have been used in this field as well. Besides, some other prediction research works are performed using the method of word analysis for several news articles [15, 16], etc. In particular, since White first employed the ANN to predict the daily return rate of IBM ordinary stock [17], the usage of the ANN in stock price prediction has become a hot research topic [18,19,20]. More recently, the deep learning method, and especially convolutional neural networks (CNN), has shown impressive performance in pattern recognition and classification; it is also employed in forecasting stock price from historical exchange data [21]. For example, Tsantekidis et al. [22] proposed a deep learning methodology, based on Convolutional Neural Networks, to predict the stock prices, and they gained the highest 67.38% precision in 20 prediction horizons. Deng et al. [23] trained a recurrent deep neural network (NN) for real-time financial signal representation and trading; the experimental results on both the stock and the commodity future markets demonstrated the effectiveness of their proposed approach. Zarkias et al. [24] reported a novel price trailing approach based on deep reinforcement learning (RL) that went beyond traditional price forecasting, and the experiment conducted on a real dataset demonstrated their proposed method outperforming a competitive deep RL method. Besides, some designed recurrent neural networks (RNNs) have been applied to forecasting the stock data as well [25, 26]. These studies make full use of the advantages of neural networks such as self-adapting, self-learning, distributed processing, etc., and overcome the shortcomings of conventional prediction methods. Nevertheless, there are still problems for these neural network methods, including low precision, slow convergence, and easy inclination to a local minimum. In addition, a very large number of parameters that need to be trained by a vast amount of labeled data is also a problem for deep neural networks. On the other hand, such the price trailing forecast, which is a variant of the price prediction method, does not really overcome the limitations of traditional price forecasting influenced by chaos, volatility, and noise interference. Despite these constraints, the previous studies successfully indicated the potential of neural network methods in stock forecasting.

As mentioned previously, most of the existing research considered the prediction of price changes with the aim of creating accurate models that predict the exact value of a stock price instead of the trading strategy itself, such as determining the buy or sell points regarding the knowledge of intelligent stock trading. However, in reality, investors are more concerned with making trading decisions than forecasting daily prices. So far, there are only a few examples of research that involve the turning points of a stock, and notably this problem is still a large one as regards academic researchers and industrial practitioners [1, 11]. Reviewing the latest literature, we found the following research correlated with the turning point prediction. Tang et al. [27] integrated piecewise linear representation (PLR) and weighted support vector machine (WSVM) for forecasting stock turning points. Intachai et al. [28] reported a Local Average Model (LAM) to predict the turning points of the stock price. Chang et al. [29] applied an ANN model to learn the connection weights from historical turning points, and afterward, an exponential smoothing based dynamic threshold model was used to forecast the future trading signals, etc. The commonly used algorithms including SVM, ANN, and others have shown the effectiveness in stock turning point prediction. However, as stated earlier, the conventional algorithms such as ANNs exist some disadvantages in avoiding over-fitting and trapping at a local minimum. Thus, referring to the above literature, this paper proposes a RVFL-GMDH based turning point prediction method for the time series analysis of stock price. Through chaotic time series analysis, the turning signal Φ(t) of stock price can be got in advance. Then, the RVFL-GMDH networks, which randomly initializes the weight and bias parameters between the input layer and the hidden layer is employed for the turning point prediction of the stock price. A series of experiments are performed and the empirical analysis results demonstrate the validity of the proposed procedure.

The rest of this writing is structured in the following hierarchy. Section 2 introduces the chaotic time series analysis and describes the theoretical background. Section 3 discusses the modeling methodology to accomplish the task of turning point prediction along with related concepts and the proposed procedure. Section 4 dedicates to the algorithm experiments; multiple experiments are conducted as well as empirical analysis implemented in practical application scenarios. This paper is ultimately summarized in Sect. 5.

2 Chaotic time series analysis

2.1 Determination of chaotic attractor

Generally, it is easy to get a set of time series values in the economic system, i.e., the one-dimensional information of the system. Although the one-dimensional information contains the characteristics of the system, the dynamic or multi-dimensional features of the system cannot be reflected by this one-dimensional representation fully, and thus some multi-dimensional features of the system may be lost. Therefore, Packard proposed to reconstruct the phase space with the delay coordinates of a variable in the original system, and Takens proved that a suitable embedding dimension could be found [30]. It means that if the dimension of the delay coordinate m is greater than or equal to 2d + 1 (m ≥ 2d + 1, d is the dimension of the dynamic system), the regular trajectory (attractor) can be restored in the embedding dimension space. Thus, on the orbit of the reconstructed Rm space, the original dynamic system maintains differential homeomorphism, which lays a solid theoretical foundation for the prediction of chaotic time series.

Let the chaotic time series of a single variable be \(\{ x(i),i = 1,2,...,N\}\): Then, if the embedding dimension m and time delay τ are selected appropriately, the phase space of the time series can be expressed in Eq. (1).

$$ \begin{aligned} (t_{i} ) & = [x(t_{i} ),x(t_{i} + \tau ),x(t_{i} + 2\tau ), \ldots ,x(t_{i} + (m - 1)\tau )] \\ i & = 1,2, \ldots ,m \\ \end{aligned} $$
(1)

where τ represents the delay time, m indicates the embedding dimension. Referring to Takens’ theorem, the dynamic characteristics of the attractor can be recovered in the sense of topological equivalence.

A small embedding dimension m0 is first given to form a reconstructed phase space, and the distance between vectors in phase space can be defined in Eq. (2).

$$ \left| {y_{i} - y_{j} } \right| = \mathop {\max }\limits_{{1 \le k \le m_{0} }} \left| {y_{ik} - y_{jk} } \right| $$
(2)

The correlation integral of embedded time series is computed by

$$ C_{{m_{0} }} (r) = \frac{1}{{N^{2} }}\sum\limits_{i,j = 1}^{N} {\theta (r - \left| {y_{i} - y_{j} } \right|} ) $$
(3)

where θ(·) is the Heaviside function, r denotes a certain probability value, and C(r) is a cumulative distribution function, representing the probability that the distance in phase space is less than r. The function θ(·) is expressed using Eq. (4).

$$ \theta (x) = \left\{ \begin{gathered} 0,\quad x \le 0 \hfill \\ 1,\quad x > 0 \hfill \\ \end{gathered} \right. $$
(4)

For an appropriate range of r, the dimension d of attractors and the cumulative distribution function C(r) should satisfy the logarithmic linear relationship, as written in Eq. (5).

$$ d(m_{0} ) = \ln C_{{m_{0} }} (r)/\ln r $$
(5)

Thus, d(m0) is the calculated value of the correlation dimension corresponding to m0. Increase the embedding dimension (m1 > m0), and repeat the above calculation process until the corresponding dimension value d(m) no longer changes with a certain error range as m increases. The d obtained at this time is the correlation dimension of the attractor. If d increases with m and does not converge to a stable value, it indicates that the time series is a random time series and does not have the characteristics of chaos dynamics, which provides a basis for the identification of chaotic time series.

2.2 Chaotic analysis prediction

According to Packard's theorem of reconstructive phase space and Takens' proof [30] of maintaining differential homeomorphism between reconstructed Rm space and original dynamical system, we can obtain a homeomorphic dynamical system in m dimensional phase space Rm. Thus, there is a smooth map f: Rm → Rm, and it gives the nonlinear dynamic relationship in phase space, as expressed in Eq. (6).

$$ X(t + 1) = f(X(t)) $$
(6)

For the time series x1, x2,…, xn-1, xn, the phase space can be reconstructed on the basis of the calculated optimal embedding dimension m and time delay τ. It is defined as:

$$ X(t_{i} ) = [x(t_{i} - \tau ),x(t_{i} - 2\tau ), \ldots ,x(t_{i} - m\tau )],\quad i = 1,2, \ldots $$
(7)

From Takens’ theorem, the dynamic characteristics of the attractor can be recovered in the sense of topological equivalence. Therefore, a dynamic system F: Rm → R can be obtained on the space Rm, which satisfies Eq. (8).

$$ x(t_{i} - \tau + 1) = F[x(t_{i} - \tau ),x(t_{i} - 2\tau ), \ldots ,x(t_{i} - m\tau )],\quad i = 1,2, \ldots $$
(8)

where F(·) is a mapping from m-dimensional states to a one-dimensional real number. If the resolution or dynamic equation for the mapping F(·) can be obtained, it is possible to predict the time series according to the inherent regularity of the chaotic time series.

For most of the time series is the complex and nonlinear system, there are some difficulties in obtaining the functional equation directly, and the strong nonlinear mapping ability of neural networks just provides a good way to deal with such issues. Moreover, the Kolmogorov continuity theorem [31] tells us: for any continuous function ψ: Em → Rn, Y = ψ(x), the ψ can be precisely implemented by a three-layer neural network, where Em is the m dimensional unit cube [0,1]m. Thus, this theorem mathematically guarantees the feasibility of neural networks for chaotic time series prediction.

3 Nonlinear modeling based on RVFL-GMDH

3.1 Related theories

3.1.1 RVFL neural networks

Random vector functional link (RVFL) networks proposed first by Pao et al. [32] is a randomized version of feedforward neural networks and has received much attention all along due to its universal approximation capability and outstanding generalization performance [33,34,35,36,37]. Different from the conventional neural networks, the RVFL networks randomly initializes all the weight and bias parameters ({wj, bj}, j = 1,2,…,m) between the input layer and hidden layer; it should be noted that these parameters are fixed and do not need to be tuned during the whole training stage. In addition, compared with the influential extreme learning machines (ELM) [38] and random kitchen sinks (RKS) [39] neural networks, there are extra direct connections between the input layer and the output layer for the RVFL, although they are all random networks. Also, the RVFL as well as ELM randomly fix the weight and bias in the hidden layer and RKS adopts a bank of arbitrary randomized nonlinearities. Figure 1 depicts a schematic diagram of the RVFL networks and a brief description of each layer is presented below.

Fig. 1
figure 1

The schematic diagram of RVFL networks [33]

Input layer

The main function of the input layer is to enter a training set {(xi, yi)} with n samples, where i = 1, 2,…, n, x ∈ Rn, y ∈ R.

Hidden layer

The hidden layer obtains the value of the activation function (h(·)) for each hidden layer node, and the sigmoid function is usually employed to compute the value of h(·), as defined in Eq. (9).

$$ h(x,w,b) = \frac{1}{{1 + \exp \{ - w^{T} x + b\} }} $$
(9)

where w and b represent the weights and biases from the input layer to the hidden layer, respectively. Then, the kernel mapping matrix H of the hidden layer can be formed to calculate the output, as written using Eq. (10).

$$ H = \left[ {\begin{array}{*{20}c} {h_{1} (x_{1} )} & \cdots & {h_{k} (x_{1} )} \\ \vdots & \ddots & \vdots \\ {h_{1} (x_{n} )} & \cdots & {h_{k} (x_{n} )} \\ \end{array} } \right] $$
(10)

where k is the number of hidden layer nodes.

Output layer

The key task of mode training for the RVFL networks is to learn the optimal weights Wo from the hidden layer to the output layer, which can be calculated using the least-square method and eventually solved by

$$ W_{o} = (H^{T} H)^{ - 1} H^{T} Y $$
(11)

where Y is the training target.

3.1.2 GMDH method

Group method of data handling (GMDH) is the core algorithm of self-organizing data mining, and it can determine the variables to enter the model, confirm the structure and parameters of the model in a self-organizing manner [40,41,42]. GMDH is suitable for the modeling of nonlinear complex systems. Figure 2 depicts a typical GMDH network architecture.

Fig. 2
figure 2

A typical GMDH network

As seen in Fig. 2, the initial input variables (models) are combined with each other to generate the intermediate candidate models, and the optimal intermediate models are selected (e.g., black-filled nodes) by the operations such as heredity, mutation; Then, after further iterating, the processes of heredity, mutation, selection, and evolution are repeated, and the complexity of the intermediate models are continuously increased until the optimal complexity model is obtained.

Different from the classical ANN family, the GMDH adopts the form of mathematical description which is termed the referential function to establish a general mapping relationship between the input and output variables for modeling. The discrete form of the Volterra functional series or Kolmogorov–Gabor (KG) polynomial is usually taken into account for the referential function. Most frequently, K-G polynomial is utilized as the initial input model of this algorithm, and the K-G polynomial composed of variables (x1, x2,…, xm) is established by Eq. (12)

$$ y = f(x_{1} ,x_{2} ,...,x_{m} ) = \sum\limits_{i = 1}^{m} {a_{i} x_{i} } + \sum\limits_{i = 1}^{m} {\sum\limits_{j = 1}^{m} {a_{ij} x_{i} x_{j} + \sum\limits_{i = 1}^{m} {\sum\limits_{j = 1}^{m} {\sum\limits_{k = 1}^{m} {a_{ijk} x_{i} x_{j} x_{k} +\cdots} } } } } $$
(12)

where X = (x1, x2,…, xm) represents the vector of input variables, a = (a1, a2,…, am) is the vector of coefficients or weights, and y denotes the output variable. Ideally, as the increase in independent variables and polynomial degree (aliased as complexity) [40], the polynomial sequence can fit numeric data with any required precision. Therefore, the GMDH method is usually employed to address the prediction problem in practical application scenarios of various domains.

3.2 Proposed approach

3.2.1 RVFL-GMDH modeling

As mentioned earlier, RVFL has the nonlinear modeling ability and the advantages of simplicity, optimal approximation, and fast solution by using the randomization method, while GMDH can resist noise interference, effectively avoid the over-fitting problem, and has interpretability. There are just some complementarities between the two algorithms. Therefore, by taking the merits of both, the modified RVFL and GMDH networks were fused to generate a new model called RVFL-GMDH, which was used for the turning point prediction of the stock price. Figure 3 depicts the schematic of the model, and the specific descriptions of these processes are presented below.

Fig. 3
figure 3

a Network structure of RVFL-GMDH, and b Output predicted value

  1. 1.

    For a given dataset D, it is divided into the training set A and test set B, thus D = A + B. Suppose the training set A = {x1,…, xn} with n samples input to the model, where i = 1,2,…,n, x ∈ Rn.

  2. 2.

    Using the input variables, the value of the nodes in the first hidden layer is calculated by the activation function h(·), where the sigmoid function is employed, as expressed in Eqs. (13,14).

    $$ f(x_{i} ){ = }\sum\limits_{i = 1}^{n} {(w_{i} x_{i} + b_{i} )} $$
    (13)
    $$ h_{i} { = 1/}(1 + e^{{ - f(x_{i} )}} ) $$
    (14)

    where wi and bi represent the random weight and bias of RVFL, xi is the input variable, and hi denotes the output of the activation function.

  3. 3.

    After conducting Step 2, the K-G polynomial is employed as the reference function for the proposed method, as seen in Eq. (12). The form of the first order K-G polynomial including n neurons (variables) is displayed as follows:

    $$ f(x_{1} ,x_{2} ,...,x_{n} ) = a_{0} + a_{1} x_{1} + a_{2} x_{2} +\cdots+ a_{n} x_{n} $$
    (15)
  4. 4.

    Generate the candidate models: By pairwise coupling, the nodes in the former layer are combined to generate the intermediate candidate models, which are regarded as the new input of the next layer. Specifically, considering all the sub-items of Eq. (15), there are n + 1 initial input models: v1 = a0, v2 = a1x1, …, vn+1 = anxn, and every two nodes are composed as one unit with the reference function y = f(vi, vj) = a1 + a2vi + a3vj. Therefore, there are n1 = \(C_{{n_{0} }}^{2}\)( n0 = n + 1) candidate models in the hidden layer:

    $$ y_{k}^{1} = a_{1}^{k} + a_{2}^{k} v_{i} + a_{3}^{k} v_{j},\quad \,i,\,j = 1,2,...,n_{0} ,i \ne j,\quad k = 1,2,...,n_{1} $$
    (16)

    where yk1 represents the estimation output; a1k, a2k, a3k (k = 1,2,…,n1) are the coefficients obtained in the training set by the least-squares (LS) method.

  5. 5.

    Model selection: Based on the threshold measurement (external criterion), F1 (≤ n1) candidate models are selected as the input of the next layer. Similarly, there are also n2 = \(C_{{F_{1} }}^{2}\) intermediate candidate models obtained in the subsequent layer:

    $$ y_{k}^{2} = b_{1}^{k} + b_{2}^{k} y_{i}^{1} + b_{3}^{k} y_{j}^{1},\quad\,i,\,j = 1,2,...,F_{1} ,i \ne j,k = 1,2,...,n_{2} $$
    (17)
  6. 6.

    Repeat Steps 4–5 until obtaining the optimal complexity model by termination principle (see Fig. 3a). A brief description of the above processes is displayed in Algorithm 1.

    figure e

3.2.2 Generate turning point sequence

Stock price forecasting represents a specific type of economic prediction and has its own unique characteristics. As shown in the following Fig. 4, it is the daily K curve of a stock. For making a successful stock market operation, it is necessary to trade according to the vertical arrow, buying at the lowest point and selling at the highest point. A good trading system should only give reminders when the real turning point appears, and the false positives need to be avoided. For example, as indicated by the horizontal arrow, the market only drops by a few percentage points and then continues to rise. So, it is not a turning point at this time, and should not be reminded.

Fig. 4
figure 4

The daily K curve of a stock

In practice, we need to know the direction of the observed market, and thus determine the trading operations such as long-term, short-term, buying, selling, and so on. What we want is to use historical data to predict future price trends. Therefore, the model does not need to predict the exact closing price of the next trading day but discover our trading strategy from the predicted trend. In other words, we need to predict the turning points of the stock price rather than the exact value.

As for the neural networks, the supervisory signal must be obtained to learn the model in the training phase. The model we designed uses the time series turning indicator Φ(t) as the supervisory signal. Let θ be the expected turning rate, and the calculation method of Φ(t) can be described as follows.

  1. 1.

    Starting from the first data in the time series, the increment between adjacent data y(t) and y(t-1) is calculated in turn, and it is accumulated.

  2. 2.

    When the market price rise rate above the expected turning rate θ, the previous low point is marked as the “Buy” signal, and the corresponding output value of Φ(t) is set to 0.05.

  3. 3.

    When the market price fall rate below the expected turning rate θ, the previous high point is marked as a “Sell” signal, and the corresponding output value of Φ(t) is set to 0.95.

  4. 4.

    The other output values of Φ(t) in this time window are normalized to the interval [0.05, 0.95] using the interpolation method.

Thus, the values of turning indicator Φ(t) can be well calculated on the basis of the above processes. It is worth mentioning that the expected turning rate θ cannot be set too small, otherwise, too many signals will be generated, and many of which may be useless signals. For a given time series data of stock price, Fig. 5 depicts the different results of the calculated turning indicator Φ(t) when the expected turning rate θ is set to 5% and 8%, respectively.

Fig. 5
figure 5

The relation diagram between turning indicator Φ(t) and turning rate θ

As seen in Fig. 5a, it generates 9 “Buy” signal points and 8 “'Sell'” signal points when the turning rate θ is set to 5%. By contrast, in the same periods, there are 4 “Buy” signal points and 3 “Sell” signal points when the turning rate θ is set to 8%, as presented in Fig. 5b. Naturally, the greater the number of trained turning points is, the more turning signals will be generated for the predicting of the proposed method, or the less otherwise. That is to say, the turning rate θ is a hyper-parameter threshold and we recommend it to be set 5% or 8% here. After getting the value of turning signal Φ(t), the trajectory prediction in the reconstructed phase space can be carried out correspondingly. According to Takens’ theorem, the reconstructed Rm space and original dynamical system maintain differential homeomorphism, and thus the turning point prediction for the stock price can be performed on the basis of the turning signal. In particular, Algorithm 2 depicts the specific calculation processes of the stock turning point sequence.

figure f

4 Experimental results and analysis

We have conducted a series of experiments to validate the efficacy of the proposed procedure. Except that some graphical representations were conducted using MATLAB or R tools, the main machine learning algorithms including SVM, ANN, RF, and the proposed RVFL-GMDH were implemented using Anaconda3 (Python 3.6), scikit-learn library, and PyMC3 library, etc. [43,44,45]. The chaos identification and correlation dimension determination components were accomplished with statistics and machine learning toolbox in MATLAB. The essential hardware environment for the experiments contains Intel® Core™ i7-8750 CPU (2.20 GHz), 8 GB DDR4 RAM, and GeForce GTX 1060 graphics card, which was used for program operation.

4.1 Experiments on public datasets

UCI Repository [46] is an international general database comprised of widespread collected datasets, which are primarily for the prediction or classification algorithm test of machine learning. To verify the efficiency of the proposed method, five datasets including Boston, Abalone, Airfoil, Communities, and Combined Cycle Power Plant (CCPP) are downloaded from the UCI database and utilized in the experiments. They are frequently used as benchmark datasets to probe the performance of different methods.

Boston dataset that is composed of 506 instances focus on the forecasting of a median house price, and it includes 14 variables starting from a set of characteristics of the house and its correlated attributes. Through the physical measurement of Abalone molluscs, the Abalone dataset aims to predict the ages of Abalone, and this dataset contains 4177 samples determined by 8 features. Airfoil dataset is used to predict the scaled sound pressure level of aircraft, and it comprises of 1,503 samples with 6 features. Communities dataset contains 1994 instances which are determined by 128 features; for this dataset, the goal is to learn to predict the violent crime rate. CCPP dataset is used for the prediction of the hourly energy output of the electrical net, and this dataset contains 9,568 instances determined by 4 features. Approximately half of the data in each dataset is selected as the training samples while the remainders as the test samples.

Moreover, we especially study the forecasting effect of the model on economic data, and one open Stock dataset is directly downloaded from KEEL-dataset repository [47] (http://150.214.190.154/keel/dataset.php?cod=1298), which aims at providing a set of benchmarks to analyze the behavior of the learning methods for data mining and knowledge discovery tasks. The Stock dataset provides the daily stock prices of 10 aerospace companies from January 1988 to October 1991, and there are 10 features for this dataset comprised of 950 instances. The goal of this dataset is to estimate the stock price of the 10th airline company given the stock prices of the other 9 companies, and the specific description of this Stock dataset is presented in Table 1. On this basis, we perform the model training and test for the proposed approach on this dataset, and Fig. 6 depicts the mean square error (MSE, see Eq. (19)) iteration curve of the training process. The horizontal axis of this figure is the number of iterations and the vertical axis is the mean square error. In general, it can be seen from the figure that the final mean square error converges to a very small value in the training process. Table 2 displays detailed information of these tested open datasets.

Table 1 The description of the Stock dataset
Fig. 6
figure 6

The testing R2 of different algorithms on diverse datasets

Table 2 The detailed information of the tested open datasets

Taking into account the statistics of accuracy, we may verify the efficiency of the models with metrics like the coefficient of determination R2, the mean squared error MSE, and the explained variance EVAR, which are, respectively calculated in Eqs. (1820).

$$ R^{2} = 1 - \sum\limits_{i = 0}^{n} {(y_{i} - \hat{y}_{i} )^{2} } /\sum\limits_{i = 1}^{n} {(y_{i} - \overline{y})^{2} } $$
(18)
$$ MSE = \frac{1}{n}\sum\limits_{i = 1}^{n} {(y_{i} - } \hat{y}_{i} )^{2} $$
(19)
$$ E_{VAR} = 1 - var(y_{i} - \hat{y}_{i} )/var(y_{i} ) $$
(20)

where \(y_{i}\) denotes the observed value, \(\hat{y}_{i}\) is the predicted value, \(\overline{y}\) is the average of the observed value, and var(∙) is the variance function. The indicator R2 measures the proportion of the dependent variable change that can be interpreted by the independent variable; the MSE depicts the deviation with the mean value of the squared error between the observed value and the predicted value; the EVAR represents the explanatory power of the model. For both R2 and EVAR, the best possible value is equal to 1, while greater values are worse for MSE. Furthermore, to compare the proposed procedure, we took into account four influential algorithms including SVM, ANN, RF (random forest), and GMDH to perform the prediction for the contrast analysis. The training and test accuracies of different methods are, respectively obtained in Tables 3, 4. Particularly, the R2 of diverse algorithms on test datasets are depicted in Fig. 7.

Table 3 The training accuracies of predicting the datasets
Table 4 The testing accuracies of predicting the datasets
Fig. 7
figure 7

The MSE iteration curve of the proposed approach on Stock dataset

From Table 4, it can be clearly observed that the average R2 value of the RVFL-GMDH is 0.668 in the testing set, which is the highest of all the algorithms. In particular, for the forecasting effect of the models on Stock dataset, the final mean square error MSE of the proposed method in the training set and the test set is relatively small, while the coefficient of determination R2 and the explained variance EVAR are both large, indicating that the model fits the economic data well. Additionally, for the different datasets, the RVFL-GMDH presents a relatively high R2 value except for the Airfoil dataset. The whole bars are higher than that of other algorithms for all the regions, which depicts the stability and effectiveness of the proposed method, as observed in Fig. 7. For the average MSE and EVAR values, the RVFL-GMDH also achieves the optimal results except for the RF algorithm, which is an ensemble learning method composed of multiple decision trees (Here is assigned as 20). Therefore, based on the experimental results, it can be assumed that the proposed method is superior to other algorithms and has comparable performance as the RF algorithm on the experimental datasets. The proposed procedure basically outperforms the other well-known prediction algorithms on the experimental dataset, even though the ensemble learning algorithm is adopted.

4.2 Empirical analysis experiment

By obtaining the historical data of stock trading from Yahoo Finance [48], we can randomly choose the stock price as the analysis object. Here, the Pci-Suntek (stock code: 600,728) and one FTSE-100 index constituent stock, Tesco (stock code: TSCO.L) are selected in our empirical analysis. The close price of the Pci-Suntek stock from January 2017 to September 2018 is obtained and the data normalization is processed as the following Eq. (21).

$$ x(i) = {{(y(i) - \frac{1}{N}\sum\limits_{i = 1}^{N} {y(i)} )} \mathord{\left/ {\vphantom {{(y(i) - \frac{1}{N}\sum\limits_{i = 1}^{N} {y(i)} )} {\left\{ {\frac{1}{N}\sum\limits_{j = 1}^{N} {(y(j) - \frac{1}{N}\sum\limits_{i = 1}^{N} {y(i)} )^{2} } } \right\}}}} \right. \kern-\nulldelimiterspace} {\left\{ {\frac{1}{N}\sum\limits_{j = 1}^{N} {(y(j) - \frac{1}{N}\sum\limits_{i = 1}^{N} {y(i)} )^{2} } } \right\}}}^{{{1 \mathord{\left/ {\vphantom {1 2}} \right. \kern-\nulldelimiterspace} 2}}} $$
(21)

The chaos of time series is first judged using the method of chaotic attractor determination reported in Sect. 2. Based on the normalized time series data, the m-dimensional phase space with the time delay of τ can be constructed, and the point in the phase space is y(t) = x(t),x(t + τ),…,x(t + (m−1)τ). Considering that there are five trading days a week in the stock market, we can set the τ value as τ = 5, and the embedding dimension m is taken a positive integer such as m = 3, 4,.., 10, etc. Furthermore, for the recent data that has a greater impact on the prediction, the recent data from January 1, 2018 to July 1, 2018 is selected as the verification data and the correlation integral function C(r) is computed in Eq. (3).

The value of the correlation integral is related to the given distance r. Theoretically, there is a range of r value for the attractor dimension d and the integral function C(r) satisfies the log-linear relationship. That is, the d(m0) is equal to the lnCm0(r)/lnr (see Eq. (5)). Under different embedding dimensions, a series of correlation integral values Cm(r) can be calculated for a series of r values. Taking ln(r) as the horizontal axis and ln(Cm0(r)) as the vertical axis, the graph can be drawn as Fig. 8.

Fig. 8
figure 8

The LnC(r)-ln(r) diagram for Pci-Suntek

As seen in Fig. 8, the curve from top to bottom indicates that the embedding dimension m = 3, 4, 5,…,10 increases gradually. With the increase of m, the curve becomes steeper and the slope values are more stable. The obtained stable value is just the estimated value of the correlation dimension, and the details are listed in the following Table 5.

Table 5 The estimates of correlation dimension

It can be visualized from Table 5 that when the embedding dimension m is increased to 8, the correlation dimension d(m) tends to be stable. The value is around 3.0, and the best embedding dimension is 7. This implies that the time series is not a random time series and has chaotic dynamics characteristics. The correlation value of the chaotic attractor is about 3.0. Moreover, the embedding dimension 7 is greater than the 2*3.0287, which satisfies the condition of restoring the differential homeomorphic dynamical system in Takens’ theorem [31]. Therefore, we can reconstruct the phase space based on the embedded dimension 7.

We take the historical data of Pci-Suntek stock from January 1, 2018 to September 1, 2018, and divide it into two sections on July 1, 2018. The former section is used as the training sample while the latter is as the prediction sample. Based on the method mentioned in Sect. 3, we can compute the turning indicator value, and which is loaded into the RVFL-GMDH network for the model training. Subsequently, the data from July 1, 2018 to September 1, 2018 is used as the test data and the turning points of stock prices in this period are predicted. Table 6 displays the partial sample data. In particular, it should be noted that the commonly used approaches such as ANN and time series methods are chosen to predict the turning points for comparative analysis. The results of turning point predictions are depicted in Fig. 9. Among the figures, Fig. 9a is the trend of the actual stock price, Fig. 9b displays the prediction results of the time series method of Auto-Regressive and Moving Average Model (ARMA), Fig. 9c presents the prediction results of ANN method, and Fig. 9d depicts the prediction results of the proposed RVFL-GMDH method.

Table 6 The partial sample data of turning signal generation (θ = 5%)
Fig. 9
figure 9

The actual stock price curve and the predictions of turning indicator Φ(t)

It can be observed from Fig. 9 that the RVFL-GMDH based turning point prediction method has predicted most of the turning points for the samples generally, and the position of the turning point is basically consistent with the actual trend of the stock price curve. Moreover, as shown in Fig. 9d, the most of predicted turning points (e.g., the marked red arrow in Fig. 9d) are just located in the position of the cusp in the actual stock price curve (see Fig. 9a), which proves the effectiveness of the proposed procedure. By contrast, the other method such as time series of ARMA mainly obtains the “Buy” turning points except for 2 “Sell” turning point signals because the prediction value is relatively small, and the correctly predicted turning points are not too much as well. Moreover, the results of ANN prediction method are similar and the primary turning point signals predicted by this method are not consistent with the cusp of the stock price curve. Most of the “Buy” turning point signals are gained because the predicted value of the ANN method is relatively big, as shown in Fig. 9c. Therefore, referring to the cusp position of the RVFL-GMDH prediction curve, we can use it as a selling or buying signal to assist our trading operations. Likewise, the experiment is further conducted on TSCO.L, which is one of the FTSE-100 index constituent stocks in UK stock market. We have extracted the historical data of TSCO.L stock price from January 1, 2015 to October 30, 2020, a total of 1,476 instances, to perform the empirical analysis of turning point prediction of stock price. In particular, it is worth mentioning that considering the emergency impact of the new coronavirus (COVID-19) on the economy, the data of 2020 is not used to validate the effect of the model. That is to say, we have used the historical data of TSCO.L stock price from January 1, 2015 to July 31, 2019 to train the model, while the data from August 1, 2019 through October 31, 2019 (3 months, time window) to verify the effectiveness of the proposed procedure. Table 7 displays the partial sample data and the estimated values of the correlation dimension are computed in Table 8. Also, from Table 8, it can be seen that the correlation dimension tends to be stable when the embedded dimension is increased to 6, which indicates that the time series has chaotic dynamics characteristics and the phase space can be reconstructed.

Table 7 The partial sample data of TSCO.L
Table 8 The correlation dimension estimation of TSCO.L

As seen in Table 7, after gaining the turning signal according to the method introduced in Sect. 3.2.2, we use it to update the normalized difference value, which is normalized to the interval of [0.05, 0.95] and forms the sample data of modeling (last column of Table 7). Therefore, based on the constructed sample data, we build the model using the RVFL-GMDH method along with other comparative methods to predict the turning point data for the next 3 months. Figure 10a presents the specific trend of the actual TSCO.L stock price and the predicted results of different methods are depicted in Fig. 10b–d. As seen in Fig. 10b, the predicted data are all limited in the interval of [0.2, 0.8], and no obvious turning points are predicted by the time series method of ARMA. The situation of the neural network method is similar and most of the predicted value is relatively big, which caused the “Sell” turning point signals are primarily obtained but few “Buy” turning point signals, as shown in Fig. 10c. Particularly, the correctly predicted turning points are not too much. On the other hand, looking at Fig. 10d, it is apparent that the proposed approach has successfully predicted most of the effective turning points for the stock price. As indicated by the arrow in this figure, the turning points with the predicted index value greater than 0.8 or less than 0.2, which signals the selling or buying strategy for the stock exchange, are almost effective turning points verified by the actual stock price. Moreover, the predicted trend of the turning indicator is consistent with the trend of the actual stock price as well. For example, in the period of the continuous increase of actual stock price, the predicted trend of turning point indicator (“Adjusted nmdiff” variable) is flat too, as seen in the middle part of Fig. 10d. Also, we can conduct the simulation studies to evaluate the performance of turning point prediction using the financial metrics like Profit and Loss (PnL), which measures an absolute gain of the price index for total transactions, as calculated by [2]

$$ PnL = \sum\limits_{k = 1}^{N} {(Closing\_index_{s} - Closing\_index_{b} )} $$
(22)

where N denotes the number of holding stock, closing_index represents the closing price of the stock. Suppose that an investor buy 100 shares at the time point of “410” (August 15, 2019, stock price:213.1) according to the turning point signals predicted by the proposed procedure, and he sells them at the next long selling point of “452” (October 15, 2019, stock price:244.2). By doing this, the investor will obtain a profit of 3,110 since the Profit and Loss is computed as 100*(244.2–213.1). Furthermore, suppose that the investor carries out the short-term trading strategy of each signal point, and he buys or sells at least 100 shares each time. Thus, the investor buys 100 shares at the time point of “410” (August 15, 2019, stock price:213.1) and sells them out at the time point of “413” (August 20, 2019, stock price:215.7). In the same way, the investor buys 100 shares at the time point of “416” (August 23, 2019, stock price:212.3) and sells them out at the time points of “418” (August 28, 2019, stock price:218.8). As such, the investor buys 100 shares at the four time points of “423” (September 04, 2019, stock price:227.5), “426” (September 09, 2019, stock price:230.6), “435” (September 20, 2019, stock price:241.5), “440” (September 27, 2019, stock price:244.5), respectively. And he sells them out at the time point of “452” (October 15, 2019, stock price: 244.2). Similarly, at the time point of “455” (October 18, 2019, stock price: 244), the investor buys 100 shares and sells them out at the time point of “456” (October 21, 2019, stock price: 245.5). In this manner, the PnL of the trading strategy based on turning point signals in this time window can be calculated as 100*(215.7–213.1) + 100*(218.8–212.3) + 100*(4*244.2–227.5–230.6–241.5–244.5) + 100*(245.5–244) = 4,330. In summary, the results of the empirical analysis reveal the feasibility and effectiveness of the proposed procedure, which is successful in predicting the turning points of stock time series and can also be employed in more real-world application scenarios.

Fig. 10
figure 10

The actual stock price curve and predicted turning indicator Φ(t) of TSCO.L

5 Conclusions

This work proposes a novel turning point prediction approach for the time series analysis of stock price. The modeling of the nonlinear dynamic system based on the RVFL-GMDH network is introduced in the paper, and the chaos theory is utilized to analyze the time series. Using the fractal characteristic of strange attractor with infinite self-similar structure, the trajectory trend prediction in the reconstructed phase space is accomplished. This method not only exploits the prediction ability brought by the periodic denseness in chaotic dynamic systems but also avoids the issue that the single predicted value is not accurate due to the sensitivity of the initial value. A series of experiments are performed for the proposed approach on both the open dataset and practical stock price data. The experimental findings demonstrate the effectiveness of the model compared to the other state-of-art methods and present the proposed procedure with the substantial performance for forecasting the turning points of stock time series in real-life application scenarios. Based on the experimental analysis, it can be concluded that the proposed procedure has a significant capability for the turning point prediction of stock price and can also be extended to other fields such as electricity load forecasting, material demand planning, and international oil price forecast. In future development, we plan to deploy it on embedded systems to form the production for automatically predicting and recognizing the turning points of stock prices. Meanwhile, we intend to apply it to more real-world applications.