Keywords

1 Introduction

Heilongjiang is the national important commodity grain base; among which rice is one of the three major crops. Predicting rice production in Heilongjiang province exerts significant impact on the national macro policy. Foreign production prediction method focuses on statistical dynamics growth simulation model, meteorological yield prediction model and remote sensing technology prediction model; while domestic production prediction is based on the mathematical model; Liu Qianpu uses the space-time regression prediction model to predict the output of grain of Henan province and municipality, and Li Bingjn predicts short-term grain output using grey linear regression model; Chen Xiangfang proposes a kind of regression tree based on multivariate time series prediction model, to predict the yield of cucumber; Xu Xingmei proposes the model based on clustering analysis and scheduling algorithm to predict corn production. By removing the noise data and reducing the data dimension [7], the paper builds space-time prediction model of rice yield and the rice yield collected in 2010 of Heilongjiang province was predicted by using this model. Such aspects as yield data processing, selection of predictors, establishment of model and stability of model is noticed during the process of prediction, which has improved the accuracy of the prediction.

To sum up, it has yet reported that the comprehensive utilization of gis and time series analysis are used for space-time prediction of rice production in Heilongjiang province.

2 ARIMA Model

The ARIMA model is difference autoregressive moving average model in its full name; ARIMA (p, d, q) is called difference autoregressive moving average model, with AR being autoregression; p autoregression item, MA moving average number, d the difference time conducted when time series become reliable. The so-called ARIMA model refers to the model established by converting non-stationary time series into the stationary time series and then performing regression of the dependent variable lag value and the present value of the random error and lag values, so as to convert the nonstationary series into stationary series.

The model is generally referred to as the ARIMA (p, d, q); the model parameter–p, d, q are nonnegative integers, meaning autoregression, the order of integration, and all parts of the moving average model. Baucus ARIMA model serves as an important part of a Jenkins method of time series model.

2.1 Stationarity Test

Usually it is based on the following functions to determine sequences’ stationary.

Mean function:

$$ \mu_{n} = EX_{n} = \int_{ - \infty }^{\infty } {xdF_{n} (x)} $$
(1)

Variance function:

$$ DX_{n} = E(X_{n} - \mu_{n} )^{2} = \int_{ - \infty }^{\infty } {(x - \mu_{n} )^{2} } dF_{n} (x) $$
(2)

Auto-covariance function:

$$ \gamma (n,n + k) = E(X_{n} - \mu_{n} )(X_{n + k} - \mu_{n + k} ) $$
(3)

Autocorrelation function:

$$ \rho (n,n + k) = \frac{\gamma (n,n + k)}{{\sqrt {DX_{n} \cdot DX_{n + k} } }} $$
(4)

2.2 Pure Randomness Test

Null hypothesis: delay between periods no more than m phase sequence values are independent of each other.

$$ H_{0} :\rho_{1} = \rho_{2} = \rho_{3} = \cdots = \rho_{m} = 0,\forall m \ge 1 $$
(5)

The test statistic:

$$ Q_{LB} = n(n + 2)\sum\limits_{k - 1}^{m} {(\frac{{\hat{\rho }_{k}^{2} }}{n - k})} \sim \chi^{2} (m) $$
(6)

\( Q_{LB} = n(n + 2)\sum\limits_{k - 1}^{m} {(\frac{{\hat{\rho }_{k}^{2} }}{n - k})} \sim \chi^{2} (m) \), reject the null hypothesis, and consider the sequence as a purely random sequence, can be modeled

\( Q_{LB} = n(n + 2)\sum\limits_{k - 1}^{m} {(\frac{{\hat{\rho }_{k}^{2} }}{n - k})} \sim \chi^{2} (m) \), accept the null hypothesis, consider the sequence as pure random sequence and model terminal.

In general, take m = 6, 12, 18.

2.3 Processing of Outliers

If \( X_{{{\text{t}} + 1}} \) is an outlier, we can use \( \hat{X}_{t} \) to correction \( X_{t + 1} \), \( \hat{X}_{t} = 2X_{t} - X_{t - 1} \) [8].

3 Rice Yield Prediction Model

3.1 Data Collection

This paper collected the rice yield data in municipalities of Heilongjiang Province from 1991 to 2010, taking rice yield from 1991 to 2009 as training set, taking rice yield in 2010 as test set.

3.2 Data Processing

In the case of Rice yield per unit area of Mudanjiang city, after the treatment of abnormal, getting a visual distribution map of the rice yield, as shown in Fig. 1:

Fig. 1.
figure 1

Rice yield distribution maps in Mudanjiang city from 1990 to 2010

Analysis of pictorial diagram, the rice yield doesn’t move smoothly enough. After a white noise inspection, we find that the sequence exists in white noise. Quadratic differential on the sequence as shown in Fig. 2:

Fig. 2.
figure 2

Rice yield distribution after the second differential pictorial diagram

To make the group number of autocorrelation and partial autocorrelation, Table 1 for autocorrelation function, Table 2 for the partial autocorrelation function:

Table 1. Autocorrelation function
Table 2. Partial autocorrelation function

Pictorial diagram can be seen that the production distribution has no obvious cyclical rice production, and the quadratic differential autocorrelation function and partial autocorrelation function can be seen that rice yields in Mudanjiang city Heilongjiang province in 1991–2008 were stationary series.

3.3 Determine the Order Number

Based on the BIC criterion

$$ BIC(n) = \ln \hat{\sigma }_{\varepsilon }^{2} (n) + \frac{n}{N}\ln N Z $$

Inside, n is the number of parameters. If an order number \( n_{0}^{{\prime }} \) meet

\( BIC(n_{0}^{{\prime }} ) = \mathop { \hbox{min} }\limits_{1 \le n \le M(N)} BIC(n) Z \), \( M(N) \) is equal to [\( \sqrt N \)] or [\( \frac{N}{10} \)], \( n_{0}^{{\prime }} \) is the best order. After the calculation, p = 3, q = 1. Therefore, rice yield prediction model was ARIMA (3, 2, 1), To examine the Bic sequence analysis by SAS software, as shown in Table 3.

Table 3. BIC order determination results map

4 Yield Prediction

According to the rice yield prediction model, To predict rice yield in cities of Heilongjiang province in 2010. The forecast output and the actual output are shown in Table 4. The distribution maps of Prediction and actual yield in Mudanjiang city in 2010 is shown in Fig. 3.

Table 4. The yield of rice in Heilongjiang province in 2010 actual yield is compared with the predicted values.
Fig. 3.
figure 3

The distribution maps of prediction and actual yield n Mudanjiang city in 2010

5 Spatial Analysis of Rice Yield

By using ArcGIS software, To establish the spatial distribution map of The yield of rice in Heilongjiang Province in 2010 cities that the actual and the predicted. As shown in Figs. 4, 5.

Fig. 4.
figure 4

The actual yield distribution in space

Fig. 5.
figure 5

Prediction of yield distribution in space

Using the functions of GIS spatial analysis, Error analysis for the actual yield of rice and predicted values in 2010 Heilongjiang province. The spatial distribution map of error is shown in Fig. 6.

Fig. 6.
figure 6

Distribution of the prediction error space

According to the above analysis, Getting the prediction error of Rice yield of municipalities in Heilongjiang province. As shown in Table 5.

Table 5. Prediction error of rice yield of Heilongjiang province in 2010

It can be seen from Table 5 that Yichun has the least prediction error of 0.26 %, with Heihe having the largest prediction error of 23.86 %. By carrying out calculation, the prediction error of different cities on average is 4.5 %.

6 Conclusions

The paper establishes time prediction model of rice production in Heilongjiang province using ARIMA model and gis technology; By using the model, the prediction value of rice yield in Heilongjiang province in 2010 is obtained; and the actual production, predicted production and predict the spatial distribution of prediction error is acquired. By conducting analysis, the prediction accuracy of the model on average reaches over 95 % that can be used for rice yield prediction and that provides relevant scientific reference for government sectors’ overall planning and decision-making.

However, the prediction error of this model used for prediction of rice yield of Heihe reaches a maximum value of 23.8 %, indicating the model is not well designed taking all factors into account that further research is still required.