CH 7 Forecasting

Chapter 7
Forecasting
7.1 Introduction
Forecasting refers to the process of using statistical procedures to predict future values of a time
series based on historical trends. Forecasting the future values of an observed time series is an
important problem in many areas, including economics, production planning, sales forecasting
and stock control. Forecasting problems are often classified as short-term, medium-term, and
long-term. Short-term forecasting problems involve predicting events only a few time periods
(days, weeks, months) into the future. Medium-term forecasts extend from one to two years into
the future, and long-term forecasting problems can extend beyond that by many years.
Suppose we have an observed time series Y1, Y2,…, Yn. Then the basic problem is to estimate
future values such as YT+k, where k = 1, 2, 3, …. and an integer k is called the lead time or
forecasting horizon. The forecast of YT+k at time T for k steps ahead is typically denoted by 𝑌T+k
or Y(T+k). A wide variety of different forecasting procedures is available and it is important to
realize that no single method is universally applicable.
Forecasting methods may be broadly classified into three groups as follows:
1. Subjective
Forecasts can be made on a subjective basis using judgment, intuition, commercial knowledge
and any other relevant information. This Methods is done based on a group of forecasters to
obtained a consensus forecast with controlled feedback of other analysts’ predictions and
opinions as well as other relevant information. These methods will not be described here, as most
statisticians will want their forecasts to be at least partly objective. However, note that some
subjective judgment is often used in a more statistical approach, for example, to choose an
appropriate model and perhaps make adjustments to the resulting forecasts.
1.Univariate
Forecasts of a given variable are based on a model fitted only to present and past observations of
a given time series, so that 𝑌T+k depends only on the values of Yn, Yn−1, Yn−2,…, possibly
augmented by a simple function of time, such as a global linear trend. This would mean, for
example, that univariate forecasts of the future sales of a given product would be based entirely
1
on past sales, and would not take account of other economic factors. Methods of this type are
sometimes called naive or projection methods.
3.Multivariate
Forecasts of a given variable depend at least partly on values of one or more additional series,
called predictor or explanatory variables. For example, sales forecasts may depend on stocks
and/or on economic indices. Models of this type are sometimes called causal models.
In practice, a forecasting procedure may involve a combination of the above approaches. For
example, marketing forecasts are often made by combining statistical predictions with the
subjective knowledge and insight of people involved in the market. A more formal type of
combination is to compute a weighted average of two or more objective forecasts, as this often
proves superior on average to the individual forecasts.
Applications of Time Series Forecasting
Forecasting has a range of applications in various industries. It has tons of practical applications
including: weather forecasting, climate forecasting, economic forecasting, healthcare forecasting
engineering forecasting, finance forecasting, retail forecasting, business forecasting,
environmental studies forecasting, social studies forecasting, and more. Basically anyone who
has consistent historical data can analyze that data with time series analysis methods and then
model, forecasting, and predict. For some industries, the entire point of time series analysis is to
facilitate forecasting.
Examples of time series forecasting
Here are several examples from a range of industries to make the notions of time series analysis
and forecasting more concrete:
 Forecasting the closing price of a stock each day.

 Forecasting product sales in units sold each day for a store.
 Forecasting unemployment for a state each quarter.
 Forecasting the average price of gasoline each day.
2
7.2 Some Forecasting Methods
Average Methods
A. The Mean
When we forecast using the mean, data must be stationary, and the variance is stable. In other
words, the data must have not trend and seasonality. Given the data set covering the start time
∑
periods the forecast of the observation in some future period ‘T+k’ would be 𝑌T+k = 𝑌 = .
B. Simple Moving Averages

Given historical data and a decision to use the most recent N observations for each average, then
the forecast for some future period ‘T+k’ becomes 𝑌T+k = MT.
Example: consider the following data and use the three-period simple moving average. Using
one-period ahead forecast.
t 1 2 3 4 5 6 7 8 9 10
Yt 14 15 10 14 17 12 15 11 12 19
3MA 13.00 13.00 13.67 14.33 14.67 12.67 12.67 14.00
𝑌T+k=1 13.00 13.00 13.67 14.33 14.67 12.67 12.67 14.00
Solution:
t 1 2 3 4 5 6 7 8 9 10
3MA - 13.00 13.00 13.67 14.33 14.67 12.67 12.67 14.00 -
𝑌T+k=1 - - 13.00 13.00 13.67 14.33 14.67 12.67 12.67 14.00
Therefore 𝑌T+k=1 = 𝑌10+k=1 = 𝑌10 = 14.
7.3. Box-Jenkins Method

Here we will see the forecasting procedure based on ARIMA models which is usually known as
the Box-Jenkins Approach. A forecast is obtained by taking expectation at origin T of the model
written at time T+k. A major contribution of Box and Jenkins has been to provide a general
strategy for time-series forecasting, which emphasizes the importance of identifying an
appropriate model in an iterative way.
For both seasonal and non-seasonal data, the adequacy of the fitted model should be checked by
what Box and Jenkins call ‘diagnostic checking’. This essentially consists of examining the
residuals from the fitted model to see whether there is any evidence of non-randomness. The
3
correlogram of the residuals is calculated and we can then see how many coefficients are
significantly different from zero and whether any further terms are indicated for the ARIMA
model. If the fitted model appears to be inadequate then alternative ARIMA models may be tried
until a satisfactory one is found. When a satisfactory model is found, forecasts may readily be
computed. Given data up to time T, these forecasts will involve the observations and the fitted
residuals (i.e. the one-step-ahead forecast errors) up to and including time T. The standard
criterion to use in obtaining the best forecast is the mean squared error for which the expected
value of the squared forecast errors, E[(YT+k - 𝑌 )2] = E[(εT+k)2], is minimized. The minimum
mean square error forecast of YT+k at time T is the conditional expectation of YT+k at time T,
namely, 𝑌 = E(YT+k/YT, YT-1, YT-2, …). In evaluating this conditional expectation, we use the
fact that the ‘best’ forecast of all future ε’s is simply zero (or more formally that the conditional
expectation of εT+k, given series up to time T, is zero for all k>0).
Forecasts with AR (1) Process

For this process, it holds that Yt = μ + 𝜙 Yt-1 + εt, with |𝜙 | < 1. The optimal k-step-forecast is
the conditional mean of YT+k, i.e.
E[YT+k] = E[𝜇̂ + 𝜙 YT+k-1 + εt+k] = 𝜇̂ + 𝜙 E[YT+k-1] + 0 .
We get the following first order difference equation for the prediction function which can be
solved recursively:
 E[YT+1] = 𝑌T+1 = E[𝜇̂ + 𝜙 YT + εT+1] = 𝜇̂ + 𝜙 E[YT] = 𝜇̂ + 𝜙 YT.
 E[YT+2] = 𝑌T+2 = E[𝜇̂ + 𝜙 YT+1 + εT+2] = 𝜇̂ + 𝜙 E[YT+1] = 𝜇̂ + 𝜙 𝑌T+1.
 In general, E[YT+k] =𝑌T+k = 𝜇̂ + 𝜙 𝑌T+k-1, for k ≥ 1.
Forecasts with AR(2) Process

For this process, it holds that Yt = μ + 𝜙 Yt-1 + 𝜙 Yt-2 + εt, with 𝜙 + 𝜙 < 1, 𝜙 − 𝜙 < 1
and |𝜙 | < 1. The optimal k-step-forecast is the conditional mean of Y T+k, i.e.
E[YT+k] = E[𝜇̂ + 𝜙 YT+k-1 +𝜙 YT+k-2 εt+k] = 𝜇̂ + 𝜙 E[YT+k-1] +𝜙 E[YT+k-2] + 0 .
We get the following first order difference equation for the prediction function which can be
solved recursively:

E[YT+1] =𝑌T+1 = E[𝜇̂ +𝜙 YT + 𝜙 YT-1 + 𝜙 YT-1+ εT+1] = 𝜇̂ + 𝜙 E[YT] + 𝜙 E[YT-1]
4
= 𝜇̂ + 𝜙 YT+ 𝜙 YT-1.
 E[YT+2] =𝑌T+2 = E[𝜇̂ + 𝜙 YT+1 + 𝜙 YT + εT+2] =𝜇̂ + 𝜙 E[YT+1] + 𝜙 E[YT]
= 𝜇̂ + 𝜙 𝑌T+1 + 𝜙 YT
 E[YT+3] =𝑌T+3 = E[𝜇̂ + 𝜙 YT+2 + 𝜙 YT+1 + εT+2] =𝜇̂ + 𝜙 E[YT+1] + 𝜙 E[YT+1]
= 𝜇̂ + 𝜙 𝑌T+1 + 𝜙 𝑌T+1
In general, E[YT+k] =𝑌T+k = 𝜇̂ + 𝜙 𝑌T+k-1 + 𝜙 𝑌T+k-2, for k ≥ 3.
Forecasts with AR(p) Process
Starting with the representation: Yt = 𝜇̂ + 𝜙 Yt-1 + 𝜙 Yt-2 + 𝜙 Yt-3 +…+ 𝜙 Yt-p + εt
The conditional mean of YT+k is given by:

E[YT+k ] =𝑌T+k = 𝜇̂ + 𝜙 E[YT+k-1 ]+ …+ 𝜙 E[YT+k-p ] + 0.
Thus, the above equation can be solved recursively:
E[YT+1 ] =𝑌T+1 = 𝜇̂ + 𝜙 YT+ 𝜙 YT-1 …+ 𝜙 YT+1-p .
E[YT+2 ] =𝑌T+2 = 𝜇̂ + 𝜙 𝑌T+1+ 𝜙 YT + 𝜙 YT-1 …+ 𝜙 YT+2-p etc.
Examples
1. A time series model has fitted to some historical data and yielding Yt = 25+0.34Yt-1 + ε t.
suppose that at time T=100, the observation is Y100 = 28, then
a. Determine forecasts for periods 101, 102, 103 etc.
b. Suppose Y101 = 32, revise your forecasts for periods 102, 103, 104,..., using period
101 as the new origin of time T.
2. The following time series model has been fitted to some historical series as AR(2)
process as: Yt = 21 + 0.27Yt-1 + 0.41Yt-2 + εt at time T =104, then determine forecasts for
periods 105, 106, 107, …
Forecasts with MA(1) Process
For this process, it holds that Yt = μ + εt – θ1 εt-1, with | θ1| < 1. The conditional mean of Yt+k is
E[YT+k] = 𝑌T+k = 𝜇̂ + E[εT+k] – 𝜃1 E[εT+k-1] = 𝜇̂ – 𝜃1 e1(T+k-1)
 For k = 1, this leads to 𝑌T+1 = 𝜇̂ – 𝜃1 e1(T) and
 for k ≥ 2,
 we get 𝑌T+k = 𝝁 i.e. the unconditional mean is the optimal forecast of Y t+k, k = 2, 3, ..., .
5
Forecasts With MA(2) Process
For this process, it holds that Yt = μ +εt – θ1 εt-1–θ2 εt-2 , with θ1+θ2 < 1, θ2–θ1 <1 and | θ1| <1.
The conditional mean of YT+k is
 E[YT+k] = 𝑌T+k = 𝜇̂ + E[εT+k] – 𝜃1 E[εT+k-1] + 𝜃2 E[εT+k-2] = 𝜇̂ – 𝜃1 e1(T+k-1) – 𝜃2 e2(T+k-2).
This leads;
 for k = 1, E[YT+1] = 𝑌T+1 = 𝜇̂ – 𝜃1 e1(T) – 𝜃2 e2(T-1),
 for k = 2, 𝑌T+2 = 𝜇̂ – 𝜃2 e2(T) and
 for k ≥ 3, we get 𝑌T+k = 𝝁

i.e. the unconditional mean is the optimal forecast of Y t+k, k = 3,4, ..., .
Similarly, it is possible to show that, after q forecast steps, the optimal forecasts of invertible
MA(q) processes, q > 1 are equal to the unconditional mean of the process. The forecasts in
observable terms are represented similarly to those of the MA(1) process.
Examples
1. The time series model has been fitted to some historical data MA(1) process as:
Yt = 10 + εt - 0.3 εt-1 with 200 observations. If the first observation and the last forecast
error are given as 19 and -0.45 then find forecasts for period 201, 201, 203,…
2. The time series model has been fitted to some historical data MA(2) process as:
Yt = 20 + εt + 0.45 εt-1 - 0.35 εt-2. If the first four observations are 17.5, 21.36, 18.24 and
16.91, respectively, then find forecasts for period 5, 6, 7,…
Forecasts With ARMA(p, q) Processes
Forecasts for ARMA(p, q) process result from combining the approaches of pure AR and MA
processes. Thus, for instance, the one-step ahead forecast for a stationary and invertible
ARMA(1,1) process as Yt = μ + 𝜙 Yt-1 +εt – θ1 εt-1 is given by:
E[Yt+k] = E[𝜇̂ + 𝜙 YT-1 +εT – 𝜃1 εT-1] = 𝑌T+k = 𝜇̂ + 𝜙 𝑌T+k-1– 𝜃1 ε1(T+k-1)
 For k =1, 𝑌T+1 = 𝜇̂ + 𝜙 𝑌T– 𝜃1 ε1(T),
 for k=2, 𝑌T+2 = 𝜇̂ + 𝜙 𝑌T+1 and

 In general for k >1, 𝑌T+k = 𝜇̂ + 𝜙 𝑌T+k-1.
6
Example: The time series model has been fitted to some historical data having 100 observations
as ARMA(1, 1) process: Yt = 0.8Yt-1 + εt - 0.5 εt-1. If the last observation and forecast error are
given as 91 and -0.54 then find forecasts for period 101, 102, 103, …
7.4 The Accuracy of Forecasting Methods
Forecasts can be evaluated when the realized values are available. There are many kinds of
measures to do this. Quite often, only graphs and/or scatter diagrams of the predicted values and
the corresponding observed values of a time series are plotted. Intuitively, a forecast is ‘good’ if
the predicted values describe the development of the series in the graphs relatively well or if the
points in the scatter diagram are concentrated around the bisecting line in the first and/or third
quadrant.
On the other hand, simple descriptive measures, which are often employed to evaluate the
performance of forecasts, are based on the average values of the forecast errors over the forecast
horizon i.e. the forecast error for a particular forecast 𝑌t with respect to actual value Yt is:

et  Yt  Yt . The simple arithmetic mean indicates whether the values of the variable are – on
average – over- or underestimated.
However, the disadvantage of this measure is that large over- and underestimates cancel each
other out. The mean absolute error is often used to avoid this effect i.e. | et || Yt  Yˆt | . Hence,
we can define a measure known as the mean absolute error (MAE) as:
n n
 et Y t  Yˆt
MAE  t 1
 t 1
n n
Every forecast error gets the same weight in this measure. The root mean square error is often
n n
 et Y t  Yˆt
used to give particularly large errors a stronger weight: RMAE  t 1
 t 1
n n
Another method is to use the mean squared error (MSE) defined as follows:
n n
 et  (Y  Yˆt ) 2
2
t
MSE  t 1
 t 1
. These measures are not normalized, i.e. their size depends on the
n n
scale of the data.
7
Therefore, the inequality measure proposed by HENRY THEIL (1961) avoids this problem by
comparing the actual forecasts with so-called naïve forecasts, i.e. the realized values of the last
available observation. This method that measures the accuracy of forecast is by using U-Statistic
called Theil’s U-Test. The Theil’s U-Statistic is defined as:
∑ ( )
U= , where ∑ ( ) is forecast relative error and ∑ ( )
∑ ( )
is actual relative error.

Here the observation at time t+1 is being considered as a forecast for the observation at time t.
This is Naïve method.
Interpretation:
U =1, the naïve method is as good as the forecasting method being used;
U < 1, the forecasting method being used is better than the naïve method;
U > 1, there is no point in using the forecasting method under consideration since this method is
not better than the naïve method.
Example: the following table shows the actual values and the corresponding forecasts using
some forecasting method. Then compute the U-statistic and comment on the forecast accuracy.
t 1 2 3 4 5 6 7 8 9 10
Yt 22 23 39 37 38 47 43 49 61 63
𝑌t 24 28 32 36 40 44 48 52 56 60
Solution:
t 1 2 3 4 5 6 7 8 9 10 Sum
Numerator 0.052 0.093 0.001 0.003 0.006 0.011 0.005 0.010 0.002 - 0.184
Denominator 0.002 0.494 0.003 0.001 0.056 0.007 0.019 0.060 0.001 - 0.633
∑ ( )
Then U =
∑ ( )
= .
.
= 0.54 <1.
This tells us the forecasting method used is better than the naïve method.
Exercises for chapters 6-to-8
1. What are the three parameters in ARIMA model?
2. Write the Box-Jenkins modeling stages.
3. Find the MA representation of the first-order AR process Yt = 0.3Yt-1 + 1 .
8
4. Consider a certain time series process of Y t = ¾ Yt-1 - ½ Yt-2 + εt .
(a) Verify that the process is second-order stationary.

(b)What looks like the graph of ACF? And find 𝜙 of the process.
(c) Find cov(Yt, εt) and cov(Yt, εt-1)
5. Derive an ARMA process for (1-0.3B-0.7B2)vYt = 𝜀 and determine the variance of
the process if 𝜎 = 2.
6. Suppose that the first 7 AC and PAC functions of a time series consisting 108
observations are below in the table.
lag 1 2 3 4 5 6 7
rk 0.95 0.91 0.86 0.81 0.78 0.75 0.73
se(rk) 0.167 0.161 0.203 0.235 0.259 0.280 0.298
𝜙 0.95 0.09 -0.08 -0.10 0.13 0.10 0.01
𝑠𝑒( 𝜙 ) 0.096 0.096 0.096 0.096 0.096 0.096 0.096
(a) Suggest an ARMA model which may be appropriate. Why?

(b) Estimate the model parameter(s) and fit the model.
(c) If the variance of the series is, 𝜎y = 5, then estimate the standard deviation of
the error term of this model.
7. Suppose that the first 10 autocorrelation of the residuals consisting of 100

observations are r1 = 0.31, r2=0.37, r3=-0.05, r4=0.06, r5=-0.21, r6=0.11, r7=0.08,
r8=0.05, r9=0.12 and r10=-0.01. Then test model adequacy by using individual basis
and portmanteun lack of fit- test.
8. Consider the MA(1) model Yt = 40 + 0.4 t-1 + t. Assume that the variance of the
white noise process is 𝜎 = 2.

a. Find the variance for this model.
b. Find and plot the first few ACF and PACF for this model.
9. Assume that you have fit a model for a time series as: Y t = 0.9Yt-1+0.7εt-1 –0.2εt-2 + t
and suppose that you are at the end of time period T = 10.
(a) What is the equation for forecasting the time series in period 11, 12 and 13?

CH 7 Forecasting

Uploaded by

Copyright:

Available Formats

CH 7 Forecasting

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH 7 Forecasting

Uploaded by

Copyright:

Available Formats

Chapter 7

Applications of Time Series Forecasting

Examples of time series forecasting

 Forecasting the closing price of a stock each day.

B. Simple Moving Averages

Therefore 𝑌T+k=1 = 𝑌10+k=1 = 𝑌10 = 14.

7.3. Box-Jenkins Method

expectation of εT+k, given series up to time T, is zero for all k>0).

Forecasts with AR (1) Process

the conditional mean of YT+k, i.e.

E[YT+k] = E[𝜇̂ + 𝜙 YT+k-1 + εt+k] = 𝜇̂ + 𝜙 E[YT+k-1] + 0 .

 E[YT+1] = 𝑌T+1 = E[𝜇̂ + 𝜙 YT + εT+1] = 𝜇̂ + 𝜙 E[YT] = 𝜇̂ + 𝜙 YT.

 E[YT+2] = 𝑌T+2 = E[𝜇̂ + 𝜙 YT+1 + εT+2] = 𝜇̂ + 𝜙 E[YT+1] = 𝜇̂ + 𝜙 𝑌T+1.

 In general, E[YT+k] =𝑌T+k = 𝜇̂ + 𝜙 𝑌T+k-1, for k ≥ 1.

Forecasts with AR(2) Process

E[YT+k] = E[𝜇̂ + 𝜙 YT+k-1 +𝜙 YT+k-2 εt+k] = 𝜇̂ + 𝜙 E[YT+k-1] +𝜙 E[YT+k-2] + 0 .

 E[YT+2] =𝑌T+2 = E[𝜇̂ + 𝜙 YT+1 + 𝜙 YT + εT+2] =𝜇̂ + 𝜙 E[YT+1] + 𝜙 E[YT]

Forecasts with AR(p) Process

Starting with the representation: Yt = 𝜇̂ + 𝜙 Yt-1 + 𝜙 Yt-2 + 𝜙 Yt-3 +…+ 𝜙 Yt-p + εt

The conditional mean of YT+k is given by:

Forecasts with MA(1) Process

 For k = 1, this leads to 𝑌T+1 = 𝜇̂ – 𝜃1 e1(T) and

The conditional mean of YT+k is

 E[YT+k] = 𝑌T+k = 𝜇̂ + E[εT+k] – 𝜃1 E[εT+k-1] + 𝜃2 E[εT+k-2] = 𝜇̂ – 𝜃1 e1(T+k-1) – 𝜃2 e2(T+k-2).

 for k = 2, 𝑌T+2 = 𝜇̂ – 𝜃2 e2(T) and

 for k ≥ 3, we get 𝑌T+k = 𝝁

ARMA(1,1) process as Yt = μ + 𝜙 Yt-1 +εt – θ1 εt-1 is given by:

E[Yt+k] = E[𝜇̂ + 𝜙 YT-1 +εT – 𝜃1 εT-1] = 𝑌T+k = 𝜇̂ + 𝜙 𝑌T+k-1– 𝜃1 ε1(T+k-1)

 For k =1, 𝑌T+1 = 𝜇̂ + 𝜙 𝑌T– 𝜃1 ε1(T),

 for k=2, 𝑌T+2 = 𝜇̂ + 𝜙 𝑌T+1 and

average – over- or underestimated.

is actual relative error.

3. Find the MA representation of the first-order AR process Yt = 0.3Yt-1 + 1 .

(a) Verify that the process is second-order stationary.

(a) Suggest an ARMA model which may be appropriate. Why?

7. Suppose that the first 10 autocorrelation of the residuals consisting of 100

white noise process is 𝜎 = 2.

You might also like