Econometrics 2019 PDF
Econometrics 2019 PDF
Econometrics 2019 PDF
of Financial Data
Martin Tegner
Mathematical Institute
University of Oxford
June 21, 2019
The main goals of this course
2/146
Suggested literature
3/146
Outline
Introduction to R
Empirical properties of financial data
Kernel estimators
Test for normality, QQ plots
Regressions
Way of analysing dependency between variables
Linear and non-linear time series
Stationarity
Autocorrelation functions
Modelling
4/146
Introduction to R
https://github.com/martnj/R-econometrics
5/146
R - introduction
# Creating vectors (data structure which contains objects of the same mode)
x <- c(1,2,3,4,5) # ’c’ for concatenate, gives *numeric* vector ’x’
x # print out ’x’
length(x)
(y <- 1:5) # define vector and *print* output via ’()’
(y[6] <- 6) # append to a vector (alternative: y <- c(z, 6))
numeric(5) # empty numeric of length 5 (default)
6/146
R - introduction
# Some functions
(x <- c(3,4,2))
rev(x) # reverse order
sort(x) # sort in increasing order
sort(x, decreasing=TRUE)
(idx <- order(x)) # create indices that sort x
x[idx] # => sorted
log(x) # (component-wise) logarithms
x^2 # (component-wise) squares
exp(x)
sum(x)
cumsum(x)
prod(x)
# Sequences
seq(from=1,to=7,by=2)
seq(from=1,to=100,length.out=25)
rep(1:3, each=3, times=2)
# Missing values
z <- 1:3; z[5] <- 4 # two statements in one line (’;’-separated)
z # ’not available’ (NA)
c(z, 0/0) # 0/0, 0*Inf, Inf-Inf lead to ’not a number’ (NaN)
class(NaN) # not a number but still of mode ’numeric’ 7/146
class(NA)
R - introduction
# Matrices
(A <- matrix(1:9, ncol=3)) # OBS! operates on matrix by *columns*
(A <- matrix(1:9, ncol=3, byrow=TRUE)) # row-wise
8/146
R - introduction
# Reproducibility:
# Set a ’seed’
X==Y # obviously not equal (here: with probability 1)
set.seed(10) # with set.seed() we can set the seed
X <- rnorm(2) # draw two N(0,1) random variates
set.seed(10) # set the same seed again
Y <- rnorm(2)
all.equal(X, Y) # => TRUE
9/146
R - introduction
10/146
R - introduction
install.packages("quantmod") # download/install
install.packages(c("TTR","xts","zoo")) # packages required for qunatmode
library(quantmod) # load the package
?‘quantmod-package‘
# Log-returns
rtn = diff(log(AAPL$AAPL.Close))
chartSeries(rtn,theme="white")
11/146
Probability background
12/146
Randomness
13/146
Randomness
14/146
Distributional properties
Marginal distribution
Z ∞
FX (x) = FX ,Y (x, ∞), fx (x) = fx,y (x, y )dy
∞
xq = inf{x : q ≤ FX (x)}
Conditional distribution
P(X ≤ x, Y ≤ y ) fx,y (x, y )
FX |Y ≤y (x) = , fx|y (x) =
P(Y ≤ y ) fy (y )
15/146
Distributional properties
p-th moment
Z
p
E[X ] = x p f (x)dx (< ∞ ⇐⇒ X ∈ Lp )
R
Skewness (symmetry)
(X − µX )3
SX = E[ ]
σX3
Kurtosis (tail behaviour)
(X − µX )4
KX = E[ ]
σX4
16/146
Conditional expectation
ξˆ = E[ξ|X ]
ξˆ = α0 + α1 X1 + · · · + αn Xn
17/146
Empirical properties of financial data
18/146
Histogram
19/146
Kernel density estimator
class(SP500)
names(SP500)
plot(SP500$r500,type="l")
hist(SP500$r500)
hist(SP500$r500,breaks=100)
plot(density(SP500$r500))
install.packages("fBasics")
library(fBasics)
basicStats(SP500)
21/146
R - output of basicStats(SP500)
r500
nobs 2783.000000 # Sample size
NAs 0.000000 # No of missing values
Minimum -0.228006
Maximum 0.087089
1. Quartile -0.004845
3. Quartile 0.005766
Mean 0.000418
Median 0.000536
Sum 1.163571
SE Mean 0.000206 # Standard error of sample mean
# SE:= (standard deviation)/sqrt(nobs)
LCL Mean 0.000014 # Lower bound of 95% C.I.
UCL Mean 0.000822
Variance 0.000118
Stdev 0.010863
Skewness -3.495673
Kurtosis 74.400163
22/146
Hypothesis testing
23/146
Hypothesis testing - example
|t| ≥ q1−α/2
Jarque-Bera test
H0 : data {Yi }ni=1 from standard normal distribution
25/146
R - test for normality
normalTest(SP500$r500,method=’jb’) # JB-test
# STATISTIC:
# X-squared: 648508.6002
# P VALUE:
# Asymptotic p Value: < 2.2e-16 # Reject normality
hist(SP500$r500,nclass=30) # Histogram
d1 = density(SP500$r500) # Obtain density estimate
range(SP500$r500) # Range of SP500 returns
x = seq(-.25,.1,by=.001)
y1 = dnorm(x,mean(SP500$r500),stdev(SP500$r500))
plot(d1$x,d1$y,xlab="return",ylab="density",type="l")
lines(x,y1,lty=2)
plot(d1$x,d1$y,xlab="return",ylab="density",type="l",xlim=c(-0.05,.05))
lines(x,y1,lty=2)
26/146
Q-Q plots
or more generally
πq = inf{x : q ≤ FX (x)}
27/146
Q-Q plots
Empirical distribution
n
1X
F̂n (x) = 1{xi ≤x}
n
i=1
k −1 k
x(k),n = π̂q for <q≤
n n
since the sample quantile is
n
1X
π̂q = inf{x : q ≤ F̂n (x) = 1{xi ≤x} }
n
i=1
28/146
Normal probability plots
πq − µ X −µ πq − µ πq − µ
Φ( ) = P( ≤ )=q ⇒ = Φ−1 (q).
σ σ σ σ
Hence, plotting (x(k),n − µ̂)/σ̂ versus Φ−1 (k/(n + 1)) we can
test for normality
29/146
R - Q-Q Plot
30/146
Linear regression
31/146
Linear regression
Y , output variable
X = (X1 , . . . , Xp ), input variables
How does Y relate to X ?
Model assumption:
Y = β0 + β1 X1 + . . . + βp Xp +
32/146
Regression coefficients
Model
Y = β0 + β1 X1 + . . . + βp Xp +
β0 : intercept
βj : slope
∂
βj = E [Y |X1 , . . . , Xp ].
∂Xj
33/146
Regression - assumptions
E [Y |X ] = β0 + β1 X1 + . . . + βp Xp
34/146
Least squares estimation - 1D input variable
Yi = β0 + β1 Xi + i
Solution
Pn
yi (xi − x̄)
β1 = Pi=1
b n 2
i=1 (xi − x̄)
βb0 = ȳ − βb1 x̄
Yi = β0 + β1 Xi1 + · · · + βp Xip + i
36/146
Sampling properties of β̂
Unbiased estimate with variance
37/146
Sampling properties of β̂
(1−α/2)
z (1−α/2) and χ2p−1 are the 1 − α/2-quantiles of the
normal and chi-square distributions respectively (intervals
approximately correct even if errors are non-Gaussian)
The confidence region for β generates a corresponding
confidence region for the true regression function
f (x) = (1, x)> β, namely
n o
Cf (x) = (1, x)> β : β ∈ Cβ
38/146
Inference about model
Gives a way of drawing inference about β
Consider the null hypothesis βj = 0. Then
β̂j
tj = √ , vj = jth diagonal of (X> X)−1
σ̂ v j
(RSS0 − RSS)/(p − p0 )
F =
RSS/(n − p − 1)
spx = SP500$r500
y = spx[-(1:2)]
X = cbind(rep(1,length(spx)-2),spx[-c(1,length(spx))],
spx[-c(length(spx)-1,length(spx))] )
install.packages("scatterplot3d")
library(scatterplot3d)
pr = par()
scatterplot3d(X[,2],X[,3],y)
par(pr)
m1 <- lm(y~X-1)
summary(m1)
X = cbind(rep(1,length(spx)-2),spx[-c(1,length(spx))] )
plot(X[,2],y)
m2 <- lm(y~X-1)
summary(m2)
40/146
R - Linear regression
beta_hat = solve(t(X)%*%X,t(X)%*%y)
sig2_eps = sum((y - X%*%beta_hat)^2)/(nrow(X)-3)
beta_cov = inv(t(X)%*%X)*sig2_eps
for(i in 1:1000){
beta_rnd = beta_hat + (chol(beta_cov))%*%rnorm(length(beta_hat))
abline(beta_rnd,col=grey(0.7))
}
41/146
Model selection - information criteria
Akaike Information Criterion (AIC) is defined as
2 2
AIC = − log(max likelihood) + × (number of parameters)
N N
Recall the likelihood L(θ|y) = P(y|θ)
N
2 X 2d
AIC = − log P(yi |θ̂) +
N N
i=1
42/146
Model quality - goodness of fit
43/146
Sums of squares and R 2
Always
Define
explained SS
R2 =
total SS
Thus, R 2 ≤ 1 measures proportion of the total variation of Y that
can be explained by linear model
If residual SS = 0 =⇒ R 2 =1 (i.e., zero prediction error)
If explained SS = 0 (i.e., ybi = ȳ ) no point of regression model
The closer R 2 to 1 the better
44/146
Part I: linear time series
45/146
Time series - definition
X = {Xt }t
46/146
Objective
47/146
Asset returns
Log return
Pt
rt = log (1 + Rt ) = log = log Pt − log Pt−1
Pt−1
48/146
R - asset price vs. returns
49/146
Stylized facts of asset returns
50/146
Stylized facts of asset returns
Return time series are not i.i.d. although they show little
serial correlation
Series of absolute or squared returns (∼variance) show
profound serial correlation
Conditional expected returns are close to zero
Extreme returns appear in clusters
Return series are leptokurtic (peaked) and heavy-tailed
(power-like tail)
51/146
Stylized facts of asset returns
1.0
0.5
log−return
0.0
−0.5
Time
52/146
Time series model - properties
µX (t) = E[Xt ]
p
σX (t) = Var[Xt ]
Autocovariance function
γX (s, t) = Cov[Xs , Xt ]
γX (s, t)
ρX (s, t) = Corr[Xs , Xt ] =
σX (s)σX (t)
Stationarity
“The same type of stochastic behaviour of a stochastic
process from one time period to another”
For instance, financial returns can vary, but their stochastic
properties (mean, std, ...) are often similar in each period
Attention: returns often show stationary behaviour, not asset
prices, which tend to increase over time
Important theme: transforming time series data to obtain
stationary behaviour =⇒ modelling in stationary domain
54/146
Strictly and weakly stationary processes
Definition
A stochastic process X = {Xt }t
is strictly stationary if for any n, h ∈ N
is weakly stationary if
i) E[Xt ] = µX for all t ≥ 0
ii) Var [Xt ] = σX2 < ∞ for all t ≥ 0
iii) γX (s + h, s) = γX (t + h, t) for all t, s, h (auto-covariance is a
function only of the time lag h)
55/146
Remarks
γX (h)
ρX (h) = Corr[Xh , X0 ] = , h∈Z
γX (0)
56/146
Time average as statistical estimates
n−|h|
1 X γ̂X (h)
γ̂X (h) = (xi − µ̂X )(xi+|h| − µ̂X ), ρ̂X (h) =
n γ̂X (0)
i=1
57/146
Autocorrelation function (ACF)
getSymbols("SPY",src="yahoo")
head(SPY)
plot(SPY$SPY.Close)
SPY.rtn = diff(log(SPY$SPY.Close))
plot(SPY.rtn)
acf(SPY.rtn[-1]) # plot autocorrelation function
acf(SPY.rtn[-1]^2) # x[-1] : skip the first element
58/146
The search for stationarity - decomposition
59/146
The search for stationarity - integration
∇Xt = Xt − Xt−1
60/146
The search for stationarity - integration
∇2 Xt = Xt − 2Xt−1 + Xt−2
61/146
The search for stationarity - Box-Cox transformations
plot(diff(co2,differences = 2))
63/146
First example of time series model - white noise
# White noise
WN <- rnorm(1024,0,1)
ts.plot(WN)
acf(WN,40,"covariance")
acf(WN,40,"correlation")
65/146
Linear time series
Definition
A time series {Xt }t∈Z is said to be linear if it can be written as
∞
X
Xt = µ + Ψi Wt−i ,
i=0
∞
X
E[Xt ] = µ, Var[Xt ] = σ 2 Ψ2i .
i=0
66/146
Linear time series
P∞
i=0 |Ψi | < ∞ implies that
∞
X
E[| Ψi Wt−i |] < ∞
i=0
67/146
Linear time series
68/146
Random walk
Xn = X0 + W1 + . . . + Wn .
E[Xn ] = E[X0 ]
Var[Xn ] = Var[X0 ] + nσ 2
Variance is changing with n =⇒ non-stationarity
But {Xn }n is I (1)
Example of a unit-root non-stationary time series.
69/146
R - random Walk
# Random Walk
WN <- rnorm(1024,0,1)
RW <- cumsum(WN)
acf(RW,40,"covariance")
acf(RW,40,"correlation")
acf(diff(RW),40,"correlation")
70/146
Auto regressive time series (AR model)
Definition
A mean-zero time series {Xn }n is AR(p) if
where φi ∈ R
Subtract the sample mean from the data before trying to model
with (mean-zero) AR processes
71/146
AR(1) series
AR(1) series
Xt = φ1 Xt−1 + Wt
We can rewrite this iteratively as
72/146
AR(1) series
σ2
Var[Xt ] =
1 − φ21
# AR(IMA) process
# order = c(p,d,q) where p is AR(p), d differencing,
# q is MA(q)
x = arima.sim(list(order=c(1,0,0),ar=0.9),n=1000)
ts.plot(x)
acf(x,40,type="correlation")
lines(0.9^(0:40),lty=2)
74/146
Estimation: Yule-Walker equations
(1 − φ1 B − φ2 B 2 )ρ(h) = 0
(1 − ω1 B)(1 − ω2 B)ρ(h) = 0
77/146
Estimation of AR(p) process
ACF satisfies
78/146
Finding the order
79/146
Partial autocorrelation function
80/146
R - partial autocorrelation function
data = read.table("Data/q-gnp4710.txt",header=T)
head(data)
tail(data)
gnp = data$VALUE
gnp.r = diff(log(gnp))
tVec = seq(1947,2010,length.out=nrow(data)) # create time-index
plot(tVec,gnp,xlab=’year’,ylab=’GNP’,type="l")
plot(tVec[-1],gnp.r,type="l",xlab="year",ylab="growth"); abline(h=0)
acf(gnp.r,lag=12)
pacf(gnp.r,lag=12,ylim=c(-1,1))
81/146
R - AIC criteria for order determination
# AIC criteria
?ar # ‘ar’ also fits order ’p’!
m2 = ar(gnp.r,method=’mle’)
m2$order # Find the identified order
names(m2)
print(m2$aic,digits=3)
plot(c(0:12),m2$aic,type=’h’,xlab=’order’,ylab=’AIC’)
lines(0:12,m2$aic,lty=1)
82/146
Model checking - residuals
Xt = φ1 Xt−1 + . . . + φp Xt−p + Wt
ŵt = xt − x̂t
84/146
Ljung-Box
85/146
Ljung-Box
Ljung-Box test
m
X ρ̂2i
Q(m) = T (T + 2) .
T −i
i=1
86/146
Ljung-Box
# Box-Ljung test
vw = read.table("Data/m-ibm3dx2608.txt",header=T)[,3]
plot(vw,type=’l’)
m3 = arima(vw-mean(vw),order=c(3,0,0)) # fit AR(3)-model
names(m3)
Box.test(m3$residuals,lag=12,type=’Ljung’)
pv = 1 - pchisq(16.352,9) # p-value with 9 dof = #lags - #params
pv # Small p-value => reject H0: white noise
m4 = arima(vw-mean(vw),order=c(3,0,0),fixed=c(NA,0,NA,NA))
m4
Box.test(m4$residuals,lag=12,type=’Ljung’)
pv = 1 - pchisq(16.828,10) # Compute p-value, 10 dof
pv
87/146
Prediction with AR(p)
X̂t+1|t = φ1 Xt + . . . + φp Xt−p+1
et (1) = Xt+1 − X̂t+1|t = Wt+1 residual
2
Var[et (1)] = σW
88/146
Prediction with AR(p)
Hence
89/146
Moving average time series
90/146
Moving average time series
Variance
Var[Xt ] = (1 + θ12 + . . . + θq2 )σW
2
91/146
Backward shift operator
φ(z) = 1 − φ1 z − φ2 z 2 − . . . − φp z p
θ(z) = 1 + θ1 z + θ2 z 2 + . . . − θq z q
92/146
ARMA and ARIMA
X ∼ ARMA(p, q) if
φ(B)Xt = θ(B)Wt
93/146
Fitting models to data in practice
94/146
Fitting an AR in practice
95/146
Fitting an MA in practice
96/146
Fitting an ARMA
97/146
Exponential smoothing
Assume we believe in
∞
X
X̂h+1 ≈ w j Xh+1−j , w ∈ (0, 1).
j=1
Since ∞ j
P
j=1 w = 1/(1 − w ) and we want weights to sum up
to one we define
∞
X
X̂h+1 = (1 − w ) w j Xh+1−j , w ∈ (0, 1).
j=1
98/146
Regression models with time series errors
Yt = α + βXt + et ,
where Yt , Xt are two time series and et denotes the error term.
Use least-squares to estimate α, β but it is common for et to
be serially correlated
99/146
R - regression models with time series errors
# Linear regression
lm1 = lm(r3~r1)
summary(lm1)
plot(tVec,lm1$residuals,type=’l’); abline(h=0)
acf(lm1$residuals,lag=36)
100/146
R - cont.
# Linear regression
lm2 = lm(c3~-1+c1) # ’-1’ for no intercept
summary(lm2)
plot(lm2$residuals,type="l")
acf(lm2$residuals,lag=36)
pacf(lm2$residuals,lag=36,ylim=c(-1,1))
101/146
Regression models with time series errors
102/146
R - cont.
r = lm2$residuals
m.ma = arima(r,order=c(0,0,1),include.mean=F)
plot(m.ma$residuals); abline(h=0)
acf(m.ma$residuals)
pacf(m.ma$residuals,ylim=c(-1,1))
Box.test(m.ma$residuals,lag=10,type=’Ljung’)
1 - pchisq(50.344,9)
103/146
Regression models - summary
104/146
Backetesting
105/146
Model averaging
106/146
Part II: nonlinear time series
107/146
Volatility
108/146
Volatility - characteristics
Volatility clusters
Volatility jumps are rare
Volatility shows mean reversion
Volatility seems to react differently to big price increase and
drops - leverage effect
109/146
Volatility models
110/146
Nonlinear AR
111/146
Motivation
112/146
R - Nonlinear AR
ts.plot(SP500); abline(h=0)
acf(SP500)
qqnorm(SP500$r500); qqline(SP500$r500)
# Treasury-bill rates
data(Tbrate,package="Ecdat")
Tbill <- Tbrate[,1]
d.Tbill <- diff(Tbill)
qqnorm(d.Tbill)
qqline(d.Tbill)
acf(d.Tbill)
acf(d.Tbill^2)
113/146
Nonlinear AR
# IBM data:
da = read.table("data/m-intcsp7309.txt",header=T)
head(da)
X = log(da$intc+1)
rtn = ts(X,frequency=12,start=c(1973,1))
plot(rtn,type="l",xlab="year",ylab="IBM log-return"); abline(h=0)
Box.test(X,lag=12,type="Ljung")
# -> can not reject H0: rho(h)=0, h>0
acf(X,lag=24)
acf(abs(X),lag=24)
Box.test(abs(X),lag=12,type="Ljung")
# -> reject H0: rho(h)=0, h>0
114/146
Model structure
115/146
Building a model
116/146
ARCH models
where
p
X
σt2 = α0 + 2
αj Xt−j ,
j=1
σt2 > 0
117/146
ARCH(1)
Stationarity
P∞ j
E[Xt ] = E[E[Xt |Xt−1 ]] = 0 and E[Xt2 ] = α0 j=0 α1
Hence, {Xt }t is weakly stationary ⇐⇒ |α1 | < 1 and in that
case, E[Xt2 ] = α0 /(1 − α1 )
118/146
ARCH(1)
119/146
ARCH(1)
Higher moments
3α02 (1 + α1 )
E[Xt4 ] =
(1 − α1 )(1 − 3α12 )
1
This implies that 0 ≤ α12 < 3
Tails of Xt are heavier than that of normal distribution
E[Xt4 ] 1 − α12
= 3 > 3.
(Var[Xt ])2 1 − 3α12
120/146
ARCH pros and cons
Advantage:
Model can produce volatility clusters
X has heavy tails
Weaknesses:
Positive and negative shocks have the same effect on volatility
ARCH models are likely to over-predict the volatility
121/146
ARCH and AR
ARCH(p):
p
X
Xt = σt Wt where σt2 = α0 + 2
αj Xt−j ,
j=1
122/146
ARCH model checking
Define residuals as
Xt
X̃t =
σt
Check if X̃t forms i.i.d sequence (Ljung-Box etc.)
123/146
R - ARCH fitting to IBM data
# ARCH(p)
pacf(X^2,lag=24,ylim=c(-1,1))
m3 = garchFit(~garch(12,0),data=X,trace=F)
summary(m3)
plot(m3)
124/146
GARCH
Xt = µt + σt Wt
where
p
X q
X
σt2 = α0 + 2
αj X̃t−j + 2
θj σt−j ,
j=1 j=1
with X̃t = Xt − µt
Hence X̃t |Ft−1 ∼ N(0, σt2 )
125/146
ARCH, GARCH
σ 2 = α0 + α1 Xt−1
2
Xt2 = α0 + α1 Xt−1
2
+ εt
GARCH(1,1):
σt2 = α0 + α1 Xt−1
2 2
+ β1 σt−1
Xt2 − εt = α0 + α1 Xt−1
2 2
+ β1 (Xt−1 − εt−1 )
Xt2 = α0 + (α1 + β1 )Xt−1
2
− β1 εt−1 + εt
126/146
ARCH,GARCH
127/146
Other models
128/146
Cointegration
129/146
Example
where:
Sn = S0 + W1 + . . . + Wn , S0 = 0, is a random walk with
{Wn }n - N(0, 1) being a white noise process,
(1) (2)
{n } and {n } are two independent white noise sequences
that are independent of {W }
(1) (2)
{Xn }n and {Xn }n are random walks plus noise and hence
both I (1)
130/146
Example cont.
131/146
Example cont.
132/146
Regression analysis for cointegration
133/146
Trading with R
134/146
Useful libraries in R
135/146
Basic steps
136/146
Example
Data: SP500
Tool: we use DVI indicator (momentum indicator) from TTR
library
Trading rule: go long if DVI < 0.5 and short if DVI > 0.5
We always invest all our capital
Check performance of our strategy
137/146
Example in R
install.packages("quantmod")
install.packages("PerformanceAnalytics")
library(PerformanceAnalytics)
library(TTR)
139/146
Example in R
140/146
RSI - Example in R
141/146
RSI - Example in R
142/146
Remarks
143/146