The Propagation of Regional Recessions
James D. Hamiltony
Department of Economics
University of California, San Diego
Michael T. Owyangz
Research Division
Federal Reserve Bank of St. Louis
keywords: recession, regional business cycles, Markov-switching
…rst draft: September 13, 2008
revised: September 22, 2008
Abstract
This paper develops a framework for inferring common Markov-switching components in a
panel data set with large cross-section and time-series dimensions. We apply the framework
to studying similarities and di¤erences across U.S. states in the timing of business cycles. We
hypothesize that there exists a small number of cluster designations, with individual states in a
given cluster sharing certain business cycle characteristics. We …nd that although oil-producing
and agricultural states can sometimes experience a separate recession from the rest of the United
States, for the most part di¤erences across states appear to be a matter of timing, with some
states beginning a national recession or recovery before others. [JEL: C11; C32; E32]
1
Introduction
The formation of the European Monetary Union has sparked a resurgence of interest in regional
business cycles, both in Europe and in the United States where longer time series are available. A
number of these recent studies have characterized the U.S. national economy as an agglomeration
of distinct but interrelated regional economies. While some idiosyncrasies exist, regional business
The authors bene…ted from conversations with Michael Dueker and Jeremy Piger. Kristie M. Engemann provided
research assistance. The views expressed herein do not re‡ect the o¢ cial positions of the Federal Reserve Bank of
St. Louis or the Federal Reserve System.
y
jhamilton@ucsd.edu.
z
owyang@stls.frb.org.
1
cycles in the U.S., for the most part, bear a reasonable resemblance to the national cycle identi…ed by
the NBER using aggregate data. Disparities in regional business cycles have often been attributed
either to idiosyncratic shocks or to di¤erences in characteristics such as the industrial composition
of the regions.
Conversely, commonality can be attributed to responses to common aggregate
shocks for which the state responses vary but the timing is identical.1
Characterizing regional business cycles using a panel data set with large cross-section and timeseries dimensions raises two separate questions. The …rst is how to model the comovements that
are common across geographic divisions. In Owyang, Piger, and Wall (2005) and Owyang, Piger,
Wall, and Wheeler (forthcoming), the unit of analysis is taken to be individual states and cities,
respectively. Regional similarities were noted but not modeled explicitly. A common alternative
for characterizing common elements across geographic divisions is to rely on factor analysis, as
in Forni and Reichlin (2001) and Del Negro (2002).
Another approach is to use exogenously
de…ned regions such as those adopted by the Bureau of Economic Analysis as either the basic unit
of analysis (e.g., Kouparitsis, 1999), or as an additional observable restriction on the state-level
factor structure (Del Negro, 2002). A few studies de…ne regions endogenously. Crone (2005) used
k-means cluster analysis of state business cycle movements to de…ne regions. While his regional
de…nitions are similar to those used by the BEA, Crone found some discrepancies (in particular,
Arizona, which may be taken as a region unto itself). Partridge and Rickman (2005) used cyclical
indices to uncover common currency areas in the U.S. Similarly, van Dijk, Franses, Paap, and van
Djik (2007) constructed clusters for regional housing markets in the Netherlands.
A second question concerns the manner in which the business cycle itself is de…ned.
What
exactly are we claiming to have measured when we compare the timing of a recession in one state
with that observed in another? In a standard factor model, the cyclical component is viewed as a
continuous-valued random variable, de…ned in terms of its ability to capture certain comovements
across states. Kouparitsis (1999) and Carlino and DeFina (2004) used band-pass …lters to extract
the business cycle frequency from disaggregate data. Carlino and Sill (2001) and Partridge and
Rickman (2005) relied on trend-cycle decompositions.
Hamilton (2005) argued that the de…ning characteristic of the business cycle as understood,
1
Monetary shocks, for example, are aggregate shocks that have common timing but varying e¤ect (see Carlino
and De Fina (1998)).
2
for example, by Burns and Mitchell (1946) is a transition between distinct, discrete phases of
expansion and contraction.
Owyang, Piger, and Wall (2005) and Owyang, Piger, Wall, and
Wheeler (forthcoming) adopted this perspective in their application of the Markov-switching model
of Hamilton (1989) to data for individual states and cities, respectively. The contribution of the
present paper is to extend that e¤ort to characterize the interactions across states in these shifts.
Our paper could alternatively be viewed as an extension of factor or cluster analysis to this kind
of nonlinear framework.
We account for the correlation across states by modeling both national and regional recessions.
In our setup, following Frühwirth-Schnatter and Kaufmann (2008), we allow the data to de…ne
regional groupings (which we designate as “clusters”) on the basis of comovement in state employment growth rates and other observable …xed state characteristics.
In particular, we model the
probability of a state’s inclusion in any region as a logistic variable, in which state-level characteristics a¤ect the prior probability of state membership in a region-cluster and observed employment
growth comovements inform the posterior inference about those probabilities..
The model is estimated using Bayesian methods and we report …ve main …ndings. First, most
state-level business cycle experiences are similar to those of the nation. Second, most idiosyncratic
recession experiences amount to di¤erentials in timing around the national recessions. For example,
some states enter some recessions a quarter before the rest of the nation. Third, a cluster of states,
characterized by a high oil and agriculture share of their economy, does enter and exit recessions
independently from the nation.
Fourth, the regional clusters we …nd are not exclusive, i.e., a
state can belong to more than one region. However, the overlapping of states in multiple regions
is infrequent.
Finally, while industrial composition is important for determining the regional
clusterings, other factors such as the share of employment coming from small …rms may also be
important.
The remainder of this paper is organized as follows.
Section 2 presents our characterization
of regional business cycles with particular focus on endogenous region determination.
Section 3
details the estimation technique. Section 4 presents the empirical results. Section 5 concludes.
3
2
Characterizing regional business cycles.
Let ytn denote the employment growth rate for state n observed at date t. We group observations
for all states at date t in an (N
states. Let st be an (N
1) vector yt = (yt1 ; :::; ytN )0 ; where N denotes the number of
1) vector of date t recession indicators (so stn = 1 when state n is in
recession and stn = 0 when state n is in expansion). Suppose that
yt =
where the nth element of the (N
0
+
1) vector
0
n during recession, the nth element of the (N
state n during expansion, and
1
st + "t ;
+
1
(1)
is the average employment growth in state
1) vector
0
is the average employment growth in
represents the Hadamard product. We assume that "t
i.i.d.
N (0; ); with "t independent of s for all dates and that st follows a Markov chain.
Equation (1) postulates that recessions are the sole source of dynamics in state employment
growth. There is no conceptual problem with adding lagged values of yt
j
or st
j
to this equation,
though that would greatly increase the number of parameters and regimes for which one needs to
draw an inference. We regard the parsimonious formulation (1) as more robust than more richly
parameterized models for purposes of characterizing the broad features of business cycles across
states. We also adopt the simplifying assumption that
2
6
6
6
6
=6
6
6
4
2
1
0
..
.
0
0
3
0
2
2
0
..
.
is diagonal:
0
..
.
2
N
7
7
7
7
7:
7
7
5
This reduces the number of variance parameters from N (N + 1)=2 down to N , and, unfortunately,
is necessary for the particular algorithms we employ to be valid.
Our model thus assumes that
coincident recessions, or the tendency of a recession in one state to lead to a recession in another,
are the only reason that employment growth would be correlated across states.
Again, this is
a stronger formulation than one might like, though we think nevertheless an interesting one for
getting a broad summary of some of the ways that the business cycle may be propagated across
4
regions.
Despite these assumptions, the model (1) is numerically intractable without further simpli…cation. If state 1 can be in recession while 2 and 3 are not, or 1 and 2 in recession while 3 is not,
there are
= 2N di¤erent possibilities, or 2:8
1014 di¤erent con…gurations in the case of the 48
contiguous states. Implementing the algorithm for inference and likelihood evaluation in Hamilton
(1994, p. 692) would require calculation of an (
1) vector
t
and (
) matrix P, which is not
remotely feasible. Even if it somehow could be implemented, such a formulation is trying to infer
much more information from a (T
N ) data set than can be reasonably justi…ed.
Our approach as in Frühwirth-Schnatter and Kaufmann (2008) is to assume that recession
dynamics can be characterized in terms of a small number K << 2N of di¤erent clusters and by an
aggregate indicator zt 2 f1; 2; :::; Kg signifying which cluster is in recession at date t. We associate
1) vector h1 = (h11 ; :::; hN 1 )0 whose nth element is unity when state n is
with cluster 1 an (N
associated with cluster 1 and 0 if state n is not associated with the cluster. When zt = 1; all the
states associated with cluster 1 would be in recession. In general,
yt jzt = k
N (mk ; );
where
mk =
0
+
1
hk :
Conditional on knowing the values of h1 ; :::; hK , this is a standard Markov-switching framework
for which inference methods are well known. The new question is how to infer the con…gurations
of h1 ; :::; hK from the data. We impose two of these con…gurations a priori, stipulating that hK is
a column of all zeros (so that every state is in expansion when zt = K), and hK
all ones (every state is in recession when zt = K
characterized by hK
1
1
is a column of
1). We will refer to clusters other than those
and hK as “idiosyncratic”clusters and let
=K
2 denote the number of
idiosyncratic clusters. Thus, when zt = 1; 2; :::; ; some states are in recession and others are not.
The values of h1 ; :::; h are unobserved variables that in‡uence the probability distribution of the
observed data fyt gTt=1 .
We postulate that there is a (Pk
1) vector xnk that in‡uences whether state n experiences a
5
recession when zt = k according to
8
i
h
>
< 1= 1 + exp x0 k
nk
p(hnk ) =
h
0
0
>
: exp x
=
1 + exp xnk
nk k
k
if hnk = 0
i
(2)
if hnk = 1
for n = 1; :::; N ; k = 1; :::; . Note that state n could be a¢ liated with more than one idiosyncratic
cluster.2 Alternatively, state n would participate only in national recessions if hn1 =
We think of
k
= hn = 0.
as a population parameter –prior to the generation of any data, nature generated
a value of hnk according to (2).
population parameter
k.
We will then draw a Bayesian posterior inference about the
Following Holmes and Held (2006), it is convenient for purposes of
the estimation algorithm to represent this generation of hnk given
unobserved pair of latent variables, denoted
nk
and
nk .
k
as the outcome of another
The ability to do so comes from the
following observation by Andrews and Mallows (1974). Let
nk
have the limiting distribution of
the Kolmogorov-Smirnov test statistic, whose density Devroye (1986, p. 161) writes as
p(
1
X
nk ) = 8
( 1)j+1 j 2
nk
2j 2
exp
2
nk
:
(3)
j=1
Andrews and Mallows show that if
nk
0
a logistic distribution with mean xnk
Pr (
k
nk
KS and enk
N (0; 1), then
nk
0
= xnk
k
+2
nk enk
has
and unit scale parameter, for which the cdf is
z) =
1
0
1 + exp xnk
z
k
:
Thus as in Holmes and Held (2006) we have that
0
Pr (
nk
> 0) =
exp xnk
1 + exp xnk
In other words, if we thought of nature as having generated
where
nk
=4
2
nk
for
nk
k
0
nk
:
k
0
from a N xnk
KS, and then selected hnk to be unity if
nk
k;
nk
distribution
> 0; that is equivalent
to claiming that the value of hnk was generated according to the probability speci…ed in (2).
2
This approach stands in contrast with the typical notion of a "region". Government agency (Bureau of Economic
Analysis, Bureau of Labor Statistics, Census, etc) de…ne their regions such that any state can be a member of only
one region. Empirical studies (e.g., Crone 2005) make a similar exclusivity restriction.
6
3
Bayesian posterior inference.
The task of data analysis is to draw a Bayesian posterior inference about the values of both population parameters and the unobserved latent variables.
several categories. The set
=f
0;
1;
g characterizes the growth rates for each state in reces-
sion and expansion and the standard deviation
those means. The (K
We divide these unknown objects into
of employment growth rates for state n around
n
K) matrix P contains the transition probabilities for regimes, with row i;
column j element
pji = p (zt = jjzt
1
= i)
where as in Hamilton (1994, p. 679) each column of P sums to unity.
1) vector z = (z1 ; :::; zT )0
There are also two groups of unobserved latent variables. The (T
summarizes which clusters are in recession at each date, while h = fh1 ; :::; h g summarizes the
cluster a¢ liation of each state where hk = (h1k ; :::; hN k )0 denotes the (N
1) vector characterizing
which states participate in cluster k: There are also 3 other sets of variables and parameters
associated with that realization of h. Let
k
=(
0
1k ; :::; N k )
and
k
=(
1k ; :::;
0
Nk)
denote the
associated auxiliary variables (see Tanner and Wong 1987) that are viewed as having determined
hk according to:
hnk =
nk j k ;
8
>
< 1 if
nk
>0
>
: 0 otherwise
0
N xnk
nk
nk
k;
;
nk
(4)
;
(5)
= 4 /2nk ;
KS:
nk
Collect all the latent variables associated with the cluster a¢ liations in a set H = fh; ; g, where
= f 1 ; :::;
g, and
= f
1 ; :::;
g, while
= f
coe¢ cient vectors.
7
1 ; :::;
g denotes the set of all the logistic
3.1
Priors.
Recall that a positive scalar x is said to have a
8
>
< [
p(x) =
>
:
We adopt a ( ( =2; =2)) prior for
p
We use a N (m;
2 M)
p(
nj
prior for
n)
for x > 0
0
:
(6)
otherwise
2:
/
=(
n
x
1e
= ( )] x
2
n
+2
n
exp
n
2
=2 :
(7)
0
n1 ) :
n0 ;
1=2
2
nM
/
n
( ; ) distribution if its density is
exp
n
(
m)0
n
1
2
nM
(
n
o
m) =2 :
(8)
With independent priors across states, we then have
p( ) =
N
Y
nj n) p
p(
n=1
n
2
:
We model transition probabilities using a Dirichlet prior. Recall that for w = (w1 ; :::; wm )0 with
P
,
wi 2 [0; 1] and m
i=1 wi = 1, we say that w has a Dirichlet distribution with parameter vector
denoted w
D ( ), if the joint density of fw1 ; :::; wm
p (w1 ; :::; wm
1)
=
(
(
+
1)
1
1g
+
(
is given by
m)
m)
w1 1
1
wmm
1
:
We adopt the di¤use Dirichlet prior (D (0)) for each column of P:
p (P) / p111
Our prior distribution for
k
1
pKK
:
is characterized by independent Normal distributions,
k
N (bk ; Bk )
8
for k = 1; :::; ;
(9)
with p ( ) the product of (9) over k = 1; :::; . Then,
p (H; ) = p (Hj ) p ( )
where p (Hj ) is the product of (3) through (5) over k = 1; :::;
and n = 1; :::; N .
Numerical values for the prior parameters are summarized in Table 1.
3.2
Joint distribution.
Let Y denote the (T
N ) matrix consisting of the observed growth rates for all states at all dates,
where T is the length of the time series. The joint density-distribution for data, parameters, and
latent variables for the logistic clustering formulation is given by
p (Y; ; P; z; H; ) = p (Yj ; P; z; H; ) p (zj ; P; H; ) p ( jP;H; ) p (PjH; ) p (H; )
= p (Yj ; z; h) p (zjP) p ( ) p (P) p (H; ) :
Note that and
(10)
a¤ect the likelihood only through the value of h, and are only relevant as auxiliary
parameters to facilitate generation of posterior values of
over all possible values of
and
.
Speci…cally, one can integrate (10)
to obtain
p (Y; ; P; z; h; ) =
Z
p (Yj ; z; h) p (zjP) p ( ) p (P) p (H; ) d d
Z
= p (Yj ; z; h) p (zjP) p ( ) p (P) p (H; ) d d
= p (Yj ; z; h) p (zjP) p ( ) p (P) p (hj ) p ( ) ;
where p (hj ) is the product of (2) over k = 1; :::;
(11)
and n = 1; :::; N .
The conditional likelihood p (Yj ; z; h) can be written as follows. Collect the state n observations for all dates in a (T
1) vector Yn (y1n ; :::; yT n )0 and let
p (Yj ; z; h) =
N
Y
n=1
9
p (Yn j
n
n ; z; h)
=
n0 ;
n1 ;
n
2 0.
Then,
(12)
p (Yn j
p (ytn j
n ; zt ; h)
n ; z; h)
=
T
Y
t=1
/
n
1
p (ytn j
2
h
6
exp 4
n ; zt ; h)
i2 3
n w (zt ; h) 7
5
2
0
ytn
2
w (zt ; h) = (1; hn;zt )0 :
n
The unconditional probabilities for z are given by
p (zjP) = p (z1 )
T
Y
pzt
1 ;zt
t=2
for pzt
1 ;zt
the row zt ; column zt
1
element of P. The initial regime is set to expansion a priori:
p(z1 ) =
3.3
Drawing
8
>
< 1 for z1 = K
>
: 0 otherwise
:
given Y; ; P; z; H; :
Our general Bayesian inference is via the Gibbs sampler (see Gelfand and Smith (1990); Cassella
and George (1992); Carter and Kohn (1994)), in which we will generate a draw for one block of
parameters or latent variables conditional on the others. This subsection discusses generation of
conditional on the data Y and on the values for ; P; z; H; and
that were in turn generated
by the previous step of the iteration. In the next subsection we will discuss how to draw
given
Y; ; P; z; H; : Both distributions can be derived from
p ( jY; P; z; H; ) = R
where the numerator is given by (10) and
R
p ( ; Y; P; z; H; )
p ( ; Y; P; z; H; ) d
(13)
[:] d denotes the de…nite integral over all the possible
values for . But multiplicative terms not involving
10
cancel from numerator and denominator of
(13), so that
p ( jY; P; z; H; ) / p (Yj ; z; h) p ( )
N
Y
=
n=1
Hence the
given Y; P; z; H;
n
p(
n jY; P; z; H;
p (Yn j
n ; z; h) p ( n ) :
are independent across n with
) / p (Yn j
/ p(
n)
n ; z; h) p ( n )
" T
X
T
[ytn
n exp
t=1
Substituting (7) into (14) and dividing by the integral over
p
for ^ =
PT
t=1
h
ytn
n
2
jY; ; P; z; H /
n
T
+2
exp
0
2
n w (zt ; h)]
n,
= 2
2
n
#
:
(14)
we have
h
+^
i2
w
(z
;
h)
. Recalling (6), we thus generate
t
n
0
n
n
2
2
i
=2
from a
( + T ) =2;
+ ^ =2
distribution, a standard result as in Kim and Nelson (1999, p. 181).
3.4
Drawing
given Y; ; P; z; H; :
Using (8) in (14) and this time dividing by the integral over
n;
we again see as in Kim and Nelson
(1999, p. 181) that
n jY;
; P; z; H;
N mn ;
2
n Mn
for
Mn = M
1
mn = Mn M
Cn =
"
T
X
cn =
+ Cn
1
1
m + cn
0
w (zt ; h) w (zt ; h)
t=1
"
T
X
#
w (zt ; h) ytn :
t=1
11
#
(15)
3.5
Drawing P given Y; ; z; H; :
Conditional on H and z, this is again a standard inference problem for a K-state Markov switching
process, as in Chib (1996, p. 84). From (10),
p (PjY; ; z; H) / p (zjP) p (P) ;
column i of which will be recognized as D(
i
is given by
ij
=
PT
t=2
i)
PT
distribution, where the jth element of the vector
(zt
1
= i; zt = j)
(zt
t=2
1
= i)
;
which is just the fraction of times that regime i is observed to be followed by regime j among the
sequence fz1 ; :::; zT g :
3.6
Drawing z given Y; ; P;H; :
Here
p (zjY; ; P; H; ) / p (Yj ; z; h) p (zjP) :
Again as in Chib (1996, p. 83),
p (zjY; ; P; H; ) = p (zT jY; ; P; h)
TY1
t=1
p (zt jzt+1 ; :::; zT ; Y; ; P; h) :
But zt+1 conveys all the information about zt embodied by future z or y. Thus if Yt = fy
n
:
t; n = 1; :::; N g
collects observations from all states for all dates through t;
p (zjY; ; P; H; ) = p (zT jYT ; ; P; h)
TY1
t=1
p (zt jzt+1 ; Yt ; ; P; h) :
(16)
One can calculate p (zt jYt ; ; P; h) by iterating on equation [22.4.5] in Hamilton (1994)3 , the terminal
value of which (t = T ) gives us p (zT jYT ; ; P; h), the …rst term in (16). Furthermore,
p (zt jzt+1 ; Yt ; ; P; h) = PK
pzt ;zt+1 p (zt jYt ; ; P; h)
j=1 pj;zt+1 p (zt
= jjYt ; ; P; h)
;
3
Here t is a (K 1) vector
zero otherwise, while
Q whose kth element is unity when zt = k and
0
^
vector whose kth element is N
n=1 p(ytn j ; zt = k; h), while 0j0 = (0; 0; :::; 1) :
12
t
is a (K
1)
allowing us to generate zT ; zT
3.7
1 ; :::; z1
sequentially:
Generating H.
We now de…ne Hk = fhk ;
k;
kg
and H [k] = hj ;
j;
j
: j = 1; :::; ; j 6= k .
Our strategy will
be to generate the elements associated with cluster k (denoted Hk ) conditional on all the elements
of all the other clusters (denoted H [k] ). We will, in turn, break down the generation of Hk given
Y; H [k] ; ; P; z;
k
into a series of steps, …rst generating hk ; then
conditional on hk and
3.7.1
k,
k
conditional on hk ; and …nally
all conditioning on H [k] :
Drawing hk given Y; H [k] ; ; P; z; :
From (11),
p hk jY; H [k] ; ; P; z;
/ p (Yj ; z; h) p (hk j
=
N
Y
n=1
k)
p Yn jhnk ; h[k] ; ; z p (hnk j
k) :
In other words, we can generate hnk for n = 1; :::; N independently across states from
p Yn jhnk = 1; h[k] ; ; z Pr (hnk = 1j k )
= P1
[k]
j=0 p Yn jhnk = j; h ; ; z Pr (hnk = jj k )
Pr hnk = 1jY; h[k] ; ; P; z;
where
Pr(hnk
3.7.2
Drawing
k
8
h
i
>
< 1= 1 + exp( x0 k
nk
= jj k ) =
h
0
0
>
: exp( x
=
1 + exp( xnk
k
nk
k
i
for j = 0
for j = 1
given Y; hk ; H [k] ; ; P; z; :
Here, we have
p
k jY; hk ; H
[k]
; ; P; z;
= p(
=
N
Y
n=1
13
k jhk ;
p(
)
nk jhnk ;
k) :
:
Note that if we’d conditioned on
nk ,
then
would have a Normal distribution. However, without
nk
that conditioning, we are back to the logistic distribution that motivates the parameterization in
terms of (
nk; nk ).
Holmes and Held (2006) argue that generating
distribution and then
nk
nk
from the unconditional
will give the algorithm better convergence properties. For the posterior
distribution given hnk , we know that
if hnk = 1 and
nk
0
is logistic with mean xnk
nk
< 0 if hnk = 0. Recall that if u
logistic distribution with mean E ( ) = A.4
U [0; 1], then
Furthermore,
0 i¤ u
k
and truncated by
=A
log u
1
nk
0
1 has a
1= (1 + exp (A)). In other
words, we want to generate u from a uniform distribution over the interval [0; 1= (1 + exp (A))]
when hnk = 0 and u
a + (b
a) u
U [1= (1 + exp (A)) ; 1] when hnk = 1: Note …nally that if u
unk =
Then
3.7.3
nk
U [0; 1] and de…ne
U [a; b]. Thus we generate unk
0
= xnk
>
>
:
1
0
1+exp(xnk
k
)
1
0
1+exp(xnk
k
)
log unk1
k
Drawing
8
>
>
<
k
if hnk = 0
unk
+
0
exp xnk k
0
1+exp xnk k
(
:
)
unk if hnk = 1
1 .
given Y;
k ; hk ; H
[k] ;
; P; z; :
Now
p
k jY; k ; hk ; H
[k]
; ; P; z;
= p(
kj k;
k)
/ p(
kj k;
k) p ( k)
=
p(
N
Y
nk j nk ;
n=1
4
This claim may be veri…ed directly as follows:
Pr(
z)
=
=
Pr A
log(u
1
1)
z
1
1)
A
z
1 + exp(A
z)
Pr log(u
1
=
Pr u
=
Pr u
=
1
1 + exp(A
1
1 + exp(A
z)
which will be recognized as the cdf of a logistic variable with mean A:
14
z)
k ) p ( nk ) :
U [0; 1] ; then
2 =
Again as in Holmes and Held (2006), we set rnk
2
0
xnk
nk
and use as a proposal density
k
a Generalized Inverse Gaussian density,
^ nk
GIG 1=2; 1; r2 ;
a draw for which can be generated as follows. Generate wnk the square of a standard Normal and
set
vnk = 1 +
Generate a separate u
^nk
p
wnk (4r + Y )
:
2r
wnk
U [0; 1], and set
^ nk =
8
>
< r=vnk if u
^nk
>
: rvnk
1= (1 + vnk )
:
otherwise
We then decide to accept ^ nk (or else repeat the above steps) using the algorithm described by
Holmes and Held (2006, p. 165).
3.8
Drawing
given (Y; ; P; z; H) :
Notice
p ( jY; ; P; z; H) =
Y
kj k;
p(
k=1
which is just a standard Normal regression model for each
k
= Xk
2
6
6
Xk = 6
6
(N Pk )
4
"k
Wk
(N N )
k
k
+ "k
0
x1k
..
.
0
xN k
3
7
7
7
7
5
N (0; Wk )
= diag
15
1k ; :::;
Nk:
k)
of the form
Thus
k jY;
; P; z; Q
N (bk ; Bk )
1
0
bk = Bk 1 + Xk Wk 1 Xk
Bk 1 bk + X0k Wk 1
Bk = Bk 1 + X0k Wk 1 Xk
3.9
1
k
:
Label switching.
The model described above is unidenti…ed in two respects. First, if we were to switch the values
of
0
with
1;
and correspondingly switch the last two columns and then the last two rows of P,
the likelihood function would be unchanged.
Likewise, switching the de…nition of clusters (e.g.,
switching the …rst two columns of H along with the …rst two columns and …rst two rows of P); the
likelihood function would be unchanged.
The …rst is a familiar issue in the literature, and we deal with it in a typical way, by normalizing
n1
0.
We implement this by rejecting any generated
n
not satisfying the restriction and
redrawing from (15) until obtaining a draw that satis…es the normalization restriction.
The second issue is unique to our clustering approach. We mitigate this in part by imposing
the restriction that the process cannot transition from one idiosyncratic regime to another, that is,
imposing pij = 0 if i and j are both less than K
1 and if i 6= j. We are thus ruling out transitions
in which recession for a subset of states is followed by those states going out of recession and a
di¤erent set of states going into recession.
We found that once we imposed this condition, the
clusters are sharply di¤erentiated by the data, so that for a typical run with 20,000 burned draws
and the next 20,000 retained, all of the retained draws tended to be consistent with a particular
distinct set of characterizations of the di¤erent clusters.
4
Empirical results.
The data used to measure state-level business cycles are the seasonally adjusted, annualized growth
rates of quarterly payroll employment.5
The sample period is 1956:Q2 to 2007:Q4; Alaska and
Hawaii are excluded. These data were obtained from the Bureau of Labor Statistics (BLS). Even
5
The measure most synonomous with GDP at the state level is Gross State Product (GSP). Unfortunately, GSP
is available only at an annual frequency and at a two-year lag, making it nonviable for a study of business cycles.
16
at the quarterly frequency, the growth rate in state-level employment can experience large swings
caused by idiosyncratic state experiences (for example, mining strikes in West Virginia). To ensure
that our algorithm identi…es business cycles rather than outliers, we …lter the employment growth
data by smoothing periods with growth rates more than 4 standard deviations away from the mean.
In addition to the time series data, the model in the preceding section requires a set of statelevel covariates characterizing the ex ante likelihood of membership in a given cluster. We report
results for a speci…cation with
= 4 idiosyncratic clusters and Pk = 6 covariates used to explain
the cluster a¢ liations of each state, with the same vector of explanatory variables used for each
cluster (xnk = xn for k = 1; :::; ). The vector xn includes barrels of oil produced per 100 dollars
of state GDP, agricultural employment as a share of total employment, manufacturing employment
share, …nancial activities employment share, a measure of workers’ compensation by state; and
the share of total state employment accounted for by small …rms.6 Values for these explanatory
variables are displayed in Figure 1.
We report results for some of the parameters and unobserved latent variables of interest based
on a run of 20,000 Gibbs sampler iterations having discarded an initial burn-in of 20,000 iterations. Table 2 shows the posterior medians and means for the model parameters
for each state.
Table 3 gives the posterior means of the logistic coe¢ cients
k
0,
1,
and
2
associated with
each of the idiosyncratic clusters (k = 1; :::; 4), with a bold entry signifying that 90 percent of the
posterior draws were on the same side of zero as the reported posterior mean. We also translate
these coe¢ cients into discrete derivatives (denoted k ). The ith element of k has the following
P
interpretation. Let xi = N 1 N
n=1 xin denote the average value for the ith explanatory variable.
Suppose we compare two states, each of which has xjn = xj for all i 6= j, but in the …rst state,
characteristic i is one standard deviation below the average xi ; and in the other state, characteristic
i is one standard deviation above the average. How would the probability of inclusion in cluster k,
6
The oil share was calculated as 100 times the number of barrels of crude oil produced in the state in 1984 (from
the Energy Information Administration, http://tonto.eia.doe.gov/dnav/pet/pet_crd_crpdn_adc_mbbl_m.htm)
divided by 1984 state personal income (from the Census Bureau, http://www.census.gov/compendia/statab/
tables/08s0658.xls). The manufacturing and …nancial activities shares of employment by state were calculated
as the average of the annual industry (NAICS) shares of total payroll employment from 1990-2006, also from the
BLS. For agriculture’s share of employment, we used the percentage of employment in 2002 that was farm jobs
or farm-related jobs, which we obtained from the USDA’s Economic Research Service. Workers compensation was
computed as the average of the index of workers’compensation insurance costs from Table 1 in Krueger and Burton
(1990). We took the average of the 1972, 1975, 1978, and 1983 data for our …nal series. The share of small …rms
was computed as an average of the share of total employment in …rms with fewer than 100 employees and was taken
from the Statistics of U.S. Businesses dataset.
17
as calculated from (2), di¤er between the two states? The value for this magnitude implied by the
posterior mean for
k
is reported as the ith element of the vector
k
in Table 3. Taking cluster 1
as an example, a state that was average in all respects but one standard deviation below average in
the share of agricultural employment would be rather unlikely to be included in cluster 1, whereas
a state one standard deviation above the average would be quite likely to be included.
A state
in which …nancial services comprise a smaller share of employment are also more likely to be one
of those that is in recession when zt = 1. The second aggregate regime a¤ects oil-producing and
agricultural states, but is negatively related to the share of manufacturing in total employment.
For cluster 3, states in which a higher fraction of total employment is due to large …rms are more
likely to be included in the group that is in recession when zt = 3. Cluster 4 tends to include states
in which agriculture is less important.
Although one might have thought that state regulations
as proxied by workers’compensation might be related to the duration of aggregate unemployment
spells, we …nd no connection between this measure and any cluster a¢ liations, as re‡ected in small
and insigni…cant values for
5k .
Table 4 reports posterior means of the regime transition probabilities pij .
Starting with the
…rst column, suppose that zt = 1 in quarter t; which would mean that only those states that are
included in cluster 1 would be in recession: We have ruled out a priori the possibility that these
states go out of recession and a new di¤erent subset of states begins a recession at t + 1 (that
is, we imposed p12 = p13 = p14 = 0). Although we did not impose p16 = 0, the posterior mean
of p16 in fact turns out to be quite close to zero, meaning that if the states in cluster 1 go into
recession, eventually the entire nation will follow, usually with a lag of about 2 quarters. Likewise
if the states in cluster 4 are in recession, we’ll also see a national recession eventually arrive, usually
within 3 quarters (p46 = 0, p44 = 0:63). By contrast, zt = 2 corresponds to a purely idiosyncratic
recession in which the subset of states in cluster 2 experience a recession which on average lasts
about 1= (1
p22 ) = 4:3 quarters, and transitions to a regime of all states being in expansion. The
regime zt = 3 would be characterized as a late recovery for the subset of states in cluster 3; we
could only observe zt = 3 if the previous period had seen a national recession (zt
1
= 5). If there
is a national recession, these states invariably require one more quarter to recover from recession
compared with the rest of the nation.
Figure 2 plots the posterior means for the regime probabilities given the data. The top panel
18
is calculated as the fraction out of the 20,000 simulations for which zt for the indicated quarter is
equal to 5, that is, it shows the posterior probability of a national recession.
These correspond
fairly closely to the traditional NBER dates, which are indicated by shaded regions in the top panel,
with the exception of a downturn in 1956 based on state employment data that is not characterized
by NBER as a national recession.
Also, our framework would date both the 1990-91 and 2001
recessions as substantially longer based on state employment growth than the traditional NBER
dates specify.
The shaded regions in the bottom 4 panels of Figure 1 are based on the zt = 5 dates rather
than the NBER dates, to clarify the nature of the estimated dynamics. The national recessions of
1980, 1981, and 2001 all began with recessions in the cluster 1 states. A new recession also seems
to have begun in these states in 2007, which according to the estimated model parameters would
imply a national recession should soon follow. By contrast, the 1974 and 1990 recessions began
in the cluster 4 states. Every recession is characterized by a slow recovery by the cluster 4 states.
The cluster 2 states experienced a uniquely idiosyncratic recession during the oil price collapse in
the middle 1980s, as well as several briefer episodes in the 1950s and 1960s.
Figure 3 indicates which states are a¤ected by the respective idiosyncratic regimes based on the
posterior probabilities for each hnk given the observed employment data Y. Regime 1 is in fact
close to being a national recession. Thirty-one states would be in recession when zt = 1, and the
rest historically have always followed if that happens. The main states left out of this group are
the northeast, the most populous states, and some of the key oil-producing states.
Cluster 2 is
clearly con…ned to the oil-producing states and their agricultural neighbors in the central-west part
of the United States. The late-recovery states of cluster 3 seem to be limited to Illinois and a few
of the Great Plains states. Cluster 4, the states in which the 1974 and 1990 downturns seemed to
begin, are concentrated in the northeast and southeast U.S.
5
Conclusion
Certainly there are important di¤erences in cyclical behavior across states. But one of the striking
features of our …ndings is the extent to which U.S. recessions appear typically to have been national
events in which at some point every state participated. Historically, di¤erent recessions likely began
19
in di¤erent regions with di¤erent characteristics, suggesting that there is some heterogeneity in the
initial shocks. But there also appears to be some common national propagating dynamic when a
downturn becomes su¢ ciently widespread.
20
References
[1] Andrews, D.F., and C.L. Mallows. “Scale Mixtures of Normal Distributions.” J. R. Statist.
Soc. B, 1974, 36, pp. 99-102.
[2] Carlino, Gerald and DeFina, Robert. “The Di¤erential Regional E¤ects of Monetary Policy.”
Review of Economics and Statistics, November 1998, 80(4), pp. 572-587.
[3] Carlino, Gerald A. and DeFina, Robert H. “How Strong Is Co-movement in Employment over
the Business Cycle? Evidence from State/Sector Data.” Journal of Urban Economics, March
2004, 55(2), pp. 298-315.
[4] Carlino, Gerald and Sill, Keith. “Regional Income Fluctuations: Common Trends and Common
Cycles.” Review of Economics and Statistics, August 2001, 83(3), pp. 446-456.
[5] Carter, C., and Kohn, R., 1994, “On Gibbs Sampling for State Space Models,”Biometrika 81,
541-553.
[6] Casella, G., and George, E., 1992, “Explaining the Gibbs Sampler,”The American Statistician
46, 167-174.
[7] Chib, Siddhartha. “Calculating Posterior Distributions and Modal Estimates in Markov Mixture Models.” Journal of Econometrics 75(1), pp. 79-97.
[8] Crone, Theodore M. “An Alternative De…nition of Economic Regions in the United States
Based on Similarities in State Business Cycles.”Review of Economics and Statistics, November
2005, 87(4), pp. 617-626.
[9] Del Negro, Marco. “Asymmetric Shocks among U.S. States.” Journal of International Economics, March 2002, 56(2), pp. 273-297.
[10] Del Negro, Marco and Otrok, Christopher. “99 Luftballoons: Monetary Policy and the House
Price Boom across U.S. States.” Journal of Monetary Economics, October 2007, 54(7), pp.
1962-1985.
[11] Devroye, Luc. Non-Uniform Random Variate Generation. New York: Springer-Verlag, 1986.
21
[12] Forni, Mario and Reichlin, Lucrezia. “Federal Policies and Local Economies: Europe and the
US.” European Economic Review, January 2001, 45(1), pp. 109-134.
[13] Fratantoni, Michael and Schuh, Scott. “Monetary Policy, Housing, and Heterogeneous Regional
Markets.” Journal of Money, Credit and Banking, August 2003, 35(4), pp. 557-589.
[14] Frühwirth-Schnatter, Sylvia and Kaufmann, Sylvia. “Model-Based Clustering of Multiple
Times Series.” Journal of Business and Economic Statistics, January 2008, 26(1), pp. 78-89.
[15] Gelfand, A. and Smith, A.,
1990, ”Sampling Based Approaches to Calculating Marginal
Denisities,” Journal of the American Statistical Association 85,398-409.
[16] Hamilton, James D. “A New Approach to the Economic Analysis of Nonstationary Time Series
and the Business Cycle.” Econometrica, March 1989, 57(2), pp. 357-384.
[17] Hamilton, James D. “What’s Real About the Business Cycle?” Federal Reserve Bank of St.
Louis Review, July/August 2005 (87, no. 4), pp. 435-452.
[18] Holmes, Chris C., and Leonard Held. “Bayesian Auxiliary Variable Models for Binary and
Multinomial Regression.” Bayesian Analysis, 2006, 1(1), pp. 145-168.
[19] Kim, Chang-Jin and Nelson, Charles R. State-Space Models with Regime Switching. Cambridge,
MA: The MIT Press, 1999.
[20] Kouparitsas, Michael A. “Is the United States an Optimal Currency Area?”Chicago Fed Letter,
October 1999, 146, pp. 1-3.
[21] Krueger, Alan B., and John F. Burton, Jr. “The Employers’Costs of Workers’Compensation
Insurance: Magnitudes, Determinants, and Public Policy,”Review of Economics and Statistics,
May 1990, 72(2), pp. 228-240.
[22] Owyang, Michael T.; Piger, Jeremy; and Wall, Howard J. “Business Cycle Phases in U.S.
States.” Review of Economics and Statistics, November 2005, 87(4), pp. 604-616.
[23] Owyang, Michael T.; Piger, Jeremy; Wall, Howard J.; and Wheeler, Christopher H. “The Economic Performance of Cities: A Markov-Switching Approach.” Journal of Urban Economics,
forthcoming.
22
[24] Partridge, Mark D. and Rickman, Dan S. “Regional Cyclical Asymmetries in an Optimal
Currency Area: An Analysis Using US State Data.” Oxford Economic Papers, July 2005,
57(3), pp. 373-397.
[25] Tanner, Martin A. and Wong, Wing Hung. “The Calculation of Posterior Distributions by
Data Augmentation.” Journal of the American Statistical Association, June 1987, 82(398),
pp. 528-540.
[26] van Dijk, Bram; Franses, Philip Hans; Paap, Richard; and van Djik, Dick. “Modeling Regional
House Prices.” mimeo.
23
Table 1: Priors for Estimation
Parameter
[
0
1n ]
0n ;
n
2
P
k
Prior Distribution
N m;
Hyperparameters
m = [2; 1]0 ; M = I2
2M
2; 2
=0;
D( )
i
=0
=0
b = 0p ; B = 12 Ip
N (b; B)
24
8n
8n
8i
8k
Table 2: Estimated model coe¢ cients (posterior medians and means)
Median
0
1
Mean
2
0
Median
2
1
0
1
Mean
2
0
1
2
Alabama
2.85
-3.93
5.00
2.85
-3.93
5.04
Nebraska
2.85
-2.68
3.81
2.85
-2.68
3.83
Arizona
5.56
-3.96
11.29
5.56
-3.95
11.37
Nevada
5.98
-4.62
11.39
5.98
-4.62
11.46
Arkansas
3.34
-3.73
6.55
3.34
-3.73
6.60
New Hampshire
3.54
-4.94
7.43
3.53
-4.94
7.48
California
3.24
-4.06
4.40
3.24
-4.06
4.43
New Jersey
2.28
-3.54
3.28
2.28
-3.54
3.30
Colorado
4.12
-3.53
6.15
4.12
-3.53
6.19
New Mexico
3.59
-2.33
5.78
3.59
-2.33
5.81
Connecticut
2.31
-4.03
4.76
2.31
-4.03
4.79
New York
1.40
-3.12
2.60
1.40
-3.12
2.62
Delaware
2.75
-3.52
10.85
2.76
-3.52
10.93
North Carolina
3.73
-4.35
4.39
3.73
-4.35
4.43
Florida
4.85
-3.85
7.39
4.85
-3.85
7.44
North Dakota
2.66
-1.60
7.43
2.66
-1.60
7.47
Georgia
3.97
-4.45
4.49
3.96
-4.44
4.53
Ohio
2.12
-5.22
6.19
2.12
-5.22
6.23
Idaho
3.95
-3.46
8.44
3.95
-3.46
8.49
Oklahoma
3.07
-3.51
5.89
3.07
-3.51
5.92
Illinois
1.95
-4.02
3.83
1.95
-4.02
3.85
Oregon
3.59
-5.06
6.10
3.59
-5.06
6.14
Indiana
2.63
-5.24
9.01
2.63
-5.24
9.07
Pennsylvania
1.58
-4.00
4.52
1.58
-4.00
4.56
Iowa
2.51
-3.54
5.00
2.52
-3.54
5.03
Rhode Island
2.03
-4.40
6.70
2.03
-4.40
6.75
Kansas
2.90
-3.18
5.79
2.90
-3.18
5.82
South Carolina
3.53
-4.62
5.53
3.53
-4.61
5.57
Kentucky
2.97
-4.18
7.66
2.97
-4.18
7.70
South Dakota
3.04
-2.81
5.44
3.04
-2.80
5.47
Louisiana
2.74
-3.04
10.22
2.74
-3.04
10.27
Tennessee
3.31
-4.19
5.35
3.31
-4.19
5.39
Maine
2.44
-3.93
5.63
2.44
-3.93
5.67
Texas
3.88
-3.68
3.83
3.88
-3.68
3.86
Maryland
2.98
-3.65
5.20
2.98
-3.65
5.23
Utah
4.35
-3.53
6.03
4.35
-3.52
6.07
Massachusetts
2.13
-4.47
3.64
2.13
-4.47
3.67
Vermont
3.04
-4.24
5.75
3.04
-4.24
5.78
Michigan
2.34
-5.66
13.03
2.34
-5.66
13.13
Virginia
3.52
-3.67
3.56
3.52
-3.67
3.58
Minnesota
3.17
-3.93
4.21
3.17
-3.93
4.24
Washington
3.58
-3.82
8.31
3.58
-3.82
8.36
Mississippi
3.15
-3.63
7.07
3.16
-3.64
7.12
West Virginia
1.57
-3.29
11.43
1.57
-3.30
11.50
Missouri
2.34
-3.82
3.73
2.35
-3.82
3.75
Wisconsin
2.79
-4.33
3.82
2.78
-4.33
3.85
Montana
2.87
-3.03
9.01
2.87
-3.03
9.06
Wyoming
3.04
-2.83
17.68
3.04
-2.82
17.80
25
Table 3: Estimated logistic coe¢ cients and derivatives (posterior means)
Cluster 1
1
1
Cluster 2
Cluster 3
2
2
3
Cluster 4
3
4
4
constant
0.04
-
-0.17
-
-0.10
-
-0.03
-
oil production
-3.3
-0.39
25.5
1.00
2.7
0.15
-0.6
-0.07
manufacturing
-0.13
-0.24
-0.70
-0.92
-0.08
-0.07
0.12
0.21
agriculture
0.66
0.65
0.54
0.61
0.23
0.11
-0.43
-0.42
…nance
-0.47
-0.25
-0.67
-0.40
0.31
0.07
0.06
0.03
workers comp
-0.08
-0.01
-0.52
-0.09
-0.17
-0.01
-0.01
0.00
small …rms
-0.11
-0.25
0.06
0.17
-0.15
-0.16
0.09
0.18
Table 4: Estimated regime transition probabilities (posterior means)
from 1
from 2
from 3
from 4
from 5
from 6
to 1
0.56
0
0
0
0.00
0.03
to 2
0
0.77
0
0
0.00
0.03
to 3
0
0
0.00
0
0.24
0.00
to 4
0
0
0
0.63
0.00
0.02
to 5
0.44
0.00
0.00
0.37
0.76
0.03
to 6
0.00
0.23
1.00
0.00
0.00
0.90
NOTES: pij for
i = 1; :::; 4 and i 6= j were restricted a priori to be zero (indicated by boldface).
26
Figure 1. Values of explanatory variables for logistic probabilities across states.
Figure 2. Posterior probabilities of aggregate regimes.
National recession
1.25
1.00
0.75
0.50
0.25
0.00
-0.25
1956
1959
1962
1965
1968
1971
1974
1977
1980
1983
1986
1989
1992
1995
1998
2001
2004
2007
1986
1989
1992
1995
1998
2001
2004
2007
1986
1989
1992
1995
1998
2001
2004
2007
1986
1989
1992
1995
1998
2001
2004
2007
1986
1989
1992
1995
1998
2001
2004
2007
Cluster 1
1.25
1.00
0.75
0.50
0.25
0.00
-0.25
1956
1959
1962
1965
1968
1971
1974
1977
1980
1983
Cluster 2
1.25
1.00
0.75
0.50
0.25
0.00
-0.25
1956
1959
1962
1965
1968
1971
1974
1977
1980
1983
Cluster 3
1.25
1.00
0.75
0.50
0.25
0.00
-0.25
1956
1959
1962
1965
1968
1971
1974
1977
1980
1983
Cluster 4
1.25
1.00
0.75
0.50
0.25
0.00
-0.25
1956
1959
1962
1965
1968
1971
1974
1977
1980
1983
Notes to Figure 2. Top panel: posterior probability that zt = 5, with shaded regions
corresponding to dates of NBER recessions. Bottom panels: posterior probability that zt =
1,...,4, with shaded regions corresponding to dates for which posterior probability that zt =
5 is greater than 0.99.
Figure 3. Posterior probabilities of cluster affiliations.
Notes to Figure 3. The color for state n in panel k indicates the fraction of 20,000
simulations for which the value of hnk = 1.