Economic theory and econometric models

1989, Economic and Social Review

The Economic and Social Review, Vol. 21, No. 1, October, 1989, pp. 1-25 Economic Theory and Econometric Models CHRISTOPHER L . G I L B E R T * Queen Mary College and West field, University of London and CEPR T he c o n s t i t u t i o n o f the Econometric Society states as the main objective of the society "the unification of the theoretical-qualitative and the empirical-quantitative approach to economic problems" (Ragnar Frisch, 1933, p. 1). Explaining this objective, Frisch warned that "so long as we content ourselves to statements i n general terms about one economic factor having an 'effect' on some other factor, almost any sort o f relationship may be selected, postulated as a law, and 'explained' by a plausible argument". Precise, realistic, b u t at the same time complex theory was required to "help us out i n this s i t u a t i o n " {ibid, p . 2). Over f i f t y years after the f o u n d a t i o n of the Econometric Society Hashem Pesaran stated, in an editorial w h i c h appeared i n the initial issue o f the Journal of Applied Econometrics, that "Frisch's call for unification o f the research in economics has been left largely unanswered" (Pesaran, 1986, p . 1). This is despite the fact that propositions that theory should relate to applied models, and that models should be based upon theory, are n o t enormously controversial. The simple reason is that economic theory and econometric models D u b l i n E c o n o m i c s W o r k s h o p G u e s t L e c t u r e , T h i r d A n n u a l C o n f e r e n c e of the I r i s h E c o n o m i c s A s s o c i ation, Carrickmacross, May 20th 1989. * I a m grateful to J o s e C a r b a j o , V a n e s s a F r y , D a v i d H e n d r y , S t e p h e n N i c k e l l a n d Peter P h i l l i p s for c o m ments. I m a i n t a i n full responsibility for a n y errors. are relatively a w k w a r d bedfellows — much as they cannot do w i t h o u t each other, they also f i n d i t very difficult to live i n the same house. I shall suggest that this is i n part because the proposed u n i o n has been over-ambitious, and partly for that reason, neither partner has been sufficiently accommodating to the other. What is needed is the intellectual analogue o f a modern marriage. One could l o o k at this issue either i n terms o f making theory more applicable, or i n terms o f making modelling more theoretically-based. I shall start f r o m the latter approach. What I have to say w i l l therefore relate t o controversies about the relative merits o f different econometric methodologies, and i t is apparent that increased a t t e n t i o n has been given b y econometricians to questions o f methodology over the past decade. I n particular, the rival methodologies associated respectively w i t h David H e n d r y , Edward Learner and Christopher Sims, and recently surveyed b y A d r i a n Pagan (1987), generate showpiece discussions at international conferences. I do not propose here either t o repeat Pagan's survey, or t o advise o n the "best b u y " (on this see also Aris Spanos, 1988). A major issue, w h i c h underlies m u c h o f Learner's criticism o f conventional econometrics, is the status o f inference i n equations the -specification o f w h i c h has been chosen at least i n part on the basis o f preliminary regressions. I do not propose to pursue those questions here. I also note the obvious p o i n t that the tools we use may be i n part dictated b y the j o b to hand — hypothesis testing, forecasting and policy evaluation are different functions and i t is not axiomatic that one particular approach to modelling w i l l dominate in all three functions. This p o i n t w i l l recur, b u t I wish to focus on t w o specific questions — h o w should economic theory determine the structure o f the models we estimate and h o w should we interpret rejection o f theories? I w i l l argue that i n the main we should see ourselves as using theory to structure our models, rather than using models t o test theories. As I have noted, Frisch was adamant that economic data were only interesting i n relation to economic theory. Nevertheless, there was no consensus at that time as to h o w data should be related to theory. Mary Morgan (1987) has documented the lack o f clear probabilistic foundation for the statistical techniques then in use. T o a large extent this deficiency was attributable t o Frisch's h o s t i l i t y to the use o f sampling theory i n econometrics. Morgan argues that econometrics was largely concerned w i t h the measurement o f constants in lawlike relationships, the existence and character o f w h i c h were not i n d o u b t . She quotes Henry Schultz (1928, p. 33), discussing the demand for sugar, as stating " A l l statistical devices are to be valued according to their efficacy in enabling us to lay bare the true relationship between the phenomena under question", a comment that is remarkable only i n its premise that the true relationship is k n o w n . The same premise is evident i n Lionel Robbins' strictures on D r Blank's estimated demand f u n c t i o n for herrings (Robbins, 1935) — Robbins does n o t wish t o disagree w i t h Blank o n the f o r m o f the economic law, b u t o n l y o n the possibility of its quantification. This all changed w i t h the p u b l i c a t i o n i n 1944 o f Trygve Haavelmo's "Probability A p p r o a c h i n Econometrics" (Haavelmo, 1944). Haavelmo insisted, first, that " n o t o o l developed i n the theory of statistics has any meaning — except, perhaps, for descriptive purposes — w i t h o u t being referred to sorrie stochastic scheme" (ibid, p . i i i ) ; and second, that theoretical economic models should be f o r m u l a t e d as "a restriction u p o n the j o i n t variations o f a system of variable quantities (or, more generally, 'objects') w h i c h otherwise m i g h t have any value or p r o p e r t y " (ibid, p. 8 ) . For Haavelmo, the role o f t h e o r y was t o offer a non-trivial, and therefore i n principle rejectable, structure for the variance-covariance m a t r i x o f the data. I n this may be recognised the seeds of Cowles Commission econometrics. Haavelmo was unspecific w i t h respect to the genesis of the theoretical restrictions, b u t over the post-war period economic theory has been increasingly dominated b y the paradigm o f atomistic agents maximising subject to a constraint set. I t is true that game theory has provided a new set o f models, although these game theoretic models t y p i c a l l y i m p l y fewer restrictions than the atomistic o p t i m i s a t i o n models w h i c h remain the core o f the dominant neoclassical or neo-Wairasian "research programme". This research programme has been reinforced b y the so-called " r a t i o n a l expectations r e v o l u t i o n " , and i n Lakatosian terms this provides evidence that the programme remains "progressive". I shall briefly illustrate this programme f r o m the theory o f consumer behaviour on the grounds that i t is in this area that the econometric approach has been most successful; and also because i t is an area w i t h w h i c h all readers w i l l be very familiar. The systems approach t o demand modelling was i n i t i ated b y Richard Stone's derivation o f and estimation o f the Linear Expenditure System (the L E S ; Stone, 1954). The achievement o f the LES was the derivation o f a complete set o f demand equations w h i c h could describe the outcomes o f the optimising decisions o f a u t i l i t y maximising consumer. We now k n o w that the f u n c t i o n a l specification adopted b y Stone, i n w h i c h expenditure is a linear f u n c t i o n o f prices and money income, is highly restrictive and entails additive separability o f the u t i l i t y f u n c t i o n ; b u t this is i n no way to devalue Stone's c o n t r i b u t i o n . Over the past decade, much o f the w o r k o n consumer theory has adopted a more general framework i n w h i c h agents maximise expected u t i l i t y over an uncertain future. This gives rise to a sequence o f consumption (more generally, demand) plans, one starting i n each time period, w i t h the feature that o n l y the i n i t i a l observation o f each plan is realised. A c o m m o n procedure, initiated i n this literature b y Robert H a l l (1978), b u t most elegantly carried o u t b y Lars Peter Hansen and K e n n e t h Singleton ( 1 9 8 3 ) , is t o estimate the parameters of the m o d e l f r o m the Euler equations w h i c h link the first order conditions f r o m the o p t i m i s a t i o n p r o b l e m i n successive periods. The c o m b i n a t i o n exhib i t e d b o t h i n these t w o recent papers and in the original Stone LES paper o f theoretical derivations o f a set o f restrictions on a system o f stochastic equations, and the subsequent testing o f these restrictions using the procedures o f classical statistical inference, is exactly that anticipated b o t h i n the constit u t i o n o f the Econometric Society and in Haavelmo's manifesto. I t therefore appears somewhat churlish that the profession, having duly arrived at this destination, should n o w question that this is where we wish to go. However, i n terms o f the c o n s t i t u t i o n a l objective o f unifying theoretical and empirical modelling, the Haavelmo-Cowles programme forces this unification too much o n the terms o f the theory p a r t y . 1 This argument can be made i n a number o f respects, b u t almost invariably w i l l revolve around aggregation. I shall consider t w o approaches to aggregation — one deriving from theoretical and the other from statistical considerations. The three consumption-demand papers to w h i c h I have referred share the characteristic that they use aggregate data to examine the implications o f theories that are developed i n terms o f individual optimising agents. This requires that we assume either that all individuals are identical and have the same income, or that individuals have the same preferences (although they may differ i n the intercepts o f their Engel curves) w h i c h , moreover, belong to the P I G L class. I n the latter and marginally more plausible case, the market demand curves may be regarded as the demand curves o f a h y p o t h e t i c a l representative agent maximising u t i l i t y subject t o the aggregate budget constraint. This allows interpretation o f the parameters o f the aggregate functions i n terms o f the parameters o f the representative agent's u t i l i t y f u n c t i o n . 2 The representative agent hypothesis therefore performs a " r e d u c t i o n " o f the parameters o f the aggregate function to a set o f more fundamental micro parameters. I n this sense, the aggregate equations are " e x p l a i n e d " in terms of these more fundamental parameters, apparently i n the same way that one might explain the properties o f the hydrogen a t o m in terms o f the quantum mechanics equations o f an electron-proton pair. This reductionist approach to explanation was discussed by Haavelmo in an i m p o r t a n t b u t neglected section o f his manifesto entitled " T h e A u t o n o m y o f an Economic R e l a t i o n " and w h i c h anticipates the Lucas critique (Robert Lucas, 1.976). A relation3 1. O r o r t h o g o n a l i t y c o n d i t i o n s i m p l i e d by the E u l e r equations. A difficulty w i t h these "tests" is that they m a y have very l o w p o w e r against interesting alternatives — this a r g u m e n t was made by J a m e s D a v i d s o n a n d D a v i d H e n d r y ( 1 9 8 1 ) in r e l a t i o n to the tests r e p o r t e d in H a l l ( 1 9 7 8 ) . 2. T h e P r i c e I n d e p e n d e n t G e n e r a l i z e d L i n e a r class — see A n g u s D e a t o n a n d J o h n M u e l l b a u e r ( 1 9 8 0 ) . 3. See also J o h n A l d r i c h ( 1 9 8 9 ) . ship is autonomous to the extent that i t remains constant i f other relationships in the system (e.g., the money supply rule) are changed. Haavelmo explains " I n scientific research — i n the field o f economics as w e l l as i n other fields — our search for 'explanations' consists o f digging d o w n to more fundamental relations than those w h i c h stand before us w h e n we merely 'stand and l o o k ' " (ibid, p . 38). There is however abundant evidence that attempts t o make inferences about individual tastes f r o m the tastes o f a "representative agent" o n the basis of aggregate time series data can be highly misleading. Thomas Stoker (1986) has emphasised the importance o f distributional considerations i n a microbased application o f the Linear Expenditure System t o US data and Richard Blundell (1988) has reiterated these concerns. Richard Blundell, Panos Pashardes and G i u l l i m o Weber (1988) estimate the D e a t o n and Muellbauer (Deaton and Muellbauer, 1980) A l m o s t Ideal Demand System o n b o t h a panel of micro data and o n the aggregated macro data. They f i n d substantial biases in the estimated coefficients f r o m the aggregate relationships i n comparison w i t h the microeconometric estimates. Furthermore, and, they suggest as a consequence o f these biases, the aggregate equations exhibit residual serial correlation and reject the homogeneity restrictions. They suggest, as major causes o f this aggregation failure, differences between households w i t h and w i t h o u t children and the prevalence o f zero purchases in the micro data. This study in particular suggests that it is difficult t o claim that aggregate elasticities correspond i n any very straightforward way to underlying micro-parameters. 4 H o w does this leave the reductionist interpretation o f aggregate equations? Pursuing further the hydrogen analogy, the demand theoretic " r e d u c t i o n " is flawed b y the fact that we have i n general no independent knowledge of the parameters o f individual u t i l i t y functions w h i c h w o u l d allow us to predict elasticities prior to estimation. I t is therefore better t o see u t i l i t y theory i n traditional studies as " s t r u c t u r i n g " demand estimates. I n that case, the representative agent assumption is a convenient and almost Friedmanite simplifying assumption, i m p l y i n g that the role o f theory is p r i m a r i l y instrumental. 5 I n the Stoker (1986) and Blundell et al. (1988) studies, b y contrast, the presence o f good micro data allow a genuine test o f the c o m p a t i b i l i t y o f the macro and micro estimates, and this allows those authors to investigate the circumstances i n w h i c h a reductionist interpretation of the macro estimates 4. T h i s p r o b l e m is even more severe i n the r e c e n t study b y V a n e s s a F r y a n d P a n o s Pashardes ( 1 9 8 8 ) w h i c h a d o p t s the same p r o c e d u r e i n l o o k i n g at t o b a c c o p u r c h a s e s . C l e a r l y , the prevalence of n o n s m o k e r s i m p l i e s that there is no representative c o n s u m e r . B u t it is also the case that elasticities esti m a t e d f r o m aggregate data m a y fail to reflect the elasticities of a n y representative s m o k e r . T h i s is because it is not possible to distinguish b e t w e e n the effect of a (perhaps t a x i n d u c e d ) price rise in i n d u c ing s m o k e r s to s m o k e less a n d the effect, if a n y , of i n d u c i n g s m o k e r s to give u p the habit. 5. See M i l t o n F r i e d m a n ( 1 9 5 3 ) ; a n d for.a recent s u m m a r y of the ensuing l i t e r a t u r e j o h n P h e b y ( 1 9 8 8 ) . is possible. B o t h papers conclude that aggregate estimates w i t h no corrections for d i s t r i b u t i o n a l effects tend to suffer f r o m misspecified dynamics and this does i m p l y that they w i l l be less useful i n forecasting and policy analysis. There is also a suggestion that they may vary significantly, in a Lucas (1976) manner, because government policies w i l l affect income distribution more than they w i l l affect household characteristics and taste parameters. A l t h o u g h neither set o f authors makes an explicit argument, the i m p l i c a t i o n appears to be that one is better o f f confining oneself t o microeconometric data. This view seems to me to be radically mistaken. I t has been clear ever since Lawrence K l e i n (1946a, b) and Andre Nataf (1948) first discussed aggregation issues i n the context o f p r o d u c t i o n functions that the requirements for an exact correspondence between aggregate and micro relationships are enormously strong. T h r o u g h the w o r k o f Terence G o r m a n ( 1 9 5 3 , 1959, 1968), Muellbauer (1975, 1976) and others these conditions have been somewhat weakened b u t remain heroic. I n this light i t m i g h t appear somewhat surprising that aggregate relationships do seem to be relatively constant over t i m e and are broadly interpretable i n terms o f economic theory. A n interesting clue as t o w h y this might be was provided b y Yehuda Grunfeld and Z v i Griliches (1960) w h o asked "Is aggregation necessarily bad?" I n his Ph.D. thesis, G r u n f e l d had obtained the surprising result that the investment expenditures o f an aggregate o f eight major US corporations were better explained b y a t w o variable regression o n the aggregate market value o f these corporations and their aggregate stock o f plant and equipment at the start o f each period, than b y a set o f eight micro regressions i n w h i c h each corporation's investment was related to its o w n market value and stock o f plant and equipment ( G r u n f e l d , 1958). I n forecasting the aggregate, one w o u l d do better using an aggregate equation than b y disaggregating and forecasting each component o f the aggregate separately. G r u n f e l d and Griliches suggest that this may be explained b y misspecification o f the micro relationships. I f there is even a small dependence o f the m i c r o variables o n aggregate variables, this can result i n better explanation b y the aggregated equation than b y the slightly misspecified m i c r o equations. This result has recently been rediscovered b y Clive Granger (1987) w h o distinguishes between individual (i.e., agent-specific) factors and c o m m o n factors i n the m i c r o equations. I n the demand context, an example o f a comm o n factor w o u l d be the interest rate i n a demand for a durable good. I f we consider a specific agent, the role o f the interest rate is likely to be quite small — whether or n o t a particular household purchases, say a refrigerator, i n a particular p e r i o d w i l l m a i n l y depend o n whether the o l d refrigerator has finally ceased t o f u n c t i o n (replacement purchase) or the fact that the household u n i t has just been formed (new purchase). The role o f the interest rate w i l l be secondary. However, the individual factors are unlikely to be f u l l y observed. I f one is obliged to regress simply on the c o m m o n factors — i n this case the interest rate — the micro R w i l l be t i n y ( w i t h a m i l l i o n households, Granger obtains a value o f 0.001), b u t the aggregate equation may have a very high R (Granger obtains 0.999) because the individual effects average o u t across the p o p u l a t i o n . 2 2 6 So long as the micro relationships are correctly specified and all variables ( c o m m o n and individual) are fully observed there is no gain t h r o u g h aggregation. However, once we allow parsimonious simplification strategies, the effects o f these simplifications w i l l be to result i n quite different micro and aggregate relationships. Furthermore, i t is not clear a priori w h i c h o f these approximated relationships w i l l more closely reflect theory. Blundell (1988) i m p l i e d that w h e n micro and aggregate relationships differ this must entail aggregation bias i n the aggregate relationships. Granger's results show that theoretical effects may be swamped at the micro level b y individual factors w h i c h are of little interest t o the economist, and w h i c h i n any case are likely to be incompletely observed resulting i n o m i t t e d variable bias i n the micro equations. Microeconometrics is i m p o r t a n t , b u t i t does not invalidate trad i t i o n a l aggregate time series analysis. 7 M y concern here is w i t h the methodology o f aggregate time series econometrics so I shall n o t dwell o n the problems o f doing microeconometrics. The question I have posed is h o w economic theory should be incorporated i n aggregate models. The naive answer to this question is the reductionist route, in w h i c h the parameters o f aggregate relationships are interpreted i n terms o f the decisions o f a representative optimising agent. However, there is absolutely no reason to suppose that the aggregation assumptions required b y this reducw i l l h o l d . There is l i t t l e p o i n t , therefore, i n using these estimated aggregate relationships t o " t e s t " theories based o n the optimising behaviour o f a representative agent — i f we fail t o reject the theory i t is o n l y because we have insufficient data. 8 6. S t r i c t l y , the variance of the h o u s e h o l d effects is of order n, w h e r e n is the n u m b e r of h o u s e h o l d s , and the variance of the c o m m o n f a c t o r s is of order n . H e n c e , as the n u m b e r of h o u s e h o l d b e c o m e s 2 large, the c o n t r i b u t i o n of the i n d i v i d u a l effects b e c o m e s negligible. I n the converse case i n w h i c h we observe the i n d i v i d u a l f a c t o r s b u t n o t the c o m m o n factors the R s are reversed. See also G r a n g e r ( 1 9 8 8 ) . 2 7. W e r n e r H i l d e n b r a n d ( 1 9 8 3 ) arrives at a similar c o n c l u s i o n in a more specialised c o n t e x t . H e r e m a r k s (ibid, p. 9 9 8 ) " T h e r e is a qualitative difference in m a r k e t a n d i n d i v i d u a l d e m a n d f u n c t i o n s . T h i s obser v a t i o n shows that the c o n c e p t of the 'representative c o n s u m e r ' , w h i c h is often u s e d i n the literature; does n o t really simplify the a n a l y s i s ; o n the c o n t r a r y , it might be misleading". I a m grateful to J o s e C a r b a j o for bringing this reference to m y a t t e n t i o n . 8. I do not w i s h to c l a i m that " I f the sample size is large y o u reject e v e r y t h i n g " — see Peter P h i l l i p s ( 1 9 8 8 , p. 1 1 ) . I t is n o w nearly ten years since Sims argued in his "Macroeconomics and R e a l i t y " (Sims, 1980) that the Haavelmo-Cowles programme is misconceived. Theoretically-inspired identifying restrictions are, he argued, simply "incredi b l e " . This is p a r t l y because many sets o f restrictions amount to no more than normalisations together w i t h "shrewd aggregations and exclusion restrictions" based o n an " i n t u i t i v e econometrician's view o f psychological and sociological t h e o r y " (Sims, 1980, p p . 2-3); because the use o f lagged dependent variables for identification requires prior knowledge o f exact lag lengths and orders o f serial correlation (Michio Hatanaka, 1975); and partly because rational expectations i m p l y that any variable entering a particular equation may, in principle, enter a l l other equations containing expectational variables. I n Sims' example, a demand for meat equation is " i d e n t i f i e d " b y normalisation o f the coefficient o n the q u a n t i t y (or value share) o f meat variable to - 1 ; b y exclusion o f all other q u a n t i t y (or value share) variables; and b y exclusion o f the prices o f goods considered b y the econometrician to be distant substitutes for meat, or replacement o f these prices by the prices o f one or more suitably defined aggregates. The V A R m e t h o d o l o g y , elaborated i n a series o f papers by Thomas Doan, Robert L i t t e r m a n and Sims, is t o estimate unrestricted distributed lags o f each non-deterministic variable o n the complete set o f variables (Doan, L i t t e r m a n and Sims, 1984; L i t t e r m a n , 1986a; Sims, 1982, 1987). Thus given a set o f k variables one models k n x. = £ 2 0.. x. , +u. t ( i = 1, . . ,k) (1) The objective is to allow the data to structure the dynamic responses o f each variable. Obviously, however, one may wish t o consider a relatively large number o f variables and relatively long lag lengths, and this could result i n shortage o f degrees o f freedom and in-poorly determined coefficient estimates. Some o f the early V A R articles impose " i n c r e d i b l e " marginalisation (i.e., variable exclusion) and lag length restrictions — for example, Sims (1980) uses a six variable V A R o n quarterly data w i t h lag length restricted to four. B u t these restrictions are hardly more palatable than those Sims argued against in "Macroeconomics and R e a l i t y " and at least i m p l i c i t recognition o f this has pushed the V A R school into an a d o p t i o n o f a Bayesian framework. The crucial element o f Bayesian V A R ( B V A R ) modelling is a "shrinkage" procedure i n w h i c h a loose Bayesian prior distribution structures the otherwise unrestricted distributed lag estimates (Doan et al., 1984; Sims, 1987). A prior d i s t r i b u t i o n has t w o components — the prior mean and the prior variance. First consider the prior mean. I f one estimates a large number o f lagged coefficients, one w i l l i n t u i t i v e l y feel that many o f t h e m , particularly those at high lag lengths, should be s m a l l . D o a n et al. (1984) formalise this i n t u i t i o n b y specifying the prior for each modelled variable as a r a n d o m walk w i t h d r i f t . This prior can be justified o n the argument t h a t , under (perhaps incredibly) strict assumptions, r a n d o m walk models appear as the outcomes of the decisions o f atomistic agents optimising under uncertainty (most notably, H a l l , 1978); or on the heuristic argument that " n o change' ' forecasts provide a sensible " n a i v e " base against w h i c h any other forecasts should be compared. M o r e f o r m a l l y , one can argue that collinearity is clearly a major problem i n the estimation o f unrestricted distributed lag models and that severe collinearity may give models w h i c h "produce erratic, poor forecasts and i m p l y explosive behavior o f the data" (Doan et al., 1984). A standard remedy for collinearity, implemented i n ridge regression ( A r t h u r H o e r l and Robert Kennard, 1970a, b) is to " s h r i n k " these coefficients towards zero b y adding a small constant (the "ridge constant") to the diagonal elements o f the data cross-product m a t r i x . 9 1 0 1 1 1 Specification o f the prior variance involves the investigator quantifying his/ her uncertainty about the prior mean. The prior variance m a t r i x w i l l typically contain a large number o f parameters, and this therefore appears a daunting task. M u c h o f the originality o f the V A R shrinkage procedure arises f r o m the economy i n specification o f this m a t r i x w h i c h , i n the most simple case, is characterised i n terms o f o n l y three parameters (Doan, L i t t e r m a n and Sims, 1986). These are the overall tightness o f the prior d i s t r i b u t i o n , the rate at w h i c h the prior standard deviations decay, and the relative weight o f variables other than the lagged dependent variable i n a particular autoregression ( w i t h prior covariances set to zero). A tighter prior d i s t r i b u t i o n implies a larger ridge constant and this results i n a greater shrinkage towards the r a n d o m walk model. The i m p o r t a n t feature o f the D o a n et al. (1984) procedure is that the tightness o f the prior is increased as lag length increases. Degrees o f freedom considerations are no longer paramount since coefficients associated w i t h long 9. B u t note that this i n t u i t i o n m a y be i n c o r r e c t if one uses seasonally u n a d j u s t e d data. H o w e v e r , K e n n e t h Wallis ( 1 9 7 4 ) has s h o w n that use of seasonally a d j u s t e d data c a n distort the d y n a m i c s i n the estimated relationships. 10. T h e e x p o s i t i o n in D o a n et al. ( 1 9 4 8 ) is c o m p l i c a t e d a n d n o t e n t i r e l y consistent. See J o h n G e w e k e ( 1 9 8 4 ) for a concise s u m m a r y . 11. I n the s t a n d a r d linear m o d e l y = X)3 + u where y a n d X are b o t h m e a s u r e d as deviations f r o m their sample means, the ridge regression e s t i m a t o r of P is b = (X'x + kl) where k is the ridge c o n s t a n t . ^'y \ I lag lengths, and w i t h less i m p o r t a n t explanatory variables, are forced t o be close t o zero. I t is often suggested that the V A R approach is completely atheoretical (see, e.g., Thomas Cooley and Stephen L e R o y , 1986). This view is given support b y those V A R modellers whose activities are primarily related t o forecasting and w h o argue that relevant economic theory is so incredible that one w i l l forecast better w i t h an unrestricted reduced f o r m m o d e l ( L i t terman, 1986a, b ; Stephen McNees, 1 9 8 6 ) . However, this p o s i t i o n is too extreme. M o s t simply, theory may be tested to a l i m i t e d extent b y examinat i o n o f b l o c k exclusion (Granger causality) tests, although I w o u l d agree w i t h 12 i Sims that, interpreted strictly, such restrictions are n o t i n general credible. I t is therefore more interesting to examine the use o f V A R models i n policy analysis since i n this activity theory is indispensable. Suppose one is interested i n evaluating the policy impact o f a shock to the money supply. One w i l l typically l o o k for a set o f dynamic multipliers showing the impact o f that shock o n a l l the variables o f interest. A n i n i t i a l d i f f i c u l t y is that i n V A R models all variables are j o i n t l y determined b y their c o m m o n hist o r y and a set o f current disturbances. This implies that i t does n o t make sense to talk o f a shock to the money supply unless additional structure is imposed on the V A R . T o see this, note that the autoregressive representation (1) may be transformed i n t o the moving average representation k x. it °° = E j = 1 2 r = 0 a., u. i j r },t-i ( i = 1, . . ,k) t \ > > i (2) V / where each variable depends on the history o f shocks to all the variables in the m o d e l . There are t w o possibilities. Take the money supply to be variable 1. I f none o f the other k - 1 variables i n the model Granger-causes the money supply (so that 0 ^ = 0 for all j > l and r ) we may identify monetary policy w i t h the i n n o v a t i o n U j o n the money supply equation and trace o u t the effects o f these innovations o n the other variables i n the system. I t is more likely, however, particularly given Sims' views, that all variables are interdependent at least over t i m e . I n that case analysis o f the effects o f monetary policy requires the identifying assumption that the monetary authorities 12. F o r e x a m p l e , L i t t c r m a n ( 1 9 8 6 b , p. 26) w r i t e s i n c o n n e c t i o n w i t h business c y c l e s , ". . . there are a m u l t i t u d e of e c o n o m i c theories of the business c y c l e , m o s t of w h i c h focus on one p a r t of a c o m p l e x m u l t i f a c e t e d p r o b l e m . M o s t e c o n o m i s t s w o u l d a d m i t that e a c h t h e o r y has some v a l i d i t y , although there is w i d e disagreement over the relative i m p o r t a n c e of the different a p p r o a c h e s . " A n d i n c o n j u n c t i o n w i t h the D a t a R e s o u r c e s I n c . ( D R I ) m o d e l i n v e s t m e n t sector, he states, " E v e n i f one a c c e p t s the J o r g e n son t h e o r y as a reasonable a p p r o a c h to e x p l a i n i n g i n v e s t m e n t , the e m p i r i c a l i m p l e m e n t a t i o n does n o t a d e q u a t e l y r e p r e s e n t the true u n c e r t a i n t y a b o u t the d e t e r m i n a n t s of i n v e s t m e n t . " choose X j independently o f the current period disturbances on the other equations. I n an older t e r m i n o l o g y , this defines the first l i n k in a Wold causal chain w i t h money causally prior to the other variables (Herman W o l d and Radnar Bentzel, 1946; W o l d and Lars Jureen, 1953). I t can be implemented by renormalisation o f (2) such that x depends only on the policy innovations v while the remaining variables depend o n v and also a set o f innovations 2 f " " ' kt orthogonal to v . I n the l i m i t i n g case in w h i c h all the innovations are m u t u a l l y orthogonal, we may rewrite (2) as ] t u v u v w m c n a r e ] t *it = V i j r V - r .2 j = 1 r= 0 •" J 0=1.- (3) This expression is unique given the ordering o f the variables, b u t as Pagan (1987) notes, it is n o t clear a priori h o w the innovations v , . . , v should be interpreted. The policy multipliers w i l l depend on the causal ordering adopted, and the ordering of variables 2 . . k may i n practice be somewhat arbitrary. We f i n d therefore that, although i n estimation V A R modellers can avoid making strong identifying assumptions, policy interpretation o f their models, including the calculation o f policy multipliers, requires that one make exactly the same sort o f identifying assumption that Sims criticised i n the Haavelmo-Cowles programme. This is the basis o f Cooley and LeRoy's (1985) critique o f atheoretical macroeconometrics. 2 t k t As a criticism o f Sims, this is t o o strong. Note first that in his applied w o r k , Sims does not restrict himself t o orthogonalisation assumptions as i n ( 3 ) , but is w i l l i n g to explore a wider class o f identifying restrictions w h i c h are n o t dissimilar t o those made by structural modellers (see Sims, 1986). Moreover, he allows himself t o search over different sets o f identifying assumptions i n order to obtain plausible p o l i c y m u l t i p l i e r s . However, the sets of assumptions he explores all generate just identified models w i t h the i m p l i c a t i o n that they are all compatible w i t h the same reduced f o r m . This permits a t w o stage procedure in w h i c h at the first stage the autoregressive representation (1) is estimated, and at the second stage this representation is interpreted i n t o economic theory by the i m p o s i t i o n o f identifying assumptions on the moving average representation. The identifying assumptions may be controversial, b u t they do n o t contaminate estimation. A l t h o u g h i t is n o t true that V A R modelling is completely atheoretical, the philosophy o f the V A R approach may be caricatured as attempting t o l i m i t the role o f theory in order to obtain results w h i c h are as objective as possible and as near as possible independent o f the investigator's theoretical beliefs or prejudices. A n alternative approach, associated w i t h what I have called elsewhere (Gilbert, 1989) the LSE ( L o n d o n School of Economics) methodology B is t o use theory t o structure models in a more or less loose way so as t o o b t a i n a m o d e l whose general interpretation is i n line w i t h theory b u t whose detail is determined b y the data. The instrument for ensuring coherence w i t h the data is classical testing m e t h o d o l o g y . This immediately prompts the question o f what constitutes a test o f a theory w h i c h we regard as at best approximate? I have noted that i t does not usually make m u c h sense to suppose that we can sensibly use classical testing procedures to attempt t o reject theories based on the behaviour o f atomistic optimising agents o n aggregate economic data, since there is no reason t o suppose that those theories apply precisely on aggregate d a t a . There are in practice t w o interesting questions. The first is whether a given theory is or is not t o o simple relative b o t h to the data and for the purposes to hand. The second question is whether one theory-based model explains a given dataset better than another theory-based model. 13 The issue o f simplification almost invariably prompts the map analogy. For example, Learner ( 1 9 7 8 , p . 205) writes "Each map is a greatly simplified version o f the theory o f the w o r l d ; each is designed for some class o f decisions and w o r k s relatively p o o r l y for others". Simplification is forced upon us by the fact that we have l i m i t e d comprehension, and, more acutely i n time series studies, b y l i m i t e d numbers o f observations. As the amount o f data available increase, we are able t o entertain more complicated models, but this is not necessarily a benefit i f we are interested in investigating relatively simple theories since the additional c o m p l e x i t y may then largely take the form o f nuisance parameters. Frequently, the increased model c o m p l e x i t y w i l l take the f o r m o f inclusion o f more variables — i.e., revision o f the marginalisation decision — and this can be tested using conventional classical nested techniques. The i m p o r t a n t question is whether omission o f these factors results in biased coefficient values and incorrect inference in relation to the purposes of the investigation. The tourist and the geologist w i l l typically use different maps, b u t the tourist may wish to k n o w i f there are steep gradients on his/her r o u t e , and questions o f access are not totally irrelevant t o the geologist. 14 The obvious trade-off in the sort o f samples we frequently f i n d ourselves analysing i n time series macroeconometrics is between reduction in bias through the inclusion o f additional regressors and reduced precision through the reduct i o n i n degrees o f freedom and increase in collinearity. Short samples o f aggregate data can only relate to simple theories since they only contain a l i m i t e d amount o f i n f o r m a t i o n . Macroeconometric models w i l l therefore be more 13. H e t e r o g e n e i t y m a y i m p l y that these theories also fail to h o l d o n m i c r o data. 14. P h i l l i p s ( 1 9 8 8 , p. 2 8 ) notes that it is i m p l i c i t in the H e n d r y m e t h o d o l o g y that the n u m b e r k or regressors grows w i t h the sample size T in such a w a y that k / T ~* 0 as T simple than the w o r l d they p u r p o r t t o represent. This does not particularly matter, but i t does imply that we must always be aware that previously neglected factors may become i m p o r t a n t — an obvious example is provided by the role o f i n f l a t i o n in the consumption f u n c t i o n . T w o strategies are currently available for controlling for structural nonconstancy. V A R modellers advise use o f random coefficient vector autoregressions i n which the model coefficients all evolve as random walks (Doan et al., 1984). I n principle, this leads to very high dimensional models, b u t imposition o f a tight Bayesian prior distribution heavily constrains the coefficient evolution and permits estimation. This procedure automates c o n t r o l for structural constancy, since the modeller's role is reduced t o choice o f the tightness parameters o f the prior. A disadvantage is that it cannot ever p r o m p t reassessment o f the marginalisation decision — i.e., inclusion o f previously excluded or unconsidered regressor variables. A n alternative approach w h i c h is gaining increasing support is the use o f recursive regression methods to check for structural constancy. I n recursive regression one uses updating formulae, first w o r k e d o u t by T i m o Terasvirta ( 1 9 7 0 ) , to compute the regression o f interest for each subsample [ l , t ] for t = T j , . . ,T where T is the final observation available and T j is o f the order of three times the number o f regressor variables (see H e n d r y , 1989, pp. 20-21). This produces a large volume o f o u t p u t w h i c h is difficult t o interpret except by graphical methods. Use o f recursive methods had therefore t o wait u n t i l PC technology allowed easy and costless preparation o f graphs. I t is n o w computationally trivial t o graph Chow tests (Gregory Chow, 1960) for all possible structural breaks, or for one period ahead predictions for all periods w i t h i n the [ T j , T - 1] interval. Also one can p l o t coefficient estimates against sample size. A l t h o u g h these graphical methods do n o t i m p l y any precise statistical tests, they show up structural non-constancy o f either the break or evolution f o r m in an easily recognisable f o r m , and p r o m p t the investigator t o ask w h y a particular coefficient is moving through the sample, or w h y a particular observation is exerting leverage on the coefficient estimates. These questions should then p r o m p t appropriate model respecification. I am not aware that Learner has ever advised use o f recursive methods, but they do appear to be very much in the spirit o f his concern w i t h fragility in regression estimates (see Learner and Hermann Leonard, 1983), even i f the proposed databased " s o l u t i o n " is not one he w o u l d favour. Level o f c o m p l e x i t y is therefore p r i m a r i l y a matter o f sample size. The more interesting questions arise from comparison o f alternative and i n c o m patible simple theories w h i c h share the same objectives. A l t h o u g h maps may differ o n l y i n the selection o f detail to represent, they may also differ because one or other map incorrectly represents certain details. I n such cases we are required t o make a choice. There is n o w a considerable body o f b o t h econometric theory and o f experience i n non-nested hypothesis testing. Suppose we have t w o alternative and apparently congruent models A and B . Suppose initially model A (say a regression of y on X ) gives the " c o r r e c t " represent a t i o n of the economic process under consideration. This implies that the estimates obtained b y incorrectly supposing model B (regression o f y on Z) to be true w i l l suffer from misspecification bias. Knowledge of the covariance o f the X and Z variables allows this bias to be calculated. Thus i f A is true, i t allows the econometrician to predict h o w B w i l l p e r f o r m ; b u t i f A does not give a good representation o f the economic process, it w i l l not be able t o " e x p l a i n " the model B coefficients. Furthermore, we can reverse the entire procedure and a t t e m p t t o use model B t o predict how A w i l l p e r f o r m . These non-nested hypothesis tests, or encompassing tests as they are sometimes called, t u r n o u t t o be very simple t o p e r f o r m . One forms the composite b u t quite possible economically uninterpretable hypothesis A U B w h i c h in the case discussed above is the regression o f y on b o t h X and Z (deleting one occurrence o f any variable included i n b o t h X and Z ) , and then performs the standard F tests o f A and B i n t u r n against A U B . Four outcomes are possible. I f one can accept the absence of the Z variables in the presence o f the X variables, b u t not vice versa (i.e., E [ y | X , Z ] = X a ) , model A is said t o encompass m o d e l B ; equally, m o d e l B may encompass model A ( E [ y l X , Z ] = Z 0 ) . B u t t w o other outcomes are possible. I f one cannot accept either E [ y | X , Z ] = X a or E [ y | X , Z ] = Zj3 neither hypothesis may be maintained. Finally, one might be able t o accept b o t h E [ y | X , Z ] = X a and E [ y | X , Z ] = Zj3 in w h i c h case the data are indecisive. This relates to Thomas Kuhn's view that a scientific theory w i l l n o t be rejected simply because of anomalies, b u t rather because some o f these anomalies can be explained by a rival theory ( K u h n , 1962). 15 The LSE procedure may be summarised as an attempt to obtain a parsimonious representation of a general unrestricted e q u a t i o n . This represent a t i o n should simultaneously satisfy a number of criteria (Hendry and JeanFrancois Richard, 1 9 8 2 , 1 9 8 3 ) . First, i t must be an acceptable simplification o f the unrestricted equation either on the basis o f a single F test against the unrestricted equation, or on the basis o f a sequence o f such tests. Second, i t should have serially independent errors. T h i r d , i t must be structurally constant. I w i l l r e t u r n to the error correction specification shortly. The model dis16 17 15. T h i s is "coefficient e n c o m p a s s i n g " . A more l i m i t e d question ("variance encompassing") is w h e t h e r we c a n e x p l a i n the r e s i d u a l variances. C o e f f i c i e n t e n c o m p a s s i n g implies variance e n c o m p a s s i n g , b u t n o t vice versa ( M i z o n a n d R i c h a r d , 1 9 8 6 ) . 16. U s u a l l y this w i l l involve O L S e s t i m a t i o n of single equations, b u t the same p r o c e d u r e s m a y be a d o p t e d i n s i m u l t a n e o u s m o d e l s using appropriate estimators. 17. See G r a y h a m M i z o n ( 1 9 7 7 ) . covery activity takes place i n part i n the parsimonious simplification activity, w h i c h t y p i c a l l y involves the i m p o s i t i o n o f zero or equality restrictions on sets of coefficients, and also i m p o r t a n t l y i n reviewing the marginalisation (variable exclusion) decisions. Parsimonious simplification may be regarded as i n large measure a t i d y i n g up operation w h i c h does l i t t l e to affect equation f i t , controls for collinearity and thereby improves forecasting performance, and at worst results i n exaggerated estimates o f coefficient precision (since coefficients w h i c h are approximately zero or equal are set to be exactly zero or e q u a l ) . Pravin Trivedi (1984) has coined the term " t e s t i m a t i o n " to describe the "empirical specification search involving a blend o f estimation and significance tests". I m p o r t a n t l y , parsimonious simplification conserves degrees o f freedom and i n this respect i t is n o t dissimilar to the shrinkage procedure adopted i n V A R modelling, the difference being mainly whether one imposes strong restrictions on a set of near zero coefficients ( L S E ) , or weaker restrictions on the entire set o f coefficients ( V A R ) . I t does n o t seem t o me that there is any strong basis for suggesting that one m e t h o d has superior statistical properties than the other. V A R modellers argue that their models have superior forecasting properties, but LSE modellers w o u l d reply that their methods tend to be more robust w i t h respect t o structural change. This is n o t an argument that can be settled on an a priori basis. 18 Opening up the marginalisation question is o f greater importance. I f a variable w h i c h is o f practical importance is o m i t t e d from the model, perhaps because its presence is n o t indicated by the available theory, this omission is likely t o cause biased coefficient estimates and either serially correlated residuals or over-complicated estimated dynamics. I n the former case, one might be tempted t o estimate using an appropriate autoregressive estimator, w h i c h is t a n t a m o u n t to regarding the autoregressive coefficients as nuisance parameters; while in the latter one w i l l obtain the same result via unrestricted estimates o f the autoregressive equation. The alternative, w h i c h is familiar to all o f us, is t o take the residual serial correlation as p r o m p t i n g the question of whether the m o d e l is well-specified, and i n particular, whether i m p o r t a n t variables have been o m i t t e d . Subsequent discovery that this is indeed the case may either indicate a need t o extend or revise the underlying theory, or more simply suggest the observation that the theory offers only a partial explanat i o n o f the data. I n the latter case, the additional variables introduced i n t o the m o d e l may perhaps be legitimately regarded as nuisance variables, but in the former case the t w o way interaction between theory and data w i l l have a clear positive value. 18. C o n t r a s t L e a r n e r ( 1 9 8 5 ) w h o describes the L S E m e t h o d o l o g y as " a c o m b i n a t i o n of b a c k w a r d a n d f o r w a r d stepwise (better k n o w n as u n w i s e ) regression . . . T h e order for imposing the restrictions a n d the c h o i c e o f significance level are a r b i t r a r y . . . W h a t meaning s h o u l d be a t t a c h e d to all of t h i s ? " The feature o f the LSE approach on which I wish to concentrate is the role o f cointegration and the prevalence o f the error correction specification. The error correction specification is an attempt to combine the flexibility o f time series ( B o x - J e n k i n s ) models in accounting for short term dynamics w i t h the t h e o r y - c o m p a t i b i l i t y o f traditional structural econometric models (see Gilbert, 1989). I n this specification b o t h the dependent variable and the explanatory variables appear as current and lagged differences (sometimes as second differences or multi-period differences), as in Box-Jenkins models, b u t unlike those models, the specification also includes a single lagged level o f the dependent variable and a subset o f the explanatory variables. For example, a stylised version o f the Davidson et al. (1978) consumption function may be w r i t t e n as 19 A 4 l n c t = 0o + U A l n y + U A A l n y - 0 ( l n c _ - l n y _ ) 1 4 t 2 1 4 t 3 t 4 t 4 (4) where, o n quarterly data, annual changes i n consumption are related to annual changes in income and a four quarter lagged discrepancy between income and c o n s u m p t i o n . I t is these lagged levels variables which determine the steady state solution o f the m o d e l . I t w i l l frequently be f o u n d that augmentation of pure difference equations b y lagged levels terms in this way has a dramatic effect o n forecasts and on estimated policy responses. 2 0 I t is always possible to reparameterise any unrestricted distributed lag equation specified i n levels (e.g., a V A R ) i n t o the error correction f o r m , so i t may appear o d d to claim any special status for this way o f w r i t i n g distributed lag relationships. Note however that the LSE procedure i m p l i c i t l y prohibits parsimonious simplification o f the unrestricted equation into a Box-Jenkins model i n w h i c h the lagged level o f the dependent variable is excluded, even i f this exclusion w o u l d result in negligible loss in f i t . I n this sense, the specificat i o n is non-trivial. That i t is an interesting non-trivial specification depends on the claim that economic theory implies a set o f comparative static results w h i c h are reflected in long-run constancies, and is reinforced by the logically independent b u t incorrect claim that economic theory tells us l i t t l e about short-term adjustment processes. The earliest error correction specification was Denis Sargan's (1964) wage 21 19. G e o r g e B o x a n d G w y l y m J e n k i n s ( 1 9 7 0 ) . 2 0 . I n the steady state s o l u t i o n all the differenced variables arc set to zero. T h e steady state g r o w t h s o l u t i o n , i n w h i c h all the differenced variables are set to appropriate c o n s t a n t s , is often more i n f o r m a tive — see J a m e s D a v i d s o n , D a v i d H e n d r y , F r a n k S r b a a n d S t e p h e n Y e o ( 1 9 7 8 ) , a n d G i l b e r t ( 1 9 8 6 , 1989). 2 1 . See for e x a m p l e P h i l l i p s ( 1 9 8 8 , p. 1 9 ) : " I n m a c r o e c o n o m i c s , theory u s u a l l y provides little infor m a t i o n a b o u t the process of short r u n a d j u s t m e n t " . model i n w h i c h the rate o f increase i n wage rates was related t o the difference between the lagged real wage and a n o t i o n a l target real wage. Here there is a straightforward structural interpretation o f the error correction t e r m . More recently, however, the generality o f the specification has received support from the Granger representation theorem ( Robert Engle and Granger, 1987) w h i c h states that i f there exists a stationary linear c o m b i n a t i o n o f a set o f non-stationary variables (i.e., i f the variables are "cointegrated") then these variables must be linked by at least one relationship w h i c h can be w r i t t e n i n the error correction f o r m . ( I f this were n o t the case, the variables w o u l d increasingly diverge over time.) Since most macroeconomic aggregates are non-stationary (typically they grow over time) any persisting (autonomous) relationship between aggregates over time is likely to be o f the error correction form. Cointegration therefore provides a powerful reason for supposing that there w i l l exist structural constant relationships between macroeconomic aggregates. I f economic time series are non-stationary but cointegrated there are strong arguments for imposing the error correction structure on our models, and i t is an advantage o f the LSE methodology over the V A R methodology that i t adopts this approach. A major role for economic theory i n the LSE methodology is t o aid the specification o f the cointegrating t e r m . Unsurprisingly, short samples o f relatively high frequency (quarterly or m o n t h l y ) data are often relatively uninformative about the long-run relationship between the variables, so that theoretically unmotivated specification o f these terms gives l i t t l e precision or discrimination between alternative specifications. One possibility, suggested by Engle and Granger ( 1 9 8 7 ) , is a t w o stage procedure where at the first stage one estimates the static ("cointegrating") regression ignoring the short-term dynamics, and at the second stage one imposes these long-run coefficients on the dynamic error correction model. However, M o n t e Carlo investigation suggests that this procedure has poor properties ( A n i n d y a Banerjee, Juan D o l a d o , David Hendry and Gregor S m i t h , 1986) and that i t is preferable t o attempt t o estimate the long-run solution from the dynamic adjustment equation as i n the i n i t i a l Sargan (1964) wage model and the Davidson et al. (1978) consumption function m o d e l . Nevertheless, the long-run solution may still be p o o r l y determined, i m p l y i n g that theoretical restrictions are unlikely to be rejected. The theoretical status o f the short-run dynamics i n the LSE parsimoniously simplified equations is more problematic and here economic theory has as yet been less helpful. H e n d r y and Richard (1982, 1983) describe the modelling exercise as an a t t e m p t to provide a characterisation o f what they call the " D a t a Generating Process" (the DGP) w h i c h is the j o i n t p r o b a b i l i t y d i s t r i b u t i o n of the complete set o f sample data (endogenous and exogenous variables). A c t u a l DGPs, they suggest, w i l l be very complicated, b u t the c o m b i n a t i o n of marginalisation (exclusion o f variables that do n o t m u c h matter), conditioning (regarding certain variables as e x o g e n o u s ) and simplification which together make up the activity o f modelling can give rise t o simple and structurally constant representations o f the DGP. The DGP concept derives from M o n t e Carlo analysis where the investigator specifies the process w h i c h w i l l generate the data t o be used in the subsequent estimation experiments. This suggests an analogy i n w h i c h we suppose a fict i o n a l statistician choosing the data that we analyse i n applied economics. I n a pioneering c o n t r i b u t i o n to the A r t i f i c i a l Intelligence literature, A l a n T u r i n g (1950) asked whether an investigator could infallibly distinguish which o f t w o terminals is connected to a machine and w h i c h operated by a human. H e n d r y dares us t o claim that we can distinguish between M o n t e Carlo and real w o r l d economic data. I f we cannot, the DGP analogy carries over, and we can hope to discover structural short-term dynamics. 22 This argument appears to me to be flawed. I f macroeconomic data do exhibit constant short-term dynamics then one might expect any structural interpret a t i o n t o relate to the parameters o f the adjustment processes o f the optimising agents. B u t we have seen that the aggregation conditions required for the aggregate parameters to be straightforwardly interpretable in terms o f the microeconomic parameters are heroic. W i t h M o n t e Carlo data, by contrast, we can be confident that there does exist a simple structure since the structure has been imposed by a single simple investigator. There are no aggregation issues, and the question o f reduction does n o t arise. The most promising route for rationalising the dynamics o f LSE equations is i n terms o f the backward representation o f a forward looking optimising models. I n simple models, optimising behaviour in the presence o f adjustment costs w i l l give rise t o a second order difference equation which can be solved to give a lagged (partial) adjustment term and a forward lead on the expected values o f the exogenous variables. B u t these future expected values may always be solved o u t in terms o f past values o f the exogenous variables giving a backw a r d l o o k i n g representations. Stephen Nickell (1985) showed that in a number of cases o f interest, this backward representation w i l l have the error correction f o r m , and this suggests that i t may in general be possible to rationalise error correction models in these terms ( K e i t h Cuthbertson, 1988). A n implication of this view, via the Lucas (1976) critique, is that i f the process followed by any o f the exogenous variables changes, the backward looking relationship w i l l be structurally non-constant while the forward looking representation 22. S t r i c t l y "at least w e a k l y e x o g e n o u s " — see E n g l c , H e n d r y a n d R i c h a r d ( 1 9 8 3 ) . w i l l remain constant. Current experience, however, is that i n these circumstances i t is the f o r w a r d looking equation that is non-constant (Hendry, 1988; Carlo Favero, 1989). A n alternative approach is to regard the short-term dynamics i n LSE relationships as nuisance terms. Direct estimation o f the cointegrating relationships is inefficient because of the residual serial correlation resulting f r o m the o m i t t e d dynamics and may be inconsistent because o f simultaneity. I t is possible that i n part these o m i t t e d dynamics arise from aggregation across heterogeneous agents (Marco L i p p i , 1988). I n principle, one could estimate using a systems m a x i m u m l i k e l i h o o d ( M L ) estimator taking into account the serial correlation (Sorenjohansen, 1988; Soren Johansen andKatarinaJuselius, 1989), b u t there is advantage i n using a single equations estimator since this localises any misspecification error. The single equations estimator must correct b o t h for the simultaneity and for the serial correlation. I n recent w o r k Phillips (1988) has argued that LSE dynamic equations often come very close to and sometimes achieve o p t i m a l estimation o f the cointegrating relationship. O n this i n t e r p r e t a t i o n , the short-run dynamic terms i n those equations are simply the simultaneity and Generalised Least Squares (GLS) adjustments, i n the same way that one can rewrite the familiar Cochrane-Orcutt autoregressive estimator (Donald Cochrane and G u y O r c u t t , 1949) i n terms o f a restricted OLS estimation o f an equation containing lagged values o f the dependent variable and the regressor variables (Hendry and M i z o n , 1978). A n implicat i o n is that we have come full circle back to pre-Haavelmo econometrics where the concern was the measurement o f constants i n lawlike relationships which i n modern t e r m i n o l o g y are simply the cointegrating relationships. However, this is t o miss m u c h o f the p o i n t o f the methods generated b y Sargan, Hendry and their colleagues. Routine forecasting and policy analysis in econometrics is as much or more concerned w i t h short-term movements in key variables than w i t h their long-term equilibria. Furthermore, short-term (derivative) responses are generally very m u c h better determined than longterm relationships. I argued i n Gilbert (1989) that a substantial part o f the m o t i v a t i o n o f the LSE t r a d i t i o n i n econometrics was the perceived challenge to " w h i t e b o x " econometric models from "black b o x " t i m e series (BoxJenkins) models (Richard Cooper, 1972; Charles Nelson, 1972). The same points are true i n relation to the development of V A R methodology. Practitioners o f the LSE approach are u n l i k e l y , therefore, t o recognise themselves in Phillips' description. A t the start o f his famous 1972 survey "Lags i n Economic Behavior", Marc Nerlove quoted Schultz (1938) as saying " A l t h o u g h a theory o f dynamic economics is still a thing o f the future, we must not be satisfied w i t h the status quo i n economics". Nerlove then went o n to remark that " d y n a m i c economics is still, i n large part, a thing o f the f u t u r e " (Nerlove, 1972, p . 222). The rational expectations o p t i m i s i n g models, examples o f w h i c h I have already discussed, have constituted a major attempt t o provide that dynamic theory. They have not been w h o l l y successful, for the reasons I have indicated. Neither have they been w h o l l y unsuccessful. A possible criticism o f b o t h the V A R and LSE approaches t o modelling aggregate macro-dynamics is that they do not make any attempt t o accommodate these theories. A n alternative possibility is to argue that the p r o b l e m is on the theorists' side; and that the rational expectations atomistic optimising models deliver models w h i c h are t o o simple even to be taken as reasonable approximations. The problem is, nevertheless, that however much the anomalies m u l t i p l y , we are likely to abandon these theories u n t i l an alternative paradigm becomes available. Sadly, I do n o t see any i n d i c a t i o n that such a development is i m m i n e n t . I started this lecture by recalling a c o m m i t m e n t to unify theory and empirical modelling. That programme has recorded a measure o f success, b u t to a large extent that success has been in the modelling o f long-term equilibrium relationships. When Nerlove surveyed the methods o f dynamic economics, the cont r i b u t i o n o f theory was relatively new and relatively slight. We n o w have m u c h better developed and more securely based theories o f dynamic adjustment b u t these theories have been too simple t o i n f o r m practical modelling. I t is obviously possible to argue that this is the fault o f the econometricians, and the level o f discord among the econometricians m i g h t be held as evidence for this view. M y suspicion is, however, that the current disarray in the econometric camp is the consequence o f the lack o f applicable theory. Where we have informative and detailed theories, as for example in demand analysis or the theory o f financial asset prices, methodological debates are m u t e d . I f the theorist can develop realistic but securely based dynamic theories, then the competing approaches to econometric methodology could coexist, quite happily t h r o u g h o u t macroeconometrics. I have made a number o f different arguments i n the course o f this paper, so a brief summary may be useful. 1. I agree w i t h the currently widely held view that i t is not possible in general t o estimate parameters o f micro functions from aggregate data. 2. I disagree w i t h the i m p l i e d view that aggregate relationships cannot be interpreted in terms o f microeconomic theory. The appropriate level of aggregation w i l l depend b o t h on the purpose o f the modelling exercise and on the questions being asked. 3. Theoretical restrictions should not be expected t o hold precisely on aggregate data. This implies that classical rejections cannot per se be taken t o i m p l y rejection o f the theories in question. 4. Classical techniques o f non-nested hypothesis testing provide a m e t h o d for discriminating between alternative imprecise theories. 5. I t is d i f f i c u l t t o argue a priori t h a t Bayesian shrinkage procedures have either superior or inferior statistical properties t o the pseudo-structural methods associated w i t h the British approach t o dynamic m o d e l l i n g . A n advantage o f the latter approach is however that i t gives a central role to model discovery, w h i c h may allow a beneficial feedback from data t o t h e o r y . 6. Cointegration provides a p o w e r f u l reason for believing that macroeconomic aggregates w i l l be l i n k e d b y structurally stable relationships, and i t is an i m p o r t a n t advantage o f the British approach that i t embodies this feature or economic time series t h r o u g h error c o r r e c t i o n . However, the argument that the British approach t o dynamic m o d e l l i n g should be seen as simply a m e t h o d o f efficiently estimating these e q u i l i b r i u m relationships is misconceived. 7. The progress i n estimating relationships has n o t been matched b y comparable progress i n estimating dynamic adjustment processes, where theory and data appear t o be quite starkly at odds. A possible response is that the existing o p t i m i s i n g theories are just t o o simple. REFERENCES A L D R I C H , J O H N , 1989. "Autonomy", Oxford Economic Papers, Vol. 41, pp. 1534. B A N E R J E E , A N I N D Y A , J U A N J . D O L A D O , D A V I D F . H E N R Y , and G R E G O R W. SMITH, 1986. "Exploring Equilibrium Relationships in Econometrics through static Relationships: Some Monte Carlo Evidence", Oxford Bulletin of Economics and Statistics, 48, pp. 253277. B L U N D E L L , R I C H A R D , 1988. "Consumer Behaviour: Theory and Empirical Evidence", Economic Journal, Vol. 98, pp. 1665. B L U N D E L L , R I C H A R D , PANOS P A S H A R D E S , and G I U L L I M O W E B E R , 1988. "What do we Learn about Consumer Demand Patterns from MicroData?", Institute for Fiscal Studies, Working Paper No. W88/10. BOX, G E O R G E E.P., and G W Y L Y M M. J E N K I N S , 1970. Time Series Analysis: Forecasting and Control, San Francisco: Holden Day. CHOW, G R E G O R Y C , 1960. "Tests of Equality Between Sets of Coefficients in Two Linear Regressions", Econometrica, Vol. 28, pp. 591605. C O C H R A N E , D O N A L D , and G U Y H. O R C U T T , 1949. "An Application of Least Squares Regression to Relationships Containing AutoCorrelated Error Terms"', Journal of the American Statistical Association, Vol. 44, pp. 3261. C O O L E Y , THOMAS F . , and S T E P H E N F . L E R O Y , 1985. "Atheoretical Macroecono metrics: A Critique", Journal of Monetary Economics, Vol. 16, pp. 283308. C O O P E R , R I C H A R D L . , 1972. "The Predictive Performance of Quarterly Econometric Models of the United States", in Bert Hickman (ed.), Econometric Models of Cyclical Behavior, NBER Studies in Income and Wealth, Vol. 36, pp. 813926, New York: Columbia University Press. C U T H B E R T S O N , K E I T H , 1988. "The Demand for M l : A ForwardLooking Buffer Stock Model", Oxford Economic Papers, Vol. 40, pp. 110131. D A V I D S O N , J A M E S E . H . , and D A V I D F . H E N D R Y , 1981. "Interpreting Econometric Evidence: The Behaviour of Consumers' Expenditure in the U K " , European Economic Review, Vol. 16, pp. 177592. D A V I D S O N , J A M E S E . H . , D A V I D F . H E N D R Y , F R A N K S R B A , and S T E P H E N Y E O , 1978. "Econometric Modelling of the Aggregate Time Series Relationship Between Consumers' Expenditure and Income in the United Kingdom", Economic Journal, Vol. 88, pp. 661692. D E A T O N , ANGUS S., 1974. "A. Reconsideration of the Empirical Implications of Addi tive Preferences", Economic Journal, Vol. 84, pp. 338348. D E A T O N , ANGUS S., and J O H N M U E L L B A U E R , 1980. "An Almost Ideal Demand System", American Economic Review, Vol. 70, pp. 312332. DO AN, THOMAS, R O B E R T B. L I T T E R M A N , and C H R I S T O P H E R A. SIMS, 1984. "Forecasting and Conditional Projection Using Realistic Prior Distributions", Econometric Reviews, V o l . 3, pp. 1100. E N G L E , R O B E R T F . , and C L I V E W.H. G R A N G E R , 1987. "CoIntegration and Error Correction: Representation, Estimation and Testing", Econometrica, Vol. 55, pp. 251 276. E N G L E , R O B E R T F . , D A V I D F . H E N D R Y , and J E A N F R A N C O I S R I C H A R D , 1983. "Exogeneity", Econometrica, Vol. 51, pp. 277304. F A V E R O , C A R L O , 1989. "Testing for the Lucas Critique: An Application to Consumers' Expenditure", University of Oxford, Applied Economics Discussion Paper #73. F R I E D M A N , M I L T O N , 1953. "The Methodology of Positive Economics", in Milton Fried man, Essays in Positive Economics, Chicago: University of Chicago Press, pp. 343. F R I S C H , R A G N A R , 1933. "Editorial", Econometrica, Vol. 1, pp. 14. F R Y , V A N E S S A C , and PANOS P A S H A R D E S , 1988. "NonSmoking and the Estimation of Household and Aggregate Demands for Tobacco", Institute for Fiscal Studies, pro cessed. G E W E K E , J O H N , 1984. "Comment", Econometric Reviews, Vol. 3, pp. 105112. G I L B E R T , C H R I S T O P H E R L . , 1986. "Professor Hendry's Econometric Methodology", Oxford Bulletin of Economics and Statistics, Vol. 48, pp. 283307. G I L B E R T , C H R I S T O P H E R L . , 1989. " L S E and the British Approach to Time Series Econometrics", Oxford Economic Papers, Vol. 41, pp. 108128. G O R M A N , W.M. ( T E R R E N C E ) , 1953. "Community Preference Fields", Econometrica, Vol. 19, pp. 6380. G O R M A N , W.M. ( T E R R E N C E ) , 1959. "Separable Utility and Aggregation", Econometrica, Vol. 27, pp. 469481. G O R M A N , W.M. ( T E R R E N C E ) , 1968. "The Structure of Utility Functions", Review of Economic Studies, Vol. 21, pp. 117. G R A N G E R , C L I V E W.J., 1987. "Implications of Aggregation with Common Factors", Econometric Theory, V o l . 3, pp. 208222. G R A N G E R , C L I V E W.J., 1988. "Aggregation of Time Series A Survey", Federal Bank of Minneapolis, Institute for Empirical Macroeconomics, Discussion Paper # 1 . G R U N F E L D , Y E H U D A , 1958. "The Determinants of Corporate Investment", unpublished Ph.D. Thesis, University of Chicago. G R U N F E L D , Y E H U D A , and Z V I G I R L I C H E S , 1960. "Is Aggregation Necessarily Bad?", Review of Economics and Statistics, Vol. 42, pp. 113. H A A V E L M O , T R Y G V E , 1944. "The Probability Approach to Econometrics", Econometrica, Vol. 12, Supplement. H A L L , R O B E R T E . , 1978. "Stochastic Implications of the Life CyclePermanent Income Hypothesis: Theory and Evidence", Journal ofPolitical Economy, Vol. 86, pp. 971987. H A N S E N , L A R S P E T E R , and K E N N E T H J . S I N G L E T O N , 1983. "Stochastic Consump tion, Risk Aversion and the Temporal Behavior of Asset Returns", Journal of Political Economy, Vol. 91, pp. 249265. H A T A N A K A , M., 1975. "On the Global Identification of the Dynamic Simultaneous Equation Model with Stationary Disturbances", International Economic Review, Vol. 16, pp. 545554. H E N D R Y , D A V I D F . , 1988. "Testing Feedback versus Feedforward Econometric Speci fications", Oxford Economic Papers, Vol. 40, pp. 132149. H E N D R Y , D A V I D F . , 1989. PC-GIVE: An Interactive Econometric Modelling System, Oxford: Institute of Economics and Statistics. H E N D R Y , D A V I D F . , and G R A Y H A M E . MIZON, 1978. "Serial Correlation as a Con venient Simplification, Not a Nuisance: A Comment on a Study of the Demand for Money by the Bank of England", Economic Journal, Vol. 88, pp. 549563. H E N D R Y , D A V I D F . , and J E A N F R A N C O I S R I C H A R D , 1982. "On the Formulation of Empirical Models in Dynamic Econometrics", Journal of Econometrics, Vol. 20, pp. 333. H E N D R Y , D A V I D F . , and J E A N F R A N C O I S R I C H A R D , 1983. "The Econometric Analysis of Economic Time Series", International Statistical Review, Vol. 51, pp. 1163. H I L D E N B R A N D , W E R N E R , 1983. "On the 'Law of Demand' ",Econometrica, Vol. 51, pp. 9971019. H O E R L , A R T H U R , and R O B E R T W. K E N N A R D , 1970a. "Ridge Regression: Biased Regression for NonOrthogonal Problems", Technometrics, Vol. 12, pp. 5567. H O E R L , A R T H U R , and R O B E R T W. K E N N A R D , 1970b. "Ridge Regression: Applica tions to NonOrthogonal Problems", Technometrics, Vol. 12, pp. 6982. J O H A N S E N , S O R E N , 1988. "Statistical Analysis of Cointegration Vectors", Journal of Economic Dynamics and Control, Vol. 12, pp. 231254. J O H A N S E N , S O R E N , and K A T A R I N A J U S E L I U S , 1989. "The Full Information Maxi mum Likelihood Procedure for Inference on Cointegration — with Applications", University of Copenhagen, processed. K L E I N , L A W R E N C E R., 1946a. "Macroeconomics and the Theory of Rational Behaviour", Econometrica, Vol. 14, pp. 93108. K L E I N , L A W R E N C E R., 1946b. "Remarks on the Theory of Aggregation",£conomern'ca, Vol. 14, pp. 303312. K U H N , THOMAS S., 1962. The Structure of Scientific Revolutions, Chicago: Chicago University Press. L E A M E R , E D W A R D E . , 1978. Specification Searches: Ad Hoc Inference with NonExperimental Data, New York: Wiley. L E A M E R , E D W A R D E . , 1985. "Sensitivity Analysis Would Help", American Economic Review, Vol. 75, pp. 308313. L E A M E R , E D W A R D E . , and H E R M A N N L E O N A R D , 1983. "Reporting the Fragility of Regression Estimates", Review of Economics and Statistics, Vol. 65, pp. 306317. LIPPI, M A R C O , 1988. "On the Dynamic Shape of Aggregated Error Correction Models", Journal of Economic Dynamics and Control, Vol. 12, pp. 561585. L I T T E R M A N , R O B E R T B., 1986a. "A Statistical Approach to Economic Forecasting", Journal of Business and Economic Statistics, Vol. 4, pp. 14. L I T T E R M A N , R O B E R T B., 1986b. "Forecasting with Bayesian Vector Autoregressions — Five Years of Experience", Journal of Business and Economic Statistics, Vol. 4, pp. 2538. L U C A S , R O B E R T E . , 1976. "Econometric Policy Analysis: A Critique", in Karl Brunner and Allan H . Melzer (eds.), The Phillips Curve and Labor Markets (Journal of Monetary Economics, Supplement, Vol. 1, pp. 1946). M c N E E S , S T E P H E N K . , 1986. "Forecasting Accuracy of Alternative Techniques: A Com parison of U.S. Macroeconomic Forecasts", Journal of Business and Economic Statistics, Vol. 4, pp. 515. MIZON, G R A Y H A M E . , 1977. "Inferential Procedures in Nonlinear Models: An Applica tion in a U . K . Industrial Cross Section Study of Factor Substitution and Returns to Scale", Econometrica, Vol. 45, pp. 12211242. MIZON, G R A Y H A M E . , and J E A N F R A N C O I S R I C H A R D , 1986. "The Encompassing Principle and Its Application to NonNested Hypotheses", Econometrica, Vol. 54, pp. 657678. M O R G A N , M A R Y S., 1987. "Statistics without Probability and Haavelmo's Revolution in Econometrics", in Lorenz Kruger, Gerd Gigerenzer and Mary S. Morgan (eds.), The Probabilistic Revolution, Vol. 2, pp. 171197, Cambridge, Mass.: MIT Press. M U E L L B A U E R , J O H N , 1975. "Aggregation, Income Distribution and Consumer Demand", Review of Economic Studies, Vol. 62, pp. 525543. M U E L L B A U E R , J O H N , 1976. "Commands Preferences and the Representative Consumer", Econometrica, Vol. 44, pp. 979999. N A T A F , A N D R E , 1948. "Sur la Possibilite de Construction de Certains Macromodels", Econometrica, Vol. 16, pp. 232244. N E L S O N , C H A R L E S R., 1972. "The Prediction Performance of the FRBMITPENN Model of the US Economy", American Economic Review, Vol. 62, pp. 902917. N E R L O V E , M A R C , 1972. "Lags in Economic Behavior", Econometrica, Vol. 40, pp. 221 251. N I C K E L L , S T E P H E N J . , 1985. "Error Correction, Partial Adjustment and All That: An Expository Note", Oxford Bulletin of Economics and Statistics, Vol. 47, pp. 119131. P A G A N , A D R I A N , 1987. "Three Econometric Methodologies: a Critical Appraisal", Journal of Economic Surveys, Vol. 1, pp. 324. P E S A R A N , M. H A S H E M , 1986. "Editorial Statement", Journal of Applied Econometrics, Vol. l , p p . 14. P H E B Y , J O H N , 1988. Methodology and Economics: A Critical Introduction, London: Macm illan. P H I L L I P S , P E T E R C . B . , 1988. "Reflections on Econometrics Methodology", Cowles Foundation Discussion Paper # 8 9 3 , Cowles Foundation, Yale University. ROBBINS, L I O N E L , 1935. An Essay on the Nature and Significance of Economic Science, (2nd edition), London: Macmillan. S A R G A N , J . DE.NIS, 1964. "Wages and Prices in the United Kingdom: A Study in Econo metric Methodology", in Peter E . Hart, Gordon Mills and John K . Whittaker (eds.), Econometric Analysis for National Economic Planning, London: Butterworth. S C H U L T Z , H E N R Y , 1928. Statistical Laws of Demand and Supply with Special Application to Sugar, Chicago: University of Chicago Press. S C H U L T Z , H E N R Y , 1938. The Theory and Measurement of Demand, Chicago: Univer sity of Chicago Press. SIMS, C H R I S T O P H E R A., 1980. "Macroeconomics and Reality", Econometrica, Vol. 48, pp. 148. SIMS, C H R I S T O P H E R A., 1982. "Policy Analysis with Econometric Models", Brookings Papers on Economic Activity, 1982(1), pp. 107152. SIMS, C H R I S T O P H E R A., 1986. "Are Forecasting Models Usable for Policy Analysis", Federal Reserve Bank of Minneapolis, Quarterly Review, V o l . 10, pp. 216. SIMS, C H R I S T O P H E R A., 1987. "Making Economics Credible", in Truman F . Bewley (ed.), Advances in Econometrics — Fifth World Congress, Cambridge, Mass.: Econo metric Society. SPANOS, A R I S , 1988. "Towards a Unifying Methodological Framework for Econometric Modelling", Economic Notes, pp. 107134. S T O K E R , THOMAS M., 1986. "Simple Tests of Distributional Effects on Macroeconomic Equations", Journal of Political Economy, Vol. 94, pp. 763795. S T O N E , j . R I C H A R D N., 1954. "Linear Expenditure Systems and Demand Analysis: An Application to the Pattern of British Demand", Economic Journal, Vol. 64, pp. 511527. T E R A S V I R T A , TIMO, 1970. Stepwise Regression and Economic Forecasting, Economic Studies Monograph #31, Helsinki: Finnish Economic Association. T R I V E D I , P R A V I N , 1984. "Uncertain Prior Information and Distributed Lag Analysis", in David F . Hendry and Kenneth F . Wallis (eds.), Econometrics and Quantitative Economics, Oxford: Basil Blackwell. T U R I N G , A L A N M., 1950. "Computing Machinery and Intelligence", Mind, Vol. 59, pp. 433460. W A L L I S , K E N N E T H F . , 1974. "Seasonal Adjustment and Relations Between Variables", Journal of the American Statistical Association, Vol. 69, pp. 1831. WOLD, H E R M A N , and R A D N A R B E N T Z E L , 1946. "On Statistical Demand Analysis from the Viewpoint of Simultaneous Equations", Skandinavisk Aktuarietidskrift, Vol. 29, pp. 95114. WOLD, H E R M A N , and L A R S J U R E E N , 1953. Demand Analysis, New York: Wiley.

Log In

Economic theory and econometric models

Free related PDFsRelated papers

Free related PDFsRelated papers

Related topics