Nothing Special   »   [go: up one dir, main page]

Academia.eduAcademia.edu

Economic theory and econometric models

1989, Economic and Social Review

The Economic and Social Review, Vol. 21, No. 1, October, 1989, pp. 1-25 Economic Theory and Econometric Models CHRISTOPHER L . G I L B E R T * Queen Mary College and West field, University of London and CEPR T he c o n s t i t u t i o n o f the Econometric Society states as the main objective of the society "the unification of the theoretical-qualitative and the empirical-quantitative approach to economic problems" (Ragnar Frisch, 1933, p. 1). Explaining this objective, Frisch warned that "so long as we content ourselves to statements i n general terms about one economic factor having an 'effect' on some other factor, almost any sort o f relationship may be selected, postulated as a law, and 'explained' by a plausible argument". Precise, realistic, b u t at the same time complex theory was required to "help us out i n this s i t u a t i o n " {ibid, p . 2). Over f i f t y years after the f o u n d a t i o n of the Econometric Society Hashem Pesaran stated, in an editorial w h i c h appeared i n the initial issue o f the Journal of Applied Econometrics, that "Frisch's call for unification o f the research in economics has been left largely unanswered" (Pesaran, 1986, p . 1). This is despite the fact that propositions that theory should relate to applied models, and that models should be based upon theory, are n o t enormously controversial. The simple reason is that economic theory and econometric models D u b l i n E c o n o m i c s W o r k s h o p G u e s t L e c t u r e , T h i r d A n n u a l C o n f e r e n c e of the I r i s h E c o n o m i c s A s s o c i ­ ation, Carrickmacross, May 20th 1989. * I a m grateful to J o s e C a r b a j o , V a n e s s a F r y , D a v i d H e n d r y , S t e p h e n N i c k e l l a n d Peter P h i l l i p s for c o m ­ ments. I m a i n t a i n full responsibility for a n y errors. are relatively a w k w a r d bedfellows — much as they cannot do w i t h o u t each other, they also f i n d i t very difficult to live i n the same house. I shall suggest that this is i n part because the proposed u n i o n has been over-ambitious, and partly for that reason, neither partner has been sufficiently accommodating to the other. What is needed is the intellectual analogue o f a modern marriage. One could l o o k at this issue either i n terms o f making theory more applicable, or i n terms o f making modelling more theoretically-based. I shall start f r o m the latter approach. What I have to say w i l l therefore relate t o controversies about the relative merits o f different econometric methodologies, and i t is apparent that increased a t t e n t i o n has been given b y econometricians to questions o f methodology over the past decade. I n particular, the rival methodologies associated respectively w i t h David H e n d r y , Edward Learner and Christopher Sims, and recently surveyed b y A d r i a n Pagan (1987), generate showpiece discussions at international conferences. I do not propose here either t o repeat Pagan's survey, or t o advise o n the "best b u y " (on this see also Aris Spanos, 1988). A major issue, w h i c h underlies m u c h o f Learner's criticism o f conventional econometrics, is the status o f inference i n equations the -specification o f w h i c h has been chosen at least i n part on the basis o f preliminary regressions. I do not propose to pursue those questions here. I also note the obvious p o i n t that the tools we use may be i n part dictated b y the j o b to hand — hypothesis testing, forecasting and policy evaluation are different functions and i t is not axiomatic that one particular approach to modelling w i l l dominate in all three functions. This p o i n t w i l l recur, b u t I wish to focus on t w o specific questions — h o w should economic theory determine the structure o f the models we estimate and h o w should we interpret rejection o f theories? I w i l l argue that i n the main we should see ourselves as using theory to structure our models, rather than using models t o test theories. As I have noted, Frisch was adamant that economic data were only interesting i n relation to economic theory. Nevertheless, there was no consensus at that time as to h o w data should be related to theory. Mary Morgan (1987) has documented the lack o f clear probabilistic foundation for the statistical techniques then in use. T o a large extent this deficiency was attributable t o Frisch's h o s t i l i t y to the use o f sampling theory i n econometrics. Morgan argues that econometrics was largely concerned w i t h the measurement o f constants in lawlike relationships, the existence and character o f w h i c h were not i n d o u b t . She quotes Henry Schultz (1928, p. 33), discussing the demand for sugar, as stating " A l l statistical devices are to be valued according to their efficacy in enabling us to lay bare the true relationship between the phenomena under question", a comment that is remarkable only i n its premise that the true relationship is k n o w n . The same premise is evident i n Lionel Robbins' strictures on D r Blank's estimated demand f u n c t i o n for herrings (Robbins, 1935) — Robbins does n o t wish t o disagree w i t h Blank o n the f o r m o f the economic law, b u t o n l y o n the possibility of its quantification. This all changed w i t h the p u b l i c a t i o n i n 1944 o f Trygve Haavelmo's "Probability A p p r o a c h i n Econometrics" (Haavelmo, 1944). Haavelmo insisted, first, that " n o t o o l developed i n the theory of statistics has any meaning — except, perhaps, for descriptive purposes — w i t h o u t being referred to sorrie stochastic scheme" (ibid, p . i i i ) ; and second, that theoretical economic models should be f o r m u l a t e d as "a restriction u p o n the j o i n t variations o f a system of variable quantities (or, more generally, 'objects') w h i c h otherwise m i g h t have any value or p r o p e r t y " (ibid, p. 8 ) . For Haavelmo, the role o f t h e o r y was t o offer a non-trivial, and therefore i n principle rejectable, structure for the variance-covariance m a t r i x o f the data. I n this may be recognised the seeds of Cowles Commission econometrics. Haavelmo was unspecific w i t h respect to the genesis of the theoretical restrictions, b u t over the post-war period economic theory has been increasingly dominated b y the paradigm o f atomistic agents maximising subject to a constraint set. I t is true that game theory has provided a new set o f models, although these game theoretic models t y p i c a l l y i m p l y fewer restrictions than the atomistic o p t i m i s a t i o n models w h i c h remain the core o f the dominant neoclassical or neo-Wairasian "research programme". This research programme has been reinforced b y the so-called " r a t i o n a l expectations r e v o l u t i o n " , and i n Lakatosian terms this provides evidence that the programme remains "progressive". I shall briefly illustrate this programme f r o m the theory o f consumer behaviour on the grounds that i t is in this area that the econometric approach has been most successful; and also because i t is an area w i t h w h i c h all readers w i l l be very familiar. The systems approach t o demand modelling was i n i t i ated b y Richard Stone's derivation o f and estimation o f the Linear Expenditure System (the L E S ; Stone, 1954). The achievement o f the LES was the derivation o f a complete set o f demand equations w h i c h could describe the outcomes o f the optimising decisions o f a u t i l i t y maximising consumer. We now k n o w that the f u n c t i o n a l specification adopted b y Stone, i n w h i c h expenditure is a linear f u n c t i o n o f prices and money income, is highly restrictive and entails additive separability o f the u t i l i t y f u n c t i o n ; b u t this is i n no way to devalue Stone's c o n t r i b u t i o n . Over the past decade, much o f the w o r k o n consumer theory has adopted a more general framework i n w h i c h agents maximise expected u t i l i t y over an uncertain future. This gives rise to a sequence o f consumption (more generally, demand) plans, one starting i n each time period, w i t h the feature that o n l y the i n i t i a l observation o f each plan is realised. A c o m m o n procedure, initiated i n this literature b y Robert H a l l (1978), b u t most elegantly carried o u t b y Lars Peter Hansen and K e n n e t h Singleton ( 1 9 8 3 ) , is t o estimate the parameters of the m o d e l f r o m the Euler equations w h i c h link the first order conditions f r o m the o p t i m i s a t i o n p r o b l e m i n successive periods. The c o m b i n a t i o n exhib i t e d b o t h i n these t w o recent papers and in the original Stone LES paper o f theoretical derivations o f a set o f restrictions on a system o f stochastic equations, and the subsequent testing o f these restrictions using the procedures o f classical statistical inference, is exactly that anticipated b o t h i n the constit u t i o n o f the Econometric Society and in Haavelmo's manifesto. I t therefore appears somewhat churlish that the profession, having duly arrived at this destination, should n o w question that this is where we wish to go. However, i n terms o f the c o n s t i t u t i o n a l objective o f unifying theoretical and empirical modelling, the Haavelmo-Cowles programme forces this unification too much o n the terms o f the theory p a r t y . 1 This argument can be made i n a number o f respects, b u t almost invariably w i l l revolve around aggregation. I shall consider t w o approaches to aggregation — one deriving from theoretical and the other from statistical considerations. The three consumption-demand papers to w h i c h I have referred share the characteristic that they use aggregate data to examine the implications o f theories that are developed i n terms o f individual optimising agents. This requires that we assume either that all individuals are identical and have the same income, or that individuals have the same preferences (although they may differ i n the intercepts o f their Engel curves) w h i c h , moreover, belong to the P I G L class. I n the latter and marginally more plausible case, the market demand curves may be regarded as the demand curves o f a h y p o t h e t i c a l representative agent maximising u t i l i t y subject t o the aggregate budget constraint. This allows interpretation o f the parameters o f the aggregate functions i n terms o f the parameters o f the representative agent's u t i l i t y f u n c t i o n . 2 The representative agent hypothesis therefore performs a " r e d u c t i o n " o f the parameters o f the aggregate function to a set o f more fundamental micro parameters. I n this sense, the aggregate equations are " e x p l a i n e d " in terms of these more fundamental parameters, apparently i n the same way that one might explain the properties o f the hydrogen a t o m in terms o f the quantum mechanics equations o f an electron-proton pair. This reductionist approach to explanation was discussed by Haavelmo in an i m p o r t a n t b u t neglected section o f his manifesto entitled " T h e A u t o n o m y o f an Economic R e l a t i o n " and w h i c h anticipates the Lucas critique (Robert Lucas, 1.976). A relation3 1. O r o r t h o g o n a l i t y c o n d i t i o n s i m p l i e d by the E u l e r equations. A difficulty w i t h these "tests" is that they m a y have very l o w p o w e r against interesting alternatives — this a r g u m e n t was made by J a m e s D a v i d s o n a n d D a v i d H e n d r y ( 1 9 8 1 ) in r e l a t i o n to the tests r e p o r t e d in H a l l ( 1 9 7 8 ) . 2. T h e P r i c e I n d e p e n d e n t G e n e r a l i z e d L i n e a r class — see A n g u s D e a t o n a n d J o h n M u e l l b a u e r ( 1 9 8 0 ) . 3. See also J o h n A l d r i c h ( 1 9 8 9 ) . ship is autonomous to the extent that i t remains constant i f other relationships in the system (e.g., the money supply rule) are changed. Haavelmo explains " I n scientific research — i n the field o f economics as w e l l as i n other fields — our search for 'explanations' consists o f digging d o w n to more fundamental relations than those w h i c h stand before us w h e n we merely 'stand and l o o k ' " (ibid, p . 38). There is however abundant evidence that attempts t o make inferences about individual tastes f r o m the tastes o f a "representative agent" o n the basis of aggregate time series data can be highly misleading. Thomas Stoker (1986) has emphasised the importance o f distributional considerations i n a microbased application o f the Linear Expenditure System t o US data and Richard Blundell (1988) has reiterated these concerns. Richard Blundell, Panos Pashardes and G i u l l i m o Weber (1988) estimate the D e a t o n and Muellbauer (Deaton and Muellbauer, 1980) A l m o s t Ideal Demand System o n b o t h a panel of micro data and o n the aggregated macro data. They f i n d substantial biases in the estimated coefficients f r o m the aggregate relationships i n comparison w i t h the microeconometric estimates. Furthermore, and, they suggest as a consequence o f these biases, the aggregate equations exhibit residual serial correlation and reject the homogeneity restrictions. They suggest, as major causes o f this aggregation failure, differences between households w i t h and w i t h o u t children and the prevalence o f zero purchases in the micro data. This study in particular suggests that it is difficult t o claim that aggregate elasticities correspond i n any very straightforward way to underlying micro-parameters. 4 H o w does this leave the reductionist interpretation o f aggregate equations? Pursuing further the hydrogen analogy, the demand theoretic " r e d u c t i o n " is flawed b y the fact that we have i n general no independent knowledge of the parameters o f individual u t i l i t y functions w h i c h w o u l d allow us to predict elasticities prior to estimation. I t is therefore better t o see u t i l i t y theory i n traditional studies as " s t r u c t u r i n g " demand estimates. I n that case, the representative agent assumption is a convenient and almost Friedmanite simplifying assumption, i m p l y i n g that the role o f theory is p r i m a r i l y instrumental. 5 I n the Stoker (1986) and Blundell et al. (1988) studies, b y contrast, the presence o f good micro data allow a genuine test o f the c o m p a t i b i l i t y o f the macro and micro estimates, and this allows those authors to investigate the circumstances i n w h i c h a reductionist interpretation of the macro estimates 4. T h i s p r o b l e m is even more severe i n the r e c e n t study b y V a n e s s a F r y a n d P a n o s Pashardes ( 1 9 8 8 ) w h i c h a d o p t s the same p r o c e d u r e i n l o o k i n g at t o b a c c o p u r c h a s e s . C l e a r l y , the prevalence of n o n ­ s m o k e r s i m p l i e s that there is no representative c o n s u m e r . B u t it is also the case that elasticities esti­ m a t e d f r o m aggregate data m a y fail to reflect the elasticities of a n y representative s m o k e r . T h i s is because it is not possible to distinguish b e t w e e n the effect of a (perhaps t a x ­ i n d u c e d ) price rise in i n d u c ­ ing s m o k e r s to s m o k e less a n d the effect, if a n y , of i n d u c i n g s m o k e r s to give u p the habit. 5. See M i l t o n F r i e d m a n ( 1 9 5 3 ) ; a n d for.a recent s u m m a r y of the ensuing l i t e r a t u r e j o h n P h e b y ( 1 9 8 8 ) . is possible. B o t h papers conclude that aggregate estimates w i t h no corrections for d i s t r i b u t i o n a l effects tend to suffer f r o m misspecified dynamics and this does i m p l y that they w i l l be less useful i n forecasting and policy analysis. There is also a suggestion that they may vary significantly, in a Lucas (1976) manner, because government policies w i l l affect income distribution more than they w i l l affect household characteristics and taste parameters. A l t h o u g h neither set o f authors makes an explicit argument, the i m p l i c a t i o n appears to be that one is better o f f confining oneself t o microeconometric data. This view seems to me to be radically mistaken. I t has been clear ever since Lawrence K l e i n (1946a, b) and Andre Nataf (1948) first discussed aggregation issues i n the context o f p r o d u c t i o n functions that the requirements for an exact correspondence between aggregate and micro relationships are enormously strong. T h r o u g h the w o r k o f Terence G o r m a n ( 1 9 5 3 , 1959, 1968), Muellbauer (1975, 1976) and others these conditions have been somewhat weakened b u t remain heroic. I n this light i t m i g h t appear somewhat surprising that aggregate relationships do seem to be relatively constant over t i m e and are broadly interpretable i n terms o f economic theory. A n interesting clue as t o w h y this might be was provided b y Yehuda Grunfeld and Z v i Griliches (1960) w h o asked "Is aggregation necessarily bad?" I n his Ph.D. thesis, G r u n f e l d had obtained the surprising result that the investment expenditures o f an aggregate o f eight major US corporations were better explained b y a t w o variable regression o n the aggregate market value o f these corporations and their aggregate stock o f plant and equipment at the start o f each period, than b y a set o f eight micro regressions i n w h i c h each corporation's investment was related to its o w n market value and stock o f plant and equipment ( G r u n f e l d , 1958). I n forecasting the aggregate, one w o u l d do better using an aggregate equation than b y disaggregating and forecasting each component o f the aggregate separately. G r u n f e l d and Griliches suggest that this may be explained b y misspecification o f the micro relationships. I f there is even a small dependence o f the m i c r o variables o n aggregate variables, this can result i n better explanation b y the aggregated equation than b y the slightly misspecified m i c r o equations. This result has recently been rediscovered b y Clive Granger (1987) w h o distinguishes between individual (i.e., agent-specific) factors and c o m m o n factors i n the m i c r o equations. I n the demand context, an example o f a comm o n factor w o u l d be the interest rate i n a demand for a durable good. I f we consider a specific agent, the role o f the interest rate is likely to be quite small — whether or n o t a particular household purchases, say a refrigerator, i n a particular p e r i o d w i l l m a i n l y depend o n whether the o l d refrigerator has finally ceased t o f u n c t i o n (replacement purchase) or the fact that the household u n i t has just been formed (new purchase). The role o f the interest rate w i l l be secondary. However, the individual factors are unlikely to be f u l l y observed. I f one is obliged to regress simply on the c o m m o n factors — i n this case the interest rate — the micro R w i l l be t i n y ( w i t h a m i l l i o n households, Granger obtains a value o f 0.001), b u t the aggregate equation may have a very high R (Granger obtains 0.999) because the individual effects average o u t across the p o p u l a t i o n . 2 2 6 So long as the micro relationships are correctly specified and all variables ( c o m m o n and individual) are fully observed there is no gain t h r o u g h aggregation. However, once we allow parsimonious simplification strategies, the effects o f these simplifications w i l l be to result i n quite different micro and aggregate relationships. Furthermore, i t is not clear a priori w h i c h o f these approximated relationships w i l l more closely reflect theory. Blundell (1988) i m p l i e d that w h e n micro and aggregate relationships differ this must entail aggregation bias i n the aggregate relationships. Granger's results show that theoretical effects may be swamped at the micro level b y individual factors w h i c h are of little interest t o the economist, and w h i c h i n any case are likely to be incompletely observed resulting i n o m i t t e d variable bias i n the micro equations. Microeconometrics is i m p o r t a n t , b u t i t does not invalidate trad i t i o n a l aggregate time series analysis. 7 M y concern here is w i t h the methodology o f aggregate time series econometrics so I shall n o t dwell o n the problems o f doing microeconometrics. The question I have posed is h o w economic theory should be incorporated i n aggregate models. The naive answer to this question is the reductionist route, in w h i c h the parameters o f aggregate relationships are interpreted i n terms o f the decisions o f a representative optimising agent. However, there is absolutely no reason to suppose that the aggregation assumptions required b y this reducw i l l h o l d . There is l i t t l e p o i n t , therefore, i n using these estimated aggregate relationships t o " t e s t " theories based o n the optimising behaviour o f a representative agent — i f we fail t o reject the theory i t is o n l y because we have insufficient data. 8 6. S t r i c t l y , the variance of the h o u s e h o l d effects is of order n, w h e r e n is the n u m b e r of h o u s e h o l d s , and the variance of the c o m m o n f a c t o r s is of order n . H e n c e , as the n u m b e r of h o u s e h o l d b e c o m e s 2 large, the c o n t r i b u t i o n of the i n d i v i d u a l effects b e c o m e s negligible. I n the converse case i n w h i c h we observe the i n d i v i d u a l f a c t o r s b u t n o t the c o m m o n factors the R s are reversed. See also G r a n g e r ( 1 9 8 8 ) . 2 7. W e r n e r H i l d e n b r a n d ( 1 9 8 3 ) arrives at a similar c o n c l u s i o n in a more specialised c o n t e x t . H e r e m a r k s (ibid, p. 9 9 8 ) " T h e r e is a qualitative difference in m a r k e t a n d i n d i v i d u a l d e m a n d f u n c t i o n s . T h i s obser­ v a t i o n shows that the c o n c e p t of the 'representative c o n s u m e r ' , w h i c h is often u s e d i n the literature; does n o t really simplify the a n a l y s i s ; o n the c o n t r a r y , it might be misleading". I a m grateful to J o s e C a r b a j o for bringing this reference to m y a t t e n t i o n . 8. I do not w i s h to c l a i m that " I f the sample size is large y o u reject e v e r y t h i n g " — see Peter P h i l l i p s ( 1 9 8 8 , p. 1 1 ) . I t is n o w nearly ten years since Sims argued in his "Macroeconomics and R e a l i t y " (Sims, 1980) that the Haavelmo-Cowles programme is misconceived. Theoretically-inspired identifying restrictions are, he argued, simply "incredi b l e " . This is p a r t l y because many sets o f restrictions amount to no more than normalisations together w i t h "shrewd aggregations and exclusion restrictions" based o n an " i n t u i t i v e econometrician's view o f psychological and sociological t h e o r y " (Sims, 1980, p p . 2-3); because the use o f lagged dependent variables for identification requires prior knowledge o f exact lag lengths and orders o f serial correlation (Michio Hatanaka, 1975); and partly because rational expectations i m p l y that any variable entering a particular equation may, in principle, enter a l l other equations containing expectational variables. I n Sims' example, a demand for meat equation is " i d e n t i f i e d " b y normalisation o f the coefficient o n the q u a n t i t y (or value share) o f meat variable to - 1 ; b y exclusion o f all other q u a n t i t y (or value share) variables; and b y exclusion o f the prices o f goods considered b y the econometrician to be distant substitutes for meat, or replacement o f these prices by the prices o f one or more suitably defined aggregates. The V A R m e t h o d o l o g y , elaborated i n a series o f papers by Thomas Doan, Robert L i t t e r m a n and Sims, is t o estimate unrestricted distributed lags o f each non-deterministic variable o n the complete set o f variables (Doan, L i t t e r m a n and Sims, 1984; L i t t e r m a n , 1986a; Sims, 1982, 1987). Thus given a set o f k variables one models k n x. = £ 2 0.. x. , +u. t ( i = 1, . . ,k) (1) The objective is to allow the data to structure the dynamic responses o f each variable. Obviously, however, one may wish t o consider a relatively large number o f variables and relatively long lag lengths, and this could result i n shortage o f degrees o f freedom and in-poorly determined coefficient estimates. Some o f the early V A R articles impose " i n c r e d i b l e " marginalisation (i.e., variable exclusion) and lag length restrictions — for example, Sims (1980) uses a six variable V A R o n quarterly data w i t h lag length restricted to four. B u t these restrictions are hardly more palatable than those Sims argued against in "Macroeconomics and R e a l i t y " and at least i m p l i c i t recognition o f this has pushed the V A R school into an a d o p t i o n o f a Bayesian framework. The crucial element o f Bayesian V A R ( B V A R ) modelling is a "shrinkage" procedure i n w h i c h a loose Bayesian prior distribution structures the otherwise unrestricted distributed lag estimates (Doan et al., 1984; Sims, 1987). A prior d i s t r i b u t i o n has t w o components — the prior mean and the prior variance. First consider the prior mean. I f one estimates a large number o f lagged coefficients, one w i l l i n t u i t i v e l y feel that many o f t h e m , particularly those at high lag lengths, should be s m a l l . D o a n et al. (1984) formalise this i n t u i t i o n b y specifying the prior for each modelled variable as a r a n d o m walk w i t h d r i f t . This prior can be justified o n the argument t h a t , under (perhaps incredibly) strict assumptions, r a n d o m walk models appear as the outcomes of the decisions o f atomistic agents optimising under uncertainty (most notably, H a l l , 1978); or on the heuristic argument that " n o change' ' forecasts provide a sensible " n a i v e " base against w h i c h any other forecasts should be compared. M o r e f o r m a l l y , one can argue that collinearity is clearly a major problem i n the estimation o f unrestricted distributed lag models and that severe collinearity may give models w h i c h "produce erratic, poor forecasts and i m p l y explosive behavior o f the data" (Doan et al., 1984). A standard remedy for collinearity, implemented i n ridge regression ( A r t h u r H o e r l and Robert Kennard, 1970a, b) is to " s h r i n k " these coefficients towards zero b y adding a small constant (the "ridge constant") to the diagonal elements o f the data cross-product m a t r i x . 9 1 0 1 1 1 Specification o f the prior variance involves the investigator quantifying his/ her uncertainty about the prior mean. The prior variance m a t r i x w i l l typically contain a large number o f parameters, and this therefore appears a daunting task. M u c h o f the originality o f the V A R shrinkage procedure arises f r o m the economy i n specification o f this m a t r i x w h i c h , i n the most simple case, is characterised i n terms o f o n l y three parameters (Doan, L i t t e r m a n and Sims, 1986). These are the overall tightness o f the prior d i s t r i b u t i o n , the rate at w h i c h the prior standard deviations decay, and the relative weight o f variables other than the lagged dependent variable i n a particular autoregression ( w i t h prior covariances set to zero). A tighter prior d i s t r i b u t i o n implies a larger ridge constant and this results i n a greater shrinkage towards the r a n d o m walk model. The i m p o r t a n t feature o f the D o a n et al. (1984) procedure is that the tightness o f the prior is increased as lag length increases. Degrees o f freedom considerations are no longer paramount since coefficients associated w i t h long 9. B u t note that this i n t u i t i o n m a y be i n c o r r e c t if one uses seasonally u n a d j u s t e d data. H o w e v e r , K e n n e t h Wallis ( 1 9 7 4 ) has s h o w n that use of seasonally a d j u s t e d data c a n distort the d y n a m i c s i n the estimated relationships. 10. T h e e x p o s i t i o n in D o a n et al. ( 1 9 4 8 ) is c o m p l i c a t e d a n d n o t e n t i r e l y consistent. See J o h n G e w e k e ( 1 9 8 4 ) for a concise s u m m a r y . 11. I n the s t a n d a r d linear m o d e l y = X)3 + u where y a n d X are b o t h m e a s u r e d as deviations f r o m their sample means, the ridge regression e s t i m a t o r of P is b = (X'x + kl) where k is the ridge c o n s t a n t . ^'y \ I lag lengths, and w i t h less i m p o r t a n t explanatory variables, are forced t o be close t o zero. I t is often suggested that the V A R approach is completely atheoretical (see, e.g., Thomas Cooley and Stephen L e R o y , 1986). This view is given support b y those V A R modellers whose activities are primarily related t o forecasting and w h o argue that relevant economic theory is so incredible that one w i l l forecast better w i t h an unrestricted reduced f o r m m o d e l ( L i t terman, 1986a, b ; Stephen McNees, 1 9 8 6 ) . However, this p o s i t i o n is too extreme. M o s t simply, theory may be tested to a l i m i t e d extent b y examinat i o n o f b l o c k exclusion (Granger causality) tests, although I w o u l d agree w i t h 12 i Sims that, interpreted strictly, such restrictions are n o t i n general credible. I t is therefore more interesting to examine the use o f V A R models i n policy analysis since i n this activity theory is indispensable. Suppose one is interested i n evaluating the policy impact o f a shock to the money supply. One w i l l typically l o o k for a set o f dynamic multipliers showing the impact o f that shock o n a l l the variables o f interest. A n i n i t i a l d i f f i c u l t y is that i n V A R models all variables are j o i n t l y determined b y their c o m m o n hist o r y and a set o f current disturbances. This implies that i t does n o t make sense to talk o f a shock to the money supply unless additional structure is imposed on the V A R . T o see this, note that the autoregressive representation (1) may be transformed i n t o the moving average representation k x. it °° = E j = 1 2 r = 0 a., u. i j r },t-i ( i = 1, . . ,k) t \ > > i (2) V / where each variable depends on the history o f shocks to all the variables in the m o d e l . There are t w o possibilities. Take the money supply to be variable 1. I f none o f the other k - 1 variables i n the model Granger-causes the money supply (so that 0 ^ = 0 for all j > l and r ) we may identify monetary policy w i t h the i n n o v a t i o n U j o n the money supply equation and trace o u t the effects o f these innovations o n the other variables i n the system. I t is more likely, however, particularly given Sims' views, that all variables are interdependent at least over t i m e . I n that case analysis o f the effects o f monetary policy requires the identifying assumption that the monetary authorities 12. F o r e x a m p l e , L i t t c r m a n ( 1 9 8 6 b , p. 26) w r i t e s i n c o n n e c t i o n w i t h business c y c l e s , ". . . there are a m u l t i t u d e of e c o n o m i c theories of the business c y c l e , m o s t of w h i c h focus on one p a r t of a c o m p l e x m u l t i f a c e t e d p r o b l e m . M o s t e c o n o m i s t s w o u l d a d m i t that e a c h t h e o r y has some v a l i d i t y , although there is w i d e disagreement over the relative i m p o r t a n c e of the different a p p r o a c h e s . " A n d i n c o n j u n c t i o n w i t h the D a t a R e s o u r c e s I n c . ( D R I ) m o d e l i n v e s t m e n t sector, he states, " E v e n i f one a c c e p t s the J o r g e n ­ son t h e o r y as a reasonable a p p r o a c h to e x p l a i n i n g i n v e s t m e n t , the e m p i r i c a l i m p l e m e n t a t i o n does n o t a d e q u a t e l y r e p r e s e n t the true u n c e r t a i n t y a b o u t the d e t e r m i n a n t s of i n v e s t m e n t . " choose X j independently o f the current period disturbances on the other equations. I n an older t e r m i n o l o g y , this defines the first l i n k in a Wold causal chain w i t h money causally prior to the other variables (Herman W o l d and Radnar Bentzel, 1946; W o l d and Lars Jureen, 1953). I t can be implemented by renormalisation o f (2) such that x depends only on the policy innovations v while the remaining variables depend o n v and also a set o f innovations 2 f " " ' kt orthogonal to v . I n the l i m i t i n g case in w h i c h all the innovations are m u t u a l l y orthogonal, we may rewrite (2) as ] t u v u v w m c n a r e ] t *it = V i j r V - r .2 j = 1 r= 0 •" J 0=1.- (3) This expression is unique given the ordering o f the variables, b u t as Pagan (1987) notes, it is n o t clear a priori h o w the innovations v , . . , v should be interpreted. The policy multipliers w i l l depend on the causal ordering adopted, and the ordering of variables 2 . . k may i n practice be somewhat arbitrary. We f i n d therefore that, although i n estimation V A R modellers can avoid making strong identifying assumptions, policy interpretation o f their models, including the calculation o f policy multipliers, requires that one make exactly the same sort o f identifying assumption that Sims criticised i n the Haavelmo-Cowles programme. This is the basis o f Cooley and LeRoy's (1985) critique o f atheoretical macroeconometrics. 2 t k t As a criticism o f Sims, this is t o o strong. Note first that in his applied w o r k , Sims does not restrict himself t o orthogonalisation assumptions as i n ( 3 ) , but is w i l l i n g to explore a wider class o f identifying restrictions w h i c h are n o t dissimilar t o those made by structural modellers (see Sims, 1986). Moreover, he allows himself t o search over different sets o f identifying assumptions i n order to obtain plausible p o l i c y m u l t i p l i e r s . However, the sets of assumptions he explores all generate just identified models w i t h the i m p l i c a t i o n that they are all compatible w i t h the same reduced f o r m . This permits a t w o stage procedure in w h i c h at the first stage the autoregressive representation (1) is estimated, and at the second stage this representation is interpreted i n t o economic theory by the i m p o s i t i o n o f identifying assumptions on the moving average representation. The identifying assumptions may be controversial, b u t they do n o t contaminate estimation. A l t h o u g h i t is n o t true that V A R modelling is completely atheoretical, the philosophy o f the V A R approach may be caricatured as attempting t o l i m i t the role o f theory in order to obtain results w h i c h are as objective as possible and as near as possible independent o f the investigator's theoretical beliefs or prejudices. A n alternative approach, associated w i t h what I have called elsewhere (Gilbert, 1989) the LSE ( L o n d o n School of Economics) methodology B is t o use theory t o structure models in a more or less loose way so as t o o b t a i n a m o d e l whose general interpretation is i n line w i t h theory b u t whose detail is determined b y the data. The instrument for ensuring coherence w i t h the data is classical testing m e t h o d o l o g y . This immediately prompts the question o f what constitutes a test o f a theory w h i c h we regard as at best approximate? I have noted that i t does not usually make m u c h sense to suppose that we can sensibly use classical testing procedures to attempt t o reject theories based on the behaviour o f atomistic optimising agents o n aggregate economic data, since there is no reason t o suppose that those theories apply precisely on aggregate d a t a . There are in practice t w o interesting questions. The first is whether a given theory is or is not t o o simple relative b o t h to the data and for the purposes to hand. The second question is whether one theory-based model explains a given dataset better than another theory-based model. 13 The issue o f simplification almost invariably prompts the map analogy. For example, Learner ( 1 9 7 8 , p . 205) writes "Each map is a greatly simplified version o f the theory o f the w o r l d ; each is designed for some class o f decisions and w o r k s relatively p o o r l y for others". Simplification is forced upon us by the fact that we have l i m i t e d comprehension, and, more acutely i n time series studies, b y l i m i t e d numbers o f observations. As the amount o f data available increase, we are able t o entertain more complicated models, but this is not necessarily a benefit i f we are interested in investigating relatively simple theories since the additional c o m p l e x i t y may then largely take the form o f nuisance parameters. Frequently, the increased model c o m p l e x i t y w i l l take the f o r m o f inclusion o f more variables — i.e., revision o f the marginalisation decision — and this can be tested using conventional classical nested techniques. The i m p o r t a n t question is whether omission o f these factors results in biased coefficient values and incorrect inference in relation to the purposes of the investigation. The tourist and the geologist w i l l typically use different maps, b u t the tourist may wish to k n o w i f there are steep gradients on his/her r o u t e , and questions o f access are not totally irrelevant t o the geologist. 14 The obvious trade-off in the sort o f samples we frequently f i n d ourselves analysing i n time series macroeconometrics is between reduction in bias through the inclusion o f additional regressors and reduced precision through the reduct i o n i n degrees o f freedom and increase in collinearity. Short samples o f aggregate data can only relate to simple theories since they only contain a l i m i t e d amount o f i n f o r m a t i o n . Macroeconometric models w i l l therefore be more 13. H e t e r o g e n e i t y m a y i m p l y that these theories also fail to h o l d o n m i c r o data. 14. P h i l l i p s ( 1 9 8 8 , p. 2 8 ) notes that it is i m p l i c i t in the H e n d r y m e t h o d o l o g y that the n u m b e r k or regressors grows w i t h the sample size T in such a w a y that k / T ~* 0 as T simple than the w o r l d they p u r p o r t t o represent. This does not particularly matter, but i t does imply that we must always be aware that previously neglected factors may become i m p o r t a n t — an obvious example is provided by the role o f i n f l a t i o n in the consumption f u n c t i o n . T w o strategies are currently available for controlling for structural nonconstancy. V A R modellers advise use o f random coefficient vector autoregressions i n which the model coefficients all evolve as random walks (Doan et al., 1984). I n principle, this leads to very high dimensional models, b u t imposition o f a tight Bayesian prior distribution heavily constrains the coefficient evolution and permits estimation. This procedure automates c o n t r o l for structural constancy, since the modeller's role is reduced t o choice o f the tightness parameters o f the prior. A disadvantage is that it cannot ever p r o m p t reassessment o f the marginalisation decision — i.e., inclusion o f previously excluded or unconsidered regressor variables. A n alternative approach w h i c h is gaining increasing support is the use o f recursive regression methods to check for structural constancy. I n recursive regression one uses updating formulae, first w o r k e d o u t by T i m o Terasvirta ( 1 9 7 0 ) , to compute the regression o f interest for each subsample [ l , t ] for t = T j , . . ,T where T is the final observation available and T j is o f the order of three times the number o f regressor variables (see H e n d r y , 1989, pp. 20-21). This produces a large volume o f o u t p u t w h i c h is difficult t o interpret except by graphical methods. Use o f recursive methods had therefore t o wait u n t i l PC technology allowed easy and costless preparation o f graphs. I t is n o w computationally trivial t o graph Chow tests (Gregory Chow, 1960) for all possible structural breaks, or for one period ahead predictions for all periods w i t h i n the [ T j , T - 1] interval. Also one can p l o t coefficient estimates against sample size. A l t h o u g h these graphical methods do n o t i m p l y any precise statistical tests, they show up structural non-constancy o f either the break or evolution f o r m in an easily recognisable f o r m , and p r o m p t the investigator t o ask w h y a particular coefficient is moving through the sample, or w h y a particular observation is exerting leverage on the coefficient estimates. These questions should then p r o m p t appropriate model respecification. I am not aware that Learner has ever advised use o f recursive methods, but they do appear to be very much in the spirit o f his concern w i t h fragility in regression estimates (see Learner and Hermann Leonard, 1983), even i f the proposed databased " s o l u t i o n " is not one he w o u l d favour. Level o f c o m p l e x i t y is therefore p r i m a r i l y a matter o f sample size. The more interesting questions arise from comparison o f alternative and i n c o m patible simple theories w h i c h share the same objectives. A l t h o u g h maps may differ o n l y i n the selection o f detail to represent, they may also differ because one or other map incorrectly represents certain details. I n such cases we are required t o make a choice. There is n o w a considerable body o f b o t h econometric theory and o f experience i n non-nested hypothesis testing. Suppose we have t w o alternative and apparently congruent models A and B . Suppose initially model A (say a regression of y on X ) gives the " c o r r e c t " represent a t i o n of the economic process under consideration. This implies that the estimates obtained b y incorrectly supposing model B (regression o f y on Z) to be true w i l l suffer from misspecification bias. Knowledge of the covariance o f the X and Z variables allows this bias to be calculated. Thus i f A is true, i t allows the econometrician to predict h o w B w i l l p e r f o r m ; b u t i f A does not give a good representation o f the economic process, it w i l l not be able t o " e x p l a i n " the model B coefficients. Furthermore, we can reverse the entire procedure and a t t e m p t t o use model B t o predict how A w i l l p e r f o r m . These non-nested hypothesis tests, or encompassing tests as they are sometimes called, t u r n o u t t o be very simple t o p e r f o r m . One forms the composite b u t quite possible economically uninterpretable hypothesis A U B w h i c h in the case discussed above is the regression o f y on b o t h X and Z (deleting one occurrence o f any variable included i n b o t h X and Z ) , and then performs the standard F tests o f A and B i n t u r n against A U B . Four outcomes are possible. I f one can accept the absence of the Z variables in the presence o f the X variables, b u t not vice versa (i.e., E [ y | X , Z ] = X a ) , model A is said t o encompass m o d e l B ; equally, m o d e l B may encompass model A ( E [ y l X , Z ] = Z 0 ) . B u t t w o other outcomes are possible. I f one cannot accept either E [ y | X , Z ] = X a or E [ y | X , Z ] = Zj3 neither hypothesis may be maintained. Finally, one might be able t o accept b o t h E [ y | X , Z ] = X a and E [ y | X , Z ] = Zj3 in w h i c h case the data are indecisive. This relates to Thomas Kuhn's view that a scientific theory w i l l n o t be rejected simply because of anomalies, b u t rather because some o f these anomalies can be explained by a rival theory ( K u h n , 1962). 15 The LSE procedure may be summarised as an attempt to obtain a parsimonious representation of a general unrestricted e q u a t i o n . This represent a t i o n should simultaneously satisfy a number of criteria (Hendry and JeanFrancois Richard, 1 9 8 2 , 1 9 8 3 ) . First, i t must be an acceptable simplification o f the unrestricted equation either on the basis o f a single F test against the unrestricted equation, or on the basis o f a sequence o f such tests. Second, i t should have serially independent errors. T h i r d , i t must be structurally constant. I w i l l r e t u r n to the error correction specification shortly. The model dis16 17 15. T h i s is "coefficient e n c o m p a s s i n g " . A more l i m i t e d question ("variance encompassing") is w h e t h e r we c a n e x p l a i n the r e s i d u a l variances. C o e f f i c i e n t e n c o m p a s s i n g implies variance e n c o m p a s s i n g , b u t n o t vice versa ( M i z o n a n d R i c h a r d , 1 9 8 6 ) . 16. U s u a l l y this w i l l involve O L S e s t i m a t i o n of single equations, b u t the same p r o c e d u r e s m a y be a d o p t e d i n s i m u l t a n e o u s m o d e l s using appropriate estimators. 17. See G r a y h a m M i z o n ( 1 9 7 7 ) . covery activity takes place i n part i n the parsimonious simplification activity, w h i c h t y p i c a l l y involves the i m p o s i t i o n o f zero or equality restrictions on sets of coefficients, and also i m p o r t a n t l y i n reviewing the marginalisation (variable exclusion) decisions. Parsimonious simplification may be regarded as i n large measure a t i d y i n g up operation w h i c h does l i t t l e to affect equation f i t , controls for collinearity and thereby improves forecasting performance, and at worst results i n exaggerated estimates o f coefficient precision (since coefficients w h i c h are approximately zero or equal are set to be exactly zero or e q u a l ) . Pravin Trivedi (1984) has coined the term " t e s t i m a t i o n " to describe the "empirical specification search involving a blend o f estimation and significance tests". I m p o r t a n t l y , parsimonious simplification conserves degrees o f freedom and i n this respect i t is n o t dissimilar to the shrinkage procedure adopted i n V A R modelling, the difference being mainly whether one imposes strong restrictions on a set of near zero coefficients ( L S E ) , or weaker restrictions on the entire set o f coefficients ( V A R ) . I t does n o t seem t o me that there is any strong basis for suggesting that one m e t h o d has superior statistical properties than the other. V A R modellers argue that their models have superior forecasting properties, but LSE modellers w o u l d reply that their methods tend to be more robust w i t h respect t o structural change. This is n o t an argument that can be settled on an a priori basis. 18 Opening up the marginalisation question is o f greater importance. I f a variable w h i c h is o f practical importance is o m i t t e d from the model, perhaps because its presence is n o t indicated by the available theory, this omission is likely t o cause biased coefficient estimates and either serially correlated residuals or over-complicated estimated dynamics. I n the former case, one might be tempted t o estimate using an appropriate autoregressive estimator, w h i c h is t a n t a m o u n t to regarding the autoregressive coefficients as nuisance parameters; while in the latter one w i l l obtain the same result via unrestricted estimates o f the autoregressive equation. The alternative, w h i c h is familiar to all o f us, is t o take the residual serial correlation as p r o m p t i n g the question of whether the m o d e l is well-specified, and i n particular, whether i m p o r t a n t variables have been o m i t t e d . Subsequent discovery that this is indeed the case may either indicate a need t o extend or revise the underlying theory, or more simply suggest the observation that the theory offers only a partial explanat i o n o f the data. I n the latter case, the additional variables introduced i n t o the m o d e l may perhaps be legitimately regarded as nuisance variables, but in the former case the t w o way interaction between theory and data w i l l have a clear positive value. 18. C o n t r a s t L e a r n e r ( 1 9 8 5 ) w h o describes the L S E m e t h o d o l o g y as " a c o m b i n a t i o n of b a c k w a r d a n d f o r w a r d step­wise (better k n o w n as u n w i s e ) regression . . . T h e order for imposing the restrictions a n d the c h o i c e o f significance level are a r b i t r a r y . . . W h a t meaning s h o u l d be a t t a c h e d to all of t h i s ? " The feature o f the LSE approach on which I wish to concentrate is the role o f cointegration and the prevalence o f the error correction specification. The error correction specification is an attempt to combine the flexibility o f time series ( B o x - J e n k i n s ) models in accounting for short term dynamics w i t h the t h e o r y - c o m p a t i b i l i t y o f traditional structural econometric models (see Gilbert, 1989). I n this specification b o t h the dependent variable and the explanatory variables appear as current and lagged differences (sometimes as second differences or multi-period differences), as in Box-Jenkins models, b u t unlike those models, the specification also includes a single lagged level o f the dependent variable and a subset o f the explanatory variables. For example, a stylised version o f the Davidson et al. (1978) consumption function may be w r i t t e n as 19 A 4 l n c t = 0o + U A l n y + U A A l n y - 0 ( l n c _ - l n y _ ) 1 4 t 2 1 4 t 3 t 4 t 4 (4) where, o n quarterly data, annual changes i n consumption are related to annual changes in income and a four quarter lagged discrepancy between income and c o n s u m p t i o n . I t is these lagged levels variables which determine the steady state solution o f the m o d e l . I t w i l l frequently be f o u n d that augmentation of pure difference equations b y lagged levels terms in this way has a dramatic effect o n forecasts and on estimated policy responses. 2 0 I t is always possible to reparameterise any unrestricted distributed lag equation specified i n levels (e.g., a V A R ) i n t o the error correction f o r m , so i t may appear o d d to claim any special status for this way o f w r i t i n g distributed lag relationships. Note however that the LSE procedure i m p l i c i t l y prohibits parsimonious simplification o f the unrestricted equation into a Box-Jenkins model i n w h i c h the lagged level o f the dependent variable is excluded, even i f this exclusion w o u l d result in negligible loss in f i t . I n this sense, the specificat i o n is non-trivial. That i t is an interesting non-trivial specification depends on the claim that economic theory implies a set o f comparative static results w h i c h are reflected in long-run constancies, and is reinforced by the logically independent b u t incorrect claim that economic theory tells us l i t t l e about short-term adjustment processes. The earliest error correction specification was Denis Sargan's (1964) wage 21 19. G e o r g e B o x a n d G w y l y m J e n k i n s ( 1 9 7 0 ) . 2 0 . I n the steady state s o l u t i o n all the differenced variables arc set to zero. T h e steady state g r o w t h s o l u t i o n , i n w h i c h all the differenced variables are set to appropriate c o n s t a n t s , is often more i n f o r m a ­ tive — see J a m e s D a v i d s o n , D a v i d H e n d r y , F r a n k S r b a a n d S t e p h e n Y e o ( 1 9 7 8 ) , a n d G i l b e r t ( 1 9 8 6 , 1989). 2 1 . See for e x a m p l e P h i l l i p s ( 1 9 8 8 , p. 1 9 ) : " I n m a c r o e c o n o m i c s , theory u s u a l l y provides little infor­ m a t i o n a b o u t the process of short r u n a d j u s t m e n t " . model i n w h i c h the rate o f increase i n wage rates was related t o the difference between the lagged real wage and a n o t i o n a l target real wage. Here there is a straightforward structural interpretation o f the error correction t e r m . More recently, however, the generality o f the specification has received support from the Granger representation theorem ( Robert Engle and Granger, 1987) w h i c h states that i f there exists a stationary linear c o m b i n a t i o n o f a set o f non-stationary variables (i.e., i f the variables are "cointegrated") then these variables must be linked by at least one relationship w h i c h can be w r i t t e n i n the error correction f o r m . ( I f this were n o t the case, the variables w o u l d increasingly diverge over time.) Since most macroeconomic aggregates are non-stationary (typically they grow over time) any persisting (autonomous) relationship between aggregates over time is likely to be o f the error correction form. Cointegration therefore provides a powerful reason for supposing that there w i l l exist structural constant relationships between macroeconomic aggregates. I f economic time series are non-stationary but cointegrated there are strong arguments for imposing the error correction structure on our models, and i t is an advantage o f the LSE methodology over the V A R methodology that i t adopts this approach. A major role for economic theory i n the LSE methodology is t o aid the specification o f the cointegrating t e r m . Unsurprisingly, short samples o f relatively high frequency (quarterly or m o n t h l y ) data are often relatively uninformative about the long-run relationship between the variables, so that theoretically unmotivated specification o f these terms gives l i t t l e precision or discrimination between alternative specifications. One possibility, suggested by Engle and Granger ( 1 9 8 7 ) , is a t w o stage procedure where at the first stage one estimates the static ("cointegrating") regression ignoring the short-term dynamics, and at the second stage one imposes these long-run coefficients on the dynamic error correction model. However, M o n t e Carlo investigation suggests that this procedure has poor properties ( A n i n d y a Banerjee, Juan D o l a d o , David Hendry and Gregor S m i t h , 1986) and that i t is preferable t o attempt t o estimate the long-run solution from the dynamic adjustment equation as i n the i n i t i a l Sargan (1964) wage model and the Davidson et al. (1978) consumption function m o d e l . Nevertheless, the long-run solution may still be p o o r l y determined, i m p l y i n g that theoretical restrictions are unlikely to be rejected. The theoretical status o f the short-run dynamics i n the LSE parsimoniously simplified equations is more problematic and here economic theory has as yet been less helpful. H e n d r y and Richard (1982, 1983) describe the modelling exercise as an a t t e m p t to provide a characterisation o f what they call the " D a t a Generating Process" (the DGP) w h i c h is the j o i n t p r o b a b i l i t y d i s t r i b u t i o n of the complete set o f sample data (endogenous and exogenous variables). A c t u a l DGPs, they suggest, w i l l be very complicated, b u t the c o m b i n a t i o n of marginalisation (exclusion o f variables that do n o t m u c h matter), conditioning (regarding certain variables as e x o g e n o u s ) and simplification which together make up the activity o f modelling can give rise t o simple and structurally constant representations o f the DGP. The DGP concept derives from M o n t e Carlo analysis where the investigator specifies the process w h i c h w i l l generate the data t o be used in the subsequent estimation experiments. This suggests an analogy i n w h i c h we suppose a fict i o n a l statistician choosing the data that we analyse i n applied economics. I n a pioneering c o n t r i b u t i o n to the A r t i f i c i a l Intelligence literature, A l a n T u r i n g (1950) asked whether an investigator could infallibly distinguish which o f t w o terminals is connected to a machine and w h i c h operated by a human. H e n d r y dares us t o claim that we can distinguish between M o n t e Carlo and real w o r l d economic data. I f we cannot, the DGP analogy carries over, and we can hope to discover structural short-term dynamics. 22 This argument appears to me to be flawed. I f macroeconomic data do exhibit constant short-term dynamics then one might expect any structural interpret a t i o n t o relate to the parameters o f the adjustment processes o f the optimising agents. B u t we have seen that the aggregation conditions required for the aggregate parameters to be straightforwardly interpretable in terms o f the microeconomic parameters are heroic. W i t h M o n t e Carlo data, by contrast, we can be confident that there does exist a simple structure since the structure has been imposed by a single simple investigator. There are no aggregation issues, and the question o f reduction does n o t arise. The most promising route for rationalising the dynamics o f LSE equations is i n terms o f the backward representation o f a forward looking optimising models. I n simple models, optimising behaviour in the presence o f adjustment costs w i l l give rise t o a second order difference equation which can be solved to give a lagged (partial) adjustment term and a forward lead on the expected values o f the exogenous variables. B u t these future expected values may always be solved o u t in terms o f past values o f the exogenous variables giving a backw a r d l o o k i n g representations. Stephen Nickell (1985) showed that in a number of cases o f interest, this backward representation w i l l have the error correction f o r m , and this suggests that i t may in general be possible to rationalise error correction models in these terms ( K e i t h Cuthbertson, 1988). A n implication of this view, via the Lucas (1976) critique, is that i f the process followed by any o f the exogenous variables changes, the backward looking relationship w i l l be structurally non-constant while the forward looking representation 22. S t r i c t l y "at least w e a k l y e x o g e n o u s " — see E n g l c , H e n d r y a n d R i c h a r d ( 1 9 8 3 ) . w i l l remain constant. Current experience, however, is that i n these circumstances i t is the f o r w a r d looking equation that is non-constant (Hendry, 1988; Carlo Favero, 1989). A n alternative approach is to regard the short-term dynamics i n LSE relationships as nuisance terms. Direct estimation o f the cointegrating relationships is inefficient because of the residual serial correlation resulting f r o m the o m i t t e d dynamics and may be inconsistent because o f simultaneity. I t is possible that i n part these o m i t t e d dynamics arise from aggregation across heterogeneous agents (Marco L i p p i , 1988). I n principle, one could estimate using a systems m a x i m u m l i k e l i h o o d ( M L ) estimator taking into account the serial correlation (Sorenjohansen, 1988; Soren Johansen andKatarinaJuselius, 1989), b u t there is advantage i n using a single equations estimator since this localises any misspecification error. The single equations estimator must correct b o t h for the simultaneity and for the serial correlation. I n recent w o r k Phillips (1988) has argued that LSE dynamic equations often come very close to and sometimes achieve o p t i m a l estimation o f the cointegrating relationship. O n this i n t e r p r e t a t i o n , the short-run dynamic terms i n those equations are simply the simultaneity and Generalised Least Squares (GLS) adjustments, i n the same way that one can rewrite the familiar Cochrane-Orcutt autoregressive estimator (Donald Cochrane and G u y O r c u t t , 1949) i n terms o f a restricted OLS estimation o f an equation containing lagged values o f the dependent variable and the regressor variables (Hendry and M i z o n , 1978). A n implicat i o n is that we have come full circle back to pre-Haavelmo econometrics where the concern was the measurement o f constants i n lawlike relationships which i n modern t e r m i n o l o g y are simply the cointegrating relationships. However, this is t o miss m u c h o f the p o i n t o f the methods generated b y Sargan, Hendry and their colleagues. Routine forecasting and policy analysis in econometrics is as much or more concerned w i t h short-term movements in key variables than w i t h their long-term equilibria. Furthermore, short-term (derivative) responses are generally very m u c h better determined than longterm relationships. I argued i n Gilbert (1989) that a substantial part o f the m o t i v a t i o n o f the LSE t r a d i t i o n i n econometrics was the perceived challenge to " w h i t e b o x " econometric models from "black b o x " t i m e series (BoxJenkins) models (Richard Cooper, 1972; Charles Nelson, 1972). The same points are true i n relation to the development of V A R methodology. Practitioners o f the LSE approach are u n l i k e l y , therefore, t o recognise themselves in Phillips' description. A t the start o f his famous 1972 survey "Lags i n Economic Behavior", Marc Nerlove quoted Schultz (1938) as saying " A l t h o u g h a theory o f dynamic economics is still a thing o f the future, we must not be satisfied w i t h the status quo i n economics". Nerlove then went o n to remark that " d y n a m i c economics is still, i n large part, a thing o f the f u t u r e " (Nerlove, 1972, p . 222). The rational expectations o p t i m i s i n g models, examples o f w h i c h I have already discussed, have constituted a major attempt t o provide that dynamic theory. They have not been w h o l l y successful, for the reasons I have indicated. Neither have they been w h o l l y unsuccessful. A possible criticism o f b o t h the V A R and LSE approaches t o modelling aggregate macro-dynamics is that they do not make any attempt t o accommodate these theories. A n alternative possibility is to argue that the p r o b l e m is on the theorists' side; and that the rational expectations atomistic optimising models deliver models w h i c h are t o o simple even to be taken as reasonable approximations. The problem is, nevertheless, that however much the anomalies m u l t i p l y , we are likely to abandon these theories u n t i l an alternative paradigm becomes available. Sadly, I do n o t see any i n d i c a t i o n that such a development is i m m i n e n t . I started this lecture by recalling a c o m m i t m e n t to unify theory and empirical modelling. That programme has recorded a measure o f success, b u t to a large extent that success has been in the modelling o f long-term equilibrium relationships. When Nerlove surveyed the methods o f dynamic economics, the cont r i b u t i o n o f theory was relatively new and relatively slight. We n o w have m u c h better developed and more securely based theories o f dynamic adjustment b u t these theories have been too simple t o i n f o r m practical modelling. I t is obviously possible to argue that this is the fault o f the econometricians, and the level o f discord among the econometricians m i g h t be held as evidence for this view. M y suspicion is, however, that the current disarray in the econometric camp is the consequence o f the lack o f applicable theory. Where we have informative and detailed theories, as for example in demand analysis or the theory o f financial asset prices, methodological debates are m u t e d . I f the theorist can develop realistic but securely based dynamic theories, then the competing approaches to econometric methodology could coexist, quite happily t h r o u g h o u t macroeconometrics. I have made a number o f different arguments i n the course o f this paper, so a brief summary may be useful. 1. I agree w i t h the currently widely held view that i t is not possible in general t o estimate parameters o f micro functions from aggregate data. 2. I disagree w i t h the i m p l i e d view that aggregate relationships cannot be interpreted in terms o f microeconomic theory. The appropriate level of aggregation w i l l depend b o t h on the purpose o f the modelling exercise and on the questions being asked. 3. Theoretical restrictions should not be expected t o hold precisely on aggregate data. This implies that classical rejections cannot per se be taken t o i m p l y rejection o f the theories in question. 4. Classical techniques o f non-nested hypothesis testing provide a m e t h o d for discriminating between alternative imprecise theories. 5. I t is d i f f i c u l t t o argue a priori t h a t Bayesian shrinkage procedures have either superior or inferior statistical properties t o the pseudo-structural methods associated w i t h the British approach t o dynamic m o d e l l i n g . A n advantage o f the latter approach is however that i t gives a central role to model discovery, w h i c h may allow a beneficial feedback from data t o t h e o r y . 6. Cointegration provides a p o w e r f u l reason for believing that macroeconomic aggregates w i l l be l i n k e d b y structurally stable relationships, and i t is an i m p o r t a n t advantage o f the British approach that i t embodies this feature or economic time series t h r o u g h error c o r r e c t i o n . However, the argument that the British approach t o dynamic m o d e l l i n g should be seen as simply a m e t h o d o f efficiently estimating these e q u i l i b r i u m relationships is misconceived. 7. The progress i n estimating relationships has n o t been matched b y comparable progress i n estimating dynamic adjustment processes, where theory and data appear t o be quite starkly at odds. A possible response is that the existing o p t i m i s i n g theories are just t o o simple. REFERENCES A L D R I C H , J O H N , 1989. "Autonomy", Oxford Economic Papers, Vol. 41, pp. 15­34. B A N E R J E E , A N I N D Y A , J U A N J . D O L A D O , D A V I D F . H E N R Y , and G R E G O R W. SMITH, 1986. "Exploring Equilibrium Relationships in Econometrics through static Relationships: Some Monte Carlo Evidence", Oxford Bulletin of Economics and Statistics, 48, pp. 253­277. B L U N D E L L , R I C H A R D , 1988. "Consumer Behaviour: Theory and Empirical Evidence", Economic Journal, Vol. 98, pp. 16­65. B L U N D E L L , R I C H A R D , PANOS P A S H A R D E S , and G I U L L I M O W E B E R , 1988. "What do we Learn about Consumer Demand Patterns from Micro­Data?", Institute for Fiscal Studies, Working Paper No. W88/10. BOX, G E O R G E E.P., and G W Y L Y M M. J E N K I N S , 1970. Time Series Analysis: Forecasting and Control, San Francisco: Holden Day. CHOW, G R E G O R Y C , 1960. "Tests of Equality Between Sets of Coefficients in Two Linear Regressions", Econometrica, Vol. 28, pp. 591­605. C O C H R A N E , D O N A L D , and G U Y H. O R C U T T , 1949. "An Application of Least Squares Regression to Relationships Containing Auto­Correlated Error Terms"', Journal of the American Statistical Association, Vol. 44, pp. 32­61. C O O L E Y , THOMAS F . , and S T E P H E N F . L E R O Y , 1985. "Atheoretical Macroecono­ metrics: A Critique", Journal of Monetary Economics, Vol. 16, pp. 283­308. C O O P E R , R I C H A R D L . , 1972. "The Predictive Performance of Quarterly Econometric Models of the United States", in Bert Hickman (ed.), Econometric Models of Cyclical Behavior, NBER Studies in Income and Wealth, Vol. 36, pp. 813­926, New York: Columbia University Press. C U T H B E R T S O N , K E I T H , 1988. "The Demand for M l : A Forward­Looking Buffer Stock Model", Oxford Economic Papers, Vol. 40, pp. 110­131. D A V I D S O N , J A M E S E . H . , and D A V I D F . H E N D R Y , 1981. "Interpreting Econometric Evidence: The Behaviour of Consumers' Expenditure in the U K " , European Economic Review, Vol. 16, pp. 177­592. D A V I D S O N , J A M E S E . H . , D A V I D F . H E N D R Y , F R A N K S R B A , and S T E P H E N Y E O , 1978. "Econometric Modelling of the Aggregate Time Series Relationship Between Consumers' Expenditure and Income in the United Kingdom", Economic Journal, Vol. 88, pp. 661­692. D E A T O N , ANGUS S., 1974. "A. Reconsideration of the Empirical Implications of Addi­ tive Preferences", Economic Journal, Vol. 84, pp. 338­348. D E A T O N , ANGUS S., and J O H N M U E L L B A U E R , 1980. "An Almost Ideal Demand System", American Economic Review, Vol. 70, pp. 312­332. DO AN, THOMAS, R O B E R T B. L I T T E R M A N , and C H R I S T O P H E R A. SIMS, 1984. "Forecasting and Conditional Projection Using Realistic Prior Distributions", Econometric Reviews, V o l . 3, pp. 1­100. E N G L E , R O B E R T F . , and C L I V E W.H. G R A N G E R , 1987. "Co­Integration and Error Correction: Representation, Estimation and Testing", Econometrica, Vol. 55, pp. 251­ 276. E N G L E , R O B E R T F . , D A V I D F . H E N D R Y , and J E A N ­ F R A N C O I S R I C H A R D , 1983. "Exogeneity", Econometrica, Vol. 51, pp. 277­304. F A V E R O , C A R L O , 1989. "Testing for the Lucas Critique: An Application to Consumers' Expenditure", University of Oxford, Applied Economics Discussion Paper #73. F R I E D M A N , M I L T O N , 1953. "The Methodology of Positive Economics", in Milton Fried­ man, Essays in Positive Economics, Chicago: University of Chicago Press, pp. 3­43. F R I S C H , R A G N A R , 1933. "Editorial", Econometrica, Vol. 1, pp. 1­4. F R Y , V A N E S S A C , and PANOS P A S H A R D E S , 1988. "Non­Smoking and the Estimation of Household and Aggregate Demands for Tobacco", Institute for Fiscal Studies, pro­ cessed. G E W E K E , J O H N , 1984. "Comment", Econometric Reviews, Vol. 3, pp. 105­112. G I L B E R T , C H R I S T O P H E R L . , 1986. "Professor Hendry's Econometric Methodology", Oxford Bulletin of Economics and Statistics, Vol. 48, pp. 283­307. G I L B E R T , C H R I S T O P H E R L . , 1989. " L S E and the British Approach to Time Series Econometrics", Oxford Economic Papers, Vol. 41, pp. 108­128. G O R M A N , W.M. ( T E R R E N C E ) , 1953. "Community Preference Fields", Econometrica, Vol. 19, pp. 63­80. G O R M A N , W.M. ( T E R R E N C E ) , 1959. "Separable Utility and Aggregation", Econometrica, Vol. 27, pp. 469­481. G O R M A N , W.M. ( T E R R E N C E ) , 1968. "The Structure of Utility Functions", Review of Economic Studies, Vol. 21, pp. 1­17. G R A N G E R , C L I V E W.J., 1987. "Implications of Aggregation with Common Factors", Econometric Theory, V o l . 3, pp. 208­222. G R A N G E R , C L I V E W.J., 1988. "Aggregation of Time Series ­ A Survey", Federal Bank of Minneapolis, Institute for Empirical Macroeconomics, Discussion Paper # 1 . G R U N F E L D , Y E H U D A , 1958. "The Determinants of Corporate Investment", unpublished Ph.D. Thesis, University of Chicago. G R U N F E L D , Y E H U D A , and Z V I G I R L I C H E S , 1960. "Is Aggregation Necessarily Bad?", Review of Economics and Statistics, Vol. 42, pp. 1­13. H A A V E L M O , T R Y G V E , 1944. "The Probability Approach to Econometrics", Econometrica, Vol. 12, Supplement. H A L L , R O B E R T E . , 1978. "Stochastic Implications of the Life Cycle­Permanent Income Hypothesis: Theory and Evidence", Journal ofPolitical Economy, Vol. 86, pp. 971­987. H A N S E N , L A R S P E T E R , and K E N N E T H J . S I N G L E T O N , 1983. "Stochastic Consump­ tion, Risk Aversion and the Temporal Behavior of Asset Returns", Journal of Political Economy, Vol. 91, pp. 249­265. H A T A N A K A , M., 1975. "On the Global Identification of the Dynamic Simultaneous Equation Model with Stationary Disturbances", International Economic Review, Vol. 16, pp. 545­554. H E N D R Y , D A V I D F . , 1988. "Testing Feedback versus Feedforward Econometric Speci­ fications", Oxford Economic Papers, Vol. 40, pp. 132­149. H E N D R Y , D A V I D F . , 1989. PC-GIVE: An Interactive Econometric Modelling System, Oxford: Institute of Economics and Statistics. H E N D R Y , D A V I D F . , and G R A Y H A M E . MIZON, 1978. "Serial Correlation as a Con­ venient Simplification, Not a Nuisance: A Comment on a Study of the Demand for Money by the Bank of England", Economic Journal, Vol. 88, pp. 549­563. H E N D R Y , D A V I D F . , and J E A N ­ F R A N C O I S R I C H A R D , 1982. "On the Formulation of Empirical Models in Dynamic Econometrics", Journal of Econometrics, Vol. 20, pp. 3­33. H E N D R Y , D A V I D F . , and J E A N ­ F R A N C O I S R I C H A R D , 1983. "The Econometric Analysis of Economic Time Series", International Statistical Review, Vol. 51, pp. 11­63. H I L D E N B R A N D , W E R N E R , 1983. "On the 'Law of Demand' ",Econometrica, Vol. 51, pp. 997­1019. H O E R L , A R T H U R , and R O B E R T W. K E N N A R D , 1970a. "Ridge Regression: Biased Regression for Non­Orthogonal Problems", Technometrics, Vol. 12, pp. 55­67. H O E R L , A R T H U R , and R O B E R T W. K E N N A R D , 1970b. "Ridge Regression: Applica­ tions to Non­Orthogonal Problems", Technometrics, Vol. 12, pp. 69­82. J O H A N S E N , S O R E N , 1988. "Statistical Analysis of Cointegration Vectors", Journal of Economic Dynamics and Control, Vol. 12, pp. 231­254. J O H A N S E N , S O R E N , and K A T A R I N A J U S E L I U S , 1989. "The Full Information Maxi­ mum Likelihood Procedure for Inference on Cointegration — with Applications", University of Copenhagen, processed. K L E I N , L A W R E N C E R., 1946a. "Macroeconomics and the Theory of Rational Behaviour", Econometrica, Vol. 14, pp. 93­108. K L E I N , L A W R E N C E R., 1946b. "Remarks on the Theory of Aggregation",£conomern'ca, Vol. 14, pp. 303­312. K U H N , THOMAS S., 1962. The Structure of Scientific Revolutions, Chicago: Chicago University Press. L E A M E R , E D W A R D E . , 1978. Specification Searches: Ad Hoc Inference with NonExperimental Data, New York: Wiley. L E A M E R , E D W A R D E . , 1985. "Sensitivity Analysis Would Help", American Economic Review, Vol. 75, pp. 308­313. L E A M E R , E D W A R D E . , and H E R M A N N L E O N A R D , 1983. "Reporting the Fragility of Regression Estimates", Review of Economics and Statistics, Vol. 65, pp. 306­317. LIPPI, M A R C O , 1988. "On the Dynamic Shape of Aggregated Error Correction Models", Journal of Economic Dynamics and Control, Vol. 12, pp. 561­585. L I T T E R M A N , R O B E R T B., 1986a. "A Statistical Approach to Economic Forecasting", Journal of Business and Economic Statistics, Vol. 4, pp. 1­4. L I T T E R M A N , R O B E R T B., 1986b. "Forecasting with Bayesian Vector Autoregressions — Five Years of Experience", Journal of Business and Economic Statistics, Vol. 4, pp. 25­38. L U C A S , R O B E R T E . , 1976. "Econometric Policy Analysis: A Critique", in Karl Brunner and Allan H . Melzer (eds.), The Phillips Curve and Labor Markets (Journal of Monetary Economics, Supplement, Vol. 1, pp. 19­46). M c N E E S , S T E P H E N K . , 1986. "Forecasting Accuracy of Alternative Techniques: A Com­ parison of U.S. Macroeconomic Forecasts", Journal of Business and Economic Statistics, Vol. 4, pp. 5­15. MIZON, G R A Y H A M E . , 1977. "Inferential Procedures in Nonlinear Models: An Applica­ tion in a U . K . Industrial Cross Section Study of Factor Substitution and Returns to Scale", Econometrica, Vol. 45, pp. 1221­1242. MIZON, G R A Y H A M E . , and J E A N ­ F R A N C O I S R I C H A R D , 1986. "The Encompassing Principle and Its Application to Non­Nested Hypotheses", Econometrica, Vol. 54, pp. 657­678. M O R G A N , M A R Y S., 1987. "Statistics without Probability and Haavelmo's Revolution in Econometrics", in Lorenz Kruger, Gerd Gigerenzer and Mary S. Morgan (eds.), The Probabilistic Revolution, Vol. 2, pp. 171­197, Cambridge, Mass.: MIT Press. M U E L L B A U E R , J O H N , 1975. "Aggregation, Income Distribution and Consumer Demand", Review of Economic Studies, Vol. 62, pp. 525­543. M U E L L B A U E R , J O H N , 1976. "Commands Preferences and the Representative Consumer", Econometrica, Vol. 44, pp. 979­999. N A T A F , A N D R E , 1948. "Sur la Possibilite de Construction de Certains Macromodels", Econometrica, Vol. 16, pp. 232­244. N E L S O N , C H A R L E S R., 1972. "The Prediction Performance of the FRB­MIT­PENN Model of the US Economy", American Economic Review, Vol. 62, pp. 902­917. N E R L O V E , M A R C , 1972. "Lags in Economic Behavior", Econometrica, Vol. 40, pp. 221­ 251. N I C K E L L , S T E P H E N J . , 1985. "Error Correction, Partial Adjustment and All That: An Expository Note", Oxford Bulletin of Economics and Statistics, Vol. 47, pp. 119­131. P A G A N , A D R I A N , 1987. "Three Econometric Methodologies: a Critical Appraisal", Journal of Economic Surveys, Vol. 1, pp. 3­24. P E S A R A N , M. H A S H E M , 1986. "Editorial Statement", Journal of Applied Econometrics, Vol. l , p p . 1­4. P H E B Y , J O H N , 1988. Methodology and Economics: A Critical Introduction, London: Macm illan. P H I L L I P S , P E T E R C . B . , 1988. "Reflections on Econometrics Methodology", Cowles Foundation Discussion Paper # 8 9 3 , Cowles Foundation, Yale University. ROBBINS, L I O N E L , 1935. An Essay on the Nature and Significance of Economic Science, (2nd edition), London: Macmillan. S A R G A N , J . DE.NIS, 1964. "Wages and Prices in the United Kingdom: A Study in Econo­ metric Methodology", in Peter E . Hart, Gordon Mills and John K . Whittaker (eds.), Econometric Analysis for National Economic Planning, London: Butterworth. S C H U L T Z , H E N R Y , 1928. Statistical Laws of Demand and Supply with Special Application to Sugar, Chicago: University of Chicago Press. S C H U L T Z , H E N R Y , 1938. The Theory and Measurement of Demand, Chicago: Univer­ sity of Chicago Press. SIMS, C H R I S T O P H E R A., 1980. "Macroeconomics and Reality", Econometrica, Vol. 48, pp. 1­48. SIMS, C H R I S T O P H E R A., 1982. "Policy Analysis with Econometric Models", Brookings Papers on Economic Activity, 1982(1), pp. 107­152. SIMS, C H R I S T O P H E R A., 1986. "Are Forecasting Models Usable for Policy Analysis", Federal Reserve Bank of Minneapolis, Quarterly Review, V o l . 10, pp. 2­16. SIMS, C H R I S T O P H E R A., 1987. "Making Economics Credible", in Truman F . Bewley (ed.), Advances in Econometrics — Fifth World Congress, Cambridge, Mass.: Econo­ metric Society. SPANOS, A R I S , 1988. "Towards a Unifying Methodological Framework for Econometric Modelling", Economic Notes, pp. 107­134. S T O K E R , THOMAS M., 1986. "Simple Tests of Distributional Effects on Macroeconomic Equations", Journal of Political Economy, Vol. 94, pp. 763­795. S T O N E , j . R I C H A R D N., 1954. "Linear Expenditure Systems and Demand Analysis: An Application to the Pattern of British Demand", Economic Journal, Vol. 64, pp. 511­527. T E R A S V I R T A , TIMO, 1970. Stepwise Regression and Economic Forecasting, Economic Studies Monograph #31, Helsinki: Finnish Economic Association. T R I V E D I , P R A V I N , 1984. "Uncertain Prior Information and Distributed Lag Analysis", in David F . Hendry and Kenneth F . Wallis (eds.), Econometrics and Quantitative Economics, Oxford: Basil Blackwell. T U R I N G , A L A N M., 1950. "Computing Machinery and Intelligence", Mind, Vol. 59, pp. 433­460. W A L L I S , K E N N E T H F . , 1974. "Seasonal Adjustment and Relations Between Variables", Journal of the American Statistical Association, Vol. 69, pp. 18­31. WOLD, H E R M A N , and R A D N A R B E N T Z E L , 1946. "On Statistical Demand Analysis from the Viewpoint of Simultaneous Equations", Skandinavisk Aktuarietidskrift, Vol. 29, pp. 95­114. WOLD, H E R M A N , and L A R S J U R E E N , 1953. Demand Analysis, New York: Wiley.