% ------------------------------------------------------------------------
% AMS-LaTeX Paper ********************************************************
% **** -----------------------------------------------------------------------
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\documentclass[reqno]{amsart}
\usepackage{srcltx} % SRC Specials
% ------------------------------------------------------------------------
% Over-full v-boxes on even pages are due to the \v{c} in author's name
\vfuzz2pt % Don't report over-full v-boxes if over-edge is small
\hfuzz2pt % Don't report over-full h-boxes if over-edge is small
% THEOREMS ---------------------------------------------------------------
\newtheorem{THM}{{\!}}[section]
\newtheorem{THMX}{{\!}}
\renewcommand{\theTHMX}{} % not numbered
%
\newtheorem{thm}{Theorem}[section]
\newtheorem{cor}[thm]{Corollary}
\newtheorem{lem}[thm]{Lemma}
\newtheorem{prop}[thm]{Proposition}
\theoremstyle{definition}
\newtheorem{defn}[thm]{Definition}
\theoremstyle{remark}
\newtheorem{rem}[thm]{Remark}
%\numberwithin{equation}{section}
% MATH -------------------------------------------------------------------
\newcommand{\Real}{\mathbb R}
\newcommand{\RPlus}{\Real^{+}}
\newcommand{\norm}[1]{\left\Vert#1\right\Vert}
\newcommand{\abs}[1]{\left\vert#1\right\vert}
\newcommand{\set}[1]{\left\{#1\right\}}
\newcommand{\seq}[1]{\left<#1\right>}
\newcommand{\eps}{\varepsilon}
\newcommand{\To}{\longrightarrow}
\newcommand{\BX}{\mathbf{B}(X)}
\newcommand{\A}{\mathcal{A}}
\newcommand{\M}{\mathcal{M}}
\newcommand{\N}{\mathcal{N}}
\newcommand{\Lom}{\mathcal{L}}
\newcommand{\Comp}{\mathcal{K}}
\newcommand{\Basis}{\mathcal{B}}
\renewcommand{\baselinestretch}{1.1}
%%% ----------------------------------------------------------------------
\begin{document}
\title{A RATIONAL EXPECTATIONS FRAMEWORK FOR SHORT RUN POLICY ANALYSIS}
\author{Christopher A. Sims}
%\address{}
%\email{}
%
\thanks{This paper is a revision of one prepared for presentation at the
``New Approaches to Monetary Economics" conference, May 23-24
1985, at the IC$^{2}$ Institute of the University of Texas at
Austin. The research for the paper was supported in part by NSF
grant SES 8309329.}
%\subjclass{}
%\keywords{}
%\date{}
%\dedicatory{}
%\commby{}
%
%%%% ----------------------------------------------------------------------
%
%\begin{abstract}
%\end{abstract}
%%% ----------------------------------------------------------------------
\maketitle
%%% ----------------------------------------------------------------------
% ------------------------------------------------------------------------
%\bibliographystyle{amsplain}
%\bibliography{}
% ------------------------------------------------------------------------
\section{Introduction}
There is increasing recognition that Lucas's [1976] critique of
econometric policy evaluation, at least under its usual
interpretation, is logically flawed. The point has been made
forcefully recently by Sargent [1984] and by Cooley, Leroy and
Rahman [1984] (henceforth referred to as CLR), as well as in my
own paper [1982]. The problem is that if the parameters of the
policy ``rule" are subject to change, as they must be if it makes
sense to evaluate changes in them, then the public must
recognize this fact and have a probability distribution over the
parameters of the rule. But then these ``parameters" are
themselves ``policy variables", taking on a time series of values
drawn from some probability law. Predicting how the economy will
behave if we set the parameters of the rule at some value and
keep them there is logically equivalent to predicting the
behavior of the economy conditional on a certain path of a
policy variable. Yet this is just the kind of exercise which
Lucas claimed to be meaningless.
It is also evident that the methods of policy evaluation which Lucas
criticized are still in wide use nine years after the appearance of his
paper. During discussions of monetary and fiscal policy, statistical
models prepared by the Congressional Budget Office, the Federal Reserve
Board, numerous other agencies, and by private entities are used to
prepare predictions of the likely future path of the economy conditional
on various possible paths for policy variables. These conditional
projections influence policy-makers' views of the likely consequences of
the choices they must make. Though Lucas suggested an alternative
paradigm for policy analysis, it is still little used.
Nonetheless, we are not quite to the point where well-trained young
macroeconomists collaborate in preparing and interpreting conditional
projections of the effects of alternative paths for policy variables
without feeling queasy. Sargent, while clearly explaining the problems
with the rational expectations paradigm for policy analysis and finding
no way around them, claims we must ignore them if we are to avoid the
conclusion that policy recommendations of any kind are meaningless. CLR
show that projections of the future path of the economy conditional on
paths of policy variables are not meaningless, even in the presence of
shifts in policy rule. But they never explicitly address the central
question of whether such conditional projections could ever be used in
the process of policy choice without invalidating them. My own paper
[1982] asserts that policy choice based on conditional projection from
models which do not identify the parameters of tastes and technology can
be logically coherent, but it does not support the assertion with formal
modeling and many remain unconvinced.
The remainder of this paper attempts to make that assertion more
convincing by analyzing more closely how properly constructed and
interpreted conditional projections sidestep the Lucas critique and by
presenting examples of model economies in which policy makers steadily
make good use of conditional projections from ``loosely identified"
statistical models.
\section{Finding a Coherent Interpretation of the Lucas Critique}
Lucas formulates his critique most baldly and succinctly at the beginning
of section 6 of his paper, where he writes
\begin{quote}
...there are compelling empirical and theoretical reasons
for believing that a structure of the form
\end{quote}
\begin{equation}\label{eq:naive}
y_{t+1} = F(y_{t},x_{t},\beta,\eta_{t})
\end{equation}
\begin{quote}
($F$ known, $\beta$ fixed, $x_{t}$ ``arbitrary") will not be of use for
forecasting and policy evaluation in actual economies.
\end{quote}
He goes on to observe that for short-term forecasting the problem can
avoided, indeed has been avoided in practice, by allowing the
``parameters" $\beta$ to drift in time. But for policy evaluation he argues
that a completely different approach is required, in which models are
given the form
\begin{gather}\label{eq:rule}
x_{t} = G(y_{t},\gamma,\varepsilon_{t})\\
\label{eq:response}
y_{t+1} = F(y_{t},x_{t},\beta(\gamma),\eta_{t}) ,
\end{gather}
where G and F are known, $\gamma$ is a fixed parameter vector, and $\varepsilon_{t}$ and $\eta_{t}$
are vectors of disturbances. The econometric problem is now that of
estimating the function $\beta(\gamma)$, not a fixed vector of real numbers $\beta$, and
policy evaluation is performed by considering the effects of alternative
settings for $\gamma$, not by comparing choices of paths for $x$.
These assertions make no sense if taken at face value. The most widely
used method for allowing for parameter drift in forecasting models takes
the ``parameters" $\beta$ to be a time series evolving according to
\begin{equation}\label{eq:tv}
\beta(t) = H(\beta(t-1),\alpha,v(t)) ,
\end{equation}
where $v$ is a vector of random disturbances, $H$ is a known function, and $\alpha$
is a fixed vector of parameters. By recursively substituting lagged
versions of \eqref{eq:tv} into itself, we can obtain
\begin{equation}\label{eq:tvback}
\beta(t) = H_{1}(\alpha,V(t),\beta(0)),
\end{equation}
where $V(t) = \{v(1),...,v(t)\}$. If the influence of $\beta(0)$ on $H_{1}$ is
non-negligible, we can merge it with the parameter vector $\alpha$, to form the
new parameter vector $\alpha^{*}=(\alpha,\beta(0))$, so that
\begin{equation}\label{eq:newbeta}
\beta(t) = H^{*}(\alpha^{*},V(t)) .
\end{equation}
Substituting \eqref{eq:newbeta} into \eqref{eq:naive} produces
\begin{equation}\label{eq:seven}
y_{t+1} = F(y_{t},x_{t},H^{*}(\alpha^{*},V(t)),\eta_{t}) =
F^{*}(y_{t},x_{t},\alpha^{*},\eta^{*}(t)) ,
\end{equation}
where $\eta^{*}(t)=(V(t),\eta(t))$. Now obviously $F^{*}$ in \eqref{eq:seven} has exactly the
same form as $F$ in \eqref{eq:naive}. If a ``time-varying parameter" model like that
described by $F^{*}$ can in fact provide good forecasts, which Lucas claims
it can, then it would seem he cannot consistently claim also that a
structure of the form \eqref{eq:naive} ``will not be of use for forecasting... in
actual economies." Of course, if $F$ in \eqref{eq:naive} is linear, or in some other
sense ``simple", $F^{*}$ in \eqref{eq:seven} will generally not be. But Lucas says only
that F is known, not that it is linear, and he undoubtedly meant to be
saying something stronger than that we need nonlinear models.
There is a similar problem with the claim that evaluating policy by using
\eqref{eq:rule} and \eqref{eq:response} to gauge the consequences of changing $\gamma$
is different from
using \eqref{eq:naive} to gauge the consequences of various choices of time path for
$x$. The ``fixed" parameter $\gamma$ is itself necessarily a time series with at
least two possible values (the value before we change policy and the
value after). Putting the appropriate subscript on $\gamma$ and substituting
\eqref{eq:rule} into \eqref{eq:response} gives us
\begin{equation}\label{eq:eight}
y_{t+1} = F(y_{t},G(y_{t},\gamma_{t},\varepsilon_{t}),\beta(\gamma_{t}),\eta_{t})
= F^{**}(y_{t},\gamma_{t},\eta^{**}_{t}) ,
\end{equation}
where $\eta^{**}_{t} = (\eta_{t},\varepsilon_{t})$ . Now we have transformed
\eqref{eq:rule} and \eqref{eq:response} into
\eqref{eq:eight}, wherein $F^{**}$ has exactly the functional form of $F$, with $\gamma$ playing
the role of $x$. There are no unknown parameters in \eqref{eq:eight} at all, but of
course in practice there would be unknown parameters in the functional
form of the $\beta(\gamma_{t})$ function, to take the place of the original fixed $\beta$
parameters.
Again, if $F$ were linear, $F^{**}$ would probably not be, but surely most
economists interpret Lucas's argument as asserting more than that we
will need nonlinear models to do good policy analysis.
It is also true that Lucas wants us to take $\gamma$
as ``fixed". Of course if
we contemplate changing $\gamma$ it cannot really be fixed, but the spirit of
the argument is that we ought to consider once-and-for-all changes in $\gamma$,
i.e. "paths" of $\gamma_{t}$ which are constant up to some date $T$ and constant
thereafter, with a discontinuity at $T$. Further according to this
interpretation, we ought to concentrate on predicting the long run
effects of the change, after $\gamma$ has been at its new level long enough for
behavior to have completely adjusted. From the point of view of equation
\eqref{eq:eight}, this is the suggestion that we should limit ourselves to comparative
statics based on the long run properties of the model. But this
recommendation, if we admit that $F^{**}$ is the same form as displayed by
standard econometric models, is not at all revolutionary. And it is a
dubious recommendation -- in practice econometric models are probably
less reliable in their long run properties than in their short run
properties.
Some readers undoubtedly have given Lucas's critique the interpretation
outlined above. Sargent's paper is one example. Under this
interpretation, equation \eqref{eq:naive} is interpreted as general enough to
correspond to any statistical model relating endogenous variables $y$ to
lagged endogenous variables, policy variables $x$, some parameters $\beta$, and
random disturbances $\eta$. The claim is that any use of such a model to make
policy will have to change the model, making the model invalid. This
leads to skepticism toward all use of data in forming policy, since there
is no way to do so without using a model in the general form \eqref{eq:naive}.
Concentration on choice of the parameters of a policy rule is of no help,
since there is no logical distinction between such ``parameters" and
"policy variables". Thus on this interpretation, if one takes the
rational expectations critique of standard econometric policy evaluation
seriously it applies with equal force to the rational expectations
program for ``correct" econometric evaluation of policy.
As we will see in the next section through an example, the argument in
this form is simply incorrect. It is possible for an optimizing,
benevolent, immortal policy maker to make policy every period forever by
choosing among conditional projections from a statistical model while the
model remains accurate.
However, this version of the argument distorts the original. When Lucas
writes down \eqref{eq:naive} in section 6 of his paper, he introduces it as a
``structure" of the form \eqref{eq:naive}. By this he probably means to imply that $F$
is not some general statistical model, but the kind of entity he
described in more detail when using \eqref{eq:naive} earlier in the paper. There he
requires that ``the function $F$ and the parameter vector $\beta$ are derived from
decision rules (demand and supply functions) of agents in the economy,
and these decisions are, theoretically, optimal given the situation in
which each agent is placed." Furthermore, in the examples Lucas
considers there is in every case one or more functions contained in the
model which represent or are directly affected by agents'
expectation-formation rules.
The lesson of rational expectations is that when we use a model in whose
functional form is embedded agents' expectational rules, we are likely to
make errors even in forecasting if we insist that those expectational
rules are fixed through time, and that we will make even more serious
errors in policy evaluation if we pretend that those rules will remain
fixed despite changes in policy which make them clearly suboptimal. The
difference between \eqref{eq:naive} and the system \eqref{eq:rule}-\eqref{eq:response}
as frameworks for policy
analysis is not the superficial one that in the latter we think of
ourselves as choosing a ``fixed parameter" $\gamma$ while in the former we are
choosing ``arbitrary" values of policy ``variables" $x$. The difference is
that in \eqref{eq:naive} the parameters $\beta$ in fact depend on the hidden policy variable
$\gamma$. If we try to use \eqref{eq:naive} to guess the effects of various $x$
paths which
are in fact accompanied by changes in $\beta$, we will make errors. The
advantage of \eqref{eq:rule}-\eqref{eq:response}, or the equivalent model
\eqref{eq:eight}, is that either of them
takes proper account of the effect of $\gamma$ on $\beta$.
Thus we should not necessarily expect a different mathematical form or a
different probabilistic treatment of policy variables for models which
take proper account of rational expectations. It is even in principle
possible that such models could have $F^{**}$ functions which turned out to
be a set of linear stochastic difference equations, with the
corresponding mistaken $F$ function being of complicated nonlinear form.
Most important, the problems of identification for rational expectations
models are not fundamentally different from those for what used to be
standard models. In the happy circumstance where the historically
observed data contains exogenous variation in policy along the lines we
are currently contemplating, we can estimate the effects of our policy
choices by reduced form modeling. We can estimate regressions of current
data on current and past policy variables and correctly use the results
to project the likely effects of our policy choices. To do this we need
not separate the effects of policy occurring directly from those occurring
indirectly through modifications in expectation-formation rules of the
public. It is exactly this point which is so neatly laid out by CLR.
This is not to deny that identification is a hard problem.
Identification for purposes of policy evaluation is roughly the same hard
problem whether or not we take account of rational expectations.
Historical data on policy variables will usually reflect some systematic
pattern of response of policy to disturbances originating elsewhere in
the economy. Therefore conditional distributions of other variables
given policy variables do not necessarily correspond to conditional
distributions of those variables given autonomously induced changes in
policy variables.
\section{Optimal Policy Using Conditional Projections}
If policy is made optimally, it is always reacting correctly to all
available information about the state of the economy. Presumably then it
does not display capricious or arbitrary variation. This suggests that
it should contain no autonomous randomness, so that policy variables
should be exact functions of past data. In this case, there would be no
way to separate the effects of policy variables from the effects of the
variables policy depends on, except by use of strong auxiliary
assumptions. On the other hand, if policy is made sub-optimally, it might
contain a lot of capricious variation; we might then easily obtain
estimates of the effect of this variation. Yet if policy makers use
our estimates to improve policy, the amount of autonomous variation in
policy will shrink, the probability structure of the economy is likely to
change, and our estimates may quickly become obsolete.
Is there any way that policy could both be chosen optimally, on the basis
of a correct model, and at the same time contain enough autonomous
variation to allow accurate estimation of the effects of deliberately
induced variations in policy?
It is not hard to see that the answer must be yes. All that is required
is the existence of some source of variation in policy choice which, as
far as the public is concerned, is indistinguishable from an error or a
capricious shift in policy choice. One obvious possibility arises when
we recognize that macroeconomic policy is in fact set through a political
process, in which groups with varying knowledge and objectives contest to
influence policy. The public does not know the identity, the objectives
or the relative political strength of these groups with certainty. Hence
actual policy always contains an unpredictable element from this source.
The public has no way of distinguishing an error by one of the political
groups in choosing its target policy from a random disturbance in policy
from the political process. Hence members of a such a group can
accurately project the effects of various settings of policy they might
aim for by using historically observed reactions to random shifts in
policy induced by the political process. The group will itself, if it
behaves optimally according to its own objective function, make its
target policy a deterministic function of data it observes. But it can
implement that function correctly by, at each date, using a statistical
model to make conditional projections of the effects of alternative
policy variable paths and choosing the projected path it likes best.
CLR include regime switches in their model, but provide no explanation
for why policy differs under the two regimes. It should be apparent,
though, that something at least very much like their model could emerge
if their two regimes were generated by two optimizing political
coalitions.\footnote{Recently Roberds [1985] has produced an explicit model which goes
some way toward capturing optimal behavior of stochastically alternating
regimes of policy makers. Like CLR, he has the regime switch a purely
exogenous random process, but instead of arbitrarily changing policy
rules, regime switches in Roberds' model change policy makers' objective
functions. His model is not ready to be tacked on to CLR's, however,
because it does not follow CLR in giving the public an inference problem
in trying to determine the current regime. This means also that, unlike
CLR, Roberds does not make the stochastic regime switches a source of
identifying variation in policy; the public sees regime switches directly
in Roberds' model, so they are an extra observable policy variable rather
than an underlying source of variation in observable policy variables.
Roberds also ignores the interesting problem of strategic interaction
between the two regimes. Each regime takes account of the fact that the
public is modeling its behavior, but neither attempts to model the
behavior of the other regime or consider the possibility that its own
behavior could affect the behavior of the other regime. It would be
interesting to see work along the lines Roberds has begun which tied more
directly to the setting CLR dealt with.}
While policy randomness due to political struggles is probably the most
realistic and important source of identifying variation in policy, it
leads to analytically challenging models. A simpler way to approach the
question of how to design good policy posits a unitary policy authority.
In such a framework, the most plausible source of identifying variation
in policy is noisy information available to the policy authority which is
visible to the econometrician, if at all, only with a delay. The
optimizing policy maker will use the noisy information, but because it is
noisy his use of it will introduce into policy a ``random error". The
reaction of the economy to this random error will provide a statistical
basis for determining the reaction of the economy to hypothetical
optimization errors. To some it may be apparent immediately that this
setup will ``work". But since I am not aware of a closely similar
construction in the literature, we work out an explicit model here.
The typical agent in this economy chooses the vector $C_{t}$ (consumption) at
$t$. The government chooses $G_{t}$ at t, government activity per capita.
Utility of the agent is
\begin{equation}\label{eq:nine}
E \left[ \sum_{t=1}^{\infty} \rho^{t}.5(C_{t}^{\prime} AC_{t} + G_{t}'BG_{t} +
K_{t}'FK_{t})\right]
\end{equation}
The technology imposes the constraint
\begin{equation}\label{eq:ten}
K_{t} = HC_{t} + MG_{t} + NK_{t-1} + X_{t} ,
\end{equation}
where the stochastic process $X$ is exogenous to both private agents and the
government. Though this framework is very general and looks like the
canonical quadratic-linear control problem, it is unconventional in that
it does not assert that $X_{t}$ is i.i.d., only that its evolution is
unaffected by choices of $G$, $C$, or $K$.
The government and the agents both try to maximize the same objective
function \eqref{eq:nine}. There is an ``information process" containing three
subvectors, $Q_{t}$, $W_{t}$ and $Z_{t}$. When private agents choose $C_{t}$, they do
so with knowledge of all values of $Q_{s}$, $W_{s}$ and $Z_{s}$ for $s \leq t$. The
government must choose $G_{t}$ based on knowledge only of these variables
for $s \leq t-1$, plus an observation on $Z_{t}$.
We assume special relationships among $Q$, $W$, $X$ and $Z$ to generate our
example: $Q$ and
$W$ are Granger causally prior to $Z$, $X$ is a linear function
of current and past $Q$, and $W$ alone, and $Z_{t}$ has the form of a noisy
measurement of $W_{t}$. I.e., assuming $Q$, $W$, and $Z$ form a linear process
with an autoregressive representation,
\begin{align}\label{eq:eleven}
E_{t}\begin{bmatrix}
Q_{t+1} \\
W_{t+1}
\end{bmatrix}
= a*\begin{bmatrix}
Q_{t} \\
W_{t}
\end{bmatrix} \\
\label{eq:twelve}
E[Z_{t}| W_{s},Q_{s},Z_{s-1}, \text{all} s \leq t] = W_{t} \\
\label{eq:thirteen}
X_{t} = b*\begin{bmatrix}
Q_{t} \\
W_{t}
\end{bmatrix}\: ,
\end{align}
where $a(s)=b(s)=0$ for $s<0$ ($a$ and $b$ are one-sided) and ``$E_{t}[.]$" stands for
``$E[.|Q_{s},W_{s},Z_{s}, \text{all} s \leq t]$".
The Granger causal priority assumption means that the information the
government obtains at $s \leq t$ is strictly redundant from the point of view of
agents at $s \leq t$. This means in turn that the solution to this problem can be
found as if there were a single optimizing agent who must choose $G_{t}$ at
a stage where his information set is smaller than it is at the next
stage, when he chooses $C_{t}$. This would be impossible if, say, private
agents also observed only noisy information on $X_{t}$ at $s \leq t$, but with a
noise different from that facing the government.
Because $K_{t-1}$ enters the constraint \eqref{eq:ten}, the public cares about future
values of
$G$, leaving a role for the rational expectations hypothesis.
Because the model is quadratic-linear, we can solve it using dynamic
certainty-equivalence.
The main point of the example does not depend on the explicit solution of
the model. Consider the government's problem at $s \leq t$, which is to choose
$G_{t}$. Dynamic certainty equivalence tells us that a correct approach to
this problem can begin by forming forecasts of the paths
$X$ using the data
available to the government at this point, i.e. values of $Q_{s}$, $W_{s}$ for
s