Statistics and Algorithms / Vector Autoregressions /

IMPULSE and ERRORS

IMPULSE computes the responses of the system to particular initial shocks. It has a large array of options for specifying the shocks, but most of these (such as PATHS and MATRIX) are used more naturally with FORECAST. The VAR methodology primarily works just with first period shocks.

ERRORS uses the same information as IMPULSE, but produces its output in a different form. It decomposes the forecast error variance into the part due to each of the innovation processes. While IMPULSE will accept any form of innovation process, ERRORS requires orthogonalization, as the decomposition is meaningless without it.

You should always use IMPULSE and ERRORS together. For instance, you may note an interesting and unexpected response using IMPULSE. Before you get too excited, examine the variance decomposition with ERRORS—you may find that it has a trivial effect.

You can use the Time Series>VAR (Forecast/Analyze) wizard to analyze both the impulse responses and the variance decomposition.

Technical Information

If we look at the moving average representation of a vector time series:

\begin{equation} {\bf{y}}_t = {\bf{\hat y}}_t + \sum\limits_{s = 0}^\infty {\Psi _s {\bf{u}}_{t - s} } = {\bf{\hat y}}_t + \sum\limits_{s = 0}^\infty {\Psi _s {\bf{Fv}}_{t - s} } ,{\rm{ }}E{\kern 1pt} {\bf{u}}_t {\bf{u'}}_t = \Sigma \,\,\,\,\,\,\,\,\,\,E{\kern 1pt} {\bf{v}}_t {\bf{v'}}_t = {\bf{I}} \end{equation}

IMPULSE computes the $\Psi _s $ or the $\Psi _s^* \equiv \Psi _s {\bf{F}}$ (for orthogonalized innovations). These are organized as $N^2$ series, although you can also do shocks to one innovation at a time.

The error in the $H$-step ahead forecast is:

\begin{equation} \sum\limits_{s = 0}^{H - 1} {\Psi _s^* {\bf{Fv}}_{t + H - s} } \end{equation}

As $\mathbf{v}$ has been assumed to be uncorrelated both across time and contemporaneously (and to have unit variances), the covariance matrix of the $H$-step forecast is

\begin{equation} \sum\limits_{s = 0}^{H - 1} {\Psi _s^* {\Psi _s^*} ^\prime } = \sum\limits_{s = 0}^{H - 1} {\Psi _s^{} {\bf{FF'}}{\Psi _s^{}} ^\prime } \end{equation}

This doesn't depend upon $\mathbf{F}$, as long as ${\bf{FF'}} = \Sigma $. We can isolate the effect of a single component of $\mathbf{v}$ by rewriting the sum as

\begin{equation} \sum\limits_{s = 0}^{H - 1} {\sum\limits_{i = 1}^N {\Psi _s^* {\bf{e}}(i){\bf{e}}(i)'{\Psi _s^*} ^\prime } } = \sum\limits_{i = 1}^N {\sum\limits_{s = 0}^{H - 1} {\Psi _s^* {\bf{e}}(i){\bf{e}}(i)'{\Psi _s^*} ^\prime } } \end{equation}

where ${{\bf{e}}(i)}$ is the unit vector i. This decomposes the variance-covariance matrix of forecast errors into $N$ terms, each of which shows the contribution of a component of $\mathbf{v}$ over the $H$ periods. In particular, if we look at the forecast error variance of variable $j$, we get

\begin{equation} \sum\limits_{i = 1}^N {\left\{ {\sum\limits_{s = 0}^{H - 1} {\left\{ {\left( {\Psi _s^* } \right)_{j,i} } \right\}^2 } } \right\}} \end{equation}

In this, for each source shock $i$, we take the sums of squared responses of variable $j$ to its shock over the forecast period to get its contributions to the overall forecast error variance. Since those are all positive (or at least non-negative), they decompose the total.

Graphing Impulse Responses

The moving average representation (MAR) of a model is simply the complete set of impulse responses. These are most usefully presented graphically. There are three logical ways to organize graphs of the MAR.

•A single graph can have responses of all variables to a shock in one variable. If the variables are measured in different units, it is advisable to standardize the responses: if you divide the response of a variable by the standard deviation of its residual variance, all responses are in fractions of standard deviations. When the variables are in comparable units and a comparison of actual values is important, it is better to graph unscaled responses. For example, you would want to compare interest rates and the rate of inflation without scaling.

•A single graph can have responses of one variable to shocks in all the variables. There is no problem of scale in this case.

•You can use a matrix of small graphs, each with only a single response. This looks nicer, but is more difficult to set up, as you must be sure that all graphs showing the responses of a single variable use the same MAXIMUM and MINIMUM values. Otherwise, very small responses will be spread across the entire height of a graph box and look quite imposing.

Where possible, use the procedure @VARIRF to do the graphs. It takes an already estimated VAR, computes the impulse responses, and organizes the graphs in one of several formats. We use this twice in IMPULSES.RPF, once with each page having responses of all variables to a single shock (the PAGE=BYSHOCKS option), and once with each page having the responses of each variable to all shocks (PAGE=BYVARIABLES).

@VARIRF(model=canmodel,steps=nsteps,$

varlabels=implabel,page=byshocks)

@VARIRF(model=canmodel,steps=nsteps,$

varlabels=implabel,page=byvariables)

Confidence Bands

Point estimates alone of impulse responses may give a misleading impression. You might note a response whose sign is unexpected. Is this truly interesting, or is it just a statistical fluke? This can be answered, in part, by examining the corresponding error decomposition. If the innovation you're examining in fact explains a trivial amount of the target variable, then it isn't really meaningful.

But many responses can't quite so easily be dismissed as uninteresting. IMPULSE produces a moving average representation from the point estimates of a VAR. Since the coefficients of the VAR aren't known with certainty, neither are the responses. There are three principal methods proposed for computing confidence bands or standard errors for impulse responses:

1.Monte Carlo integration

2.Delta method (linearization)

3.Bootstrapping

The delta method is sometimes given the appealing description of “analytical”, but is based upon a linearization which becomes increasingly inaccurate as the number of steps grows and the response functions become increasingly non-linear. We have a very strong preference for Monte Carlo integration, as is done with the @MONTEVAR or the @MCMCDODRAWS procedures. See Sims and Zha (1999) for a discussion of these issues. The replication programs for the Sims and Zha paper include both Monte Carlo integration and bootstrapping.

Scaling Responses

The scale on an orthogonalized shock is determined empirically based upon the values of the $\Sigma$ matrix. This has the advantage that each shock is of a "historically typical" size. However, you might be more interested in a shock of a particular size or a shock which has an impact of a particular size. Impulse responses are linear, so this can be achieved either by adjusting the size of the input shock (usually the easiest) or by scaling the size of the response. (Note that changing what would be a positive shock to a negative shock is the same process, just scaling by -1).

You input a specific shock using IMPULSE with the SHOCK option. The IMPULSES.RPF example includes a computation (near the end of the program) which does a Cholesky shock to the (Canadian) interest rate (variable 3 in the model), re-scaled to an own response of -1. Shock 3 is column 3; the impact of variable 3 is the 3rd element of that, so multiplying by -1/RSHOCK(3) keeps the same "shape" but changes the sign and scale:

compute factor=%decomp(%sigma)

compute [vector] rshock=%xcol(factor,3)

compute rshock=-1.0*rshock/rshock(3)

impulse(model=canmodel,shock=rshock,steps=nsteps,$

results=to_r,noprint)

graph(number=0,header=$

"Response of Canadian RGDP to expansionary rate shock") 1

# to_r(5,1)

Interpreting the Decomposition of Variance

Decomposition of Variance for Series CANRGDPS

Step Std Error USARGDPS CANUSXSR CANCD90D CANM1S CANRGDPS CANCPINF

1 0.004688318 13.062 2.172 2.321 0.604 81.842 0.000

2 0.007407510 21.727 2.495 1.291 0.943 73.505 0.040

3 0.008882190 19.386 4.086 0.977 8.445 67.062 0.044

4 0.010004321 15.284 4.194 3.754 12.355 64.256 0.156

5 0.010632775 14.403 4.704 6.786 12.090 61.583 0.434

6 0.011511805 17.409 5.516 13.026 10.328 53.250 0.472

7 0.013088464 21.646 5.594 21.954 8.939 41.435 0.432

8 0.014910398 23.338 5.798 29.436 8.936 31.964 0.529

9 0.016733054 23.104 6.219 35.950 8.549 25.436 0.742

10 0.018526192 22.059 6.274 41.696 8.106 20.916 0.949

This is part of one of the tables produced by an ERRORS instruction for a six-variable VAR. There will be one such table for each endogenous variable.

The first column in the output is the standard error of forecast for this variable in the model. Since the computation assumes the coefficients are known, it is lower than the true uncertainty when the model has estimated coefficients. The remaining columns provide the decomposition. In each row, they add up to 100%. For instance, in the sample above, 81.84% of the variance of the one-step forecast error is due to the innovation in CANRGDPS itself. However, the more interesting information is at the longer steps, where the interactions among the variables start to become felt. We have truncated this table to 10 lags to keep its size manageable, but ordinarily you should examine at least four years’ worth of steps.

The above table suggests the following:

•The three principal factors driving CANRGDPS are itself, USARGDPS and CANCD90D, which is a short interest rate.

•The importance of USARGDPS is fairly constant across the range. Because this variable was first in the ordering, it isn't clear (from examining just this one ordering) whether this is an artifact of the ordering only.

•Innovations in CANCD90D take almost six periods to have an effect but quickly become the prime mover.

•The other three variables (CANM1S, CANUSXSR and CANCPINF) have negligible explanatory power for CANRGDPS.

If you want more information on how USARGDPS and CANCD90D affect CANRGDPS, you need to look at the impulse response functions.

Choosing Orderings

If you work with Cholesky factorizations, the orderings that you should examine depend upon the set of questions you want to answer, and upon the structure of the covariance matrix of residuals.

•Variables that you don’t expect to have any predictive value for other variables should be put last: for instance, local variables in a model with national variables.

•By definition, the first variable in the ordering explains all of its one-step variance.

•The one-step variance will be nearly 100% due to own innovations if there is little correlation between the residuals of a variable and the residuals of variables that appear before it in the ordering.

•When there is substantial correlation among innovations in variables, the decomposition of one-step variance depends strongly on the order of factorization.

To determine whether a variable behaves exogenously, put the variable first in the ordering. The variance in an exogenous variable will be explained primarily by own innovations. The meaning of “primarily” depends upon the number of variables in the system—50% is quite high in a six variable system. Remember that if the covariance matrix is nearly diagonal, the decomposition of variance will be fairly robust to changes of order.

Focusing on Two Variables

When there is high correlation between innovations in two variables, run a pair of decompositions with the two variables placed next to each other, changing only the positions of those two variables from one ordering to the next. Since the combined explanatory power of the two variables is independent of which one comes first, how the variance is split between them can be examined.

Usually, most of the variance will be attributed to whichever variable comes first. If this is true for both orderings, we can draw no conclusions. If one variable does much better when second than the other, you have evidence that variable is the causative factor, and the other moves with it closely. If most of the power is attributed to whichever variable is second, some linear combination of the two variables, perhaps the sum or difference, is the truly important factor.

In the example shown for the decomposition of variance, there is the question of whether or not the influence of USARGDPS is mainly due to its placement in the ordering before CANRGDPS. That can be answered by switching the ordering around to put CANRGDPS earlier. This reordering pretty much wipes out the fraction of variance due to USARGDPS.

IMPULSES.RPF Example

IMPULSES.RPF computes and graphs impulse response functions (to Cholesky shocks) in several different formats. Two are done using @VARIRF, which does one page per shock (with six panes) and one page per variable (again with six panes). Then, direct programming is used to do single graphs with responses of all six variables to each shock, and single graphs with responses for each variable to all shocks. It also uses @MONTEVAR to do a basic set of confidence bands.

Very little of IMPULSES.RPF needs to be changed to handle a different data set. Once the data are in and transformed, you need to set the number of lags and the number of steps of responses.