Spillover/Causality-in-Variance Tests

A common topic for people writing a master’s thesis or undergraduate paper is to study the “spillover” between markets A and B using GARCH models. While seemingly a relatively straightforward case of “estimate and test some coefficients for zero”, it is, in fact, much less clear whether the results from this are of real value. There are several reasons for this:

- Volatility isn’t observable.
- Unlike causality in the mean, where (Granger) “causality” means that a series is forecast better with the extra information, causality in variance means that the variance forecasts are merely different. (Because of (a), you can’t really tell whether they are better.)
- GARCH models are all approximations, and models with similar fits can produce very different results.

Cheung and Ng(1996) tried to avoid some of these problems by proposing a test based upon the results of univariate GARCH models, testing cross-correlations of squared univariate standardized residuals. That fails as a useful strategy for several reasons. First, the squared residuals are known to be a poor proxy for volatility. Second, this is basically an application of the Pierce-Haugh(1977) test for Granger causality which was very quickly discarded as it produced unreliable results.

The typical parametric method for testing causality-in-variance is to use a multivariate GARCH model in which some parameters create cross-variable effects on future volatility estimates, and test whether those are zero. The most common choice for this is the BEKK model, where the off-diagonal elements would have that type of effect. (Given the standard parameterization, A(1,2) would create a cross-effect from series 1 to series 2.) However, a BEKK often produces a similar fit to a DVECH model, and the DVECH model (by construction) permits no “spillover” as the volatility of each series depends only upon its own past.

To look at the effect this can have, we’ll look at the model used in Hafner and Herwartz(2006). (Note that H&H do not do formal tests for variance-in-causality, but look at that through the volatility impulse responses). The BEKK model used in the paper has 15 free parameters (4 in the mean model and 11 in the GARCH part). The log likelihood is 28606.8. A Wald test for volatility spillover from series 2 (British pound) to series 1 (Deutsche Mark) is 11.5, which, for a \(\chi ^2 \) with 2 degrees of freedom, has a p-value of .0032. The test in the opposite direction is 69.4, which has a p-value which is 0 to as many decimals as you might want to show. (These tests can be set up fairly easily using the Statistics—Regression Tests wizard, as it’s just "Exclusion Restrictions" on the proper elements of the BEKK parameters, such as jointly on A(1,2) and B(1,2) for the 1 to 2 direction). From this, one would conclude that there is apparently fairly strong evidence of spillover, particularly from Germany to the UK. However, the DVECH model (which again, allows no possibility of spillover), has 13 free parameters (here 9 in the GARCH part) with a log likelihood of 28610.8. If we were picking between the two models using AIC, they would be tied to five decimals. With the more stringent SBC, the smaller DVECH model is strongly preferred (estimates use 3718 observations, so the SBC penalty for the larger BEKK model is 16.4 with a log likelihood difference of only 4). The difference in inference regarding causality between the two models is due to a combination of (b) and (c)—the BEKK estimates are different, but apparently not really better. Clearly one has to be careful in picking a multivariate GARCH model based solely upon whether it admits a test for “spillover”.