Matheson Stavrev EL 2013

This is based upon the work in Matheson and Stavrev(2013) with a reconstructed and extended (to 2016) data set. This does a relatively standard "NAIRU-Phillips Curve'' model but allows for time-varying coefficients in the measurement equations. The paper also imposes inequality restrictions on the states (the time-varying coefficients) themselves, which is technically quite a bit more complicated than described in this (rather short) paper. Both the use of the time-varying coefficients (which creates a non-linear state-space model) and the handling of the inequality restrictions on the states are covered in detail as part of the State-Space/DSGE e-course.

This starts with a Phillips curve equation, which has (observable) inflation (${\pi _t}$) equal to expected inflation ($\pi _t^e$), plus terms in the unemployment gap (difference between observable unemployment ${u_t}$ and the unobservable NAIRU ($u_t^*$)), and an observable import price push ($\pi _t^m$). The resulting equation is

\begin{equation} {\pi _t} = \pi _t^e - {\kappa _t}({u_t} - u_t^*) + {\gamma _t}\hat \pi _t^m + \varepsilon _t^\pi \label{eq:matheson_phillips} \end{equation}

(Unobservable) expected inflation is derived from two observables using a time-varying average:

\begin{equation} \pi _t^e = {\theta _t}{{\bar \pi }_t} + (1 - {\theta _t})\pi _{t - 1}^{(4)} \label{eq:matheson_expinfl} \end{equation}

${{\bar \pi }_t}$ is a long-term inflation expectation and $\pi _{t - 1}^{(4)}$ is year-over-year inflation (lagged). Note that this equation has no error term and can just be substituted into \eqref{eq:matheson_expinfl}. That can be rearranged to make

\begin{equation} {\pi _t} = {\pi _{t-1}^{(4)}} + {\theta _t} ({{\bar \pi }_{t-1}} - {\pi _{t-1}^{(4)}}) - {\kappa _t}({u_t} - u_t^*) + {\gamma _t}\hat \pi _t^m +\varepsilon _t^\pi \label{eq:matheson_rearrange} \end{equation}

The state equation for the NAIRU gap is assumed to be the first order autoregression:

\begin{equation} {u_t} - u_t^* = \rho ({u_{t - 1}} - u_{t - 1}^*) + \varepsilon _t^{\left( {u - {u^*}} \right)} \end{equation}

while the NAIRU itself evolves as a random walk:

\begin{equation} u_t^* = u_{t - 1}^* + \varepsilon _t^{{u^*}} \end{equation}

${\pi _t}$, ${u_t}$, $\hat \pi _t^m$, ${{\bar \pi }_t}$ and $\pi _{t - 1}^{(4)}$ are observable. One can simplify the writing of the model by adding ${u_t} - u_t^*$ to the state vector, and then including the identity

\begin{equation} {u_t} \equiv u_t^* + ({u_t} - u_t^*) \end{equation}

The states are then $u_t^*$, ${u_t} - u_t^*$ and the time-varying coefficients ${\kappa _t}$, ${\gamma _t}$ and ${\theta _t}$. The observables are inflation (${\pi _t}$) and the unemployment rate (${u_t}$).

The interactions between the time-varying coefficients and the $u_t^*$ means that this is not a standard form for a state-space model, but instead, requires the methods of the Non-linear (or Extended) Kalman filter.

This includes several programs with different levels of complexity:

kfunconstrain.rpf

Estimates the basic "gap" model with fixed rather than time-varying coefficients in the Phillips curve. (This is not included in the paper, though almost certainly would have been done as a first step).

ekfunconstrain.rpf

Estimates the model allowing for time-varying coefficients in the Phillips curve, but without constraints on their values.

ekfconstrain.rpf

Estimates the model allowing for time-varying coefficients in the Phillips curve, including constraints on their values, as is (somewhat vaguely) described in the paper.

ekfexponential.rpf

Estimates the model allowing for time-varying coefficients but using exponential and logistic parameterizations to implement the constraints. (This is an alternative to the method employed in the paper).

All of the programs use specialized graph styles which redefine styles 1, 2, 12, 3 and 13 to have heavier lines for 1, 2 and 3 and thinner ones for 12 and 13:

open styles mscolors.txt

grparm(import=styles)

where mscolors.txt is a text file with the definitions:

LINE_COLOR_1=0,000000,2.0

LINE_COLOR_2=0,0000FF,2.0

LINE_COLOR_12=0,0000FF,0.5

LINE_COLOR_3=0,FF0000,2.0

LINE_COLOR_13=0,FF0000,0.5

The data are read and transformed with

cal(q) 1960

open data msextended.rat

data(format=rats) 1960:01 2016:04 ptrcpi ptrpce cpi unrate mdef

set u = unrate

set pi = 400.0*log(cpi/cpi{1})

set pim = 400.0*log(mdef/mdef{1})-pi

set pibar = ptrcpi

set pistar = .25*(pi+pi{1}+pi{2}+pi{3})

where U in the unemployment rate, PI is the annualized CPI inflation rate, PIM is an import price "push" (import price inflation less the CPI inflation), PIBAR is a survey value for expected inflation and PISTAR is a one-year moving average of inflation.

All the programs use @NBERCycles to get dummies for recessions and use a Hodrick-Prescott filter to get guess values for the unemployment gap (subtracting a smoothed version of unemployment from the observed value):

@nbercycles(down=nber)

filter(type=hp) u / ustar_0

set ugap_0 = u-ustar_0

kfunconstrain.rpf

In this case, there are only two states $u_t^*$, ${u_t} - u_t^*$ (actually only one, but it's more convenient to let the software handle the relationship between those). Most of \eqref{eq:matheson_phillips} can be shifted into the "MU" option in the state-space setup. The first step is to get guess values for the parameters (taking the guess for UGAP_0 as data). It's most convenient to do this as a non-linear least squares estimation, in part because it immediately puts the values into the KAPPA, THETA and GAMMA variables, and also because the observable PISTAR{1} is included on the right side.

nonlin(parmset=fixedparms) kappa theta gamma

* The sample start allows for the loss of five data points due to the

* lagged four-period moving average:

compute sstart =1961:02

compute send =2016:04

frml rollpi pi = pistar{1}+theta*(pibar-pistar{1})-$

kappa*ugap_0+gamma*pim

compute kappa=0.1,gamma=0.1,theta=0.5

nlls(frml=rollpi) pi sstart send

compute sv=%sigmasq

and this estimates the parameters of the unemployment gap evolution:

linreg ugap_0

# ugap_0{1}

compute rho=%beta(1)

compute swugap=%sigmasq

We also need a guess for the variance in the evolution of $u_t^*$. Since that is assumed to be a random walk, we take the variance (or actually the mean square) of the first difference of the guess series:

sstats(mean) / (ustar_0-ustar_0{1})^2>>swustar

This sets up the system matrices for the state-space model. As we said above, most of the Phillips curve goes into the MU component of the model—the only element of the right side that depends upon the states is the ugap term. These are all declared as FRML's of the proper type of matrix since they all depend upon at least one free parameter.

dec frml[vect] muf

dec frml[rect] cf

dec frml[rect] af

dec frml[symm] svf

dec frml[symm] swf

frml muf = ||pistar{1}+theta*(pibar-pistar{1})+gamma*pim,0.0,0.0||

frml cf = ||-kappa,1.0|$

0.0 ,1.0||

frml af = ||rho,0.0|$

0.0,1.0||

frml svf = %diag(||sv,0.0||)

frml swf = %diag(swugap~~swustar)

The second component of SVF is zero since that's an identity (that U is exactly the sum of the two states).

This declares the other parameters (in addition to the Phillips curve coefficients above):

nonlin(parmset=base) rho sv swugap swustar

The estimates for the model as described would best be characterized as disappointing. The unconstrained estimate of SWUSTAR (the variance of the random walk process for USTAR) comes in negative. If that gets pegged to zero (as the closest permitted value), then the NAIRU is flat across the entire sample. Note that this is not uncommon in "gap" models like this. The time series properties for USTAR and UGAP are not substantially different. (Unit root vs an AR with a .9 root). So the only thing in the model that distinguishes the two is the presence of UGAP in the Phillips curve. Unfortunately, the likelihood of that is fairly flat over quite a wide range of possible estimates for UGAP, thus doesn't provide much help. In order to give the model content, it is necessary to do something else to distinguish the two components. In this case, we peg the ratio of the shock variances to 15 (with UGAP having the higher variance). This is one of the options used in the work done in the paper itself (which had the same issue). The number is chosen to make the model results look reasonable.

nonlin(parmset=pegratio) swugap=15.0*swustar

This estimates the model including the pegged variance ratio, producing the Kalman smoothed states and graphs the estimated UGAP and USTAR:

dlm(a=af,c=cf,y=||pi,u||,mu=muf,sv=svf,sw=swf,pmethod=simplex,piters=10,method=bfgs,$

parmset=base+fixedparms+pegratio,presample=ergodic,type=smooth) sstart send xstates

set ugap = xstates(t)(1)

set ustar = xstates(t)(2)

graph(header="Unemployment Gap from Model without TVC")

# ugap

graph(header="NAIRU from Model without TVC")

# ustar

DLM - Estimation by BFGS

Convergence in 10 Iterations. Final criterion was 0.0000074 <= 0.0000100

Quarterly Data From 1961:02 To 2016:04

Usable Observations 223

Rank of Observables 445

Log Likelihood -477.4718

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. RHO 0.9791837491 0.0091549226 106.95708 0.00000000

2. SV 2.3135614894 0.2212312958 10.45766 0.00000000

3. SWUGAP 0.0889450513 0.0085214828 10.43774 0.00000000

4. SWUSTAR 0.0177890103 0.0016789557 10.59528 0.00000000

5. KAPPA 0.2890102957 0.0768278554 3.76179 0.00016870

6. THETA 0.4085720581 0.0551290045 7.41120 0.00000000

7. GAMMA 0.1514273112 0.0126624664 11.95875 0.00000000

ekfunconstrain.rpf

This estimates the model allowing for time-varying coefficients (thus creating a multiplicative non-linearity) but does not impose bounds on those coefficients. If we call $({u_t} - u_t^*)$ $G_t$ for short, the linearization of that the Phillips curve is

\begin{equation} {\kappa _t}{G_t} \approx {\tilde \kappa _t}{\tilde G_t} + {\tilde \kappa _t}({G_t} - {\tilde G_t}) + {\tilde G_t}({\kappa _t} - {\tilde \kappa _t}) = \left( { - {{\tilde \kappa }_t}{{\tilde G}_t}} \right) + {\tilde \kappa _t}{G_t} + {\tilde G_t}{\kappa _t} \label{eq:matheson_GapLinearization} \end{equation}

The first term in the final expression is a constant (as far as the state-space model is concerned) and thus goes into the MU component. The other two give the C component values for the $G$ and $\kappa$ states respectively. The linearized measurement equation can be written:

\begin{equation} \left[ {\begin{array}{*{20}{c}} {{\pi _t}} \\ {{u_t}} \\ \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {\pi _{t - 1}^{(4)} + {{\tilde \kappa }_t}{{\tilde G}_t}} \\ 0 \\ \end{array}} \right] + \left[ {\begin{array}{*{20}{c}} { - {{\tilde \kappa }_t}} & 0 & { - {{\tilde G}_t}} & {{{\bar \pi }_{t - 1}} - \pi _{t - 1}^{(4)}} & {\pi _t^m} \\ 1 & 1 & 0 & 0 & 0 \\ \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {{G_t}} \\ {u_t^*} \\ {{\kappa _t}} \\ {{\theta _t}} \\ {{\gamma _t}} \\ \end{array}} \right] + \left[ {\begin{array}{*{20}{c}} {\varepsilon _t^\pi } \\ 0 \\ \end{array}} \right] \end{equation}

If we estimate this model with freely estimated drift variances, it is likely to produce nonsensical results. First off, the NAIRU decomposition will have the same problem in the simpler version without time-variation—there just is too little difference in log likelihood across a wide range of different decompositions. To handle that, the authors impose one of two different variance ratios between the variances in the two components of the gap model, which they call "stable" and "flexible" (referring to the behavior of the NAIRU component.

The authors also put limits on the drift variances for the Phillips curve parameters. While this isn't at all unreasonable (in practice, those can come in too high to be sensible), the method used seems a bit arbitrary. They run 40 period rolling regressions and estimate the variance from the differences in the parameter estimates in adjacent windows. However, the regression run with the window from 1 to 40 and the regression from 2 to 41 have 39 observations in common so aren't really going to tell you much about possible period-to-period fluctuations. Note also that 40 is completely arbitrary—a wider window would give a smaller limit, a narrower one would give a larger one. While the idea is not well-motivated, the example will follow the paper's lead.

This sets up the rolling regressions. They're done by non-linear least squares though they could be done (with some rearrangement of variables) using OLS—it's just easier to set this up as a FRML.

nonlin(parmset=rollparms) kappa_r theta_r gamma_r

frml rollpi pi = pistar{1}+theta_r*(pibar-pistar{1})-$

kappa_r*ugap_0+gamma_r*pim

This sets the span for the rolling regressions, the overall sample start and end (SSTART and SEND) and the start for the rolling regressions (RSTART):

compute span =40

compute sstart =1961:02

compute rstart =sstart+span-1

compute send =2016:04

This does the full sample regression

compute kappa_r=0.1,gamma_r=0.1,theta_r=0.5

nlls(parmset=rollparms,frml=rollpi) pi sstart send

compute sv=%sigmasq

and this does the rolling sample regressions and saves the series of estimated coefficients.

clear(zeros) kappa_re theta_re gamma_re

do time=rstart,send

compute kappa_r=0.1,gamma_r=0.1,theta_r=0.5

nlls(parmset=rollparms,frml=rollpi,noprint) pi time-(span-1) time

compute kappa_re(time)=kappa_r

compute theta_re(time)=theta_r

compute gamma_re(time)=gamma_r

end do time

This extends the coefficient series back to the start of the sample by repeating the initial window for the early part of the data set:

set kappa_0 = %if(t<=rstart,kappa_re(rstart),kappa_re)

set theta_0 = %if(t<=rstart,theta_re(rstart),theta_re)

set gamma_0 = %if(t<=rstart,gamma_re(rstart),gamma_re)

This next computes the average squared change of the coefficients (over the part of the sample which had enough data to do the regressions):

sstats(mean) rstart+1 send (kappa_0-kappa_0{1})^2>>sigsqup_kappa $

(theta_0-theta_0{1})^2>>sigsqup_theta $

(gamma_0-gamma_0{1})^2>>sigsqup_gamma

This gets guess values for the AR coefficient in the gap model, and its variance:

linreg ugap_0

# ugap_0{1}

compute rho=%beta(1)

compute swugap=%sigmasq

This now sets up the linearized MU and C formulas. Note that MU has one term due to linearization, and the PISTAR{1} that's part of the original \eqref{eq:matheson_rearrange}. Note also that because $\kappa G$ enters the Phillips curve with a negative sign, the members of MU and C due to the expansion of that have the opposite signs from \eqref{eq:matheson_GapLinearization}.

dec frml[vect] muf

dec frml[rect] cf

dec frml[symm] svf

dec frml[symm] swf

frml muf = ||+kappa_0*ugap_0+pistar{1},0.0||

frml cf = ||-kappa_0 ,1.0|$

0.0 ,1.0|$

-ugap_0 ,0.0|$

(pibar-pistar{1}),0.0|$

pim ,0.0||

and this sets up the variance FRML's and the parameter set for estimating the free parameters.

dec vector swtvc(4)

frml svf = %diag(||sv,0.0||)

frml swf = %diag(swugap~~swtvc)

nonlin(parmset=base) rho sv swugap swtvc

These are the two variance ratio pegs for the components of the NAIRU decomposition. The first is the "stable" one and the second the "flexible". SWUGAP is the variance in the gap autoregression, SWTVC(1) is the drift variance in the NAIRU, so the first allows relatively less variability in the NAIRU.

nonlin(parmset=stable) swugap=15.0*swtvc(1)

nonlin(parmset=flexible) swugap=5.0*swtvc(1)

It's important to note that SWUGAP is the variance in an AR process and SWTVC(1) is the variance of the increment in a random walk, so the two aren't strictly comparable—how SWUGAP affects the variance of the gap process depends upon the value of $\rho$ as well, so a smaller value of $\rho$ (with these data, it comes in at a high value of .95) might require higher multiples to achieve a desired amount of movement.

The $\bf{A}$ matrix is just an identity matrix other than $A_{1,1}$. The user function %%DLMSetup will reset that to the current test value of $\rho$:

dec rect a(5,5)

compute a=%na~\%identity(4)

function %%DLMSetup

compute a(1,1)=rho

end

This sets up the limits on the parameters. All the drift variances are set to be non-negative, and the three time-varying parameter variances are limited as described above. In addition, $\rho$ is limited to go no higher than .95—if it gets much higher, you start to see confusion between the gap and the NAIRU itself.

nonlin(parmset=limits) swtvc(1)>=0.0 rho<=.95 $

swtvc(2)>=0.0 swtvc(2)<=sigsqup_kappa $

swtvc(3)>=0.0 swtvc(3)<=sigsqup_theta $

swtvc(4)>=0.0 swtvc(4)<=sigsqup_gamma

This sets guess values for the random walk variances. The last three are based upon the limits already computed and the first (for the NAIRU) allows for drift of about $ \pm 2$ over 10 quarters.

compute swtvc(1)=.1

compute swtvc(2)=.5*sigsqup_kappa

compute swtvc(3)=.5*sigsqup_theta

compute swtvc(4)=.5*sigsqup_gamma

This does a single Kalman smooth at the guess values to get the expansion points for the linearization that will be used in the later analysis:

dlm(start=%%DLMSetup(),a=a,c=cf,y=||pi,u||,mu=muf,sv=svf,sw=swf,$

presample=ergodic,type=smooth) sstart send xstates vstates

set ugap_0 = xstates(t)(1)

set kappa_0 = xstates(t)(3)

The paper does graphs with quite a few different variations on the estimates. To help organize this, we use a set of HASH aggregators. The final results will be saved in HASH'es layered three deep: the outermost is the choice between stable and flexible variance restrictions (shortened to keys of "stab" and "flex"), the middle one is filtered vs smoothed estimates ("kf" and "sm") and the innermost between the estimate, and lower and upper (one standard deviation) bounds ("est", "lo", "hi"):

dec hash[parmset] pegs

compute pegs("stab")=stable

compute pegs("flex")=flexible

dec hash[hash[hash[series]]] h_ustar h_kappa h_theta h_gamma

dec hash[hash[series]] h_pie

dec hash[series[vect]] h_xstates

dec hash[series[symm]] h_vstates

This does estimation of the parameters (combined with filtered estimates of the states) and smoothed estimates of the states for each of the two pegs on the variance ratios. It changes the limits on those by using PEGS(ROOT), where PEGS is a HASH[PARMSET] just defined which chooses between the STABLE and FLEXIBLE PARMSETS.

dec string root ktype

dofor root = "stab" "flex"

dlm(start=%%DLMSetup(),a=a,c=cf,y=||pi,u||,mu=muf,sv=svf,sw=swf,$

type=filter,presample=ergodic,$

parmset=base+pegs(root)+limitsmethod=bfgs,condition=10) $

sstart send h_xstates("kf") h_vstates("kf")

dlm(start=%%DLMSetup(),a=a,c=cf,y=||pi,u||,mu=muf,sv=svf,sw=swf,$

type=smooth,presample=ergodic) $

sstart send h_xstates("sm") h_vstates("sm")

Note that this uses the CONDITION option. The 10 is somewhat arbitrary, but you probably want to do at least five, as the ability of the model to predict those early data points is extremely weak.

The estimated states and covariance matrices are saved into HASHes of their own so the organization of the output can be done inside a DOFOR loop over the HASH keys ("kf" and "sm"). We omit most of this, since it's just the same set of lines applies to different elements out of the state vector:

dofor ktype = "kf" "sm"

set h_ustar(root)(ktype)("est") = h_xstates(ktype)(t)(2)

set h_ustar(root)(ktype)("hi") = h_xstates(ktype)(t)(2)+$

sqrt(h_vstates(ktype)(t)(2,2))

set h_ustar(root)(ktype)("lo") = h_xstates(ktype)(t)(2)-$

sqrt(h_vstates(ktype)(t)(2,2))

...

set h_pie(root)(ktype) = h_theta(root)(ktype)("est")*pibar+$

(1-h_theta(root)(ktype)("est"))*pistar{1}

end do ktype

The calculation of the inflation expectation is the one that has a different pattern. That takes the weighted average of two observables (PIBAR and PISTAR{1}) using the estimated time-varying thetas.

The output from the two models is shown in Stable and Flexible. Note that in both models, all of the upper bounds (on $\rho$ and the three drift variances) have been hit. If we do the standard "random walk" analysis on the $\theta$ coefficients for instance, over 10 years (40 quarters), the variance of that is $40 \times .0027$ or roughly .1, for a standard deviation of a bit over .3. Although we haven't constrained it directly, it's hard to imagine $\theta$ being outside [0,1] (since it forms the weights on a weighted average), so a standard deviation of .3 over 40 quarters would pretty much let it roam over the full sensible range. Thus, the upper bound may still be a bit too high for a practical model.

The unemployment rate decomposition and the model's inflation predictions are shown in the first graph below. The blue lines are from the stable model and the red lines from the flexible. (In the unemployment rate there are estimates with one standard deviation bounds). The inflation expectations are almost identical, which is why you can see almost no blue, as it gets overwritten by the red. The NAIRU's are not identical as the flexible model tends to (as designed) move more with the peaks and troughs. As the log likelihood is quite a bit higher with the flexible model, it's probably a good guess that without putting the limit on the variance ratio, a fully unconstrained maximum would have the NAIRU much more closely matching the unemployment rate itself (which certainly wouldn't be the intent of the model). These types of models depend on rather strong restrictions to produce "reasonable" results.

The time-varying coefficients themselves are shown in the second graph below. Not surprisingly, the filtered (left column) estimates are a bit wild at the start. As mentioned earlier, even though the drift variances on these had an upper bound (which was binding on the estimates), they still may be too high—if you compare (for instance) the filtered and smoothed estimates of $\theta$ in the 2008-9 recession, the filtered estimates seem to overreact, as there seems to be no permanent effect judging from the smoothed estimates. The fact that even the smoothed estimates for $\kappa$ and $\gamma$ go negative at parts of the sample presumably led the authors to put restrictions on the states, which is quite a bit more complicated technically than they describe.

DLM - Estimation by BFGS with inequalities

Convergence in 22 Iterations. Final criterion was 0.0000006 <= 0.0000100

Quarterly Data From 1963:04 To 2016:04

Usable Observations 213

Rank of Observables 426

Log Likelihood -463.5915

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. RHO 0.9500000000 0.0000000000 0.00000 0.00000000

2. SV 1.5823673837 0.1752055332 9.03149 0.00000000

3. SWUGAP 0.1136659351 0.6281682847 0.18095 0.85640822

4. SWTVC(1) 0.0075777290 0.0007549753 10.03706 0.00000000

5. SWTVC(2) 0.0090484546 0.0000000000 0.00000 0.00000000

6. SWTVC(3) 0.0026700010 0.0000000000 0.00000 0.00000000

7. SWTVC(4) 0.0002055912 0.0000000000 0.00000 0.00000000

DLM - Estimation by BFGS with inequalities

Convergence in 15 Iterations. Final criterion was 0.0000080 <= 0.0000100

Quarterly Data From 1963:04 To 2016:04

Usable Observations 213

Rank of Observables 426

Log Likelihood -449.9446

Variable Coeff Std Error T-Stat Signif

************************************************************************************

1. RHO 0.9500000003 0.0000000000 0.00000 0.00000000

2. SV 1.4342322533 0.1543318330 9.29317 0.00000000

3. SWUGAP 0.0957249775 0.0090841450 10.53759 0.00000000

4. SWTVC(1) 0.0191449955 0.0017758266 10.78089 0.00000000

5. SWTVC(2) 0.0090484559 0.0000000000 0.00000 0.00000000

6. SWTVC(3) 0.0026700049 0.0000000000 0.00000 0.00000000

7. SWTVC(4) 0.0002056355 0.0000000000 0.00000 0.00000000