RLS Instruction

RLS( options ) depvar start end residuals

# list of explanatory variables in Regression Format

RLS uses the Kalman filter to perform least squares regressions over a range of entries, generating various statistics on the behavior of these regressions. These are often used in formal or informal tests of the stability of a regression relationship.

If you need some information beyond the recursively generated coefficients and residuals, such as forecasts from each stage, you need to use the KALMAN instruction instead. KALMAN will do the same type of calculation, but does so one entry at a time, allowing you to do whatever extra computations are required at each stage.

Parameters

depvar	dependent variable
start, end	range to estimate, defaults to maximum range permitted by all variables involved in the regression
residuals	(Optional) Series for the recursive residuals

Options

[PRINT]/NOPRINT

VCV/[NOVCV]

TITLE="title for output" [“Recursive Least Squares”]

These control the printing of regression output and the printing of the estimated Covariance/correlation matrix of the coefficients, and the title used in labeling the output.

SMPL=Standard SMPL option[unused].

Omits from the estimation observations where the SMPL series or formula is zero or NA

SPREAD=Standard SPREAD option [unused]

WEIGHT=Standard WEIGHT option[unused]

Use SPREAD for corrections for heteroscedasticity and WEIGHT for weighting observations.

EQUATION=Equation to estimate [unused]

LASTREG/[NOLASTREG]

Use the EQUATION option to estimate a previously defined equation. LASTREG re-estimates the most recent regression using recursive least squares. If you use either, don't include a supplementary card.

COHISTORY=VECTOR[SERIES] of coefficient estimates[not used]

SEHISTORY=VECTOR[SERIES] of coefficient standard errors[not used]

Respectively, these save the sequential estimates of the coefficients and the sequential estimates of the standard errors of the coefficients. These are stored into VECTORS of SERIES, with each element of the vector being a series for a different coefficient. For instance, if you do:

rls(cohistory=coefs) y

# constant x1

the series COEFS(1) will contain the sequential coefficient estimates for the constant term, while COEFS(2) will contain the coefficient estimates for X1.

SIGHISTORY=SERIES for the standard errors [not used]

DFHISTORY=SERIES for the degrees of freedom history[not used]

SIGHISTORY saves the sequential estimates of the standard error of the regression into a series. DFHISTORY saves the degrees of freedom at each time period (the excess of observations over the number of parameters).

CSUMS=SERIES for cumulated sum of recursive residuals[not used]

CSQUARED=SERIES for cumulated sum of squared rec. resid.[not used]

When scaled, these can be used for the CUSUM and CUSUMSQ tests for stability and homoscedasticity.

ORDER=series or formula giving order entries are to be added [not used]

INDEX=SERIES[INTEGER] showing the entry mapping actually used [not used]

You can use ORDER to add entries to the regression based upon a series other than the time sequence. The “history” series and residuals keep the original entry mapping. If you need to “remap” these into the sequence in which they were added to the regression, you can use the INDEX option to get that entry mapping. That is, if you do ORDER=POP,INDEX=IPOP and save the residuals into RES, the series RES will be the recursive residuals in their original order, so, for instance, a scatter plot of POP against RES will be sensible; while the series generated by RES(IPOP) will be the series of residuals in population order.

CONDITION=# of initial periods for first regression[number of regressors]

This sets the number of initial observations used in the first regression. It defaults to the number of regressors–the value you supply must be greater than or equal to that. Note that if you condition on greater than the number of regressors, the residuals for the conditioning period will not be mutually independent

Technical Information

If there are K regressors, RLS will first find the smallest set of entries in the sample, added in the order indicated, which will give a full rank regression (unless you use the CONDITION option, in which case RLS uses the number of entries you specify). This will give a coefficient estimate $\beta_t$ and ${\left( {{\bf{X'}}{\kern 1pt} {\bf{X}}} \right)^{ - 1}}$ matrix $\Sigma_t$. The residual, SIGHIST and SEHIST entries for these early entries will be zeros. Call the starting entry $T_0$ and the point where we get to full rank $T_1$. Given a previous set of entries, the result of adding a new data point is

(1) ${\hat e_t} = \frac{{\left( {{y_t} - {{\bf{X}}_t}{\beta _{t - 1}}} \right)}}{{\sqrt {1 + {{\bf{X}}_t}{\Sigma _{t - 1}}{{{\bf{X'}}}_t}} }}$

(2) ${\beta _t} = {\beta _{t - 1}} + {\Sigma _{t - 1}}{{\bf{X'}}_t}\frac{{\left( {{y_t} - {{\bf{X}}_t}{\beta _{t - 1}}} \right)}}{{1 + {{\bf{X}}_t}{\Sigma _{t - 1}}{{{\bf{X'}}}_t}}}$

(3) ${\Sigma _t} = {\Sigma _{t - 1}} - \frac{{{\Sigma _{t - 1}}{{{\bf{X'}}}_t}{{\bf{X}}_t}{\Sigma _{t - 1}}}}{{\left( {1 + {{\bf{X}}_t}{\Sigma _{t - 1}}{{{\bf{X'}}}_t}} \right)}}$

where ${\hat e_t}$ is the recursive residual at t. The estimated variance of the regression through t is

(4) $\sigma _t^2 = \frac{{\left( {\sum\limits_{s = {T_1} + 1}^t {{{\hat e}^2}_s} } \right)}}{{\left( {t - {T_1}} \right)}}$

and the standard errors of the coefficient estimates are square roots of the diagonal elements of $\sigma _t^2{\Sigma _t}$.

Examples

The following excerpts are taken from the example on pages 121-126 of Johnston and DiNardo (1997). The complete program is provided on the file JOHN4P121.RPF.

The COHIST and SEHIST options provide a VECTOR[SERIES] with the “histories” of the coefficients and the standard errors of the coefficient estimates, while SIGHIST gives the standard errors of the regression. The CSQUARED option returns the sum of squared recursive residuals, which will also be the sequence of sums of squared residuals from the regressions.

rls(sehist=sehist,cohist=cohist,sighist=sighist, $

dfhistory=dfhist,csquared=cusumsq) y 1959:1 1973:3 rresids

# constant x2 x3

set lower = -2*sighist

set upper = 2*sighist

graph(header="Recursive Resids and S.E. Bands for Gasoline") 3

# rresids

# lower

# upper / 2

set seqf = (t-%nreg-%regstart())*(cusumsq-cusumsq{1})/cusumsq{1}

set seqfcval %regstart()+%nreg+1 * = $

seqf/%invftest(.05,1,dfhist(t))

graph(header=$

"Figure 4.4 Sequential F-Tests as Ratio to .05 Critical Value",$

vgrid=||1.0||)

# seqfcval

Variables Defined

Most of the standard Regression Variables (these will be based on end-of-sample values)