* PANELSCC.src
*
* Estimates a Spatial Correlation Consistent (SCC) covariance matrix from panel data
* as outlined in John C. Driscoll and Aart C. Kraay (1998), "Consistent-Covariance
* Matrix Estimation With Spatially Dependent Panel Data" Review of Economics and
* Statistics. November. pp. 549-560.
* Authors:
* John C. Driscoll
* Department of Economics
* Brown University, Box B
* Providence RI 02912
* John_Driscoll@brown.edu
*
* Aart C. Kraay
* The World Bank
* Washington DC 20433
* akraay@worldbank.org */
*
* This procedure is a RATS translation of the TSP and Gauss code available at
* Driscoll's web site: http://econ.pstc.brown.edu/~jd/
* Translation by Steve Green of Baylor University (steve_green@baylor.edu).
* This is preliminary version of the program. If you encounter any difficulties,
* please contact Steve Green at the above e-mail address.
*
* The procedure computes coefficient estimates and standard errors
* consistent for spatial correlation, autocorrelation and heteroskedasticity
* for linear instrument-variable panel data models of the form
*
* y(i,t)=x(i,t)'b+e(i,t)
*
* where x(i,t) is an Lx1 vector of explanatory variables; b is an Lx1 coefficient
* vector; e(i,t) is a residual; and i=1,...,N indexes cross-sectional units and
* t=1,...,T indexes time periods. The model is identified by a Kx1 vector
* of orthogonality conditions E[z(i,t)'e(i,t)]=0, where z(i,t) is a Kx1 vector of
* instrumental variables with K>=L. If the model contains individual effects,
* these must first be removed by the appropriate transformation (i.e. differencing
* or taking deviations from individual means), which may be easily accomplished
* using the PANEL instruction in RATS. The procedure does not support missing
* values, but it is possible to estimate the equation over subsamples of periods
* and/or cross-section units.
*
* The procedure allows data to be input in exactly the way RATS handles panel data --
* that is, as series grouped by cross section unit. (The Gauss and TSP code on which this
* procedure is based requires data to be input as arrays, with data grouped by period.)
*
*
* The procedure takes the form:
*
* @panelscc(options) depvar startperiod endperiod iN mask
* list of right-hand-side (X) variables
* list of instrumental (Z) variables
*
* where
*
*
* depvar is the dependent variable series [REQUIRED]
* startperiod is the beginning time period (year, quarter, month) of the regression [REQUIRED]
* endperiod is the ending time period (year, quarter, month) of the regression [REQUIRED]
* iN is the number of cross sectional units in the regression [REQUIRED]
* mask is a masking series you can use to estimate the equation over a
* subset of the cross section units [OPTIONAL]
*
* Importantly, the parameters "startperiod" and "endperiod" are conventional RATS dates,
* not panel data codes. For example, suppose you want to estimate a regression from a panel
* of annual data from 102 countries over the years 1974-1993, with Y as the dependent variable.
* The command line (assuming you don't need a masking series) should be as follows:
*
* @panelscc Y 1974:01 1993:01 102
*
* That is, use conventional dates (e.g., 1974:01) rather than panel dates (1//1974:01).
*
* Supplementary cards should include "constant" if necessary. To run OLS, supplementary
* card #2 should be indentical to card #1. (There is an example of this below.)
*
* The options are as follows:
*
* LAGS = # of lags [0], is the lag truncation parameter [default = 0]
*
* TOTALCROSS = total number of cross section units [iN], is the total number of cross section
* units in your data. This defaults to iN, but you must use it if you are
* estimating the regression over a subset of cross section units. For example,
* suppose you have 102 countries but are estimating the equation over only 80
* of them. Then iN=80 in the command line, and TOTALCROSS=102 is the option.
* The program needs to know TOTALCROSS to generate the indicator series
* correctly. The default value of TOTALCROSS is iN, so if you are using all of
* your cross section units in your regression you do not need to enter a value
* for TOTALCROSS.
*
* VERYLASTPERIOD = the very last period of your program file range [endperiod]. Use this
* option if you are estimating your equation over a subsample of periods,
* and the last period of the subsample (endperiod) is prior to the last
* period of your program file (which is normally the period in your ALLOCATE
* statement). For example, suppose your program file range is 1973-1993, but
* you are estimating an equation from 1974 to 1990. Then STARTPERIOD=1974:01,
* ENDPERIOD=1990:01, and VERYLASTPERIOD=1993:01. You don't need to worry
* about the fact that STARTPERIOD is beyond the first period in the program
* file range. As with TOTALCROSS, the program needs to know VERYLASTPERIOD to
* generate the indicator series correctly. The default value of
* VERYLASTPERIOD is ENDPERIOD, so if your data run to the end of your
* program file range and the last period of your regresion is the last period
* of the file range, you do not need to enter a value for VERYLASTPERIOD.
*
* The series MASK is a series with values = 1 for observations you want to include in the
* regression and zero elsewhere. The primary use of MASK is to restrict your regression to
* specific cross section units. (You can restrict the regression to specific periods using
* STARTPERIOD and ENDPERIOD.) For example, suppose you have annual data on 102 countries
* over the period 1973:01 to 1993:01. Y is your dependent variable, X1 and X2 are your
* independent variables, and Z1, Z2, and Z3 are instruments for X2. To use only the
* first 80 and last 10 countries and the years 1976-1990, first generate the masking
* series (which I will call MYMASK) as follows:
*
* set mymask 1 verylastperiod*totalcross = %indiv(t) .le. 80 .or. %indiv(t) .ge. 93
*
* where verylastperiod = 1993:01 and totalcross = 102. Next, invoke the procedure as follows:
*
* @panelscc(lags=2,verylastperiod=1993:01,totalcross=102) Y 1976:01 1990:01 90 mymask
* # constant X1 X2
* # constant X1 Z1 Z2 Z3
*
* where I am using 2 lags. Note that the number of cross section units is 90 (= 80+10).
*
* Things are much simpler, of course, if your data run the entire range of your program file,
* you use all of the cross section units, and the ENDPERIOD for your regression is the same
* as VERYLASTPERIOD. For example, suppose your program file range is 1973-1993, and you want
* estimate the above regression over the range 1974-1993 with all 102 countries. In this case,
* you invoke the procedure as:
*
* @panelscc(lags=2) Y 1974:01 1993:01 102
* # constant X1 X2
* # constant X1 Z1 Z2 Z3
*
* That is, you need not use the options TOTALCROSS or VERYLASTPERIOD, and you do not
* need to specify a MASK series. Note that your STARTDATE can be anything, so you could
* put 1976:01 or 1984:01 (or whatever) in the above equation and it should work fine.
*
* Finally, to estimate the above regression via OLS, type the following:
*
* @panelscc(lags=2) Y 1974:01 1993:01 102
* # constant X1 X2
* # constant X1 X2
*
* The procedure outputs the standard OLS or TSLS output along with a second table
* made up of the coefficients, SCC standard errors, and the associated z-statistics
* and p-values. The first table will say "estimation by instrumental variables" even
* if you are using OLS, because the program invokes OLS by specifying an instrument list
* that is the same as the independent variable list.
*
* The procedure stores the SCC covariance matrix in the array %VHATSCC.
*
*
Procedure panelscc depvar startperiod endperiod iN mask
type series depvar mask
type integer startperiod endperiod iN
option integer lags 0
option integer totalcross iN
option integer verylastperiod endperiod
local index xlist zlist
local series indicator resids0 mask2
local integer allobs iT s j
local rect[real] zT ehat hT omega0 Shat zTL ehatL hTL mmat vhat x z omega
local real rN rT
enter(varying) xlist ; * xlist = list of right-hand-side variables
enter(varying) zlist ; * zlist = list of instrumental variables
compute iT = endperiod-startperiod+1
compute allobs = verylastperiod*totalcross
compute rN = float(iN)
compute rT = float(iT)
If %defined(mask) .eq. 0
{
set mask2 1 allobs = %indiv(t) .le. iN
}
If %defined(mask) .eq. 1
{
set mask2 1 allobs = mask
}
set indicator 1 allobs = mask2*(%period(t) .ge. startperiod .and. %period(t) .le. endperiod)
instruments zlist
linreg(smpl=indicator,instruments) depvar 1 allobs resids0
# xlist
make(smpl=indicator) x
# xlist
make(smpl=indicator) z
# zlist
set indicator 1 allobs = mask2*(%period(t) .eq. startperiod)
make(smpl=indicator) zT
# zlist
make(smpl=indicator) ehat
# resids0
compute hT = (1./rN)*(tr(zT)*ehat)
compute omega0 = ht*tr(ht)
do j = startperiod+1, startperiod+it-1
set indicator 1 allobs = mask2*(%period(t) .eq. j)
make(smpl=indicator) zT
# zlist
make(smpl=indicator) ehat
# resids0
compute hT = (1./rN)*(tr(zt)*ehat)
compute omega0 = omega0 + (ht*tr(ht))
end do j
compute Shat = (1./rT)*omega0
If lags .gt. 0
{
do s = 1, lags
set indicator 1 allobs = mask2*(%period(t) .eq. (startperiod+s))
make(smpl=indicator) zT
# zlist
make(smpl=indicator) ehat
# resids0
compute hT = (1./rN)*(tr(zT)*ehat)
set indicator 1 allobs = mask2*(%period(t) .eq. startperiod)
make(smpl=indicator) zTL
# zlist
make(smpl=indicator) ehatL
# resids0
compute htL = (1./rN)*(tr(ztL)*ehatL)
compute omega = ht*tr(htL)
do j = s+1, iT-1
set indicator 1 allobs = mask2*(%period(t) .eq. (startperiod+j))
make(smpl=indicator) zT
# zlist
make(smpl=indicator) ehat
# resids0
compute ht = (1./rN)*(tr(zt)*ehat)
set indicator 1 allobs = mask2*(%period(t) .eq. (startperiod+j-s))
make(smpl=indicator) zTL
# zlist
make(smpl=indicator) ehatL
# resids0
compute htL = (1./rN)*(tr(ztL)*ehatL)
compute omega = omega + (ht*tr(htL))
end do j
compute omega = (1./rT)*(omega+tr(omega))
compute shat = shat + (1.- (float(s)/(1+float(lags))))*omega
end do s
}
compute mmat = (1./(rN*rT))*(tr(x)*z)
compute vhat = inv(mmat*inv(shat)*tr(mmat))
compute vhat = (1./rT)*vhat
compute %vhatscc = vhat
display ' '
display ' '
display 'SERIAL CORRELATION CONSISTENT STANDARD ERRORS '
display ' with Lags = ' lags
display ' '
display ' '
display ' Variable Coeff SCC Std Error z-Stat Signif'
display '*******************************************************************************'
do j = 1, %rows(%beta)
display @1 j $
@29 #.########## %beta(j) $
@44 #.########## sqrt(vhat(j,j)) $
@60 #.##### %beta(j)/sqrt(vhat(j,j)) $
@70 #.######## %ztest(%beta(j)/sqrt(vhat(j,j)))
end do j
display ' '
display ' '
display 'NOTE: SCC Covariance Matrix is stored as "%VHATSCC" '
display ' '
display ' '
end