LSDVC—Estimation of dynamic panel models

TomDoan · Unread post by **TomDoan** » Sat Feb 27, 2010 6:22 pm

This is a procedure for computing bias-corrected estimators for dynamic panel models. The main reference on the technique is Kiviet(1995), "On bias, inconsistency, and efficiency of various estimators in dynamic panel data models," Journal of Econometrics, vol. 68, no. 1, 53-78.

This allows for precisely one lag of the dependent variable, which you do not include in the regressor list. The syntax is

@LSDVC(options) y start end
# exogenous regressors only

The most important option is the METHOD. The key choices there are METHOD=K1, METHOD=K2 and METHOD=K3 which are the three levels of correction given by Kiviet. Subsequent papers showed that there isn't that much to be gained in using the K2 and K3, but there's no harm in using them once they're available.

Note that you cannot apply this to one equation out of a panel VAR. (I've seen it done, but it's not correct). The exogenous variables really need to be exogenous, not just predetermined, or they'll be subject to exactly the same type of bias as the lagged dependent variable.

Code: Select all

*
* @LSDVC( options ) y start end
* # exogenous regressor list
*
* Computes a least squares dummy variable (i.e. fixed effects) estimator
* for a dynamic panel data model with correction for bias. The model is
*
*  y(i,t)=gamma*y(i,t-1)+alpha(i)+x(i,t)*beta
*
* References:
*   Kiviet(1995), "On bias, inconsistency, and efficiency of various
*    estimators in dynamic panel data models," Journal of Econometrics,
*    vol. 68, no. 1, 53-78.
*   Bun & Kiviet(2001), "The Accuracy of Inference in Small Samples of
*    Dynamic Panel Data Models," Tinbergen Institute Discussion Papers
*    01-006/4, Tinbergen Institute.
*
* Options:
*  METHOD=[K1]/K2/K3/SIMPLE/FIXED/AH
*    K1, K2 and K3 are the three levels of correction proposed by Kiviet.
*    SIMPLE is a related alternative which treats the "X'X" matrix from
*    LSDV as fixed.
*    FIXED is fixed effects, which is known to be inconsistent for small T.
*    AH is Anderson-Hsiao, which is consistent by generally quite inefficient.
*  ITERS=# of recalculations of corrections for K1, K2, K3 or SIMPLE [1]
*  PRINT]/NOPRINT
*  ROBUSTERRORS/[NOROBUST]
*
*
procedure LSDVC y start end
type series 	y
type integer 	start end
*
option integer iters  1
option choice  method 1 k1 k2 k3 simple fixed ah
option switch  robust 0
option switch  print  1
*
local vect[int] 	reglist
local rect[int] 	etable
local integer 		startl endl
local integer 		i pass
local integer 		n tdim
local series 		presids psmpl temp eah ytilde
local vect[series] w dx
local vect 			bw icount
local vect 			delta
local rect 			c at pi
local symm 			xxlsdv
local rect        waw wacaw waccaw
local vect        awtilde
local vect        betalsdv
local rect        xlsdvdm epsmat
local real        gamma ici trpi trpi2 trpi3 trpi4 fiddle
local rect        aca cc
local vector      ek
local integer     i1 i2
local rect        wi q
local vect        q1 corr1 corr2 corr3 kcorr betalsdvc
local equation    workeqn
*
enter(varying) reglist
*
* Figure out the regression range if the user didn't provide it.
*
inquire(regressorlist) startl>>start endl>>end
# y{0 1} reglist
*
* Do a silent PREG to get the LSDV estimates. This is also used to get
* the regression sample by seeing what entries have defined residuals.
*
preg(method=fixed,print=(print.and.method==5),define=workeqn) y startl endl presids
# y{1} reglist
if method==5
   return
*
compute xxlsdv  =%xx
compute betalsdv=%beta
set psmpl startl endl = %valid(presids)
*
* Count the number of usable data points per individual. As currently
* constructed, this only works for balanced panels, so return with an
* error if the number of data points varies.
*
compute n=%indiv(endl)
dim icount(n)
compute icount=%const(0.0)
do i=startl,endl
   if psmpl(i)
      compute icount(%indiv(i))=icount(%indiv(i))+1
end do i
*
if %maxvalue(icount)<>%minvalue(icount) {
   disp "@LSDVC requires a balanced sample"
   return
}
compute tdim=fix(%maxvalue(icount))
dim dx(%nreg-1) w(%nreg)
*
* Do the panel transformation for the explanatory variables and the
* dependent variable and compute A-H as the initial consistent estimator.
*
diff y / dy
compute etable=%eqntable(workeqn)
do i=1,%nreg
   clear temp
   set temp startl endl = etable(1,i){etable(2,i)}
  	panel(smpl=psmpl,entry=1.0,indiv=-1.0) temp startl endl w(i)
   if i>1
      diff temp / dx(i-1)
end do i
panel(smpl=psmpl,entry=1.0,indiv=-1.0) y startl endl ytilde
*
instruments y{2} dx
linreg(inst,noprint) dy
# dy{1} dx
if method==6 {
   linreg(create,equation=workeqn,title="Anderson-Hsiao",print=print)
   return
}
*
dim c(tdim,tdim) at(tdim,tdim)
ewise at(i,j)=(i==j)-1.0/tdim
*
* This allows for multiple calculations of the adjustment. It uses the
* same base (the LSDV estimator), but gets a slightly different
* adjustment term as the estimated value of the coefficient on the
* lagged dependent variable changes.
*
if method==1.or.method==2.or.method==3 {
   make(smpl=psmpl) xlsdvdm
   # w
   do pass=1,iters
      *
      * Use current estimates to get residuals for the LSDV model
      *
      set(smpl=psmpl) eah = ytilde-%dot(%xt(w,t),%beta)
      make(smpl=psmpl) epsmat
      # eah
      sstats(smpl=psmpl) / eah^2>>%rss
      compute %seesq=%rss/(%nobs-%nreg-n)
      compute gamma=%beta(1)
      ewise c(i,j)=%if(i>j,gamma^(i-j-1),0.0)
      *
      * Create the unit vector
      *
      compute ek=%unitv(%nreg,1)
      *
      compute pi      =at*c
      compute trpi    =%trace(pi)
      compute trpi2   =%trace(tr(pi)*pi)
      compute trpi3   =%trace(tr(pi)*pi*pi)
   	compute trpi4   =%trace(tr(pi)*pi*tr(pi)*pi)
      compute waw     =%zeros(%nreg,%nreg)
      compute wacaw   =%zeros(%nreg,%nreg)
      compute waccaw  =%zeros(%nreg,%nreg)
      do i=1,n
         compute i1=(i-1)*tdim+1,i2=i*tdim
         compute awtilde=pi*%xsubmat(epsmat,i1,i2,1,1)
         compute wi     =%xsubmat(xlsdvdm,i1,i2,1,%nreg)-awtilde*tr(ek)
         compute waw    =waw   +tr(wi)*wi
         compute wacaw  =wacaw +tr(wi)*c*wi
         compute waccaw =waccaw+tr(wi)*c*tr(c)*wi
      end do i
      compute q=waw
      compute q(1,1)=q(1,1)+%seesq*n*trpi2
      compute q=inv(q)
      compute q1=q*ek
      compute corr1=%seesq*n*trpi*q1
      compute corr2=-%seesq*(q*wacaw+%trace(q*wacaw)*%identity(%nreg)+2*%seesq*q(1,1)*trpi3*%identity(%nreg))*q1
      compute corr3=%seesq^4*trpi*(2*n*q(1,1)*q*waccaw*q1+(n*%qform(waccaw,q1)+q(1,1)*%trace(q*waccaw)+q(1,1)^2*trpi4)*q1)
      if method==1
         compute kcorr=corr1
      else
      if method==2
         compute kcorr=corr1+corr2
      else
         compute kcorr=corr1+corr2+corr3
      compute betalsdvc=betalsdv-kcorr
      compute %beta=betalsdvc
   end do pass
}
else
if method==4 {
   *
   * Compute the cross product matrix of the transformed data. The lagged
   * dependent variable will be in the first position, and the dependent
   * variable itself will be in the last.
   *
   cmom
   # w ytilde
   dim bw(%ncmom)
   compute %beta=betalsdv
   do pass=1,iters
      *
      * Compute the sum of squared residuals for the current settings of
      * %beta.
      *
      ewise bw(i)=%if(i<=%nreg,%beta(i),-1)
      compute %seesq=%qform(%cmom,bw)/(%nobs-%nreg-n)
      *
      * Make sure gamma (the coefficient on the lagged dependent variable)
      * is in range.
      *
      compute gamma=%max(-1.0,%min(%beta(1),1.0))
      *
      * Do the E[X'Ae] for a single individual record. This assumes that
      * the only correlation between e and X is with the lagged dependent
      * variable.
      *
      ewise c(i,j)=%if(i>j,gamma^(i-j-1),0.0)
      compute fiddle=%seesq*n*%trace(at*c)
      *
      * And figure out (X'AX)^-1 * E[X'Ae]
      *
      compute delta=%xcol(xxlsdv,1)*fiddle
      *
      * Adjust the LSDV estimator
      *
      compute %beta=betalsdv-delta
   end do pass
}
if robust==0 {
   *
   * Compute the covariance matrix. The formula is from Kiviet and Bun
   * (2001).
   *
   compute pi=at*c
   compute trpisq=%trace(tr(pi)*pi)
   compute %xx=%seesq*xxlsdv+%seesq^2*trpisq*%outerxx(%xcol(xxlsdv,1))
}
else {
   *
   * Compute a robust covariance matrix. Take the LSDV residuals, and
   * compute the standard robust estimator of (X'Ae)(X'Ae)'
   *
   mcov(lags=tdim) / presids
   # w
   *
   * Then subtract off the bias squared term
   *
   compute %cmom(1,1)=%cmom(1,1)-fiddle^2/%nobs
   compute %xx=xxlsdv*%cmom*xxlsdv
}
linreg(create,equation=workeqn,form=chisquared,print=print,title="LSDV Corrected")
end