* * Example 11.3 from pp 224-225 * Heteroscedasticity Tests * open data tablef9-1[1].txt data(format=prn,org=columns) 1 100 mdr acc age income avgexp ownrent selfempl * set incomesq = income**2 set posexp = avgexp>0 * linreg(smpl=posexp) avgexp / resids # constant age income incomesq ownrent * * This is the ML estimate of the variance (not corrected for degrees of freedom). * It's used later for the Breusch-Pagan-Godfrey test. * compute olssq=%sigmasq * * White's test, done using an auxiliary regression. The following generate all the * unique cross products of the explanatory variables. ownrent**2 is omitted since * it's the same as ownrent. However, even if you made the mistake of including an * ownrent**2 variable, the cdf instruction below would give you the correct test * and degrees of freedom, as %ndf is adjusted when LINREG detects collinearity. * set x1 = age*age set x2 = age*income set x3 = age*incomesq set x4 = age*ownrent set x5 = income*incomesq set x6 = income*ownrent set x7 = incomesq*incomesq set x8 = incomesq*ownrent * set usq = resids**2 linreg(noprint) usq # constant age income incomesq ownrent x1 x2 x3 x4 x5 x6 x7 x8 * cdf(title="White Test for Heteroscedasticity") chisqr %nobs*%rsquared %nobs-%ndf-1 * * White's test done using the RegWhiteTest procedure. Use @RegWhiteTest * immediately after the regression. (Since we just did a different regression * above to handle the "long-form" White test, we need to redo the regression). * linreg(smpl=posexp) avgexp # constant age income incomesq ownrent @RegWhiteTest * * Goldfeld-Quandt test * Instead of sorting the whole data set, we generate a series of ranks * for the income series, and partition the data set based upon the values * of those. Because we only want the data with positive values of the exp * variable, we use the SMPL option on ORDER. This will give NA's to the * ranks for the excluded data points. * * There's one other minor problem. The value 3.00 is shared by four different * individuals, and these happen to be ranks 36-39. The way that ORDER with RANKS * works, all of these get a rank value of 37.5, so when we split the sample at * rank 36, we get 35 observations in the first part and 37 in the second. Note that * this will be also be a problem if the data are sorted instead. Different * sorting algorithms will give a different ordering for those four tied * observations, so different programs could easily give different results for * the test. * order(ranks=incranks,smpl=posexp) income linreg(smpl=incranks<=36) avgexp # constant age income incomesq ownrent compute rss1=%rss,ndf1=%ndf linreg(smpl=incranks>36) avgexp # constant age income incomesq ownrent compute rss2=%rss,ndf2=%ndf * cdf(title="Goldfeld-Quandt Test") ftest (rss2/ndf2)/(rss1/ndf1) ndf2 ndf1 * * Breusch-Pagan Test * linreg usq # constant income incomesq * cdf(title="Breusch-Pagan-Godfrey Test") chisqr .5*%rss/olssq**2*%rsquared/(1-%rsquared) %nreg-1 * * This is the alternative, which doesn't assume Normal residuals * cdf(title="BPG Test - TR**2 Variant") chisqr %trsquared %nreg-1 * * Jarque-Bera normality test is included in the standard STATISTICS output. The value of * the test statistic is slightly different from the one shown in the text because RATS * uses estimates of the 3rd and 4th moments which have some small sample corrections. * See the description of the STATISTICS instruction for the details if you're interested. * stats resids