* * Example 6.1 from pp 172-174 * * CPS88 comes from the Current Population Survey. It is a random sample, with * replacement, of 1,000 observations from a sample of males with non-missing * information on all the 11 variables in the data set. * * This example uses only the first 100 observations from the data set. You can * get these by just restricting the DATA instruction to the range 1 100. * open data cps88.asc data(format=prn,org=columns) 1 100 age exp2 grade ind1 married lnwage occ1 partt potexp union weight high * * The data series are * AGE * LNWAGE Log of hourly wage in dollars * OCC1 Dummy variable for occupational category * IND1 Dummy variable for industrial category * UNION 1 if union member, 0 otherwise * GRADE highest educational grade completed * MARRIED 1 if married, 0 otherwise * PARTT 1 if part-time worker, 0 otherwise * POTEXP Years of potential experience * EXP2 POTEXP squared * WEIGHT Sampling weight * HIGH "Highly" unionized industry (IND1 equals 1,2,3,4,5,10,11, or 14) * table * * The conventional earnings equation on page 172 * linreg lnwage # constant grade potexp exp2 union * * We need this for the Breusch-Pagan-Godfrey test later. This is the maximum * likelihood estimate of the variance. * compute olssq=%sigmasq * * Computation of new variables for the White regression: * set ressq = %resids**2 set grade2 = grade**2 set exp4 = exp2**2 set exp3 = potexp*exp2 set gx = grade*potexp set gx2 = grade*exp2 set gu = grade*union set xu = potexp*union set xu2 = exp2*union * * White test done by auxiliary regression, Table 6.3, p.173. * linreg ressq # constant grade potexp exp2 union grade2 exp4 exp3 gx gx2 gu xu xu2 cdf(title="White Heteroscedasticity Test") chisqr %trsquared %nreg-1 * * Breusch-Pagan/Godfrey test * Same dependent variable as the White test, just a subset of the regressors * linreg ressq # constant grade potexp union * * Rather than divide the dependent variable by the variance**2, the scaling is * done after the fact, when computing the test statistic. * cdf(title="Breusch-Pagan-Godfrey Test") chisqr .5*%rss/(olssq**2)*%rsquared/(1-%rsquared) %nreg-1 * * This is the alternative, which doesn't assume Normal residuals * cdf(title="BPG Test - TR**2 Variant") chisqr %trsquared %nreg-1 * * These are the same tests done using the RegWhiteTest procedure. This is applied * immediately after the regression that you want to test. @RegWhiteTest with * TYPE=FULL (the default) does the White test with all the squares and interaction * terms. With TYPE=BP, it does the Breusch-Pagan/Godfrey test. Note, however, that * the BPG test done using this procedure includes *all* the regressors. If you * want to use a subset (or any set of candidate variables other than the full set * of explanatory variables), you need to do the auxiliary regressions as shown above. * linreg lnwage # constant grade potexp exp2 union @RegWhiteTest @RegWhiteTest(type=bp) * * Goldfeld-Quandt test * Rather than reorder the data set, we compute the ranks of the potexp series, * and use the observations for which ranks<=35 for the first group and ranks>=66 * as the second. (These are done using the SMPL option). Because there's a tie * among the values of potexp which crosses the 66th value (64-70 are all 22), the * exact test statistic would depend upon which ones were included. The code * included below will include all of them, since the average range of the tied * values is 67. For a continuous (rather than discrete) variable, it is highly * unlikely that you would have this problem. * order(ranks=expranks) potexp * linreg(noprint,smpl=expranks<=35) lnwage # constant grade potexp exp2 union compute rss1 = %rss, ndf1 = %ndf linreg(noprint,smpl=expranks>=66) lnwage # constant grade potexp exp2 union compute rss2 = %rss, ndf2 = %ndf compute fstat = (rss2/ndf2)/(rss1/ndf1) cdf(title="Goldfeld Quandt Test") ftest fstat ndf2 ndf1