*
* BOOTSIMPLE.RPF
* Simple Example of Bootstrapping
*
all 32
open data rental.wks
data(format=wks,org=obs) / rent no rm sex dist
*
* Dataset is rental data from Pindyck and Rubinfeld, Econometric Models
* and Econometric Forecasts, 4th edition, p 54. We're asked to test the
* hypothesis that rent per person (rent/no) has a mean of 135 against
* the alternative that it isn't.
*
set rpp = rent/no
*
stats rpp
*
* For comparison, we'll do a standard t-test
*
cdf(title="Test of Mean=135 Assuming Normal") ttest sqrt(%nobs)*(%mean-135)/sqrt(%variance) %nobs-1
*
compute testmean = 135.0
compute sampmean = %mean
*
compute ndraws = 1000
set means 1 ndraws = 0.0
*
* To compute significance levels for a two-tailed test, you need to
* decide where the cutoff will be on the other side of the hypothesized
* value. Here, upperlim and lowerlim are symmetrically placed around
* testmean. sigcount will count the number of times the resampled values
* fall outside these bounds.
*
* A one-tailed test is simpler, since you just have to count the number
* of times the resampled statistic is more extreme than the observed one.
*
compute sigcount = 0.0
compute upperlim = %if(sampmean>testmean,sampmean,testmean*2-sampmean)
compute lowerlim = %if(sampmean>testmean,testmean*2-sampmean,sampmean)
*
* For each draw, resample the data set using boot, compute the mean of
* the drawn sample and adjust it to give us the sampling distribution
* around the hypothesized mean.
*
do draw=1,ndraws
boot entries 1 32
sstats(mean) 1 32 rpp(entries(t))>>%mean
compute means(draw)=%mean-sampmean+testmean
compute sigcount=sigcount+(means(draw)upperlim)
end do draws
*
* This shows the estimated significance level of the test along with the
* 90% confidence band, computed by translating the sampling distribution
* to zero (by subtracting testmean), then flipping to deal with possible
* asymmetries.
*
stats(fractiles) means
display "**** Test of Mean=" testmean " ****"
display "Sample mean" sampmean
display "Significance level" #.#### (sigcount+1)/(ndraws+1)
display "90% confidence interval" (sampmean+(testmean-%fract95)) (sampmean+(testmean-%fract05))