Replicating Hoover et al. (2009)
-
TVolscho-286
- Posts: 25
- Joined: Thu May 03, 2012 6:50 pm
Replicating Hoover et al. (2009)
http://www.umsl.edu/~dibooglus/personal ... evised.pdf (draft of paper)
http://www.sciencedirect.com/science/ar ... 2509000193 (published version)
"This paper investigates the impact of various socio-economic variables on various cohorts of the income distribution. We use asymmetric cointegration tests to show that unemployment and immigration shocks have real impacts on income inequality. In addition, using threshold test results we are able to show that positive and negative shocks to the economy do not have symmetric effects nor do the impacts of these shocks impact income quintiles uniformly."
I would like to replicate their graphs of asymmetric shocks. Here are the commands to replicate (closely enough) their published estimates using the Gini index. If anyone can duplicate their estimates of the asymmetric ECM and the asymmetric positive and negative shocks in Fig. 2, I would greatly appreciate it.
http://www.sciencedirect.com/science/ar ... 2509000193 (published version)
"This paper investigates the impact of various socio-economic variables on various cohorts of the income distribution. We use asymmetric cointegration tests to show that unemployment and immigration shocks have real impacts on income inequality. In addition, using threshold test results we are able to show that positive and negative shocks to the economy do not have symmetric effects nor do the impacts of these shocks impact income quintiles uniformly."
I would like to replicate their graphs of asymmetric shocks. Here are the commands to replicate (closely enough) their published estimates using the Gini index. If anyone can duplicate their estimates of the asymmetric ECM and the asymmetric positive and negative shocks in Fig. 2, I would greatly appreciate it.
- Attachments
-
- Replicate Hoover et al.RPF
- (14.41 KiB) Downloaded 1159 times
-
- Hoover_New.xlsx
- (12.54 KiB) Downloaded 880 times
Re: Replicating Hoover et al. (2009)
That would be done using their (5), (6) and (7) (which I hope have the typos fixed in the published version --- the gini minus terms should be differences and the error in (7) should be e3, not e2).
This is what is necessary in order to compute IRF's in this type of model. (It's like the Balke example on steroids, since it has four different non-linearities rather than one).
This uses the paper's notation for the variables which differs from yours.
This will get the IRF's from the initial conditions at the end of the data set. They apparently want the initial conditions for the "equilibrium". I assume that means lagged dx's are zero, lagged u and im are zero and lagged gini=b0+tau. (Any combination of u, im and gini which zeros out z will give the same IRF's if the lagged dx's are zero).
This is what is necessary in order to compute IRF's in this type of model. (It's like the Balke example on steroids, since it has four different non-linearities rather than one).
This uses the paper's notation for the variables which differs from yours.
Code: Select all
set dgini = gini-gini{1}
set dginiplus = %max(dgini,0)
set dginiminus = %min(dgini,0)
set du = u-u{1}
set duplus = %max(du,0)
set duminus = %min(du,0)
set dim = im-im{1}
set dimplus = %max(dim,0)
set dimminus = %min(dim,0)
*
set zplus = %max(gini-b0-b1*u-b2*im-tau,0.0)
set zminus = %min(gini-b0-b1*u-b2*im-tau,0.0)
*
* Order variables IM-->U-->GINI
*
system(model=tvecm)
variables dim du dgini
det dginiplus{1 to p} dginiminus{1 to 2} duplus{1 to p} duminus{1 to p} $
dimplus{1 to p} dimminus{1 to p} zplus{1 to p} zminus{1 to p}
end(system)
*
estimate
*
frml(identity) ginieq gini = gini{1}+dgini
frml(identity) dginipluseq dginiplus = %max(dgini,0)
frml(identity) dginiminuseq dginiminus = %min(dgini,0)
*
frml(identity) ueq u = u{1}+du
frml(identity) dupluseq duplus = %max(du,0)
frml(identity) duminuseq duminus = %min(du,0)
*
frml(identity) imeq im = im{1}+dim
frml(identity) dimpluseq dimplus = %max(dim,0)
frml(identity) dimminuseq dimminus = %min(dim,0)
*
frml(identity) zpluseq zplus = %max(gini-b0-b1*u-b2*im-tau,0.0)
frml(identity) zminuseq zminus = %min(gini-b0-b1*u-b2*im-tau,0.0)
*
group identities ginieq ueq imeq dginipluseq dginiminuseq $
dupluseq duminuseq dimpluseq dimminuseq zpluseq zminuseq
*
forecast(model=tvecm+identities,results=baseresults,steps=12)
*
* Do Choleski factor
*
compute fsigma=%decomp(%sigma)
*
* Do responses with + and - shocks to IM
*
forecast(model=tvecm+identities,shocks=%xcol(fsigma,1),results=withplus,steps=12)
forecast(model=tvecm+identities,shocks=-1.0*%xcol(fsigma,1),results=withminus,steps=12)
*
* Take gap between the forecasts to get the IRF's
*
set irfimplus = withplus(4)-baseresults(4)
set irfimminus = withminus(4)-baseresults(4)
-
TVolscho-286
- Posts: 25
- Joined: Thu May 03, 2012 6:50 pm
Re: Replicating Hoover et al. (2009)
Hi Tom,
When I get to this point, I get blank results in the series window. I can export them to excel and subtract and plot, though I prefer the RATS graphics. Oh, the (4) in parentheses, why is the 4th step used for the plot?
set irfimplus = withplus(4)-baseresults(4)
set irfimminus = withminus(4)-baseresults(4)
When I get to this point, I get blank results in the series window. I can export them to excel and subtract and plot, though I prefer the RATS graphics. Oh, the (4) in parentheses, why is the 4th step used for the plot?
set irfimplus = withplus(4)-baseresults(4)
set irfimminus = withminus(4)-baseresults(4)
Re: Replicating Hoover et al. (2009)
Why don't you post your full program? Did you rename either your variables or mine (you were using U differently from them)?
In the combined model tvecm+identities, the first three equations are in TVECM, so (4) refers to the first set of forecasts for the identities.
In the combined model tvecm+identities, the first three equations are in TVECM, so (4) refers to the first set of forecasts for the identities.
-
TVolscho-286
- Posts: 25
- Joined: Thu May 03, 2012 6:50 pm
Re: Replicating Hoover et al. (2009)
Hi Tom,
Sure here is the code below. I called the residual u1 after the static linear regression. The Enders/Siklos routine yields the threshold of -0.0213 (I added a 5 to the last place to adjust for rounding error). And for the input to zplus and zminus I used the results from the regression.
Sure here is the code below. I called the residual u1 after the static linear regression. The Enders/Siklos routine yields the threshold of -0.0213 (I added a 5 to the last place to adjust for rounding error). And for the input to zplus and zminus I used the results from the regression.
Code: Select all
OPEN DATA "\hoover replicate.xls"
CALENDAR(A) 1947:1
DATA(FORMAT=XLS,ORG=COLUMNS) 1947:01 2003:01 Year gini unemp immig
set u = unemp
set im = immig
linreg gini / u1
# constant u im
set tseries = u1{1}
@enderssiklos(title="TAR Empirical Tau", threshold=tseries, pi=.10, lags=1) u1
set dgini = gini-gini{1}
set dginiplus = %max(dgini,0)
set dginiminus = %min(dgini,0)
set du = u-u{1}
set duplus = %max(du,0)
set duminus = %min(du,0)
set dim = im-im{1}
set dimplus = %max(dim,0)
set dimminus = %min(dim,0)
*
set zplus = %max(gini-0.352532523-(-0.000806540*u)-(-0.000806540*im)-(-.02135),0.0)
set zminus = %min(gini-0.352532523-(-0.000806540*u)-(-0.000806540*im)-(-.02135),0.0)
*
* Order variables IM-->U-->GINI
*
system(model=tvecm)
variables dim du dgini
det dginiplus{1} dginiminus{1} duplus{1} duminus{1} $
dimplus{1} dimminus{1} zplus{1} zminus{1}
end(system)
*
estimate
*
frml(identity) ginieq gini = gini{1}+dgini
frml(identity) dginipluseq dginiplus = %max(dgini,0)
frml(identity) dginiminuseq dginiminus = %min(dgini,0)
*
frml(identity) ueq u = u{1}+du
frml(identity) dupluseq duplus = %max(du,0)
frml(identity) duminuseq duminus = %min(du,0)
*
frml(identity) imeq im = im{1}+dim
frml(identity) dimpluseq dimplus = %max(dim,0)
frml(identity) dimminuseq dimminus = %min(dim,0)
*
frml(identity) zpluseq zplus = %max(gini-0.352532523-(-0.000806540*u)-(-0.000806540*im)-(-.02135),0.0)
frml(identity) zminuseq zminus = %max(gini-0.352532523-(-0.000806540*u)-(-0.000806540*im)-(-.02135),0.0)
*
group identities ginieq ueq imeq dginipluseq dginiminuseq $
dupluseq duminuseq dimpluseq dimminuseq zpluseq zminuseq
*
forecast(model=tvecm+identities,results=baseresults,steps=12)
*
* Do Choleski factor
*
compute fsigma=%decomp(%sigma)
*
* Do responses with + and - shocks to IM
*
forecast(model=tvecm+identities,shocks=%xcol(fsigma,1),results=withplus,steps=12)
forecast(model=tvecm+identities,shocks=-1.0*%xcol(fsigma,1),results=withminus,steps=12)
*
* Take gap between the forecasts to get the IRF's
*
set irfimplus = withplus(4)-baseresults(4)
set irfimminus = withminus(4)-baseresults(4)- Attachments
-
- hoover replicate.xls
- (29.5 KiB) Downloaded 882 times
Re: Replicating Hoover et al. (2009)
That will be part you, part me.
The fix for my error is to add the following:
compute fstart=%regend(),fend=fstart+11
forecast(model=tvecm+identities,results=baseresults,steps=12)
...
and then put the fstart and fend parameters on
set irfimplus fstart fend = withplus(4)-baseresults(4)
set irfimminus fstart fend = withminus(4)-baseresults(4)
The FORECAST goes outside the original sample, so the range on the SET has to be explicit.
You have a typo in your definitions of Z---you used the second coefficient twice. A better way of handling that is
linreg gini / u1
# constant u im
compute b=%beta
and then
set zplus = %max(gini-b(1)-b(2)*u-b(3)*im-%%breakvalue,0.0)
with similar changes to the other three similar instructions. (%%breakvalue is defined by @ENDERSSIKLOS).
The fix for my error is to add the following:
compute fstart=%regend(),fend=fstart+11
forecast(model=tvecm+identities,results=baseresults,steps=12)
...
and then put the fstart and fend parameters on
set irfimplus fstart fend = withplus(4)-baseresults(4)
set irfimminus fstart fend = withminus(4)-baseresults(4)
The FORECAST goes outside the original sample, so the range on the SET has to be explicit.
You have a typo in your definitions of Z---you used the second coefficient twice. A better way of handling that is
linreg gini / u1
# constant u im
compute b=%beta
and then
set zplus = %max(gini-b(1)-b(2)*u-b(3)*im-%%breakvalue,0.0)
with similar changes to the other three similar instructions. (%%breakvalue is defined by @ENDERSSIKLOS).
-
TVolscho-286
- Posts: 25
- Joined: Thu May 03, 2012 6:50 pm
Re: Replicating Hoover et al. (2009)
Hi Tom,
Thanks, getting closer. So the edits to the code generate the positive and negative IRFs for shock to immigrants and response of gini index. But they do not match the paper.
Also, the VECM estimates are quite a bit different from what's published in their article see Table 3: http://www.umsl.edu/~dibooglus/personal ... evised.pdf
ZPLUS and ZMINUS. Should they match the variables XSPLIT(1) and XSPLIT(2), respectively, produced by Enders-Siklos?
I can get a match to XSPLIT(1) and XSPLIT(2) with this somewhat clunky code (following Enders/Siklos):
set mt1 = %if(u1{1}>=%%breakvalue,1,0)
set mt2 = 1-mt1
set p1 = mt1*u1{1}
set p2 = mt2*u1{1}
If I substitute p1 and p2 in place of zplus and zminus in the VECM, the estimates are pretty close to what is reported in Table 3.
Here is the full code as it stands:
Thanks, getting closer. So the edits to the code generate the positive and negative IRFs for shock to immigrants and response of gini index. But they do not match the paper.
Also, the VECM estimates are quite a bit different from what's published in their article see Table 3: http://www.umsl.edu/~dibooglus/personal ... evised.pdf
ZPLUS and ZMINUS. Should they match the variables XSPLIT(1) and XSPLIT(2), respectively, produced by Enders-Siklos?
I can get a match to XSPLIT(1) and XSPLIT(2) with this somewhat clunky code (following Enders/Siklos):
set mt1 = %if(u1{1}>=%%breakvalue,1,0)
set mt2 = 1-mt1
set p1 = mt1*u1{1}
set p2 = mt2*u1{1}
If I substitute p1 and p2 in place of zplus and zminus in the VECM, the estimates are pretty close to what is reported in Table 3.
Here is the full code as it stands:
Code: Select all
OPEN DATA "\hoover replicate.xls"
CALENDAR(A) 1947:1
DATA(FORMAT=XLS,ORG=COLUMNS) 1947:01 2003:01 Year gini unemp immig
set im = immig
set u = unemp
linreg gini / u1
# constant im u
compute b = %beta
set tseries = u1{1}
@enderssiklos(title="TAR Empirical", threshold=tseries, pi=.10, lags=1) u1
set dgini = gini-gini{1}
set dginiplus = %max(dgini,0)
set dginiminus = %min(dgini,0)
set du = u-u{1}
set duplus = %max(du,0)
set duminus = %min(du,0)
set dim = im-im{1}
set dimplus = %max(dim,0)
set dimminus = %min(dim,0)
*
*
set zplus = %max(gini-b(1)-b(2)*u-b(3)*im-%%breakvalue,0.0)
set zminus = %min(gini-b(1)-b(2)*u-b(3)*im-%%breakvalue,0.0)
*
* Order variables IM-->U-->GINI
*
system(model=tvecm)
variables dim du dgini
det dginiplus{1} dginiminus{1} duplus{1} duminus{1} $
dimplus{1} dimminus{1} zplus zminus
end(system)
*
estimate
*
frml(identity) ginieq gini = gini{1}+dgini
frml(identity) dginipluseq dginiplus = %max(dgini,0)
frml(identity) dginiminuseq dginiminus = %min(dgini,0)
*
frml(identity) ueq u = u{1}+du
frml(identity) dupluseq duplus = %max(du,0)
frml(identity) duminuseq duminus = %min(du,0)
*
frml(identity) imeq im = im{1}+dim
frml(identity) dimpluseq dimplus = %max(dim,0)
frml(identity) dimminuseq dimminus = %min(dim,0)
frml(identity) zpluseq zplus = %max(gini-b(1)-b(2)*u-b(3)*im-%%breakvalue,0.0)
frml(identity) zminuseq zminus = %min(gini-b(1)-b(2)*u-b(3)*im-%%breakvalue,0.0)
*
group identities ginieq ueq imeq dginipluseq dginiminuseq $
dupluseq duminuseq dimpluseq dimminuseq zpluseq zminuseq
*
compute fstart=%regend(),fend=fstart+11
forecast(model=tvecm+identities,results=baseresults,steps=12)
*
* Do Choleski factor
*
compute fsigma=%decomp(%sigma)
*
* Do responses with + and - shocks to IM
*
forecast(model=tvecm+identities,shocks = %xcol(fsigma,1),results=withplus,steps=12)
forecast(model=tvecm+identities,shocks = -1.0*%xcol(fsigma,1),results=withminus,steps=12)
*
* Take gap between the forecasts to get the IRF's
*
set irfimplus fstart fend = withplus(4)-baseresults(4)
set irfimminus fstart fend = withminus(4)-baseresults(4)
Re: Replicating Hoover et al. (2009)
Use the Code button to enclose big chunks of code (I've already done that in the other posts). It makes it easier to read and extract.
The ZPLUS and ZMINUS here have to have the {1}'s as shown below (otherwise, you're regressing the variable on itself in effect).
system(model=tvecm)
variables dim du dgini
det dginiplus{1} dginiminus{1} duplus{1} duminus{1} $
dimplus{1} dimminus{1} zplus{1} zminus{1}
end(system)
If you want the responses of GINI, you need (6)'s here:
set irfimplus fstart fend = withplus(6)-baseresults(6)
set irfimminus fstart fend = withminus(6)-baseresults(6)
I think I wrote it originally with gini first (which would mean you need (4)), but then noticed that they used a specific Cholesky order.
The estimation program seems to be doing what their equations say. Their estimates of the alpha's don't seem to match the scales that you would expect: gini is quite a bit smaller in scale than the other two so since the Z is normalized with unit on gini, the coefficients in the other two equations would be expected to be much, much larger in magnitude than they are in the gini equation.
Did you get a program or anything else from the authors?
The ZPLUS and ZMINUS here have to have the {1}'s as shown below (otherwise, you're regressing the variable on itself in effect).
system(model=tvecm)
variables dim du dgini
det dginiplus{1} dginiminus{1} duplus{1} duminus{1} $
dimplus{1} dimminus{1} zplus{1} zminus{1}
end(system)
If you want the responses of GINI, you need (6)'s here:
set irfimplus fstart fend = withplus(6)-baseresults(6)
set irfimminus fstart fend = withminus(6)-baseresults(6)
I think I wrote it originally with gini first (which would mean you need (4)), but then noticed that they used a specific Cholesky order.
The estimation program seems to be doing what their equations say. Their estimates of the alpha's don't seem to match the scales that you would expect: gini is quite a bit smaller in scale than the other two so since the Z is normalized with unit on gini, the coefficients in the other two equations would be expected to be much, much larger in magnitude than they are in the gini equation.
Did you get a program or anything else from the authors?
Re: Replicating Hoover et al. (2009)
It's possible, given both the coefficients and the graphs of the responses as "standard deviations" that the TVECM is run in variables divided by standard deviations (I assume they mean the basic statistical S.D's).
-
TVolscho-286
- Posts: 25
- Joined: Thu May 03, 2012 6:50 pm
Re: Replicating Hoover et al. (2009)
Hi Tom,
This has all been extremely helpful.
My apologies. I will enclose the code next time.
I didn't get any code from the authors, just the dataset.
This has all been extremely helpful.
My apologies. I will enclose the code next time.
I didn't get any code from the authors, just the dataset.
Re: Replicating Hoover et al. (2009)
You can see why I really want code and not just data. There are very few papers which include all the steps in their description of the empirical work.
-
TVolscho-286
- Posts: 25
- Joined: Thu May 03, 2012 6:50 pm
Re: Replicating Hoover et al. (2009)
I see what you mean. I've asked for code several times since I first saw the study but to no avail.
There was a debate about 10 years ago pushing (in sociology and political science) for authors to post the code and data to a website for any article published in the main association journal. Economics was held up as the gold standard for this practice.
FWIW: The dataverse site has a mix of data and code:
http://thedata.harvard.edu/dvn/
There was a debate about 10 years ago pushing (in sociology and political science) for authors to post the code and data to a website for any article published in the main association journal. Economics was held up as the gold standard for this practice.
FWIW: The dataverse site has a mix of data and code:
http://thedata.harvard.edu/dvn/