Replicating Hoover et al. (2009)

TVolscho-286 · Unread post by **TVolscho-286** » Mon Apr 14, 2014 2:54 am

http://www.umsl.edu/~dibooglus/personal ... evised.pdf (draft of paper)

http://www.sciencedirect.com/science/ar ... 2509000193 (published version)

"This paper investigates the impact of various socio-economic variables on various cohorts of the income distribution. We use asymmetric cointegration tests to show that unemployment and immigration shocks have real impacts on income inequality. In addition, using threshold test results we are able to show that positive and negative shocks to the economy do not have symmetric effects nor do the impacts of these shocks impact income quintiles uniformly."

I would like to replicate their graphs of asymmetric shocks. Here are the commands to replicate (closely enough) their published estimates using the Gini index. If anyone can duplicate their estimates of the asymmetric ECM and the asymmetric positive and negative shocks in Fig. 2, I would greatly appreciate it.

TomDoan · Unread post by **TomDoan** » Mon Apr 14, 2014 11:06 pm

That would be done using their (5), (6) and (7) (which I hope have the typos fixed in the published version --- the gini minus terms should be differences and the error in (7) should be e3, not e2).

This is what is necessary in order to compute IRF's in this type of model. (It's like the Balke example on steroids, since it has four different non-linearities rather than one).

This uses the paper's notation for the variables which differs from yours.

Code: Select all

set dgini      = gini-gini{1}
set dginiplus  = %max(dgini,0)
set dginiminus = %min(dgini,0)
set du         = u-u{1}
set duplus     = %max(du,0)
set duminus    = %min(du,0)
set dim        = im-im{1}
set dimplus    = %max(dim,0)
set dimminus   = %min(dim,0)
*
set zplus      = %max(gini-b0-b1*u-b2*im-tau,0.0)
set zminus     = %min(gini-b0-b1*u-b2*im-tau,0.0)
*
* Order variables IM-->U-->GINI
*
system(model=tvecm)
variables dim du dgini  
det dginiplus{1 to p} dginiminus{1 to 2} duplus{1 to p} duminus{1 to p} $
  dimplus{1 to p} dimminus{1 to p} zplus{1 to p} zminus{1 to p}
end(system)
*
estimate
*
frml(identity) ginieq       gini       = gini{1}+dgini
frml(identity) dginipluseq  dginiplus  = %max(dgini,0)
frml(identity) dginiminuseq dginiminus = %min(dgini,0)
*
frml(identity) ueq          u          = u{1}+du
frml(identity) dupluseq     duplus     = %max(du,0)
frml(identity) duminuseq    duminus    = %min(du,0)
*
frml(identity) imeq         im         = im{1}+dim
frml(identity) dimpluseq    dimplus    = %max(dim,0)
frml(identity) dimminuseq   dimminus   = %min(dim,0)
*
frml(identity) zpluseq      zplus      = %max(gini-b0-b1*u-b2*im-tau,0.0)
frml(identity) zminuseq     zminus     = %min(gini-b0-b1*u-b2*im-tau,0.0)
*
group identities ginieq ueq imeq dginipluseq dginiminuseq $
   dupluseq duminuseq dimpluseq dimminuseq zpluseq zminuseq
*
forecast(model=tvecm+identities,results=baseresults,steps=12)
*
* Do Choleski factor
*
compute fsigma=%decomp(%sigma)
*
* Do responses with + and - shocks to IM
*
forecast(model=tvecm+identities,shocks=%xcol(fsigma,1),results=withplus,steps=12)
forecast(model=tvecm+identities,shocks=-1.0*%xcol(fsigma,1),results=withminus,steps=12)
*
* Take gap between the forecasts to get the IRF's
*
set irfimplus  = withplus(4)-baseresults(4)
set irfimminus = withminus(4)-baseresults(4)

This will get the IRF's from the initial conditions at the end of the data set. They apparently want the initial conditions for the "equilibrium". I assume that means lagged dx's are zero, lagged u and im are zero and lagged gini=b0+tau. (Any combination of u, im and gini which zeros out z will give the same IRF's if the lagged dx's are zero).

TVolscho-286 · Unread post by **TVolscho-286** » Thu Apr 17, 2014 8:13 am

Hi Tom,

When I get to this point, I get blank results in the series window. I can export them to excel and subtract and plot, though I prefer the RATS graphics. Oh, the (4) in parentheses, why is the 4th step used for the plot?

set irfimplus = withplus(4)-baseresults(4)
set irfimminus = withminus(4)-baseresults(4)

TomDoan · Unread post by **TomDoan** » Thu Apr 17, 2014 9:43 am

Why don't you post your full program? Did you rename either your variables or mine (you were using U differently from them)?

In the combined model tvecm+identities, the first three equations are in TVECM, so (4) refers to the first set of forecasts for the identities.

TVolscho-286 · Unread post by **TVolscho-286** » Thu Apr 17, 2014 3:55 pm

Hi Tom,
Sure here is the code below. I called the residual u1 after the static linear regression. The Enders/Siklos routine yields the threshold of -0.0213 (I added a 5 to the last place to adjust for rounding error). And for the input to zplus and zminus I used the results from the regression.

Code: Select all

OPEN DATA "\hoover replicate.xls"
CALENDAR(A) 1947:1
DATA(FORMAT=XLS,ORG=COLUMNS) 1947:01 2003:01 Year gini unemp immig

set u = unemp
set im = immig

linreg gini / u1
# constant u im


set tseries = u1{1}
@enderssiklos(title="TAR Empirical Tau", threshold=tseries, pi=.10, lags=1) u1


    set dgini      = gini-gini{1}
    set dginiplus  = %max(dgini,0)
    set dginiminus = %min(dgini,0)
    set du         = u-u{1}
    set duplus     = %max(du,0)
    set duminus    = %min(du,0)
    set dim        = im-im{1}
    set dimplus    = %max(dim,0)
    set dimminus   = %min(dim,0)
    *
    set zplus      = %max(gini-0.352532523-(-0.000806540*u)-(-0.000806540*im)-(-.02135),0.0)
    set zminus     = %min(gini-0.352532523-(-0.000806540*u)-(-0.000806540*im)-(-.02135),0.0)
    *
    * Order variables IM-->U-->GINI
    *
    system(model=tvecm)
    variables dim du dgini 
    det dginiplus{1} dginiminus{1} duplus{1} duminus{1} $
      dimplus{1} dimminus{1} zplus{1} zminus{1}
    end(system)
    *
    estimate
    *
    frml(identity) ginieq       gini       = gini{1}+dgini
    frml(identity) dginipluseq  dginiplus  = %max(dgini,0)
    frml(identity) dginiminuseq dginiminus = %min(dgini,0)
    *
    frml(identity) ueq          u          = u{1}+du
    frml(identity) dupluseq     duplus     = %max(du,0)
    frml(identity) duminuseq    duminus    = %min(du,0)
    *
    frml(identity) imeq         im         = im{1}+dim
    frml(identity) dimpluseq    dimplus    = %max(dim,0)
    frml(identity) dimminuseq   dimminus   = %min(dim,0)
    *
    frml(identity) zpluseq      zplus      = %max(gini-0.352532523-(-0.000806540*u)-(-0.000806540*im)-(-.02135),0.0)
    frml(identity) zminuseq     zminus     = %max(gini-0.352532523-(-0.000806540*u)-(-0.000806540*im)-(-.02135),0.0)
    *
    group identities ginieq ueq imeq dginipluseq dginiminuseq $
       dupluseq duminuseq dimpluseq dimminuseq zpluseq zminuseq
    *
    forecast(model=tvecm+identities,results=baseresults,steps=12)
    *
    * Do Choleski factor
    *
    compute fsigma=%decomp(%sigma)
    *
    * Do responses with + and - shocks to IM
    *
    forecast(model=tvecm+identities,shocks=%xcol(fsigma,1),results=withplus,steps=12)
    forecast(model=tvecm+identities,shocks=-1.0*%xcol(fsigma,1),results=withminus,steps=12)
    *
    * Take gap between the forecasts to get the IRF's
    *
    set irfimplus  = withplus(4)-baseresults(4)
    set irfimminus = withminus(4)-baseresults(4)

TomDoan · Unread post by **TomDoan** » Thu Apr 17, 2014 5:20 pm

That will be part you, part me.

The fix for my error is to add the following:

compute fstart=%regend(),fend=fstart+11
forecast(model=tvecm+identities,results=baseresults,steps=12)

...

and then put the fstart and fend parameters on

set irfimplus fstart fend = withplus(4)-baseresults(4)
set irfimminus fstart fend = withminus(4)-baseresults(4)

The FORECAST goes outside the original sample, so the range on the SET has to be explicit.

You have a typo in your definitions of Z---you used the second coefficient twice. A better way of handling that is

linreg gini / u1
# constant u im
compute b=%beta

and then

set zplus = %max(gini-b(1)-b(2)*u-b(3)*im-%%breakvalue,0.0)

with similar changes to the other three similar instructions. (%%breakvalue is defined by @ENDERSSIKLOS).

TVolscho-286 · Thu Apr 17, 2014 10:02 pm

Hi Tom,
Thanks, getting closer. So the edits to the code generate the positive and negative IRFs for shock to immigrants and response of gini index. But they do not match the paper.

Also, the VECM estimates are quite a bit different from what's published in their article see Table 3: http://www.umsl.edu/~dibooglus/personal ... evised.pdf

ZPLUS and ZMINUS. Should they match the variables XSPLIT(1) and XSPLIT(2), respectively, produced by Enders-Siklos?

I can get a match to XSPLIT(1) and XSPLIT(2) with this somewhat clunky code (following Enders/Siklos):

set mt1 = %if(u1{1}>=%%breakvalue,1,0)
set mt2 = 1-mt1
set p1 = mt1*u1{1}
set p2 = mt2*u1{1}

If I substitute p1 and p2 in place of zplus and zminus in the VECM, the estimates are pretty close to what is reported in Table 3.

Here is the full code as it stands:

Code: Select all

OPEN DATA "\hoover replicate.xls"
CALENDAR(A) 1947:1
DATA(FORMAT=XLS,ORG=COLUMNS) 1947:01 2003:01 Year gini unemp immig

set im = immig
set u = unemp

linreg gini / u1
# constant im u
compute b = %beta

set tseries = u1{1}
@enderssiklos(title="TAR Empirical", threshold=tseries, pi=.10, lags=1) u1

set dgini      = gini-gini{1}
    set dginiplus  = %max(dgini,0)
    set dginiminus = %min(dgini,0)
    set du         = u-u{1}
    set duplus     = %max(du,0)
    set duminus    = %min(du,0)
    set dim        = im-im{1}
    set dimplus    = %max(dim,0)
    set dimminus   = %min(dim,0)
    	*
	*
	set zplus    = %max(gini-b(1)-b(2)*u-b(3)*im-%%breakvalue,0.0)
	set zminus   = %min(gini-b(1)-b(2)*u-b(3)*im-%%breakvalue,0.0)
    *
    * Order variables IM-->U-->GINI
    *
    system(model=tvecm)
    variables dim du dgini
    det dginiplus{1} dginiminus{1} duplus{1} duminus{1} $
      dimplus{1} dimminus{1} zplus zminus
    end(system)
    *
    estimate
    *
    frml(identity) ginieq       gini       = gini{1}+dgini
    frml(identity) dginipluseq  dginiplus  = %max(dgini,0)
    frml(identity) dginiminuseq dginiminus = %min(dgini,0)
    *
    frml(identity) ueq          u          = u{1}+du
    frml(identity) dupluseq     duplus     = %max(du,0)
    frml(identity) duminuseq    duminus    = %min(du,0)
    *
    frml(identity) imeq         im         = im{1}+dim
    frml(identity) dimpluseq    dimplus    = %max(dim,0)
    frml(identity) dimminuseq   dimminus   = %min(dim,0)
    frml(identity) zpluseq      zplus      = %max(gini-b(1)-b(2)*u-b(3)*im-%%breakvalue,0.0)
    frml(identity) zminuseq     zminus     = %min(gini-b(1)-b(2)*u-b(3)*im-%%breakvalue,0.0)
    *
    group identities ginieq ueq imeq dginipluseq dginiminuseq $
       dupluseq duminuseq dimpluseq dimminuseq zpluseq zminuseq
    *
	compute fstart=%regend(),fend=fstart+11
	forecast(model=tvecm+identities,results=baseresults,steps=12)
    *
    * Do Choleski factor
    *
    compute fsigma=%decomp(%sigma)
    *
    * Do responses with + and - shocks to IM
    *
    forecast(model=tvecm+identities,shocks = %xcol(fsigma,1),results=withplus,steps=12)
    forecast(model=tvecm+identities,shocks = -1.0*%xcol(fsigma,1),results=withminus,steps=12)
    *
    * Take gap between the forecasts to get the IRF's
    *
    set irfimplus fstart fend = withplus(4)-baseresults(4)
    set irfimminus fstart fend = withminus(4)-baseresults(4)

TomDoan · Unread post by **TomDoan** » Thu Apr 17, 2014 11:35 pm

Use the Code button to enclose big chunks of code (I've already done that in the other posts). It makes it easier to read and extract.

The ZPLUS and ZMINUS here have to have the {1}'s as shown below (otherwise, you're regressing the variable on itself in effect).

system(model=tvecm)
variables dim du dgini
det dginiplus{1} dginiminus{1} duplus{1} duminus{1} $
dimplus{1} dimminus{1} zplus{1} zminus{1}
end(system)

If you want the responses of GINI, you need (6)'s here:

set irfimplus fstart fend = withplus(6)-baseresults(6)
set irfimminus fstart fend = withminus(6)-baseresults(6)

I think I wrote it originally with gini first (which would mean you need (4)), but then noticed that they used a specific Cholesky order.

The estimation program seems to be doing what their equations say. Their estimates of the alpha's don't seem to match the scales that you would expect: gini is quite a bit smaller in scale than the other two so since the Z is normalized with unit on gini, the coefficients in the other two equations would be expected to be much, much larger in magnitude than they are in the gini equation.

Did you get a program or anything else from the authors?

TomDoan · Unread post by **TomDoan** » Fri Apr 18, 2014 8:01 am

It's possible, given both the coefficients and the graphs of the responses as "standard deviations" that the TVECM is run in variables divided by standard deviations (I assume they mean the basic statistical S.D's).

TVolscho-286 · Unread post by **TVolscho-286** » Fri Apr 18, 2014 2:57 pm

Hi Tom,
This has all been extremely helpful.
My apologies. I will enclose the code next time.

I didn't get any code from the authors, just the dataset.

TomDoan · Unread post by **TomDoan** » Fri Apr 18, 2014 3:02 pm

You can see why I really want code and not just data. There are very few papers which include all the steps in their description of the empirical work.

TVolscho-286 · Unread post by **TVolscho-286** » Fri Apr 18, 2014 5:37 pm

I see what you mean. I've asked for code several times since I first saw the study but to no avail.

There was a debate about 10 years ago pushing (in sociology and political science) for authors to post the code and data to a website for any article published in the main association journal. Economics was held up as the gold standard for this practice.

FWIW: The dataverse site has a mix of data and code:

http://thedata.harvard.edu/dvn/

The RATS Software Forum

Replicating Hoover et al. (2009)

Replicating Hoover et al. (2009)

Re: Replicating Hoover et al. (2009)

Re: Replicating Hoover et al. (2009)

Re: Replicating Hoover et al. (2009)

Re: Replicating Hoover et al. (2009)

Re: Replicating Hoover et al. (2009)

Re: Replicating Hoover et al. (2009)

Re: Replicating Hoover et al. (2009)

Re: Replicating Hoover et al. (2009)

Re: Replicating Hoover et al. (2009)

Re: Replicating Hoover et al. (2009)

Re: Replicating Hoover et al. (2009)