Page 1 of 1

Panel Data Operation

Posted: Mon Oct 28, 2013 4:39 pm
by yngvi
I have a panel of 11 individuals and 16 time periods, 1997-2012.

Problem 1:
Applying a division by a common time series variable for all individuals e.g. CPI.
The SET instruction doesn't seem to work as the common variable is interpreted as
belonging to individual 1.

I have come up with the following

Code: Select all

compute nobs=176
declare vector[real] YD_r(nobs)
do i=0,nobs-16,16
  do j=1,16
    compute k=i+j
    compute YD_r(k) = YD(k)/CPI(j)*CPI(16)
  end do j
end do i
* converting vector to panel data
set yd_rt 1//1997:1 11//2012:1 = yd_r(t-1//1997:1+1)
Question: Is there an easier/cleaner way to do this?

Problem 2: What is the most natural way of making up a panel of time differences?
It can be done within a double do loop structure similar as above, e.g.

Code: Select all

declare vector[real] dYD_r(nobs-11)
do i=0,nobs-26,15
  do j=1,15
    compute k=i+j
    compute dYD_r(k) = log(YD_r(k+1)/YD_r(k))
  end do j
end do i
However I have a problem converting the dYd_r vector back to time series relating as

Code: Select all

set dYD_rt 1//1998:1 11//2012:1 = dYD_r(t-1//1998:1+i)
doesn't work. It seems to assume the time series element of the differenced panel
begins in 1997. I'm pretty sure that I'm overlooking something really simple.

Thanks, Yngvi

Re: Panel Data Operation

Posted: Mon Oct 28, 2013 6:56 pm
by TomDoan
yngvi wrote:I have a panel of 11 individuals and 16 time periods, 1997-2012.

Problem 1:
Applying a division by a common time series variable for all individuals e.g. CPI.
The SET instruction doesn't seem to work as the common variable is interpreted as
belonging to individual 1.

I have come up with the following

Code: Select all

compute nobs=176
declare vector[real] YD_r(nobs)
do i=0,nobs-16,16
  do j=1,16
    compute k=i+j
    compute YD_r(k) = YD(k)/CPI(j)*CPI(16)
  end do j
end do i
* converting vector to panel data
set yd_rt 1//1997:1 11//2012:1 = yd_r(t-1//1997:1+1)
Question: Is there an easier/cleaner way to do this?
Sounds like you want

set(nopanel) yd_r = yd/cpi(%period(t))*cpi(2012:1)
yngvi wrote: Problem 2: What is the most natural way of making up a panel of time differences?
It can be done within a double do loop structure similar as above, e.g.

Code: Select all

declare vector[real] dYD_r(nobs-11)
do i=0,nobs-26,15
  do j=1,15
    compute k=i+j
    compute dYD_r(k) = log(YD_r(k+1)/YD_r(k))
  end do j
end do i
However I have a problem converting the dYd_r vector back to time series relating as

Code: Select all

set dYD_rt 1//1998:1 11//2012:1 = dYD_r(t-1//1998:1+i)
doesn't work. It seems to assume the time series element of the differenced panel
begins in 1997. I'm pretty sure that I'm overlooking something really simple.
If you're trying to do a forward difference,

set dyd_r = log(yd_r{-1}/yd_r)

That will be defined except for the final period for each individual.

Re: Panel Data Operation

Posted: Tue Oct 29, 2013 6:16 am
by yngvi
Thanks.

The solution to problem 1:

Code: Select all

set(nopanel) yd_r = yd/cpi(%period(t))*cpi(2012:1)
needed to be ammended as:

Code: Select all

set(nopanel) yd_r 1//1997:1 11//2012:1 = yd/cpi(%period(t))*cpi(2012:1)
i.e. I needed to specify the sample range explicitly. The %period(t) was what I was looking for. I had already tried something similar.

As regards problem 2 then I wasn't looking for a forward difference I just wanted to populate the vector with no empty elements.
As in the case of problem 1 I needed to specify the sample range explicitly but within a do loop. I don't understand why I can't use a similar sample reference as I did for problem 1.
In any case the following works:

Code: Select all

do i=1,11
set dYD_r i//1998:1 i//2012:1 = log(yd_r/yd_r{1})
end do i
I would like to understand why I need to reference the sample range in a different manner in problem 2 than in problem 1 if you can provide a comment on that.

Re: Panel Data Operation

Posted: Tue Oct 29, 2013 6:24 am
by TomDoan
yngvi wrote: As regards problem 2 then I wasn't looking for a forward difference I just wanted to populate the vector with no empty elements.
As in the case of problem 1 I needed to specify the sample range explicitly but within a do loop. I don't understand why I can't use a similar sample reference as I did for problem 1.
In any case the following works:

Code: Select all

do i=1,11
set dYD_r i//1998:1 i//2012:1 = log(yd_r/yd_r{1})
end do i
I would like to understand why I need to reference the sample range in a different manner in problem 2 than in problem 1 if you can provide a comment on that.
How do you difference the data without losing data points?

If you're doing the standard backwards difference, you just need

set dYD_r = log(yd_r/yd_r{1})

The first element of each individual will be NA. With panel data, by default, an expression which spans two individuals (as this would at the first entry in an individual's record) will be NA. The NOPANEL option in the other case disables that, allowing the calculation to use data from the common series (CPI) which is in the first individual's record only.

Re: Panel Data Operation

Posted: Tue Oct 29, 2013 8:52 am
by yngvi
TomDoan wrote:
yngvi wrote: As regards problem 2 then I wasn't looking for a forward difference I just wanted to populate the vector with no empty elements.
As in the case of problem 1 I needed to specify the sample range explicitly but within a do loop. I don't understand why I can't use a similar sample reference as I did for problem 1.
In any case the following works:

Code: Select all

do i=1,11
set dYD_r i//1998:1 i//2012:1 = log(yd_r/yd_r{1})
end do i
I would like to understand why I need to reference the sample range in a different manner in problem 2 than in problem 1 if you can provide a comment on that.
How do you difference the data without losing data points?

If you're doing the standard backwards difference, you just need

set dYD_r = log(yd_r/yd_r{1})

The first element of each individual will be NA. With panel data, by default, an expression which spans two individuals (as this would at the first entry in an individual's record) will be NA. The NOPANEL option in the other case disables that, allowing the calculation to use data from the common series (CPI) which is in the first individual's record only.
1) I was not addressing the fact that you loose datapoints. I was addressing the issue that in my example the SET instruction doesn't run correctly without setting the sample explicitly on the instruction.
2) In comparing problems 1 and 2 I was referring to that I seem to need a do loop in the second problem while I can use the SET instruction for the full sample, including the individual references. I.e. 1//1998:1 11/2012:1 vs. i//1998:1 i//2012:1 within a do loop. I am puzzled by the need to set the sample in different manners for the two problems. I am not puzzled at all by loosing a datapoint - that goes without saying. :)

Re: Panel Data Operation

Posted: Tue Oct 29, 2013 9:28 am
by TomDoan
What's your ALLOCATE instruction or first DATA range? That sets the default length on a SET instruction. If you make the ALLOCATE instruction cover the full range of the panel data, then you won't have to do an explicit range later on.

Re: Panel Data Operation

Posted: Tue Oct 29, 2013 12:20 pm
by yngvi
This is how I set it up. I never did a separate ALLOCATE for the panel.

Code: Select all

cal(a) 1997:1
all 2012:1
*
open data Indices.xlsx
data(org=obs,format=xlsx) 1997:01 2012:01 CPI W
close data
*
cal(a,panelobs=16) 1997:1
open data Panel.rat
data(format=rats) 1//1997:1 11//2012:1 C YD TFER A
close data
Then I set the ranges on the individual SET instructions.
Thanks for the help.

Re: Panel Data Operation

Posted: Tue Oct 29, 2013 12:43 pm
by yngvi
FYI then I added an ALLOCATE 11/2012:1 instruction.
I still seem to require to put a do loop around the differencing set instructions as per my earlier post.
It's a puzzle but I've got a working program.

BTW do you have any documentation on the MOVE instruction.

Many thanks again.

Re: Panel Data Operation

Posted: Tue Oct 29, 2013 1:46 pm
by TomDoan
If you do the ALLOCATE for the panel range, you shouldn't have to loop over the individuals.

The syntax for MOVE is

move series start end newseries newstart

It doesn't look at panel boundaries, so just shifts data around between series and newseries.