reading data when list of variables is huge

For questions and discussion related to reading in and working with data.
RK2509
Posts: 30
Joined: Wed Apr 15, 2015 3:16 pm

reading data when list of variables is huge

Unread post by RK2509 »

Hi,

I am trying to do Principal Component Analysis. There are around 103 commodities, but the series window is taking only upto the first 70. So what should I do, how do I ensure that all variables are included before I perform the PCA. RATS is not reading the entire list of variables.


cal(m) 1981:4
open data pca.xls
data(format=xls,org=cols) 1981:4 2014:9 AL PR FA FG CR RI WH JW BJ M BL RG PU G AR MN MA UR FV V PO SP ON TA GN PEA TA CA FR BA MNG AP OR CSHW CNT PAP GRP MLK EMF EG FIS MUT CHK PRK SP BP CH TU CM DRGN BNT CUM GRL OFA TEA COF NFART FP MNPR FDP DP BU GH PM CANN CF GRNM MD SO AT WB BK CK BRD SKG SU KH GU SL SC EO V GNO MST GR CTN RBO OILC MOC GROC CTO TCP TLF CPWD OFP BT WN ML SCW TXT


Also I am using RATS for the first time for PCA. So this is how I am doing. I hope this is correct.

vcv(center,matrix=r)
#AL PR FA FG CR RI WH JW BJ M BL RG PU G AR MN MA UR FV V PO SP ON TA GN PEA TA CA FR BA MNG AP OR CSHW CNT PAP GRP MLK EMF EG FIS MUT CHK PRK SP BP CH TU CM DRGN BNT CUM GRL OFA TEA COF NFART FP MNPR FDP DP BU GH PM CANN CF GRNM MD SO AT WB BK CK BRD SKG SU KH GU SL SC EO V GNO MST GR CTN RBO OILC MOC GROC CTO TCP TLF CPWD OFP BT WN ML SCW TXT

@prinfactors(print) r
@prinfactors(print,values=evalues) %cvtocorr(r)
set eigen 1 2 = evalues(t)
graph(style=symbols,vlabel="Eigenvalue",hlabel="Component",nodates)
#eigen
TomDoan
Posts: 7814
Joined: Wed Nov 01, 2006 4:36 pm

Re: reading data when list of variables is huge

Unread post by TomDoan »

Use $ at the end of a line which is too long to fit. See Section 1.5.6 of the Introduction.

Depending upon what you want to do, you may want the @PRINCOMP procedure rather than @PRINFACTORS.
RK2509
Posts: 30
Joined: Wed Apr 15, 2015 3:16 pm

Re: reading data when list of variables is huge

Unread post by RK2509 »

Thanks a ton. This is great help. If I put the $ sign, RATS reads the entire data set

Also I was trying the Princomp command. So this is what I did, to get the first principal component.

cal(m) 1981:4
open data pca2.xls
data(format=xls,org=cols) 1981:4 2014:9 list of variables
* dec vect[series] pcfoodprices

vcv(center,matrix=r)
#list of variables

@princomp(corr, ncomp=1) 1981:4 2014:9 pcfoodprices
#list of variables
@princomp(print)

If I run this, I get PCFOODPRICES(1) in the Series Window, along with the list of series, and when I click on that , I basically get a value of 0.596221. So is that the value of the first principal component.
Actually I am new to RATS, that's why I have basic questions.
Thank you so much. This is great help indeed.
TomDoan
Posts: 7814
Joined: Wed Nov 01, 2006 4:36 pm

Re: reading data when list of variables is huge

Unread post by TomDoan »

print / pcfoodprices(1)

should give you the first principal component across the full range of data.
RK2509
Posts: 30
Joined: Wed Apr 15, 2015 3:16 pm

Re: reading data when list of variables is huge

Unread post by RK2509 »

Thank you so much. It worked. This is great help.
Many Many Thanks
ateeb
Posts: 65
Joined: Sat Mar 16, 2019 11:15 am

Re: reading data when list of variables is huge

Unread post by ateeb »

Dear Tom,

I am trying to write a paper on effectiveness of monetary policy. my Y is the short-term interest rate (operational target of the CB denoted by R). X is a big matrix of data. X1 is a big matrix of slow moving variables. Once i have applied the @princomp command on slow and overall variables and got the number of components I am looking for, say 3 in this case, how can i do the following:

1. look at how much of the total variation in the data is each component explaining?
2. when we have taken out principal components from slow and overall variables, that is for example we took out C-overall and C-slow and they are 3 each and Y has 3 series of interest lets say Y = (IPI, P, R). so to remove dependence do i have to do the following:

C-overall-1st = C-slow-1st + C-slow-2nd + C-slow-3rd+ B*Y + et

that is for each component this regression has to be run and C-overall-1st - B*Y has to be subtracted?

Regards,

Ateeb
Post Reply