Problem with optimisation

Use this forum to post questions about syntax problems or general programming issues. Questions on implementing a particular aspect of econometrics should go in "Econometrics Issues" below.
PeterF
Posts: 63
Joined: Thu Apr 12, 2012 2:03 pm

Problem with optimisation

Unread post by PeterF »

Hello,

I tried to program in RATS a routine to perform non-linear Support Vector Regression (SVR). In order to obtain the fitted values for the dependent variable, the function L in the attached PDF document has to be maximized. The parameter epsilon has to be set and is the width of a range above and below the optimal fitted line for the dependent variable. The parameter C has also be set as a cost factor. k() is a kernel function. If an observation of y is within the band of plus and minus epsilon, then alpha and alpha-asterisk have to be zero.

i have written the following code

Code: Select all

*
* First steps in developing the implementation of
* Support Vector Regression in RATS
*

*
* Gaussian RBF-Kernelfunction
*
function rbfkernel x1 x2 sigma
type real sigma
type real rbfkernel
type vector[real] x1 x2
local MATRIX[REAL] d

comp d = x1-x2
comp rbfkernel = exp(%dot(d,d)*-sigma)
end

* set-up the x and y vectors

declare vector[real] x(401) y(401) d(401)
do i=1, 401
   comp x(i) = -20.1+i*0.1
   comp y(i) = sin(x(i))/x(i)+ %ran(0.03)
end
comp y(201) = (y(200)+y(202))/2.0

*
* set the parameter for e-insensitive, cost and sigma (Gaussian kernel)
*
comp eps = 0.025
comp cost = 3
comp sigma = 0.05

*
* Trying find to estimate the alpha vectors
*
comp nrows = %rows(x)

declare vector[real] alpha(nrows) alphaast(nrows)
declare real costf1 costf2 coatf3 costfunc

nonlin(parmset=pars) alpha alphaast
nonlin(parmset=constraints) %sum(alpha-alphaast)==0.0 alpha>=0.0 alphaast>=0.0 $
      alpha<=cost alphaast<=cost
comp alpha = cost/2*%ones(nrows,1)
comp alphaast = cost/2*%ones(nrows,1)
find(parmset=pars+constraints,method=BFGS,iterations=100,cvcrit=0.1) maximum costfunc
comp costf1 = 0.0
comp costf2 = 0.0
comp costf3 = 0.0
 do i=1, nrows
    do j=1, nrows
       comp x1 = %xrow(X,i)
       comp x2 = %xrow(X,j)
       comp costf1 = costf1 + (alpha(i)-alphaast(i))*(alpha(j)-alphaast(j))*rbfkernel(x1,x2,sigma)
    end
    comp costf2 = costf2 + (alpha(i)+alphaast(i))
    comp costf3 = costf3 + y(i)*(alpha(i)-alphaast(i))
 end
comp costfunc = -0.5*costf1 - eps*costf2 + costf3
disp costfunc -0.5*costf1 -eps*costf2 costf3
end


Unfortunately, i do not get a solution. If I changed the method in the Find instruction then I get the error message ## M4. I have also tried to estimate the alpha vectors with quadratic programming and using either the LQPROG or the FIND instruction but had no success. What am I doing wrong? Or is it not possible to find the solution for the maximization problem? The function for the y-variable is from an example of a package in R and the fitted y-values were estimated within a second or two. I would appreciate it very much to receive an answer.

Best regards
PeterF
Attachments
L.pdf
L function which has to be maximized
(299.32 KiB) Downloaded 683 times
TomDoan
Posts: 7814
Joined: Wed Nov 01, 2006 4:36 pm

Re: Problem with optimisation

Unread post by TomDoan »

First, as you've written that, COST is integer, which will throw off the calculations. Change that to compute cost=3.0.

Have you tried doing that for a more modest-sized grid? As you've written that, you have 800 parameters which will take a really long time to get anywhere. This is set up to handle different numers of grid points easily, and it seems to work for 41---you would have to check whether the results are what would be expected, but it does converge and satisfies the restrictions.
Support Vector Regression.RPF
(1.55 KiB) Downloaded 741 times
PeterF
Posts: 63
Joined: Thu Apr 12, 2012 2:03 pm

Re: Problem with optimisation

Unread post by PeterF »

Dear Tom ,

thank you for your reply. I changed the computation of the cost variable to real and reduced the size of the x and y arrays to just 50 elements. Find with method BFGS delivered a solution, however, there were some elements of the alpha and alphaast vectors, which had a negative sign, which is not meeting the constraints.

For financial time series, 400 data points for daily or even weekly frequency does not appear to much. Would there be a more elegant and quicker way to solve the optimization of the l-function?

Best regards
PeterF
TomDoan
Posts: 7814
Joined: Wed Nov 01, 2006 4:36 pm

Re: Problem with optimisation

Unread post by TomDoan »

PeterF wrote:Dear Tom ,

thank you for your reply. I changed the computation of the cost variable to real and reduced the size of the x and y arrays to just 50 elements. Find with method BFGS delivered a solution, however, there were some elements of the alpha and alphaast vectors, which had a negative sign, which is not meeting the constraints.
Are they just "round-off" negative? If you are doing CVCRIT=.1, it's going to converge when some of those are still negative. I used a more natural CVCRIT and the negatives are all =.00000x (i.e. zero).
PeterF wrote: For financial time series, 400 data points for daily or even weekly frequency does not appear to much. Would there be a more elegant and quicker way to solve the optimization of the l-function?
Sometimes ideas don't scale well to larger data sets. However, isn't this a quadratic problem in the alpha's? The y and x are just data.
PeterF
Posts: 63
Joined: Thu Apr 12, 2012 2:03 pm

Re: Problem with optimisation

Unread post by PeterF »

Tom,

yes, the negative values are just round-offs.

it could be formulated as a quadratic optimization.The Q-matrix contains the values of the k()-kernel functions. For the Find instruction I have no problem to set the constraints with nonlin. For the LQPROG instruction, i know how to formulate the constraint that the differences between alpha and alphaast sum up to zero. But how can i include the constraints that all elements are non-negative and smaller or equal to the value of the cost variable?

With the original size of x and y, i run quickly into the memory problem. Maybe it gets time for an upgrade to the pro version of RATS.

Best regards
Peter
TomDoan
Posts: 7814
Joined: Wed Nov 01, 2006 4:36 pm

Re: Problem with optimisation

Unread post by TomDoan »

Non-negativity constraints are included by default. Upper bound would be A=identity, B=vector of constants at COST value. The quadratic form matrix would be

K -K
-K K

where K is the matrix of kernel weights.
PeterF
Posts: 63
Joined: Thu Apr 12, 2012 2:03 pm

Re: Problem with optimisation

Unread post by PeterF »

Tom,

thank you very much. For all the trees, I did not see the forest and overlooked that the solution is really not complicated as I supposed. I have written also some code lines for solving the quadratic programming instead of optimizing the L-function by applying the find instruction. It worked well and is much faster.

Best regards
PeterF
Post Reply