|
"Roger Stafford" <ellieandrogerxyzzy@mindspring.com.invalid> wrote in message <h5skgj$g3q$1@fred.mathworks.com>...
> "shinchan " <shinchan75034@gmail.com> wrote in message <h5si1s$2u6$1@fred.mathworks.com>...
> > x=[.69 -1.31 .39 .09 1.29 .49 .19 -0.81 -.31 -.71];
> > y=[.49 -1.21 .99 .29 1.09 .79 -.31 -.81 -.31 -1.01];
> >
> > Given the data here, I understand that the first PCA axis (eigenvector) is a line that minimizes the square of the distance of each point to that line. So I use fminsearch to find the slope of this line:
> >
> > slope = fminsearch(@(param) (y - param*x)'*(y - param*x) + (x - y/param)'*(x - y/param), 1); % slope = 1.0725.
> >
> >
> > However, I can also eig() to get my eigenvectors and the slope I will get is not the same as above.
> >
> > [vec val]=eig(cov(x,y));
> >
> > %the slope of 1st PCA is
> > slope = vec(2,2)/vec(1,2); % slope = 1.0845.
> >
> > Did I miss something here? I can't seem to figure out why these two methods gives me different slopes. I hope someone can help me out. Thanks.
>
> I can think of a couple of things amiss with your fminsearch formula. First, you are minimizing, in effect, (y-p*x)^2*(1+p^2)/p^2, where p is param, when you should be minimizing (y-p*x)^2/(1+p^2). The latter is proportional to the true mean squared distance to the line through the origin with slope p.
>
> Secondly, the line you want does not necessarily run through the origin. You should subtract the mean values of x and y from x and y, respectively, before using them in the above formula. It is, after all, a two-parameter problem: the line's slope and its distance from the origin.
>
> Roger Stafford
Hi Roger,
Thank you very much for your explanations. You are quite correct. Now I understand that I wasn't minimizing the what I thought it was. Thanks a bunch.
Shinchan.
|