Re: A basic question on Canonical Correlation Analysis



Bob, thanks very much for your kindly reply. I clarified my question a
bit in my another reply to Jerry.

Basically, I'm looking for a "quantitative" criterion, to tell how much
better CCA based multivaraite-regression (with shrinkage-out of
insignificant canonical variates) can do than a simpler correlation.
So I can have a kind of "theoretical" assessment on how much better CCA
may do, and this assessment would be compared (verified) with the
actual testing of prediction.

As to your explanation why CCA was not used in practice for regression,
I have a bit confusion. In your example, Y1, Y2, ... ,Y5 are exam and
test scores for grade determination, "If you WERE to use CCA to
determine BOTH the a(i) and b(j)", why would you have to end up with

"a(1) = .996, a(2) = a(3) = a(4) = a(5) = .001"

such that final exam hardly count? No specific context and data, I
cannot say more about your example. But here I will construct a
specific context in terms of Y1, ..., Y5.

put into a specific context, say, we want to determine a linear
combination of Y1, ..., Y5 to grade the undergrad students, being
linearly consistent with some other academic-performance measures, say,
students' overall GPA in college (X1), GRE scores (X2). This is not
necessary to be a realistic model, just to illustrate my point. Now we
use CCA to determine both a(i) and b(j)

a(1)*Y(1) + ... + a(5)*Y(5)

b(1)*X(1) + b(2)*X(2)

taking the first canonical variate pair to determine grade weights - it
seems that final exam may be strongly correlated to Y(5). If not, we
may want to first check many other things, such as 1) the data
glitches? 2) computation glitches? 3)... 4) model glitches? 5) sample
size glitches? I hardly see why CCA itself could be a big problem -
only maybe sample size is too small that CCA didn't make sensible
result.

Now in my work I do face problems that make sound sense by using CCA
for regression - I did see many statisticians worked out papers on
this, so I hope I have not been much way off.

Gary

>Reef Fish wrote:
>
>
> Gary, in spite of your lengthy explanation of your "question", I am not
> sure what your question is!
>
> OTOH, I have an ANSWER which I think may answer your question,
> whatever that may be.
>
> In a CCA (Canonical Correlation Analysis), you try to find
> unknown coefficients a(1), ..., a(p), and b(1), ..., b(q) <for
> simplicity,
> I'll leave out the question of constraints on these parameters> which
> will MAXIMIZE the correlation between the linear combinations
>
> Y = a(1)Y1 + a(2) Y2 + ... a(p) Yp, and
>
> X = b(1)X1 + b(2) X2 + ... b(q) Xq.
>
> So, basically you have (p+q) parameters to maximize the correlation.
>
> But if you SPECIFY or know the coefficients a(1), ... a(p),
> and wants the highest correlation between X and Y or the best
> predictor of Y by X, as in a Multiple Regression problem with
> one dep. var Y and q independent vars. X, you only have q
> parameters to vary to maximize the correlation.
>
> That is why the CCA correlation is always higher than what you
> call "simpler correlation approaches".
>
> In practice, the fly in the ointment of this higher correlation or
> predictive power in CCA is that the function a(1)Y1 + ... + Y(p)
> in CCA may NOT be what you WANT to predict.
>
> For example Y1, Y2, ... ,Yp may be exam and test scores for
> grade determination. If you have a priori weights for Yp the
> final exam score to count higher than the other scores and have
> reasons to want to base the grade on
>
> .1 Y1 + .15 Y2 + .2 Y3 + .05 Y4 + .5 Y5, say,
>
> where Y5 is the final exam score, then you can use regression
> methods to determine the coefficients in b(1)X1 + ... + b(q)Xq
> that best predicts Y, based on the variables X.
>
> If you WERE to use CCA to determine BOTH the a(i) and b(j),
> you may find the maximized correlation is achived by
>
> a(1) = .996, a(2) = a(3) = a(4) = a(5) = .001
>
> so that Y1 becomes essentially the sole determinanat of the
> course grade, while the final exam score Y5 hardly counts at
> all.
>
> This is a deliberately over-simplified explanation of the CCA.
>
> In a nutshell, that's pretty much why CCA's are rarely used in
> any practical situations whereas regression methods are,
> because one nearly always know the weights a(i) of the
> variables Y(i) in the objective function to be evaluated or
> predicted.
>
> -- Bob.

.



Relevant Pages

  • Re: A basic question on Canonical Correlation Analysis
    ... > insignificant canonical variates) can do than a simpler correlation. ... > So I can have a kind of "theoretical" assessment on how much better CCA ... > put into a specific context, say, we want to determine a linear ... >> final exam score to count higher than the other scores and have ...
    (sci.stat.math)
  • Re: A basic question on Canonical Correlation Analysis
    ... > I have a basic question about canonical correlation. ... > give a better prediction to multiple-output than, e.g., simple ... > reduction scheme in CCA space. ... > space) regression to some testing input data, ...
    (sci.stat.math)
  • a question on statistical mathematics - canonical correlation
    ... I have a basic question about canonical correlation. ... give a better prediction to multiple-output than, e.g., simple ... I can perform CCA on observed data to establish canonical variate pairs ...
    (sci.math)
  • A basic question on Canonical Correlation Analysis
    ... I have a basic question about canonical correlation. ... give a better prediction to multiple-output than, e.g., simple ... I can perform CCA on observed data to establish canonical variate pairs ...
    (sci.stat.math)
  • A basic question on Canonical Correlation Analysis
    ... I have a basic question about canonical correlation. ... give a better prediction to multiple-output than, e.g., simple ... I can perform CCA on observed data to establish canonical variate pairs ...
    (sci.stat.edu)