Re: A basic question on Canonical Correlation Analysis



In article <86j4o1l33ob8t9j3orppt7b9er8m67p7m4@xxxxxxx>, Richard Ulrich
<Rich.Ulrich@xxxxxxxxxxx> writes
>On 21 Nov 2005 14:23:01 -0800, gangxu_csu@xxxxxxxxx wrote:
>
>> Hi, all,
>>
>> I have a basic question about canonical correlation.
>>
>> * put in the multivariate regression (multiple-input and
>> multiple-output), and given some well-justified
>> shrinkage/rank-reduction scheme, why would the canonical correlation
>> give a better prediction to multiple-output than, e.g., simple
>> correlations? Here simple correlation is one-variable to one-variable
>> correlation. Which theoretical arguments do specifically support this
>> statement?
>
>Who uses a "shrinkage/rank-reduction scheme"?
>
Here is something along these lines that I haven't seen elsewhere and so
presume is a bad idea, but I'd like to know if there are any neat
insights about why it wouldn't work and what would be better.

Suppose we have a number of pairs of vectors, one dependent and one
independent. The dimension of the vectors is not small compared with the
number of pairs. For example, we might have a short time series whose
observations are relatively long vectors, and we might be trying to find
a good linear predictor of the next observation, given the current
observation. Because the vectors are long compared to the number of
observations, using ordinary linear regression will lead to massive
overfitting. We might have subject-matter knowledge that leads us to
believe that there is a hidden time series of small dimension, and that
each observation is a linear function of the state of the hidden time
series, plus noise.

One attempt to produce a predictor which has low computation cost
(compared to schemes requiring e.g. Gibbs sampling or general-purpose
numerical optimisation routines) would be to perform a canonical
correlation analysis between the dependent and independent vectors and
retain only the first few components of the correlation on the
independent side, before replacing all the independent vectors with the
values of those components. Now use linear regression to find a linear
predictor of the dependent side given the (short) component-value
vectors and, if required, perform a little linear algebra to work back
to a linear predictor of dependent vectors given independent vectors.
--
A.G.McDowell
.



Relevant Pages

  • Re: multiple regression question
    ... using the tolerance values, multicollinearity was not present. ... is totally uncorrelated with the other predictor can become ... the size of the error term is reduced. ... correlation with the original variable. ...
    (sci.stat.edu)
  • Lots of data and how it all relates
    ... Basicaly lets say I have xij where xij is the ith data element of the jth ... I have information on each of those people such as there weight, race, ... yesterday might not be such a good predictor in the future. ... basicaly I'm just trying to find the correlation between variables(but there ...
    (sci.stat.edu)
  • Re: Functional approximation in higher dimensions
    ... outlier removal is probably input dimensionality reduction. ... obtained when the the I/O transformation is taken into consideration. ... understandable correlation with the output is far less than the number ... Most of them, however, are for Linear ...
    (sci.math.num-analysis)
  • Re: Functional approximation in higher dimensions
    ... outlier removal is probably input dimensionality reduction. ... obtained when the the I/O transformation is taken into consideration. ... understandable correlation with the output is far less than the number ... Most of them, however, are for Linear ...
    (sci.math)
  • Correlation Probability Confidence Intervals in PI
    ... Correlation Probability Confidence Intervals in PI ... Notice that in simple linear regression, ... within the Fairly Frequent Event range of .05 to .95. ...
    (sci.stat.math)

Loading