Re: ratios and spurious correlation



On Fri, 26 Sep 2008 08:09:54 -0700 (PDT), sangdonlee@xxxxxxxxx wrote:

On Sep 25, 8:54 pm, Richard Wright <richwrigREM...@xxxxxxxxxx> wrote:

The reason for my interest is that I am trying to evaluate a
morphometric paper that does linear discriminant analysis on a mixture
of measurements and ratios derived from those same measurements. For
example the analysis includes (A) Length as well as Height/Length and
(B) Height and Breadth as well as Height/Breadth and Height/Length.

If I understood your question correctly, I would like to recommend CA
(correspondence analysis).

CA is an application of SVD (Singular value decomposition) or EVD
(eigenvalue decomposition) to decompose "profile" of data (X). The
row/column "profiles" are defined as the ratios of each number (Xij)
to the row/column totals, therefore the "profile" is to analyze
ratios, in which the denominator is not just one variable, rather the
row or the column totals (i.e., the sum of all variables and sum of
all samples). You can avoid of issues regarding which variable should
be selected as a denominator.

A good book on the similarity/difference among PCA, CA and
MDS(multidimensional scaling) is written by Susan Weller:

Weller, S.C. and Romney, A.K. (1990) Metric Scaling: Correspondence
Analysis, Sage university paper series on quantitative applications in
the social sciences, No. 07-075, Sage, Newbury Park, CA.

I performed CA, PCA,& ICA(independent component analysis) on the
anthropometric data (X) from CAESAR to understand obesity, and found
that CA provides better results than PCA and as good as ICA. The
following paper compares only PCA and ICA though:

Lee, Sangdon , 2008, 'Comparative analyses of anthropometry associated
with overweight and obesity: PCA and ICA approaches', Theoretical
Issues in Ergonomics Science,9:5,441 ? 475. (I have a pdf).

IMHO, PCA is good to analyze "correlation", MDS for various
"distance", and CA for "profiles", even though they are known to show
very similar results: correlated variables tend to close in distance
and thus have similar profiles (I guess).

Hope this helps.

Sangdon Lee, Ph.D.,
GM Tech. Center

Thanks Sangdon. What I am doing is evaluating a paper, not writing
one. I know of the virtues of correspondence analysis, but the
transformations did not produce zero Euclidean distance in the
experiments by Jungers et al (see my reply in this thread to Rich
Ulrich).

.



Relevant Pages

  • Re: ratios and spurious correlation
    ... of measurements and ratios derived from those same measurements. ... to decompose "profile" of data. ... I performed CA, PCA,& ICAon the ...
    (sci.stat.math)
  • Re: PCA/LDA what use? interpretation?
    ... For a PCA in wikipedia I read "PCA is mostly used as a tool in ... your measurements are only really measuring 5 dimensions about people. ... Some types of predictive models can handle a large number ... the target values as well as possible from the corresponding predictor values. ...
    (sci.stat.consult)
  • Re: Good code profilers on VMS?
    ... I do wish I could profile all loaded images at the same time though rather ... PCA doesn't seem to have received much attention since the days ... Next fun VMS event comming for me this fall: ...
    (comp.os.vms)
  • PokerStars Caribbean Adventure - Final Results - LiveStraddle.com News
    ... Steve Paul-Ambrose wins the 2006 PCA in the Bahamas!' ... LiveStraddle.com's Profile: http://www.rgpaccess.com/member.php?userid=298 ... Prev by Date: ...
    (rec.gambling.poker)