Re: Principal Component Analysis- Do I need to scale (i.e. normalize) my variables?



When you tell the software to use the covariance matrix in any type of factor analysis (principal components, principal factors, alpha, image, etc) the scales are important. In the more routine situation, where you tell the software to use the correlation matrix the variables are implicitly standardized. z = (value of x-mean of x) /(standard deviation of x).

I don't know what the substantive meaning of x's would be if you used the absolute value of the z-scores aka standardized variables.
This is a correlation matrix on 4 z-scores. (it is the same as for raw variables).

x1 x2 x3 x4
x1 1 .646 -.303 -.533
x2 .646 1 -.437 -.399
x3 -.303 -.437 1 .327
x4 -.533 -.399 .327 1

This is a correlation on the absolute value of the z-scores
abs1 abs2 abs3 abs4
abs1 1 .535 -.269 .125
abs2 .535 1 -.324 .117
abs3 -.269 -.324 1 .022
abs4 .125 .117 .022 1


Art Kendall
Social Research Consultants




Kerry wrote:
Hi,

I need to perform PCA on 20 or so variables (ex. height of say a tree,
weight of a tree, age, etc), many with different units and/or value
ranges. Will this bias my results? I noticed in a past PCA I did that
I converted all of my values to z scores [i.e. ((abs(value-mean))/std
dev)], but not sure why or if this was even a correct way to
normalize. If I do need to normalize my values, wouldn't it make more
sense to convert them to value/mean? Or what about value/sum(values)?

To be clear, I am referring to making my values unitless prior to
adding them all to my PCA.

Thanks,
K
.



Relevant Pages

  • Re: basic description of PCA terms
    ... > description which is PCA used usually for data reduction and PCA as ... Principal Components is ... > analysis is because the original poster mentioned "latent variables". ... >>> Sorry, Data Matter. ...
    (sci.stat.math)
  • Re: Difference between Principal Components Analysis and Factor Analysis?
    ... What is the difference between PCA and Factor Analysis? ... original axes of X into orthogonal axes of the PCs. ... That is Principal Components Factor Analysis. ... We have dozens and dozens of "analytic rotation methods" ...
    (sci.stat.math)
  • Re: Cant perform PCA
    ... Dave Krebs wrote: ... functions such as "processpca" to compute principal components due to memory limitations. ... Does anyone have an m-file that performs PCA interatively, returning one component at a time, or any algorithm that allows PCA to be performed in some way without computing the entire covariance matrix? ... The PRINCOMP function computes a PCA directly from the data. ...
    (comp.soft-sys.matlab)
  • Re: Principle Component Analysis
    ... I read a book about clearing multicollinearity of the independent variables by PCA. ... The result could be in the form of latent roots or latent vector but the problem is how do i use this PCA in regression? ... HOWEVER -- there is a serious problem here -- some of the principal components may not be predictive of the Y variables. ...
    (sci.stat.math)
  • Re: basic description of PCA terms
    ... description which is PCA used usually for data reduction and PCA as ... I have not heard principal components referred to as "latent variables" ...
    (sci.stat.math)

Loading