Re: Question about using PCA to select major features from dataset



Jonathan Campbell wrote:
Jake wrote:
Hello Jon,

Based on your comments, now I understand that the chosen 3 features
are not come from
the original features(f1, f2, f3, f4, f5). So there is not direct
relationship between the new
created features and the original old features.

Also, what is the meaning to maximize the variance of n1, n2, n3?



The adi i = 1, ... 5 describe a projection line for PCA component (feature) d.

Think of a two-dim. data set. You can draw a scatter plot. Think of the data lying in a long narrow elliptical cluster along the diagonal.

If you project onto the diagonal (PCA component 1) that will give you maximum spread (variance). PCA component 2 will be perpendicular to the latter --- and will have much less spread.

If you search <campbell pca> or <campbell karhunen> on this newsgroup or comp.ai.neural-nets you may find elaborations.


There's a brief statement of what PCA does in Appendix A of:

http://www.jgcampbell.com/ip/pr.pdf

Best regards,

Jon C.
.



Relevant Pages

  • Re: Question about using PCA to select major features from dataset
    ... now I understand that the chosen 3 features ... If you project onto the diagonal (PCA component 1) that will give you maximum spread. ... If you search <campbell pca> or on this newsgroup or comp.ai.neural-nets you may find elaborations. ...
    (sci.image.processing)
  • Re: Parallels 3.0 bugs
    ... C J Campbell wrote: ... DirectX support. ... these features do not appear to actually ... It will not even run Solitaire or any of the other games included ...
    (comp.sys.mac.apps)