Re: creating artificial dataset for nonlinear PCA

From: Gottfried Helms (helms_at_uni-kassel.de)
Date: 12/20/04


Date: Mon, 20 Dec 2004 12:40:15 +0100

Am 17.12.04 23:35 schrieb Tomasz Rogala:
> Gottfried Helms wrote:
>
>
>
>>If it is only that, what you are asking, then it is simple to
>>generate such data just by combining the uncorrelated factor-raw-
>>data according to some desired terms (where the inappropriateness
>>of PCA may be different for some of such models).
>
>
> Could you give a conrete, suitable (in your opinion) example of data in
> R^3 space to be visualized in R^2, with clearly visible nonlinearities
> ?
Hmmm - I nearly can't believe, that I understood the question right,
because you even use the squaring and cubing-operation yourself...
Use a random-generator, generate 3 vectors with normal or uniform
distribution with mean=0 and stddev=1
 x1, x2, x3 as vectors of length N (N cases)

Then create combinations like

 y1 = x1 + exp(x2 )
 y2 = x^2 - x3
 y3 = x1*x2*x3
 y4 = x1^3 + 3*x1^2*x2 + 3 x1*x2^2 + x2^3
 ...
and center and standardize y1..yv-data .

These y1 ... yv have lots of nonlinear compositions. I think with PCA
you would hardly uncover the structure with the means of checking
factor loadings...

>
>
>>Since I do not know about neural network modeling I can't give a
>>hint, how data should be configured to discriminate between
>>PCA and NN-approaches most sensible. What type of non-linearity are
>>NN's able to approximate best?
>
>
> I didn't explore NN-based PCA too much so far, but it seems that they
> can do the same as "principal curves" (see eg.
> www.iro.montreal.ca/~kegl/research/pcurves) or other "conventional"
> nonlinear dimensionality reduction techniques. I have tried NN only in
> classification tasks, so my knowlegde in this field is limited as well.
> This answer helped me a lot. Thanks.
> Tomasz Rogala
>
I think, that with the nonlinear methods you must at least formulate a
type of polynomial/exponential/otherwise model (or select one impli-
citely by selecting a bad-documented software which realizes such
assumptions) or you have to define a training model/valuation of
the application results...
(Well, I'll have a look at .../pcurves to see, what's going on with this)

Gottfried Helms



Relevant Pages

  • Re: PCA allowing bias
    ... information on the first components? ... Well PCA don't loose any information. ... to maximize variance on the first factor and minimize variance on last factor. ... Gottfried Helms, Kassel ...
    (sci.math)
  • Re: PCA Principal component analysis
    ... Why would one NOT want to rotate when using PCA? ... theoretical model of factors (expecting one dominant ... Gottfried Helms ...
    (sci.stat.math)