Re: how to compare the Gaussian probability of data with different dimensions?



On Mar 19, 4:01 pm, "Randy Poe" <poespam-t...@xxxxxxxxx> wrote:
On Mar 19, 3:07 pm, "zl2k" <kdsfin...@xxxxxxxxx> wrote:

hi, all
Suppose I have a data set with data having m-dimensions, and I have a
multivariate Gaussian (m dimensional) to describe it. In some cases,
the covariance matrix could be singula and I have to deduce the
dimension by projecting the data to a lower dimension to calculate
the probability. My question is: how can I compare the probability
using m dimension with using (m-k) dimension? (usually k=1) The change
of the dimensionality is only because of the singula covariance
matrix.

I think I understand your setup, but not your specific
question. For instance, you might have a random variable
(x,y,z) but it is confined to an (unknown) plane.

What I am thinking is that the data using lower dimension will have
higher probability than using higher dimension. How can I make
adjustment to compensate of change of the dimensionality? Thanks for
help.

I'm not sure what you mean about higher probability. Can
you give a more specific example (for example the one
I gave) and sketch out the issue you are trying to
deal with?

- Randy

Let me have a specific example. I have one mulitvariable Gaussian,
d=2, mean1 =[0; 0], sigma1=[1 0; 0 1]. (G1)
I have another Gaussian, d=3, mean2=[0;0;0], sigma2=[1 0 0; 0 1 0; 0 0
1]. (G2)
Now I have x1=mean1 for G1 or x2=mean2 for G2.
Obviously, P(x1|G1) > P(x2|G2).
However, I would expect the probability should be roughly the same
since if I project the G2 to 2 dimension, I get G1. Both x1 and x2 are
locate at the center. The difference of the probability is due to the
difference of dimension. It is not because of the deviate of x from
the mean. It is also not because of the shape of the Gaussian if they
are having the same dimension.
So my question is, how can I compensate that difference such that the
probability getting from different dimensions are comparable? Maybe I
should project the G2 to G1? I am even not sure if my concern make
sense or not. Thanks for comments.

The background of my question is that I have m-dimensional data and I
can estimate a Gaussian model based on that. (Let's call it datasetA)
Given a new dataset (still m-dimensional) and becaused of the
dependensy among the variable, the sigma may become singula so I'll
get a Gaussian with less dimensions. (Let's call it datasetB). Giving
a data x, I am asking the question: is the x more likely to be from
datasetA or from datasetB?

zl2k

.



Relevant Pages


Quantcast