Re: how to compare the Gaussian probability of data with different dimensions?
- From: "zl2k" <kdsfinger@xxxxxxxxx>
- Date: 20 Mar 2007 06:56:34 -0700
On Mar 20, 5:49 am, "illywhacker" <illywac...@xxxxxxxxx> wrote:
On Mar 19, 8:05 pm, "zl2k" <kdsfin...@xxxxxxxxx> wrote:
hi, all
Suppose I have a data set with data having m-dimensions, and I have a
multivariate Gaussian (m dimensional) to describe it. In some cases,
the covariance matrix could be singula and I have to deduce the
dimension by projecting the data to a lower dimension to calculate
the probability. My question is: how can I compare the probability
using m dimension with using (m-k) dimension? (usually k=1) The change
of the dimensionality is only because of the singula covariance
matrix.
What I am thinking is that the data using lower dimension will have
higher probability than using higher dimension. How can I make
adjustment to compensate of change of the dimensionality? Thanks for
help.
zl2k
A probability is a real number: you can always compare them. The
question is what does it mean? If you calculate the probability that
the data lies in some set in the m-dimensional space, then in the case
that the covariance is singular, you will either get zero or you will
get the same probability that you would get if you first marginalized
to the codimension k surface and then computed the probabilty of the
data lying in the intersection of this surface with your set.
I think you need to describe what you want to achieve with this
comparison, what the context is, etc., if you want a truly useful
answer.
illywhacker;
I think what you said "you will
get the same probability that you would get if you first marginalized
to the codimension k surface and then computed the probabilty of the
data lying in the intersection of this surface with your set." is what I need to know.
Let me have a specific example. I have one mulitvariable Gaussian,
d=2, mean1 =[0; 0], sigma1=[1 0; 0 1]. (G1)
I have another Gaussian, d=3, mean2=[0;0;0], sigma2=[1 0 0; 0 1 0; 0 0
1]. (G2)
Now I have x1=mean1 for G1 or x2=mean2 for G2.
Obviously, P(x1|G1) > P(x2|G2).
However, I would expect the probability should be roughly the same
since if I project the G2 to 2 dimension, I get G1. Both x1 and x2 are
locate at the center. The difference of the probability is due to the
difference of dimension. It is not because of the deviate of x from
the mean. It is also not because of the shape of the Gaussian if they
are having the same dimension.
So my question is, how can I compensate that difference such that the
probability getting from different dimensions are comparable? Maybe I
should project the G2 to G1? I am even not sure if my concern make
sense or not. Thanks for comments.
The background of my question is that I have m-dimensional data and I
can estimate a Gaussian model based on that. (Let's call it datasetA)
Given a new dataset (still m-dimensional) and becaused of the
dependensy among the variable, the sigma may become singula so I'll
get a Gaussian with less dimensions. (Let's call it datasetB). Giving
a data x, I am asking the question: is the x more likely to be from
datasetA or from datasetB?
zl2k
.
- Follow-Ups:
- References:
- Prev by Date: Re: Sum of two independent uniform random variables
- Next by Date: Re: WinBugs Conditional Formulation
- Previous by thread: Re: how to compare the Gaussian probability of data with different dimensions?
- Next by thread: Re: how to compare the Gaussian probability of data with different dimensions?
- Index(es):
Relevant Pages
|