Re: Calculating joint entropy
From: Yaroslav Bulatov (yaroslavvb_at_gmail.com)
Date: 02/07/05
- Next message: Ray Koopman: "Re: If the curve looks right ...."
- Previous message: socraticquest: "Forecasting numbers based on number frequencies."
- In reply to: Richard: "Calculating joint entropy"
- Messages sorted by: [ date ] [ thread ]
Date: 6 Feb 2005 17:46:37 -0800
Entropy is a property of a probability distribution.
So to calculate entropy you first fit a distribution to your data, and
then calculate entropy of that distribution.
What Matlab probably does for entropy of 2 discrete variables is to
take the "empirical distribution", and calculate the entropy of that.
So here, the process of "fitting probability distribution to data"
basically consists of filling in the entries of the table.
So the extend the analogy, you can treat each possible setting of your
18 variables, as a unique value, and then your estimation would be
equivalent to finding entropy of 1 variable with n^18 different values.
However, with 18 variables, even if they are binary valued, the
table-based approach will contain at least 2^18 entries, hence 2^18-1
independent parameters to estimate. That's a problem because:
1) it takes a lot of memory
2) unless you have lots of data, your estimates of 2^18 parameters will
be mostly noise, so your calculated entropy will be mostly noise
So in order to get something meaningful you need to fit a distribution
that's
1) Close enough to the "true" distribution
2) Has much fewer parameters than datapoints
Picking the right distribution is mostly domain dependent, but here are
some examples of what you can do.
1) Assume all your variables are independent.
Then joint entropy will be sums of individual entropies:
H(A,B,C,D,E,F,G) = H(A)+H(B)+....+H(G)
2) Assume some limited number of dependencies. For entropy calculations
to be easy, you can do it by specifying your dependencies as a DAG. For
instance you can do the following
A->B->C->D<-E<-F<-G
Then your entropy will be
H(A,B,C,D,E,F,G) = H(A)+H(B|A)+H(C|B)+H(D|C,E)+H(E|F)+H(F|G)+H(G)
where
H(F|G) = H(F,G)-H(G) and
H(D|C,E) = H(D,C,E)-H(C,E)
These joint entropies can be calculated using the table based approach.
(ie, treat D,C,E as a single variable with 2^3 values, and calculate
entropy of that)
You can see here, that the largest table we have to deal with has 8
cells, instead having to fill entries of full joint table with 32
cells.
The total number of parameters we have to estimate with this
parametrization is
1+3+3+7+3+3+1 = 21 as opposed to 31 for the joint table, but also here,
one datapoint can be used to fill in several different cell, whereas in
the full joint approach each datapoint only affects one cells, so that
results in low accuracy in estimation.
So to answer your question, you can:
1) Treat your 18 variables as a single variable with n^18 values, and
use formula for entropy of 1 random variable
2) Assume they are all independent, and add up their individual
entropies
3) Use your domain knowledge to come up with a good parametrized model
for the variables, and calculate entropy of that (ie, a DAG)
3) will give most accurate results, 1) will give accurate results if
you have much more than n^18 datapoints, otherwise use 2)
Hope that helps.
- Next message: Ray Koopman: "Re: If the curve looks right ...."
- Previous message: socraticquest: "Forecasting numbers based on number frequencies."
- In reply to: Richard: "Calculating joint entropy"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|