Re: On Bayes



Jim Ferry wrote:
On Jun 4, 1:32 pm, Paulo Matos <pocma...@xxxxxxxxx> wrote:
Hi all,

I'm trying to work out a Bayesian probability which is getting me
confused.
Guess I have a documents of 1000 words and I'm considering 3 classes
of documents X, Y, Z (say prior probabilities are 10, 20 and 70%
respectively). I've estimated that:
Docs of type X have:
10 'bye'
15 'hello'
35 'english'
Docs of type Y have
15 'bye'
12 'hello'
40 'bye'
Docs of type Z have
30 'bye'
18 'hello'
35 'bye'

Can I compute from this the conditional probability P(doc having 27
'bye | doc of type X) ?

The problem is not well specified as it stands. You need some model
to specify the probability of getting b 'bye', h 'hello', and e
'english' in a document of type X (and similarly for Y and Z),

If I read his notation correctly those are given:
For instance
>> Docs of type X have:
>> 10 'bye'
>> 15 'hello'
>> 35 'english'
means P(b|X)=0.1, P(h|X)=0.15, P(e|X)=0.35
I assume he has skipped the % sign.



which
we will denote P(b,h,e|X). A natural model to use is the multinomial
distribution:

P(b,h,e|X) = pb^b ph^h pe^e (1-pb-ph-pe)^(n-b-h-e) C(n,b,h,e),

where

* n = 1000,
* pb, ph, pe, and po are the probabilities of getting 'bye', 'hello',
and 'english', respectively, which we set to 10/1000, 15/1000, and
35/1000,
* and C(n,b,h,e) = n!/(b! h! e! (n-b-h-e)!) is the multinomial
coefficient.

You can sum over h and e to get the probability you specified:

P(b|X) = pb^b (1-pb)^(n-b) C(n,b),


I don't quite follow you here, why do you set up a binomial/multinomial
distribution if the distribution is already given (and there are no repeated trials with it)

P(b|X)=pb is given in the problem description

which is simply a binomial distribution. P(27|X) = 3.6 x 10^-6, for
example, whereas P(10|X) = 0.126.

Typically, the problem of interest in this situation is classifying a
doc as type X, Y, or Z given the counts b, h, and e (or some subset of
them). Letting P0(X) be the prior probability of X, etc, we have

P(X|b,h,e) = P(X,b,h,e)/P(b,h,e) = P(b,h,e|X) P0(X) / (P(b,h,e|X)
P0(X) + P(b,h,e|Y) P0(Y) + P(b,h,e|Z) P0(Z)),

where (P0(X),P0(Y),P0(Z)) = (.1,.2,.7) in this case.

For example, if (b,h,e) = (20,15,35), then

P(b,h,e|X) = 1.3 x 10^-6,
P(b,h,e|Y) = 3.0 x 10^-5, and
P(b,h,e|Z) = 4.7 x 10^-5,

whence

P(X|b,h,e) = 0.016,
P(Y|b,h,e) = 0.386, and
P(Z|b,h,e) = 0.598.

(Note: because of the low frequencies of the words with respect to the
size of the text, you could also get by approximating the distribution
of each count as independent Poisson distributions, which are easier
to work with. E.g., P(b|X) = e^-10 10^b/b!, which yields P(27|X) =
4.2 x 10^-6.)

-Jim Ferry
Metron, Inc.
f rr @m tsc .c m
e y e i o

.



Relevant Pages

  • Re: Pigeons, People, and Priors
    ... the variance of the probability generator go to zero you have a continuum ... a random-interval 60 s schedule is not. ... The Exponential Distribution ... I probably should have used the phrase "statistical learning theory" rather ...
    (comp.ai.philosophy)
  • Re: So called "stimulus/response" models
    ... Instead of answering to each misunderstood, ironic and out of context ... Sorry, you exhibit a simplistic view of probability theory, and an even more ... of acquiring the consequences of responses. ... distribution over consequences of a given act. ...
    (comp.ai.philosophy)
  • Re: behavior as mapping
    ... estimating a probability distribution, the distribution ... sequence with equal probability - since you have microsecond temporal ... reduction of the entropy Pto the entropy P ... If there were 4 genes we would need 2 bits of binding site info. ...
    (comp.ai.philosophy)
  • Re: Bill Reid, Kelly Criterion
    ... about logs; if a person is talking about a percentage change in the ... probability of going broke the more they trade. ... adjustment (which is the one which allows any distribution which is ...
    (misc.invest.stocks)
  • Re: Hardy-weinberg Equilibrium
    ... Mating is random. ... while panmixis means equal probability of any ... But suppose we assumed a normal distribution? ... Are you claiming that statistical randomness requires a uniform ...
    (talk.origins)