Re: Help! Probability problem
- From: michael.cary@xxxxxxxxx
- Date: 18 Aug 2006 11:57:30 -0700
C6L1V@xxxxxxx wrote:
Orngarth wrote:
Here's the problem:
You have a thousand white balls in a barrel and you randomly pick out 100 and paint one red dot on each, then put all 100 back in the barrel. You then pick out another hundred (some of them (~10%) will already have a red dot), paint a blue dot on each, then put them all back in as well. If you then pick out m balls from the barrel, how do you determine the probability of picking out n balls that have both a red and blue dot?
If you knew exactly 10 balls had both a red and blue dot, this problem would be trivial, but the number of balls with both dots is unknown (but can be described by the hypergeometric distribution, I think).
Yes. After putting back the initial 100, there are 900 unmarked ones
and 100 having a red dot. When you select another 100, the number
having a red dot on them has the hypergeometric distribution with
parameters (900,100,100), and all of these also are given a blue dot
(the others are, too, but are irrelevant to the later issue). Now, for
any given number of red-blue balls in the barrel, say r, the number of
red-blues in your sample of size m is hypergeometric, with parameters
(1000-r,r,m). Thus, if N = number of red-blues in your sample, we have
P{N = k|r} = h(k;1000-r,r,m), so
P{N=k} = sum_r h(k;1000-r,r,m) h(r;900,100,100).
You might, possibly, be able to simplify this a bit by using the exact
formulae for the hypergeometric probabillities, but I doubt that the
result will be pretty.
R.G. Vickson
Thanks, that makes sense. What I really want to know is the
probability of selecting m OR MORE red/blue balls from the barrel. I
have a function that calculates the tail of a hypergeometric
distribution, so I imagine I can just apply the methodology you
described using that function, which returns p(m+), instead of the
exact value function, which would return p(m).
As far as summing over r, this would be from {0 to min(number of red,
number of blue)}, right? Just want to make sure I understand it
correctly. This could get hairy, since in the problem I'm working on I
need to figure out p() for a range of reds and blues (from 1 to n)
where the total set size, n, is 20,000. If anyone knows of a more
efficient way to calculate this please let me know.
-Mike
.
- References:
- Help! Probability problem
- From: Orngarth
- Help! Probability problem
- Prev by Date: Re: An uncountable countable set
- Next by Date: Gossett's theorem
- Previous by thread: Help! Probability problem
- Next by thread: Re: Help! Probability problem
- Index(es):
Relevant Pages
|