Re: Question about use of Poisson probabilities
- From: Jim Burns <burns.87@xxxxxxx>
- Date: Sun, 04 Feb 2007 14:17:37 -0500
Dora Smith wrote:
My boss wants me to use poisson probabilities to compute the
liklihood of meeting various goals in relation to our project,
where the average number of records per month is 617,000.
I am having trouble getting Excel to compute probabilities as
other than 0 or 1, though when I do examples on the web that
involve very small numbers I get teh correct answers with no
trouble.
Are poisson probabilities intended for this use? Or are they
only applicable for small numbers of discrete events, like the
liklihood that 4 cars will run a traffic light in a day?
What would be the appropriate statistic for the probability of
meeting a goal of 700,000 discrete events in a month, or a few
million in a year?
The idea of changing your units to events per micro-month (or
something like that) doesn't seem very useful to me. Since the
Poisson distribution is discrete, this model would only include
the events { 0 records in Feb; 1,000,000 records in Feb,
2,000,000 records in Feb; ...} This is probably not what you want.
My thought was to approximate the Poisson distribution by
a Gaussian with the same mean and standard varation. This works
well for large enough numbers of events (including numbers
much smaller than 617,000). However, when I did this I realized
that source of your problems is your model. It just has
to be incorrect.
For the Poisson distribution you describe,
\mu = 617,000
\sigma = sqrt(\mu) = 785.5
For the question "What is the probability of at least 700,000
events in a month?", one can use the standard normal cdf with
z = (700,000 - 617,000)/785.5 = 105.7
(For a little perspective, the probability that z > 6 is
about one in 10^9) Maple will give me that number, but I'm
not surprised Excel won't:
prob(z > 105.7) = 1.146471703e-2427
If there is enough chance that you'll process 700,000 records
in a month that you're even asking this question, then
your model is wrong.
There are a lot of other well-studied distributions out
there, maybe even supported by Excel, and maybe even some
literature about which would be most appropriate for
your situation. I can't say anything about that deeper
than "Look on Google", or maybe "Look on Scholar.google"
If I were doing this, I would use the normal distribution
(which is nearly always not too wrong) with \mu and
\sigma gotten from the sample mean and sample standard
devaiation, of whatever data you got 617,000 from.
I notice, though, that you're looking for the probability
of reaching a goal. That makes me wonder whether these
statistics are being used incorrectly from the very
beginning.
The assumption behind using descriptive statistics as you
are doing is that the number of records processed per month
is *not* manipulable, that they just happen to have a certain
distribution. We can ask some questions, but the sort of
question you cannot have is "How well did we do last
month?" The answer will always be "We did the same as we
always do; it's just that last month we were lucky
(or, we were unlucky; it doesn't matter)" The assumption
that you did what you always do is built into the way
the question was asked.
Some more valuable statistics might be the correlations
between whatever you think *might* cause you to have
more records or fewer records processed and the numbers
actually processed.
Jim Burns
(Disclaimer: I am not a mathematician. If you're planning
to tell your boss he's an idiot for asking what he asked,
you need some heavier-weight support than me.)
.
- Follow-Ups:
- Re: Question about use of Poisson probabilities
- From: Dora Smith
- Re: Question about use of Poisson probabilities
- References:
- Question about use of Poisson probabilities
- From: Dora Smith
- Question about use of Poisson probabilities
- Prev by Date: Re: Cantor Confusion
- Next by Date: Re: Cantor Confusion
- Previous by thread: Re: Question about use of Poisson probabilities
- Next by thread: Re: Question about use of Poisson probabilities
- Index(es):
Relevant Pages
|