Re: Estimation of variance of proportions
- From: David Winsemius <doe_snot@xxxxxxxxxxx>
- Date: Sat, 24 Jun 2006 16:32:29 -0500
"Anon." <bob.ohara@xxxxxxxxxxxxxxxxx> wrote in
David Winsemius wrote:
<aubertot@xxxxxxxxxxxxxxx> wrote inAnd then you plug in p=0... :-)
news:11931925.1151133920157.JavaMail.jakarta@xxxxxxxxxxxxxxxxxxxxxx:
Hi everyone,
I would like to estimate the variance of estimated proportions. I
think that it is generally correct to estimate the variance by
VAR^=np^(1-p^), where p^ is the estimated proportion and n the
sample size. However, when p^=0, this leads to an estimated variance
VAR^=0, which puzzles me because the variance does not depend
anymore on n. It is quite different to obtain p^=0 with n=2, or with
n=100000 for instance. Does anyone know how to estimate the variance
of a proportion when the estimated proportion is null ?
If you _know_ that p or (1-p) is zero, then the variance _is_ zero.
If you do not know that p=0, then why would you torture the formula
into giving you a meaningless estimate?
If you think that the proportion of successes, hits or events is not
necessarily zero, then you should not use zero, but rather a small
non-zero estimate. In that instance, the Poisson distribution and
associated methods might be mathematically convenient. The variance
of the Poisson equals the expected value which is np. The
reasonableness of the Poisson approximation (with large n) to the
binomial is easy to see, since np(1-p) will approach np because (1-p)
is near 1.
One can calculate the 95% confidence interval for such a proportion by
plotting the likelihood against p, and then finding the 95% quantile
(the lower limit will be zero: you can calculate a symmetric CI, but
it doesn't make sense as it doesn't include the point estimate). If
the proportion comes from a series of Bernoulli trials, then the
likelihood is a Beta distribution, so a good stats package will be
able to calculate it.
It did independently occur to me that we could offer the OP a discussion
of confidence limits around an observed value of zero. So I (re)post
with full knowledge that I may be straying over the Uebersax Line by
responding to what I thought the OP should have asked.
One fairly widely known quick approximation for the upper 95% confidence
bound on p from a sample that produced zero events out of n cases is 3/n.
Another (possibly more accurate, but certainly more flexible, since you
can vary alpha) approximation to the upper limit on p for zero
observations in n cases is 1-alpha^(1/n). These are not p-values but
rather binomial parameters.
Then there are the exact tests. In an R session you can get a variety of
such estimates.
The Hmisc library (many thanks to Frank Harrell) has binconf(obs,n) which
defaults to the Wilson method; the R stats package has binom.test(obs,n)
which is the Clopper and Pearson method; and x <- 1-alpha^(1/n) would
give you a quick obs=0 approximation.
Agresti has a website that includes some other alternatives written in R:
http://www.stat.ufl.edu/~aa/cda/R/one_sample/R1/
The SAS exact single sample binomial CI estimates (I have read) are based
on the Clopper/Pearson formula.
--
David Winsemius
.
- References:
- Estimation of variance of proportions
- From: Jean-Noël
- Re: Estimation of variance of proportions
- From: David Winsemius
- Re: Estimation of variance of proportions
- From: Anon.
- Estimation of variance of proportions
- Prev by Date: T vs. T'
- Next by Date: Re: Estimation of variance of proportions
- Previous by thread: Re: Estimation of variance of proportions
- Next by thread: Re: Estimation of variance of proportions
- Index(es):
Relevant Pages
|