Re: Unusual formulae for confidence intervals
- From: Art Kendall <Arthur.Kendall@xxxxxxxxxxx>
- Date: Thu, 23 Nov 2006 03:08:07 GMT
There is only one place in public policy applications I have seen something like this. In planning the sample size for a random sample study, 50% (.5) is used as the working population proportion if there is no other information about what it might be. This is used in conjunction with a desired bound on error, commonly 5 or 3 percentage points to get a number of cases to select into the sample. The bound on error is t for 95% confidence level * standard error.
At the analysis phase of a sample-based study, after the data has been gathered, the confidence interval is found around the obtained percentage using the normal, binomial, or poisson as needed.
Perhaps there was confusion between designing a study and analyzing data after it has been gather?
Art Kendall
Social Research Consultants
Stephen J. Herschkorn wrote:
One of my tutees is taking a course in statistics for public policy. I think this is my first such client. (I have had other studetns in psychology, sociology, and public health.).
As we were discussing confidence intervals, he pointed out two formulae in his text and notes. I had never encountered these before.
- For a population proportion, use as the standard error 0.5 / sqrt(n), where n is the sample size. The justification was that since we do not know the population proportion pi, use pi = 0.5 for a conservative interval estimate. Most formulae I have seen substitute the sample proportion Q (my notation) to get sqrt(Q (1-Q) / n) for the standard error. If one is going to be a stickler, one can solve the quadratic inequality
-z <= (Q - pi) / sqrt( pi (1- pi) / n) <= z for explicit bounds on pi. So using 0.5 seems silly to me.
- For a population mean with a large sample size n, use S / sqrt(n-1), where S is the sample standard deviation, as the standard error with a normal distribution. I have always seen S / sqrt(n). Personally, I always use S / sqrt(n) with the t distribution, since computers can compute t for any degrees of freedom. (Oddly, though the textbook discusses hypothesis testing with small samples, it does not discuss confidence intervals with small samples.)
Have you seen these practices elsewhere? Are these conventions peculiar to public policy?
On another matter, didn't there used to be a newsgroup named sci.stat? That's where I was going to post this.
And, in case you are wondering, here is the reason I use "Q" for sample proportion. I prefer to use lower-case Greek letters for parameters and capital Latin letters for statistics. I rule out "P" for proportion since "P" is used to mean probability. Hence, I use the next alphabetical letter.
- References:
- Unusual formulae for confidence intervals
- From: Stephen J. Herschkorn
- Unusual formulae for confidence intervals
- Prev by Date: Re: Unusual formulae for confidence intervals
- Next by Date: Re: Unusual formulae for confidence intervals
- Previous by thread: Re: Unusual formulae for confidence intervals
- Next by thread: Re: Unusual formulae for confidence intervals
- Index(es):
Relevant Pages
|
|