Re: Unusual formulae for confidence intervals



There is only one place in public policy applications I have seen something like this. In planning the sample size for a random sample study, 50% (.5) is used as the working population proportion if there is no other information about what it might be. This is used in conjunction with a desired bound on error, commonly 5 or 3 percentage points to get a number of cases to select into the sample. The bound on error is t for 95% confidence level * standard error.

At the analysis phase of a sample-based study, after the data has been gathered, the confidence interval is found around the obtained percentage using the normal, binomial, or poisson as needed.

Perhaps there was confusion between designing a study and analyzing data after it has been gather?

Art Kendall
Social Research Consultants



Stephen J. Herschkorn wrote:

One of my tutees is taking a course in statistics for public policy. I think this is my first such client. (I have had other studetns in psychology, sociology, and public health.)

As we were discussing confidence intervals, he pointed out two formulae in his text and notes. I had never encountered these before.

- For a population proportion, use as the standard error 0.5 / sqrt(n), where n is the sample size. The justification was that since we do not know the population proportion pi, use pi = 0.5 for a conservative interval estimate. Most formulae I have seen substitute the sample proportion Q (my notation) to get sqrt(Q (1-Q) / n) for the standard error. If one is going to be a stickler, one can solve the quadratic inequality
-z <= (Q - pi) / sqrt( pi (1- pi) / n) <= z for explicit bounds on pi. So using 0.5 seems silly to me.

- For a population mean with a large sample size n, use S / sqrt(n-1), where S is the sample standard deviation, as the standard error with a normal distribution. I have always seen S / sqrt(n). Personally, I always use S / sqrt(n) with the t distribution, since computers can compute t for any degrees of freedom. (Oddly, though the textbook discusses hypothesis testing with small samples, it does not discuss confidence intervals with small samples.)

Have you seen these practices elsewhere? Are these conventions peculiar to public policy?

On another matter, didn't there used to be a newsgroup named sci.stat? That's where I was going to post this.

And, in case you are wondering, here is the reason I use "Q" for sample proportion. I prefer to use lower-case Greek letters for parameters and capital Latin letters for statistics. I rule out "P" for proportion since "P" is used to mean probability. Hence, I use the next alphabetical letter.

.



Relevant Pages

  • Unusual formulae for confidence intervals
    ... One of my tutees is taking a course in statistics for public policy. ... For a population proportion, use as the standard error 0.5 / sqrt, where n is the sample size. ... I prefer to use lower-case Greek letters for parameters and capital Latin letters for statistics. ...
    (sci.stat.math)
  • Re: Unusual formulae for confidence intervals
    ... As we were discussing confidence intervals, ... For a population proportion, use as the standard error 0.5 / ... Personally, I always use S / sqrtwith the t distribution, since ...
    (sci.stat.math)
  • Re: Unusual formulae for confidence intervals
    ... The justification was that ... so that the standard error is guaranteed to be small enough. ... Actually, more precisely, they cicuitously said use sqrt/ n), where P_u is the population proportion, and set P_u = 0.5 for a conservative estimate. ...
    (sci.stat.math)