Shotgun statistics
RossClement_at_gmail.com
Date: 01/24/05
- Next message: Thom: "Re: Chi-Square Question"
- Previous message: P.: "Chi-Square Question"
- Next in thread: Richard Ulrich: "Re: Shotgun statistics"
- Reply: Richard Ulrich: "Re: Shotgun statistics"
- Messages sorted by: [ date ] [ thread ]
Date: 24 Jan 2005 04:33:17 -0800
Hi. I've been thinking about the problem of using a confidence level of
95% for deciding whether to reject a null hypothesis. I've always
assumed that this would mean that 5% of experiments that should show no
effect, would show an effect by chance. However, if I try to calculate
the number of hypotheses that need to be considered before we have a
>50% of chance of finding a spurious effect, I get the following.
(i) Given a 95% confidence level for rejecting the null hypothesis, and
a proper alternate hypotheis which is the true negation of the null
hypothesis, then I assume that the chance of incorrectly rejecting the
null hypothesis is 0.5, and the probability of correctly failing to
reject the null hypothesis is 0.95.
(ii) Lets assume that we have a single set of data, and a large number
of null and alternate hypotheses that can be investigated (e.g. the
astrobank dataset used for investigating astrology). Lets also assume
that all of the alternate hypotheses are spurious (i.e., in this
example, that "astrology is bunkum"). If we keep on choosing hypotheses
and investiging their statistical significance, then the probability
will get higher and higher that we will find some spurious hypothesis
that "gets lucky" and comes out significant.
(iii) If we view the probability that we have at least one such
hypothesis coming up out of N, then this is one minus the probability
that no such hypotheses are found. Assuming that testing hypotheses are
independent random events, then the probability of this is:
(0.95)^N
(iv) A quick calculator check shows that this is < 0.5 for N>=14.
Hence, if I use a 95% confidence level, and choose 14 or more such
alt/null hypothesis pairs, the probability that I get at least one
improper reject of the null hypothesis is better than even.
Is my reasoning and calculation correct?
Note: this is not a homework problem.
Cheers,
Ross-c
- Next message: Thom: "Re: Chi-Square Question"
- Previous message: P.: "Chi-Square Question"
- Next in thread: Richard Ulrich: "Re: Shotgun statistics"
- Reply: Richard Ulrich: "Re: Shotgun statistics"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|