Re: JACK TOMSKY: an HISTORIAL of nonsense



To the Richard Atkin’s behalf (in WEB)


*** INTERPRETING FAILURE TO REJECT
A NULL HYPOTHESIS

David F. Parkhurst, School of Public and Environmental Affairs, and Biology Department, Indiana University, Bloomington, IN 47405

Like many members of the Ecological Society, I review manuscripts for Ecology from time to time. As a teacher of biostatistics courses, I pay special attention to applications of statistics as I do so, and because of a personal interest in how scientists interpret statistical results, that is what I attend to most. Uncomfortably often, I review papers in which the statistical analyses seem to be misinterpreted. The most recent paper sent to me was one of these, and it has inspired this note. The points I will make seem obvious to me, but there may be ecologists who disagree with them. If so, I hope they will respond in this Bulletin, since the issues are important ones.
My argument is with the interpretation, in hypothesis testing, that failure to reject the null hypothesis is proof of the null hypothesis. Failing to reject a null hypothesis is distinctly different from proving a null hypothesis; the difference in these interpretations is not merely a semantic point. Rather, the two interpretations can lead to quite different biological conclusions, as will be seen below.
Consider the following example, which I have made up to illustrate several points. The example is logically similar to the statistical analysis in the manuscript referred to above, but the biological content has been changed to prevent identification of that paper.

.. . .
Discussion
The manuscript that elicited this note did not present the data nor the details of any statistical tests, and it was based on a different set of biological materials. But it did contain a statement that was logically equivalent to the following interpretation of the tests above (the italics indicate direct quotation; the rest is paraphrased to fit the example):

“Since measurements of seed areas showed no statistical difference between species (data not presented), any difference in the distance their seeds may be transported by the wind must be due entirely to weight effects.”

Such a biological interpretation of statistical results like these is obviously unwarranted. Consider the following problems with that interpretation.

First, as is recognized in almost all statistics texts, there is little or no relationship between the biological importance and the statistical significance of a given result. For example, in the constructed maple seed data, the mean masses differed by only 13%, whereas the mean areas differed by 56%. The masses were statistically detectable as different because of the within-species similarity, but the areas were not because of their great variability. Yet the large difference in average areas would probably be of greater ecological importance than the smaller difference in masses.

Second, failure to reject a null hypothesis can (and very often does) result as easily from an inadequate experiment as from lack of a large or important effect. Such inadequacy can result from a combination of small sample size, careless control of extraneous factors, and measurement error, as well as from intrinsic variability in the phenomenon under study.

Third, regardless of the results of any hypothesis tests, the sample means for the masses and areas of the two species are the best estimates available from the data for the true population values of those quantities. In fact, parameter estimation with accompanying confidence limits is often a more logical way to interpret data than is hypothesis testing.

Because of their apparent biological importance, the area data in Table 1 suggest the need to obtain a larger sample, if one feels compelled to test hypotheses. Indeed, one might work according to a set of decision rules like those shown in Table 2. Of course, deciding on the degree of biological importance requires subjective scientific judgment, which some workers would rather not face.
The main point is that the above data certainly do not prove the mean seed areas of populations A and B to be identical, and indeed, no statistical test could possibly prove such an assertion. In other words, failure to disprove a null hypothesis does not prove that null hypothesis. Put another way, one should not think of ´ accepting the null hypothesis,´ but rather of failure to reject it. The greater accuracy of the latter interpretation far outweighs the fact that it is a more cumbersome
. .
Unfortunately, the notion of accepting the null hypothesis is suggested by at least one important textbook (Sokal and Rohlf 1981, e.g., pages 172, 190, and 224), and the idea seems to be widely held. (For example, I frequently find graduate students who learned that notion as undergraduates.) In the interest of scientific progress, it is time to reject this interpretation of hypothesis-test results. ***


I DO NOT comment. NO NEED

_________licas (Luis A. Afonso)
.