Re: Goodness of fitting of a distribution
- From: "Reef Fish" <large_nassua_grouper@xxxxxxxxx>
- Date: 9 Nov 2006 23:06:06 -0800
Herman Rubin wrote:
In article <1163112757.733760.179870@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>,
Reef Fish <large_nassua_grouper@xxxxxxxxx> wrote:
Herman Rubin wrote:
In article <1162867197.824670.102540@xxxxxxxxxxxxxxxxxxxxxxxxxxx>,
Reef Fish <large_nassua_grouper@xxxxxxxxx> wrote:
Beliavsky wrote:
nelson wrote:
hi all!
i have done some fitting test of a dataset. I do quantile quantile
plot that points out that the best distribution that fit my data is a
linear combination of a weibull and a normal distribution. How can i
have a teorical test that can confirm it? People that work with me
wants to see numbers, not only QQ plots. And they don't like sum of
square error...
You can use the Kolmogorov-Smirnov test of goodness-of-fit
Kolmogorov-Smirnov statistics is NOT a "goodness of fit" statistic.
It is the maximum order statistic between a theoretical cdf and
an empirical cdf. It is a statistic sometimes used to measure
the DEPARTURE from a given cdf, rather than a "goodness of
fit".
I do not know of any "goodness of fit" test which is
not a "badness of fit" test.
Then perhaps you don't know as many tests as you think, and
you also over-value the K-S stat which examines ONE value
(the max) in the difference between two cdfs.
This does not make it a bad test. Look at my paper in
the last Berkeley Symposium.
I don't need to read your Berkeley Symposium to know that the K-S
is a bad test best for DATA ANALYSTS like myself, for the reason
stated below.
It is a TERRIBLE measure of "goodness of fit" because it looks
at only the point of MAXIMUM discrepancy.
Terrible? Very definitely NOT. In a given situation,
there may well be better tests, but it has comparable
power to parametric tests, and it is a universal test.
It is the chi-squared test with many classes which has
little power, and the combination of local discrepancies
with the same direction adds greatly to the power. The
maximum takes advantage of this.
You found one that is WORSE than the K-S test. :-)
And even THAT is not strictly worse except against LONG-
tailed distirubtions, where a Chi-square goodness of fit
test necessarily lumps the information in the TAILS that is
most telling into the end bins.
No, it is the large number of bins which reduces the
power. Also, the chi-squared test ignores the order
of the cells; that the larger cells are close together
generally provides more information than the individual
cell differences. I have found far-out p values in
DISCRETE problems, where K-S is conservative, and
chi-squared finds nothing. These were two-sample
tests, but the principle is the same.
Those are two separate issues. The chi-square is poor
against long-tailed distributions because of lumping tail
observations into the end bin.
It's not bad at all for testing U(0,1) distribution because
binning is not a problem since all bins are uniformly
distributed.
Why should the order of the cells matter for testing the
uniformity of a U(0,1) distribution? For testing uniform
random numbers, I think the Chi-square test is on
everyone's list of tests, while the K-S is on none, to the
best of my recollection.
If scale is an important concern, the Kuiper test
is better, and this only looks at the deviation at
TWO points, but the two points are not fixed, but
are the two extremes. This is the one I recommend
for "bump hunting".
I think your mind wandered off, Herman. I was talking
about the chi-square test being better than the K-S for
testing random numbers on U(0,1).
For short-tail distribution, such as the Uniform, the Chi-
square goodness if fit ain't too bad. In fact, is is used as
ONE of the tests for pseudorandom number generators.
To test the uniformity of the distribution in ALL bins.
A more powerful test would be to look at the maximum
deviation, which is likely to be more powerful than the
chi-squared, or even the sum of the absolute deviations.
In this case, there is no ordering of the bins.
More power test against what? The maximum deviation
cannot possibly be more powerful than the uniform chi-
square which takes ALL deviations of bins into consideration
rather than just the bin with the maximum deviation.
For all other distribution, no Data Analyst worth his salt would
even think about K-S, for the reason of the effectiveness of
Q-Q plot. The more you understand or think about HOW to
use (or examine) Q-Q plots for departure, the LESS you'll
be impressed by the Kolmogorov. Of course, mathematical
statisticians have their own way of mathemtistry to think up
reasons why K-S test is any good at all!
NO mathematical statistician has EVER thought of, or be
able to capture the "small systematic departures" that eludes
the K-S statistic every time. We. the Data Analysis NEVER
miss that kind of systematic departures. That is why the Q-Q
visual test has NO analytic competitor that is even in the same
league.
-- Reef Fish Bob.
--
This address is for information only. I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hrubin@xxxxxxxxxxxxxxx Phone: (765)494-6054 FAX: (765)494-0558
.
- Follow-Ups:
- Re: Goodness of fitting of a distribution
- From: Herman Rubin
- Re: Goodness of fitting of a distribution
- From: BeeL
- Re: Goodness of fitting of a distribution
- References:
- Goodness of fitting of a distribution
- From: nelson
- Re: Goodness of fitting of a distribution
- From: Reef Fish
- Re: Goodness of fitting of a distribution
- From: Herman Rubin
- Re: Goodness of fitting of a distribution
- From: Reef Fish
- Re: Goodness of fitting of a distribution
- From: Herman Rubin
- Goodness of fitting of a distribution
- Prev by Date: Re: Critical value and test statistic
- Next by Date: Re: Critical value and test statistic
- Previous by thread: Re: Goodness of fitting of a distribution
- Next by thread: Re: Goodness of fitting of a distribution
- Index(es):
Relevant Pages
|