Re: Chi²-Test automatic grouping?`




>> 1) Why choose between some collection of distributions based on
>> goodness of fit? What are you trying to achieve by it?

> I have to implement an insurance model. The approach is well
> known in theory and there are only a few distr's proposed to fit.

I must have missed this particular theory. Which theory says that this
is a good way to select between distributions?


>> 2) Why the ones in your list and not others?

> because it's only a model and others have tested that some will fit
> "better" than others

Frankly, I don't believe this is why you'd exclude candidates they
mightn't have even been aware of, but I'll let this go.

>> 3) Why on /earth/ use chi-square?

> because it's easy to implement. Better for the date would be a
> graphical test but there's less objectivity

Plainly, it's not all that easy to implement in practice, or your
question wouldn't have been necessary.

If goodness-of-fit was your only interest, something better able to
distinguish between your candidates might be better (I would guess, for
example, that all of your proposed distribution are unimodal, and
right-skew). Goodness of fit via chi-square makes poor use of the
information. Some other choices aren't really any harder to implement
and totally avoid the problems created by binning in the first place.

If you're fixated on using a goodness of fit criterion, perhaps a
cdf-discrepancy like the Anderson-Darling statistic would make a better
measure, though one better suited still could be designed with more
information.

However, I think that's missing the point. Why not choose by, say
maximum likelihood, or in the case where some of your proposed
distributions have different numbers of parameters, some statistic that
takes account of that. Perhaps something like BIC or AIC? (Which one
I'd use would depend on what I was using the fits for.)

But there's a second big "However" here.

Since you're using the fitted distribution for something else,
presumably you want to /achieve/ something in that application. *That*,
rather than fit using chi-square (or anything else) ought to drive your
choice.

What is it you actually want to be "best"? If you used two different
distributions in your application, what would it be about the answers
they provide that would make one answer better than another?

I have some actuarial background, if that helps.

If you're using those distributions in some predictive way ("fit an
insurance model" is uninformative), you should probably take account of
the difference between fitted and predictive distributions (any
parameter estimates have associated uncertainty - you can take that
into account when forecasting).

[Indeed, you can even take account of the model selection uncertainty
in your choice between distributions.]

Glen

.



Relevant Pages

  • =?iso-8859-1?q?Re:_Chi=B2-Test_automatic_grouping=3F`?=
    ... >> 1) Why choose between some collection of distributions based on ... > known in theory and there are only a few distr's proposed to fit. ... Goodness of fit via chi-square makes poor use of the ... takes account of that. ...
    (sci.stat.math)
  • Re: Distribution Question
    ... discovered that opener will have 3 trump on average, ... I can imagine a small deviation but nothing significant. ... It always semed obvious to me that a fit with the 5 card major was likely. ... intended to determine the best MP results for these distributions. ...
    (rec.games.bridge)
  • =?iso-8859-1?q?Re:_Chi=B2-Test_automatic_grouping=3F`?=
    ... All I can say is "Why on earth are you doing these things?" ... Why choose between some collection of distributions based on ... goodness of fit? ...
    (sci.stat.math)
  • Re: Q: Building a Weibull distribution of air temperatures based on local min, max and average?
    ... allows to fit various distributions to sample data and easily select ... the best model (more than 40 probability distributions are supported). ... fit a Gaussian or more likely a Weibull distribution with 3 parameters? ...
    (sci.geo.meteorology)
  • Re: =?ISO-8859-1?Q?Chi=B2-Test_automatic_grouping=3F=60?=
    ... Why choose between some collection of distributions based on ... goodness of fit? ... theory and there are only a few distr's proposed to fit. ... test but there's less objectivity ...
    (sci.stat.math)