Re: Goodness of fit measures for a distribution



Reef Fish wrote:
You should mine more carefully and delicately than running a steam-
roller over all of them when more delicate tools are required.

I agree, this is a steam-roller approach. A hand-sewn suit can be of higher quality than a machine-sewn suit. Why aren't we all wearing hand-sewn suits?


When you have a few thousand variables, the first task is to
selectively consider ONLY a few dozen, if that many, that seem
most appropriate, for substantive reasons.

On several occassions data mining yielded novel patterns that we were unaware of, and that we would not include into the analysis. This is especially valid of interactions: although one can judge the relevance of variables, we are rarely able to reliably judge the abundance of possible interactions. Automation helps in such cases.


Then you do want a one-number summary: you have no time to manually
inspect a few hundred thousand QQ plots.

It's easy to do a one number summary. Just generate one RANDOM NUMBER, and say "that's what the 'puter gave me!" And that number is probably as meaningful <or meaningless> as your single number from GIGO (Garbage In, Garbage Out).

Which one number summary is less garbage than others? Goodness of fit does a solid job for categorical data, and it can be appropriately reformulated for numerical data. The agreement between summaries of fit and expert opinion was good in the experiments we did, so it does provide better summaries than generating random numbers.


What you have argued is in fact the WORST that has happened to the
application of statistics -- when computer programs and packages
are readily availble for any Tom, ***, and Harry to throw data
into the bin to get some meanless and useless number(s) out.

Yes, good tailors too were quite dismissive of the quality of the early machine-sewn dresses. They also pointed out how hurtful it is a look at all those unsophisticated clothes. They were forgetting that it is still better to see a person in a machine-sewn suit than in rags: that's what one should be comparing.


--
mag. Aleks Jakulin
http://kt.ijs.si/aleks/
Department of Knowledge Technologies,
Jozef Stefan Institute, Ljubljana, Slovenia.
.


Quantcast