Re: Identifying the distribution of a data set
- From: ZZZjon@xxxxxxxxxxxxxxxx
- Date: 17 Aug 2006 00:48:39 -0700
ali wrote:
Dear all
I am creating a software that is reading tcp packets from a link. I have the following information available:
size of the packet: 1 4 7 9..........
frequency of the packet: 12 6 9 1..........
This is just an example. In reality I have thousands of these values.
Now I want to check what distribution fits the packet sizes best for e.g whether the distribution is Poisson or hyper exponential or Pareto or Gamma etc.
One way I guess is to plot a histogram and then study the shape. But I want the task to be fully automated and performed implicitly by the tool that I am developing.
I have been looking into books and searching on the internet as well. But I did not come across any discrete values that could be compared to predict the distribution like for e.g (just assuming)
if coefficient of variance of the sizes of packets is less than 1 then its Poisson etc.
Do you know of any algorithms etc in this regards. I have come across Goodness of fit tests for e.g chi-square test and Kolmogorov test. But I dont exactly know what they do and I think they need some sort of reference data that fits a distribution to calculate the difference from that distribution. Am I following the right path or do I need to look into some thing else?
My deadline is 23rd of August so I am looking for a short cut way. It would be very kind of you to help me.
I have searched this forum quite extensively and I did come across some similar posts but most of the answers I could not understand most probably because I not a statistician rather a computer programmer.
I hope you understand my problem. Any help would be appreciated and please please keep it simple.
Thanks in advance to all of you.
ali
If you have access to Mathcad, there is a useful thread with some code
for comparing fits to different distributions on the Mathcad
collaboratory http://collab.mathsoft.com/%7Emathcad2000 . You may have
to register.
It's in the "Probability & Statistics" section, entitled "Fitting
Statistical Distributions", started 2 July 2005. The last code is in
message 47, Sept 22 2005, from Paul W, "distribution ranking d.mcd".
No guarantees, but Paul W seems to know what he's talking about.
If you don't have Mathcad, maybe you could contact him via the
collaboratory for help/suggestions.
HTH
Jon
.
- References:
- Prev by Date: Re: Do you want it?
- Next by Date: Re: Identifying the distribution of a data set
- Previous by thread: Re: Identifying the distribution of a data set
- Next by thread: Re: Identifying the distribution of a data set
- Index(es):
Relevant Pages
|