Identifying the distribution of a data set
- From: ali <alinaqvi90@xxxxxxxxxxx>
- Date: Thu, 10 Aug 2006 23:57:57 EDT
Dear all
I am creating a software that is reading tcp packets from a link. I have the following information available:
size of the packet: 1 4 7 9..........
frequency of the packet: 12 6 9 1..........
This is just an example. In reality I have thousands of these values.
Now I want to check what distribution fits the packet sizes best for e.g whether the distribution is Poisson or hyper exponential or Pareto or Gamma etc.
One way I guess is to plot a histogram and then study the shape. But I want the task to be fully automated and performed implicitly by the tool that I am developing.
I have been looking into books and searching on the internet as well. But I did not come across any discrete values that could be compared to predict the distribution like for e.g (just assuming)
if coefficient of variance of the sizes of packets is less than 1 then its Poisson etc.
Do you know of any algorithms etc in this regards. I have come across Goodness of fit tests for e.g chi-square test and Kolmogorov test. But I dont exactly know what they do and I think they need some sort of reference data that fits a distribution to calculate the difference from that distribution. Am I following the right path or do I need to look into some thing else?
My deadline is 23rd of August so I am looking for a short cut way. It would be very kind of you to help me.
I have searched this forum quite extensively and I did come across some similar posts but most of the answers I could not understand most probably because I not a statistician rather a computer programmer.
I hope you understand my problem. Any help would be appreciated and please please keep it simple.
Thanks in advance to all of you.
ali
.
- Follow-Ups:
- Re: Identifying the distribution of a data set
- From: ZZZjon
- Re: Identifying the distribution of a data set
- From: ZZZjon
- Re: Identifying the distribution of a data set
- From: john2
- Re: Identifying the distribution of a data set
- Prev by Date: Re: What discontinuities? WHAT?
- Next by Date: HELP ME, PLS!! - Error analysis question
- Previous by thread: bootstrap for nonlinear regression on nonnegative data
- Next by thread: Re: Identifying the distribution of a data set
- Index(es):
Relevant Pages
|