Re: Least Squares solution for fitting beta distribution to empirical distrbution

From: Herman Rubin (hrubin_at_odds.stat.purdue.edu)
Date: 02/23/05


Date: 23 Feb 2005 15:10:20 -0500

In article <1109155518.005943.271990@l41g2000cwc.googlegroups.com>,
 <clemenr@wmin.ac.uk> wrote:
>Hi.

>I would like to look at the following case. I have a set of data that I
>can use as a probability distribution using kernel density estimation.
>I would like to find the best possible (I'm thinking at the moment,
>least squares) beta distribution to approximate this empirical
>distribution.

Why should you use least squares? Maximum likelihood is
tractable for this problem, although if you have unknown
endpoints, there may be reasons for making some modifications.
Also, I suggest you do this on the original data, not the
results of kernel smoothing, which loses information, and
which will in general increase the range, causing errors.

If the density is

(x-A)^{a-1}*(B-x)^{b-1}*\Gamma(a+b)/(\Gamma(a)*\Gamma(b)*(B-A)^(a+b-1)),

the mle of a and b given A and B correspond to setting the
average value of ln((x-A)/(B-A)) to its expected value
\Psi(a) - \Psi(a+b), and correspondingly for ln((B-x)/(B-A)).

Then one can plug this in to the likelihood function, but
observe that if a <= 1, the mle of A is the smallest order
statistic, and similarly for the case b <=1 for B. If this
is an appropriate model, kernel estimators are likely to be
quite poor near any such endpoint.

-- 
This address is for information only.  I do not claim that these views
are those of the Statistics Department or of Purdue University.
Herman Rubin, Department of Statistics, Purdue University
hrubin@stat.purdue.edu         Phone: (765)494-6054   FAX: (765)494-0558


Relevant Pages

  • ProcMeter3 - System monitoring program (cpu, memory etc.)
    ... The ProcMeter program itself is a framework on which a number of modules ... The statistics that are displayed are grouped by module, ... if you have configured the kernel to have the APM feature. ... Statistics about individual CPU usage including support for SMP ...
    (comp.os.linux.announce)
  • Re: Merge of per task delay accounting (was Re: 2.6.18 -mm merge plans)
    ... That infrastructure cannot meet the needs of delay accounting, ... Doesn't taskstats interface provide "user pull" request-reply model ... the kernel maintain counters and to provide preprocessed data. ... the statistics infrastructure on behalf of its exploiters as well ...
    (Linux-Kernel)
  • [PATCH] CPA: Add statistics about state of direct mapping v3
    ... Add statistics about state of direct mapping v3 ... A lot of split kernel pages means the ... int i = pmd_index; ... return proc_calc_metrics(page, start, off, count, eof, len); ...
    (Linux-Kernel)
  • Re: statistics infrastructure (in -mm tree) review
    ... io-schedular statistics? ... telling the kernel how to filter and adjust the data? ... pointer to string describing defaults setting for attributes ... * corresponding array of struct statistic, ...
    (Linux-Kernel)
  • Re: [PATCH 0/13] maps: pagemap, kpagemap, and related cleanups
    ... Poking deeply into the kernel to provide information about kernel state. ... And we should satisfy ourselves that all the required information has been ... what statistics we want yet, or even if it can be distilled down to ... really: don't prejudge what info userspace ...
    (Linux-Kernel)