Re: Confidence interval on mean for a set of numbers

From: andre (andrevh_at_sci.kun.nl)
Date: 09/18/04


Date: Sat, 18 Sep 2004 16:58:44 +0000 (UTC)

Hello George (Giorgo?),

what is the difference between your method and making a histogram
of the "old data" in order to get a hold on the distribution.
(And I think that making a histogram is computationally easier than
ranking)

kalo savattokiriako
                      andre

>
>I agree with "andre" (andrevh@sci.kun.nl), that a quick (and not
>too-dirty) way is to assume normality in the distribution of the
>mean (provided that the size of the sample is large enough). Here
>I post my suggestion on how we can "do it right". Like Peter
>Michaux prefers, we avoid estimating the PDF of each outcome;
>we work instead with the corresponding predictive CDF
>("cumulative distribution function").
>
>E.g., say that we have recorded one thousand outcomes, lumping
>all old samples together. Let us rank these outcomes. Then the
>probability of the next outcome being, say, between the
>outcomes ranked #393 and #394 is 1/1001, provided that we admit
>no prior information of any kind regarding the true underlying
>distribution (in other words, we apply only
>"low structure assumptions"). This "posterior distribution of
>percentiles" is still debatable wrt the foundation. (Imho, it
>*is* possible to supply a classical proof (i.e. based on
>properties of conventional, orthodox probability) but no such
>proof is in print yet, afaik.)
>For references, if you need them, see Section 3a of my
>[7]news:<3ce8f26b.0409070825.7c799b23@posting.google.com>
>"prediction versus parameter estimation (was: literature: ...)"
>7 Sep 2004 09:25:57 -0700).
>
>If we accept (OK, just provisionally) the above posterior
>distribution of percentiles, we obtain an inexact CDF for the
>next outcome. E.g., the CDF at x=#393 is 393/1001, the CDF at
>x=#394 is 394/1001, and the CDF at any intermediate point is
>undefined as an exact value but defined as an "interval value":
>[393/1001, 394/1001].
>
>Now it is elementary to derive a CDF for the mean of the next
>sample of size n, given the CDF of single iid outcomes. The
>upshot is, in this problem we obtain an interval-valued CDF
>for the mean.
>(If we wanted the CDF of the median rather than the CDF of the
>mean, it would be an easier calculation.)
>Moreover, if we have recorded a large number of outcomes, this
>"second-order uncertainty" will be negligible.
>
>Not only this method is trivial to implement, but also it is
>*the* correct way, imo. (Of course I will welcome any comment
>or disagreement.)
>
>Good Day, ~ George Kahrimanis



Relevant Pages

  • Re: Confidence interval on mean for a set of numbers
    ... >to check which distribution because I have many of these lists. ... we work instead with the corresponding predictive CDF ... Let us rank these outcomes. ...
    (sci.stat.math)
  • Re: estimate the cdf 95% with a confidence interval of a 95%
    ... What BinomF, 0.95 MN) represents is the probability that of your ... distribution function for a binomial distribution, ... samples have a delay below x, so this would be the probability that X=x ... compute an estimate of the 95% cdf of such the delay, ...
    (sci.math)
  • Re: estimate the cdf 95% with a confidence interval of a 95%
    ... samples have a delay below x, so this would be the probability that X=x ... the distribution from your data and then compute the percentile from the ... What I do is that every station writes in a file the delay that it took ... compute an estimate of the 95% cdf of such the delay, ...
    (sci.math)
  • Re: OT: another day of media stupidity
    ... Assuming a Probability Density ... distribution with something like a 'Normal' shape it follows that the ... However the above does not tell us that the most probable outcomes are not ... Audio Misc http://www.st-and.demon.co.uk/AudioMisc/index.html ...
    (uk.tech.digital-tv)
  • Re: Probability theory is incoherent
    ... An archstone of probability theory is the concept of a "random ... The two possible outcomes ... distribution of outcomes of a series of events, ... closely to the ideal distribution "1/2" (half heads and half tails). ...
    (sci.math)