Re: Statistical Ranking for Non-Normal Populations

From: George Kahrimanis (anakreon_at_hol.gr)
Date: 10/14/04


Date: Thu, 14 Oct 2004 18:27:05 +0300


"Long message" alert :-[
"Non-parametric Predictive Inference" alert :-)

Peter Hach, on 13 Oct 2004 18:30:39 +0000 (UTC) wrote
>I need to perform (statistical) ranking of a number of large, but
>finite popolations X[i] = (x[i][1], ... ,x[i][n]) in a scenario
>where acquiring each x[i][j] is very expensive. I am looking for the
>population X[i] with the smallest Sum or Average over the x[i][j]
>(i.e. I am only interested in the top-ranked one).

This is not a trivial question, if the data are in short supply.
Here I propose a solution of the "nonparametric predictive" kind.
In this approach, it is not considered meaningful to ask a question
about the underlying pdfs themselves (inasmuch as there is zilch
prior knowledge) but we may pose a question related to the next
sampling of each of the separate populations (indexed by `i').

"Imagine a future sample, with one outcome for each i; what is
the probability that the outcome #1 will be the maximum in that lot?
What is the probability that the outcome #2 will be the maximum? And
so on."

Here is the foundation, in short. Without further assumptions about
the underlying processes, any assumed prior is arbitrary. On the other
hand, confidence intervals have been a disappointment (at least) in
other cases, so let us stay clear of them, or leave them to those who
have some use (like, what -- publish?) for them.

So what is left to do? Consider any sub-sample separately (i.e., for
any fixed i). Think of the next (i.e. future) oucome. Can you form
any prediction on the relative *rank* of the next outcome? The
obvious answer is that, if we had n events of type i, the rank of
the next event will be 1, 2,... n+1 with equal probability: 1/(n+1).
For references, see Section 3a of my
news:<3ce8f26b.0409070825.7c799b23@posting.google.com>
"prediction versus parameter estimation (was: literature: ...)"
7 Sep 2004 09:25:57 -0700).
Sometimes this assignment of probability is regarded as a
separate assumption, but I think that the issue of foundation is
still open. (Check the references.)

To be on the safe side, let us regard this assignment as an
assumption, "A_n", for now. The n outcomes of type i form n+1
plain intervals, when we also take into account +/- infinity or
the bounding values. According to A_n, the probability of the next
type-i outcome is 1/n+1, that it be inside any of these n+1 intervals.
However, we have no way to define probability for any interval
other than those, and their unions. This knowledge is practically
equivalent (in terms of decision strategy) to an "inexactly defined"
(i.e., interval valued) cumulative distribution function ("DF").
EXAMPLE. Say we have 4 outcomes of type i; the DF below the lower
bound is 0 (or the interval [0, 0]); the DF between the lower bound
and the lowest outcome is the interval [0, 1/5]; the DF between the
lowest two outcomes is the interval [1/5, 2/5];
in the next interval, the DF is the interval [2/5, 3/5]; and so on;
over the higher bound, the DF has a single value: 1 (that is, [1, 1]).

(The value of the DF at each node is a detail.)
(We have ignored ties, for now; that issue is trivial.)

We have defined an interval-valued DF for each i, regarding the
next outcome of a type-i measurement.

To define the probability of the next type-1 event, Y_1, being
larger than the next type-2 event, Y_2, given the interval-valued
DF for each type, we can consider the (incompletely defined) random
variable Z_{1,2} == Y_1 - Y_2, and seek what is the probability of
"z_{1,2} > 0". Offhand, we expect that the result will be an
interval, like [p_1, p_2].

We could calculate the DF for Z_{1,2} if we also assumed a PDF
(probability density function) for each of Y_1 and Y_2. Although
no such PDFs are defined, we can define, for each i, the family
of PDFs that are compatible with the corresponding interval-valued
DF. Now take any two such PDFs, one for i=1 and the other for
i=2, and calculate the corresponding DF for Z_{1,2}. Let the
two PDFs vary, each in its family (say, we implement a Monte Carlo)
and (after many blind trials) we identify the minimal and maximal
values for the DF of Z_{1,2}. Fortunately, we only need the extremal
values of the DF at zero only, so that the number of calculations is
not prohibitively large, for small samples.

It is a dumb, computation-intensive solution. I am almost sure that
an elegant one exists, but I still need to work on some fine points
in the proof.

We continue, with Z_{1, 3}, ... Z_(i', i''),... and find what is
the probability interval for each Y_i' to be larger than Y_i''.
By treating outcomes as independent, we form the probability
that Y_1 is maximum, or Y_2 is maximum, and so on.
(Of course, they will come out as interval-valued probabilities.)

I am sorry for the length, but imho this problem is worth it!
Thanks for the problem! ~ George Kahrimanis



Relevant Pages

  • Re: stats related question
    ... I made a function that gives the probability that one ... Pwhere x and y are PDFs. ... If the random variables are uniform, ... g is a uniform distribution between 5 and 15. ...
    (sci.math)
  • Exponential Knowledge-Information PI Axioms
    ... Shannon Information/Entropy really was preceded by Hartley Informa- ... where p is either a probability mass function or a probability density ... referring to entire probability pdfs or cdfs or probability mass ... corresponds to intuition. ...
    (sci.stat.math)
  • Quantum Gravity 326.9: The Strange Case of Statistical Dependence versus Independence
    ... E. H. Lehmann of the U.C. Berkely Statistics Department in the late ... 1960s invented or discovered the concept of "Dependence", ... which for respectively Cumulative Probability Distribution Functions ... F and Probability Density Functions (pdfs) f can be written as: ...
    (sci.physics)
  • Re: Quantum Gravity Via Expansion-Contraction 77.7: Stochastic Differential Equations
    ... to use univariate and bivariate probability density functions ... (pdfs), respectively fX ... and hence corresponds to P, while fXis the pdf of X ...
    (sci.physics)
  • Re: The =?ISO-8859-1?Q?Schr=F6dinger_Equation=2C_potential_?= =?ISO-8859-1?Q?box_pro
    ... What would you expect an asymmetric wave in a box to do? ... If the probability function is heavy on the left side, then the probability of finding the particle is bigger in the left half of the box than in the right half. ... Find the complex conjugate of that, and the time dependence won't go away. ... I still don't rellay get it, but seeing w1 and w2 (that is, separate angular velocities) in what is called the U operator by you gives me a clue. ...
    (sci.physics)