Re: Reduction of variance due to grouping




I've seen nothing on exactly this problem.

It's not a trivial one, though.

I can imagine doing it in probability distribution terms, but the only
way I can see of doing it with any guaranteeable level of accuracy for
an observed set of data is to actually do it and then compute the
variances.

However, I note that for some NP-hard problems, there are approximate
solutions (no more than a factor of (1+epsilon) times the minimum) that
are very quick; there may be an algorithm such as this (possibly
adapted from an existing related problem) for your problem that would
allow a good upper bound estimate of the variance at the optimum.

Just off the top of my head, I'm thinking about something akin to FFD
in the bin-packing problem (even though FFD isn't one of the
(1+epsilon) algorithms, it does have a nice bounded worst-case). It
may be that an algorithm like that can be adapted to you problem to
give you a quick upper bound on the variance.

You problem is akin to bin packing but instead of having an unknown
number of fixed sized bins you have a fixed number of unknown-sized
bins - also like the related pipe-cutting (1D nesting) problem.

You clearly identify that a single large piece can have a dramatic
effect. Another thought is that the more small items you have the
better the chances you can get a small variation across groups, but
there's likely to be a steep drop down in variance somewhere roughly in
the region near the point where the sum of the small pieces (where I've
left small deliberately vague for a moment) passes the largest piece.

Anyway, here's my adapted version of FFD that would give a quick bound
(though I present no results for how far from optimum it could get, I
have hopes that it or a variant of it would have FFD-like performance)
- sort the objects in decreasing size order. At each step put the
largest available piece in the smallest**
group. Compute the variance of that allocation. The actual variance at
the optimum will be smaller than this.

You might call this a Robin-Hood like strategy (giving the most to the
poorest); the variance from that rule is a sample-statistic, which is
what you're asking for. Doutless much better rules (and hence much
better approximations to the variance) will readily be found.

sci.stat.num-analysis is probably a more fruitful place for help on
this problem. I'd be surprised if it hasn't appeared in many guises
before.

Glen

.



Relevant Pages

  • Re: Reduction of variance due to grouping
    ... glenbarn...@xxxxxxxxxxxxx (Why is google groups doing that? ... here's my adapted version of FFD that would give a quick ... > (though I present no results for how far from optimum it could get, ... Compute the variance of that allocation. ...
    (sci.stat.math)
  • Balls into Bins with Variable Sized Balls
    ... certain events occurring on throwing n balls at random on m bins. ... Given only the mean and variance ... of the balls set (where the random variable is the size of the ...
    (sci.math)
  • Re: Packing problem
    ... >into, in this case, three sets whose sums have the least variance. ... Let's say your items are x_1,...,x_n and you have k bins. ... the variance can be made 0 is NP-complete). ... closest integer to x_i/M for some suitable integer M. ...
    (sci.math)
  • Packing problem
    ... Given a finite set of infinite, one-dimensional bins and a collection of ... the variance of the bin contents is minimal. ...
    (sci.math)
  • Solver problem
    ... I want to maximize one value (expected rate of return) and minimize another ... Is there a way to find an optimum value ... finding the minimum value of variance. ... generate a graph of expected return vs variance. ...
    (microsoft.public.excel.misc)