Re: Reduction of variance due to grouping
- From: glenbarnett@xxxxxxxxxxxxx
- Date: 13 Apr 2005 19:30:25 -0700
I've seen nothing on exactly this problem.
It's not a trivial one, though.
I can imagine doing it in probability distribution terms, but the only
way I can see of doing it with any guaranteeable level of accuracy for
an observed set of data is to actually do it and then compute the
variances.
However, I note that for some NP-hard problems, there are approximate
solutions (no more than a factor of (1+epsilon) times the minimum) that
are very quick; there may be an algorithm such as this (possibly
adapted from an existing related problem) for your problem that would
allow a good upper bound estimate of the variance at the optimum.
Just off the top of my head, I'm thinking about something akin to FFD
in the bin-packing problem (even though FFD isn't one of the
(1+epsilon) algorithms, it does have a nice bounded worst-case). It
may be that an algorithm like that can be adapted to you problem to
give you a quick upper bound on the variance.
You problem is akin to bin packing but instead of having an unknown
number of fixed sized bins you have a fixed number of unknown-sized
bins - also like the related pipe-cutting (1D nesting) problem.
You clearly identify that a single large piece can have a dramatic
effect. Another thought is that the more small items you have the
better the chances you can get a small variation across groups, but
there's likely to be a steep drop down in variance somewhere roughly in
the region near the point where the sum of the small pieces (where I've
left small deliberately vague for a moment) passes the largest piece.
Anyway, here's my adapted version of FFD that would give a quick bound
(though I present no results for how far from optimum it could get, I
have hopes that it or a variant of it would have FFD-like performance)
- sort the objects in decreasing size order. At each step put the
largest available piece in the smallest**
group. Compute the variance of that allocation. The actual variance at
the optimum will be smaller than this.
You might call this a Robin-Hood like strategy (giving the most to the
poorest); the variance from that rule is a sample-statistic, which is
what you're asking for. Doutless much better rules (and hence much
better approximations to the variance) will readily be found.
sci.stat.num-analysis is probably a more fruitful place for help on
this problem. I'd be surprised if it hasn't appeared in many guises
before.
Glen
.
- Follow-Ups:
- Re: Reduction of variance due to grouping
- From: glenbarnett
- Re: Reduction of variance due to grouping
- References:
- Reduction of variance due to grouping
- From: Rick
- Re: Reduction of variance due to grouping
- From: glenbarnett
- Re: Reduction of variance due to grouping
- From: Rick
- Reduction of variance due to grouping
- Prev by Date: feature extraction in high dimensional data
- Next by Date: Re: Reduction of variance due to grouping
- Previous by thread: Re: Reduction of variance due to grouping
- Next by thread: Re: Reduction of variance due to grouping
- Index(es):
Relevant Pages
|