Re: Finding Percentiles WITHOUT Sorting

From: Anonymous (anon_at_inter.net)
Date: 08/21/04


Date: Sat, 21 Aug 2004 15:42:04 +0200

Creating an index will not solve the problem: the index will of course be of
type Integer (4 byte), which is of the same order of that of original data
(type Single, 4 bytes, or Double, 8 bytes). I would need not 32 MB (smaller
by an order of magnitude) but another 400 or 200 MB, depending on whether
that data is Single or Double, respectively. If I had that kind of memory
available I would be having no problem in the first place -- or rather I
would have used it to contain more data, so I'd be in exactly the same
predicament.

"Thad Smith" <thad@ionsky.com> wrote in message
news:4126c5e2$1_4@omega.dimensional.com...
> Anonymous wrote:
> > I have a very large array that occupies most of the memory (several
hundred
> > megabytes in size). I want to calculate its percentiles, but the
"obvious"
> > way is to sort it first -- and I do not want to sort the original array
> > in-place, and of course I do not have enough memory to create a copy be
> > sorted.
>
> The most efficient way to do this exactly would be to sort. If you
> sorted in place, you would probably need an index for each value which
> carries its original location. In addition, you can use another data
> field for the rank, from which you can determine percentile. For 400 MB
> of data, you can use a 32-bit index. If there are 4 million items, you
> would need an additional 32 MB for the added data items. If the data
> structure already includes enough information to relate it to the
> original order or end use, you wouldn't need to add an index.
>
> Add index, sort on value, assign rank, sort on original index. Compute
> percentile from rank.
>
> Thad
>



Relevant Pages

  • Re: Finding Percentiles WITHOUT Sorting
    ... type Integer, which is of the same order of that of original data ... from which you can determine percentile. ... > Add index, sort on value, assign rank, sort on original index. ...
    (comp.programming)
  • Re: Crosstab Pivot column problem
    ... The 90th percentile rank would then be the 11th highest. ... > highest "rankXX", while including duplicates. ... > The final query would include a count of results for each id/date ...
    (microsoft.public.access.queries)
  • Re: Ranking Without Sorting
    ... you have just reinvented the rank sort. ... >> 'rank' array you want. ... >> Doing a 'heap sort', for example, where the heap holds the index into ...
    (comp.programming)
  • Re: Crosstab Pivot column problem
    ... if you rank all the rows in each group by ascending values for Result ... I'm going to make this in pieces, each of which needs to be a saved query. ... be used to find the 90th percentile. ... >> It would be possible to use Labnum as a secondary ranking value. ...
    (microsoft.public.access.queries)
  • Re: How do I return a Percentile Rank in Microsoft Access?
    ... Method 2) Add filter to DCount ... IIf) AS Percentile ... and the field with the values to rank is ).... ...
    (microsoft.public.access.modulesdaovba)

Quantcast