Re: Mann-Kendall test statistic

From: A.G.McDowell (mcdowella_at_nospam.co.uk)
Date: 09/30/04


Date: Thu, 30 Sep 2004 06:33:32 +0100

In article <gvrml0hs7vf2k9cpsg0gfsd3borvf7t6qn@4ax.com>, Richard Ulrich
<Rich.Ulrich@comcast.net> writes
>
>[snip, Glen's request for clarification]
>>
>> AFAIK the ordinary Mann-Kendall test statistics for a real sequence
>> y = (y_1,\ldots,y_n) is calculated as
>>
>> S = \Sum_{i=1}^{n-1} \Sum_{j=i+1}^n sgn(y_j - y_i)
>>
>> ie. it is the sum of the signs of all pairwise differences. If eg.
>> you have the sequence (1,2,3,4,5,6,7,8,9,10) there are 45 pairs,
>> each contributing a +1, giving the test statistics of +45.
>>
>
>That reminds me of the Kendall tau, and the bubble-
>sort algorithm for counting interchanges.
>
>You can sort in O(n log(n)), right? - and carry along the
>original ranks.
>
>That leaves the interesting problem of computing the
>needed sum by comparing the sorted Ranks with the
>original Ranks in less than O(n^2). If someone has
>solved it for tau, then the same answer should apply
>for this one (it seems to me, after a few minutes of
>failing to find an algorithm).
>
>
Suppose you pre-sort the input, keeping track of the original positions,
as suggested. I think you have reduced the problem to the following:
starting with an empty array of size N, fill in the empty slots in some
order determined by the data. Every time you fill in a slot, you want to
know how many slots to its left and right have been filled. The slot
each element fills in depends on its rank in sorted order. The order of
filling in the slots is the reverse order of the data. So the number of
filled slots to an item's left and right is the number of +ve and -ve
signs it contributes.

To keep track of the number of filled slots to the left and right of
arbitrary positions, maintain a series of arrays of length 1/2, 1/4,
1/8.. the size of the original array. Each cell in these arrays contains
the number of filled slots in a corresponding pair, four-some, 8-some...
of slots in the original array. Filling in a slot in the main array
requires you to modify one slot in each of the smaller arrays. To work
out the number of filled slots to the left/right of any given location
you need to sum up a collection of cells in the summary arrays that
covers the required region in the original array, and again you can do
this by reading just one cell from each summary array (work from the top
down, using the cell that covers only - but probably not all - of the
required region, leaving a gap to be filled in by the next array down).

There are log n summary arrays so the cost of this stage is n log n -
asymptotically the same as sorting

-- 
A.G.McDowell


Relevant Pages

  • Re: tictactoe
    ... >> a new array based on an existing array where the new one has x's where ... Here's your original array: ... You want the second array to have the same number of elements as the ... The loop examines each element of the original array one at a time, ...
    (comp.lang.java.help)
  • Re: Ranking Without Sorting
    ... >some point you want to use an ordering of the original array... ... >in the original array (an array of pointers, reference, indices, whatever ... >works nicest in the target language) and sort that. ... Since a ranking can be used to sort an array in time O, ...
    (comp.programming)
  • Re: safe to delete elements of array in foreach
    ... I make it a habit not to delete entries in a foreach() loop. ... build an array of keys I want to delete, and after the loop ends, delete ... I don't know whether an operation like this is guaranteed to work in PHP ... So unsetting a value in the original array should not affect the copy ...
    (comp.lang.php)
  • Re: random array elements and speed
    ... # need int here since we are checking for dups and floats won't work ... case I were to need more elements than those in the returned array. ... that could theoretiocally never terminate. ... the original array, but small's a relative term and the expected speed ...
    (comp.lang.perl.misc)
  • Re: Subset Sum problem (w/ limited scope)
    ... The "Subset Sum" problem (stated various ways depending on what you ... a subset of those integers that sums up to zero. ... Are there going to be duplicate values in your original array? ...
    (comp.lang.fortran)