Re: Mann-Kendall test statistic
From: A.G.McDowell (mcdowella_at_nospam.co.uk)
Date: 09/30/04
- Next message: Fred Chen: "Extending Poisson Distribution to trending datasets"
- Previous message: Peter Michaux: "Sample Size for Emperical CDF"
- Next in thread: Dr. Knud Werner: "Re: Mann-Kendall test statistic"
- Reply: Dr. Knud Werner: "Re: Mann-Kendall test statistic"
- Messages sorted by: [ date ] [ thread ]
Date: Thu, 30 Sep 2004 06:33:32 +0100
In article <gvrml0hs7vf2k9cpsg0gfsd3borvf7t6qn@4ax.com>, Richard Ulrich
<Rich.Ulrich@comcast.net> writes
>
>[snip, Glen's request for clarification]
>>
>> AFAIK the ordinary Mann-Kendall test statistics for a real sequence
>> y = (y_1,\ldots,y_n) is calculated as
>>
>> S = \Sum_{i=1}^{n-1} \Sum_{j=i+1}^n sgn(y_j - y_i)
>>
>> ie. it is the sum of the signs of all pairwise differences. If eg.
>> you have the sequence (1,2,3,4,5,6,7,8,9,10) there are 45 pairs,
>> each contributing a +1, giving the test statistics of +45.
>>
>
>That reminds me of the Kendall tau, and the bubble-
>sort algorithm for counting interchanges.
>
>You can sort in O(n log(n)), right? - and carry along the
>original ranks.
>
>That leaves the interesting problem of computing the
>needed sum by comparing the sorted Ranks with the
>original Ranks in less than O(n^2). If someone has
>solved it for tau, then the same answer should apply
>for this one (it seems to me, after a few minutes of
>failing to find an algorithm).
>
>
Suppose you pre-sort the input, keeping track of the original positions,
as suggested. I think you have reduced the problem to the following:
starting with an empty array of size N, fill in the empty slots in some
order determined by the data. Every time you fill in a slot, you want to
know how many slots to its left and right have been filled. The slot
each element fills in depends on its rank in sorted order. The order of
filling in the slots is the reverse order of the data. So the number of
filled slots to an item's left and right is the number of +ve and -ve
signs it contributes.
To keep track of the number of filled slots to the left and right of
arbitrary positions, maintain a series of arrays of length 1/2, 1/4,
1/8.. the size of the original array. Each cell in these arrays contains
the number of filled slots in a corresponding pair, four-some, 8-some...
of slots in the original array. Filling in a slot in the main array
requires you to modify one slot in each of the smaller arrays. To work
out the number of filled slots to the left/right of any given location
you need to sum up a collection of cells in the summary arrays that
covers the required region in the original array, and again you can do
this by reading just one cell from each summary array (work from the top
down, using the cell that covers only - but probably not all - of the
required region, leaving a gap to be filled in by the next array down).
There are log n summary arrays so the cost of this stage is n log n -
asymptotically the same as sorting
-- A.G.McDowell
- Next message: Fred Chen: "Extending Poisson Distribution to trending datasets"
- Previous message: Peter Michaux: "Sample Size for Emperical CDF"
- Next in thread: Dr. Knud Werner: "Re: Mann-Kendall test statistic"
- Reply: Dr. Knud Werner: "Re: Mann-Kendall test statistic"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|