Re: How to identify flat (even) distributions?



On Dec 10, 6:05 pm, Steve555 <foursh...@xxxxxxxxxxxxxx> wrote:
On 10 Dec, 16:50, illywhacker <illywac...@xxxxxxxxx> wrote:

On Dec 10, 5:27 pm, Steve555 <foursh...@xxxxxxxxxxxxxx> wrote:

Hi

If I have 1000 people and their opinion ratings for, say, 100 songs
each (on a scale 1-10) How do I test for those users that have rated
10 1s, 10 2s, 10 3s etc i.e. a flat distribution?
I know I can use standard deviation to  spot those that tend to give
the same rating, or polar extremes, but there's nothing uniquely
identifiable about the SD for these 'flat' users.

Except that they have standard deviation zero, if I have understood
your notation 10 1s, 10 2s, etc., which you did not explain.

illywhacker;

They have given 100 scores: 10 of each possible score from 1 to 10 =
100
sum = 550, mean = 5.5  SD = 2.525
The problem is that any number of distributions could have that SD, it
doesn't uniquely identify a flat distribution.

Sorry for lack of clarity; when naming the subject of this question I
was trying to think of synonyms for flat/even/level... is there an
accepted term for this that statisticians recognize?

You can also try the entropy, which is the classic measure of
uniformity. If the numbers of scores of 1, 2,...i,...10 for an
individual are notated n_{i}, then define the proportions p_{i} to be
n_{i}/100. Then the entropy is

H(p) = - \sum_{i} p_{i} log(p_{i}) .

If the logarithm is taken to base 2, H tells you the minimum number of
yes/no questions you would need to ask on average to find out the
score, assuming you know the p_{i}. The maximum this can reach is log
(10) when all the p_{i} are the same, i.e. 1/10. The minimum is 0,
when one of the p_{i} = 1.

It does not take the ordering/neighbourhood relations (i.e. that 4
occurs next to 3 and 5, etc.) into account though.

illywhacker;
.



Relevant Pages