Re: How to identify flat (even) distributions?



On 11 Dec, 15:57, illywhacker <illywac...@xxxxxxxxx> wrote:
On Dec 11, 4:46 pm, Steve555 <foursh...@xxxxxxxxxxxxxx> wrote:



On 11 Dec, 15:07, illywhacker <illywac...@xxxxxxxxx> wrote:

On Dec 11, 3:49 pm, Steve555 <foursh...@xxxxxxxxxxxxxx> wrote:

On 11 Dec, 13:53, illywhacker <illywac...@xxxxxxxxx> wrote:

On Dec 11, 1:59 pm, Steve555 <foursh...@xxxxxxxxxxxxxx> wrote:

On 11 Dec, 11:25, illywhacker <illywac...@xxxxxxxxx> wrote:
There is no predefined calculation to solve this. problem. I stated
that my hunch(idea) was that some people - based on their scoring
distribution - might  improve (or add noise to) my prediction
algorithm. I then asked the learned people here how to best judge
these scoring distributions.
I think it means: "It is easier to notice how simplistic a model is
when it is explicit" But what's that got to do with the price of fish?

For example, what do you mean by the word 'improve' above?

illywhacker;

To make more accurate and reliable my prediction algorithm.
(...looking for patterns in other users' scores of a song to predict
the score another user will give)

OK, so your prediction is of a new user's score based on previous
observations of users' scores. How will you measure how good your
prediction is?

illywhacker;

I've kept some samples aside (i.e that are not part of the
observations used in the analysis), and I'm collecting new samples.
These are the novel scores I'm trying to predict. I'm comparing them
to my predictions and calculating RMS Errors.  I consider a prediction
to be improved as I minimise the RMSE.

Before even using the data for my pattern-matching algorithm (the
details of which I think are off-topic) , I'm trying experiments like
removing bi-polar scorers.
If after  removing these samples, I minimise the RMSE, I'll check for
95% significance that this is not just coincidence.
I'll then consider that pre-screening the data this way is an
improvement.

I am intrigued. You are measuring some other attributes of the user, I
guess, to make your predictions?

I do not think the details of the pattern-matching algorithm (I take
it this is the prediction algorithm) are off-topic. They are crucial,
since they constitute your model, and nothing is model-independent,
least of all a measure of whether users are relevant or not to
prediction.

Are you aware of the grievous flaws of classical significance tests?
See

http://bayes.wustl.edu/etj/articles/confidence.pdf.

illywhacker;
I think this is off topic, but this exchange has made me question a
few not-so-obvious assumptions so I'm curious to see where it's
going.

Thanks for the link.Bayes is on my to-do list.
I should have made this disclaimer to start with: I'm not a very good
statistician or programmer, I'm a musician and recording engineer, and
I think I have some ideas about what shapes people's taste in music.
I'm determined to try these ideas even if I'm obviously wrong because
it's a great motivation to learn about statistics, rather than
studying just for the sake of it. Current recommendation systems are
not that good, and it's my guess that some very talented statistician
have worked on it judging by the amount of eCommerce systems that
employ them to gain a competitive edge. To me that suggests that
either their ideas aren't very creative, or more likely, that they
don't have the time/budget to justify exploring tenuous theories. For
me it's just a learning exercise, so I'll explore anything that sounds
fun.

So that's a very long winded way of saying, no I'm not aware of the
grievious flaws, I'm a beginner who yesterday didn't know that
'uniform' was the correct word. At this stage I just have to trust in
some concepts as black boxes, with the intention of re-examining them
later.
To answer your other question: I'm using Kohenen and Back-Prop nets to
tune a system based on a metaphor of 'friends of friends'; that both
listeners and genres are connected by a chain of others according to
some hidden attributes.

Steve




.



Relevant Pages