Re: Statistical Analyses of Non-Static Group Question



Hi Gary,

OK, great -- glad that I could be of assistance! I guess the only other
thing I would mention is the idea that, at any given point in time, these
"bundles" of participant input might be "scored" relative to their expected
value. In a sense, I think that that's at the heart of what you're after
here. In mathematical terms, you have two time-based functions for each
individual. One function is their expected output / behavior, F{e}(x), and
the other is their observed / actual behavior, F{o}(x), and what you're
ultimately interested in is the difference between two curves (aside: I use
curly braces, {}, to indicate subscripted font in plain email):


Example of a participant slightly exceeding expectations

|
|
S | ............. oooo
c | ...... .... oooo
o | ........ ooooooooooooo .. oooo
r | .... ...oooooo ... oooooooo ooo..
e | .. ooooo. .. oooooo oooooo ..
|.. oooo .. ... oooo ...
| ooo ......... ....
| o ..
|
+------------------------------------------------------------------------
Time ---->

. = expected, o = observed


Importantly, the function for expected participant performance, F{e}(x), is
INDEED a function, but I don't know that it needs to be a complicated one.
All it needs to do is generate a curve that approximates what you think
"average" or "acceptable" or "typical" performance looks like over time.
You already have a lot of data available to help you estimate such a curve
empirically, so it ought just be a matter of digging through the data,
picking a few representative "typical" performers, and then coming up with
an idealized typical-performance curve. (Probably not a bad idea to graph
out a few high performer and low performers as well, just for comparison).

I suspect that, beyond this, all that would be needed is to adjust the curve
for two factors -- the participant's level of testing experience and the
general product class -- so that you then have a small family of curves
covering the entire range of "typical" tester performance. (By small, I
would think no more than 6-8, say based on two experience levels and four
product classes).

Then, the analysis simply becomes a matter of comparing the appropriate
expected and observed curves for each participant. That is, simply look at
the difference in the area under the curves: Typical performers should have
a difference around zero, underperformers negative, overperformers positive.

The only slight complication is that the observed curve might sometimes have
to be phase shifted (in time) so that it's at its "best fit" position
relative to the expected curve, which is very easy to do. Of course, if a
participant's observed curve needs to be shifted too much, say beyond
10-15%, then there's probably a true lag issue, which is fine, but you might
then want to deduct a few points from that participant's final score to
account for that. Also, there will need to be a maximum limit on the amount
of phase shift that's allowed (probably no more than 20-30%, if even that)
-- beyond this and the results will probably start to get pretty spurious.

Lastly, as you said, the ultimate goal here is not to focus on finding bad
testers, but rather evaluate various testing groups and the testing effort
as a whole. That's the nice thing about approaching the analysis the way
I've suggesting here. Once you are able to analyze individual participants
in this way, extending it to considering a group of participants (e.g., all
the testers for a particular product) or all participants (i.e., the whole
testing program) is completely straightforward. Simply integrate the
expected PCs for a group of participants for a certain time period,
integrate the observed PCs for the same participants and over the same time
period, and the resultant curves should reveal a lot of what I think you're
looking for, and perhaps then some. For example, comparing the curves will
give you information not only about how on target things are, but also let
you see when and where deviations are occurring, both good and bad (i.e.,
overachievement or underachievement).

Again, I hope this helps!

Best wishes,
Doug


On Sun, 12 Aug 2007 19:54:46 -0700, gwcallahan1@xxxxxxxxxxxx
<gwcallahan1@xxxxxxxxxxxx> wrote:
Doug,

Your analysis, to say the least, is spot on. In the testing program
there are 20 some odd projects representing 26 or so different models
of product that in all case has the same basic functionality. Over
time the product has been enhanced with different feature sets and
capabilities. To date none of the products have been retired though
...

.


Loading