Software effectiveness
From: Bill Tan (billztan_at_yahoo.com)
Date: 11/23/04
- Next message: alex_anis: "SPLUS"
- Previous message: Jeff Sauro: "Re: ANOVA on ordinal data"
- Next in thread: JRKRideau: "Re: Software effectiveness"
- Reply: JRKRideau: "Re: Software effectiveness"
- Messages sorted by: [ date ] [ thread ]
Date: Tue, 23 Nov 2004 20:21:02 +0000 (UTC)
Statistic Advice Needed
Hi, I'm working on a research project to evaluate the effectiveness of
a new
software technology, for assessing the Spanish language skills of
healthcare
workers, ie. how well does a doctor or nurse speak Spanish? how well
can
they communicate with a Spanish-speaking patient, without the help of
a
medical interpreter? etc. This sort of assessment is usually
conducted by
a live person, so having a software that's capable of doing that job
will
obviously be more efficient and cost-effective.
So the question I'm trying to answer is: how effective is this
software?
We've come up with the following approach to evaluate its
effectiveness; if
you have other ideas, please do let me know.
We will have a group of 60 healthcare workers (including doctors,
nurses,
receptionists, etc.), from 3 proficiency levels in Spanish: beginner,
intermediate, advanced (so 20 persons per level). Candidates will be
given
definitions of each level and asked to self-identify their own
proficiency
level. This is so that we will have a random sample of recruites who
are
not all from the same proficiency level. Once we have 60 people, we
will
test them using the software, and record the scores. Then, we will
ask the
same 60 people to take a standardized language test (probably the SAT
Spanish test), and record their scores.
Here comes the evaluation part: the 2 sets of scores will be compared.
We
won't compare the 2 scores of each individual, since the basis for
comparison is limited; rather, we will compare the distributions of
the 2
score sets, i.e. each set of 60 scores will be graphed in some sort of
statistical chart, and the two charts will be compared. If the
distribution
of the software scores are within a certain acceptable percentage of
variance (say 10%) from the SAT score distributions, then the software
is
deemed "effective" or reliable or feasible, etc. To be more clear: if
John
is ranked #32 out of 60 in the SAT test, then he should fall within
the
#29-35 range in the computer test (10% variance in 60 people is 6
spots, so
3 to the left of #32, which is #29, and 3 to the right of #32, which
#35).
My questions are:
1. is comparing distribution of the 2 score sets a reasonable
approach? are
there other evaluation approaches?
2. In order to demonstrate effectiveness/feasibility/reliability of
the
software, what percentage of variance should be used as the benchmark?
3. what is the best approach for comparing the 2 distributions? What
is the
statistical method called? what is the name of the charts/graphs?
what are
the statistics lingos? This will help me find the right language to
describe the project in my proposal.
4. Is 60 people a sufficient sample size?
5. what are some potential limitations of the above approach?
- Next message: alex_anis: "SPLUS"
- Previous message: Jeff Sauro: "Re: ANOVA on ordinal data"
- Next in thread: JRKRideau: "Re: Software effectiveness"
- Reply: JRKRideau: "Re: Software effectiveness"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|