Re: Software effectiveness

From: JRKRideau (JohnKane9996_at_hotmail.com)
Date: 11/24/04


Date: 24 Nov 2004 13:12:35 -0800

billztan@yahoo.com (Bill Tan) wrote in message news:<gkgsm52x8id3@legacy>...
> Statistic Advice Needed
>
> Hi, I'm working on a research project to evaluate the effectiveness of
> a new
> software technology, for assessing the Spanish language skills of
> healthcare
> workers, ie. how well does a doctor or nurse speak Spanish? how well
> can
> they communicate with a Spanish-speaking patient, without the help of
> a
> medical interpreter? etc. This sort of assessment is usually
> conducted by
> a live person, so having a software that's capable of doing that job
> will
> obviously be more efficient and cost-effective.
>
> So the question I'm trying to answer is: how effective is this
> software?
> We've come up with the following approach to evaluate its
> effectiveness; if
> you have other ideas, please do let me know.
>
> We will have a group of 60 healthcare workers (including doctors,
> nurses,
> receptionists, etc.), from 3 proficiency levels in Spanish: beginner,
> intermediate, advanced (so 20 persons per level). Candidates will be
> given
> definitions of each level and asked to self-identify their own
> proficiency
> level. This is so that we will have a random sample of recruites who
> are
> not all from the same proficiency level. Once we have 60 people, we
> will
> test them using the software, and record the scores. Then, we will
> ask the
> same 60 people to take a standardized language test (probably the SAT
> Spanish test), and record their scores.
>
> Here comes the evaluation part: the 2 sets of scores will be compared.
> We
> won't compare the 2 scores of each individual, since the basis for
> comparison is limited; rather, we will compare the distributions of
> the 2
> score sets, i.e. each set of 60 scores will be graphed in some sort of
> statistical chart, and the two charts will be compared. If the
> distribution
> of the software scores are within a certain acceptable percentage of
> variance (say 10%) from the SAT score distributions, then the software
> is
> deemed "effective" or reliable or feasible, etc. To be more clear: if
> John
> is ranked #32 out of 60 in the SAT test, then he should fall within
> the
> #29-35 range in the computer test (10% variance in 60 people is 6
> spots, so
> 3 to the left of #32, which is #29, and 3 to the right of #32, which
> #35).
>
> My questions are:
> 1. is comparing distribution of the 2 score sets a reasonable
> approach? are there other evaluation approaches?

I don't understand the approach but it does not look like any usual
approach to test construction that I have seen. You might want to
have a look at a couple of standard references in the area. Crocker,
Linda & Algina, James Introduction to classical and modern test
theory: New York : Holt, Rinehart, and Winston, 1986 or Ghiselli,
EdwinMeasurement theory for the behavioral sciencesSan Francisco : W.
H. Freeman, c1981 are probably useful starts.

Ghiselli, in particular is getting out of date but it is a good fairly
simple intro to test construction and the basics really haven't
changed in classical test theory in 100 years.
 
> 2. In order to demonstrate effectiveness/feasibility/reliability of
> the
> software, what percentage of variance should be used as the benchmark?

I don't understand what you mean by the percentage of variance. Could
you explain a bit?

Otherwise Question 2. appears to be be asking 3 questions. As far as I
can see only the question of reliability (used in the technical
psychometric sense) is a statistical question. As I understand the
terms "effectiveness/feasibility" are practial questions that one
should ask 'before' developing a test and 'after' the test is
developed and crutial things such as the reliablity and validity of
the test are known.

By the way a good understanding of what reliability and validity (in
the technical psychometric meanings of the terms) is crucial here.

> 3. what is the best approach for comparing the 2 distributions? What
> is the
> statistical method called? what is the name of the charts/graphs?
> what are
> the statistics lingos? This will help me find the right language to
> describe the project in my proposal.

I don't understand the approach so cannot comment except to say I
don't recognise it at all. See my response to Q. 2.

> 4. Is 60 people a sufficient sample size?

I would think that 60 people for a test that apparently is going to be
used in staffing functions is a much, much too small sample
particularly since you may need to demonstrate validities (levels of
validity?) for diffferent job types. What I mean here is that you may
want to be able to show what level and 'type' of Spanish is required
say for an X-ray technician versus an consulting internal medicine
physician. I am pretty sure they are 'not' the same.

The issue of sample size depends on any number of factors and you need
a good statistician or psychometrican to advise you hopefully based on
such things as good job analyses and some idea of the structure and
length etc of the test.

> 5. what are some potential limitations of the above approach?

Bill,
Essentially it sounds to like you are walking into a minefield long
before you get into the statistical questions.

You need linguistic and psychometrics expertise, both hopefully with
experience in language proficiency assessment, on the team combined
with subject matter experts and from what I understand someone with
specialized job analysis skills. The subject matter experts may be
drawn from the possible pool of participants but not of course used in
the validation study.

A few issues that strike me are:

a) Are the tests aural/oral or will the software provide
reading/writing simuli? Or require written responses or require the
repondent to read possible responses as in written multiple choice
responses ? I would assume that most or all healthcare workers'
interaction with patients will be verbal not written.

b) Is the SAT Spanish an oral or written test? If my assumption in (a)
is correct and the SAT is written I think you have a very serious
problem to establish any kind of validity.

In regards to a) and b) above: It is quite possible to speak a
language fluently and not read it or conversely read/write it and not
speak it, at least in any comprehensible form. Old latin students will
understand this.

Even if there is an aural/oral component in the SAT I am not sure that
you would have a good measure of validity (criterion validity,
technically). It is possible to speak a language well in one context
but not in another due to such things as vocabulary problems.

I, personally, know of two examples 1) a friend with GRE scores in the
99% percentile in English but who did all his education in French. He
skiies in French and sails in English. He has no idea of the technical
terms for each sport in the other language. 2) My former boss grew up
speaking French but all of her statistics vocabluary was learned in
English. She was going to help her younger brother in his stats course
but he was at a French language university and they did not have a
technical language in common.

c) Is the SAT measuring anything meaningful in terms of staff/patient
communication? If the SAT is measuring standard academic
accomplishments in Spanish it may not be relevant (technically valid)
in a health-care setting as a criterion meaure. The ability to use
the pluperfect subjunctive form of a verb may not be all that relevant
to the job of trying to understand a very sick patient.

d) Self-reporting of language skills sounds very dubious to me. People
may, for any number of reasons, may under or over report their skill
levels, and as mentioned in c) where the proficency lies is also
critical.

If one needs to defend the test in a human rights case I think you are
leaving the client badly exposed. At least, I assume that this test is
intended for staffing actions of some kind and is quite likely to be
the subject of grievences and human rights challanges.

Actually, having said the above the question of why you are asking for
self-report levels of language skills arises. Is this an attempt to
increase the expected range of scores on the test or the SAT? If so,
I suppose it makes some sense since it would appear that it is
intended to help reduce range restriction problems but as I said I'm
not sure how much faith I would put in self-report. I'd run by a
statistician/psychometrician and a language assessment expert for
opinions.

e) Have you established what is a reasonable working vocabulary for
your target population? Unless you know what this is, you probably
cannot claim to have anything like a valid test. Basically you need
comprehensive job analyses with special emphasis on language
requirements. The level of language production and comprehension
required is likely to vary widely by job type.

e) Are you dealing with one or more Spanish dialects? If so, how does
this effect the content of the test? This could be especially
critical for vocabulary and pronunciation for aural stimuli. My
knowlege of Spanish is virtually non-existant but from work in the
English and French languages I know that dialectial differences can be
crucial. This can be true in both the written and spoken language.

f) Is the test material you are developing being composed by native
Spanish speakers with the appropriate dialect(s) and experience in the
health care field to ensure proper terminology and usage, or tranlated
from another language (English?). If it is translated do you have a
good quality control check on it? Assessment by native speakers or
back-translation may be approaches to look at.

You may well have considered everything I mentioned above but if not I
think they are some valid concerns.

Best of luck,

John Kane
Kingston ON Canada



Relevant Pages

  • the tomasites , english and the filipino
    ... And they came to teach English as part of the "policy of attraction" ... a Filipino is deemed illiterate even ... Even a secondary Spanish school like Colegio de San Juan de Letrán ... wrote a textbook to teach the English language as early as 1902. ...
    (soc.culture.filipino)
  • Re: the tomasites , english and the filipino
    ... this change was documented in books written in English or other ... European languages not Tagalog or any other of the Filipino dialects. ... second language, unfortunately it seems that we did not use these ... Even a secondary Spanish school like Colegio de San Juan de Letrán ...
    (soc.culture.filipino)
  • Re: So..you think English is easy??
    ... This is one sentence of Spanish that I learned to say... ... It is also a language with a very rich vocabulary. ... Trouble is they speak too fast. ... Let's face it - English is a crazy language. ...
    (sci.med.transcription)
  • Re: Male and Female IQs
    ... IQ tests have always been problematic because of language and social barriers, perhaps there is also a "gender barrier". ... problem for statistics is the difference between causation and correlation. ... In this study, a correlation was found between IQ and gender, but without knowing more about the study, we can't determine any causation. ... 800,000 Lebanese are refugees from Israeli aggression. ...
    (sci.skeptic)
  • Re: put your money where your mouse is, or why I prefer Spanish to English: a proposal
    ... >I propose that we change the language of this board to Spanish, ... >over English: ... Austronesian language - Hawai'ian or Maori. ... extent by Japanese and Korean speakers. ...
    (comp.lang.python)

Loading