Re: Simulation Process and Independent Vs Paired Samples
- From: duncan smith <buzzard@xxxxxxxxxxxxxxxxxxxxx>
- Date: Sun, 27 Apr 2008 16:54:20 +0100
Richard Ulrich wrote:
On Sat, 26 Apr 2008 16:49:47 -0400, Paul Rubin <rubin@xxxxxxx> wrote:
Hanspeter wrote:I have a question about Simulation Process and Independent Vs PairedIf the simulation generates pairs of observations (one from each population), and if the paired observations are correlated to each other, then the paired-difference t-test should be more accurate. The
Samples.
Given two related samples, in order to enhance the sensitivity of t-
test,
1 sample t-test should be applied to the two samples differences (null
hypothesis = the differences average is statistically zero), instead
of
applying 2 sample t-test to both the samples separately (null
hypothesis =
both the samples come from the same population).
Simulations made to decide whether to modify or not a certain device,
yield paired (=related) samples (i.e.: before- and after-modification
measurements) spontaneously, however.
Sometimes it happens that the two samples differences result
significantly different from zero (i.e.: 1 sample t-test is
significant),
while both samples seem coming from the same population (i.e.: 2
sample
t-test is not significant).
In these cases, I do not want to modify the device under test because
I am
not able to reject the null hypothesis of the 2 sample t-test. The
"paired
t-test" (i.e.: 1 sample t-test applied to the two samples differences)
suggests to modifying the device under test, however.
What is the proper treatment of paired samples resulting spontaneously
from simulation?
I would go further, and state that if the observations
are correlated, the grouped test is *wrong* -- the assumption of independence was not met, and the
question then is how much it matters. If you get
different results, then it apparently matters.
two sample t-test calculates the variance of the difference in means under the assumption that the samples are mutually independent, and will overestimate the variance of the difference in means (reducing power) if there is a positive correlation between pairs of observations.
Similarly, a negative correlation will cause the Student's
test to underestimate the variance, and yield too much
power. (Someone once posted a related question concerning
a forced-choice design of this sort, where Left+Right < or = 10.)
I'm in the process of writing simulation software that can generate paired values (patient histories). Basically, at some point in a patient's lifecourse, choices can be made regarding treatment. So a simulated patient can fork into separate simulations when such a point is reached (and fork again if more than one treatment is involved). The histories are identical up to the point different treatments can be applied.
One suggestion that was made to me was to use the same random numbers for each 'post-fork' simulation. I quickly dismissed that as it does introduce correlations that depend on the detailed implementation of the random variate generators. But it doesn't (to me) seem unreasonable for the pre-fork histories to be identical.
The simulation model basically models transitions between states, until eventually a patient dies. If a treatment is assumed to solely prolong the expected length of time spent in state X until transition to state Y, then there is an argument that the with / without treatment records should only differ in the length of time spent in state X. It has also been argued (not by me) that the same uniform random variate should be used to generate the length of time in state X (when using the inversion method of random variate generation). I can easily dismiss the latter as being unreasonable, although there is an implicit assumption (untestable in the real world) however I choose to generate the two times to transition. There are practical reasons why trying to pair records after a fork is difficult, because in most cases there are competing risks, and the effect of a treatment might be that a patient moves from state X to state Z, rather than state Y.
The point of pairing the records is simply to reduce the computational time necessary to give a given degree of confidence in model predictions; typically changes in life expectancy. Judicious pairing seems reasonable to me, if the question relates to the predicted effects on a particular cohort of patients (and not otherwise).
I'm not sure how relevant this is to the OP, but increasing the number of simulated cases will generally give as much power as you want. In my case it comes down to making inferences about model predictions whilst limiting computational cost. Regardless of the number of simulations and my consequent degree of confidence in what the model actually predicts, it doesn't tell me anything about how confident I should be in the model or its predictions. (I wish the people who are going to be using the software realised this; I'm confident they generally don't.) Cheers.
Duncan
.
- Follow-Ups:
- Re: Simulation Process and Independent Vs Paired Samples
- From: Paul Rubin
- Re: Simulation Process and Independent Vs Paired Samples
- References:
- Simulation Process and Independent Vs Paired Samples
- From: Hanspeter
- Re: Simulation Process and Independent Vs Paired Samples
- From: Paul Rubin
- Re: Simulation Process and Independent Vs Paired Samples
- From: Richard Ulrich
- Simulation Process and Independent Vs Paired Samples
- Prev by Date: A multivariate normal distribution question
- Next by Date: Re: Quantitative research Vs qualitative research?
- Previous by thread: Re: Simulation Process and Independent Vs Paired Samples
- Next by thread: Re: Simulation Process and Independent Vs Paired Samples
- Index(es):
Relevant Pages
|
|