Re: scaling your data and comparing percent change

From: Richard Ulrich (Rich.Ulrich_at_comcast.net)
Date: 03/18/05

  • Next message: epriv: "If you need third level statistics help in Ireland."
    Date: Thu, 17 Mar 2005 23:31:50 -0500
    
    

    On 16 Mar 2005 06:56:13 -0800, "john" <jkalexan@gmail.com> wrote:

    > Hello,
    >
    > I have data from multiple experiments which were collected at different
    > times. Consequently, there is often a "scaling" effect - where the
    > data from one experiment may have a higher quantitative value from some
    > of the others.

     - okay, that is not unfamiliar.

    >
    > One option was to scale the data such that they are closer in value,
    > but by doing this I believe I am messing with the variance.

    One option for *what*? What are you trying to achieve?
    The default "option" is to report the data as you collected it.
    Why is that objectionable?

    >
    > let me explain by example (and yes, I know my N values are too small
    > ...)
    >
    > I have two conditions: control (C) and experimental (X)
    > I ran three experiments
    > experiment 1: C = 48.8 X = 5.48
    > experiment 2: C = 129.7 X = 42.4
    > experiment 3: C = 201.2 X = 140.7
    >
    > even though the X is always smaller than the C, the absolute values
    > cause a large variance - but this is mostly due to the experimental
    > conditions.

    Huh? What is that supposed to tell us, "the absolute values
    cause a large variance"? The range of C is 50 to 200; the
    range of X is 5 to 140, which is a bit less.

    As first I read this otherwise, but now I believe that there
    are only 6 values in all, 3 for C and 3 for X.

    > One approach I used was to make the average of each experiment 100
    > which yields:
    > experiment 1: C = 121.7 X = 78.3
    > experiment 2: C = 143.6 X = 56.4
    > experiment 3: C = 130.2 X = 69.8
    >
    > This makes them more comparable - but then the variance for each sample
    > is the same -
    > so I am uncertain if I can use statistical tools which
    > are based on variance (t-test)

    Well for "standardizing", it might be more common to set
    Control as 100, and go on from there.

    Again - What was wrong with the original set of values?

    >
    > alternatively, I could report the values as a percent change:
    > experiment 1: C = 100 X = 11.2
    > experiment 2: C = 100 X = 32.7
    > experiment 3: C = 100 X = 69.9
    >
    > but how do I compare these without a variance for the C group?

    If you think "percent change" is relevant, then you
    are looking at a multiplicative model, where taking logs
    makes sense for the originals.

    I saw some data once where the chemical reagent was
    different at three different periods of time, so that the range
    of scoring was non-overlapping for the dozens of assays
    from each time point. - BAD DATA of that sort does call
    for some standardizing. I guess, you *hope* that the
    quality is not so bad that standardizing is impossible to
    figure. That is -- if you can't trust anything except that
    "X is less than C", then you have a very weak test for
    differences. But all the scores *are* paired.

    Treating these as paired samples is going to result in
    a paired t-test, or its equivalent, a one-sample t-test.

    For a paired test, the hypothesis is that the set of differences
    is essentially above (below) zero. If the ratio is what
    matters, then a suitable one-sample t-test is whether the
    ratio is above (below) 1. You would also get the same test
    value by taking the logs of the two samples and doing a
    paired t in the usual way.

    -- 
    Rich Ulrich, wpilib@pitt.edu
    http://www.pitt.edu/~wpilib/index.html
    

  • Next message: epriv: "If you need third level statistics help in Ireland."