Measurement error estimation
- From: Ramzi <rwnahhas@xxxxxxxxx>
- Date: Fri, 02 Nov 2007 09:34:40 -0700
It seems that there should be no way to partition total variance into
population variability and measurement error unless one has more than
one measurement per observation. Yet in SAS when I have a random
coefficient model (vis the RANDOM statement) + serial correlation (via
the REPEATED statement) with only one measurement per point in time,
and then I add in the LOCAL option to REPEATED, SAS does this very
partitioning.
I did the same thing in R with a simple simulation and got the same
results. My R code is below. It seems to me that the fitting algorithm
is guessing an answer somehow! Are these two variance components
really uniquely estimable with only one measurement per observation or
is the answer I am getting a particular solution to a non-unique pair
of quantities?
Ramzi
R simulation code:
# The purpose of this program is to figure out how a mixed model
estimates measurement error when there is only one
# measurement per observation. It seems like it would be impossible.
With more than one measurement per observation,
# the estimate is the same as the pooled within subject variation.
With only one measurement per observation, there
# is no within subject variation to pool, but lme() and MIXED still
give an answer! They take the total variation and
# split it into two parts, but how?
sim <- function(n, ni, MU, SIGX, SIGM)
{
X <- rnorm(n, MU, SIGX)
SUBJECT <- 1:n
n <- length(X)
M <- matrix(0, nrow=n, ncol=ni)
for(i in 1:n)
M[i,] <- X[i] + rnorm(ni, 0, SIGM)
xbar <- mean(X)
mbar <- mean(M)
meas.error.est <- sqrt(mean(apply(M, 1, var)))
sdM <- sd(c(M))
dat <- data.frame(rep(X, ni), rep(SUBJECT, ni), c(M), c(M)-xbar)
names(dat) <- c("X", "SUBJECT", "M", "M-Xbar")
final <- list(M, dat, xbar, mbar, meas.error.est, sdM)
names(final) <- c("M", "dat", "xbar", "mbar", "meas.error.est",
"sdM")
return(final)
}
runlme <- function(dat, ...)
{
mylme <- lme(M ~ 1, random = ~ 1 | SUBJECT, data=dat, ...)
return(mylme)
}
# 3 measurements per observations
tmp3 <- sim(n=10, ni=3, MU=100, SIGX=10, SIGM=2)
tmp3
runlme(tmp3[[2]])
# It works perfectly... beta-hat = mbar, Residual error =
meas.error.est
# NOTE: meas.error.est is a pooled estimate of the within subject sd.
lme() estimates the residual sd to be the same as that.
# sdM is the sd of the measurements, ignoring subjects. It
should be the sqrt of the sum of squared variance components.
# It does NOT equal that when ni=3 (three measurements per
observation).
# Here, sdM is about the same as the random intercept sd and
# the pooled within subject sd is exactly the same as the
residual variance sd (measurement error).
# Is REML the reason?
runlme(tmp3[[2]], method="ML")
# No... sdM is still not the same as sqrt(sum of squared variance
components)
# 1 measurement per observation
tmp1 <- sim(n=10, ni=1, MU=100, SIGX=10, SIGM=2)
tmp1
runlme(tmp1[[2]])
# ni=1 (one measurement per observation)
# I still get an estimate for the two variance components.
# In this case, sdM DOES equal the sum of squared variance components.
# Here, sdM IS the same as the sum of squared variance
components and there is no pooled within subject sd.
# Somehow lme() splits sdM into two parts to estimate the two
variance components.
# As a result, it is underestimating the between-subject
variability!
# How is it estimating the two quantities without repeated
measurements of the same observation?
# The within subject sd is not estimable, and so can't be
pooled to get meas.error.est.
tmp1 <- sim(n=1000, ni=1, MU=100, SIGX=10, SIGM=2)
c(tmp1[[3]],tmp1[[4]],tmp1[[5]],tmp1[[6]])
runlme(tmp1[[2]])
# True values are 10 and 2. With 1 meas per obs, lme() estimates 9.7
and 3.6. It way overestimates the measurement error.
# But how does it arrive at any estimate at all?!
tmp3 <- sim(n=5000, ni=10, MU=100, SIGX=10, SIGM=2)
c(tmp3[[3]],tmp3[[4]],tmp3[[5]],tmp3[[6]])
runlme(tmp3[[2]])
# With 10 meas per obs, lme() gets it almost dead on.
# Also, sdM is almost (but not exactly) the sum of the squared
variance components.
# CONCLUSION: I still have no idea how it estimates the measurement
error without replication! Is it ambiguous?
.
- Follow-Ups:
- Re: Measurement error estimation
- From: Ramzi
- Re: Measurement error estimation
- Prev by Date: Re: time series hypothesis test
- Next by Date: no news group for R ?
- Previous by thread: experimental design question again
- Next by thread: Re: Measurement error estimation
- Index(es):
Relevant Pages
|