Re: logistic skewed response



In article <1159374436.640642.104110@xxxxxxxxxxxxxxxxxxxxxxxxxxx>,
emblabac@xxxxxxxxxxx says...
Hi all,
I have about 10 million records, with about 20 predictors and a binary
response variable (0,1). About 5,000 of the records have a response of
1 with the rest being 0. I obviously would like to do some sampling,
but I'm not sure about how to do so, any suggestions ?
Thanks !

Eric B




This is a standard question in the discrete-choice literature, under the
rubric "choice-based sampling". Basically you can draw a random sample
of whatever size from the 1-reponses, and then weight the elements in
the likelihood function by the true population proportions.


--
Philip A. Viton
Ohio State University
.