# Re: Regression with Dichotomous Dependent Variables

• From: Paul Rubin <rubin@xxxxxxx>
• Date: Sun, 29 Mar 2009 11:50:45 -0400

Rich Ulrich wrote:

What immediately occurs to me is "discriminant function" for two groups. Isn't the DF model of description appropriate?

A discriminant function will predict which of two groups an observation likely belongs to (here, predicting the choice being made). It won't necessarily provide a probability estimate, though, whereas a logistic regression will. When people use logistic regressions for discriminant analysis (which some do), I think they predict a response of 1 iff the probability estimate from the logistic regression is > 0.5 (or >= 0.5), assuming equal priors. (With unequal priors, adjust 0.5 accordingly.)

There's also a difference in the underlying distributional assumptions. The Fisher (linear) and Smith (quadratic) discriminant functions assume that the predictor variables have a multivariate normal distribution. Logistic regression makes no assumptions about the predictors (which can in fact be deterministic).

Not knowing the full context here, it's hard to say if discriminant analysis would be more or less appropriate, but it's a question worth raising.

In regression we are actually modeling E[y|x] which is essentially a
mean value given the observed values of X. In the linear probability
model it can be shown that E[y|x] = pr[y|x] which ranges from 0 to 1,
which in effect makes Y continuous and makes regression appropriate.

( of course with linear regression you can get predictions > 1 so I
moved on to the logit model)

Unless you *do* get predictions > 1, which only happens when you
have a rather high R^2, there is very little practical difference between the models, logistic versus discriminant function. The logistic is less robust with small Ns.

If you do an OLS regression of the dichotomous response variable on the predictors, the regression function is a scalar multiple of Fisher's linear discriminant function, so classification is the same either way. AFAIK, the regression function is not producing probabilities, though. It's producing "scores". Clearly values < 0 or > 1 cannot be interpreted as probabilities, but I'm not sure it's fair to interpret any output of the regression function as probabilities, even though ordinarily we would interpret the regression function as a conditional mean and, as noted above, in this case the conditional mean is a conditional probability.

Put another way: the linear regression approach violates a cardinal assumption -- the distribution of the disturbances is now strongly related to the predictor values. It produces a scalar multiple of the Fisher function, but I'm not sure it produces a meaningful estimate of the conditional probability/conditional mean. (I'm also not sure what happens when the two groups have unequal priors and/or the subsample sizes are not proportional to the priors. The linear regression slope vector is still a multiple of the Fisher coefficient vector, but the constant term in the linear regression may need adjusting, or equivalently the cutoff for the linear regression output above which you classify into the 1 group will need adjustment. I'm sure this has all been worked out long since, but we're getting past my knowledge of such things.)

/Paul
.

## Relevant Pages

• Re: Regression with Dichotomous Dependent Variables
... What immediately occurs to me is "discriminant function" for two ... and that may be why logistic regression is becoming ... I don't know how it works out for logistic regression, ... create a multi-variable predictor *composite* score. ...
(sci.stat.math)
• Re: For multiple linear regressions, are definitions same for SlopeVar, IntcptVar, and (Sl,Int)Covar
... So for this regression, this second PERL module will enable automated ... for each slope of a multiple linear regresssiom, Slope Variance is ... for the intercept of a multiple linear regression, ... No, those were special cases, for one predictor. ...
(sci.stat.math)
• Re: Simple Question on Forward and backward regression
... through BACKWARD Regression must be reflected in FORWARD Regression as ... I found the most significant Predictor variables ... of stepwise regression, let alone forward/backward stepwise, which I know is flammable in some quarters. ... The F test in the forward direction to decide whether to add X1 to X3+X5 should be the same as the F test in the backward direction to decide whether to turf X1 out of X1+X3+X5. ...
(sci.stat.math)
• Re: Richard Ulrich used THREE posts to distort one post in one thread
... That was when Richard Ulrich took it as his opening to push his ... Quackery in regression analysis. ... Those are patterns with Bob that I've noted several times. ... Linear regression might be the least ...
(sci.stat.math)
• sample size for logistic regression (Babyak/Peduzzi papers)
... What I have understood is that in logistic regression I need at least 10 observation per predictor variable and the sample size that must be considered is the smallest number between the events and the non-events, for example if I have 1000 observations but only 50 events my sample size is 50 so I "can" use about 5 predictors. ... - a categorical predictor that becomes for example 5 dummy variables counts as ONE or FIVE predictors? ... Level 1: 10 EVENTS 20 NONEVENTS ...
(sci.stat.math)