Re: problem with logistic regression



"Stratocaster" <stotz1@xxxxxxxxxxx> wrote in
1uCKi.6872$YN2.2526@trndny07:">news:1uCKi.6872$YN2.2526@trndny07:

My response is probably not as technical as you would like, but I have
encountered similiar occurances when dealing with multiple regression
models.

These explanatory variables (X1, X2) seem to exhibit
"Multicollinearity". I am certain there must be a similar (if not the
same) term for logistic regressions. Long story short (because I
don't know the full story), X1 and X2 provide the model with similar
information and one usually becomes dominant when both are included.
You could check this by evaluating the correlation between X1 and X2,
it is probably noteworthy.

The model fit statistic simply is not significantly improved by adding X2
to a model containing X1. (Difference in model fit produced by adding X2
has chi-square = 0.9 with 1 df). The reverse ordering of adding X1 to a
base model with X2 would have a significant improvement in fit.

You show us no evidence of multicollinearity. True there is likely an
association between X1 and X2, but there is no material instability in
the point estimate for the X1 parameter that would suggest
multicollinearity. LR is still a linear model. So the multicollinearity
diagnostics are the same. Look at the (X'X)^-1 matrix. Search terms:
multicollinearity "condition number" VIF.

The OP should have been asked what X1 and X2 represented. If this is
merely a homework problem, then OP should have been prompted to think of
some situations where a causal connection might exist between X1 and Y
and between X1 and X2 but not between X2 and Y. If Y were death in a car
accident and X1 were seat belt use, then perhaps a feature associated
with seatbelt use, but not as strongly with auto accident death (perhaps
habit of regularly changing one's car's oil). (Please note: I am not
saying that LR results imply a causal connection. That is established in
a wider consideration of the scientific domain.)

--
David Winsemius


"xz" <zhang.xi.cn@xxxxxxxxx> wrote in message
news:1190838865.824481.115870@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
I have three vectors X1, X2, Y and want to find out the possible
dependence of Y on X1 and X2.
Logistic Regression is adopted.
However, I got confused because of the following results:

If I fit the Logitstic Regression model for X1 and Y, or for X2 and
Y, the results show that X1 and Y are clearly correlated, so are X2
and Y.

fitting model: OddRatio(Y)=a+b1*X1
-------------------------------------------------
Overall Model Fit...
Chi Square= 15.4002; df=1; p= 0.0001

Coefficients and Standard Errors...
Variable Coeff. StdErr p
1 3.3964 1.1342 0.0027
Intercept -0.9985
-------------------------------------------------

fitting model: OddRatio(Y)=a+b2*X2
-------------------------------------------------
Overall Model Fit...
Chi Square= 7.7710; df=1; p= 0.0053

Coefficients and Standard Errors...
Variable Coeff. StdErr p
1 2.1972 0.8819 0.0127
Intercept -0.6931
-------------------------------------------------

However, If I fit the model for all the 3 varibles together, with Y
being dependent variable and X1 and X2 being independent variables,
the p-values are much higher, especially for X2 (0.3288). So can I
still say Y is correlated with X2? How come Y is obviously correlated
with X2 when fitting them separately but not so when fitting all the
variables together?

fitting model: OddRatio(Y)=a+b1*X1+b2*X2
-------------------------------------------------
Overall Model Fit...
Chi Square= 16.3307; df=2; p= 0.0003

Coefficients and Standard Errors...
Variable Coeff. StdErr p
1 2.9453 1.1946 0.0137
2 1.0542 1.0794 0.3288




.



Relevant Pages

  • Re: problem with logistic regression
    ... Multicollinearity is an UNIVERSAL issue. ... including logistic regression, poisson regression, general linear ... decomposition methods such as singular value decomposition, ...
    (sci.stat.math)
  • Re: Principle Component Analysis
    ... For some regression books AND the ASA paper: ... I think you mean PRINCIPAL components analysis. ... are these variables with multicollinearity the independent ... > use the individual principal components as predictors of your Y ...
    (sci.stat.math)
  • Re: selecting variables
    ... and a robust model without multicollinearity ... Stepwise regression with some stopping criteria is fairly typical. ... And reputable journals will be unlikely ...
    (sci.stat.math)
  • Re: Confounders/interaction effect and regression (linear and logistic)
    ... I was told that each variable must be run in a regression model ... Partial Least Squares Regression is my preferred way to deal with multicollinearity. ... If you think you have interactions that are significant, you need to add those to the model. ... I also like to do traditional data reduction. ...
    (sci.stat.consult)
  • Re: Problem related to a linear regression
    ... give the best fit of the line y = A + B*x to the data points ... finding out that A and B describe a regression line for the data ... I used the word 'scaled' since A and B don't affect the correlation ... In the first case I minimise the sum of (Y- ...
    (sci.math)

Loading