Selection of explanatory variables in a model



Dear colisters,

I have to build a logistic regression model and I've got a couple of
questions:


- I heard / read (on the Internet) that the number of variables to put
as explanatory characters must not be greater than n/10 or n/ 20 (n is
the sample size). Does someone can provide me with a serious
bibliographic reference about this ?


- Second, I've got a first set of about 15 variables (for around 150
patients), interactions excluded. I categorized some into binary
characters to yield odds-ratios. What is the risk ? lack of power ?


- Third, performing a second selection via the stepwise algorithm, is
there a consensus about the significance cut-off (alpha). to use ? I
read 20% instead of 5%. Is this usual ?


- Regarding SAS programming (I run the version 8.2 under Windows OS),
which procedure, between Proc Logistic and Proc GLM, should I choose,
and according to which criteria ?


Thanks a lot.


Catherine.

.