Re: Enter versus forward method for linear regression
- From: Richard Ulrich <Rich.Ulrich@xxxxxxxxxxx>
- Date: Wed, 21 Jun 2006 01:31:44 -0400
On 20 Jun 2006 15:43:26 -0700, "Jem" <jomilton@xxxxxxxxxxx> wrote:
Hi
I am fairly new to regression and have so far always used the enter
method, grouping certain blocks of variables. I am generally trying to
establish relationships between the dependent variable and a particular
independent, so only adding other variables that might confound the
relationship. I am not specifically trying to establish the model that
best predicts the dependent.
Googling groups, < group:sci.stat.* model-building > yielded,
among other things --
1. Statistics for Experimenters: An Introduction to Design, Data
Analysis, and Model Building, by George E.P. Box, William G. Hunter
and J. Stuart Hunter. ISBN:0-471-09315-7. Published by Wiley.
2. Applied Linear Statistical Models: Regression, Analysis of Variance
and Experimental Designs, by John Neter, William Wasserman and Michael
H. Kutner. ISBN: 0-256-08338-X. Published by IRWIN.
Also, Judd/McClelland's book on "Data Analysis."
Also, Frank Harrell's book, "Regression Modeling Strategies."
Stepwise selection is not what you want, from what you
describe. Few people should want it. You can check my
stats-FAQ for some old posts, or Google.
I am doing my thesis at the moment and it has been suggested that I
present the coefficents and p values of all predictors so that readers
can make their own minds up about the strength of relationships. I
have recently tried the forward method, so that I don't end up with so
many predictors (all theoretically related to the dependent) but many
of which are not significant predictors. However, as far as I can
gather this does not then provide you with coefficients and p values
for all variables and does not allow you to see what happens to the
coefficent of the primary predictor on adding additional predictors.
Please correct me if I am wrong.
Two problems with too many variables --
- You can run out of degrees of freedom and have far too much
capitalization on chance, if your sample is not large enough.
- For making sense, you need to have a good notion of what
the variables are supposed to mean, and how that compares
to what they actually *measure*. That can be a burden when
there are many.
You do want to test what the literature suggests is important.
It can be useful to show one variable in several contexts.
It is often useful to look at what is added by specific *sets*
of variables. Also, SPSS (for one) has a useful option for looking
at tests, as if the variable was entered next, on all the variables-
not-in-the-equation.
Also, try Robert Abelson's book "Statistics as Principled Argument."
Additionally if my hypothesis is that dependent variable, a for example
is affected by c (my primary predictor of interest) via a change in
another predictor, b then should I be adjusting for b by adding it to
the model as surely this will preven me seeing an affect of c on a. I
hope that makes sense. My current idea is to stick with the enter
method that I know best, then I can add b in a seperate block from c
and examine the effects on the coefficients. This also allows me to do
the regression easily with or without variable c.
Any help gratefully received, I will prob have a few more questions
once I get responses to these.
It sounds to me like you are starting out in the right direction.
--
Rich Ulrich, wpilib@xxxxxxxx
http://www.pitt.edu/~wpilib/index.html
.
- Follow-Ups:
- References:
- Prev by Date: help with statistics
- Next by Date: Re: Enter versus forward method for linear regression
- Previous by thread: Enter versus forward method for linear regression
- Next by thread: Re: Enter versus forward method for linear regression
- Index(es):
Relevant Pages
|