Linear regression and Logistic regression



Hi,

I'm new in statistics and I'm trying to understand and digest concepts
and how to apply correctly statistics. I'm trying to understand a
procedure for a statistic task.

My questions in linear regression, about how can I compare two
distincts regressors to say one is "statistically" better than the
other? please correct me if I confuse. I'm using spss but my questions
are conceptual.

For example, I've a data set, with: X1, ..., Xn, Y variables
X1, ..., Xn predictors and I want to predict Y (n may be large or
not).

I can see if correlation coefficient R of Y and X1,...,Xn is and near
1 or -1 (how much?) to say Y and (X1,..., Xn) are linearly correlated,
and the sum of square errors.

But the regressor works better if I consider Xi_1,..., Xi_k not
linearly correlated. Then, previosly to regression, I may do a feature
selection. I could get principal components and use some of the
principals, or factor analysis (and get only one variable of each
group of correlated variables).

I construct then one regressor with all variables, other using some
principal components (for example first two or three or four) , others
selecting differents variables of each group of factor analysis.

I see to that spss has an option for select/discard variables in the
execution of the algorithm (forward, backward and stepwise selection),
are all greedy approachs (?), isn't necesarilly to exhaustive test all
variable combinations? (I know there is 2^n combinations, but for
example for n=5, 6, 7 may be useful) or isn't necesary test all to get
the optimal?

Which of all regressors is better statistically? I can compare only
with the multiple correlation coefficient R?

Reading the output of spss, what about the confidence intervals, I see
that there is a t-hypothesis test but I don't understand why at thos
significance levels (0.455, 0.131, ...) what is the meaning (or Sig is
the p value?) what test is doing?? I want to understand for trying to
see if I can compare that properties between two regressors to decide
which is better.


I want to know the same for compare logistic regressors, I've see that
do a Chi-square test, which hypothesis tests???


If you can answer guide me refeering a website, document I'll be
gratefully.

Thanks in advance!
A

.



Relevant Pages

  • Re: Quartile Algorithms, Samples and Data Filtering
    ... > We have a typical commercial arrangement where we can test the price of ... > using the costs of the selected peers provides a benchmark price for ... > applying the selection criteria will typically result in around ten ... using statistical methods described in an introductory statistics book. ...
    (sci.stat.edu)
  • Re: Linear regression and Logistic regression
    ... I'm new in statistics and I'm trying to understand and digest concepts ... execution of the algorithm (forward, backward and stepwise selection), ... If, by "optimal", you mean minimum MSEin, the answer is yes. ... then you have to translate the elbow rule on MSEin to ...
    (sci.stat.math)
  • Engineering, Science and Statistics (t-test)
    ... How to use statistics to establishe the dynamic behavior of ... understand statistics. ... STUDENTS "T" CALCULATION FOR THE CORRELATION COEFFICIENT ... focal state of the fundamental eye. ...
    (sci.med.vision)
  • Re: significance of correlation affected by sample size?
    ... argument is that r is often useless or inferior to other ... BTW Tukey was a member of Charles Winsor's Society for the Supression ... of the Correlation Coefficient. ... Computing Science and Statistics, 33. ...
    (sci.stat.edu)
  • Re: Backward Elimination in R
    ... Marc Schwartz wrote: ... In fact the selection of variables by backward ... Such wheels have already been built in numerous regression packages. ... > some of the best minds in the statistics field. ...
    (sci.stat.edu)