Re: Logistic Regression



Thanks. As a general wrap-up to this thead.

(i) Using linear regression discriminating two authors labelled 0/1
performs better than logistic regression on the same data.
(ii) Putting maximum limits on weights improves performance for
logistic regression applied to, say, 12 variables. Unchanged logistic
regression on 5 variables still gives higher performance.
(iii) On small numbers of variables, this considerably out-performs a
naive Bayes classifier I had been using.
(iv) On larger numbers of variables, the regression approaches are
extremely poor. The naive Bayes classifier performs much better.
(v) There has been no need for authorship attribution approaches that
optimise performance on small numbers of variables for at least 40
years.
(vi) The information fusion approach I described in a previous posting
where a regression is fitted to the first N variables, and then another
is fitted to the next N variables, and so on, is worthless.
(vii) Converting my compositional data to non-compositional data did
not help solve the problems of (vi) in the slightest.
(viii) Trying a few seat-of-the-pants non-linear transformations of my
data into higher numbers of dimensions did not help in the slightest.

Conclusion: I'd previously played with classifiers based on linear
regression, and obtained results that were so poor, that I assumed that
my code must have bugs and never mentioned any results to anybody.
Given that I have new code in a different language (C/gretl versus
Java/Jama) and have found that I get good fits in lower dimensions, I
think that these approaches are unsuitable for my data.

I may revive these programs when I get around to doing some more
information fusion experiments.

Thanks to everyone who posted on this thread.

Next on the list of things to do: Support Vector Machines. Sadly I've
already concluded that it will not be reasonable for me to write my own
quadratic programming code :-(

Cheers,

Ross-c

.



Relevant Pages

  • Re: Single-Factor-Cox-Regression
    ... allowed within Cox-Regression) ... Like the logistic regression, which it is sort-of an extension ... Cox regression models hazards and hazard ratios. ... Logistic regression is to require at least 20 more cases in the ...
    (sci.stat.math)
  • Re: Approximate solution to linear regression
    ... Construct an ensemble of regression models, ... these are binary variables. ... So my idea to use a logistic regression to classify 15% of the ...
    (sci.stat.consult)
  • Re: Adjusting
    ... some about unequal sample sizes. ... >> One way is to just do weighted regression, ... > iterations of the logistic regression fitting process. ...
    (sci.stat.math)
  • Re: Logistic regression or Poisson regression (log linear)
    ... I've been looking at some analyses with similarly sparse data. ... prior to logistic regression will probably lose a lot of information. ... > treatment more often than with the experimental treatment. ...
    (sci.stat.consult)
  • Re: Regression Inference and Data Splitting
    ... The statement is: " Regression ... is logistic regression, it has relevance to all regression-type models. ... Internal validation of predictive models: ... Several internal validation methods are available that aim to ...
    (sci.stat.math)