Logistic Regression
- From: clemenr@xxxxxxxxxx
- Date: 24 May 2005 01:41:16 -0700
Hi. I've just written some code for logistic regression. My data is a
two-author discrimination problem, with one author labelled as 0 and
the other as 1. The x values are the frequences of a list of common
words in training texts, ten per author. So, it's easy to have more
dimensions (I did include an extra column of 1s for the intercept) than
I have rows of data.
I used Newton-Rhapson to learn the weights. The method I used is
similar to that on the page:
http://www.google.co.uk/url?sa=U&start=2&q=http://ocw.mit.edu/NR/rdonlyres/Sloan-School-of-Management/15-075Applied-StatisticsSpring2003/8C07CE0F-70BB-4C8F-9A7B-9AD0AF643D71/0/lec15_logistic_regression.pdf&e=10313
I start with a 0 vector for the initial set of weights.
(I can't find the actual web page I copied the weight update formula
from).
When I apply my program to test problems, it usually quickly finds a
very close fit, where my 10 "0" texts are assigned a probability very
close to zero, and my 10 "1" texts are assigned a probability very
close to one. However, sometimes, particularly when I have 10 or more
dimensions, I either find a fit that assigns a probability of 0 to all
texts, or simply fails to converge (sum of absolute values of
differences to be added to the weights < 0.01) in any reasonable time.
Usually convergence happens in a very short time.
My data should be simple. In fact, the first variable alone (frequency
of 'the') is sufficient to distinguish the two authors I'm using for
debugging.
Are the problems I'm seeing likely to be due to the search technique
I'm using, or is it more likely that I have a bug in my program, and
N-R should always converge to a good solution?
Cheers,
Ross-c
.
- Follow-Ups:
- Re: Logistic Regression
- From: Phil Sherrod
- Re: Logistic Regression
- From: Graham Jones
- Re: Logistic Regression
- From: Art Kendall
- Re: Logistic Regression
- Prev by Date: Re: Adding Fractions
- Next by Date: Re: Logistic Regression
- Previous by thread: Adding Fractions
- Next by thread: Re: Logistic Regression
- Index(es):
Relevant Pages
|