Re: Logistic Regression
- From: clemenr@xxxxxxxxxx
- Date: 24 May 2005 15:56:30 -0700
Thanks to everyone who replied to this thread.
It seems clear that I need to look into this further. I have isolated
one problem that fails to converge, which has the frequencies of the 18
most common words and punctuation symbols as independent variables.
Going through large amounts of debugging output, the algorithm seems to
start fine. In a reasonably small number of iterations, it gets to a
point where all the 0 rows are assigned a probability very close to 0,
and the 1 rows are assigned about 0.95-0.96. Then it suddenly stops
converging and, more or less, goes crazy. The sum of the absolute
values of the updates of the b matrix stop dropping, and wildly
fluctuate. The probabilities that were getting quite close to being
correct go crazy. The sizes of the coefficients get to about 1000 max
before the whole process seems to break down, but after that get very
large. (XXXe+10 etc)
Today I checked the relevant sections of Alpaydin's book (Intro to
Machine Learning). Alpaydin gives different algorithms for fitting a
logistic regression model than I used. There's also a recommendation
that the weight matrix is initialised to random numbers in the range
-0.01 to 0.01.
I modified my program to start with random weights between 0 and 1. In
two trials, the first trial failed to converge. The second trial
converged quickly to a good solution and stopped. So, it is possible to
fit a logistic model to that data.
I'm not sure what form I should use to send data to people. I attach
the X and Y matrices here for the problem mentioned above in CSV.
I'll try and run this through R's polr() function for fitting logistic
(and other) models. However, given that I can fit a model, even if not
every time, I'd imagine that R or SPSS would have no difficulty fitting
a model. I'll also check out the Classificaiton Society.
Cheers,
Ross-c
x0,x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x16,x17,x18
1,0.0720183,0.037871,0.0308037,0.0299751,0.0304528,0.0216796,0.0219038,0.0190281,0.0145538,0.0135887,0.0102062,0.00898767,0.0101477,0.00995272,0.00728177,0.00978701,0.00770093,0.00687235
1,0.074273,0.0328923,0.0305713,0.0331134,0.0316545,0.018303,0.0219282,0.0211103,0.0152525,0.0173635,0.0107983,0.00916255,0.00560363,0.00551521,0.00879782,0.00522785,0.0102236,0.0124673
1,0.0700869,0.0384123,0.0295585,0.0193777,0.0396019,0.0176619,0.0239991,0.0191261,0.0173073,0.0175132,0.0111874,0.00893388,0.00236788,0.00974605,0.00769847,0.00266529,0.00642873,0.0148479
1,0.0759439,0.0404344,0.0306368,0.0258251,0.0333168,0.0154446,0.0227014,0.0187249,0.0155142,0.0138088,0.01025,0.0101369,0.00684783,0.00924065,0.00735249,0.00684783,0.00617783,0.00937117
1,0.0647153,0.035388,0.0301734,0.00170839,0.0362259,0.0379669,0.0222578,0.0190851,0.0171489,0.0137973,0.0112591,0.00940426,0.00770401,0.00792366,0.00792366,0.0088592,0.00752504,0.00975407
1,0.0769764,0.0340408,0.0323392,0.0279275,0.0321253,0.022772,0.0195217,0.0194504,0.0168421,0.0186149,0.0111567,0.00925143,0.00642913,0.0109428,0.00753971,0.0065514,0.00770273,0.0103824
1,0.0779948,0.0347583,0.0354439,0.0319094,0.0364722,0.0159814,0.0213516,0.0183199,0.0146559,0.0141456,0.0112509,0.00982648,0.00960557,0.00561404,0.00774692,0.00882859,0.00785356,0.00652813
1,0.0809327,0.0305512,0.0383111,0.027119,0.0338934,0.0229455,0.0215272,0.0185364,0.0143715,0.0171524,0.0101851,0.00868973,0.0057503,0.00392066,0.00765279,0.00470051,0.00886541,0.0136773
1,0.0805173,0.0273328,0.0318756,0.0341889,0.0309518,0.0251569,0.0211562,0.0189574,0.013758,0.017041,0.00942143,0.0094596,0.00611553,0.00432134,0.00881064,0.00419918,0.00918475,0.0132541
1,0.0667417,0.0303387,0.0310891,0.0308972,0.0292829,0.0183062,0.0230442,0.0189781,0.0159678,0.0196499,0.0105754,0.00838525,0.00749524,0.00221629,0.00893496,0.00655289,0.0110989,0.0121721
1,0.055775,0.0524985,0.0415416,0.0302791,0.0270236,0.0179315,0.0173415,0.0199859,0.0242633,0.00582614,0.0113573,0.0108094,0.0125267,0.0107568,0.00893412,0.0122107,0.00620542,0.00742754
1,0.042524,0.05554,0.0441282,0.0196517,0.0248897,0.0121896,0.0203444,0.0213531,0.0265547,0.00700023,0.011266,0.0138546,0.0150213,0.0131376,0.0100507,0.0159085,0.00398624,0.00606444
1,0.0449953,0.0444433,0.0558275,0.036642,0.0186439,0.0190605,0.0197688,0.0204354,0.0232372,0.0133528,0.0106656,0.0135194,0.0135715,0.0136965,0.0109676,0.0120821,0.0102593,0.00520779
1,0.0476545,0.0499341,0.0498566,0.0295107,0.0244708,0.0167481,0.0206094,0.0202373,0.0240056,0.00945956,0.00956812,0.0137551,0.0174769,0.00983174,0.0111034,0.0159882,0.00834303,0.00114755
1,0.0446411,0.0418778,0.0524321,0.0415785,0.0206197,0.0159212,0.0195024,0.0207893,0.0209489,0.0122302,0.0112525,0.013916,0.0136168,0.0107338,0.0107039,0.012719,0.0114022,0.00682335
1,0.0476168,0.0503946,0.047891,0.0424188,0.022342,0.0233077,0.0194091,0.0179904,0.021579,0.0123513,0.0105272,0.00897732,0.0108252,0.0128162,0.00768974,0.0112544,0.0106226,0.00380314
1,0.0480272,0.0497692,0.0443951,0.0336382,0.0261301,0.0217838,0.0161571,0.0206167,0.0231774,0.0107743,0.0114363,0.0120721,0.0137967,0.0185959,0.00922394,0.0124641,0.00756903,0.00485149
1,0.0507474,0.0440578,0.0546267,0.0425406,0.0192034,0.0220645,0.019295,0.0200485,0.0209547,0.0123814,0.0101006,0.0120454,0.0140411,0.00722926,0.0119639,0.0126767,0.0107624,0.00290189
1,0.0477706,0.0472193,0.0464802,0.0381112,0.0266102,0.0165625,0.0158985,0.0216114,0.0225761,0.0116514,0.00930856,0.0108746,0.0102858,0.0144702,0.00957166,0.0101855,0.0104612,0.00898282
1,0.055857,0.0390281,0.048471,0.0475185,0.0203907,0.0216332,0.021081,0.0196038,0.0200732,0.0151584,0.00993995,0.00967764,0.0128115,0.0104784,0.0125768,0.00813143,0.0141782,0.00706841
y
0
0
0
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
.
- Follow-Ups:
- Re: Logistic Regression
- From: Ray Koopman
- Re: Logistic Regression
- From: clemenr
- Re: Logistic Regression
- References:
- Logistic Regression
- From: clemenr
- Re: Logistic Regression
- From: Phil Sherrod
- Logistic Regression
- Prev by Date: Re: Pearson correlation & significance
- Next by Date: Re: Logistic Regression
- Previous by thread: Re: Logistic Regression
- Next by thread: Re: Logistic Regression
- Index(es):
Relevant Pages
|
|