Re: Logistic Regression



Even an ordinary least-squares linear regression will completely
separate the 0s and 1s, so the usual likelihood-maximizing logistic
iterations should not converge.

clemenr@xxxxxxxxxx wrote:
> Thanks to everyone who replied to this thread.
>
> It seems clear that I need to look into this further. I have isolated
> one problem that fails to converge, which has the frequencies of the 18
> most common words and punctuation symbols as independent variables.
> Going through large amounts of debugging output, the algorithm seems to
> start fine. In a reasonably small number of iterations, it gets to a
> point where all the 0 rows are assigned a probability very close to 0,
> and the 1 rows are assigned about 0.95-0.96. Then it suddenly stops
> converging and, more or less, goes crazy. The sum of the absolute
> values of the updates of the b matrix stop dropping, and wildly
> fluctuate. The probabilities that were getting quite close to being
> correct go crazy. The sizes of the coefficients get to about 1000 max
> before the whole process seems to break down, but after that get very
> large. (XXXe+10 etc)
>
> Today I checked the relevant sections of Alpaydin's book (Intro to
> Machine Learning). Alpaydin gives different algorithms for fitting a
> logistic regression model than I used. There's also a recommendation
> that the weight matrix is initialised to random numbers in the range
> -0.01 to 0.01.
>
> I modified my program to start with random weights between 0 and 1. In
> two trials, the first trial failed to converge. The second trial
> converged quickly to a good solution and stopped. So, it is possible to
> fit a logistic model to that data.
>
> I'm not sure what form I should use to send data to people. I attach
> the X and Y matrices here for the problem mentioned above in CSV.
>
> I'll try and run this through R's polr() function for fitting logistic
> (and other) models. However, given that I can fit a model, even if not
> every time, I'd imagine that R or SPSS would have no difficulty fitting
> a model. I'll also check out the Classificaiton Society.
>
> Cheers,
>
> Ross-c
>
> x0,x1,x2,x3,x4,x5,x6,x7,x8,x9,x10,x11,x12,x13,x14,x16,x17,x18
> 1,0.0720183,0.037871,0.0308037,0.0299751,0.0304528,0.0216796,0.0219038,0.0190281,0.0145538,0.0135887,0.0102062,0.00898767,0.0101477,0.00995272,0.00728177,0.00978701,0.00770093,0.00687235
> 1,0.074273,0.0328923,0.0305713,0.0331134,0.0316545,0.018303,0.0219282,0.0211103,0.0152525,0.0173635,0.0107983,0.00916255,0.00560363,0.00551521,0.00879782,0.00522785,0.0102236,0.0124673
> 1,0.0700869,0.0384123,0.0295585,0.0193777,0.0396019,0.0176619,0.0239991,0.0191261,0.0173073,0.0175132,0.0111874,0.00893388,0.00236788,0.00974605,0.00769847,0.00266529,0.00642873,0.0148479
> 1,0.0759439,0.0404344,0.0306368,0.0258251,0.0333168,0.0154446,0.0227014,0.0187249,0.0155142,0.0138088,0.01025,0.0101369,0.00684783,0.00924065,0.00735249,0.00684783,0.00617783,0.00937117
> 1,0.0647153,0.035388,0.0301734,0.00170839,0.0362259,0.0379669,0.0222578,0.0190851,0.0171489,0.0137973,0.0112591,0.00940426,0.00770401,0.00792366,0.00792366,0.0088592,0.00752504,0.00975407
> 1,0.0769764,0.0340408,0.0323392,0.0279275,0.0321253,0.022772,0.0195217,0.0194504,0.0168421,0.0186149,0.0111567,0.00925143,0.00642913,0.0109428,0.00753971,0.0065514,0.00770273,0.0103824
> 1,0.0779948,0.0347583,0.0354439,0.0319094,0.0364722,0.0159814,0.0213516,0.0183199,0.0146559,0.0141456,0.0112509,0.00982648,0.00960557,0.00561404,0.00774692,0.00882859,0.00785356,0.00652813
> 1,0.0809327,0.0305512,0.0383111,0.027119,0.0338934,0.0229455,0.0215272,0.0185364,0.0143715,0.0171524,0.0101851,0.00868973,0.0057503,0.00392066,0.00765279,0.00470051,0.00886541,0.0136773
> 1,0.0805173,0.0273328,0.0318756,0.0341889,0.0309518,0.0251569,0.0211562,0.0189574,0.013758,0.017041,0.00942143,0.0094596,0.00611553,0.00432134,0.00881064,0.00419918,0.00918475,0.0132541
> 1,0.0667417,0.0303387,0.0310891,0.0308972,0.0292829,0.0183062,0.0230442,0.0189781,0.0159678,0.0196499,0.0105754,0.00838525,0.00749524,0.00221629,0.00893496,0.00655289,0.0110989,0.0121721
> 1,0.055775,0.0524985,0.0415416,0.0302791,0.0270236,0.0179315,0.0173415,0.0199859,0.0242633,0.00582614,0.0113573,0.0108094,0.0125267,0.0107568,0.00893412,0.0122107,0.00620542,0.00742754
> 1,0.042524,0.05554,0.0441282,0.0196517,0.0248897,0.0121896,0.0203444,0.0213531,0.0265547,0.00700023,0.011266,0.0138546,0.0150213,0.0131376,0.0100507,0.0159085,0.00398624,0.00606444
> 1,0.0449953,0.0444433,0.0558275,0.036642,0.0186439,0.0190605,0.0197688,0.0204354,0.0232372,0.0133528,0.0106656,0.0135194,0.0135715,0.0136965,0.0109676,0.0120821,0.0102593,0.00520779
> 1,0.0476545,0.0499341,0.0498566,0.0295107,0.0244708,0.0167481,0.0206094,0.0202373,0.0240056,0.00945956,0.00956812,0.0137551,0.0174769,0.00983174,0.0111034,0.0159882,0.00834303,0.00114755
> 1,0.0446411,0.0418778,0.0524321,0.0415785,0.0206197,0.0159212,0.0195024,0.0207893,0.0209489,0.0122302,0.0112525,0.013916,0.0136168,0.0107338,0.0107039,0.012719,0.0114022,0.00682335
> 1,0.0476168,0.0503946,0.047891,0.0424188,0.022342,0.0233077,0.0194091,0.0179904,0.021579,0.0123513,0.0105272,0.00897732,0.0108252,0.0128162,0.00768974,0.0112544,0.0106226,0.00380314
> 1,0.0480272,0.0497692,0.0443951,0.0336382,0.0261301,0.0217838,0.0161571,0.0206167,0.0231774,0.0107743,0.0114363,0.0120721,0.0137967,0.0185959,0.00922394,0.0124641,0.00756903,0.00485149
> 1,0.0507474,0.0440578,0.0546267,0.0425406,0.0192034,0.0220645,0.019295,0.0200485,0.0209547,0.0123814,0.0101006,0.0120454,0.0140411,0.00722926,0.0119639,0.0126767,0.0107624,0.00290189
> 1,0.0477706,0.0472193,0.0464802,0.0381112,0.0266102,0.0165625,0.0158985,0.0216114,0.0225761,0.0116514,0.00930856,0.0108746,0.0102858,0.0144702,0.00957166,0.0101855,0.0104612,0.00898282
> 1,0.055857,0.0390281,0.048471,0.0475185,0.0203907,0.0216332,0.021081,0.0196038,0.0200732,0.0151584,0.00993995,0.00967764,0.0128115,0.0104784,0.0125768,0.00813143,0.0141782,0.00706841
>
>
> y
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 0
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1
> 1

.