Re: why is linear regression with L1 much better in this case?



On 2007-01-14 17:39:01 -0400, "bahoo" <b83503104@xxxxxxxxx> said:



On Jan 14, 12:27 pm, Gordon Sande <g.sa...@xxxxxxxxxxxxxxxx> wrote:
On 2007-01-14 12:51:47 -0400, "bahoo" <b83503...@xxxxxxxxx> said:

Hi,

The example shown here:
http://control.ee.ethz.ch/~joloef/wiki/pmwiki.php?n=Tutorials.LinearA...shows







much better results with L1, compared to L2 and L_inf. Looks
like L1 is much more robust to such kind of noise.

Can anyone please give me a high-level explanation, or point me to some
references?

Is this related to the LASSO? But I thought LASSO is used to provide
sparse coefficients. So does it happen that sparse coefficients means
less influence by noise?

Thanks!

(In case you don't want to go to that website, here is a short
description of the data: a time series is composed of six sine waves
with different frequencies, but replaced with an abrupt noisy signal in
the middle for about 10% of duration. L1 shows perfect fit to the
underlying signal, more or less ignoring the abrupt signal, while L2 is
largely influenced by the noise.)The natural probability distribution for L_inf is the uniform, for L_2
is the Gaussian and for L_1 is the Laplacian, commonly known as the
double exponential.

The corresponding univariate location estimators are mid range, mean and
median. So one sees that L_1 is natuaral for longer tail and L_inf for
short, even bounded, tails.- Hide quoted text -- Show quoted text -

Gordon, this sounds great.
I wonder if there are any books or articles that have a little bit more
discussion on this issue?
(I did try to Google.)

Thanks!

Look at the maximum likelihood equations for each distribution and it
is rather obvious.

Gaussian and L_2 is everywhere. Once you realize that least squares is just
another name of L_2.

Laplacian and medians also are common. It is mostly a matter of realizing that
medians and L_1 are the same.

Uniform and Tchebycheff fittting go together. Again you need to connect
Tchebycheff and L_inf.

The various fitting crtieria in statistics are not common called by the
norm names so you have to do a bit of translation on the fly.



.



Relevant Pages