Re: r^2 and log transformation



On Tue, 5 May 2009 06:41:31 -0700 (PDT), rcmcll@xxxxxxxxx wrote:

On May 5, 4:05 am, Karl Ove Hufthammer <Karl.Huftham...@xxxxxxxxxxx>
wrote:

The actual title of that paper is ‘The use of R² to determine the
appropriate transformation of regression variables’, and it has the
following abstract:

It is proved that, under certain conditions, the ‘true’ model,
from a set of alternative regression specifications involving different
transformations of the same dependent variable, is the formulation for
which population R2 is highest. This theorem, which requires assumptions
of normality, is proved through the use of Hermite polynomials. The paper
contains an appendix listing the major properties of these polynomials.

I wasn't sure what this article contained, but I thought it might have
the info I was
after. I got the reference from Peter Kennedy's book
"A Guide To Econometrics". I haven't had the time to get over to the
univ. library
and take a look at it. But I'm still sure there is a way to equate the
r^2 from logged
transformed dependant variable with the r^2 of the original variable.
Bob

It doesn't sound like it, to judge by the abstract: this paper says
when r-squared would be *highest*, but it doesn't say *what it is*,
which seems to be what you want.

To your question, the correlation of the log-y values with the
x-values depends on the form of the original relationship, not just on
the value of the correlation, so the correlation can go up or down
when you take logs. Examples:

x 1 2 3 4 5 6
y 1 3 5 7 9 11

is a perfect linear relationship, so the correlation between x and y
is 1; if you replace y by log y, the correlation drops to 0.945. (If
you start with data near a straight line, the correlation will drop
when you take logs.) But in this case:

x 1 2 3 4 5 6
y 1 2 4 8 16 32

which is exponential growth, taking logs will improve things: for the
original data, the correlation is 0.906. (If you start with data near
exponential growth or decay, the correlation will improve when you
take logs.)

Cheers,
Ken.


--
Ken Butler, Lecturer (Statistics)
University of Toronto at Scarborough
butler (at) utsc.utoronto.ca
http://www.utsc.utoronto.ca/~butler
.


Quantcast