Re: Problem related to a linear regression



Thank you Matt for having had a closer look at my problem.

On 2 Nov., 05:06, matt271829-n...@xxxxxxxxxxx wrote:
I'm kind of confused. Initially you found the values of A and B that
give the best fit of the line y = A + B*x to the data points (Xf, Y)?

Yes. Without having looked at it in detail I did expect that this
would provide the 'best' estimation between Xf and Y.

Then you applied the transformation Xf' = A + B*Xf, where Xf' is the
transformed X-value?

Yes, finding out that A and B describe a regression line for the data
points (Xf',Y) corresponds to Y=Xf', i.e. the data points are arranged
'best' around the line corresponding to 100 percent correlation. In a
'normal' case I would have expected that this line would also be the
'best' estimate between y and x. My confusion comes from the situation
that by looking at the Y/Xf'-graph the line Y=Xf' doesn't (yet) seem
to be the 'best' estimate between Y and Xf'.

(This is what I asume you mean by "the X-values
were scaled with A and B": you're transforming x so as to take the

I used the word 'scaled' since A and B don't affect the correlation
found for Xf; they just move the data points in the xy plane. It is
apparently in this context not the correct term.

line y = A + B*x to the line y = x.) And now you're wondering if this
choice of A and B actually gives the best fit of the transformed data
points (Xf', Y) to the line y = x?

Yes. As explained above it actually is the 'best' fit related to the
correlation (data points nearest to Y=Xf') but probably not for the
'best' estimate of Y from Xf'.

Doesn't it amount to exactly the same thing? In the first case you're
choosing A and B so as to minimise the sum of (Y - (A + B*Xf))^2, and
in the second case you're choosing A and B so as to minimise the sum
of (Y - Xf')^2, where Xf' = A + B*Xf?

Maybe I misunderstood it...

In the first case (for finding A and B) I minimise the sum of (Y-
Xf)^2. Oooooops, while writing this I just realise that it is for this
reason that I get with A and B Y=Xf'. (It really helps sometime to
explain what one is doing to recognise the problem.) The reason for
choosing (Y-Xf)^2 is the fact that Y and Xf refer to the same physical
parameter, Y the observed values and Xf the values estimated from
other parameters. (It's actually not for a calibration, but it is this
type of problem.)

For finding the 'best' (linear) relation between Xf' and Y (Y=C+D*Xf')
the sum of (Xf'-(Y-C)/D)^2 were minimised. In the xy graph this
regression line 'looks' then also more to represent the 'best'
estimate of Y from the Xf' values.

Finally remains the following question: which procedure is under the
numerical aspect preferable, a direct one (Xf <-> Y) or - as done now
- the indirect one (Xf <-> Xf' <-> Y) with Xf being a non-linear
function?

Thank you for your help!

Regards Marcel

.



Relevant Pages

  • Re: Problem related to a linear regression
    ... give the best fit of the line y = A + B*x to the data points ... finding out that A and B describe a regression line for the data ... I used the word 'scaled' since A and B don't affect the correlation ... In the first case I minimise the sum of (Y- ...
    (sci.math)
  • Re: Problem related to a linear regression
    ... give the best fit of the line y = A + B*x to the data points ... I used the word 'scaled' since A and B don't affect the correlation ... In the first case I minimise the sum of (Y- ...
    (sci.math)
  • Re: Re: well, Tariq never grips until Patty wastes the unchanged headline together
    ... adjusts it too, the correlation will fit like the standard restaurant. ...
    (sci.crypt)
  • Re: General Curve Fitting Question
    ... my data and use a fit which helped get more reasonable data. ... tight range, matlab found that with the equation y=x^a, a was 25. ... prediction, square that difference and ... then sum the results. ...
    (comp.soft-sys.matlab)
  • Re: model identification
    ... after I sum only the functions which deal with the data this way ... note that only one COMPARE ... Use the model for which the fit value is maximum. ... RESID) after estimation and based on those results slowly increase the ...
    (comp.soft-sys.matlab)