Re: Problem related to a linear regression
- From: MET <Marcel.E.Tschudin@xxxxxxxxx>
- Date: Sat, 03 Nov 2007 16:41:21 -0000
First of all: Thank you for taking so much time for this!
On 3 Nov., 16:51, matt271829-n...@xxxxxxxxxxx wrote:
yes
OK, I'll assume you're doing the following (perhaps omitting some
detail that is not relevant to the question).
1. You have a set of Y-values, which are your observations.
2. You have a function F which takes a bunch of parameters andyes
produces one value corresponding to each Y, which we'll call Xf. You
iteratively refine this function F until you get the best correlation
between Xf and Y.
However, Xf is *not* a direct estimate of Y, sinceI'm not sure what you mean with the direct estimate.
the correlation line need not be anywhere near Y = Xf.
Regarding the correlation line: I'm not sure since we have the same
physical parameter once observed and once estimated.
3. You regress Xf and Y, yielding A and B, such that y = A + B*x isyes
the best fit line to the (Xf, Y) data.
4. You now define Xf' = A + B*Xf (your "shifting/scaling" of the x-yes
axis) and plot Xf' against Y. You note that the best fit line to this
plot does not appear to be the line y = x as one would expect.
yes
In your Test_2.gif (Y versus Xf') I assume that the pale blue line is
the same as the pale blue line in Test_1.gif (Y versus Xf), except
shifted and scaled along with the data points. By definition this line
is y = x. I assume that the white line is the result of the second
regression.
Now, if you did another least squares regression on (Xf',I agree, this was only done to bring the results from F into the value
Y) in the same way as you did the first on (Xf, Y) then, as far as I
can see, you would inevitably end up with the line y = x. The fact
that you get a different line (the white line) is, we think, because
in the first regression you minimised squared deviations in the y-
direction, and in the second you minimised squared deviations in the x-
direction. I think if you "want the white line" then you might just as
well do the first regression in the same way as you're currently doing
the second, and not bother doing the second.
To me, the pale blue line at Test_2.gif looks no better or worse than
the pale blue line at Test_1.gif. In other words, I don't think the
shifting/scaling to get from Test_1.gif to Test_2.gif has anything to
do with the problem.
range of Y since the idea is to provide a function for estimating Y.
As much for my own interest as anything else, IA very need way to show the problem in a simplified form!
did a test plot athttp://img461.imageshack.us/img461/3523/regressionvj2.gif.
The blue regression line is found by minimising squared deviations inIndeed very nicely.
the y-direction, and the green regression line by minimising squared
deviations in the x-direction. You can see they are markedly
different, which I think duplicates the effect you're seeing.
So, which method should you use? Well, since you want to estimate YWell, that's how I started by minimising - ignoring now u and v -
from Xf', it would seem that minimising squared y-distances (or
possibly minimising absolute y-differences) makes more sense than
minimising squared x-differences or ODR, even though it may "look
wrong".
SUM(Y-F)^2
Given the F-parameters that you have, you want the generatedyes
Xf' to be as near as possible to the corresponding real Y;
you are notIf I understand you right here: yes; the Y values have a 'natural'
interested in the extent to which you need to vary the parameters
supplied to F in order to get Xf' to exactly equal the real Y.
standard deviation, i.e. it would be wrong to reduce the standard
deviation obtained by an increased number of observations below their
'natural' limit.
Finally, it strikes me that this is a kind of roundabout way of doingThat's why I mentioned at the very beginning that I probably make a
things (probably you have very good reasons!).
detour.
As mentioned above I started finding the parameters of a correlating
function by minimising SUM(Y-F)^2, then I tried to 'match' the results
to the size range of the observed values and gaining the impression
that the estimations could be further improved when looking at the
graph.
I don't really see whyBut how? The problem is still how the values of F providing the 'best'
you don't fix up your optimisation of F so that it gives you the final
estimate for Y (using whatever best fit criteria you choose), thus
obviating the need for the separate regression step.
correlation can be 'matched' to the observed values Y?
.
- Follow-Ups:
- Re: Problem related to a linear regression
- From: matt271829-news
- Re: Problem related to a linear regression
- From: MET
- Re: Problem related to a linear regression
- References:
- Problem related to a linear regression
- From: MET
- Re: Problem related to a linear regression
- From: matt271829-news
- Re: Problem related to a linear regression
- From: MET
- Re: Problem related to a linear regression
- From: matt271829-news
- Re: Problem related to a linear regression
- From: MET
- Re: Problem related to a linear regression
- From: matt271829-news
- Re: Problem related to a linear regression
- From: MET
- Re: Problem related to a linear regression
- From: matt271829-news
- Re: Problem related to a linear regression
- From: MET
- Re: Problem related to a linear regression
- From: matt271829-news
- Re: Problem related to a linear regression
- From: MET
- Re: Problem related to a linear regression
- From: matt271829-news
- Problem related to a linear regression
- Prev by Date: Who knows something about history of mathematics or mathematics education in US?
- Next by Date: Re: Encouraging quotes
- Previous by thread: Re: Problem related to a linear regression
- Next by thread: Re: Problem related to a linear regression
- Index(es):
Relevant Pages
|
Loading