Re: Orthogonal Distance Regressions in R (or anywhere else)
- From: Gottfried Helms <helms@xxxxxxxxxxxxx>
- Date: Sat, 09 Jul 2005 11:56:28 +0200
Am 09.07.05 05:19 schrieb pjhernes:
> Yes, I stumbled across R-help and am still trying to get that "aha!"
> answer that makes everything crystal clear. :-)
>
> Peter
>
Hi Peter -
this "orthogonal regression" can be viewed as a special case
of a more general representation of the correlated and
separate variances in a (linear) variance-components-concept.
From factor-analysis it may be familiar to look at a loadings-
matrix as representation of such variance-components.
Given a correlation between x and y
R :
x y
---+-------------------------+--
x | 1 0.885 |
y | 0.885 1 |
First example, which is equivalent to the regression-model
where y is regressed on x, where x is thought to be "error-free"
components :
f1 err1
---+-------------------------+--
x | 1 0 |
y | 0.885 0.466 |
The components analysis finds a component of variances, which
is identified perfectly with x; correlated with r(comp,y)=0.885
and finds a component which is individual to y with correltation
of 0.466; you can say, that represents the errorterm of y, or the
residual of predicting y from component f1 (which is identified
exactly with x). The residual-variance of y is now 1-r²= 0.466².
It is now simply a rotation to turn the view of the model to
have an errorterm with measure x and *no* errorterm with measure
y
components :
f1 err2
---+-------------------------+--
x | 0.885 0.466 |
y | 1 0 |
That rotation does not affect the correlation between x and y,
which can be seen as the correlation of x with f1 (which is
perfectly identified by y). Also the error-variance is of the
same value as in the previous case, only that in this case it
is attributed to the measuring of x.
Between those both extremes your model with errorterms in each
variable can be located, simply by rotations of that components
model. For instance, we can find a position, where the error-
variances of x and y are equal:
components :
f1 err1 err2
---+-----------------------------------+--
x | 0.941 0.34 0 |
y | 0.941 0 0.34 |
where still the correlation between x and y is unchanged:
it is always the sum of loadings-products
r(x,y) = 0.941*0.941 + 0.34*0 + 0*0.34 = 0.885
But there are arbitray many other solutions for two error-terms,
for instance:
components :
f1 err1 err2
---+-----------------------------------+--
x | 0.895 0.446 0 |
y | 0.988 0 0.152 |
satisfies the same correlative relation between x and y.
I assume, orthogonal regression finds the previous solution,
where the error variances are equal, but since there are
infinitely many solutions possible you have to assign
new restrictions to your model; in this case you may apply
restrictions to the error-terms (being equal or are expected
to have a certain other ratio or even are expected to have
certain values of their variances)
---
This explanation is *only meant* to make things, which happen
in regression/orthogonal regression implicitely, *visible*.
(Here it is also assumed, that x and y are already standardized;
but the unstandardized solution can be retrieved from these
results).
In the case of two error-terms it is also a bit more difficult
to build the regression-equation of y on x={x_truevalue+ x_error}
from those components matrices
From the first case it is simple:
components :
f1 err1
---+-------------------------+--
x | 1 0 |
y | 0.885 0.466 |
since the left-top submatrix above the row of y is simply Lxx={1}
its inverse is also Lxx^-1 = {1}, and to retrieve the beta-values
one has to multiply that matrix with the appropriate submatrix
of row of y Lyx = {0.885}
beta = Lyx * Lxx^-1 = {0.885} * {1}^-1 = {0.885}
If the x-variable has two components then the inverse of Lxx
has to be a pseudoinverse
In a multivariate case (for instance trivariate) the above
procedure with matrix-inversion becomes clearer, see appendix
for a standard regression.
Here too in a components-model errors in all variables can be
assumed and the same framework can be used completely analoguous
to find regression-coefficients, though again one must define
restrictions for the error-terms.
Gottfried Helms
-----
Appendix
correlation
R :
x1 x2 y
+-----------------------------------+
x1 | 1 0.924 0.627 |
x2 | 0.924 1 0.718 |
y | 0.627 0.718 1 |
+-----------------------------------+
components :
f1 f2 err_y
+-----------------------------------+
x1 | 1 0 0 |
x2 | 0.924 0.383 0 |
+-----------------------------------+
y | 0.627 0.363 0.69 |
+-----------------------------------+
Lxx is now :
f1 f2
+-------------------------+
x1 | 1 0 |
x2 | 0.924 0.383 |
+-------------------------+
and Lyx is :
f1 f2
+-------------------------+
y | 0.627 0.363 |
+-------------------------+
The inverse InvLxx :
+-------------------------+
| 1 0 |
| -2.415 2.613 |
+-------------------------+
beta is
beta = Lyx * InvLxx
x1 x2
+-------------------------+
y | -0.249 0.948 |
+-------------------------+
y = -0.249*x1 + 0.984*x2 + err
------------
.
- References:
- Orthogonal Distance Regressions in R (or anywhere else)
- From: pjhernes
- Re: Orthogonal Distance Regressions in R (or anywhere else)
- From: Anon.
- Re: Orthogonal Distance Regressions in R (or anywhere else)
- From: pjhernes
- Orthogonal Distance Regressions in R (or anywhere else)
- Prev by Date: Re: Orthogonal Distance Regressions in R (or anywhere else)
- Next by Date: bootstrap and parameter estimation
- Previous by thread: Re: Orthogonal Distance Regressions in R (or anywhere else)
- Next by thread: Re: Orthogonal Distance Regressions in R (or anywhere else)
- Index(es):
Relevant Pages
|