Re: nonlinear regression for dummies
- From: Richard Ulrich <Rich.Ulrich@xxxxxxxxxxx>
- Date: Sat, 05 Nov 2005 15:15:33 -0500
On 4 Nov 2005 20:35:02 -0800, "contact@xxxxxxxxxxxxxxx"
<contact@xxxxxxxxxxxxxxx> wrote:
> Hello all,
>
> I'm trying to understand the output of a nonlinear regression analysis
> done by Mathematica. The fit is to a function of 2 parameters. Now,
> before I started this, I thought I knew about statistics; mean,
> variance, chi-squared, and so forth. But the output of the regression
> is a bunch of numbers that I don't quite understand. I'm hoping maybe
> someone here can teach me a thing or two.
>
> The ANOVA table looks like this:
>
> DF SumOfSq MeanSq
>
> Model 2 0.5 0.25
> Error 111 0.01 0.0001
> Uncorrected Total 113 0.51
> Corrected Total 112 0.18
>
> I'm guessing DF is degress of freedom, and the SumOfSq is the variance
> squared, but I don't understand any of the other terms.
You should start with reading about the model that
is easier and simpler, ordinary least squares regression.
There, you would learn that the ANOVA model ordinarily
includes the mean, and the "Total sums of squares" is
ordinarily the "Corrected total" -- that is, corrected for
the mean. It is the "sums of squared deviations" around
the mean. The "uncorrected total" is the SS around zero.
"Variance" is used in two competing senses. The ANOVA,
or "analysis of variance", is a partitioning of Sums of Squares.
So, "variance" sometimes indicates the SS term; otherwise, it
(more often) indicates the SS term divided by the d.f. I used
one text book that distinguished the two by using capital
S and lower-case s, which was potentially confusing. I don't
remember whether the author used S^2 and s^2, or if that
was a second source of confusion. -- However,
"variance" is already a squared term, so that your mention
of "variance squared" is bad.
The test in ANOVA or regression makes an F from the
ratio of the two Mean squares: MS for hypothesis,
divided by MS for error or residual.
The values shown in your example will give a large F.
The partition of SS and d.f. makes it clear that
"model (or hypothesis)" + "error (or residual)" equals
the "uncorrected total". It is very conventional that a
non-linear regression will not have a *default* term for
the mean, so the ANOVA table does not use the corrected
total.
However, I think that it is a good idea that the program
has displayed the corrected total for your examination, for
a couple of reasons. First, it gives a direct comparison
to simpler models that might have the corrected total,
so that you know that exactly the same data have been
analyzed. Second, it *represents* the ultimate, "simple"
model -- the one that has nothing but the mean -- so you
can eyeball the numbers and see whether your 2-parameter,
non-linear model has actually outperformed the overall mean.
If you are new to this, it is distinctly possible that you will
give the program the wrong model, or give the model in the
wrong way, so that the model will fail, terribly: However, it
will *always" reduce the SS uncorrected, possibly by an
amount that is "significant." Thus, if you don't know what
the numbers mean, you could miss the fact that it was a
poor model, no better than using the raw mean.
Thus, it is a really bad sign if the 2-parameter, non-linear model
does not out-perform the simple mean, by giving an "error SS"
that is quite a bit lower than the "Corrected Total SS."
For yours, the error SS is 0.01, which is
indeed lower than 0.18 for the Corrected Total.
Hope this helps.
--
Rich Ulrich, wpilib@xxxxxxxx
http://www.pitt.edu/~wpilib/index.html
.
- References:
- nonlinear regression for dummies
- From: contact@xxxxxxxxxxxxxxx
- nonlinear regression for dummies
- Prev by Date: nonlinear regression for dummies
- Next by Date: LCA: model failed to determine a valid solution...why?
- Previous by thread: nonlinear regression for dummies
- Next by thread: LCA: model failed to determine a valid solution...why?
- Index(es):
Relevant Pages
|