Re: Central Limit Theorem?
- From: "Reef Fish" <Large_Nassau_Grouper@xxxxxxxxx>
- Date: 26 Apr 2005 10:32:49 -0700
Herman Rubin wrote:
> In article <1114487678.907557.44870@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>,
> Reef Fish <Large_Nassau_Grouper@xxxxxxxxx> wrote:
>
> < much snippage for brevity and getting to some real ISSUES >
>
> >> Who needs a full course on Monte Carlo methods?
>
> >LOL! This coming from a man who thought a 4-quarter sequance in
> >Measure Theory and Probability Theory is necessary for an APPLIED
> >statistician!
>
> I would settle for three, but 4 would be better.
>
> >That one sentence says it all about your prejudice and ignorance
> >about the subject.
>
> >The simple answer is: My doctoral students and some doctoral
> >students from Engineering and Computer Science have taken my
> >Monte Carlo methods course because they NEEDED the material
> >for their doctoral dissertation work.
>
> >Your doctoral students in mathematical statistics doesn't
> >even need to know how to compute anything.
>
> I will admit that some of them did not.
That says a lot about your notion of applicable statistics.
> >C'est la difference, monsieur Rubin.
>
> >In MY applied statistical environment we offer ZERO course of
> >YOUR kind in Measure Theory OR Probability Theory, because
> >for most of the doctoral students there is absolutely NO USE
> >for it, as we had argued in a separtely thread about my view
> >of the uselessness of "Measure Theory" in Applied Statistics.
>
> You seem to advocate teaching of ritual.
I don't expect a statistical mathematician like yourself to under
what APPLIED statistics and Data Analysis are all about.
> Forty years ago, I received a request to
> compute the significance level WAY out in the tails for the
> sum of a large number of absolute normal random variables.
> This turned out to be rather easy; simulation would have
> been useless.
One thing a good applied statistician learn is WHEN to apply certain
methods and when NOT. p-value and tail probability calculations
generally generally do not call for Monte Carlo methods. Another
one of Herman Rubin's strawman.
> ...................
>
> >> At that time, I was already past "retirement age". I am still
> >> doing research, and will continue to do so as long as I can.
>
> >I have no problem about that. The question for history to judge is
> >whether your publication has any impact on statistics, or from my
> >point of view, more importantly any impact on the APPLICATION of
> >Statistics.
>
> >I chose to retire because I had made a few meager contributions
> >to Applicd Statistics, and I was bored with much of what's going
> >on in ASA and IMS, especially the IMS and mathematical statsticians.
>
> I am not a member of ASA because I consider IMS not to
> be sufficiently mathematical.
That figures! And ASA is considered TOO mathematical by those foot
soldiers in the field DOING statistical application. Even the
Application of JASA and the American Statistician (which is REALLY
low level in mathematical prerequisites) are beyond their mathematical
comprehension level.
Yet many of them CAN and ARE producing creditable applied statistical
work, by following the correct methodologies and avoiding all the
pitfalls in applying those methodologies.
And YOU, in your fantasy world, is arguing that they need to know
"Measure Theory" and Loeve level Probability Theory before they can
apply statistics well.
The absurdity of your stance should be clear to EVERYONE in this
sci.stat.math group except yourself.
> How should significance levels vary with sample size?
> This question was answered by Sethuraman and me 40
> years ago; theory can and should be applied.
This would be a completely USELESS result in MY book:
30 years ago, an important chapter in my Data Analysis Lecture Notes
is "statistical significance" and "practical significance" are
COMPLETELY unrelated! I didn't need any MATHEMATICAL proof to
prove that. All I had to do was to exhibit many statistically
significant results that were completely USELESS in PRACTICE.
That articulates our different worlds, Herman.
> >> >I can recall most of the results from memory, such as all the
theory
> >> >and methods about Linear Models, and all the partial correlation
> >> >theory and methods I've been telling Richard Ulrich on his abuse
of
> >> >Multiple Regression "expected signs"; and even the computational
> >> >details
> >> >of using SWEEP and how every single entry of a swept matrix
relate
> >to
> >> >a DIFFERENT bit of information in a multiple regression problem,
> >> >without having to refer to any books or notes. :-)
>
> >> Heck, I knew that 60 years ago.
>
> >Why am I not surprised at a statement like that from you?
>
> >What you knew 60 years ago was the Gaussian Seeep, which dated
> >over a hundred years ago. What revolutionized MODERN regression
> >computation was Al Beaton's SWP which was his Harvard doctoral
> >dissertation in 1964. That was only 40 years ago.
>
> THAT is a PhD dissertation? My standards are far higher. And
> in no way would it be even a statistics result, but one in
> numerical analysis.
Your remark merely shows that you're completely ignorant about Al
Beaton's SWP, and the OTHER Operators for ANOVA and other statistical
computations in his 1964 Ed.D. dissertation at Harvard.
I mentioned SWP because it was one of the CLEVEREST results ever came
out of applied statistics, for Linear Regression computation. It is
ALL statistics.
> >It's a commutative and reversible operator that is not possessed
> >by any Gaussian (or other elmination) methods up till 1964.
>
> >It was the "reversibility" of Beaton's SWEEP operator that made
> >multiple regression, stepwise regression, and all kinds of MODERN
> >computer software packages adcpt that approach. Jim Goodnight
> >of SAS is one of the ones who used it well.
>
> >Do you know how to do the absolutely most efficient (you cannot
> >so it with less number of arithmetic operations any other way)
> >way of getting ALL possible regressions of Y on k independent
> >variables via Beaton's SWP in the order of the Hamiltonian path?
> I suggest you read my paper in _Multivariate Analysis II_, in
> which I showed that one could get the sums of squares of the
> residuals in time O(2^k), which is clearly best possible. But
> this would not justify a thesis.
Please excuse me for LOL. When you talk about big O, you're already
into another realm of ASYMPTOTIC MATHEMATICS.
Let's say I am using the SWP operator for my Multiple regression
computation on 100,000 observations.. After I have the results for
Y regressed on X1, ...X10, say, with the SSE being ONE element of
a 20 x 20 matrix being sweeped (because X11, X12, ... X18 are not
yet in the multiple regression), I can get the SSE of the regression
of Y on X1, ..., X10, together with any one of the remaining Xj, in
THREE arithmetic operations!
ONE multiplcation, ONE division, and ONE subtraction.
Mull on that a bit to let it sink in.
> >I implementated that in a software package 33 years ago,
> >BEFORE any of the known stat packages today did it, efficiently.
> >Some of them probably still use the "brute force" inefficient
> >method than the Hamiltonian path.
>
> I do not think the lexicographic approach is that much less
> efficient, and it is easier to keep track of.
It is MUCH less efficient. It doesn't matter what you think in this
case because you don't KNOW Beaton's operator SWP and you don't know
how much more efficient it is (A FACT) than any lexicographical
approach!
The order of computing ALL possible regressions on X1, ... Xk is
by applying Beaton's SWP in the order of the Hamiltonian path
121312141213121... etc to produce the order of different combinations
of X in this order:
X1, (X1, X2), X2, (X2, X3), (X1, X2, X3), (X2, X3), ...
because Beaton's SWP brings an X in if it's not already in the
Multiple Regression, and take it out, if it is (because of the
REVERSIBLE property of SWP), thus resulting in the most efficient
method of computation, because EACH of the 2^k -1 possible
regressions takes exactly ONE SWP, no matter how many independent
variables there are in the combination! (One SWP is the equivalent
of pivotal step in a Gaussian elmination method .
You snipped the fact that I used Rubinstein and Fishman's books
for MY courses in Monte Carlo methods ...
> >and chapters from books by Applied Statisticians the likes of
> >Bill kennedy (ASA Fellow 1979) and Jim Gentle (ASA Fellow 1982)
> >and other APPLIED statisticians on Monte Carlo and Simulation
> >methods.
>
> Kennedy and Gentle was very weak when it came out.
> I reluctantly used it as a text once.
Here what's "weak" and what's "strong" is not only subjective, but
context and subject matter dependent isn't it?
I would say for statistical computing, the "Kennedy and Gentle"
book was stronger than the TOTALITY of all of the papers ever
published in the Annals of Mathematical Statistics, and I would
mean it seriously!
-- Bob.
> >> >> All of the intelligent discussion of pseudo-random numbers
> >> >> relies heavily on mathematics.
>
> >> >AND statistics, numerical analysis, and compuation.
>
> >> NOT on statistics. It does rely on numerical analysis,
> >> of which computation is a part. But numerical analysis
> >> is really part of analysis.
>
> >You're entitled to YOUR opinion, as I am entitled to mine.
>
> >Again, we have to agree to disagree here.
>
> >-- Bob.
>
>
>
> --
> This address is for information only. I do not claim that these
views
> are those of the Statistics Department or of Purdue University.
> Herman Rubin, Department of Statistics, Purdue University
> hrubin@xxxxxxxxxxxxxxx Phone: (765)494-6054 FAX:
(765)494-0558
.
- References:
- Central Limit Theorem?
- From: ellis x
- Re: Central Limit Theorem?
- From: Reef Fish
- Re: Central Limit Theorem?
- From: Herman Rubin
- Re: Central Limit Theorem?
- From: Reef Fish
- Re: Central Limit Theorem?
- From: Herman Rubin
- Central Limit Theorem?
- Prev by Date: Re: Solving partial moment with bivariate
- Next by Date: Re: Biostatistics for Dummies?
- Previous by thread: Re: Central Limit Theorem?
- Next by thread: Re: Central Limit Theorem?
- Index(es):
Relevant Pages
|