Re: Error on kurtosis and skewness



clemenr@xxxxxxxxxx wrote:

Jerry Dallal wrote:

Very good! You got the punchline! I told you you'd like it.


Looking at your reply, I think I sort of waffled around the topic
sufficiently that you gave me the benefit of the doubt. I personally
hate exam answers that do that, and am not inclined to give credit
excepting it is clearly deserved :-)

I expect more or less precision depending on the circumstance. I view Usenet more like a conversation. I'm not expecting the same kind of rigor one would see on an examination. A little looseness is fine with me. You got the key point--a 50% CI that *must* contain the parameter of interest!

The claim that (min(X1,X2), max(X1,X2)) is a confidence interval is
counter-intuitive to me. I'm used to thinking of confidence intervals
as being defined by numbers, rather than a function with free
variables.

Very good. Call it 'a realization of a 50% CI' that *must* contain the parameter of interest! and all of the mystery vanishes.


Actually, I think of a confidence interval as a random variable and the
things we usually call "confidence intervals" as *realizations* of
confidence intervals. (Same thing for contingency tables.)


After reading your reply, I conclude that my understanding of
confidence intervals is not as clear as it should be.

As the joke goes, "So why should you be different from anyone else?" :-) I've no doubt that someone could slip one by me under the right circumstances. I'd *hope* s/he'd have to work at it, though.


Not having any
suitable books at home I went through a number of web pages. I actually
found quite a few of them quite unsatisfying, clicking the back button

Web pages are like that. They are not unlike the way textbooks were back in the 70s. My strategy for choosing one would be to start by checking a few key definitions. It they weren't done properly, I went onto something else.


I note that one page defined a confidence interval as (wording from
memory) a process that creates an interval that has a certain fixed
probability of containing the true value of the statistic [given
assumptions]. Your min/max defined method seems to fit that definition.
One reason I found this confusing is that your confidence interval is
not deterministic. But, given the definitions, there's no reason why it
should be.

Right. It's a random interval.

<snip>

But in your example you constrain the uniform distribution to be
[theta-1,theta+1], which is not an arbitary uniform distribution,

arbitrary in the sense that theta is unknown and arbitrary


OK. I was using "arbitary" in the sense that an "arbitary" uniform
distribution could be any uniform distribution. With my tenuous
understanding of the terminology, I thought of [theta-1,theta+1] as
defining a restricted family of uniform distributions.

You're correct.


but
one with the range of possible random variates constrained between some
values x and x+2. So, if by chance max(X1,X2)-min(X1,X2) > 1, then
there is 100% chance that theta lies within the interval. For
max(X1,X2)-min(X1,X2) < 1, then there is less and less chance that
theta lies within the interval the smaller the interval.

Yes. And they are all 50% CIs! Because they are realizations of a process that produces intervals, half of which cover theta.


Yes, I picked up the word "process" on one of the web pages I look at.
Even though some of them have a 100% probability of containing theta,
the process itself has a 50% probability of producing a range that
contains theta. As I found by Monte Carlo simulation (I hope more
appropriate language than "bootstrap-y").

You are correct that "bootstrap" was probably not the right word. Some would say Monte Carlo is wrong, too, unless it involves a "variance swindle". However, many use Monte Carlo as a synonym for "simulation", which is the proper term.



that the punchiline is
that we have an interval of > 1 we know that theta is contained within
it. If we have an interval of < 1, there is still a probability that
theta is contained.

Therefore there must be greater than a 50%
confidence interval.

I don't understand this last sentence.


I was guessing that the story went like this. I'm guessing that you're
probably well familiar with the paradox of the missing dollar, where
there is a subtly reinforced expectation that a group of numbers should
add up to a sum ($20 when I heard it) even though there is no reason
that they should. I thought the story that you would come up with was
going to be this:

No. Had that been the case, I'd've said there was a flaw in the logic and asked you to spot it. Like the "switching envelopes" problem: There are two envelopes. One contains $x, the other $2x. You pick one and open it to see $100. The faulty logic says you should swap because the other envelope either contains $50 or $200, with an expectation of (50+200)/2 = $125. From my perspective, the flaw in the logic relates to a *proper understanding* of confidence intervals, although confidence intervals have nothing to do with unraveling the faulty decision that swapping is good. (There is no benefit, as you'd suspect.)


I have trouble selling snake oil. I usually end up telling these problems in a way that makes their solution obvious. However, since we don't know each other, I made it a point to say that the exercise was *not* to spot such a flaw.

Yes, I think this might be starting to penetrate my head. As I
mentioned on another thread, I plan to read a first year robust stats
book cover to cover sometime during the summer that hopefully will give
me a stronger background.

Thanks very much,

Cheers,

Ross-c


I hate to deflect someone from what looks like a profitable endeavor and, unfortunately, Reef Fish is gone for a few days so he can't comment on whether it might be too soon for you, but, given the kinds of questions you like to ask, you might *really* enjoy looking at comparative statistical inference--for example, comparing the frequentist and Bayesian approaches (and, decision theoretic, as Professor Rubin might add, which still leaves schools unmentioned) to statistics.


Where the frequentist has CIs and is unable to attach probability statement to them, the Bayesian has "credible intervals" and is *allowed* to say things like "the probability that the mean lies between 2.6 and 7.3 is 95%".
.



Quantcast