Re: Error on kurtosis and skewness
- From: Jerry Dallal <gdallal@xxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Thu, 16 Jun 2005 08:29:11 -0300
clemenr@xxxxxxxxxx wrote:
Jerry Dallal wrote:
Very good! You got the punchline! I told you you'd like it.
Looking at your reply, I think I sort of waffled around the topic sufficiently that you gave me the benefit of the doubt. I personally hate exam answers that do that, and am not inclined to give credit excepting it is clearly deserved :-)
I expect more or less precision depending on the circumstance. I view Usenet more like a conversation. I'm not expecting the same kind of rigor one would see on an examination. A little looseness is fine with me. You got the key point--a 50% CI that *must* contain the parameter of interest!
The claim that (min(X1,X2), max(X1,X2)) is a confidence interval is counter-intuitive to me. I'm used to thinking of confidence intervals as being defined by numbers, rather than a function with free variables.
Very good. Call it 'a realization of a 50% CI' that *must* contain the parameter of interest! and all of the mystery vanishes.
Actually, I think of a confidence interval as a random variable and the things we usually call "confidence intervals" as *realizations* of confidence intervals. (Same thing for contingency tables.)
After reading your reply, I conclude that my understanding of
confidence intervals is not as clear as it should be.
As the joke goes, "So why should you be different from anyone else?" :-) I've no doubt that someone could slip one by me under the right circumstances. I'd *hope* s/he'd have to work at it, though.
Not having any suitable books at home I went through a number of web pages. I actually found quite a few of them quite unsatisfying, clicking the back button
Web pages are like that. They are not unlike the way textbooks were back in the 70s. My strategy for choosing one would be to start by checking a few key definitions. It they weren't done properly, I went onto something else.
I note that one page defined a confidence interval as (wording from memory) a process that creates an interval that has a certain fixed probability of containing the true value of the statistic [given assumptions]. Your min/max defined method seems to fit that definition. One reason I found this confusing is that your confidence interval is not deterministic. But, given the definitions, there's no reason why it should be.
Right. It's a random interval.
<snip>
But in your example you constrain the uniform distribution to be [theta-1,theta+1], which is not an arbitary uniform distribution,
arbitrary in the sense that theta is unknown and arbitrary
OK. I was using "arbitary" in the sense that an "arbitary" uniform distribution could be any uniform distribution. With my tenuous understanding of the terminology, I thought of [theta-1,theta+1] as defining a restricted family of uniform distributions.
You're correct.
but one with the range of possible random variates constrained between some values x and x+2. So, if by chance max(X1,X2)-min(X1,X2) > 1, then there is 100% chance that theta lies within the interval. For max(X1,X2)-min(X1,X2) < 1, then there is less and less chance that theta lies within the interval the smaller the interval.
Yes. And they are all 50% CIs! Because they are realizations of a process that produces intervals, half of which cover theta.
Yes, I picked up the word "process" on one of the web pages I look at. Even though some of them have a 100% probability of containing theta, the process itself has a 50% probability of producing a range that contains theta. As I found by Monte Carlo simulation (I hope more appropriate language than "bootstrap-y").
You are correct that "bootstrap" was probably not the right word. Some would say Monte Carlo is wrong, too, unless it involves a "variance swindle". However, many use Monte Carlo as a synonym for "simulation", which is the proper term.
that the punchiline is that we have an interval of > 1 we know that theta is contained within it. If we have an interval of < 1, there is still a probability that theta is contained.
Therefore there must be greater than a 50% confidence interval.
I don't understand this last sentence.
I was guessing that the story went like this. I'm guessing that you're probably well familiar with the paradox of the missing dollar, where there is a subtly reinforced expectation that a group of numbers should add up to a sum ($20 when I heard it) even though there is no reason that they should. I thought the story that you would come up with was going to be this:
No. Had that been the case, I'd've said there was a flaw in the logic and asked you to spot it. Like the "switching envelopes" problem: There are two envelopes. One contains $x, the other $2x. You pick one and open it to see $100. The faulty logic says you should swap because the other envelope either contains $50 or $200, with an expectation of (50+200)/2 = $125. From my perspective, the flaw in the logic relates to a *proper understanding* of confidence intervals, although confidence intervals have nothing to do with unraveling the faulty decision that swapping is good. (There is no benefit, as you'd suspect.)
I have trouble selling snake oil. I usually end up telling these problems in a way that makes their solution obvious. However, since we don't know each other, I made it a point to say that the exercise was *not* to spot such a flaw.
Yes, I think this might be starting to penetrate my head. As I mentioned on another thread, I plan to read a first year robust stats book cover to cover sometime during the summer that hopefully will give me a stronger background.
Thanks very much,
Cheers,
Ross-c
I hate to deflect someone from what looks like a profitable endeavor and, unfortunately, Reef Fish is gone for a few days so he can't comment on whether it might be too soon for you, but, given the kinds of questions you like to ask, you might *really* enjoy looking at comparative statistical inference--for example, comparing the frequentist and Bayesian approaches (and, decision theoretic, as Professor Rubin might add, which still leaves schools unmentioned) to statistics.
Where the frequentist has CIs and is unable to attach probability statement to them, the Bayesian has "credible intervals" and is *allowed* to say things like "the probability that the mean lies between 2.6 and 7.3 is 95%".
.
- Follow-Ups:
- Re: Error on kurtosis and skewness
- From: clemenr
- Re: Error on kurtosis and skewness
- References:
- Re: Error on kurtosis and skewness
- From: Jerry Dallal
- Re: Error on kurtosis and skewness
- From: clemenr
- Re: Error on kurtosis and skewness
- From: Jerry Dallal
- Re: Error on kurtosis and skewness
- From: clemenr
- Re: Error on kurtosis and skewness
- Prev by Date: Question: Research without measurement
- Next by Date: Re: Question: Research without measurement
- Previous by thread: Re: Error on kurtosis and skewness
- Next by thread: Re: Error on kurtosis and skewness
- Index(es):