Re: Distribution of a vowel on the page
- From: David Winsemius <doe_snot@xxxxxxxxxxx>
- Date: Mon, 22 Oct 2007 20:28:48 -0500
Richard Ulrich <Rich.Ulrich@xxxxxxxxxxx> wrote in
news:9ahph3t1qek63cvlokenoro8g9sfgagb5d@xxxxxxx:
On Sun, 21 Oct 2007 20:36:40 -0500, David Winsemius
<doe_snot@xxxxxxxxxxx> wrote:
It is true that the Poisson is a good approximation to the binomial
when the rate parmeter is small, but the Poisson distribution may
also be good when even when the rate parmeter is not small. The
question is really only answerable by reference to the data.
I get it, if you are talking about the Poisson rate parameter
by itself, since that can be an arbitrary counter.
- Can you show me where a *binomial* rate parameter
is near or above 50% and results in Poisson appearance?
The question is not whether the binomial distribution is the same as the
Poisson. It's not. The question is which one offers a better fit to the
data, .... data we have not yet been shown. I can make arguments that it
"should" be a mixture distribution of a multinomial or binomial drawn
from a distribution of letter counts per line that would have a broad
"integer-ramp" that describes the last lines of paragraphs along with a
higher, narrow peak around 72 that describes the number of letters in the
non-last lines of the paragraphs.
But consonants and vowels are structured by words, and
words seldom start or end with more than 3 (say) consonants
or vowels. What you observe will have almost no instances
of more than 6 -- of *either* consonants or vowels. And
one of those has p > 0.5, so the event should not be enormously
rare.
The average number of vowels per line must be around 20-25, so your
statement about "almost no instances of more than six" does not make
much sense to me. The next sentence makes even less sense. I am
guessing that
Ah. I needed to be more clear. I was going on about the
*dependency*. If the occurrences are independent,
then there will be no "correlation" between *consecutive*
occurrences.
Clearly, vowels and consonants in words violate that assumption.
What assumption? We are not counting number of vowels per word, or the
sequences of letters, or repititions.
If there were a positive correlation (consecutive repetitions),
the distribution, if otherwise Poisson, would be over-dispersed
(variance too large for the mean). Since there is a negative
correlation, it will be under-dispersed.
Why should the vowel count of one line be dependent on the vowel count in
a prior line?
It occurs to me that if there is a positive correlation of having
words with many-versus-few vowels, then the between-word r
might tend to offset the negative correlation within words.
I don't know how that would work out.
But the OP already stated that the empirical distribution
he obtained did not look Poisson.
But he gave no data or summary statistics, nor did he say in what way it
departed from what he thought was "Poissonian". If it had a negative
skewness (say the left-sided shoulder due to the last line of
paragraphs), that would be decidedly non-Poissonian.
--
David Winsemius
.
- Follow-Ups:
- Re: Distribution of a vowel on the page
- From: Graham Jones
- Re: Distribution of a vowel on the page
- References:
- Distribution of a vowel on the page
- From: luca . pamparana
- Re: Distribution of a vowel on the page
- From: Richard Ulrich
- Re: Distribution of a vowel on the page
- From: David Winsemius
- Re: Distribution of a vowel on the page
- From: Richard Ulrich
- Distribution of a vowel on the page
- Prev by Date: Re: Distribution of a vowel on the page
- Next by Date: Re: Distribution of a vowel on the page
- Previous by thread: Re: Distribution of a vowel on the page
- Next by thread: Re: Distribution of a vowel on the page
- Index(es):
Relevant Pages
|
Loading