Re: help with simple probabilities



I'm no expert statistician, and can't solve the problem either. I
strongly suspect that there's a distribution for this sort of problem
that I either don't know or haven't recognised.

But, I think the following reasoning might possibly be heading in the
direction of an answer. Though of course I could easily be wrong.

Let's say that we are going to choose K letters. The chance of getting
all 'a's is 1/26 to the power of K. Since there are 26 letters to
choose from, there are (26 choose 1) = 26 possible single letter
sequences. Hence the probability of choosing a sequence of K letters
all the same is:

(26 choose 1) * (1/26)^K

Now I want to calculate the probability of a sequence of length K which
has *at most* two distinct letters. If the two letters were 'a' and
'b', then the probability of a sequence that has only 'a' and 'b' in
any combination (including either only 'a's or only 'b's) is (2/26)^K.
Since there are (26 choose 2) ways of choosing two letters from the
alphabet, the probability of getting a sequence that uses at most two
letters is:

(26 choose 2) * (2/26)^K

This is the probability of generating either a two letter sequence or a
one letter sequence. But, we know the probability of a one letter
sequence already, so all we have to do is subtract the probability of a
sequence containing a single letter, and we have the probability of
generating a sequence containing two letters.

This should easily generalise for any n. It's simple to replace 2 with
n in the above expression (note that n <= 26) which gives us the
probability of a sequence with at most n distinct letters. We then
subtract the probability of a sequence with at most n-1 distinct
letters, and that might be an answer. Note that this won't work for
n=1, but we have the n=1 formula above.

This is unlikely to be the right answer, but at least I had fun
thinking about it.

Cheers,

Ross-c

.



Relevant Pages

  • Re: probability of sequences and seating
    ... sequence starts with 'ABC'. ... I want to compute the probability that the ... sequence continues for n more letters after the initial 'ABC' before ...
    (sci.stat.math)
  • Re: Grammar States and Tertiary Phonemes
    ... can be 'described' by a sequence of ... decimal base numbering system has 10 letters in its 'alphabet': ... Then, we derive a *grammar*, which is effectively a set of both ...
    (soc.religion.islam)
  • Re: help with simple probabilities
    ... Hence the probability of choosing a sequence of K letters ...
    (sci.stat.math)
  • Re: Lennys Counter Argument
    ... "Not beyond very low levels of functional complexity this is by no ... "It is very much like adding letters to growing English-language ... sequence, like I or It, and then add another letter and have it be ... can fairly quickly produce a sequence of over a dozen characters. ...
    (talk.origins)
  • Re: Machines and people
    ... >Boolean Algebra lead to letters such as a,b,c,..etc. ... This standard specifies that 8 bits arranged in a certain sequence stand ... for the character 'A'; another sequence stands for the character 'a'. ... The basic answer to your question is that the bits get to be English ...
    (Debian-User)

Loading