Re: help with simple probabilities



clemenr@xxxxxxxxxx writes:

> I'm no expert statistician, and can't solve the problem either. I
> strongly suspect that there's a distribution for this sort of problem
> that I either don't know or haven't recognised.
>
> But, I think the following reasoning might possibly be heading in the
> direction of an answer. Though of course I could easily be wrong.
>
> Let's say that we are going to choose K letters. The chance of getting
> all 'a's is 1/26 to the power of K. Since there are 26 letters to
> choose from, there are (26 choose 1) = 26 possible single letter
> sequences. Hence the probability of choosing a sequence of K letters
> all the same is:
>
> (26 choose 1) * (1/26)^K
>
> Now I want to calculate the probability of a sequence of length K which
> has *at most* two distinct letters. If the two letters were 'a' and
> 'b', then the probability of a sequence that has only 'a' and 'b' in
> any combination (including either only 'a's or only 'b's) is (2/26)^K.
> Since there are (26 choose 2) ways of choosing two letters from the
> alphabet, the probability of getting a sequence that uses at most two
> letters is:
>
> (26 choose 2) * (2/26)^K

This isn't quite right. Let K=2, and this formula gives a probability of
25/13. You're including the one letter sequences more than once.

Here are the number of n letter sequences of length K. Divide by 26^K to
get the probability:

n=0: (26 choose 0) * (1*0^K)

(I'm using the definition 0^0=1. The only way you get a 0 letter sequence
is if it has length 0.)

n=1: (26 choose 1) * (1*1^K - 1*0^K)

n=2: (26 choose 2) * (1*2^K - 2*1^K + 1*0^K)

n=3: (26 choose 3) * (1*3^K - 3*2^K + 3*1^K - 1*0^K)

n=4: (26 choose 4) * (1*4^K - 4*3^K + 6*2^K - 4*1^K + 1*0^K)

general n:

(26 choose n) * Sum[j=0,n; (-1)^(n-j) (n choose j) j^K]

Scott
--
Scott Hemphill hemphill@xxxxxxxxxxxxxxxxxx
"This isn't flying. This is falling, with style." -- Buzz Lightyear
.



Relevant Pages

  • Re: probability of sequences and seating
    ... sequence starts with 'ABC'. ... I want to compute the probability that the ... sequence continues for n more letters after the initial 'ABC' before ...
    (sci.stat.math)
  • Re: Grammar States and Tertiary Phonemes
    ... can be 'described' by a sequence of ... decimal base numbering system has 10 letters in its 'alphabet': ... Then, we derive a *grammar*, which is effectively a set of both ...
    (soc.religion.islam)
  • Re: Lennys Counter Argument
    ... "Not beyond very low levels of functional complexity this is by no ... "It is very much like adding letters to growing English-language ... sequence, like I or It, and then add another letter and have it be ... can fairly quickly produce a sequence of over a dozen characters. ...
    (talk.origins)
  • Re: Machines and people
    ... >Boolean Algebra lead to letters such as a,b,c,..etc. ... This standard specifies that 8 bits arranged in a certain sequence stand ... for the character 'A'; another sequence stands for the character 'a'. ... The basic answer to your question is that the bits get to be English ...
    (Debian-User)
  • Re: probability of sequences and seating
    ... sequence starts with 'ABC'. ... I want to compute the probability that the ... sequence continues for n more letters after the initial 'ABC' before ...
    (sci.stat.math)