Expected number of doublets in a sequence



Hi,
suppose I have a sequence of 4 symbols: a,b,c,d
and the sequence is of length N;
suppose a,b,c,d each occur with prob 1/4 and that each site is
independent.
i want to compute the expected number of doublets ab.
This should be simply the number of doublets * probability that a
double is ab.
the latter probability is (1/4)*(1/4)=1/14 (due to independence);
however,
I am slightly unsure about the number of doublets; simply chopping up
sequence into doublets will be N/2 (suppose sequence length is even).
however, i could start counting from index 2 instead of 1, which gives
me another N/2 -1 doublets (for example, suppose I had abcdad, then I
have (ab, cd,ad) but
if I start from 2nd index, I have (bc,da) )
however, there is clearly a constraint in that if I have ab, then the
sequence at the same
position in the shifted frame cannot have ab again. (i.e. suppose i had
XYZ, then if XY
is "ab" then I am sure that "YZ" cannot be an "ab" (i would read YZ in
the second frame))--
assuming that the doublets is formed from different symbols (i.e. not
aa, or bb or ..)

I am confused about that so if someone can help me understand it, I
would be grateful.
to repeat my question: I want to know the expected number of a given
doublet (supposing that doublets are not of the same symbol)

thanks
les

.



Relevant Pages

  • Re: Expected number of doublets in a sequence
    ... suppose I have a sequence of 4 symbols: ... There are N-1 possible positions, so the total number of 'ab' ... these is the number of doublets, which is N-1; ... number of occurrences a particular doublet is /16. ...
    (sci.math)
  • Re: Expected number of doublets in a sequence
    ... les ander wrote: ... suppose I have a sequence of 4 symbols: ... There are N-1 possible positions, so the total number of 'ab' ... doublets in the 4^N sequences is x 4^. ...
    (sci.math)
  • Re: FFT test with few kbits
    ... the first Fourier component has a different mean. ... FFT of the 20-bit sequence you gave. ... reflecting Ernst Lippe's observation that it should average to N/2 ... imaginary part of the first component is identically zero, ...
    (sci.crypt)
  • Re: Pitman CSI Formula
    ... the probability that a randomly chosen sequence from total ... this close to the reference sequence by chance alone. ... a Hamming Distance of n/2 would have only a 50% ...
    (talk.origins)

Quantcast