Can any one help me calculate a statistical probability



Here is the question. This concerns a claim of plagarism. There are
two indexes of a similar text numbering about 750,000 words. The first
index has 27,740 terms in it, while the second index has 3,500 terms
in it. The authors of the first index claim that the authors of the
second plagarized their index, but it turns out the indexes are mostly
different, and only a few terms are similar. Can anyone calculate what
the random similarity would be, i.e., if we assume that there was no
plagarism and that index 1 (27740 terms) and index 2 (3500 terms) were
independently derived, what would be the probability that some of the
terms would still be identical if the text to which the indexes refer
is 80%-90% similar.
.



Relevant Pages

  • Can any one help me calculate a statistical probability
    ... This concerns a claim of plagarism. ... two indexes of a similar text numbering about 750,000 words. ... index has 27,740 terms in it, while the second index has 3,500 terms ... terms would still be identical if the text to which the indexes refer ...
    (sci.math)
  • Can any one help me calculate a statistical probability
    ... This concerns a claim of plagarism. ... two indexes of a similar text numbering about 750,000 words. ... index has 27,740 terms in it, while the second index has 3,500 terms ... terms would still be identical if the text to which the indexes refer ...
    (sci.stat.math)
  • Can any one help me calculate a statistical probability
    ... This concerns a claim of plagarism. ... two indexes of a similar text numbering about 750,000 words. ... index has 27,740 terms in it, while the second index has 3,500 terms ... terms would still be identical if the text to which the indexes refer ...
    (sci.stat.consult)
  • can anyone help me with the calculation of statistical probability?
    ... This concerns a claim of plagiarism. ... two indexes of a similar text numbering about 750,000 words. ... index has 27,740 terms in it, while the second index has 3,500 terms ... the indexes refer is 80%-90% similar. ...
    (sci.crypt)