Re: Hash function, Birthday paradox and probabilities.



fabrice.gautier@xxxxxxxxx wrote:
....
Lets have:
- a hash function H() that take as input a bit string of length L and
return an hash of length N bits (ie an integer in the range [0, 2^N[)
- a iterator function F, that takes as input a bit string of length L
and return another bit string of length L,
- S[0], S[1],... S[n], bit strings of length L, so that
S[i+1]=F(S[i]),
- h[0], ...h[n], so that h[i]=H(S[i])
- M a fixed hash

Given all that lets define m, the smallest positive integer so that
h[m]=M.
....
Now the question is what is the best function a() to minize m (in
average), and how to minimize the "spread" of m around its average.
(There must be a better word than "spread" but I dont remember)
[Variance]
I was thinking that, wether I use random or increments, the average of
m should be 2^N, and my experiences seem to agree with that.
....

If the hash function does a really good job of scattering
its L-bit inputs randomly and uniformly across an N-bit
range, and if the S[i] ordering is basically random, then
m should average no more than n/2 [with n = 2^N] to a first
collision, but around n between collisions as you go on.
If you lack intimate knowledge of hash function H, you
probably cannot make the average smaller by changing the
presentation order of the S[i], unless H is a bad hash
function. (Note, Bruce Schneier writes, "Honestly,
we in the cryptography community know very little about
hash functions", 1/4 of the way along in
http://www.schneier.com/blog/archives/2005/03/more_hash_funct.html
but perhaps he's exaggerating.)

Have you studied the "multicollision attack" work of Joux
and of Wang et al? Eg, see refs at end of Dan Kaminsky's
paper "MD5 To Be Considered Harmful Someday", at eg
http://www.doxpara.com/md5_someday.pdf .
-jiw
.



Relevant Pages

  • Re: type of hash functions
    ... the other hash function. ... but rather reduce the probability that you *need* to as much ... the mere process of storing and comparing hashes not worth the effort. ... pointer to the string) first can help (since many libraries omit this ...
    (comp.programming)
  • Re: type of hash functions
    ... the other hash function. ... as long as the input string. ... exists, then for any string, the pair of keys generated (by the pair ... the original input string, even if this pair of functions does exist, ...
    (comp.programming)
  • Re: Real hash functions in Java
    ... java distribution is not of such kind) ... A perfect hash function is one that maps each ... sure it is compatible with the equals method) Other classes such as String, ... etc will have meaningful hashCode functions. ...
    (comp.lang.java.programmer)
  • Re: Hash Function problem
    ... original string is read. ... The hash function is really done as an initializing step, ... Or else "normal" strlens don't take O, ...
    (comp.programming)
  • Re: Hash Function problem
    ... original string is read. ... The hash function is really done as an initializing step, ... fgets is being hidden from you. ... based updates with implicit scanning via length availability anyways? ...
    (comp.programming)