Siegfried Bublitz
unread,May 8, 2023, 1:29:23 PM5/8/23Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Sign in to report message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to tlaplus
Hi All,
I have some difficulty understanding the Collision Probability of TLC.
Let me explain my understanding.
The generated hash-code of a state is a 64 Bit number, i.e. there are n=2^64 different hash-codes possible.
Imagine a spec consists of k different states, e.g. numbered from 1 to k.
The hash-code is just a mapping from 1..k to 1..n. There are n^k such mappings.
We have no collision if the mapping is injective, i.e. different states are mapped to different hash-codes.
There are n!/(n-k)! such injective mappings.
Thus, for a random choice having no collision the probability is
n!/((n-k)!*n^k) = n/n * (n-1)/n * ...* (n-k+1)/n = 1 * (1-1/n) * ...* (1-(k-1)/n)
To compare this with a constant probability c we take the natural logarithm of the equality yielding
ln(1) + ln(1-1/n) + ...+ ln(1-(k-1)/n) = ln(c)
The left sum can be approximated by the tangent of ln(x) at 1, which is x-1. This gives an arithmetic sequence:
0 -1/n - ... -(k-1)/n = - 1/n * (1 + 2 ... + (k-1)) = k*(1-k)/(2*n) <= ln(c)
The last unequality follows from ln being convex to below, thus the tangent is above the graph.
If we assume equality, we have a quadratic equation in k with the positive solution
k1 = 1/2 + sqrt(1/4 - 2*n*ln(c)).
We can use perl to compute k for a collision probability of about 1% (thus c=99/100):
perl -le 'print 0.5 + sqrt(.25 - 2**65 *log(99/100))'
This yields
608_926_881.733485
This is just above 600 Millions.
Is this sufficient for large specs?
Is there a flaw in my argumentation?
Best,
Siegfried