Colm O'Flaherty
unread,Jun 6, 2012, 1:04:41 PM6/6/12Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to zaproxy...@googlegroups.com
Hi all,
I've been thinking about the analysis of the session id entropy/predictability, and I did a little analysis on the train this morning on 10000 PHP session ids generated using tokengen, with a PHP-based web app.
This showed that the PHP session id has exactly 128 bits of entropy: 5 bits in each of the ascii characters from position 1 to position 25 of the session id, and 3 bits of entropy in position 26. It gave me better insight into some of the results in tokengen's "Analyse Tokens" screen results (in particular the "Character Uniformity" fail on character 26). It also begs the question that maybe we should also be analysing the session ids from a "dense" binary standpoint (by excluding from the analysis any bit position that does not demonstrate any variance). Using this method in this case, the "dense" view of a session id would effectively be a 128 bit number, which sounds a lot more manageable than a 26 character string (208 bits) from the point of view of analysis.
We could, for instance, raise the following alerts based on the level of entropy seen in the session id:
32 bits <= entropy => No alert, or maybe just an informational so that the user can report it in their Pen Test Report (ahem!).
24 bits <= entropy < 32 bits => Low risk alert. Depending on the method used, a 24 bit session id could potentially be correctly guessed within the lifetime of the session. 32 bits is unlikely, admittedly, but it would depend heavily on the implementation.
16 bits <= entropy < 24 bits => Medum risk alert. 16 bits is only 64k possibilities. 24 bits is only 16777216 possibilities. Brute-forceable.
00 bits <= entropy < 16 bits => High risk alert. 0 bits implies that the session id is constant (either its not actually a session id at all, or the session id is calculated from a username/password combination using an unsalted hashing function, for instance). Either way, this is definitely a vulnerability.
How to do further analysis on the "dense" (128 bit numeric in this case) session data? Well, for instance, the "dieharder" benchmarking tool for random number generators can take a file containing a series of numerical session ids like this. I don't know how platform independent this is, or whether there exists a Java port though. In theory, we could also just generate the "dense" numbers in a file, and leave it to the user to run them through "dieharder".
Thoughts?
Colm