MD5 hash calculation steps?

167 views
Skip to first unread message

Mark Golazeski

unread,
Jun 24, 2012, 6:02:27 PM6/24/12
to sc2ge...@googlegroups.com
Hi!

tl;dr 
Base64 of MD5 of attached replay is z/YuRS3HGvvTkV395nSHKA==, but hash in filename is P#oKhiT76LLjAlTZVDi7a7. Is there more to the MD5 sum than just the raw replay file?

Long Version:
I'm working on a plugin that will be storing extra information on replays, and I was setting up to do it by replay hash, but I'm having difficulties getting the hash in the replay name to match the one in my code.

My first sanity check question would be whether there's already a way in the API to access the hash for a replay? I didn't see one, so I implemented my own based off the format of the filename hash.

I'm making the assumption that the hash listed in the filename is a Base64 encoding of the MD5 of the replay, with the trailing "=" removed and the "/" and "+" characters replaced to make the filename more friendly.

With the attached example replay, both my Java code and my command line MD5 give a checksum of cff62e452dc71afbd3915dfde6748728.

Converting this to Base64 gives  z/YuRS3HGvvTkV395nSHKA==, but the filename hash is P#oKhiT76LLjAlTZVDi7a7.

Are there extra steps or data used in the calculation of the hash? Did I just make an incorrect assumption somewhere? Let me know if you need any more info/code/data, and I'd be happy to provide.

Thanks for the great work you're doing!

-Mark
Daybreak LE (11).SC2Replay

András Belicza

unread,
Jun 24, 2012, 7:21:59 PM6/24/12
to sc2ge...@googlegroups.com
Yes, I perform a function on the MD5, it's not Base64 but something similar.

Here are the algorithm details:
Once you have the md5 hex representation, make groups of 3 digits. 3 digit will result in 2 characters in the encoded format.
3 digit = 3x4 bit = 12 bit.
The 2 output characters are taken from a 64-size alphabet = 2x6 bit = 12 bit.
This is the alphabet i use:
"0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ$#"

Mark Golazeski

unread,
Jun 25, 2012, 3:28:17 PM6/25/12
to sc2ge...@googlegroups.com
Thanks for the quick response!

I implemented some code that is correctly generating the first 21 characters of the hash, but I'm running into problems with the last character, since the md5sum has only 32 digits to parse. Are you padding with a specific digit to calculate the last character?

Let me know if you need code or examples.

Thanks,
Mark

András Belicza

unread,
Jun 25, 2012, 4:55:38 PM6/25/12
to sc2ge...@googlegroups.com
I use the last digit of the group before the last as the padding digit.
  
For example, if the last 2 groups are:
123-45
Then 3 will be used as the missing digit:
123-453

If the last 2 groups are:
abc-12
Then abc-12c will be used
  
Do this way and you'll get the same result
Reply all
Reply to author
Forward
0 new messages