How is the client hash of a dictionary determined?

35 views
Skip to first unread message

Rob M.

unread,
Feb 10, 2017, 5:58:39 PM2/10/17
to SDCH
From what I've seen of SDCH in the wild, it's common practice to name the dictionaries after their client IDs, e.g. r9e65Lgq.dct or j2iw-qbt.dct.

How is this hash calculated exactly?

Rob

Jim

unread,
Feb 10, 2017, 10:41:45 PM2/10/17
to SDCH
I think the more precise details can be found in the IETF proposed draft, or in some of the referenced docs on the net.


Based loosely on my recollection (if your question was merely high level curiosity):

The entire body of the dictionary is hashed, and then the resulting hash is split into two sections.  One portion is used to identify the dictionary to be fetched (or currently possessed?), and the other portion is used when compressed data arrives and needs to reference/identify the dictionary that compressed it.  I *think* this strategy was used to minimally reduce the size of the referencing strings (uuencoded hash??), at the risk of an attack (confusing a dictionary which matched only half of the hash with the "real" dictionary)... but the whole protocol (historically) was used over HTTP (not HTTPS), so there was already no real tamper-resistance on the download channel.  

When things moved to TLS (recently), I think they planned to mitigate such "confused dictionary" attacks by the fact that the client makes an authenticated connection to the server (to get the dictionary, or to get compressed data).  ...but I wasn't very involved with the discussion.

Please read the spec if you need the details.

Hope that helps,

Jim

Rob M.

unread,
Feb 11, 2017, 8:04:07 PM2/11/17
to SDCH
Thanks, Jim. I'm just trying to figure out how it works so that I can also name my dictionaries after the client hash.

What I gather from the spec and your description is you take the first 6 characters of the dictionary's SHA256 digest and then BASE64 encode that (URL-safe as per RFC 3548, section 4) to generate the client ID, and the same with the next 6 characters for the server ID. 

However, pulling an example dictionary from net-internals, this doesn't seem to be quite correct:

net-internals:
Client hash: r9e65Lgq
 
My attempt at the client hash:
SHA256: afd7bae4b82a2044d410414ed771a29f929a07893f520fdecccc8658ba64b751 (of r9e65Lgq.dct file)
BASE64: YWZkN2Jh (of first 6 chars)

I must be getting something wrong, because r9e65Lgq doesn't even decode to an alphanumeric string.

Any further pointers would be greatly appreciated.

Rob
Reply all
Reply to author
Forward
0 new messages