Hi,
I have a collection of documents where each document has a field
called "hash", which is the SHA1 of another field (an URL). This hash
field is a 40 char string and has an index on it. I sometimes run
queries on the collection and instead of passing the URL, I thought it
would be better if I use my application to generate the SHA1 hash and
query the collection using the hash instead of the URL.
Currently I'm storing the data in the hash field as string and have an
index on it, and I was wondering if I should store it as binary
instead; would the index be more efficient?
I used to do the same thing with MySQL - eg. for a MD5 hash field,
instead of using a CHAR(32) type, I used a BINARY(16) type, and index
that field.
Should I do the same with my Mongo collection? I suppose I would have
to use MongoBinData to do the conversion from the string hash to the
binary form? Something like this maybe (PHP):
$url = "
http://www.example.com/";
$doc = array("URL" => $url, "hash" => new
MongoBinData(hex2bin(sha1($url))));
$docs->save($doc);