Storing and querying a 36 byte _id from C#

100 views
Skip to first unread message

IanC

unread,
Sep 11, 2012, 7:26:28 AM9/11/12
to mongod...@googlegroups.com

I have a use case where storing an _id that is a composite of the following would be immensely useful:

  1. 96 bit int
  2. 96 bit int
  3. 96 bit int ObjectID

This works out as a 36 byte ASCII string.

I would like to query "Give me all documents where _id starts with {1} concatenate {2}." This would return documents where _id matches components 1 & 2 above, with any 3. Being able to do range queries would be useful, but I don't know if this is possible.

I'm using C# to access MongoDB. My first question is what data type should I use? If I use a string, it will encode as UTF-16 (.Net's native type), which is not plain binary, will inflate the size, and possibly confound queries. Will a byte array work?

My second question is how do I query it? I don't believe regex will accommodate the full ASCII 0 - 255 range.

Stefan Podkowinski

unread,
Sep 11, 2012, 8:23:02 AM9/11/12
to mongod...@googlegroups.com
Try the BSON Binary datatype for that. You should be able to create an instance from a byte array in .net to create a $gt/$lt data range query. Just make sure to add padding with zeros ($gt) or max values ($lt) for all remaining bytes so you get a 36byte key for comparison. 

IanC

unread,
Sep 11, 2012, 8:39:21 AM9/11/12
to mongod...@googlegroups.com
Thanks. And if I want to use a "starts with" query, how would I do that?

craiggwilson

unread,
Sep 11, 2012, 11:44:25 AM9/11/12
to mongod...@googlegroups.com
MongoDB supports composite id's as well, so you can leave the 3 components you have split up, and then do any kind of query on them that is support by those types.

{ _id : { "first" : 30345345, "second" : 3556654, "third" : ObjectId("...") } }

IanC

unread,
Sep 11, 2012, 11:57:43 AM9/11/12
to mongod...@googlegroups.com
Ok, even better. So I could do an equals or range on first and second. Since I'm dealing with 96 bit ints for each, would I use ObjectID to cast for all 3? Would you mind providing a sample query please.

craiggwilson

unread,
Sep 11, 2012, 12:55:32 PM9/11/12
to mongod...@googlegroups.com
No, I would use an integer for the first 2 and an ObjectId for the 3rd.

db.foos.find({ "_id.first" : { $gt : 0 }, "_id.second" : { $lt : 100 }})

would find all the documents with the first component of the _id is greater than 0 and the second component of the _id is less than 100.

craiggwilson

unread,
Sep 11, 2012, 12:57:07 PM9/11/12
to mongod...@googlegroups.com
Also, remember that the _id can never change for a document after it is set.  So, if you need these to be mutable, then store these as a normal document and use a surrogate key (probably and ObjectId) as the primary key.

IanC

unread,
Sep 11, 2012, 6:24:33 PM9/11/12
to mongod...@googlegroups.com
Ok, if I used integers, I'd need 5 components for _id since we don't have 96 bit ints. Would you recommend this over binary?

craiggwilson

unread,
Sep 12, 2012, 8:22:04 AM9/12/12
to mongod...@googlegroups.com
It depends on your needs.  Do you need to dynamically query these pieces?  Can you run the queries you need storing them as binary?  I'd test it out and see if you can accomplish what you need by storing things as binary or a string.  Either way, you'll have your answer. 

craiggwilson

unread,
Sep 12, 2012, 8:22:59 AM9/12/12
to mongod...@googlegroups.com
Also, if you go the binary route, you can't use the Aggregation Framework with that document.  If that is a concern for you, then don't use binary.
Reply all
Reply to author
Forward
0 new messages