ObjectId performance vs bigint

207 views
Skip to first unread message

Justin Chase

unread,
Jan 21, 2022, 1:11:27 PM1/21/22
to BSON
I'm having a discussion with some developers regarding establishing a pattern for ID fields  and we have the need to store documents in potentially multiple kinds of databases. Lets say both SQL and Mongo for example.

We like the sequential yet psuedo-random characteristics of ObjectId but we're having a debate regarding the effeciency of 12 byte id fields in SQL. Storing the id as a string may have negative performance impacts in SQL and there is an argument that storing it as a data type that can be "processed using a single machine language instruction" (aka 64 bits) is important for performance.

Lets assume the scale we're talking about here is large as well, lets assume very high scale and performance needs are legitimate to worry about. 

Now I believe in SQL I could make the key BINARY(12) and that would solve the key-as-a-string issue... but is that performant for queries?

Perhaps this is out of the realm of BSON specifically, forgive me if this is the wrong place to ask the question, but I'm just wondering if anyone here has any knowledge of studies done comparing the query performance characteristics of ObjectId's to other types of id fields?

Is there anything special that mongo is doing that SQL server or some other kind of database doesn't do that would make storing documents with ObjectIds perform relatively bad?

For example the logic is laid out here fully, including the alternate key algorithm:
https://instagram-engineering.com/sharding-ids-at-instagram-1cf5a71e5a5c

For example that instagram article states as point #2 "IDs should ideally be 64 bits (for smaller indexes, and better storage in systems like Redis)" but citation needed. Ok, its probably true that 64 bits is smaller than 96 bits but in the grand scheme of this data I'm skeptical that will be a significant factor and more importantly, how or why is it better or "ideal"? Specifically does it actually make querying appreciably faster?

Any help or opinions appreciated.
Reply all
Reply to author
Forward
0 new messages