Document ordering

81 views
Skip to first unread message

Massimo Redaelli

unread,
Oct 29, 2017, 6:25:39 AM10/29/17
to mongodb-user
Suppose I have a field, in my case the _id field but it doesn't matter I guess, that contains a subdocument.

I can query like this:

... .find({_id : { $lt: {a : 1} } })

But what is the definition of the sorting between documents? I was not able to find it in the documentation.
It would seem that Mongo compares all fields in their order: if the first n are equal and the (n+1)th are not, return the ordering of the (n+1)th. But before relying on that I'd like to make sure this is true and stable :)

Thanks!

Wan Bachtiar

unread,
Oct 30, 2017, 6:34:36 AM10/30/17
to mongodb-user

Hi Massimo,

If you would like a sorted result please use sort().
For more examples please see cursor.sort examples

Regards,
Wan.

Massimo Redaelli

unread,
Oct 30, 2017, 2:29:49 PM10/30/17
to mongodb-user
Hi.

I don't want a sorted result.

My point is that I want to use an _id composed of two fields, say _id: {a: 1, b: 2}, and I want to exploit the index on _id, since it's there anyway.

So I need to know what it means that {a: something, b: something} is less than {a: something_else, b: something_else} .

More details here, https://stackoverflow.com/questions/46932944/mongodb-comparison-of-subdocuments

Thanks

M.

Massimo Redaelli

unread,
Nov 1, 2017, 4:15:41 AM11/1/17
to mongodb-user
I guess I'm not explaining myself, so let me make some examples.

I have a collection with the following documents:

[  { _id: {a: 1, b: 1}}, { _id: {a: 1, b: 2}}, { _id: {a: 2, b: 1}}, { _id: {a: 2, b: 2}}, { _id: {a: 1, b: -1}}  ]

All I'm asking is how the result of these queries is computed, in the sense of what algorithm used to decide the $lt comparison:

.find({_id : { $lt: {a : 2} } })

.find({_id : { $lt: {b : 2} } })

.find({_id : { $lt: {c : 2} } })

.find({_id : { $lt: {a : 2, b : 2} } })

Also, is it anywhere in the documentation?

M.

Wan Bachtiar

unread,
Nov 5, 2017, 7:49:39 PM11/5/17
to mongodb-user

All I’m asking is how the result of these queries is computed, in the sense of what algorithm used to decide the $lt comparison:

Hi Massimo,

The sub-document comparison for _id field will be treated as Object comparison (BSON type number 3).
In other words it’s a binary comparison rather than the content of the sub-document.

You can utilise it for finding documents, using equality matching. As long as you provide both values and in the order they’re inserted. i.e.

db.collection.find({_id:{$eq:{a:1, b:-1}}});

If you reverse the keys order as an example below, it would failed to find the matching Object:

db.collection.find({_id:{$eq:{b:-1, a:1}}})

I don’t think this would be a good use for ordering documents (using $lt, $gt, etc), I would recommend to find alternatives to sort results. i.e. Add another field that you could use to sort.

Also, is it anywhere in the documentation?

Currently there is a ticket to improve this in the documentation. Feel free to upvote / watch DOCS-8794 for notifications.

Regards,
Wan.

Massimo Redaelli

unread,
Nov 6, 2017, 3:51:32 AM11/6/17
to mongod...@googlegroups.com

On Mon, Nov 6, 2017 at 1:49 AM, 'Wan Bachtiar' via mongodb-user <mongod...@googlegroups.com> wrote:

The sub-document comparison for _id field will be treated as Object comparison (BSON type number 3).

In other words it’s a binary comparison rather than the content of the sub-document.

​I see. So it is basically:

  • comparing te length of the document
  • then comparing the type of the first field
  • then comparing the name of the first field
  • then comparing the content of the first field
  • ...
At least that's what I would get from here: http://bsonspec.org/spec.html
 
I don’t think this would be a good use for ordering documents (using $lt, $gt, etc), I would recommend to find alternatives to sort results. i.e. Add another field that you could use to sort.

​I don't want to sort, but to filter
 
​.

Why do you think it woudn't be a good use, assuming all the documents have the same two fields of the same ​two fixed-legth types?


I assume this binary ordering is stable?

Thanks

Wan Bachtiar

unread,
Nov 15, 2017, 1:41:25 AM11/15/17
to mongodb-user

At least that’s what I would get from here: http://bsonspec.org/spec.html

Hi Massimo,

I’m not entirely sure where you’re getting the above information from. The BSON spec doesn’t contain the algorithm for the order of object comparison. The site only contains the specification for BSON itself.

Why do you think it woudn’t be a good use, assuming all the documents have the same two fields of the same ​two fixed-legth types?

It’s not only about the consistency of the _id values in the documents, but also the query.

For example, a document {a:1, b:2} in BSON object would be \x13\x00\x00\x00\x10a\x00\x01\x00\x00\x00\x10b\x00\x02\x00\x00\x00\x00 (string representation).
If in your query, you’re looking for exactly the same BSON object value generated from {a:1, b:2} (same fields order) then it’s a valid $eq comparison.

However, if you’re attempting to query using order comparison (i.e. $lt, $gt) then determining which binary value goes before/after another value is going to be difficult.
Furthermore if there’s only one field specified in the query i.e. {b:1} with BSON object value of \x0c\x00\x00\x00\x10b\x00\x01\x00\x00\x00\x00 an order comparison would be even be less deterministic.

Regards,
Wan.

Massimo Redaelli

unread,
Nov 15, 2017, 2:42:12 AM11/15/17
to mongod...@googlegroups.com
On Wed, Nov 15, 2017 at 7:41 AM, 'Wan Bachtiar' via mongodb-user <mongod...@googlegroups.com> wrote:

At least that’s what I would get from here: http://bsonspec.org/spec.html

I’m not entirely sure where you’re getting the above information from. The BSON spec doesn’t contain the algorithm for the order of object comparison. The site only contains the specification for BSON itself.

​I know. That's my question, isn't it? :P

I'm just assuming that the comparison of the binary representation is done byte-wise, left to right, in which case the process I described should be equivalent to the binary comparison, given the binary format specification.​
 

However, if you’re attempting to query using order comparison (i.e. $lt, $gt) then determining which binary value goes before/after another value is going to be difficult.

​To be honest, I'm baffled that something as basic as the behaviour of the operators $lt and $gt seems to be shrouded in uncertainly and complication, if not undefined.

M.

Wan Bachtiar

unread,
Nov 16, 2017, 6:48:10 PM11/16/17
to mongodb-user

​To be honest, I’m baffled that something as basic as the behaviour of the operators $lt and $gt seems to be shrouded in uncertainly and complication, if not undefined.

Hi Massimo,

To be clear, it’s not the behaviour of the operators $lt and $gt that is undefined. It is the fact that the sub-document within _id is not being travered/expanded into. If you were to store this sub-document in another field i.e. foo, you would not have this issue.

Do you have a use case requiring to store sub-documents in _id field ?
The field _id is reserved for use as a primary key; its value must be unique in the collection.
See more info MongoDB Documents

Regards,
Wan.

Massimo Redaelli

unread,
Nov 17, 2017, 1:32:46 PM11/17/17
to mongod...@googlegroups.com
On Fri, Nov 17, 2017 at 12:48 AM, 'Wan Bachtiar' via mongodb-user <mongod...@googlegroups.com> wrote:
To be clear, it’s not the behaviour of the operators $lt and $gt that is undefined. It is the fact that the sub-document within _id is not being travered/expanded into. If you were to store this sub-document in another field i.e. foo, you would not have this issue.

​It would be precisely the same issue. The point is not the name of the field, _id or foo, but that I'm applying $lt to foo (or _id) itself instead of its subfields.​

And if it is not undefined, then I'd like to see it expressed in the documentation :P
By this I mean that the docs should specify:
  1. the comparison is done at the BSON level
  2. how this binary comparison is done
  3. ideally, what is the equivalent definition of the comparison at the JSON level
 

Do you have a use case requiring to store sub-documents in _id field ?
The field _id is reserved for use as a primary key; its value must be unique in the collection.

There are plenty use cases: all those where the natural primary key is composed of more than one field...

M.

Wan Bachtiar

unread,
Dec 6, 2017, 7:41:22 PM12/6/17
to mongodb-user

And if it is not undefined, then I’d like to see it expressed in the documentation :P

Hi Massimo,

As mentioned previously, there is currently an open ticket in MongoDB JIRA issue tracker to improve this on the documentation DOCS-8974. Please feel free to watch/up-vote for notification.

how this comparison is done

After looking into the server code (MongoDB v3.4) for BSONObj::woCompare() within bsonobj.cpp, the
comparison for BSON type Object (#5) using operator (i.e. $gt) is performed using the following logic order:

  1. The key/value pairs are recursively compared in the order within the BSON object.
  2. Compare the key field name.
  3. If the key field name is equal, compare the value of the field (data type and value comparison).
  4. If the value of the field is equal, compare the next key/value pair (step 1). This is where the length of the object matters, the object where next element is not null is greater.

For examples, given a query db.coll.find({a: {$gt: {b: 1} } }) where documents in the collection:

{ a: {b: 1}},       /* Not returned, because value of b is not greater than 1 */
{ a: {b: 3}},       /* Returned, because value of b is greater than 1 */
{ a: {b: true}},    /* Returned, because value of b has a greater BSON data type than number */

{ a: {c: 1}},       /* Returned, because key field name c is greater than key field name b */
{ a: {a: 3}},       /* Not returned, because key field name a is not greater than key field name b */

{ a: {b: 1, c: 1}}, /* Returned, because the length of the object is greater than {b:1} */
{ a: {b: 0, c: 1}}, /* Not returned, because the first value of object (field key b) is not greater than 1. Ignoring the next element. */
{ a: {b: 3, c: 2}}, /* Returned, because the first value of object (field key b) is greater than 1. */
{ a: {a: 1, c: 1}}, /* Not returned, because the first key field name of object, a, is not greater than b */
{ a: {c: 1, b: -1}} /* Returned, because the first key field name of the object, c, is greater than b */

See also related: Type Bracketing

There are plenty use cases: all those where the natural primary key is composed of more than one field…

For more common use cases, you may find the following useful:

Regards,
Wan.

Massimo Redaelli

unread,
Dec 7, 2017, 8:25:31 AM12/7/17
to mongod...@googlegroups.com
Perfect, thank you very much!

M.

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to a topic in the Google Groups "mongodb-user" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/mongodb-user/cPJckoD6t3A/unsubscribe.
To unsubscribe from this group and all its topics, send an email to mongodb-user+unsubscribe@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/c8a5a07d-91b7-4b40-ad32-bab72dd22a49%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages