Hello everyone,
I'm sorry if this question is a bit off-topic and too far on the MongoDB-side of things, but it has come up in prior conversations on this mailing list.
I'm curious about the
KeyStrings used for secondary indexing. The way I understand them (please correct me if I'm wrong) is that they are a different serialization of BSON objects such that the lexicographical comparison of two resulting byte arrays always produces the same result as the comparison of the actual BSON objects. To achieve that, they use a suffix encoding for their type information.
A less-than comparison, or an ascending sort, compares the smallest elements of the array according to the BSON type sort order.
A greater-than comparison, or a descending sort, compares the largest elements of the array according to the reverse BSON type sort order.
This makes sense from a certain point of view, but it also means that the ascending sort order of documents is not necessarily the inverse of the descending sort order. For example:
- Document 1: { "arr": [92, 18] }
- Document 2: { "arr": [91, 19] }
If I sort them in ascending order according to the rule stated in the docs, I get [Document 1, Document 2] (because min(arr) in Document 1 is 18, and min(arr) in Document 2 is 19). If I sort them in descending order, I also get [Document 1, Document 2] (because max(arr) in Document 1 is 92, and max(arr) in Document 2 is 91).
If we have an index on the documents "arr" field, we will have to decide if we put the maximum or the minimum entry in our KeyString, right? Does that mean that this index can only be used to fetch the ascending sort order for the "arr" field, and has to be ignored when the user asks for descending sort order on "arr"?
The very purpose of sort keys and the sort definition stated in the docs seem very conflicting to me. Can anybody shed some light on what's actually going on under the hood here?
Thank you!
Martin