Difficulty understanding query results with a Compound, Multikey Index

57 views
Skip to first unread message

er...@sonicbloomgames.com

unread,
Jul 12, 2017, 6:11:41 PM7/12/17
to mongodb-user
First, some of the basics:
  • Database Version: 3.4.2
  • Shell Version: 3.4.3
  • Document Structure: (many fields removed for clarity)
    {
        "data": {
            "arrayOne": [
                {
                    "ID": "[String1]",
                    "arrayTwo": [
                        {
                            "someNum": 23
                        },
                        {
                            "someNum": 28
                        }
                    ]
                },
                {
                    "ID": "[String2]",
                    "arrayTwo": [
                        {
                            "someNum": 34
                        },
                        {
                            "someNum": 37
                        }
                    ]
                }
            ]
        },
        "metadata": {
            // ...
        }
    }
  • Compound Index: ( { "data.arrayOne.ID": 1, "data.arrayOne.arrayTwo.someNum": 1 } )
  • Docs In Collection: 1009
I am trying to figure out how to quickly return results wherein a document within arrayOne contains a specific ID and contains at least one someNum value within a specified range.

The following query returns the correct 29 documents:

db.datacol.find( { "data.arrayOne": { $elemMatch: { arrayTwo: { $elemMatch: { someNum: { $gte: 20.47, $lt: 20.49 } } }, ID: "test_id_0" } } } );

However, it takes over 500 milliseconds to complete the search and it claims to examine all 1009 documents. From what I can see in the explain output, the indexBounds only properly match against the ID. See:

"indexBounds" : {
    "data.arrayOne.ID" : [
        "[\"test_id_0\", \"test_id_0\"]"
    ],
    "data.arrayOne.arrayTwo.someNum" : [
        "[MinKey, MaxKey]"
    ]
}

Hinting that it should use the compound index does improve speed (~100ms) but it should be better.

To that end, I adjusted the query to the following as an experiment:

db.datacol.find( { "data.arrayOne": { $elemMatch: { "arrayTwo.someNum": { $elemMatch: { $gte: 20.47, $lt: 20.49 } }, ID: "test_id_0" } } } )

[You'll notice that the difference is that someNum is now within the first $elemMatch, rather than the second.]

This query returns 0 documents (verified via Cursor.count() and Cursor.hasNext()), but claims to examine 29 documents and has "nReturned: 29". It does this in ~10ms and reports the following bounds:

"indexBounds" : {
    "data.arrayOne.ID" : [
        "[\"test_id_0\", \"test_id_0\"]"
    ],
    "data.arrayOne.arrayTwo.someNum" : [
        "[20.47, 20.49)"
    ]
}

Which is what I'd expect for this type of lookup (based on what I've understood from this documentation). What's troublesome is the lack of returned data...

Could someone explain what's going on? Why does this newer version fail to actually return the 29 documents to the cursor? Why does moving the "someNum" field around as I did have such a large effect?

Asya Kamsky

unread,
Jul 12, 2017, 6:41:45 PM7/12/17
to mongodb-user
“$elemMatch” says “only match array elements which …

In your second case “arrayTwo.someNum” is not an array and therefore nothing will match that query.

The index finds the actual values that match the query, but then examining the documents the query system correctly rejects the the candidate documents.



--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+unsubscribe@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/d9667e39-fddc-4570-b4ab-f22e287a40eb%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Asya Kamsky
Lead Product Manager
MongoDB
Download MongoDB - mongodb.org/downloads
We're Hiring! - https://www.mongodb.com/careers

er...@sonicbloomgames.com

unread,
Jul 13, 2017, 1:22:01 PM7/13/17
to mongodb-user
Got it. How, then, can I restructure my query such that the index is correctly used (as in this case) but that the query system accepts the candidate documents under examination?

To perhaps put it another way, why does the "failing" query I wrote not work?
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.

Asya Kamsky

unread,
Jul 14, 2017, 2:15:53 PM7/14/17
to mongodb-user
I believe there may be a bug in the optimizer that’s causing the bounds on the second field not to be as tight as they should be.

I’ve filed a server ticket here with a reproducer, I did notice that the bug didn’t get triggered when the second array contained scalars (instead of subdocuments) - I’m not sure if that would help you, since you probably have other fields in the subdocuments in addition to “someNum”.

Asya

To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+unsubscribe@googlegroups.com.

To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.

For more options, visit https://groups.google.com/d/optout.

er...@sonicbloomgames.com

unread,
Jul 17, 2017, 11:28:06 AM7/17/17
to mongodb-user
I believe there may be a bug in the optimizer that’s causing the bounds on the second field not to be as tight as they should be.

Ahh, well that would certainly explain my confusion.
 
I did notice that the bug didn’t get triggered when the second array contained scalars (instead of subdocuments) - I’m not sure if that would help you, since you probably have other fields in the subdocuments in addition to “someNum”.

That is correct: someNum is one of several fields in the subdocuments within arrayTwo.

Would a temporary workaround be to create a View(?) wherein the document structure is adjusted to enable the optimizer to work in its current state?

- Eric

Asya Kamsky

unread,
Jul 17, 2017, 5:54:44 PM7/17/17
to mongodb-user
Would a temporary workaround be to create a View(?) wherein the document structure is adjusted to enable the optimizer to work in its current state?

No, I’m afraid not, because the view is not materialized so the underlying query is going to be happening on the exact same structure of document that’s in the base collection.

Asya


To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+unsubscribe@googlegroups.com.

To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.

For more options, visit https://groups.google.com/d/optout.

er...@sonicbloomgames.com

unread,
Jul 17, 2017, 6:58:02 PM7/17/17
to mongodb-user
Darn. Well, thanks for the explanation! It's both insightful and saved us the time of attempting that route...

er...@sonicbloomgames.com

unread,
Oct 25, 2017, 2:11:54 PM10/25/17
to mongodb-user
Has there been any movement on this bug? I see that it is currently in the Backlog, a place I fear that bugs go to die... ;)
Reply all
Reply to author
Forward
0 new messages