How could I use covered query with null value ?

74 views
Skip to first unread message

bappr

unread,
Dec 15, 2015, 3:18:08 PM12/15/15
to mongodb-user
Hi,

I have a collection with 2 documents such as:

  • {_id: 1, test: 1}
  • {_id: 2, test: null}

On this collection, I have 1 index such as:

  • {test: 1}

Explain a query tell me that :

  • query on {test: 1} is a covered query.
  • query on {test: null} is not.

How could I use a covered query in this specific case to use my index without needing to scan the collection ?


Thanks.


Benjamin

Asya Kamsky

unread,
Dec 15, 2015, 7:02:04 PM12/15/15
to mongodb-user
Because {test:null} matches both, documents where test exists and is equal to null and those where test does not exist, the document has to be fetched to see which of the two is the case.

There is no way to tell from the index alone whether the field exists.

Asya


--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/59a881b8-8a4e-4852-8c4e-0244c0e4f899%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Asya Kamsky
Lead Product Manager
MongoDB
Download MongoDB - mongodb.org/downloads
Free MongoDB Monitoring - cloud.mongodb.com
Free Online Education - university.mongodb.com
Get Involved - mongodb.org/community
We're Hiring! - https://www.mongodb.com/careers

John De Goes

unread,
Dec 16, 2015, 11:31:54 PM12/16/15
to mongodb-user

MongoDB sometimes conflates undefined (the absence of a value) with null (the presence of a sentinel value). In 3.2, undefineds are now preserved on insertion, and presented as undefined in the shell. At some future version, hopefully the semantics will be untangled completely which would allow your query to hit just the index.

Regards,

John

bappr

unread,
Jan 28, 2016, 5:28:58 AM1/28/16
to mongodb-user
Ok thanks Asya.

It seems that a query such as 
  • {_id: 1, test: null}
also needs to scan all the collection.

Do I need to add a boolean flag to avoid the "test:null" ?

Thanks.

Benjamin

I tried to query

Asya Kamsky

unread,
Jan 28, 2016, 7:40:55 PM1/28/16
to mongodb-user
I don't understand your question.

Query {_id:1, test:null} won't need to scan the collection, just the
one document that it can tell from the index matches _id:1 value.

But if you represented "test" as true and false (rather than null and
not null, I guess) then your indexed queries would be more efficient.

Asya
> https://groups.google.com/d/msgid/mongodb-user/9776ae34-9fa5-4bf5-958c-f39fd180c914%40googlegroups.com.

bappr

unread,
Jan 29, 2016, 3:30:47 AM1/29/16
to mongodb-user
OK let me give you my use case.

I have a collection with 40 millions datas such as
- {_id: ..., flag: true, test: "abc"}
- {_id: ..., flag: true, test: null}

I have one index (index1) such as
- {flag:1, test:1}

I need to process all of these datas, 10 millions a time.
So I will have 4 batches.
The processing is complex but at the end, all the flag have to be false.

At the end of each batch, I need to verify there are still datas with true flag and null test.
I query mongo with a count such as db.collection.count({flag:true, test:null}).
Because I have 4 batches, I will have 4 count to do.

The problem is that the count time is empirical while I have less true flag...
To give you a concrete estimation, with only 5 true flag datas, the count take 30 minutes to return...

The explain is OK => the query use my "index1" but need to do a FETCH for "test:null".

Why the count time is empirical ?
It seems it does not use any index but explain tell it is using one...

Am I more understandable ?

Thanks.

Benjamin

Asya Kamsky

unread,
Jan 29, 2016, 1:11:45 PM1/29/16
to mongodb-user
Sorry, still not really clear, but as I said before, if the query
includes a null check then it cannot be a covered index query.
That means that after checking the index, documents have to be fetched
to check if the test value in the index is null because the field is
absent or present and has value null.

You have several options to work around this: you could use a value
other than "null" to indicate there has been no test - for example,
something like "test": "none" or "test":false - then the count will be
a covered index query and should be much faster.

Depending on which version you are using, other workaround options may
be available.

Asya
> https://groups.google.com/d/msgid/mongodb-user/a47cc688-6547-4a1a-80e3-561af57b4123%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages