Mongodb sparse index and general index

26 views

Skip to first unread message

Ns Kumar

unread,

Jan 3, 2018, 4:30:47 AM1/3/18

to mongodb-user

I have created a collection with 100 documents(fields x & y) and created normal index on field x and sparse index on field y as shown below :

for(i=1;i<100;i++)db.coll.insert({x:i,y:i})

db.coll.createIndex({x:1})
db.coll.createIndex({y:1},{sparse:true})

Now added a few docs without fields x & y as shown below:
for(i=1;i<100;i++)db.coll.insert({z:"stringggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggggg"})

Looking at db.coll.stats(), i found below sizes of the indexes.

storageSize:36864
_id:32768
x_1:32768
y_1:16384

As per definition of sparse index, only fields containing indexed field y are considered, hence y_1 has occupied less space. But _id & x_1 indexes seem to contain all the documents in them.

If i perform a query - db.coll.find({z:99}).explain('executionStats')

It is doing a COLLSCAN and fetching the record. If this is the case, i am not clear on why mongodb stores all the documents under _id & x_1 indexes, as it is a waste of storage space. Please help me understand. Pardon my ignorance if i missed something.

Thank you for your help.

Kevin Adistambha

unread,

Jan 3, 2018, 6:35:00 PM1/3/18

to mongodb-user

Hi,

I believe you posted an identical question in https://stackoverflow.com/questions/48060380/mongodb-sparse-index-and-general-index. I have put up my answer there, but the main points are:

In a “normal” index, missing fields are indexed with a null value. For example, if you have index of {a:1} and you insert {b:10} into the collection, the document will be indexed as a: null.

In your collection, the documents will be indexed as follows:

200 documents will be in the _id index, which is always created by MongoDB
200 documents will be in the “normal” {x:1} index, from all the documents in the collection
100 documents will be in the sparse {y:1} index, from documents that has the y field

Note that the index sizes you posted shows the same ratio as the numbers above.

Regarding your questions:

The _id index is for MongoDB internal use, see Default _id index. You cannot drop this index, and attempts to remove it could render your database inaccessible.
The x_1 index contains all the documents in your collection because it’s a normal index. In the case of your collection, half of the values in the index are null.
The sparse y_1 index is half the size of the x_1 index because only 100 out of 200 documents contain the y field.
The query db.coll.find({z:99}) does not use any index because you don’t have an index on the z field, hence it’s doing a collection scan.

For more information about indexing, please see Create Indexes to Support Your Queries