{_id:1, t:1, s:[null, 1, 2, 3], bt:123}
{_id:2, t:1, s:[null], bt:124}
The "s" field is used in an index so the document can be found fr any
individual value of "s".
Our index is {t:1, s:1, bt:1} and is quite large.
New documents are added to this collection using an upsert. If we do
the following then it works fine but is quite slow because the above
index is not used.
---
db.test.findAndModify({query:{t:ObjectId("4e1bd2a184aefd11f9fc2950"),
bt:1307483647}, update:{$addToSet:{s:null}}, new:true, upsert:true});
---
When doing an upsert, we always do so using the "null" value for s. I
would like to do this as follows to make use of the index:
---
db.test.findAndModify({query:{t:ObjectId("4e1bd2a184aefd11f9fc2950"),
s:null, bt:1307483647}, update:{$addToSet:{s:null}}, new:true,
upsert:true});
---
If I run this for a document that does not exist however then I get
the following:
---
Wed Jul 13 01:55:14 uncaught exception: findAndModifyFailed failed:
"Cannot apply $addToSet modifier to non-array"
---
If this is run when the document already exists however then it works
fine because the s:null finds the document in the index correctly.
Do you have any ideas on how to get this working correctly?
One option would be to add another index without the "s" field, but as
I said, it is quite large so I would like to avoid that if possible.
Thanks in advance,
Cheers,
Leif
I noticed what you said about changing the order of the index, but
that would affect the places where the index is used else where. All
of the data is grouped by t, s, then bt.
"s" is actually a field that we added to index sparse data faster. We
don't set it for the majority of data because they exist in all
records within a given "t" value. Each document in this collection
contains over 100 sub documents which need to be used together. There
are times when we need to search for all documents which contain a
given subdocument however.
{t:1, s:[null], bt:123, i1:{...}, i2:{...}, i3:{...}}
{t:1, s:[null], bt:124, i1:{...}, i2:{...}, i3:{...}}
{t:1, s:[null, i4], bt:125, i1:{...}, i2:{...}, i3:{...}, i4:{...}}
{t:1, s:[null], bt:126, i1:{...}, i2:{...}, i3:{...}}
In the above data data, the "i4" field is considered sparse data
because it only exists in a small fraction of the overall data.
For most queries we do something like this to find documents
containing a given field.
db.dox.find({t:1, s:null, bt:{$gte:120, $lt:130}, i3:{$exists:true}},
{t:true, bt:true, i3:true});
This is fine because most documents contain the "i3" field and the
exists filters out those that do not without too much waste.
For sparse data however we would have to look through 10s of thousands
of records to find the most recent 10 or so documents with the sparse
field. That is very slow. To make this work faster, we make use of
the sparse field. like this:
db.dox.find({t:1, s:"i4", bt:{$gte:120, $lt:130}, i4:{$exists:true}},
{t:true, bt:true, i3:true});
The exists in this case is redundant and could be removed.
If I do what you suggested and try to create the record using a
findAndReplace without the "s" field then it would require another
index.
I was surprised that the findAndReplace command was not able to
differentiate this case because this is a standard way to index values
in an array for a query.
db.test.findAndModify({query:{t:1, s:null, bt:123},
update:{$addToSet:{s:null}}, new:true, upsert:true});
Cheers,
Leif