findAndModify upsert $addToSet

Leif Mortenson

unread,

Jul 12, 2011, 10:32:48 PM7/12/11

to mongod...@googlegroups.com

Hi,
We have a collection "test" with documents like the following:

{_id:1, t:1, s:[null, 1, 2, 3], bt:123}
{_id:2, t:1, s:[null], bt:124}

The "s" field is used in an index so the document can be found fr any
individual value of "s".

Our index is {t:1, s:1, bt:1} and is quite large.

New documents are added to this collection using an upsert. If we do
the following then it works fine but is quite slow because the above
index is not used.
---
db.test.findAndModify({query:{t:ObjectId("4e1bd2a184aefd11f9fc2950"),
bt:1307483647}, update:{$addToSet:{s:null}}, new:true, upsert:true});
---

When doing an upsert, we always do so using the "null" value for s. I
would like to do this as follows to make use of the index:
---
db.test.findAndModify({query:{t:ObjectId("4e1bd2a184aefd11f9fc2950"),
s:null, bt:1307483647}, update:{$addToSet:{s:null}}, new:true,
upsert:true});
---

If I run this for a document that does not exist however then I get
the following:
---
Wed Jul 13 01:55:14 uncaught exception: findAndModifyFailed failed:
"Cannot apply $addToSet modifier to non-array"
---
If this is run when the document already exists however then it works
fine because the s:null finds the document in the index correctly.

Do you have any ideas on how to get this working correctly?

One option would be to add another index without the "s" field, but as
I said, it is quite large so I would like to avoid that if possible.

Thanks in advance,
Cheers,
Leif

Nat

unread,

Jul 12, 2011, 10:59:12 PM7/12/11

to mongod...@googlegroups.com

- If you use index {t:1, bt:1, s:1}, findAndModify should be a lot faster.

- $addToSet: works with array only. So when the array doesn't exist, it will fail. To get it working probably, you probably should issue another upsert call before hand to add an empty document when the record doesn't exist.

Leif Mortenson

unread,

Jul 13, 2011, 12:34:25 AM7/13/11

to mongod...@googlegroups.com

Nat,
Thank you for your feedback.

I noticed what you said about changing the order of the index, but
that would affect the places where the index is used else where. All
of the data is grouped by t, s, then bt.

"s" is actually a field that we added to index sparse data faster. We
don't set it for the majority of data because they exist in all
records within a given "t" value. Each document in this collection
contains over 100 sub documents which need to be used together. There
are times when we need to search for all documents which contain a
given subdocument however.

{t:1, s:[null], bt:123, i1:{...}, i2:{...}, i3:{...}}
{t:1, s:[null], bt:124, i1:{...}, i2:{...}, i3:{...}}
{t:1, s:[null, i4], bt:125, i1:{...}, i2:{...}, i3:{...}, i4:{...}}
{t:1, s:[null], bt:126, i1:{...}, i2:{...}, i3:{...}}

In the above data data, the "i4" field is considered sparse data
because it only exists in a small fraction of the overall data.

For most queries we do something like this to find documents
containing a given field.
db.dox.find({t:1, s:null, bt:{$gte:120, $lt:130}, i3:{$exists:true}},
{t:true, bt:true, i3:true});
This is fine because most documents contain the "i3" field and the
exists filters out those that do not without too much waste.

For sparse data however we would have to look through 10s of thousands
of records to find the most recent 10 or so documents with the sparse
field. That is very slow. To make this work faster, we make use of
the sparse field. like this:
db.dox.find({t:1, s:"i4", bt:{$gte:120, $lt:130}, i4:{$exists:true}},
{t:true, bt:true, i3:true});
The exists in this case is redundant and could be removed.

If I do what you suggested and try to create the record using a
findAndReplace without the "s" field then it would require another
index.

I was surprised that the findAndReplace command was not able to
differentiate this case because this is a standard way to index values
in an array for a query.

db.test.findAndModify({query:{t:1, s:null, bt:123},

update:{$addToSet:{s:null}}, new:true, upsert:true});

Cheers,
Leif

Nat

unread,

Jul 13, 2011, 2:40:47 AM7/13/11

to mongod...@googlegroups.com

If you always sort by {t:true, bt:true, xxxxx : true} anyway, I still think most of your queries still be able to use index {t:1, bt:1, s:1}. You shouldn't need an additional index.

Reply all

Reply to author

Forward