Duplicate Key error when inserting a array of objects to a collection. How to determine which element caused the error?

901 views
Skip to first unread message

wumpus

unread,
Apr 23, 2014, 5:44:48 PM4/23/14
to mongod...@googlegroups.com
I have a batch of records (BasicDBObject[] records) to insert into a mongo (2.4.9, actually TokuMX 1.4.1) collection using the Java API (2.12.0). When a DuplicateKeyException occurs, how do I determine which batch/array member caused the dupkey error? Is there an element in the exception that describes or details the batch/array member?

When a DuplicateKeyException occurs, how can I tell if any of the batch members succeed? I'd really just like to tell mongo to ignore duplicates and process the entire batch, but it's not obvious how to do so.

Jeff Yemin

unread,
Apr 24, 2014, 2:37:30 PM4/24/14
to mongod...@googlegroups.com
If you want to keep going even after a duplicate key error, alter your WriteConcern by calling com.mongodb.WriteConcern#continueOnError passing it the boolean value 'true'.  You will still get a DuplicateKeyException, but can be assured that all the documents that could be inserted were inserted.

Regards,
Jeff


On Wed, Apr 23, 2014 at 5:44 PM, wumpus <mah...@gmail.com> wrote:
I have a batch of records (BasicDBObject[] records) to insert into a mongo (2.4.9, actually TokuMX 1.4.1) collection using the Java API (2.12.0). When a DuplicateKeyException occurs, how do I determine which batch/array member caused the dupkey error? Is there an element in the exception that describes or details the batch/array member?

When a DuplicateKeyException occurs, how can I tell if any of the batch members succeed? I'd really just like to tell mongo to ignore duplicates and process the entire batch, but it's not obvious how to do so.

--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
 
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user...@googlegroups.com.
To post to this group, send email to mongod...@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/ea9525f9-a556-42b0-84ab-1664e446b7d7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Asya Kamsky

unread,
Apr 24, 2014, 4:18:46 PM4/24/14
to mongodb-user
Please note that Jeff's advice applies to using Java driver with MongoDB - if you are using it with TokuMX it's possible that different behavior may be seen - you may be best off double-checking this on the TokuMX Google Group.

Asya



wumpus

unread,
May 15, 2014, 4:20:41 PM5/15/14
to mongod...@googlegroups.com
This is good information and worked well to ignore the duplicates, but I've changed my mind :). I want to implement "last one wins" (update duplicates) instead of "first one wins" (ignore duplicates).

I want to take advantage of batch inserts (because they're so much faster than single upserts). Unfortunately I will have some batch insert collisions where the document already exists because of a unique index. When that collision/exception happens I want to update the non-indexed document fields in the document that caused the collision and make sure the rest of the batch gets inserted. If I have to try to pre-find() each document before creating my batch, that's going to eat away at a lot of the gains I'll get from batch inserts. Likewise if I have to go back and find() all the items in the batch to see which one failed.

How do I determine which document in the batch failed? There must be something in the exception that documents which batch entry failed, or maybe which entries succeeded?

Asya Kamsky

unread,
May 15, 2014, 11:40:26 PM5/15/14
to mongodb-user
Are you talking about regular MongoDB or TokuMX here?

In MongoDB you can specify for a batch of operations whether they
should be "ordered" or "unordered" - unordered means it will try all
of them and will return the errors for those that didn't succeed,
"ordered" means it'll stop as soon as it gets the first error.

Now, starting with 2.6 you can do it as batch updates w/upsert option,
which sounds like what you want - insert if it's not there but update
(i.e. overwrite in your case) if it is.

Does that sound right?

Asya
> https://groups.google.com/d/msgid/mongodb-user/b79cbd04-ddba-4347-86ed-243416b6dd06%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages