insert_many mongoDB and how to ignore duplicate inserts in a session transaction

148 views
Skip to first unread message

Luca Pamparana

unread,
Aug 7, 2018, 8:29:28 PM8/7/18
to mongodb-user
I was using PyMongo without any transaction and sessions before and was inserting documents successfully as:

    try:
        _
= db[collection].insert_many(dataset, ordered=False)
   
except:
        err
= filter(lambda x: x['code'] != 11000, e.details['writeErrors'])
       
if len(err) > 0:
           
raise



The code above was successfully ignoring the errors about duplicate keys, which is what I wanted.

Now, I upgraded to MongoDB 4.0 and tried the new transactions API and tried to do this in a session as:

    def do_insert(db, dataset, session):
       
try:
            _
= db[collection].insert_many(dataset, ordered=False, session=session)
       
except pymongo.errors.DuplicateKeyError as e:
           
pass



However, when I commit the transaction also generates an `OperationFailure` error and I get something like:

    ERROR: test_insert_duplicate_categories (__main__.TestDefaultAnnotations)
    ----------------------------------------------------------------------
    Traceback (most recent call last):
      File "/Users/xargon/Dropbox/infermatica/code/alchera/altrack/altrack/tests/test_mongodb_default.py", line 152, in test_insert_duplicate_categories
        insert_dataset(db, ds)
      File "/Users/xargon/Dropbox/infermatica/code/alchera/altrack/altrack/data/default.py", line 269, in insert_dataset
        session.commit_transaction()
      File "/Users/xargon/anaconda/envs/deep/lib/python3.6/site-packages/pymongo/client_session.py", line 393, in commit_transaction
        self._finish_transaction_with_retry("commitTransaction")
      File "/Users/xargon/anaconda/envs/deep/lib/python3.6/site-packages/pymongo/client_session.py", line 457, in _finish_transaction_with_retry
        return self._finish_transaction(command_name)
      File "/Users/xargon/anaconda/envs/deep/lib/python3.6/site-packages/pymongo/client_session.py", line 452, in _finish_transaction
        parse_write_concern_error=True)
      File "/Users/xargon/anaconda/envs/deep/lib/python3.6/site-packages/pymongo/database.py", line 514, in _command
        client=self.__client)
      File "/Users/xargon/anaconda/envs/deep/lib/python3.6/site-packages/pymongo/pool.py", line 579, in command
        unacknowledged=unacknowledged)
      File "/Users/xargon/anaconda/envs/deep/lib/python3.6/site-packages/pymongo/network.py", line 150, in command
        parse_write_concern_error=parse_write_concern_error)
      File "/Users/xargon/anaconda/envs/deep/lib/python3.6/site-packages/pymongo/helpers.py", line 155, in _check_command_response
        raise OperationFailure(msg % errmsg, code, response)
    pymongo.errors.OperationFailure: Transaction 1 has been aborted.

The call is as:

    with db.client.start_session() as session:
       
try:
            session
.start_transaction()
            do_insert
(db, dataset, session)
            session
.commit_transaction()
       
except Exception as e:
            session
.abort_transaction()
           
raise



How can I ignore this duplicate key error in a transactional setting? The problem is that even though I ignore the duplicate key exception, the transaction now seems to be in an inconsistent state. So when I commit, it throws up that exception.

So, my use case is that I can have users trying to insert duplicates and the database should silently ignore the insert if a record already exists. Basically, existence of duplicate records should not be a hard fail. Is there a way to do this using 4.0 multi-document transaction support?


Reply all
Reply to author
Forward
0 new messages