pymongo.errors.BulkWriteError: batch op errors occurred (MongoDB 3.4.2, pymongo 3.4.0, Python 2.7.13, Ubuntu 16.04)

5,767 views
Skip to first unread message

Haitao Cai

unread,
Mar 28, 2017, 3:31:35 PM3/28/17
to mongodb-user
Hi all,

I am migrating several hundred million tweets from text files to MongoDB. A collection is created for each user to store his/her tweets. Before being migrated into the databases, the key '_id' is not in any entry. The insertion command 'insert_many()' is used to insert the documents into each collection. However, it often runs into the BulkWriteError.

Traceback (most recent call last):
  File "pipeline.py", line 105, in <module>
    timeline_db, meta_db, negative_db, log_col, dir_path)
  File "/media/haitao/Storage/twitter_pipeline/migrate_old.py", line 134, in migrate_dir
    timeline_db[user_id].insert_many(utility.temporal_sort(statuses))
  File "/home/haitao/anaconda3/envs/py27/lib/python2.7/site-packages/pymongo/collection.py", line 711, in insert_many
    blk.execute(self.write_concern.document)
  File "/home/haitao/anaconda3/envs/py27/lib/python2.7/site-packages/pymongo/bulk.py", line 493, in execute
    return self.execute_command(sock_info, generator, write_concern)
  File "/home/haitao/anaconda3/envs/py27/lib/python2.7/site-packages/pymongo/bulk.py", line 331, in execute_command
    raise BulkWriteError(full_result)
pymongo.errors.BulkWriteError: batch op errors occurred

I tried 'insert_one()' instead, but it encounters open file limit once every several hours, though I have increased the ulimit (65535 for both hard and soft ulimit -n).

Is there a solution to this issue? Thanks in advance!

Bernie Hackett

unread,
Mar 29, 2017, 12:33:07 PM3/29/17
to mongodb-user
The BulkWriteError exception has a "details" attribute:

    >>> import pymongo
    >>> from pymongo.errors import BulkWriteError
    >>> c = pymongo.MongoClient()
    >>> try:
    ...     c.foo.bar.insert_many([{'_id': 1}, {'_id': 1}])
    ... except BulkWriteError as exc:
    ...     exc.details
    ... 
    {'nModified': 0, 'nUpserted': 0, 'nMatched': 0, 'writeErrors': [{u'index': 1, u'code': 11000, u'errmsg': u'E11000 duplicate key error collection: foo.bar index: _id_ dup key: { : 1 }', u'op': {'_id': 1}}], 'upserted': [], 'writeConcernErrors': [], 'nRemoved': 0, 'nInserted': 1}

The writeErrors (or possibly WriteConcernErrors) field will tell you what went wrong. Or you can look at the mongod log file to find out what the error is.

> I tried 'insert_one()' instead, but it encounters open file limit once every several hours, though I have increased the ulimit (65535 for both hard and soft ulimit -n).

Open file limit on the application or the server?
Reply all
Reply to author
Forward
0 new messages