Insert a NumPy rec.array to MongoDB using PyMongo

1,356 views
Skip to first unread message

Femto Trader

unread,
Dec 27, 2015, 2:35:30 PM12/27/15
to mongodb-user
Hello,

In a StackOverflow question some people are trying to insert a Pandas DataFrame into MongoDB using Python internal structures (`dict`, `list`)


I wonder if we can't insert instead a NumPy `rec.array` (`numpy.recarray`) to MongoDB using PyMongo

That should probably be more efficient because to_dict use for loops

This question is also asked on


    In [1]: import pandas as pd
    In [2]: import pymongo
    In [3]: client = pymongo.MongoClient()
    In [4]: collection = client['db_name']['collection_name']
    In [5]: df = pd.DataFrame([[1,2,3],[4,5,6]], columns=['a', 'b', 'c'])
    In [6]: df
    Out[6]:
       a  b  c
    0  1  2  3
    1  4  5  6
    In [7]: rec = df.to_records()
    In [8]: rec
    Out[8]:
    rec.array([(0, 1, 2, 3), (1, 4, 5, 6)],
              dtype=[('index', '<i8'), ('a', '<i8'), ('b', '<i8'), ('c', '<i8')])
    In [9]: type(rec)
    Out[9]: numpy.recarray

but I faced some errors at insert

    In [10]: collection.insert(rec)

raised

    ValueError: no field of name _id

this

    In [11]: collection.insert_many(rec)

raised

    TypeError: documents must be a non-empty list

this

    In [12]: collection.insert_one(rec)

raised

    TypeError: document must be an instance of dict, bson.son.SON, or other type that inherits from collections.MutableMapping


I haven't find in this group any reference to numpy.recarray

Any idea?

Kind regards

Bernie Hackett

unread,
Dec 29, 2015, 12:47:22 PM12/29/15
to mongodb-user
The insert methods of PyMongo (insert, insert_one, insert_many) require a subclass of collections.MutableMapping or a list of such objects. You will probably get better performance using monary:

Reply all
Reply to author
Forward
0 new messages