I'm a little surprised to hear that PyMongo would use 30+ GB of ram to
decode but mongorestore isn't a very good comparison. mongorestore
reads each document and inserts it into the database. By comparison,
your python code is reading the entire file into a string, passing
that entire string to decode_all, which then has to create dictionary
objects for all of the documents in the file, returning the entire
file as a list of dictionaries. We haven't even gotten to inserting
the documents into MongoDB yet. That's never going to use memory
efficiently.
On Wed, Oct 17, 2012 at 6:28 PM, Matthias Lee <
matthia...@gmail.com> wrote:
> Hello there,
>
> Ive been using pymongo for a while and have read a few smaller bson files,
> but today I was trying to convert a large bson file to json. (contains no
> binary data)
> Every way I tried reading and decoding resulted in me maxing out my RAM at
> 32GB.
>
> If there a more efficient way of reading/decoding bson that this:
> import bson
> f = open("bigBson,bson", 'rb')
> result = bson.decode_all(f.read())
>
> perhaps it can be decoded incrementally?
>
> In comparison, using mongorestore to load the same file barely increased my
> memory usage.
>
> Thanks,
>
> Matthias
>
> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to
mongod...@googlegroups.com
> To unsubscribe from this group, send email to
>
mongodb-user...@googlegroups.com
> See also the IRC channel --
freenode.net#mongodb