Importing large json file into mongodb

2,748 views
Skip to first unread message

Anush Shetty

unread,
Feb 26, 2011, 10:08:52 AM2/26/11
to mongod...@googlegroups.com

I have a 80MB json file which I need to insert into MongoDB using pymongo. I tried GridFS but figured out that it is mostly for storing a single large file (atleast from what I figure out). 

Even mongoimport gave me errors saying that the input is too large

Can anybody advise me on how to go about it. How does one handle such a case

Thanks

Scott Hernandez

unread,
Feb 26, 2011, 10:16:18 AM2/26/11
to mongod...@googlegroups.com
There is a limit to the size of a single document. Depending on the
version you are working with it will be either 4MB/8MB/16MB. Basically
you need to keep your document size below 4MB.

How many documents are in your 80MB json file? Are they all below 4MB?

Yes, gridfs is for large chunks of data, like files or blobs.

> --
> You received this message because you are subscribed to the Google Groups
> "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/mongodb-user?hl=en.
>

Anush Shetty

unread,
Feb 26, 2011, 10:18:43 AM2/26/11
to mongod...@googlegroups.com
Problem is that It is a single xml file converted to json. What should I do when a single file itself is 80MB. 

Andreas Jung

unread,
Feb 26, 2011, 10:24:18 PM2/26/11
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Anush Shetty wrote:
> Problem is that It is a single xml file converted to json. What should I
> do when a single file itself is 80MB.

So you are telling us that your JSON file contains only one document
with 80 MB of data? You won't be unable to import that due to the
mentioned size restrictions. No idea what your data is about. If it
contains multiple independent subdocuments or what: perform some
transformation first or import your data using some hand-crafted
script yourself.

- -aj
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (Darwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQGUBAEBAgAGBQJNacPiAAoJEADcfz7u4AZj3F4LvREgS1HPd2RKX73n6VSfF/Wr
cXgvuYbDKyp7NqbK5ozQ8NfvUZo/I+VhsqLI5Kmoeyq2ylrrGXr7BsS9arfK0dwF
g2vxLUC2PvNknt1CxD9TvJzVJkrUHa6aZVYlKo8CAMwMLvIXk/NZax84lPo5vLhY
RicGegA5NoDqdIEF+TYvdwxBMLz7gh4S4oQYfWCe1sGWY5dylzCvsByOIfnfmC0f
M9Axf1kT2EGuSUukCwqcTBqgOF5KL7XphLeH87MrHws0bE2AoLPQlc+/JR55kCN+
GtgTqo0Fe03x2mImT9SOvJVEH+DQBuwVeYWVvjcLihb7YzZqmsLkse+0R0sjfc3O
al2YOIeToSs76CPo2a7asNPP+67rEI71IYRKitELiYkYZr5VzQl5Eirf5rA+JE00
pu6s5aTQ2CcK5+ujlq+ZmFfCKzcclXTCJi/nDDkikvPidy1j2DUF0QOCZPMBvNBs
Vx+BTGDRa5TXs7yLC10CA8WyiQ7Yir0=
=1Skj
-----END PGP SIGNATURE-----

lists.vcf

sridhar

unread,
Feb 26, 2011, 10:46:39 PM2/26/11
to mongodb-user
Can you paste a sample portion of your file? You can probably split
your doc into multiple json docs depending on what you want to do with
it As Scott says depending on the mongo version max doc size is
restricted to upto 16MB.

On Feb 26, 7:24 pm, Andreas Jung <li...@zopyx.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Anush Shetty wrote:
> > Problem is that It is a single xml file converted to json. What should I
> > do when a single file itself is 80MB.
>
> So you are telling us that your JSON file contains only one document
> with 80 MB of data? You won't be unable to import that due to the
> mentioned size restrictions. No idea what your data is about. If it
> contains multiple independent subdocuments or what: perform some
> transformation first or import your data using some hand-crafted
> script yourself.
>
> - -aj
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.11 (Darwin)
> Comment: Using GnuPG with Mozilla -http://enigmail.mozdev.org/
>
> iQGUBAEBAgAGBQJNacPiAAoJEADcfz7u4AZj3F4LvREgS1HPd2RKX73n6VSfF/Wr
> cXgvuYbDKyp7NqbK5ozQ8NfvUZo/I+VhsqLI5Kmoeyq2ylrrGXr7BsS9arfK0dwF
> g2vxLUC2PvNknt1CxD9TvJzVJkrUHa6aZVYlKo8CAMwMLvIXk/NZax84lPo5vLhY
> RicGegA5NoDqdIEF+TYvdwxBMLz7gh4S4oQYfWCe1sGWY5dylzCvsByOIfnfmC0f
> M9Axf1kT2EGuSUukCwqcTBqgOF5KL7XphLeH87MrHws0bE2AoLPQlc+/JR55kCN+
> GtgTqo0Fe03x2mImT9SOvJVEH+DQBuwVeYWVvjcLihb7YzZqmsLkse+0R0sjfc3O
> al2YOIeToSs76CPo2a7asNPP+67rEI71IYRKitELiYkYZr5VzQl5Eirf5rA+JE00
> pu6s5aTQ2CcK5+ujlq+ZmFfCKzcclXTCJi/nDDkikvPidy1j2DUF0QOCZPMBvNBs
> Vx+BTGDRa5TXs7yLC10CA8WyiQ7Yir0=
> =1Skj
> -----END PGP SIGNATURE-----
>
>  lists.vcf
> < 1KViewDownload

aditya sharma

unread,
Jan 8, 2015, 4:11:44 PM1/8/15
to mongod...@googlegroups.com, li...@zopyx.com
I have the similar problem. We have larger XMLs and we wish to save them on MongoDB. The size limit of 16MB is restricting us and so we would like to split the converted JSON in smaller chunks but keeping the hierarchy of the XML/JSON document. 

Please help us in this regard. We are using .NET for application development.

Thanks,
Aditya
Reply all
Reply to author
Forward
0 new messages