Issue 226 in couchdb-python: Provide ability to do bulk dump and load

11 views
Skip to first unread message

couchdb...@googlecode.com

unread,
Jun 14, 2013, 11:08:11 AM6/14/13
to couchdb...@googlegroups.com
Status: New
Owner: ----
Labels: Type-Defect Priority-Medium

New issue 226 by AntonBak...@gmail.com: Provide ability to do bulk dump and
load
http://code.google.com/p/couchdb-python/issues/detail?id=226

Currently load.py and dump.py utilities are loading/dumping documents one
by one which is tremendously slow.

Introducing bulk loading/dumping will really speed up the things here.

Maybe we can add an option like "--bulk-size" with default value set to 1
(load/dump documents one by one, just like it happens now) to allow user
some additional utility tuning.

--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

couchdb...@googlecode.com

unread,
Jun 14, 2013, 11:10:41 AM6/14/13
to couchdb...@googlegroups.com

Comment #1 on issue 226 by Pavel.Ts...@gmail.com: Provide ability to do
I'm working on initial implementation here, will provide some patches later

couchdb...@googlecode.com

unread,
Jun 17, 2013, 8:10:32 AM6/17/13
to couchdb...@googlegroups.com

couchdb...@googlecode.com

unread,
Jun 17, 2013, 8:44:34 AM6/17/13
to couchdb...@googlegroups.com

Comment #3 on issue 226 by djc.ochtman: Provide ability to do bulk dump and
load
http://code.google.com/p/couchdb-python/issues/detail?id=226

Good stuff! For inclusion into CouchDB-Python, I have a number of requests:

- Please remove the change in .hgignore, as it isn't needed anymore
- Please see if you can add a test for the new behavior
- It would be great if you can split this into two patches: one that
abstracts writing into a separate function, and another one that actually
does the bulk requests/writes -- this makes it easier to review the changes
now and in the future

couchdb...@googlecode.com

unread,
Jun 17, 2013, 1:23:32 PM6/17/13
to couchdb...@googlegroups.com

couchdb...@googlecode.com

unread,
Jun 18, 2013, 4:00:19 AM6/18/13
to couchdb...@googlegroups.com

Comment #5 on issue 226 by djc.ochtman: Provide ability to do bulk dump and
load
http://code.google.com/p/couchdb-python/issues/detail?id=226

I've pushed modified versions; for r6f91fa675423, I:

- Renamed function from write_dump() to dump_doc()
- Moved dump_doc() outside dump_db(), added envelope argument
- Rewrote commit message to clarify

In re8cafe210d91, I:

- Made sure lines didn't get longer than 80 chars
- Tightened up the loop code (while True, if condition: break is a little
silly)
- Rewrote commit message to clarify

Could you redo your bulk loading along these lines? You also introduce a
bug wrt error handling; db.update() doesn't throw Exceptions like
db.__setattr__(). Also, your test case references a test data file that
isn't included in the patch.

couchdb...@googlecode.com

unread,
Jun 18, 2013, 11:33:04 AM6/18/13
to couchdb...@googlegroups.com

Comment #6 on issue 226 by Pavel.Ts...@gmail.com: Provide ability to do
I hope I clearly understand your recommendations about code design. I
pushed it to
https://code.google.com/r/paveltsipinio-bulk-dumping/source/detail?r=46b5043fe465274850c4a821e468ca9ca90b70e0&name=bulk_dumping

I did not understand what you mean about test data file. I don't have any
test data files.

couchdb...@googlecode.com

unread,
Jul 15, 2014, 3:50:03 AM7/15/14
to couchdb...@googlegroups.com

Comment #7 on issue 226 by djc.ochtman: Provide ability to do bulk dump and
load
http://code.google.com/p/couchdb-python/issues/detail?id=226

This issue has been migrated to GitHub. Please continue discussion here:

https://github.com/djc/couchdb-python/issues/226
Reply all
Reply to author
Forward
0 new messages