Issue 162 in couchdb-python: Upon querying a view, couchdb-python reads the entire result, ...

12 views
Skip to first unread message

couchdb...@googlecode.com

unread,
Jan 23, 2011, 6:16:43 PM1/23/11
to couchdb...@googlegroups.com
Status: New
Owner: ----
Labels: Type-Defect Priority-Medium

New issue 162 by andreas.kloeckner: Upon querying a view, couchdb-python
reads the entire result, ...
http://code.google.com/p/couchdb-python/issues/detail?id=162

...then parses it, then returns it to the user. All the while, both raw and
parsed representation are kept in memory. How about some love for people
whose data is too big to fit in main memory? Even once?

:(

couchdb...@googlecode.com

unread,
Jan 23, 2011, 7:27:18 PM1/23/11
to couchdb...@googlegroups.com

Comment #1 on issue 162 by randall....@gmail.com: Upon querying a view,
couchdb-python reads the entire result, ...
http://code.google.com/p/couchdb-python/issues/detail?id=162

It's a great idea to support this. However, it's not a straightforward
issue. A streaming JSON parser is required in order to deliver an iterable
stream of rows without holding the whole response in memory. To solve this
problem probably requires using YAJL in combination with a Python binding
like ijson[1]. A parser like ijson has a much different API than simplejson
or the standard library parser meaning code must be differentiated and
larger pieces of CouchDB-Python rewritten to handle it. I also suspect that
making it a strict requirement would be untenable. The best approach would
perhaps be to use the http and client modules of CouchDB-Python directly,
subclassing Resource, Server, Database, etc. I don't see an immediately
straightforward way to just graft a streaming parser into the system
without lots of new code.

[1] https://github.com/kennethreitz/ijson#readme

couchdb...@googlecode.com

unread,
Jan 24, 2011, 4:32:55 AM1/24/11
to couchdb...@googlegroups.com

Comment #2 on issue 162 by anandol...@gmail.com: Upon querying a view,
couchdb-python reads the entire result, ...
http://code.google.com/p/couchdb-python/issues/detail?id=162

I wrote a some code to support iterating over rows.

https://github.com/openlibrary/openlibrary/blob/master/openlibrary/core/couch.py


couchdb...@googlecode.com

unread,
Feb 25, 2012, 4:14:07 PM2/25/12
to couchdb...@googlegroups.com

Comment #3 on issue 162 by kxepal: Upon querying a view, couchdb-python

@Matt had implemented iterative views a long time ago
http://code.google.com/r/mattgoodall-couchdb-python-iterview/source/browse/couchdb/client.py#829

After some time in production I could say that they works perfectly. The
only thing left to use them by default for db iteration and ViewField with
some constant batch number: too small as 100 produce too many requests, 10K
is quite optimal and mercy for memory and requests count, but it should be
tweakable.

Why not to get this feature to mainstream?

couchdb...@googlecode.com

unread,
Apr 24, 2013, 1:54:20 PM4/24/13
to couchdb...@googlegroups.com

Comment #4 on issue 162 by kxepal: Upon querying a view, couchdb-python
Add Matt's iterview feature as folded patch.

Attachments:
iterviews.patch 7.7 KB

--
You received this message because this project is configured to send all
issue notifications to this address.
You may adjust your notification preferences at:
https://code.google.com/hosting/settings

couchdb...@googlecode.com

unread,
Apr 25, 2013, 6:10:59 AM4/25/13
to couchdb...@googlegroups.com
Updates:
Status: Fixed

Comment #5 on issue 162 by djc.ochtman: Upon querying a view,
couchdb-python reads the entire result, ...
http://code.google.com/p/couchdb-python/issues/detail?id=162

Nice one. I did a slightly less squashed version as r4f4166f23558 and
follow-ups, and pushed that into the repository. This way, we get to see
some of the progress (though not some of the more trivial changes made
along the way).
Reply all
Reply to author
Forward
0 new messages