Message from discussion
update() and new _bulk_docs - bugs and API comments
Received: by 10.103.198.15 with SMTP id a15mr11011muq.25.1237373112835;
Wed, 18 Mar 2009 03:45:12 -0700 (PDT)
Return-Path: <matt.good...@gmail.com>
Received: from mail-bw0-f168.google.com (mail-bw0-f168.google.com [209.85.218.168])
by gmr-mx.google.com with ESMTP id 5si61135fge.7.2009.03.18.03.45.11;
Wed, 18 Mar 2009 03:45:11 -0700 (PDT)
Received-SPF: pass (google.com: domain of matt.good...@gmail.com designates 209.85.218.168 as permitted sender) client-ip=209.85.218.168;
Authentication-Results: gmr-mx.google.com; spf=pass (google.com: domain of matt.good...@gmail.com designates 209.85.218.168 as permitted sender) smtp.mail=matt.good...@gmail.com; dkim=pass (test mode) header...@gmail.com
Received: by bwz12 with SMTP id 12so433267bwz.21
for <couchdb-python@googlegroups.com>; Wed, 18 Mar 2009 03:45:11 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=gamma;
h=domainkey-signature:mime-version:received:date:message-id:subject
:from:to:content-type:content-transfer-encoding;
bh=OuOJ6wzZXztd5+m3ORbt/rbilZj27FrNWFdlIZUVQS4=;
b=Rit20DgEu9kOCVsKVm3iFQ0C13HWr3kv736rqfwtV426QnvsxnZnWIqDN+9qKrbusa
83U3vbIwh4oSgwzxxgAuxT4S2MjiyT1fKnn2lG7L0XinY3k9tRyPQb/mzGKnaGJ+5JT9
dePU6iKKF5QIQVJdlQ2owIM2eAY7ihnRuazNI=
DomainKey-Signature: a=rsa-sha1; c=nofws;
d=gmail.com; s=gamma;
h=mime-version:date:message-id:subject:from:to:content-type
:content-transfer-encoding;
b=QDlBOv8dVajR0vg8gWKBY7kuHyT4274HnuetIeebsd/f89wZDxBPvX8TlTMWlDEdap
dNXxJBGf62L8LmKnqGaXu1P87AoQBzBMytEDqoAvzUeI6WGxhNUuTE6T7M06CLkTvhhc
1vhFQ8WpIW0NlX+KXTNMySxzPjiGlfBSEoSBQ=
MIME-Version: 1.0
Received: by 10.204.56.13 with SMTP id w13mr368613bkg.131.1237372710211; Wed,
18 Mar 2009 03:38:30 -0700 (PDT)
Date: Wed, 18 Mar 2009 10:38:30 +0000
Message-ID: <214c385b0903180338n779d618dl43ee7f540e839a9e@mail.gmail.com>
Subject: update() and new _bulk_docs - bugs and API comments
From: Matt Goodall <matt.good...@gmail.com>
To: couchdb-python@googlegroups.com
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Hi,
I was looking through the recent change to match the new _bulk_docs
behaviour and immediately noticed it does not expect any of the docs
to fail to update - there is no 'rev' key in a failing document's
result dict.
The use of a generator to process the _bulk_docs response actually
masks the above problem.
>>> doc_id = db.create({})
>>> doc = db.get(doc_id)
>>> db.update([doc])
<generator object at 0x8a15dac>
>>> db.update([doc])
<generator object at 0x8a15fcc>
>>> doc_id = db.create({})
>>> doc = db.get(doc_id)
>>> db.update([doc])
<generator object at 0x8a15e4c>
>>> list(db.update([doc]))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/matt/src/ext/couchdb-python/couchdb/client.py", line
544, in _update
doc.update({'_id': result['id'], '_rev': result['rev']})
KeyError: 'rev'
However, I think the generator gives update() a "surprising" API, and
has been doing so for some time. Before the _bulk_docs change the
updated documents' revs were not updated; now it raises an exception
(because of the bug ;-)).
Regardless of the above I don't think Database.update() should really
try to process the _bulk_docs response much anymore. I think it has to
be left to the application now. In other words, update() should simply
return a nicer form of the _bulk_docs response and leave the rest to
the application.
Perhaps the simplest version of my proposal would be:
def update(self, documents):
docs = []
for doc in documents:
if isinstance(doc, dict):
docs.append(doc)
elif hasattr(doc, 'items'):
docs.append(dict(doc.items()))
else:
raise TypeError('expected dict, got %s' % type(doc))
resp, data = self.resource.post('_bulk_docs', content={'docs': docs})
return data
A slightly better version, that makes the possibility/likelihood of
failure more explicit by yielding (status, result) tuples:
def update(self, documents):
docs = []
for doc in documents:
if isinstance(doc, dict):
docs.append(doc)
elif hasattr(doc, 'items'):
docs.append(dict(doc.items()))
else:
raise TypeError('expected dict, got %s' % type(doc))
resp, data = self.resource.post('_bulk_docs', content={'docs': docs})
def _gen_result():
for result in data:
yield 'error' not in result, result
return _gen_result()
And finally, a version that yields (status, id, rev or
ResourceConflict) tuples. The ResourceConflict makes for improved
consistency with Database.__setitem__:
def update(self, documents):
docs = []
for doc in documents:
if isinstance(doc, dict):
docs.append(doc)
elif hasattr(doc, 'items'):
docs.append(dict(doc.items()))
else:
raise TypeError('expected dict, got %s' % type(doc))
resp, data = self.resource.post('_bulk_docs', content={'docs': docs})
def _gen_result():
for result in data:
if result.get('error') == 'conflict':
yield False, result['id'],
ResourceConflict('conflict', result['reason'])
else:
yield True, result['id'], result['rev']
return _gen_result()
Note that update does not even update the revs of successfully updated
documents anymore. I think that's important given the new semantics of
_bulk_docs but it also makes the create() and update() API a little
more consistent at the same time. create() does not update the rev,
update() does. I always found that a little weird ;-).
Now, I know that this completely changes the update API ... but I
think we can blame CouchDB for that ;-).
Hope that all makes sense. cmlenz, if you want me to implement one of
the above (I favour the last version) with tests and send you a patch
then please ask.
- Matt