Bulk update and returning exceptions

5 views
Skip to first unread message

Mandel

unread,
Jan 22, 2010, 5:37:53 AM1/22/10
to CouchDB-Python
Hi *,

I recently committed a patch to desktopcouch in which I was using the
bulk update (db.update) method to allow a number of documents to be
updated at once. After I committed the following was brought to my
attention:

Returning exception instances is not pythonic.

Which is actually true. One of the people (aquarius) involved in the
conversation (conversation url: https://code.launchpad.net/~mandel/desktopcouch/batch_update/+merge/17482)
went to #python on irc and had a long chat regarding the matter
(please read it at https://code.launchpad.net/~mandel/desktopcouch/batch_update/+merge/17482).

Since the patch uses the update method from python-couchdb I was
wondering what ware the different thoughts of the people in this list
and what would be the optimum solution.

Kr,

Mandel

Dirkjan Ochtman

unread,
Jan 22, 2010, 6:04:43 AM1/22/10
to couchdb...@googlegroups.com
Hi Mandel,

On Fri, Jan 22, 2010 at 11:37, Mandel <eti...@gmail.com> wrote:
> Returning exception instances is not pythonic.

By whose standards? Just because Stuart says so?

> Which is actually true. One of the people (aquarius) involved in the
> conversation (conversation url: https://code.launchpad.net/~mandel/desktopcouch/batch_update/+merge/17482)
> went to #python on irc and had a long chat regarding the matter
> (please read it at https://code.launchpad.net/~mandel/desktopcouch/batch_update/+merge/17482).

AFAICS, the point is that we have this list of documents, and we have
to fine-grainedly report if the update succeeded or not. My first idea
would be to just return a list of booleans indicating success or
failure. Throwing an Exception isn't really an option, since it's not
fine-grained (and once thrown the rest of the results aren't checked,
or the results can't be inspected by the caller. Now, here we have the
problem that we can also detect several *kinds* of problems per
failure, so you have to distinguish those somehow. You'd have to use
some sentinel values (constants), so why not use Exception instances?

I actually think it's a rather elegant solution, and certainly not unpythonic.

Cheers,

Dirkjan

Matt Goodall

unread,
Jan 22, 2010, 7:06:59 AM1/22/10
to couchdb...@googlegroups.com
Wow, there's a lot in there to try to digest quickly!

Disclaimer: the update_docs return API is largely down to me. See http://groups.google.com/group/couchdb-python/browse_thread/thread/df58036112c90040/e49d948a152c839b?lnk=gst&q=deferredlist#e49d948a152c839b for some discussion. I don't especially like it either but it seemed like the best of the bunch at the time.

I stole the idea from Twisted's DeferredList.http://twistedmatrix.com/trac/browser/trunk/twisted/internet/defer.py#L466. In many ways it's a similar problem to DeferredList - a bunch of unrelated operations that may each succeed or fail in isolation. It's not nice but that's how CouchDB works and there's not a whole lot we can do about that.

A couple of quick comments on bits of that discussion, to get them out of the way:

1. The results probably do need to be ordered. For instance, it should be possible to delete a doc and add a doc with the same id in the same bulk update as long as the new version appears later in the list. Either or both could result in an error. (Delete and create in a bulk update doesn't actually work but I consider that a bug in CouchDB ;-)).

2. twisted.python.failure.Failure is something completely different. It's for bundling everything about an exception (value, type and traceback) so it can be rethrown in the caller's callback chain. AFAIK, it has absolutely nothing to do with handling multiple exceptions.


As Dirkjan has already commented, we need to be able to differentiate the type of error so simply returning a boolean for each doc isn't enough. A sentinel could be used in place of an exception for that. In fact, the exception class, as opposed to an instance of one, would make a perfectly good sentinel to avoid duplication.

However, sentinels don't work here as they have no contextual information; they're contants. Each error represents a distinct failure with its own useful information. For instance, an exception raised by a validate_doc_update func will typically provide a meaningful message, e.g. "new docs must contain an 'author'".


The other main idea, a stateful exception that can be used to continue a bulk_update is surely just as unpythonic and, more importantly, changes the semantics of a bulk update.


So, after reading the IRC discussion I still don't see anything that could greatly improve the API.

Returning a Result object that is iterable and has a __nonzero__ to quickly test for the presence of an error would help a bit but not change things dramatically.

I think it mentioned raising an exception if errors are present somewhere in the discussion too (can't find it now). That seems horribly unpythonic to me.


Enough, I'm hitting send now ;-).


- Matt


2010/1/22 Mandel <eti...@gmail.com>

--
You received this message because you are subscribed to the Google Groups "CouchDB-Python" group.
To post to this group, send email to couchdb...@googlegroups.com.
To unsubscribe from this group, send email to couchdb-pytho...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/couchdb-python?hl=en.


Manuel de la Peña

unread,
Jan 22, 2010, 7:56:47 AM1/22/10
to couchdb...@googlegroups.com


2010/1/22 Dirkjan Ochtman <dir...@ochtman.nl>

Hi Mandel,

On Fri, Jan 22, 2010 at 11:37, Mandel <eti...@gmail.com> wrote:
> Returning exception instances is not pythonic.

By whose standards? Just because Stuart says so?


No, actually I do not know why Stuarts reasoning, but here is mine. If you get more than one exception at a time, what are you expected to do? Are exceptions that should be handled before or after other have been treated? We have more from an execution in which we just get a single exception and handle it to a case where we have several. Do we raise the first exception, lat one or maybe one the middle?

What I'm trying to say is that the in a normal case you only get a single exception and therefore you know how to handle it (of course if expected) while when we get several the way to behaive is not that clear.

> Which is actually true. One of the people (aquarius) involved in the
> conversation (conversation url: https://code.launchpad.net/~mandel/desktopcouch/batch_update/+merge/17482)
> went to #python on irc and had a long chat regarding the matter
> (please read it at https://code.launchpad.net/~mandel/desktopcouch/batch_update/+merge/17482).

AFAICS, the point is that we have this list of documents, and we have
to fine-grainedly report if the update succeeded or not. My first idea
would be to just return a list of booleans indicating success or
failure. Throwing an Exception isn't really an option, since it's not
fine-grained (and once thrown the rest of the results aren't checked,
or the results can't be inspected by the caller. Now, here we have the
problem that we can also detect several *kinds* of problems per
failure, so you have to distinguish those somehow. You'd have to use
some sentinel values (constants), so why not use Exception instances?

I  agree with it but that does not mean there is not a better way.

I actually think it's a rather elegant solution, and certainly not unpythonic.

Cheers,

Dirkjan
--
You received this message because you are subscribed to the Google Groups "CouchDB-Python" group.
To post to this group, send email to couchdb...@googlegroups.com.
To unsubscribe from this group, send email to couchdb-pytho...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/couchdb-python?hl=en.


Kr,

Mandel

Dirkjan Ochtman

unread,
Jan 22, 2010, 8:07:59 AM1/22/10
to couchdb...@googlegroups.com
2010/1/22 Manuel de la Peña <eti...@gmail.com>:

> No, actually I do not know why Stuarts reasoning, but here is mine. If you
> get more than one exception at a time, what are you expected to do? Are
> exceptions that should be handled before or after other have been treated?
> We have more from an execution in which we just get a single exception and
> handle it to a case where we have several. Do we raise the first exception,
> lat one or maybe one the middle?
> What I'm trying to say is that the in a normal case you only get a single
> exception and therefore you know how to handle it (of course if expected)
> while when we get several the way to behaive is not that clear.

Well, the handling of exceptions is clearly application-defined. Here,
too, how to handle multiple exceptions should be left to the
application; someone will have to make a decision on what to do with
conflicts, for example. I don't think the multiple-exception case is
really all that different from the single-exception case, though; if
you know what to do when you get a conflict from a single document
update, you just have to multiply that action by the amount of
conflicting documents for the multiple document update case.

> I  agree with it but that does not mean there is not a better way.

Maybe not, but I haven't heard a better proposal yet. :)

Cheers,

Dirkjan

Reply all
Reply to author
Forward
0 new messages