Storing arbitrary GObjects from Python / Fine-grained Synchronization

13 views
Skip to first unread message

Joachim

unread,
Aug 2, 2010, 5:58:00 PM8/2/10
to Desktop CouchDB
Hello everybody,

I talked to the people at #couchdb on freenode irc about my ideas some
days ago and I was sent here to present my ideas here and hear what
you think about them.

First I want to talk about storing arbitrary gobjects. I thought it
would be nice to have the possibility to store arbitrary gobjects in a
couchdb instance with very little modification to the gobject class
definition itself -- no matter whether those are contacts, tasks,
emails, IM buddies, IM logs, bookmarks, RSS feeds, ... So storage and
synchronization should be as transparent as possible, applications
could work with their slightly modified gobjects without even thinking
about the synchronization process itself.

I implemented this idea as prototype in python and it's quite easy to
use (this is how you use it, not the source itself; I don't want to
paste it all here and I don't see a button to append documents ...):

class Event(couchdb_gobject.CouchDBObject):
__gtype_name__ = 'Event'
__gproperties__ = {
'title': ( gobject.TYPE_STRING, 'title', 'this is my attribute', '',
gobject.PARAM_READWRITE ),
'message': ( gobject.TYPE_STRING, 'message', 'this is my attribute',
'', gobject.PARAM_READWRITE ),
'timestamp': ( gobject.TYPE_DOUBLE, 'timestamp', 'this is my
attribute', 0.0, 2**32, 0.0, gobject.PARAM_READWRITE ),
}
__gsignals__ = {
'deleted': ( gobject.SIGNAL_RUN_LAST, gobject.TYPE_NONE, ( ) ),
}
_couchdb_database_name = 'sharedb/anonymous/events'

New instances are automatically saved, property changes and deletes
are automatically synchronized to the couchdb instance and from there
to all the couchdb cloud instances -- no matter where: on my mobile
phone, my laptop, my server, my desktop. With all the benefits that
couchdb / desktop couch brings: Work with your objects no matter if
you are online or offline. A list of existing objects can be obtained
from an object manager and the application can keep up with changes
occuring locally or remotely by connecting to gobject signals.

This is how you can connect arbitrary gobjects that are being used in
running applications around the globe without losing the comfort of
local caching.

My second concern is about fine-grained synchronization. Imagine you
have a shared contact. You change its office number at work but your
laptop hasn't enough time to synchronize it's changes before you turn
it off. At home at your desktop you change the address. Now you have
two different documents on your laptop and your desktop. Both have
been changed, but the laptop's one doesn't reflect the change of the
address and the desktop's one doesn't reflect the change of the office
telephone number. So when synchronizing you have to decide manually
which version to delete and you will lose either one or the other
version.

This can be solved when storing the object and it's properties each in
a document of it's own. You simply have to add a "last modified"
timestamp to every property. When you then change the office telephone
number on your laptop and the address on your desktop, the changes can
be merged without any difficulties. In the end you will have the
contact with both the new office telephone number and the new address
-- exactly what you wanted! (I know that this isn't that easy when
multiple users are using the same dataset, but as far as I understood
this isn't the main scenario desktop couch focusses.)

Also imagine you change the address on your laptop, but it doesn't
synchronize, and some hours later you realize that you had a spelling
mistake and you change it again on your desktop. What will happen?
There are two conflicting changes! But based on the "last modified"
timestamp, the system can decide automatically that the last change is
more recent then the one on the laptop and you will get what you
expect: the last change will be used to solve the conflict without
human intervention. (My gobject persistency layer already uses
different documents for object and properties and adds timestamps on
modification.)

I'm really curious what you think about my two points and about my
ideas and I hope that I can contribute to desktop couch a little.

Thank you for your kind attention!

Joachim

Rodrigo Moya

unread,
Aug 3, 2010, 6:49:01 AM8/3/10
to desktop...@googlegroups.com
have you thought about using GObject introspection? That way, there
should be no need at all to write anything on your GObject-based
classes, and even objects that know nothing about CouchDB could just be
stored to/retrieved from it very easily

> My second concern is about fine-grained synchronization. Imagine you
> have a shared contact. You change its office number at work but your
> laptop hasn't enough time to synchronize it's changes before you turn
> it off. At home at your desktop you change the address. Now you have
> two different documents on your laptop and your desktop. Both have
> been changed, but the laptop's one doesn't reflect the change of the
> address and the desktop's one doesn't reflect the change of the office
> telephone number. So when synchronizing you have to decide manually
> which version to delete and you will lose either one or the other
> version.
>

couchdb deals with this by creating conflicts, so you never lose any
version. We have been thinking for a long time about how to resolve
conflicts in an easy way (for the user, it's very easy to do it from
code, but the difficult bit is about deciding which version to use), so
we might come up with something soon. But you shouldn't really be too
worried about it. In the worst case, the user would have 2 records for
the same contact and would have to merge manually

Joachim

unread,
Aug 3, 2010, 7:35:23 AM8/3/10
to Desktop CouchDB
> have you thought about using GObject introspection? That way, there
> should be no need at all to write anything on your GObject-based
> classes, and even objects that know nothing about CouchDB could just be
> stored to/retrieved  from it very easily

I haven't ever used GObject introspection by now and I don't know it's
mechanisms. But I agree: It should be possible to connect arbitrary
GObjects to a CouchDB database without modifying them in any way if
you have the following things (not necessarily complete!):

- you can change the objects properties whenever the couchdb instance
is updated
- you connect to it's signals to find out when it's destroyed or its
properties are changed
- you can hook into the instanciation process and thus be notified
whenever a new object is created
- you provide a function or method to retrieve all existing objects

> > My second concern is about fine-grained synchronization. Imagine you
> > have a shared contact. You change its office number at work but your
> > laptop hasn't enough time to synchronize it's changes before you turn
> > it off. At home at your desktop you change the address. Now you have
> > two different documents on your laptop and your desktop. Both have
> > been changed, but the laptop's one doesn't reflect the change of the
> > address and the desktop's one doesn't reflect the change of the office
> > telephone number. So when synchronizing you have to decide manually
> > which version to delete and you will lose either one or the other
> > version.
>
> couchdb deals with this by creating conflicts, so you never lose any
> version. We have been thinking for a long time about how to resolve
> conflicts in an easy way (for the user, it's very easy to do it from
> code, but the difficult bit is about deciding which version to use), so
> we might come up with something soon. But you shouldn't really be too
> worried about it. In the worst case, the user would have 2 records for
> the same contact and would have to merge manually

From my point of view a good solution has to solve conflicts without
human intervention and in a way the user expects it to happen. Of
course having the user decide between two conflicting versions or
merging them is always an option, but not a convenient one. I think it
would be both very convenient for the user and very easy to implement
it the way I proposed it: Add last modification timestamps to any
property and decide on this basis automatically which version of a
specific property to keep. What's wrong with this idea?

Dave Cottlehuber

unread,
Aug 3, 2010, 8:22:31 AM8/3/10
to desktop...@googlegroups.com
& if your time isn't in sync? even with NTP this is a lot more common
than you'd expect. I prefer to let the DB do what couch already does &
leave reconciliation largely to the user. If you can resolve this with
application rules then that's just as good - if you're always sure it
works.

> --
> You received this message because you are subscribed to the Google Groups "Desktop CouchDB" group.
> To post to this group, send email to desktop...@googlegroups.com.
> To unsubscribe from this group, send email to desktop-couch...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/desktop-couchdb?hl=en.
>
>

Eric Casteleijn

unread,
Aug 3, 2010, 8:46:19 AM8/3/10
to desktop...@googlegroups.com
On 08/03/2010 08:22 AM, Dave Cottlehuber wrote:
> & if your time isn't in sync? even with NTP this is a lot more common
> than you'd expect. I prefer to let the DB do what couch already does&
> leave reconciliation largely to the user. If you can resolve this with
> application rules then that's just as good - if you're always sure it
> works.

Having worked on timestamp based conflict resolution using couch, I
could not agree more: avoid timestamps in any setting where replication
is happening, or you will regret it, it's not a good match for how
couchdb works. I agree that desktopcouch needs better support for
conflict resolution, so that applications using the destkop API have an
easier time detecting conflicts and resolving them either automatically,
or through user intervention.

I don't think there is such a thing as automatic an conflict resolution
that will work irrespective of what kind of document/application you're
working with, (fields might have dependencies between them) so I think
mostly, it's the right thing to do to leave it to the client to deal
with them however it sees fit.

--
eric casteleijn
https://code.launchpad.net/~thisfred
Canonical Ltd.

Reply all
Reply to author
Forward
0 new messages