mesh4x and python/django? (also: file sharing with DropBox/Groove/WLM)

edjez

unread,

Dec 12, 2008, 7:17:04 PM12/12/08

to Mesh4x Discussion, cz...@dimagi.com, jjac...@dimagi.com, dmy...@dimagi.com

From: Eduardo Jezierski
Date: Fri, 12 Dec 2008 14:33:44 -0800
To: Daniel Myung
Cc: Daniel Cazzulino , Juan Tondato , Jonathan Jackson Cory Zue
Conversation: mesh4x and python/django?
Subject: Re: mesh4x and python/django?

Daniel,
When is your trial in Zambia starting? Can you tell us more about
what you are trying to achieve (eg user X takes picture and puts in
folder, then updates database, then.... Etc)

More info-

Re: server
– Jonathan asked for more info... so here is more info...
http://edjez.instedd.org/2008/12/for-geeks-progress-on-mesh4x-cloud.html
The store is a Mesh4x adapter for S3 (or Mysql over block storage)

Question to help steer our efforts: If you were to use this, what
reuse mechanism do you see for the service? (we were thinking A and D
to be the most common, but would love your input.

A) do you see yourselves using a sync.instedd.org instance (with
security scoped by the app)
B) Should we prep the VM image ‘nicely’ and share it on EC2 so you can
start your own?
C) Should we prep another VM you can download and run in some box you
own?
D) We prep the code and services to be deployed in your own native or
virtual instances on apache

Re: mesh4x
– mesh4x right now evolved around synchronizing data, not just files.
We haven’t run into a scenario where people wanted to synchronize
files per se, but an adapter wouldn’t be hard. Depending on the sizes
of the files,

Hm...your questions tell me we need a much better intro set of docs to
mesh4x and what it does today and where its going
Imagine this scenario:
2+ databases (of any kind, CSV to mysql to access to a KML file or a
javarosa/rms store)
Updates are happing on the same entities one either side
‘ the same entities’ means we are partly updating the same
information – eg I update a patient location, you update their
symptoms, and some apps mught just ‘get’ part of the info
You want to share those updates and get the updates of others (and in
updates I include insertions, edits, deletions, etc)

RDF is just the ‘ default’ data format for things that look like
dictionaries (just like you use xforms)
But the mesh4x libraries just move anything opaquely (e.g. We read a
KML file and decompose it into its versionable entities – pushpins,
placemarks, polygons, styles, hierarchies, track versions for those,
and then re-compose the KML on the other end). Or you could exhange
xforms as-is.

re: file sharing

What you sugest will move your files around, making sure the latest
updated is spread around. Unless each entity is in a different file
you might loose updates (and then it’s a latest-one-wins scenario per
file, I believe, for bosh WLM and Dropbox); and it seems like a
straightforward approach if concurrent updates, and manual conflict
resolution is not needed. (eg moving pictures around, they tend to be
never edited and then its unlikely 2 folks will edit the same thing at
concurrently).

I’d consider groove for your tests: In past excercises we moved
pictures around using groove shared folders (great for lower bandwidth
as it chunks the data exhanges and survives interruption – we even did
it over BGAN satellite links; and I know Robert deployed a
telemedicine solution based on moving microscopy JPEGs around in
groove), and they have the discovery, security, etc built in; and no
internet connection needed not even for rendezvous or update pings.
(not sure how dropbox or wlm work in these scenarios)

Not sure about how WLM would behave in this connectivity context but I
would definitely test it. I know the chaps doing the sync fx and no
matter how much I tried to hammer humanitarian scenarios to their
cortex 1+ years ago, it’s a tough sell when they are trying to ship.

From: Daniel Myung
Date: Fri, 12 Dec 2008 15:53:27 -0500
To: Eduardo Jezierski
Cc: Daniel Cazzulino Juan Tondato , Jonathan Jackson Cory Zue
Subject: Re: mesh4x and python/django?

Hey Eduardo, and other Meshers,

Our timeline is a bit tight, as we'd like to have some semblance of a
trial in Zambia sometime soon. I don't think we have the time or
resources to look into contributing a python/django adapter
implementation just yet. But it's something that intrigues me as well

Thinking through it some more, I think the file synchronization for
p2p is the compelling story we want to focus on. To reiterate, if 2
laptops are using Dropbox or Microsoft Mesh, there's essentially an
always updating directory on their system that reflects the contents
of their peer's laptop, provided there's enough connectivity to
attempt a synchronization. I'm thinking by putting as much faith into
the file transfer properties (this includes the feed xml + the binary
images), we can have our sync framework just use this one transaction
method. For our scenario dropbox/mesh address the discovery and peer
to peer connectivity of multiple remote hosts as well as one time cost
of moving binary data to remote peers.

The host can act as a hub, starting up a dropbox/mesh account and then
divvy out unique shared folders to remote accounts they want to
receive syncs from.

Re: cloud services? What's the underlying storage mechanism for
that? Is it an API dressing for the storage methods like Skydrive or
S3 or something?

In summary, I guess we could use a primer on the pros and cons of your
file based synchronization method. Is it just pure RDF based feeds
that just updates for both the left side and right side to ensure that
they are both equal at the end of the transaction? How does this
differ from the KML example?

I've cc'ed another Dimagi'er, Cory on the conversation as he's part of
our collective effort to building up this component.

Thanks,

Dan

On Wed, Dec 10, 2008 at 5:37 PM, Eduardo Jezierski <ed...@instedd.org>
wrote:
Daniel,
There's two approaches – using Django's ORM layer (requiring
Python implementation) or writing directly to MySQL / SqlLite from a
Java app using Hibernate.
The mesh scenario you describe is precisely one we want to
support. We are abut to finish a short project with CDC to sync
EpiInfo via cloud or SMS and then transform/map the cloud info
automatically onto maps/kml.

A Python implementation of the core libraries would be IMO something
widely useful and easier to build than may seem, but may be out of
scope for an initial proof of concept. But if you want to get a deep
understanding of all this or want to have some OSS fun these holiday,
you an take the Java libraries, port the unit tests to Py first, and
then work out the rest.

Sounds like you'll want HTTP to the cloud store, and then some peer to
peer exchange between laptops. There are adapters to files that you an
sync from and to, so no problem there. Tthe next question is how to
make 2 machines on the same net automagically discover each others
services and do the syncing once they rendezvus.
For the peer to peer rendezvous we don't have any implementations
yet; we did some playing around with WASTE networks (interesting,
cumbersome) and haven't yet done any work with Zeroconf but it looks
promising (there's our holiday hack project). Suggestions welcome.

What timelines are you thinking about?

I am assuming the images for the moment are not being updated, too
(mesh4x would have to bring in something like rsync to sync enclosures
efficiently without moving huge chunks of bin around, and weve been
focusing on the whole forms / javarosa/ structured data sync ; and
getting the cloud service up this week)

Cheers. Adding Jonathan so hes in the loop too.
Will send you a link to the dev instance of the cloud service when its
up later this week

ej

From: Daniel Myung
Date: Wed, 10 Dec 2008 10:16:17 -0500
To: Daniel Cazzulino
Cc: Eduardo Jezierski , Juan Tondato
Subject: Re: mesh4x and python/django?

Thanks for the reply!

Django has an ORM layer built with python. It has hooks to most of
the major dbs (ms sql needs some finesse), most people use sqlite or
mysql as default. It's just a single line change in the config to
switch from one db to another.

http://docs.djangoproject.com/en/dev/topics/db/models/
http://docs.djangoproject.com/en/dev/topics/db/queries/

I'm guessing that we should probably use Java for the sake of
generality (as we foresee some of our installs being put on linux
VMs). I just wasn't sure whether or not there was a way forward for
using pure python to reduce the number of moving parts to install. I
suppose in the interest of time/proof of concept, I could just bypass
the django API and have hibernate plug into the DB directly.

To clarify the situation some more, we're looking to use something
like microsoft Mesh or Dropbox as the actual transport of
information. We like these because they are opportunistic in their
attempt to connect and sync the big chunks of data.

We're building our case upon a local laptop will do its sync with
whatever the storage service gets from the cloud storage. So, in
essence, it's a glorified USB key transfer. So actually i misspoke,
HTTP might not be necessary.

my local data dir\updates.db (published in whatever say), + images
laptop_images = [x,y,z]

my microsoft mesh or dropbox share\tosync.db
remote_images = [x, w]

if updates.db != tosync.db
synchronize
iflaptop_images != remote_images
synchronize => [w,x,y,z]

where the dbs have pointers to images as part of each record.

Hope this clears things up,

Dan

On Wed, Dec 10, 2008 at 9:55 AM, Daniel Cazzulino
<dan...@cazzulino.com <http://dan...@cazzulino.com> > wrote:
Hi Daniel,
sounds like a very good fit for mesh4x indeed!

Which platform are you looking to create the adapter in? Java, .NET?

I'm not familiar with Django, which DB do they use?
If you go with Java, you can use the Hibernate adapter we have, to
communicate with the DB on one end, and the HTTP adapter to sync to
the cloud.

From there, you can have another client do the opposite, sync'ing from
the cloud via the HTTP adapter to a local DB via the Hibernate
adapter.

So the flow would be:

Local DB (via Hibernate adapter) <-> Cloud opaque storage (via HTTP
adapter) <-> Another local DB (via Hibernate adapter)

(that would be for peer-to-peer sync)

and

Local DB (via Hibernate adapter) <-> Cloud opaque storage (via HTTP
adapter) <-> Django site DB (via Hibernate adapter)

I'm looping Ed and Tondat in case they want to add more info/comments.

Thanks,
/kzu

On Mon, Dec 8, 2008 at 5:38 PM, Daniel Myung <dmy...@dimagi.com
<http://dmy...@dimagi.com> > wrote:
Hi Daniel,

We briefly spoke on the mesh4x democall a few weeks back. I thought
I'd send another ping to follow up with some additional questions.

We here at Dimagi are looking to explore continuing uses of Mesh4x for
a variety of uses for offline capable synchronization. In particular
we're looking to build up a telemedicine consultation/collaboration
site that synchronizes images and structured data. The data is being
stored and managed via a Django site and DB. We were wondering if you
had any suggestions on how to tackle the creation of an adapter to
handle the synchronization of this data.

The particulars are deliberately simplified to address an initial
prototype. Basically each side is only able to generate new data. A
nurse on one end can create patients, consultrequests, photos and
comments (to photos). A doctor on the other can only create
consultresponses (linked to a consultrequest) and comments on photos.

The HTTP adapter seems to fit the bill on this...but we were wondering
if you had any other suggestions and/or requests to fill out your
needs to create additional adapters and use cases.

Any assistance would be appreciated,

Dan

Daniel Myung

unread,

Dec 15, 2008, 2:16:26 PM12/15/08

to Mesh4x Discussion

Hey, writing to the group now :)

Timeline for Zambia is a bit iffy, still trying to nail down those
deliverables.

The scenario is a screen and treat for cervical cancer aided by photo
capture.

- Simple DB for recording observations
- DB contains patient info w/ visits and observations
- each visit has pictures which are pointers to files in folder
- sync input data and images to remote doctor's office for consult
- doctor comment on image synced back to remote clinic for nurse
response

Like has been said before, the images are immutable - linked to a
record and need to be copied one time from the originating site to the
consult site for archive and scrutiny/review. Nothing here is
particularly unique with regard to telemedicine craziness. Would like
to leverage as much as what you guys have done so as to make
implementation cleaner and make a more compelling reusability story.

I figure that if we can keep the file system consistent, we can have
some sort of feed or DB representation be mesh'ed over in whatever
way, and let the other services automagically copy the images over.
That way, with using relative filesystem pointers, the synchronization
can happen without adding any additional moving parts.

Our context has us leaning towards option D with having it be
deployable and hopefully maintained by the local implementers.

We'd love to see a slightly more dumbed down checklist for potential
implementers to go through to consider the routes involved. The
googlecode repository provides compelling reasons for mesh4x usage,
but finding a decision tree one needs to make (make my own adapter,
use what's existing, how to format) took a bit of fumbling by me. As
a result, it led me to be a bit indecisive on how to address our
issues with the variety of implementations. Maybe a noob FAQ and
Mes4hx "implementation continuum" is in order to keep you guys from
being constantly being inundated with similar implementation
questions?

Thanks,

Dan

Cory Zue

unread,

Dec 15, 2008, 3:29:17 PM12/15/08

to Mesh4x Discussion

Hi everybody,

Thanks for the excellent guides and apologies again for our beginner's
questions...

So I think we're getting closer to an understanding but are still a
bit limited due to our newness in the space and with your technology.

Am I correct in saying that the problem of syncing entities is
something that comes out of the box with mesh4x and that the only
particulars we will have to solve is getting our data model to import/
expose itself in a way that mesh4x can understand (via an adapter)?

And furthermore IF we have a SqlLite database AND we have hibernate
set up on top of it then the hibernate adapter will give us the rest
for free?

It seems like this may be the quickest workable solution that will
meet our requirements. The issue with the images, I think, can be
dealt with totally outside the realm of mesh4x.

Like Dan mentioned I think a sample project or tutorial saying "what
steps are necessary to make my stuff mesh?" is the type of intro-level
thing we are looking for in order to get us on the path of hooking
this all up....

thanks again!

-Cory

> ...
>
> read more »

Eduardo Jezierski

unread,

Dec 15, 2008, 6:29:07 PM12/15/08

to mes...@googlegroups.com

> From: Cory Zue <cory...@gmail.com>
> Reply-To: <mes...@googlegroups.com>
> Date: Mon, 15 Dec 2008 12:29:17 -0800 (PST)
> To: Mesh4x Discussion <mes...@googlegroups.com>
> Subject: [mesh4x] Re: mesh4x and python/django? (also: file sharing with
> DropBox/Groove/WLM)
>
>
> Hi everybody,
>
> Thanks for the excellent guides and apologies again for our beginner's
> questions...
>
> So I think we're getting closer to an understanding but are still a
> bit limited due to our newness in the space and with your technology.
>
> Am I correct in saying that the problem of syncing entities is
> something that comes out of the box with mesh4x and that the only
> particulars we will have to solve is getting our data model to import/
> expose itself in a way that mesh4x can understand (via an adapter)?

---> Yes, you implement an interface that looks like 'crud'
http://code.google.com/p/mesh4x/source/browse/Mesh4j/trunk/src/org/mesh4j/sy
nc/adapters/split/IContentAdapter.java

Or if your datasource is already Hibernate-compatible, and you are OK going
straight to the db, it's even easier (tondat can you share a mapping
example?)

--> Of course, this is with the caveat of certain operational restrictions
because of the implementation... AKA id you try to sync items 1GB big, it
won't work because the assumption so far is that the data moves around
inline (as in xml data) instead of enclosures that need to be transferred
out of band.

>
> And furthermore IF we have a SqlLite database AND we have hibernate
> set up on top of it then the hibernate adapter will give us the rest
> for free?

Yah you just have to define the mapping

>
> It seems like this may be the quickest workable solution that will
> meet our requirements. The issue with the images, I think, can be
> dealt with totally outside the realm of mesh4x.

Yup, but if we can learn a bit more what you want to do we may work on it,
since it is quite recurring and will only tend to happen more often.
(we technically sync files as part of the KML/KMZ sync as there are icons,
etc there in the resources folde, but I assumne you mean 2-Mb sized items,
not 2 Kb)

> Like Dan mentioned I think a sample project or tutorial saying "what
> steps are necessary to make my stuff mesh?" is the type of intro-level
> thing we are looking for in order to get us on the path of hooking
> this all up....

OK we'll put it together. If you tell us the schema for the table you want
to use as the example we can work it off of that!
Reminder we have a milestone coming up Friday, do we have until then? Does
this outline I just created look appropriate for your needs?

http://code.google.com/p/mesh4x/wiki/HowToSyncADatabase

~ ej

>>> B) Should we prep the VM image Œnicely¹ and share it on EC2 so you can

>>> start your own?
>>> C) Should we prep another VM you can download and run in some box you
>>> own?
>>> D) We prep the code and services to be deployed in your own native or
>>> virtual instances on apache
>>
>>> Re: mesh4x
>>> mesh4x right now evolved around synchronizing data, not just files.
>>> We haven¹t run into a scenario where people wanted to synchronize
>>> files per se, but an adapter wouldn¹t be hard. Depending on the sizes
>>> of the files,
>>
>>> Hm...your questions tell me we need a much better intro set of docs to
>>> mesh4x and what it does today and where its going
>>> Imagine this scenario:
>>> 2+ databases (of any kind, CSV to mysql to access to a KML file or a
>>> javarosa/rms store)
>>> Updates are happing on the same entities one either side

>>> Œ the same entities¹ means we are partly updating the same

>>> information eg I update a patient location, you update their

>>> symptoms, and some apps mught just Œget¹ part of the info

>>> You want to share those updates and get the updates of others (and in
>>> updates I include insertions, edits, deletions, etc)
>>

>>> RDF is just the Œ default¹ data format for things that look like

>>> ?

Eduardo Jezierski

unread,

Dec 15, 2008, 7:29:42 PM12/15/08

to mes...@googlegroups.com

Daneil, and we are going to be revamping a lot of the docs, jus tconsider
yourself a brave icebreaker. We are taking down the FAQs and writing around
those. Right now, the mesh4x team is 'marcelo'. So we appreciate your
patience as we progress on maiking this a 'firendlier' FOSS project!

> From: Daniel Myung <dmy...@dimagi.com>
> Reply-To: <mes...@googlegroups.com>
> Date: Mon, 15 Dec 2008 11:16:26 -0800 (PST)
> To: Mesh4x Discussion <mes...@googlegroups.com>
> Subject: [mesh4x] Re: mesh4x and python/django? (also: file sharing with
> DropBox/Groove/WLM)
>
>

>> B) Should we prep the VM image Œnicely¹ and share it on EC2 so you can

>> start your own?
>> C) Should we prep another VM you can download and run in some box you
>> own?
>> D) We prep the code and services to be deployed in your own native or
>> virtual instances on apache
>>
>> Re: mesh4x
>> mesh4x right now evolved around synchronizing data, not just files.
>> We haven¹t run into a scenario where people wanted to synchronize
>> files per se, but an adapter wouldn¹t be hard. Depending on the sizes
>> of the files,
>>
>> Hm...your questions tell me we need a much better intro set of docs to
>> mesh4x and what it does today and where its going
>> Imagine this scenario:
>> 2+ databases (of any kind, CSV to mysql to access to a KML file or a
>> javarosa/rms store)
>> Updates are happing on the same entities one either side

>> Œ the same entities¹ means we are partly updating the same

>> information eg I update a patient location, you update their

>> symptoms, and some apps mught just Œget¹ part of the info

>> You want to share those updates and get the updates of others (and in
>> updates I include insertions, edits, deletions, etc)
>>

>> RDF is just the Œ default¹ data format for things that look like

>> ?

Eduardo Jezierski

unread,

Dec 16, 2008, 4:53:26 AM12/16/08

to mes...@googlegroups.com

Hello everyone, how do you feel about the following doc skeleton? Any gaps?
If you have specific suggestions, let me know for example

' take me from just having a sqllite db schema for a table to syncing it
over the internet with your service'

Thanks!
~ ej
=========================

Guidance Index
Application Scenario

* Scenario: Phone to Tethered PC Sync
* Scenario: Building Maps Collaboratively using KML
* Scenario: Web-based UI for desktop-based data
* Scenario: Creating an updateable store for for data-collection tools
* Scenario: Sync Heterogenous Applications
* Scenario: Tethered Phone to Tethered Phone sync
* Scenario: Cloud Sync of single-user databases

Checklist Item

* Checklist - Making sure Tethered Phone to Tethered Phone Sync will
work
* Checklist - Setting up your development environment

Code Example

* Code Example: Making a CSV file adapter

Guideline

* Roadmap - Joining the Mesh4x Community
* Roadmap - Getting to try out Mesh4x Technolgies

How To

* How To Sync a MySQL Table to the Cloud
* How to start your own EC2 Instance of the Mesh4x Server on S3
* How to edit data in a mesh server with JavaROSA
* How to create an adapter for an existing applications store
* How to create your own adapter
* How To Set up your own Mesh4x Server on MySQL
* How To Sync an Access Table

Info

* Info - List of Mesh4x adapters
* Info - List of applications using Mesh4x
* Info - FeedSync Reference
* Info - RDF and RDFs primer
* Info - Mesh4x Support for binary files
* Info - How does the Mesh4x over SMS protocol work

Principle

* Allow data augmentation
* Conflicts are interesting
* Keep Schemas small
* Default to standard payload representations

Daniel Myung

unread,

Dec 16, 2008, 12:37:57 PM12/16/08

to mes...@googlegroups.com

Hi all,

Just have a few questions regarding the scenarios listed.

Some of the howtos in that section seem to make an assumption of syncing client app to some more "heavier" server app, like the ec2 instance or the cloud services. If I recall your web demo from the call a few weeks back, you demo'ed a mysql and excel sync.

Was something similar to that demo your intent behind the "sync heterogeneous applications" scenario? Is it a "how to sync excel to mysql to sqlite" sort of scenario, or is it gonig to be a "how to sync epi-info to openmrs using the cloud services" sort of thing?

I guess this is a question of syncing in the absence of these servers. A question on the need to define service endpoints and discovery, something you guys highlight as a pro/con of this sync. One needing a high degree of planning ahead (which I'm all for), but it might present itself as being daunting to the implementor to try to address the myriad of sync scenarios that might prop up.

In other words, is there anything special that needs to be done to support a diversity of adapters/sync methods in a given application?

Dan

--
Daniel Myung
Senior Engineer, Dimagi, Inc
529 Main St, Suite 606 | Charlestown, MA 02129
office: (617) 580-3100 | mobile: (617) 645-3236 | fax: (617) 580-3103
http://www.dimagi.com/

Reply all

Reply to author

Forward