Data model for machine parsable protocols?

16 views
Skip to first unread message

Bryan Bishop

unread,
Feb 8, 2011, 11:57:58 AM2/8/11
to diy...@googlegroups.com, Bryan Bishop, Eric Meltzer
Looks like Eric has been wanting to take a stab at this:

http://openprotocols.net/hiring/rough%20spec.pdf
http://openprotocols.net/hiring/
http://news.ycombinator.com/item?id=2191105

Git-revisioned protocols and sharing is nice, and is one step up from the current situation in sharing and developing lab protocols. DIYbio as a group has been eyeballing more programmatic ways of specifying protocols, either through structured document formats (like XML) or parsed languages like in Microsoft's Biocoder project:

http://research.microsoft.com/en-us/um/india/projects/biocoder/
or xml: http://diyhpl.us/~bryan/irc/pcr.xml

Still, I think this isn't a solved problem. The API for Biocoder feels all wrong for numerous reasons; on top of that, nobody is going to learn a new API, library or programming language just to write down a protocol, unless they are being paid, or there's some really compelling reason to do so (which, there isn't). Without something changing here, you just end up with a giant corpus of protocols like we presently have, without metadata and basically useless unless you already know what you want or need, or have the time to manually check and double check everything in each protocol you might be using.

One of the advantages of machine parsable protocols is being able to query against your lab inventory...
http://diyhpl.us/cgit/skdb/plain/doc/BOMs/diybio-equipment.yaml

I should also point out that Jonathan had been working on parsing plaintext protocols a while back:

Don’t Train the Biology Robot: Have the Machine Read the Protocol and Automate Itself
http://88proof.com/synthetic_biology/blog/archives/290

Eric, maybe you can share with us what you're thinking?

- Bryan
http://heybryan.org/
1 512 203 0507

Mackenzie Cowell

unread,
Feb 8, 2011, 12:30:02 PM2/8/11
to diy...@googlegroups.com
Looks like openprotocols.net may have been developed as part of an iGEM project:
http://2010.igem.org/Team:Paris_Liliane_Bettencourt/Collaboration

Mac

--
You received this message because you are subscribed to the Google Groups "DIYbio" group.
To post to this group, send email to diy...@googlegroups.com.
To unsubscribe from this group, send email to diybio+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/diybio?hl=en.



--
+1.231.313.9062 / m...@diybio.org / @100ideas

Mackenzie Cowell

unread,
Feb 8, 2011, 12:31:49 PM2/8/11
to diy...@googlegroups.com
It's worth noting that the Alberta iGEM 2010 team also did some development work on machine-generated protocols.  Example: http://genomikon.ca/experiments/14

Mac

Bryan Bishop

unread,
Feb 8, 2011, 12:38:26 PM2/8/11
to diy...@googlegroups.com, Bryan Bishop
On Tue, Feb 8, 2011 at 11:30 AM, Mackenzie Cowell <m...@diybio.org> wrote:
Looks like openprotocols.net may have been developed as part of an iGEM project:
http://2010.igem.org/Team:Paris_Liliane_Bettencourt/Collaboration

it links to this:
http://synbioworld.org/openprotocol/

Eric Meltzer

unread,
Feb 8, 2011, 4:39:20 PM2/8/11
to Bryan Bishop, diy...@googlegroups.com, Xavier Duportet, David Bikard
Hi Bryan, and Hi to the DIYbio list at large,

I'm glad you guys are interested in our project, and it's great to see that there's been a lot of thought about this problem in the past. To us, lab protocols are a good place to start when trying to promote the goal of open and transparent labwork.  We are not interested in being an avant-garde service used only by other open-science fans; we're shooting for the mainstream.  

Bryan, you said "...you just end up with a giant corpus of protocols like we presently have, without metadata and basically useless..."
This giant corpus of disorganized and useless protocols is exactly what motivated us to start this project.  We propose to fix it not just by using structured data (although that is part of the solution) but instead by changing the social structure of protocol sharing.

If you look at existing protocol aggregation sites (protocol-online, all of the ones run by academic journals, and open wetware) they currently just take anything, and then organize it by something like what organism its for, etc.  We've decided to organize our site in a different way, based in some part on an interesting article by Zhang Xianhang about social design for websites.

Our organization starts with group pages, which is just a page with a bunch of links to protocols, curated by the people who started the group.  When Albert Apple joins the site, he has a group page called Albert Apple's Protocols, onto which he can add any protocols on the site that he uses frequently (by clicking "copy to my page") as well as make and upload his own protocols.  In the example shown in our spec, the DIYbio group contains protocols that DIYbio people are interested in, organized into categories.  Since each group is user moderated, the content on them is free to grow very slowly, and you only see what you're interested in.  If at this time tomorrow, someone uploads 20,000 terrible protocols to the site, you won't notice, because they aren't on any of the group pages you care about.  

As you can see from this rant, and our spec, we're unapologetic user-experience obsessives.  If anyone has any critiques or comments, we'd be very glad to hear them!

Best,

Eric Meltzer



On Tue, Feb 8, 2011 at 11:57 AM, Bryan Bishop <kan...@gmail.com> wrote:

Bryan Bishop

unread,
Feb 8, 2011, 10:46:43 PM2/8/11
to Eric Meltzer, Bryan Bishop, diy...@googlegroups.com, Xavier Duportet, David Bikard
On Tue, Feb 8, 2011 at 3:39 PM, Eric Meltzer wrote:
Hi Bryan, and Hi to the DIYbio list at large,

Hi Eric.
 
Bryan, you said "...you just end up with a giant corpus of protocols like we presently have, without metadata and basically useless..."
This giant corpus of disorganized and useless protocols is exactly what motivated us to start this project.  We propose to fix it not just by using structured data (although that is part of the solution) but instead by changing the social structure of protocol sharing.

So, just to be clear, your idea is to have users write up protocols in your semantic format, and submit it to your website, rather than a journal or book for academic credit (of course, that might happen, but it's really a time investment calculation for users, I imagine).

And specifically you're betting that people who are capable of doing that task will choose to use your site over journals/books because whatever particular social sauce you have..

I don't know if data sharing is the real problem here- it's been shown that the web can be used effectively to share data.. sure. But I also think there are some greater issues involved in protocol representation.. i.e. English is not necessarily ideal, and while semantically separating sections of it can help, it does't entirely solve the problem. Jonathan Cline took a nice stab at the problem (but he stopped? or something). Anyway, if you have any solutions in that area, it'd be great to hear about it.
 
As you can see from this rant, and our spec, we're unapologetic user-experience obsessives.  If anyone has any critiques or comments, we'd be very glad to hear them!

Eric Meltzer

unread,
Feb 8, 2011, 11:30:09 PM2/8/11
to Bryan Bishop, diy...@googlegroups.com, Xavier Duportet, David Bikard
Hey Bryan,

We're not trying to be an exclusive or final destination for protocols, but rather a place for people to store all the protocols they use.  Thus there is no reason one couldn't publish a protocol, stick it in a book, and then put it on openProtocol.  Since it's incredibly difficult to copyright recipes, and the fairly constricted format of our protocols will necessitate a bit of tweaking anyway, this is a perfectly viable thing for some people to do.

The social structure of the site is not a way to get people to post more protocols, but rather a way to make the protocols that people do post easy to find and use.  It also makes sharing protocols in many languages intrinsically easier because they will naturally segregate onto different group pages (although we also are allowing for actual software filtering of protocols by language.)

Besides sharing protocols in non-english languages (which is, and this is speaking as someone going to school in Beijing and working in a chinese-only lab, still quite an edge case at the moment) what other problems do you guys see in the display/dispersal of protocols?

-Eric

Mackenzie Cowell

unread,
Feb 9, 2011, 12:25:05 PM2/9/11
to diy...@googlegroups.com, Bryan Bishop, Xavier Duportet, David Bikard
Eric,

Starting with a simple, narrow focus seems like a good way to begin.  I'm looking forward to trying out the site and helping it grow.


As an experiment, I encourage you to consider asking users for a semi-structured tag cloud of words that describe the inputs and outputs of each protocol, in addition to the ordered list of text that they will provide.  In this way, you could start treating protocols in a more functional way, perhaps even providing composite protocols, without having to implement a full-blown semantic protocol annotation system.

Good luck, let us know when we can experiment with your beta.

Mac

--
You received this message because you are subscribed to the Google Groups "DIYbio" group.
To post to this group, send email to diy...@googlegroups.com.
To unsubscribe from this group, send email to diybio+un...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/diybio?hl=en.

David Bikard

unread,
Feb 9, 2011, 2:52:17 AM2/9/11
to Eric Meltzer, Bryan Bishop, diy...@googlegroups.com, Xavier Duportet
Hi guys,

This said, we do also clearly see the value of machine readable protocols, and we are doing some research on what semantic web standard would be best to structure our data. For instance I don't know if you guys know: http://omanual.com/ . We are also considering creating a new microformat dedicated for protocols in the spirit of http://microformats.org/wiki/hrecipe. We'd love to hear your input about what would be best. However, as Eric explained, this is not where we are getting started for openprotocols.net. But rest assured that we DO think about it!

David
--
David Bikard
Unité "Plasticité du Génome Bactérien"
Département de Génomes et Génétique
Institut Pasteur
25,rue du Dr Roux
75724 Paris cedex 15
email: dbi...@gmail.fr
Tel : 33 .1.44.38.94.83
Reply all
Reply to author
Forward
0 new messages