I'm curious to hear what kinds of software projects are interesting to
you. I've loads of software development experience, but am still
pretty new to wet work and bioinformatics. I'd love to use my skills
to benefit the community, and wonder if there is software development
and informatics work that would benefit DIYbio. I'm curious about two
things:
(1) Are there other folks out there on the list like me? What kinds
of software projects are *you* working on?
(2) What really compelling software is out there waiting to be
written? I'd like to keep tabs on an existing "lazybioweb software
projects" list, or start one. Are there Encyclopedia of Life mashups
that you'd love to see? Is there a software library out there badly
in need of some love & updates?
I look forward to discussing this!
-Jason
--
Jason Morrison
jason.p....@gmail.com
http://jayunit.net
(585) 216-5657
If you want to keep tabs on what people are doing in the communities that I interact with then FriendFeed room 'The Life Scientists' would be a good place to start:
http://friendfeed.com/rooms/the-life-scientists
I'm not sure how much a hub for software it is anymore but:
http://www.bioinformatics.org/search/fullprojectlist.php
Most people I know just go straight to Sourceforge these days, or GitHub for the more Web 2.0 alert ;)
regards,
Dan
--
|| Dan || dan[at]dreadportal.com || http://dreadportal.com/ ||
"Reality is that which, when you stop believing in it, doesn't go away."
(Philip K. Dick - How to Build a Universe)
I haven't used it extensively, but I know that this is a design goal
of Raik Gruenberg's Django app Brickit - see
http://brickit.sourceforge.net/
(1) Open source hardware packaging, much like "dot deb" and "dot rpm",
sort of like a "zip file", except standardized with (a) metadata, and
(b) common CAD files, and (c) both human and computer readable
instructions.
http://fennetic.net/git/gitweb.cgi?p=skdb.git;a=tree
http://fennetic.net/git/skdb.git/
more about this: http://heybryan.org/om.html (more interesting links at the top)
(2) XMLized versions of protocol-online.org and other biology
protocols, to reference packages maintained by people contributing to
#1, and so on.
http://groups.google.com/group/diybio/browse_frm/thread/ada2289ebbc00fe0/d3f81ed82d710ff8?lnk=gst&q=pcr.xml#d3f81ed82d710ff8
http://groups.google.com/group/openmanufacturing/browse_frm/thread/a8d8ee245aaae97d#
http://groups.google.com/group/openmanufacturing/browse_frm/thread/b2e21ccc953d6328#
(3) Reaction pathway transplantation across genomes :-)
http://groups.google.com/group/diybio/browse_frm/thread/d6ec92a5df6b4e74/3b22b31a504f29ca?#3b22b31a504f29ca
http://heybryan.org/~bbishop/docs/dopamine/synthesis_of_dopamine.txt
(4) Maybe something related to gel image analysis and cell phones?
(5) B2B nonsense.
http://groups.google.com/group/diybio/browse_frm/thread/55e532a3a061f97a/e6a4cedfae8ee1e8?#e6a4cedfae8ee1e8
http://groups.google.com/group/openmanufacturing/browse_frm/thread/b1acc9e3b18721fb#
(6) Austin Fab Lab "inventory management"-- sort of--
http://groups.google.com/group/openmanufacturing/browse_frm/thread/2c3c05a3054d8934
http://heybryan.org/~bbishop/docs/shelltrance.txt
(7) I also work in the Automated Design Lab, so I've been doing
matrices for design assembly representation, and automated gear train
design and visualization:
http://heybryan.org/~bbishop/docs/gears/gears.html
(I know, I know, it's unrelated.)
(8) CFD analysis of membraneless filters.
(9) Too many other things that I'm forgetting to mention.
> (2) What really compelling software is out there waiting to be
> written? I'd like to keep tabs on an existing "lazybioweb software
> projects" list, or start one. Are there Encyclopedia of Life mashups
> that you'd love to see? Is there a software library out there badly
> in need of some love & updates?
So, there's tons of bioinformatics databases, and there's tons of
biology software out there. But what I've noticed is that recently the
explosion of software from iGEM hasn't been integrated into the
previous bout of software out there on the internet .. which is a real
shame. It would be nice to see an index of new iGEM/synbio software or
whatever, coupled next to older computational biology tools.
http://heybryan.org/mediawiki/index.php/Computational_biology
http://heybryan.org/mediawiki/index.php/List_of_bioinformatics_databases
How are you going to implement a ranker algorithm for protocol files
like the "pcr.xml" file that I wrote up the other day? Do you plan to
work on structured protocol data, or just random crap, or what?
> DIYbio may get a better reception from the general public if it is
> completely open and is accepting of the safety concerns of the general
> public and works to maintain the highest safety standards. This means
> DIYbiologist would blog about their work and experiments in real time and
> perhaps even webcast them. On their blog sites they would be able to
> receive safety feedback and would be open to receiving it from the
> community.
I'm sure blogs are important, but an underlying data infrastructure is
also important too- this way algorithms can actually check to see if
you're using a safe protocol, or if you're doing something stupid,
like using a pathogen but there's no "BSL5" equipment listed (ok, this
is an extreme example, but gets the point across). Automatically
printing out MSDS, for instance, is an easy possibility, *if*
something like XMLerization of biological lab protocols is done-
http://groups.google.com/group/diybio/msg/587f1061e5d30e8f
http://groups.google.com/group/diybio/msg/aea47b19a166764d
> An open platform where DIYbiologists and maybe Synthetic Biologists can post
> video protocols or discussions about their work and perhaps even edit them
> might be great.
I just had this discussion the other day with some others- videos are
great, but it's poor for sending *serious* information (which should
be md5'd, and so on, and I'm surprised protocols are transferred
without hashes and so on)--
http://groups.google.com/group/openmanufacturing/msg/bd051b015824dc8a
(but yes, media is of course great- I'm just trying to keep the
"promotionalism" separated from the "yes, we're serious about this and
plan to do things right")
> You may also be interested in taking a look at the Registry of Standard
> Biological parts. There is still a need in the synthetic biology community
> for a way to track storage of physical parts in their freezers. Something
> of this type that is DIY focused may take a different form.
Check out some inventory examples that fenn has been working on:
In particular, though not biology related:
http://fennetic.net/git/gitweb.cgi?p=skdb.git;a=blob_plain;f=inventory/bench.yaml;hb=HEAD
Hm. I'm looking over their data model:
http://brickit.wiki.sourceforge.net/Data+model
Looks like they have their head on straight. Just need to integrate
these data formats with the pcr.xml example, and other protocols, and
we can be close to doing something like a generator that can spit out
"access biobrick blah blah blah from container a over in the corner of
the building <floor layout available>, go get a gel box from cabinet
blah," etc. I'll have to play around with brickit some more- since
it's django, I'd have to run django to generate the database
structure, but is there an export tool for django to quickly get the
SQL query to generate the table structure, so that I don't have to run
brickit myself? Just a (very) minor annoyance.
So, one of the biology labs that I work in has been "making a vector"
for over 6 months now. I spent an hour a few weeks ago and did a
significant chunk of their work with some automated codon usage
analysis scripts (apparently vectors must be tweaked for the
metabolism of the organism). I need to do some more reading on vector
creation algorithms, or the procedure behind it, so maybe you have
some references or feature requests? I'm going to inevitably be
writing such a program anyway.
> Also, establishing an open standard for biological datasharing would
> be good. I know that there's some established data formats that can
> be used for interoperability but something that ties DNA sequence,
> protein structure, function, etc together would be nice.
http://biobricks.org/
http://partsregistry.org/
http://sbml.org/
http://myexperiment.org/
Check out SBML. Integrating SBML, biobricks, and brickit (the
django/python biobrick server for inventory management). What we don't
have though is a way of representing a complete project in a total
file .. mostly because the data models haven't been thought of yet.
:-)
> What sort of dev platform would people recommend? I've got a fair
> amount of experience with .Net but I'd hesitate to recommend it
> because of the Windows restrictions (although there is an open source
Yeah, I'd recommend python, perl, C/C++, maybe ruby on rails.
> version, Mono, which is coming along nicely these days) and
Yes, I've been able to get mono + monodevelop to work reasonably well on linux.
> limitations in drawing to the screen in an arbitrary fashion. What
NeutronLib for graphviz-like graphs.
> other cross-platform dev environments would people recommend?
Scriptable languages, mostly.
> Something that can also target mobile devices would be nice too.
Meredith mentioned a python lib for mobile platforms the other day. Yay.
> Lastly, I would love to see something like a Wikipedia for genes and
> proteins. As it is, it's often a real pain to learn about how to
WikiProteins
http://www.wikiprofessional.info/
> express and purify a given protein. Even learning whata certain
> protein does can often be annoyingly difficult. It would be great if
Wikipedia, and other wikis, provide a template for linking from an
article on a protein to the different databases, which capture
different information, like which network they react in, and so on--
but it's interesting because unlike big macro stuff, these molecular
reactions aren't what you think they are, they don't do what you think
they do, just because it has some random name doesn't really tell you
much, except for help spark a few neurons in associative memory, which
may or may not pull back correctly remembered facts.
> someone could establish a central area where people could comment on a
> given gene, talk about known and putative functions, share tidbits
> about how to purify it, quirks about its activity etc. I would have
An annotation service for "quirk tracking" would be nice to have, that's true.
> killed in the past for something that would have told me that such n
> such protein is soluble at high levels except with pH buffer X present
> and things like that.
Yes, there is a lot of information that is not being captured, and I
don't know how to begin to even propose capturing that sort of
off-hand information. It's not even necessarily consistent between one
protein to the next- what, are you going to read a book of special
facts about a single protein just to see if something sounds similar
to a problem you're having? (It's easy to find things that are
explicitly defined, but much of this information is left implicit,
which expert agent systems aren't going to be good at hooking up in a
reasonable amount of time).
Yeah, I'd recommend python, perl, C/C++, maybe ruby on rails.
> What sort of dev platform would people recommend? I've got a fair
> amount of experience with .Net but I'd hesitate to recommend it
> because of the Windows restrictions (although there is an open source
I would hesitate to use .NET (by which I mean avoid like the plague)
because it's tied to a vendor. Insert "look at the shine on those
manacles!" quote. I would recommend - in order of increasing
preference - Perl, Python or Lisp. The only problem with Lisp is that
so few people know it.
> Something that can also target mobile devices would be nice too.
Depending on what you're wanting to do with the mobile device, the
best option may be to build a mobile-centric web front end.
-Dan
There are certain tricks to playing around with Windows binary files-
this is sometimes how people are able to bypass security keys, or
other things to lock or unlock users. So, if you post some links over
to executables, we may be able to see what's up.
Are you sure that doing something like this is morally and ethically
justifiable? (I'm pretty sure it's illegal so I'm not even asking
that one.) From a practical standpoint, code isn't necessary.
Executables can be reverse engineered pretty easily. Particularly if
you've got a working license file handy...
Not that I'm advocating doing such a thing or saying I would do it.
What you're discussing is illegal, and I would never condone illegal
activity.
-Dan
I'm looking over the invitrogen link there (v11, versus Tom's v10
link), and I'm seeing some of the features:
"""
Clone2Seq™ - greatly simplified workflow for 2-fragment
restriction-ligation cloning, with all the power of our renowned
graphical map creation and lineage tracking
VectorSelector™ - quickly find cloning/expression vectors with
selected restriction sites, drug resistance markers, promoters,
purification/expression tags, and other key features
ReGENerator™ - design optimized expression constructs, with any
protein mutation you want, and we’ll make the DNA for you!
"""
And the more lengthy descriptions:
"""
Clone2Seq™
Clone2Seq™ is a greatly simplified workflow and interface for the
rapid cloning of two restriction fragments. It reduces the current
workflow from five separate interfaces to one, with dramatically fewer
mouse-clicks needed to clone two fragments with compatible ends.
Clone2Seq™ is designed for those who already know how they wish to
recombine two restriction fragments, e.g., cloning a BamHI-EcoRI
insert into an appropriately digested vector. The interface makes it
easy to select molecules and fragments for cloning, to modify fragment
ends for compatibility (f necessary), and to create the desired
recombinant, whether circular or linear. Despite the simplicity of the
workflow, Clone2Seq™ retains all the power of the cloning
functionality in Vector NTI Advance™, including our renowned graphical
map creation and parent-descendant lineage tracking.
VectorSelector™
VectorSelector™ is a completely new interface to help you quickly find
the right cloning and/or expression vectors with desired features. You
can search any subset in your Local Database for vectors with a large
number of attributes, for example: with one or two different
restriction sites, with annotated coding DNA sequence (CDS) features
that confer drug resistance, by linear or circular form, having
specific attB sites that are used in Gateway® cloning. Results are
captured in a spreadsheet-like format, and any group of results can be
saved to a subset in the Local Database. Any individual search result
can be opened in the Molecule Viewer or even sent to Clone2SeqTM for
use in a rapid cloning experiment.
Gene Synthesis with ReGENerator™
As the accuracy of de novo DNA synthesis and sequencing have
increased, and as the costs have decreased, designing your desired
gene by building it from the ground up has become much more feasible.
Indeed, this kind of “cloning” really shows its strength when the goal
is to create expression constructs with a defined set of mutations: it
may be significantly faster, quicker, and cheaper, to start with your
desired protein sequence, mutate it as needed, attach whatever
flanking sequences are required for propagation, expression,
selection, purification and detection, and then synthesize your target
construct chemically from individual nucleotide bases. ReGENeratorTM
allows you to design such specific DNAs in silico. Simply start with
your protein sequence, and if required, mutate it by substituting,
adding or deleting amino acids by simply typing in new resides. Any
number and type of mutation can be made in this step. Choose a
codon-usage table that best reflects your experimental expression
system, and ReGENeratorTM will calculate a DNA sequence that encodes
your desired protein. You can add any number of flanking sequences to
the 5’ and 3’ ends of the newly-created DNA—such as restriction sites,
Gateway cloning sites, and expression or purification tags—then send
the designed sequence electronically the secure servers at
Invitrogen’s gene synthesis partner, Blue Heron® Bio. Blue Heron® will
then synthesize your DNA, often in less than two weeks.
"""
Anyway, some of this sounds simple, some of this sounds confusing
(simply because I've never used the software before). What would
really help out on the free software front of diybio is if people take
5 minutes today to describe one proprietary software package, or one
imaginary software package, and what it does or what it doesn't do
that they would like it to do, the set of features that they would
like to have, and so on, and then software people can see what they
can do :-) either by finding already existing software packages, or
maybe even picking up a few of the projects. In my case, I need to go
do some more reading on (traditional/manual) vector design.
- Bryan
Hi Julie,
Can you expand on this a bit? I'm trying to organize a first year iGem
team at the university of victoria. My background is in computer
science rather than biology, so unless we find some strong biology
advisers I'm going to be pushing people to work on a software project
this first year. My ideal scenario would be to hook up with an iGem
team with strong wetwork skills and have us build the software out
while replicating their work on the wetwork side to gain experience in
that area. Any good candidates for me?
]
Thanks!
--Derek
On Feb 24, 1:40 pm, Julie Norville <julie.e.norvi...@gmail.com> wrote:
> I think ars synthetica would be interested in engaging with the DIYbio
> community and hosting/posting your materials or webtools, or interviews from
> the community (you may think about posting any DIYbiosafety materials
>
> you may also want to join iGEM teams (for example Chris Anderson's team in
> Berkeley or ask Randy Rettberf if you can create a Registry development iGEM
> team) or create your own software iGEM team
>
> many iGEM teams would like software tools but do not have the knowledge base
> to create them on your own
> --so creating a web site that could allow iGEM teams or Bio researchers to
> connect with DIYers and suggest needed tools and collaborate with the DIYers
> might be the best way to do the most cutting edge research (and also a great
> way to join an iGEM team in a low cost fashion since you might only need to
> pay your registration at the Jamboree rather than raising the costs to pay
> for a team registration)
>
Shit, I'd be interested in working on that. (I wrote an in-house app
for site-directed mutagenesis at IDT, and it's interested me since
then.)
--mlp