BioJava 3.0.2 released

Cesar A. Rodriguez

unread,

Sep 3, 2011, 10:45:35 PM9/3/11

to synb...@googlegroups.com, sbol...@googlegroups.com

Heads up...

From: Andreas Prlic <and...@sdsc.edu>

Date: September 2, 2011 9:56:05 AM PDT

To: Biojava <bioj...@lists.open-bio.org>, biojava-dev <bioja...@lists.open-bio.org>

Subject: [Biojava-l] BioJava 3.0.2 released

BioJava 3.0.2 has been released and is available from
http://www.biojava.org/wiki/BioJava:Download .

BioJava 3.0.2 adds new modules and enhances the capabilities of BioJava:

- biojava3-aa-prop: This new module allows the calculation of physico
chemical and other properties of protein sequences.
- biojava3-protein-disorder: A new module for the prediction of
disordered regions in proteins. It based on a Java implementation of
the RONN predictor.

Other noteworthy improvements:

- protein-structure: Improved handling of protein domains: Now with
better support for SCOP. New functionality for automated prediction of
protein domains, based on Protein Domain Parser.
- Improvements and bug fixes in several modules.

Currently, up to 8 different people are making commits per month. This
gives an indication how active Biojava is being developed. The two new
modules are based on the work of Ah Fu (Chuan Hock Koh) and Peter
Troshin, which happened around this year's Google Summer of Code.
Thanks to everybody who made this new release possible!

About BioJava:

BioJava is a mature open-source project that provides a framework for
processing of biological data. BioJava contains powerful analysis and
statistical routines, tools for parsing common file formats, and
packages for manipulating sequences and 3D structures. It enables
rapid bioinformatics application development in the Java programming
language.

Happy BioJava-ing,

Andreas

Chris J. Myers

unread,

Sep 4, 2011, 2:34:25 AM9/4/11

to synb...@googlegroups.com

Not sure we need biojava. Seems pretty orthogonal.

Chris

Sent from my iPhone

Matthew Pocock

unread,

Sep 4, 2011, 7:33:57 AM9/4/11

to synb...@googlegroups.com

Biojava is a mature library that covers much of what you need in bioinformatics, and particularly, sequence-based bioinformatics. We founded it in 1998, and it's been worked on continuously since then by an ever-changing team of developers. There's little point trying to compete with it for breadth or depth of functionality. You'd be surprised how hard it is even to get seemingly trivial things like reverse-complement and translation bug free, or processing a seemingly trivial file format like fasta correctly and without serious memory issues. From my point of view, the best way that BioJava and the Java SBOL classes can interact is for the SBOL to be light-weight enough that it's easy to implement it backed by BioJava sequence objects. This doesn't mean that the APIs need to be identical, as they are intended for different uses. There's probably scope for either BioJava or SBOL maintaining a light-weight bridge module for this mapping, so that BioJava data can be exposed via SBOL and data loaded through SBOL can be fed into all of the tools provided by BioJava.

Matthew

Cesar A. Rodriguez

unread,

Sep 4, 2011, 8:28:18 AM9/4/11

to synb...@googlegroups.com

Chris,

The Data Access Web Service which is the RESTful API of the Electronic Datasheets is built on BioJava:

http://www.biofab.org/data/docs/daws

http://biofab.jbei.org/services/data/constructs?id=pFAB314&format=genbank

http://biofab.jbei.org/services/data/constructs?id=pFAB314&format=insd

Genbank is the reigning standard for serializing annotated DNA sequences. BioJava does a wonderful job parsing and serializing to Genbank, INSD, and fasta. It's a critical library for my projects.

Cesar

Chris J. Myers

unread,

Sep 4, 2011, 9:09:38 AM9/4/11

to synb...@googlegroups.com

Ok, was not sure how we planned to use it as the current core model doesn't really depend on it. I see that this may change in the future. Please also keep in mind that at some point we will want other libSBOL libraries than the Java one.

By the way, why are we now using synbiodex rather than sbol-dev. Which one is preferred? Are they the same or different? Just curious to be sure I'm using the right one.

Chris

Cesar A. Rodriguez

unread,

Sep 4, 2011, 9:15:50 AM9/4/11

to synb...@googlegroups.com

Chris,

SynBioDex is for general announcements related to Synthetic Biology Data Exchange. SBOL-dev is for detailed discussions related to developing the SBOL specification and libSBOL.

Cesar

Chris J. Myers

unread,

Sep 4, 2011, 9:34:29 AM9/4/11

to synb...@googlegroups.com

Are the groups currently the same? Should this question have been to sbol-dev? :-)

Chris

Cesar A. Rodriguez

unread,

Sep 4, 2011, 9:46:41 AM9/4/11

to synb...@googlegroups.com

The groups overlap:

https://groups.google.com/forum/?hl=en#!forum/synbiodex

SynBioDex has 82 members.

Cesar

Herbert Sauro

unread,

Sep 5, 2011, 2:26:39 AM9/5/11

to synb...@googlegroups.com

A couple of points about biojava which i think just mirror what Chris wrote.

1. If some bioinformatic functionality exists in biojava, libsbol could exploit that. The sbol standard itself must of course be completely implementation independent.

2. We should be careful not to make things too java centric in our minds because at some point in the future we must develop a cross-language library. I am hoping if we get a proper sbol funding stream a general cross-language library can be written.

The last point begs the question, is there a cross-language library that has similar capabilities to biojava? Isn't there a who family of bioX libraries?

Herbert

> <http://www.biojava.org/wiki/BioJava:Download>http://www.biojava.org/wiki/BioJava:Download .

Deepak Chandran

unread,

Sep 5, 2011, 2:35:59 AM9/5/11

to synb...@googlegroups.com

Here's the foundation that handles BioJava, BioPython, BioPerl, BioRuby, etc.:

http://www.open-bio.org/

I don't think all these bio packages are the same. They are
co-ordinated efforts, but I don't think they branch from the same
code. I might be wrong.

--
Deepak

Matthew Pocock

unread,

Sep 5, 2011, 8:54:31 AM9/5/11

to synb...@googlegroups.com

On 5 September 2011 07:35, Deepak Chandran <dee...@u.washington.edu> wrote:

Here's the foundation that handles BioJava, BioPython, BioPerl, BioRuby, etc.:

http://www.open-bio.org/

I don't think all these bio packages are the same. They are
co-ordinated efforts, but I don't think they branch from the same
code. I might be wrong.

They are developed independently and with freedom to diverge. However, the developers meet up regularly, and they try to stay compatible where there's no reason not to be, but diverge where either the user base or the facilities provided by the language makes that sensible. The core sequence object model of all the projects is kept fairly well in sync, and are probably closer to one-another than SBOL:Core is to any of them. I have no idea what the status of BioCORBA, BioXML and BioSQL are, but these three where set up to provide interoperability between all the bio* projects so may be a reasonable approximation to a lowest common denominator API.

Matthew

--
Deepak

Herbert Sauro

unread,

Sep 5, 2011, 11:28:53 PM9/5/11

to synb...@googlegroups.com

Matthew, have you come across in your travels a C/C++ bioinformatic library?

Herbert

Cesar A. Rodriguez

unread,

Sep 6, 2011, 3:11:25 AM9/6/11

to synb...@googlegroups.com, sbol...@googlegroups.com

Herbert,

In a quick search for C++ bioinformatics libraries that I did a couple of weeks ago, I found the following projects:

http://genometools.org/
http://www.seqan.de/
http://www.ncbi.nlm.nih.gov/IEB/ToolBox/CPP_DOC/
http://code.google.com/p/bioseqlib/
http://biostar.stackexchange.com/questions/1516/c-c-libraries-for-bioinformatics

Cesar

Matthew Pocock

unread,

Sep 6, 2011, 7:26:11 AM9/6/11

to synb...@googlegroups.com

No, sorry. I don't tend to touch c/c++. There is, of course, the emboss project which is written in C, but I'm not sure how re-usable this is.

Matthew

Deepak Chandran

unread,

Sep 6, 2011, 10:44:34 AM9/6/11

to synb...@googlegroups.com

From my observation, Bioinformatics has stayed away from C/C++, except
for command-line tools such as Emboss or GUI tools. It might be due to
the importance of strings in Bioinformatics, and C/C++ strings are
terrible, although there are C/C++ libraries with good string classes.
Also, a lot of Bioinfomaticists that I have met at
Biologists-turned-programmers, and those type of people always prefer
scripting languages, e.g Perl and Python.

--
Deepak

Reply all

Reply to author

Forward