MIAPA: NeXML --> ISAtab Update

2 views
Skip to first unread message

Elliott Hauser

unread,
Jun 11, 2012, 10:33:48 PM6/11/12
to Phyloinformatics Group, miapa-...@googlegroups.com, nexml-...@lists.sourceforge.net
Dear Phyloinfo, MIAPA, & NeXML peeps

Below is a summary of my progress this past week.  Please stay tuned for calls for feedback on deliverables- your input will be greatly appreciated.

Best,

Elliott

Completed:
  • Updated documentation & plan
  • With generous help from Philippe, arrived at a draft ISA configuration
  • Generated automated element counts from supertreebase (thanks Rutger), but get an error before running all several thousand.  Working on lo-tec workaround, which is essentially just breaking them into smaller batches.
    • Issue: These element counts seem to indicate that no (or practically) no tree base files use tb:analysis and related markers like the example file I've been working with.  Fix will be to use those as guides if they're there, otherwise use a default treatment
    • Issue: a significant number are utilizing multiple <<trees>> and <<characters>> elements, which I didn't expect.  Fix will be intelligently assigning these to the correct assays.  Will seek mentor input on this.

Didn't do:
  • Wasn't able to generate valid ISAtab based on Philippe's config.  

To do:
  • Produce final table of supertreebase element counts
  • Document & release ISAconfig for comments
    • Generate ISAtab (valid or not) from tree base-record.xml for visual example
    • Include updated diagram 
  • Investigate Emily MacTavish's proposed phylotastic ontology as MIAPA draft

Jim Leebens-Mack

unread,
Jun 11, 2012, 11:02:05 PM6/11/12
to Elliott Hauser, Phyloinformatics Group, miapa-...@googlegroups.com, nexml-...@lists.sourceforge.net
Hi Elliott,

Thanks for keeping us posted on your continued progress!  

Emily MacTavish's proposed phylotastic ontology looks like a great start.  Are analysis concepts going to be added the ontology?  Such concepts could be taken from Phylont (http://bioportal.bioontology.org/ontologies/47473) and CDAO (http://www.evolutionaryontology.org/).

Bests,
Jim     

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Jim Leebens-Mack
Department of Plant Biology
University of Georgia
Athens, GA 30602-7271

Phone: 706-583-5573
Fax: 706-542-1805
email: jleebe...@plantbio.uga.edu
url: http://www.plantbio.uga.edu/~jleebensmack/JLMmain.html 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

From: Elliott Hauser [mailto:ea...@mac.com]
To: Phyloinformatics Group [mailto:Wg-phyloi...@nescent.org], miapa-...@googlegroups.com, nexml-...@lists.sourceforge.net
Sent: Mon, 11 Jun 2012 22:33:48 -0400
Subject: MIAPA: NeXML --> ISAtab Update
--
You received this message because you are subscribed to the Google
Groups "MIAPA" group.
For more options, visit this group at
http://groups.google.com/group/miapa-discuss?hl=en

Elliott Hauser

unread,
Jun 11, 2012, 11:47:03 PM6/11/12
to Jim Leebens-Mack, Phyloinformatics Group, miapa-...@googlegroups.com, nexml-...@lists.sourceforge.net
Hey Jim. 

Thanks for the response. 

By analysis concepts do you mean things like software, algorithm types, etc?  If so, that's definitely on the radar.  I'm trying to use the draft checklist as a guide ([1] below for those who don't have it). It includes things like alignment & tree inference methods, with software etc subordinate to these. 

Treebase already explicitly identifies analyses, analysis steps, inputs, outputs, software, algorithms, etc. A problem I've run into, noted below, is that a majority of the NeXML files I have from Rutger's supertreebase data dump do not utilize these attributes. I need to examine several files manually to see if this data is stored in a different way perhaps- haven't found evidence of this yet.  I was expecting 'dirty' data though, so building support for these crucial data points, even if they aren't in many of the files we're using, maybe the best I can do this summer.  

Phylont I wasn't familiar with but it looks promising.  I'll be looking to Hilmar and the other mentors for guidance here & regarding ontologies more broadly, as this is also a crucial question for the ISA config and I'm not in a position to decide when CDAO vs OBI would be appropriate.  There will be a tradeoff between expressivity and complexity in choosing how many/which ontologies to support.  

I'll be putting out targeted calls soon, but do you have any comments or recommendations to share, in ontologies or otherwise?  You're also welcome to join a call at some point if that's convenient. I may try to schedule a discussion/feedback call in the future if there's a critical mass of interested parties. 

Best,
Elliott


[1] MIAPA draft as it stands after the October workshop:


Sent from the ePad

Rutger Vos

unread,
Jun 12, 2012, 3:38:44 AM6/12/12
to Elliott Hauser, Jim Leebens-Mack, Phyloinformatics Group, miapa-...@googlegroups.com, nexml-...@lists.sourceforge.net
Hi Elliott,

Yup, I'm not so sure the code to generate analysis metadata ever made it to the production server. I think for now you will have to go with the sample file until I can squeeze in some time for TreeBASE hacking.

Rutger


--
Dr. Rutger A. Vos
Bioinformaticist
NCB Naturalis
Visiting address: Office A109, Einsteinweg 2, 2333 CC, Leiden, the Netherlands
Mailing address: Postbus 9517, 2300 RA, Leiden, the Netherlands
http://rutgervos.blogspot.com
Reply all
Reply to author
Forward
0 new messages