The "pitch" - ABI Development: Towards a comprehensive, community-owned and sustainable repository of reusable phylogenetic knowledge - now in the googledocs document looks good. I think this would be a very ambitious project, but there is clearly need for resources to improve access to phylogenetic knowledge.
I am a bit concerned about the tight connection between TB and ToLWeb that is outlined in the pitch. Are folks intending to re-engineer BOTH TB and ToLWeb? In my mind, ToLWeb would be just one of many platforms from which folks may want to delve into the phylogenetic knowledge that could be accessed in TreeBASE. I understand the desire to use ToLWeb as resource for integrating data in TreeBASE relating to species relationships, but wonder if this connection could be better presented as an example or test-case for interoperability with TB rather than a primary feature of the proposal.
Along these lines, I would think the AVAToL panel/ideas lab participants would be very enthusiastic about development of ToLWeb as an integrator of phylogenetic trees/knowledge obtained from TreeBASE or any other source. Let's hope some of us have a chance to discuss this in August!
Karen,
Will you be sending the pitch to Reed and asking for feedback? I suspect he will want to talk about some of the technical points being discussed in the googledoc in order to be certain that this is an ABI development proposal. My sense from our last discussion is that development project proposals should include pretty watertight plans for software/database engineering and testing.
Bests,
Jim
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Jim Leebens-Mack
Department of Plant Biology
University of Georgia
Athens, GA 30602-7271
Phone: 706-583-5573
Fax: 706-542-1805
email: jleebe...@plantbio.uga.edu
url: http://www.plantbio.uga.edu/~jleebensmack/JLMmain.html
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
----- Original Message -----
From: Karen Cranston
[mailto:karen.c...@nescent.org]
To: Arlin Stoltzfus
[mailto:ar...@umd.edu]
Cc: MIAPA [mailto:miapa-...@googlegroups.com],
phy...@googlegroups.com, TreeBASE devel
[mailto:Treebas...@lists.sourceforge.net]
Sent: Mon, 06 Jun 2011
10:04:48 -0400
Subject: Re: ABI proposal for phyloinformatics
> There are several pitches now in the Google doc, with a fair bit of
> overlap between them. I am willing to consolidate into a single page
> and send to NSF (Reed?) and see what he has to say about the various
> components. It seems like these components are:
> 1. some level of re-engineering of TreeBASE
> 2. further development of MIAPA, with annotation tools and TreeBASE
> integration
> 3. use of ToLWeb as a crowd sourcing and data synthesis platform
> 4. NeXML refinement and development
>
> I don't think this one-pager needs to capture all of the ideas and
> details we currently have, but instead give a general sense of what we
> are proposing and if all / some of these ideas is potentially
> fundable.
>
> Everyone in agreement? I will post the single page in the doc later today.
>
> Karen
>
> On Fri, Jun 3, 2011 at 3:38 PM, Arlin Stoltzfus <ar...@umd.edu> wrote:
> > Today is the deadline for our 1-page synopsis to pitch to an NSF program
> > officer (before going further). Currently we seem to have 3 pitches.
> It
> > is time now for some energetic person to consolidate this, so that we can
> > move ahead.
> >
> > Arlin
> >
> > On May 31, 2011, at 12:19 PM, Karen Cranston wrote:
> >
> >> Tomorrow morning (Wed, June 1) looks to be good for everyone, and
> >> sooner seems better than later. I propose we talk at 9:00 am EST. I
> >> will send connection information later today.
> >>
> >> Cheers,
> >> Karen
> >>
> >> On Thu, May 26, 2011 at 3:00 PM, Karen Cranston
> >> <karen.c...@nescent.org> wrote:
> >>>
> >>> There has been some interest among various groups in an ABI proposal
> >>> for development of phyloinformatics resources. This email is an
> >>> attempt to connect those threads and move the process forward. The
> >>> conversations that have been happening up to this point are:
> >>>
> >>> 1. The Phyloinformatics Research Foundation (phylofoundation.org,
> >>> stewards of TreeBASE and ToLWeb) started a Google doc aimed at
> >>> TreeBASE
> >>> 2. MIAPA developers started a wiki page
> >>> (https://www.nescent.org/sites/evoio/NSF_ABI_2011), recognizing the
> >>> need for coordination with TreeBASE and other resources
> >>> 3. NESCent (Todd, Hilmar and myself), as the current TreeBASE host and
> >>> as a third party interested in coordinated development across
> >>> resources started a third document (now added to the already mentioned
> >>> Google doc)
> >>>
> >>> If you are interested in this discussion and do not already have
> >>> access to the Google doc entitled TreeBASE_ABI.doc, let me know and I
> >>> can grant you access. Hilmar and I made some substantial edits earlier
> >>> this morning. I point you specifically to the section at the end
> >>> entitled "An attempt to re-think all of this". Briefly, we wanted to
> >>> encourage some radical thinking and explore the idea of developing a
> >>> PhyloCommons that incorporates both TreeBASE and ToLWeb into the
> >>> proposal (as the data repository and the data sharing / dissemination
> >>> / synthesis platform, respectively).
> >>>
> >>> The ABI deadline is July 7, so we have a short period of time to pull
> >>> this together. Here is a link to a Doodle poll for an initial
> >>> teleconference.
> >>>
> >>> http://doodle.com/zf2tz7sftyk3naxy
> >>>
> >>> During this meeting, we hope to come to agreement on the broad
> >>> direction of the grant, identify possible leaders of the various
> >>> components and create a plan for getting this pulled together in time
> >>> for the deadline. Please feel free to continue the conversation on the
> >>> Google doc between now and the teleconference. If there are others who
> >>> you think should be invited, feel free to do so. Not everyone who
> >>> participates in this first phase will end up being named on the grant,
> >>> but these resources require input from a much larger group.
> >>>
> >>> Cheers,
> >>> Karen
> >>>
> >>>
> >>> --
> >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>> Karen Cranston
> >>> Training Coordinator and Informatics Project Manager
> >>> nescent.org
> >>> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>>
> >>
> >>
> >>
> >> --
> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >> Karen Cranston
> >> Training Coordinator and Informatics Project Manager
> >> nescent.org
> >> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >>
> >> --
> >> You received this message because you are subscribed to the Google
> >> Groups "MIAPA" group.
> >> For more options, visit this group at
> >> http://groups.google.com/group/miapa-discuss?hl=en
> >
> > -------
> > Arlin Stoltzfus (ar...@umd.edu)
> > Fellow, IBBR; Adj. Assoc. Prof., UMCP; Research Biologist, NIST
> > IBBR, 9600 Gudelsky Drive, Rockville, MD
> > tel: 240 314 6208; web: www.molevol.org
> >
> > --
> > You received this message because you are subscribed to the Google
> > Groups "MIAPA" group.
> > For more options, visit this group at
> > http://groups.google.com/group/miapa-discuss?hl=en
> >
>
>
>
> --
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> Karen Cranston
> Training Coordinator and Informatics Project Manager
> nescent.org
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> --
> You received this message because you are subscribed to the Google
> Groups "MIAPA" group.
> For more options, visit this group at
> http://groups.google.com/group/miapa-discuss?hl=en
>
On Jun 7, 2011, at 1:30 PM, Jim Leebens-Mack wrote:
> I am a bit concerned about the tight connection between TB and
> ToLWeb that is outlined in the pitch. Are folks intending to re-
> engineer BOTH TB and ToLWeb? In my mind, ToLWeb would be just one
> of many platforms from which folks may want to delve into the
> phylogenetic knowledge that could be accessed in TreeBASE.
Good points, and indeed what I had in mind technically(*). The way I
am envisioning this to be implemented is indeed using technologies
(HTTP/REST APIs, canonical resolvable identifiers, RDF) that allow
very loose coupling. I left that out as I thought the tech soup
shouldn't be in there, but I agree what's missing now is the notion
that this will be achieved through loose coupling.
So as for reengineering, the idea is to engineer (or reengineer where
that's necessary) components for *both* systems that allow that loose
coupling in a way that achieves the stated goals. Components that need
not change to achieve this would not be touched.
-hilmar
(*) Socially (as opposed to technically), I think ToLWeb can, and
should, play a much more important role in enhancing TreeBASE content
in the sense of turning data into knowledge than other platforms that
we would enable here. Perhaps a useful analogy to think about is
Genbank as the the raw sequence data repository and NCBI Gene (and its
predecessor LocusLink) as well as RefSeq as resources that attempt to
turn this into curated knowledge.
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org :
===========================================================
I am putting together two pitches of ~1 page each to send to NSF. One
is the grand ToLWeb + TreeBASE version, and the second is the TreeBASE
and MIAPA-focused ideas that you, Bill and Rutger submitted (I am
merging these into a single doc). This way, we can get a sense of
which version is more likely to be viewed favourably by the panel and
program officers. Working madly, hoping to send this ASAP.
Karen
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Karen Cranston, PhD
On Tue, Jun 7, 2011 at 1:53 PM, Karen Cranston
But clearly we want something tighter than just consuming ToLWeb's XML -- and we want ToLWeb to benefit as much as TreeBASE. But in that case, I think some sort of tap on David Maddison and Andrew Lenards's shoulder is needed (David is on the phylorf mailing list, but is he following this?). I gather that ToLWeb code is not yet Open Source...
bp
On Jun 7, 2011, at 1:48 PM, Hilmar Lapp wrote:
> Jim:
>
> On Jun 7, 2011, at 1:30 PM, Jim Leebens-Mack wrote:
>
>> I am a bit concerned about the tight connection between TB and ToLWeb that is outlined in the pitch. Are folks intending to re-engineer BOTH TB and ToLWeb? In my mind, ToLWeb would be just one of many platforms from which folks may want to delve into the phylogenetic knowledge that could be accessed in TreeBASE.
I will ping David Maddison about this discussion.
Karen
--
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Karen Cranston, PhD
Having said that, I'm pretty sure from the analysis of literature
(that Brian and I are doing) that re-use of large species trees is an
important use-case. In a sample of 40 recent papers that hit
"phylogen*" in the title or topic (obviously not a random sample, but
we wanted to find the folks who focus on trees), we found 5 that use
phylomatic or APG trees, and 1 that uses the animal supertree from
Bininda-Emonds.
APG and phylomatic appear to be leaving ToLWeb and TreeBASE in the
dust, in terms of scientific re-use. Whatever they are doing to bring
trees to users, we should be doing.
Arlin
http://www.doodle.com/8zvwbidtxm9gzxcp
To make sure that we can have a relatively targeted discussion, my
suggestion would be that everyone who is willing to play a role in
this proposal enter their availability, and come prepared for the
following questions:
1. What aims would a proposal need to have to for you to commit to be
part of it, and conversely, what aims should it not have. (Ideally,
the aims would be from either pitch A or pitch B that Karen sent to
NSF for feedback.)
2. What aims, expertise, and partners are we missing from the group.
Do you have suggestions for how to pull those in.
3. What role are you interested in playing, for which aim(s). What
kind and how many resources do you anticipate requiring support for to
accomplish those aims.
At the end of this, ideally we have a concrete sense for whether there
are 0, 1, or 2 proposals that are viably going to come together, what
size of proposal(s) we are talking about, who would take
responsibility for what, and who else we need to reach out to.
Comments / suggestions / additional items for the enumeration above
welcome.
-hilmar
Summary:
1. Making the MIAPA component into a separate Innovation proposal is
probably a good idea.
2. The TreeBASE / ToLWeb piece is well-suited for a Development
proposal, and we can discuss MIAPA in this proposal as long as we have
a concrete contingency plan for the possibility that this gets funded
and the MIAPA proposal does not.
3. There is no general rule about incremental improvement vs major
re-engineering, but the goals of the proposal must be novel in some
way and have intellectual merit. A re-engineering proposal could be
computationally novel, while a proposal with only incremental
improvements must instead have novel interface components or strong
biological motivations.
4. There seems to be an empty niche for proposals that include novel
front-end as well as back-end development, but we need to make sure we
have the appropriate expertise for the former.
5. She suggests sharing the draft with someone from BIO (perhaps
Maureen Kearney) to get the user community perspective
Please fill out the doodle poll so that we can plan the next course of action!
Cheers,
Karen
--
Talk to you tomorrow,
Karen
> --
> You received this message because you are subscribed to the Google
> Groups "MIAPA" group.
> For more options, visit this group at
> http://groups.google.com/group/miapa-discuss?hl=en
--
https://docs.google.com/document/d/16bno1sB3gBHHnew5TnoCLawScuoydG-i5LCPcB30OZY/edit?hl=en_US
The focus of this, as currently conceived, is to combine problem-solving with development of a draft standard. The problem-solving attempts to address relevant user needs (e.g., helping users to create a properly formatted and annotated archive submission). This way, we will be developing technology support at the same time as the draft standard (which, ideally, will encourage the broader community to try it out and work with us).
If you are interested, please take a look at the proposal, help us to identify problems to address and possible strategies to address them by leveraging available technologies and resources. Those who are interested will need to solidify partnerships as soon as possible, as there is only a month left to formulate the plan and write the proposal.
Arlin
________________________________________
From: miapa-...@googlegroups.com [miapa-...@googlegroups.com] On Behalf Of Karen Cranston [karen.c...@nescent.org]
Sent: Thursday, June 09, 2011 12:58 PM
To: phy...@googlegroups.com; MIAPA; TreeBASE devel
Subject: Re: ABI proposal for phyloinformatics
Hilmar and I talked to Anne Maglia from NSF this morning. The notes
Cheers,
Karen
--
--
Dr. Rutger A. Vos
School of Biological Sciences
Philip Lyle Building, Level 4
University of Reading
Reading, RG6 6BX, United Kingdom
Tel: +44 (0) 118 378 7535
http://rutgervos.blogspot.com
Here are some comments, and I must say I haven't read the proposal yet, just the pitches, and the emails.
I like the general plan.
I agree that Jim that fundamentally ToLWeb should be viewed as a test case for interoperability, even if there is specific effort in the project on it. I would go so far to say that even TreeBASE should be viewed here as a test case. The most important thing that would be built in the process of doing this the vision, understanding of needs, standards, design of the tools, and community building and inspiration, rather than the particular implementations that will come out in TreeBASE and ToLWeb. The products should be done with enough abstraction that they will serve for other cases. I always find that it is good to have two test cases, just to force one to think about alternatives at various decision points. There is a cost to that, of course, and that is the added funds that would be needed to support other test cases. But if we could have a light-weight alternative to TreeBASE as an alternative test case at that end, and a light-weight alternative to ToLWeb to serve as a test case there, then it might be worth thinking about including a bit of effort there that to force greater abstraction.
ToLWeb is "emotionally" open source. That is, it's fully available as far as I am concerned, but we haven't gone through the effort of actually making it open source. Anyone who wants the source can have it. So, I am enthusiastically in support of having a small portion of the budget devoted to the effort to push it onto Source Forge or somewhere. Ideally that would involve a bit of contract money to Andy Lenards, and possibly Danny Mandel (programmer that proceeded Andy; Danny understands more of the source, I suspect).
OK, more comments after I look at the proposal.
David
---------------------------------
David R. Maddison
Department of Zoology
3029 Cordley Hall
Oregon State University
Corvallis, OR 97331 USA
david.m...@science.oregonstate.edu
http://david.bembidion.org
http://mesquiteproject.org
http://macclade.org
http://tolweb.org
Tree visualization and interaction design of web applications are two
different things. There is a lot of community expertise of the former
(and there are other people we might involve in this, e.g. Tamara
Munzner), but not enough of the latter. People complain about TreeBASE
in part because the workflows and visual metaphors aren't informed by
good HCI design, and this is not going to be addressed by better tree
viz. People will complain less about TreeBASE if we know why facebook
is nicer than myspace and apply those principles.
Best, Rob
I agree that while there is overlap between visualization and
interface development, the pitch that we have outlined so far needs
usability and interface experts more than visualization experts.
Karen
--
~~~~~~~~~~~~~~~~~~~~~~~
karen.c...@gmail.com
~~~~~~~~~~~~~~~~~~~~~~~
Need to throw in my two cents here. I imagine these comments might
represent a slightly different pov than has come up.
Near-future grant proposals probably should not spend heavily on the
development of tree visualization tools with goals of long-term
sustainability. Instead, the next wave of proposals (I'm only talking
about viz here) should focus on a few pre-tool advancements. My idea
of what these advancements should be is based on
1) the future/now of viz is on the open-web (html5 etc).
2) the best data viz people are probably not in our community, they
are in their own or are unsuspecting tinkerers who will stumble upon
it.
3) the coolest innovations in tree viz aren't going to be wrapped into viz tools
So, with that, here is what I propose needs to happen,
1) PhyloJSON notations. XML on the web is slow and unnecessary. JSON
is quickly moving in on XML dominance. PhyloBox and other projects all
moved down their own paths for creating json phylogeny notation. This
is the normal way to do it, but our community could be developing some
common notations that we could all start using/expecting.
2) Advanced REST tree queries (TreeBASE) for strictly web-client based
consumption. What I mean here, is the use of cross-domain rest queries
to find, trim, scope, and combine trees. On top of that, wrapping the
results into the product of (1) above. This will allow web mashups and
tools to emerge that combine the data in unexpected ways.
3) Proof of concept tools and documentation. All I mean here, are
examples of how the products of (1) and (2) can be used to merge
phylogeny with external sources of data (wikipedia, eol, flickr, etc),
using the modern web, html5, css3, and javascript. This will provide
the foundations for a much larger pool of producer/consumers to do
interesting things with the data. Documentation is king. It will
enable those not at the table to understand the decisions made.
Each of those have clear integration points with treebase (especially
the meeting of 1 and 2) and other projects. More importantly, none of
them have been correctly explored with sufficient resources and
planning. On top of that, developing the three in coordination will
allow much better development of use-cases for phylogeny viz that feed
back into each of the three.
best,
a
--
Ecology and Evolutionary Biology
University of Colorado
http://biodiversity.colorado.edu
http://biodivertido.blogspot.com
I agree with Andrew, though, that the re-engineering should involve
well-designed interoperability goals so that people can re-use the
data as easily as possible in new and creative ways. But, we also need
to consider the users that will access these resources via the web
interface, not only the APIs.
Karen
--
~~~~~~~~~~~~~~~~~~~~~~~
karen.c...@gmail.com
~~~~~~~~~~~~~~~~~~~~~~~
I agree with Andrew's general observation that tree viz widgets that
consume json from public APIs (phylows?) are the way forward, and
phylobox and jsPhyloSVG are both compelling examples.
Having said that, I think that a good user experience would include
intuitive integration of the viz tool with the rest of the UI, i.e.
the opposite of the way things work now with phylowidget on TreeBASE
(on the other hand, the way it's done on ToLWeb works rather well
IMO).
> Hey all, OK so my group's main interest in participating would be on the user interface side, and improving the user experience in both TreeBASE and ToLWeb. Any re-engineering of either resource should keep in mind what people want to see and do when they arrive at a tree topology. Most people want to see the trees, right? So, among the goals should be a good way to see the trees (and tree sets) in TreeBASE and move around in ToLWeb, seeing the tree at various levels. I would call that visualization, I guess, nothing outrageous, but this kind of basic tree viewer should always be there as part of the end product. So I think we should include it at some level, certainly as part of any UI developed, even if (as I suggested earlier) we don't propose new viz frameworks and just use what we already have. Andrew's suggestions for web-readiness should make this relatively easy to do and not require that a large part of the proposal be focused on viz- I do think we should focus a chunk of the proposal on the interface, with some basic tree topology viz being a key part of that. But if that is maybe a different grant, thats totally cool with me too. Thanks, Mark
To me, the TreeBASE API issue is less about visualizing trees (although clearly we need, as Rutger says, a highly integrated and easy/powerful way to view and rummage through trees) and more about giving people the power to use all the metadata to their search advantage without having an interface that is too complicated. Metadata includes size of tree, size of matrix, kind of tree, matrix data type, taxonomic identifiers, gene names, author names, analysis methods, etc etc.
e.g.:
- find me trees that include Mustelids and Pinnipeds, but not Canids (or anything below that), and that result from molecular analyses of character sets with more than 1000 bases.
- which trees don't support monophyly of bats?
- find me matrices authored by Michael Donoghue that use morphological data and that don't deal with Viburnum.
- give me a dump of all trees that have at least 5 Elasmobranchii in them, but only one tree per "study" in a form that easily allows me to build a supertree with them (i.e. all OTUs are remapped to common set of identifiers)
bp
> (on the other hand, the way it's done on ToLWeb works rather well
> IMO).
There is plenty of functionality that we can build around the type of
static tree browsing interface that ToLWeb has now. If we expose the
data in consistent ways according to the standard formats that treeviz
tools want, then others can build (perhaps even incorporate?) more
exciting visualization elements.
I also agree with Bill that a more crucial interface issue is giving
users the ability to find the trees they want, using a wealth of
available metadata and with an interface designed using modern
principles of UI design and usability.
--
~~~~~~~~~~~~~~~~~~~~~~~
karen.c...@gmail.com
~~~~~~~~~~~~~~~~~~~~~~~
Having said that, small proof-of-concept viz (and other) apps can serve well to guide and validate the "platform" and API goals. But they shouldn't take a scope that would start to distract from the main focus. Perhaps a good instrument for supporting these proof-of-concept efforts are developer-engagement events (challenges, hack-days, competitions, ?).
-hilmar
Sent with a tap.
Arlin
>> dis...@googlegroups.com] On Behalf Of Karen Cranston [karen.c...@nescent.org
-------
It would be good to start formulating the structure of such annotated
record and perhaps move forward the creation of a PhyloWS service that at
least, as a start, provides validation (and possibly completion) of the
record.
Enrico
--
Dept. Computer Science,
New Mexico State University
MSC CS, Box 30001, Las Cruces, NM 88003
Voice: 575-646-6239 Fax: 575-646-1002