Fwd: [help.nescent.org #15310] Working Group: Matthew Yoder - notification from NESCent about your proposal [DECLINED]

Matt Yoder

unread,

Mar 4, 2013, 11:32:37 AM3/4/13

to wg-...@googlegroups.com

Hi all,

Just got the word this morning that our working group proposal was
declined. The reviews are pasted below.

Regardless of this outcome it would be good to keep in touch and
thinking about these issues, the need hasn't gone away. For example,
I just spoke with two genomics/disease researchers at the recent
Phenotype RCN meeting who both asked the questions we hoped to
address, should I use NCBI "taxonomy", how do I know what is better to
use, what other options there are etc.

Thanks for everyone's help. Selfishly, at minimum the exercise was
extremely useful for me, I learned a lot from everyone (e.g. I'm using
neo4j now for various things), and it was good overall learning
experience with respect to how these things are done.

Cheers,
Matt

----

Review 1
Name standardization and computation is a key to phylogenetic
interoperability and data re-use. This proposal grew out of years of
NESCent-sponsored phyloinformatics interoperability projects, which
are now hosted under an umbrella research group EvoIO (www.evoio.org).
There have been concrete informatics products and standards grew out
of previous working group meetings and hackathons, including an
XML-based tree format (nexml), an ontology for phylogenetic data
(CDAO), and a workflow standards for querying phylogenetic information
through web (PhyloWS). Different from earlier working groups on
phylogenetic interoperability, this group would focus on a single, and
more fundamental issue of taxonomic name resolution. Three work group
meetings have been planned. The first meeting aims to formulate a new
naming standard (MIRPAT) based on use cases, followed by a meeting to
implement a MIRPAT-based naming service (2nd generation TNRS). The
last meeting aims for documentation and dissemination. While this
project would enhance NESCent’s investment and mission in promoting
Open Source interoperable phylogenetic computing, my main concern is
on a lack of wide adoption of new standards and practices. It would be
helpful if the authors present 1-2 use cases that could excite
evolutionary biologists and (better) phylogenomics researchers in
general.
Rating 1 - Support

Review 2
This proposal's goal is a "unified," "next-generation" solution to the
problem of taxonomic name resolution, which they correctly describe as
arising from a broad and heterogeneous range of needs (use cases). The
authors provide a considerable amount of background and context about
TNRS developments, and the group's leaders and members have notable
experience in the relevant fields of expertise (informatics,
databases, etc.). The novelty of this proposal (at least in terms of
specific ideas) seems to boil down to something they call "MIRPAT
assertions", which appear to be provenance-backed statements about the
properties and relationships of names. It is not at all clear to me
how this will fundamentally advance the objectives of TNRS. I don't
doubt that a new graph database might help with some queries of
importance to EOL, OpenTree, etc. But real worth of such an exercise
lies in the data that are input. There is no point in databasing 22
million name strings without the associated knowledge of how they are
linked together. Knowledge of this kind is the real limiting factor,
and technological innovations in acquiring/harvesting that knowledge
are needed more than the tools proposed for development here.
Rating 2 - Maybe

Review 3
For most groups of organisms nomenclature is not static. It changes
with some regularity. The only thing that stays the same is the
basionym of the type species. Sometimes existing taxa are moved into
different genera, sometimes they are sunk into other species and cease
to 'exist', and other times new taxa are described. The more we learn
about a clade the better we can get the nomenclature to reflect the
evolution of the group. It is complicated by the fact that there are
lots of miss-identifications, especially in groups that have not been
monographed recently. So it is not a simple matter of taking a name
and looking up what genus it is in now. Each name needs some type of
estimation of how likely it is to be correct. There are many ways of
doing this. When was the most recent identification? Does it have
accompanying information (photos, publications, etc.)? Was it used as
a voucher in a publication?, Does it have a 'concept', etc. If this
group could work out how to do this electronically it would be a
brilliant accomplishment and I would be extatic! It would save huge
amounts of time and be well worth the investment. Weaknesses: I liked
this proposal until I got to end and realized I was left hanging. It
does not talk about how to implement it on a broad level. I always ask
four questions: What do you want to do? When will it be done? Who is
going to do the work? Who is going to pay for it? They have covered
these questions for this first bit, but what about scaling it up to
all names? Where are the answers to those questions? They do have one
sentence that says they will have a strategy and that it might include
an NSF proposal but not much else. Maybe this is enough. But, I was
left thinking that if they do a good job it will be much harder than
they seem to think and that finding funding for this might be
impossible. I was part of a group that tried to use a new tool to
provide a concept for every name in a large family (25,000 used names,
ca. 75,000 names in synonomy) and after 4 years we ran out of funds
and it was only 70% finished. It is now static, rapidly becoming out
of date and embarrassing to have around. I think a funding strategy
for the future has to be part of a project from the beginning and it
should include how it will be maintained.
Rating 2 - Maybe

Review 4
Resolving taxonomic names in the modern age is important and useful
for a variety of databases and integrative efforts. There are many
ongoing projects to sort out the mess of taxanomic names, but they are
not currently coordinated. Members of some of those projects have
apparently expressed enthusiasm (need) for this WG. The advantage of
this proposal is that it will pretty clearly produce a deliverable
useful to many large scale database efforts, including others
supported by NESCent. The weakness is that, unlike many other NESCent
proposals, it will not be helping to build a new field of inquiry.
There is a bimodal distribution of rank in the participants and very
few women.
Rating 1 - Support

Decision
Board Score 1.5
Final Decision 3 - Do Not Support
Summary There was broad agreement that the problem is important.
However, there was concern that without expert curation of the names
themselves, the value would be limited to just updating names to the
current versions. Furthermore, taxonomies change so fast, that the
tool will quickly decay unless kept constantly up to date. However,
there is clear value in getting the players from the different efforts
in the same room to help hash out differences and come to a standard.
Although, past experience shows this can be hard as each team would
like the larger group to adopt their solution. It was not entirely
clear from the participant list if all the independent efforts were
represented.

---------- Forwarded message ----------
From: NESCent Proposal Team via RT <proposal...@nescent.org>
Date: Mon, Mar 4, 2013 at 8:53 AM
Subject: [help.nescent.org #15310] Working Group: Matthew Yoder -
notification from NESCent about your proposal
To: diap...@gmail.com

Dear Dr. Matthew Yoder,

We regret to tell you that your Working Group proposal to the National
Evolutionary Synthesis Center (NESCent), Improving data integration
with a unified approach to taxonomic name resolution , has been
declined. As you might expect, we received many more proposals than we
are able to support. Your proposal generated interest and support from
the board, but weaknesses were identified as well.

The individual reviews of your proposal are available on-line at
http://nead.nescent.org. You can use the login name and password you
established when you submitted your proposal. If you have forgotten
your password or have questions about the on-line system, please send
email to our help system (propos...@nescent.org ).

Feel free to contact me if you have questions about your proposal or
the review and evaluation process. To do so, please reply to this
email rather than directly to me; that will insure that the entire
proposal team (which includes me) is copied on the message.

Thank you for your interest in NESCent. We are always happy to discuss
ways that NESCent may support exciting research in synthetic
evolutionary science, and we look forward to the possibility of
working with you in the future.

Yours,

Allen Rodrigo
Director
a.ro...@nescent.org

David Patterson

unread,

Mar 4, 2013, 11:34:14 AM3/4/13

to Matt Yoder, wg-...@googlegroups.com

I am sorry about the news, but it was valuable.

I will try to ensure this group is represented in a names symposium at the next TDWG. I'd be pleased to hear from anyone who is planning to be there.

thanks

Paddy

--
You received this message because you are subscribed to the Google Groups "wg-tnrs" group.
To unsubscribe from this group and stop receiving emails from it, send an email to wg-tnrs+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
___________________________________
David J Patterson

Senior Scientist, Marine Biological Laboratory
7 MBL Street, Woods Hole, MASS 02543, USA.

Research Professor
School of Life Sciences, Arizona State University
Tempe, AZ 85287-4501

Professor (MBL) Ecology and Evolutionary Biology
Brown University, Providence, Rhode Island

Life Sciences Lead, Data Conservancy dataconservancy.org

globalnames.org

Hilmar Lapp

unread,

Mar 4, 2013, 12:00:52 PM3/4/13

to Matt Yoder, wg-...@googlegroups.com

Sorry for the negative news! Though indeed, the process of arriving here was valuable, and the reviews seem pretty unanimous in acknowledging that the problem is important, which is quite encouraging.

-hilmar

> --
> You received this message because you are subscribed to the Google Groups "wg-tnrs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to wg-tnrs+u...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

--

===========================================================
: Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org :
===========================================================

Arlin Stoltzfus

unread,

Mar 5, 2013, 10:15:44 AM3/5/13

to Hilmar Lapp, Matt Yoder, wg-...@googlegroups.com

The reviews are all over the place. The summary doesn't really reflect this. The summary should be ignored as misleading, IMHO. Only reviewer #2 raised the issue of including expert curation of names, doing so in a black-and-white way, suggesting that the "real value" of the system would reside only in curated back-end names, as though there is not an inherent value to developing a standards-based middle layer (which, I would argue, would help others to expose and address probelms with back-end name-curation). Reviewer #3 clearly has not delimited or narrowed the issue of name informatics in any way. He is holding the group responsible for the entire problem.

Thus, one thing that the reviews suggest collectively is that, for us to be effective in writing a proposal for anything less than a universal system of name informatics, we need to dissect problems of name informatics and describe them separately. Then we can propose to address one aspect of the problem, without being criticized for failing to address all aspects.

One way to do this is to publish a separate opinion piece about the problems of name informatics.

Arlin

-------
Arlin Stoltzfus (ar...@umd.edu)
Fellow, IBBR; Adj. Assoc. Prof., UMCP; Research Biologist, NIST
IBBR, 9600 Gudelsky Drive, Rockville, MD, 20850
tel: 240 314 6208; web: www.molevol.org

Hilmar Lapp

unread,

Mar 5, 2013, 3:59:19 PM3/5/13

to Arlin Stoltzfus, Matt Yoder, wg-...@googlegroups.com

On Mar 5, 2013, at 10:15 AM, Arlin Stoltzfus wrote:

The reviews are all over the place. The summary doesn't really reflect this. The summary should be ignored as misleading, IMHO.

This isn't to dispute Arlin's suggestion for how to possibly proceed from here, but just for clarification purposes, the Board Summary isn't a summary of the individual reviews, but a summary of the discussion the Advisory Board had when they met and discussed their opinions. Sort of like an NSF panel summary if you will. I agree though that in that respect it's surprising that the name curation seems to have taken a more prominent role in that discussion than the individual reviews would have suggested. As to why, I don't know.

-hilmar

Reply all

Reply to author

Forward