connection info for TNRS teleconference in <1 hour (12:00 pm EST, UTC-5)

Arlin Stoltzfus

unread,

Nov 21, 2012, 11:09:06 AM11/21/12

to wg-...@googlegroups.com

Dear all--

Connection information for the upcoming teleconference is below. I would like to suggest the following agenda:

* very brief introductions (5 min) (name, most relevant project, 1-sentence version of what you hope to contribute)

* explanation of NESCent working groups (5 min)

* review of overall vision for the proposed working group (10 min)

* discussion of goals and strategy (30 min)

* plan to complete the proposal (10 min)

We'll have a chance to revise that as needed. Regards,

Arlin

-----------------------------------------------------------------------------------------------------------------------------------------------------------

Connection information*:

Local (toll) number: 1-210-301-9441
Freephone (toll free) number: 866-818-9770

Participant passcode: 7303869

Certain dialing restrictions can apply from specific countries:

https://www.mymeetings.com/audioconferencing/pdf/GlobalAccessDialingInformation.pdf

* Restrictions may exist when accessing freephone/toll free numbers using a mobile telephone. Country freephone numbers cannot be used outside of the country listed.

------------------------------------------------------------------------------------------------------------------------------------------------------------

-------
Arlin Stoltzfus (ar...@umd.edu)
Fellow, IBBR; Adj. Assoc. Prof., UMCP; Research Biologist, NIST
IBBR, 9600 Gudelsky Drive, Rockville, MD, 20850
tel: 240 314 6208; web: www.molevol.org

Cody Hinchliff

unread,

Nov 21, 2012, 11:54:22 AM11/21/12

to Arlin Stoltzfus, wg-...@googlegroups.com

Unfortunately I won't be able to join this call. I did add some notes to the doc yesterday. Let me know if any thoughts come up regarding how the Open Tree project or myself in particular could contribute to the proposal or to the project itself.

--

Arlin Stoltzfus

unread,

Nov 21, 2012, 2:39:54 PM11/21/12

to wg-...@googlegroups.com

I'd like to thank everyone again for a productive meeting today. The notes are below, and are pasted into the google doc.

After the meeting, Matt and I discussed how to build on what happened. We decided that we need citations and details about the use cases, so a new section of the google doc was created for that here:

https://docs.google.com/document/d/1fsVxkTtl-Q3Na5ZNNdOvV5OwDjra9V3VpLtd5xzDJZY/edit#heading=h.tgw26ce7cygz

If you are familiar with the use-case of Phylotastic, Open Tree of Life (OToL), EoL, Map of Life, or InvertNet, please add references to the literature, and help to draft a paragraph describing the use-case. There is already some content from these projects pasted into the doc, but we need to express each of these projects succinctly as a use-case for taxonomic name resolution.

Also, there needs to be some further discussion of deliverables. I hope that the discussion will happen on the email list.

I'm going on vacation now and I won't be back until Monday. I'm going to leave this process in the capable hands of Matt and others. Right now, it looks like we are on track to submit a proposal by the Dec 1 deadline.

Arlin

================================================================================================

Notes from teleconference

present: Dmitry Mozzherin, David Shorthourse, Dave Nicolson, Paddy Patterson, Matt Yoder, Tom Orrell, Cyndy Parr, Arlin Stoltzfus, Hilmar Lapp, Naim Matasci, Gaurav Vaidya

1. Introductions

2. The nature of NESCent working groups

* 3 or 4 meetings of 10 to 12 people under NESCent support

* aim for "scholarly products"

* examples include knowledge, grant proposals, proof-of-concept, integration, white papers (novel), advocacy

3. Vision for this proposal. There are content-providers on one side (encoding expert knowledge of taxonomy in namebanks and other resources), and there are users on the other side. Our role (in this working group) is in the middle, making sure that expert knowledge of taxonomy gets delivered effectively to users trying to accomplish data integration goals. We're going to be driven by use-cases.

4. Use cases. We spent most of our time discussing use-cases. I'm not sure that I recall all of them, but briefly they were Phylotastic, Open Tree of Life (OToL), EoL, Map of Life, and InvertNet.

5. We ended with a brief discussion of how to finish this up.

* We'll need another 1 or 2 PIs, and those people will have to upload CVs.

* We need to decide on deliverables -- whitepaper, software, standards?

* Consider adding your thoughts to the document

* consider engaging in discussions on the email list

================================================================================================

--

David Shorthouse

unread,

Nov 21, 2012, 3:19:52 PM11/21/12

to Arlin Stoltzfus, wg-...@googlegroups.com

I don't the recall that these were the use cases discussed, but so be it. I'd discourage the use of project names in use cases, it has the unfortunate consequence of uninvited feeling excluded. Instead, use cases expressed as problems these projects face will be more inclusive.

A very important use case we're missing here is data discovery, what Paddy might call Feature Extraction. Gotta be able to find names in raw data before you can do any magic. Explorations with raw data in Dryad shows heap of issues, solutions for some of these requiring education. We have some smarts for name-finding, but these are dated and in needed of positive feedback loops.

David

--

Arlin Stoltzfus

unread,

Nov 21, 2012, 3:38:15 PM11/21/12

to wg-...@googlegroups.com

Feel free to correct the list. "EoL" is a vague reference to a use-case that Cyndy and others were discussing, and which I did not entirely understand. I think InvertNet is something Matt was talking about, although I'm not sure he used that name for it.

I agree that we want to express use-cases in abstract terms, rather than associate them with projects, e.g., the "phylotastic" use-case is "delivering expert phylogenetic knowledge on-the-fly using species names as a query".

Arlin

Robert Guralnick

unread,

Nov 21, 2012, 3:43:47 PM11/21/12

to Arlin Stoltzfus, wg-...@googlegroups.com

And Map of Life use case is just the one Arlin expressed with "phylogenetic" replaced with "biogeographic".
-r

--

David Patterson

unread,

Nov 21, 2012, 5:59:11 PM11/21/12

to David Shorthouse, Arlin Stoltzfus, wg-...@googlegroups.com

I also pasted into the document what I though were the use cases we discussed

At some time they will all get put close together, extended, revised, but then trimmed down. The Feature extraction idea (emerged from the Data Conservcancy project) has been added, but not with that name. I just called it 'Indexing'

Have fun all for Thanksgiving

Paddy

--

--
___________________________________
David J Patterson

Senior Scientist, Marine Biological Laboratory
7 MBL Street, Woods Hole, MASS 02543, USA.

Research Professor
School of Life Sciences, Arizona State University
Tempe, AZ 85287-4501

Professor (MBL) Ecology and Evolutionary Biology
Brown University, Providence, Rhode Island

Life Sciences Lead, Data Conservancy dataconservancy.org

globalnames.org

Matt Yoder

unread,

Nov 22, 2012, 11:13:32 AM11/22/12

to David Shorthouse, Arlin Stoltzfus, wg-...@googlegroups.com

On Wed, Nov 21, 2012 at 2:19 PM, David Shorthouse
<davidpsh...@gmail.com> wrote:
> I don't the recall that these were the use cases discussed, but so be it.
> I'd discourage the use of project names in use cases, it has the unfortunate
> consequence of uninvited feeling excluded. Instead, use cases expressed as
> problems these projects face will be more inclusive.

Point well taken, but on the other side this is a working group
proposal at this point whose goal, as Hilmar mentioned, is to do
scholarly work. We don't need to propose universal acids, just well
thought out and broadly useful ones. If this was, say, a TDWG level
proposal then certainly we'd put the "inclusive is better" mindset
first and foremost. Here though, IMO, integrating with partners who
are doing RealScience(TM) right now *might* be a god thing to do, it
would keep us grounded and focused on specific needs. '

M

David Shorthouse

unread,

Nov 22, 2012, 12:00:46 PM11/22/12

to Matt Yoder, Arlin Stoltzfus, wg-...@googlegroups.com

Actually, my point was that expressed as problems needing solution,
the use cases can be interpreted as RealScience AND can capture needs
of another NSF-funded project.

Something like this:

"As an ecologist who does not have time to deduce vagaries in taxon
concepts, I desire a module written in R that can resolve scientific
names in row-level data so that I can merge the results of
geographically disparate inventories and make a species list by family
using the most up-to-date taxonomic placement."

Dave

Hilmar Lapp

unread,

Nov 22, 2012, 2:20:16 PM11/22/12

to Matt Yoder, David Shorthouse, Arlin Stoltzfus, wg-...@googlegroups.com

On Nov 22, 2012, at 11:13 AM, Matt Yoder wrote:

> Here though, IMO, integrating with partners who are doing RealScience(TM) right now *might* be a god thing to do

I'm not sure I'd argue the God thing too much with NESCent ;) but RealScience(TM) is certainly the way to go. And keep in mind that what has merit as synthesis and scientific advancement is fairly broad - the HIP [1] and EvoInfo [2] working groups are (or were, respectively) both NESCent-funded working groups. But if you propose software outcomes, it needs to be evident how these are going to be produced.

-hilmar

[1] http://nescent.org/science/awards_summary.php?id=294
[2] http://evoinfo.nescent.org/
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- informatics.nescent.org :
===========================================================

andrea thomer

unread,

Nov 24, 2012, 5:49:10 PM11/24/12

to Hilmar Lapp, Matt Yoder, David Shorthouse, Arlin Stoltzfus, wg-...@googlegroups.com

Hey all -- sorry for not being around for the call this week, and I hope my comments aren't too little too late, but I left a few, as well as a short paragraph in the "Pitches" section about how we might want to think about more formal or structured ways of assessing current tools' successes and failures. While this isn't directly related to doing RealScience(TM), it _is_ related to setting accurate and realistic development benchmarks (the need for which seem alluded to in prior emails). I can flesh out more detailed ideas/thoughts if this seems like right thinking on the right track.

A

--

Cody Hinchliff

unread,

Nov 25, 2012, 10:57:11 AM11/25/12

to wg-...@googlegroups.com

Hi all, hope everyone had a good Thanksgiving! Arlin had mentioned that OpenTree came up as a use-case and suggested that I add some text to the proposal. I went ahead and added a short bit under the "collaborations" section, and took a look at the use-cases section as well. I guess I am curious if the information from below in the "Supporting text from the OpenTree project" is relevant for the use-case. I admit, I am not entirely sure what direction the proposal is taking, but if any of you who have a better idea of how to shape the overall picture need any more information, please just let me know.

C

Arlin Stoltzfus

unread,

Nov 26, 2012, 11:32:09 AM11/26/12

to wg-...@googlegroups.com

I'd like to balance out these two perspectives as follows.

First, showing that we are working with other projects is good, because this shows that we are working in a community and cooperating with others. In some cases it means that those projects will devote resources (e.g., staff) to common goals, which is a force-multiplier with respect to our proposal.

Second, however, we should not rest satisfied with the idea that projects are customers, and that we prove our relevance by feeding the needs of named projects like OToL, Phylotastic, etc.. I can't judge what is happening in the biodiversity informatics world, but in the phylogeny informatics world, its pretty clear that the existence of a project does not mean that the project has a workable business model serving the needs of scientific customers. Some projects are designed by experts for their own use, or for use by other experts. IMHO we need to justify relevance by pointing to the needs of end-users-- the more of them, the better (i.e., 1000s or tens of 1000s).

In the case of phylotastic, we can cite the project, but we also can cite the review [1] showing that the project addresses a use-case that is common in the scientific literature.

Arlin

1. Stoltzfus A, O'Meara B, Whitacre J, Mounce R, Gillespie EL, Kumar S, Rosauer DF, Vos RA: Sharing and re-use of phylogenetic trees (and associated data) to facilitate synthesis. BMC Research Notes 2012, 5:574.

David Patterson

unread,

Nov 26, 2012, 11:38:17 AM11/26/12

to Arlin Stoltzfus, wg-...@googlegroups.com

Just to counter Arlin's statement about use cases

This is not to be argumentative, but comes from our experience.

If you build tools without a target audience, a target purpose, then
there is no guarantee that you will serve anyone, or that anyone will
pick up on the idea.

Use cases establish a clear set of requirements, and with an Agile
process, the products can be refined to become very well suited.

But a badly chosen use case can have the limitations that Arlin points
to. So, it is important that we pick use cases that are good examples
of classes of problems that we are collectively aware of. The Map of
Life problem is a very good example of a particular problem that is
being encountered in many projects. It is also a problem that I
think is a little down the track, that the name reconciliation and
taxonomic resolution elements need to be solidly in pl;ace before we
can go to the level of concept management.

I don't think there is any profound disagreement here, just good use
of words in the document.

When is our next conference call, as we need to distribute chores and
appoint an Editor in Chief (and apologies if this is already answered
in another email)

Paddy

Reply all

Reply to author

Forward