Hi all,
I'm doing some work on the synth pipeline.
There is a step at which we prune off leaves of a tree if the OTT Id occurs multiple times in the tree or is nested inside another OTT ID that is mapped to a tip.
We have a flag (^ot:isTaxonExemplar) for indicating whether a particular node should be used as the exemplar of the taxon in the event that multiple tips share that same OTT ID.
There are a few cases in which it is not clear what the preferred behavior is:
1. multiple nodes with the same OTT ID have this flag. I think that I'll just go to my next ranking criterion (lexicographic sorting of the node Ids) to resolve this.
2. a tip mapped to a higher taxon and another tip mapped to a descendant of this taxon. I'm tempted to just prune off the higher taxon as ambiguous regardless of the exemplar flags. If I don't do that, then:
3. a higher taxon is flagged as the exemplar, but so is its descendant that is also in the tree. I think I'll delete the higher taxon in this case.
4. a higher taxon is flagged as the exemplar, and at least one of it descendants is also flagged as an exemplar, but another one is not flagged. I think I'd lean toward including the descendants. For example, if a tree has a flagged tip mapped to Pan paniscus, an unflagged tip mapped to Pan trog., and a third tip mapped to Pan.
It is not clear that saying we can/should solve this by concluding: "these situations should be caught and fixed in the curator app before we get to synthesis". These situations can arise after mapping because of a subsequent change to OTT.
Thoughts welcome.
Mark
--
Mark Holder
==============================================
Department of Ecology and Evolutionary Biology
University of Kansas
6031 Haworth Hall
1200 Sunnyside Avenue
Lawrence, Kansas 66045
==============================================