New issue 511 by gordon.jarrell: scientific_name with and w/o subgenus
http://code.google.com/p/arctos/issues/detail?id=511
Purpose of code changes on this branch:
Errors and duplication stem from the inconsistent inclusion of data in the
subgenus column of taxonomy. For example, we now have two valid records
for the same taxon:
Sorex (Otisorex) cinereus Kerr, 1792
Sorex cinereus Kerr, 1792
Subgenera are becoming increasingly critical as Arctos incorporates more
and more taxonomic complexity in collections like insects, parasites, and
paleontology. Eliminating subgenus is a poor option.
One option might be alternative formats for scientific names. A search on
scientific_name = Sorex cinereus should find the same row irrespective of
whether or not there is a value present for subgenus. Collections might
need the choice of whether or not to display subgenus when it is present.
When reviewing my code changes, please focus on:
After the review, I'll merge this branch into:
/trunk
Maybe this should be high priority. Gabor apparently added over a thousand
records that were already in there with subgenus because he couldn't find
them as genus+species. He could not grasp my attempts to explain the
issue. The longer we wait, the bigger the clean-up...
Comment #2 on issue 511 by dust...@gmail.com: scientific_name with and w/o
subgenus
http://code.google.com/p/arctos/issues/detail?id=511
To review:
Taxonomy exists to facilitate communication.
"Diptera" (the animal Order) and "Diptera" (the plant Genus) are different
things.
"Sorex (Otisorex) cinereus Kerr, 1792" and "Sorex cinereus Kerr, 1792" are
the same things.
Right....
Correct.
I don't think we can represent that in our current model.
Gotta get there somehow, even if takes a new model. This may be unrelated
to how the data are stored (e.g., hierarchical versus long rows), and so we
might be able to use somebody else's solution, if anybody has done it. A
compromise might be that we only concatenate subgenus into scientific_name
where species is null. You could still search all records by subgenus, and
it would only show up in scientific where it was really needed.
This has everything to do with how the data are stored. Neither a "long
row" nor a hierarchical model will do what you want, at least not in any
way that I've been able to recognize.
Doesn't the ICZN provide guidelines for how names are formed?
So, maybe just an IF clause in the trigger that builds scientific_name?
IF what?
IF subgenus, and species NOT null
THEN concatenate genus + species
ELSE concatenate genus + "(" + subgenus + ")"
or words to that effect...
Are you suggesting we ignore ICZN guidelines? Are there such things?
Not sure what the applicable guidelines might be. *Sorex cinereus, Sorex
(Otisorex)* and *Sorex (Otisorex) cinereus* are all valid constructions, I
assume.
Gordon would like to remove subgenus from display when species is given. So
both "genus=Sorex + species=cinereus" and "genus=Sorex + subgenus =
Otisorex + species=cinereus" would display as "Sorex cinereus."
If species is not given, "genus=Sorex + subgenus = Otisorex" would display
as "Sorex (Otisorex)".
So, when Taxonomy is re-concatenated under this logic, there are likely a
few thousand non-unique scientific_names. Can you temporarily delete
anything Gabor added to taxonomy in the past three months? Or can you
script something to delete the record with NULL subgenus when the
scientific_names are the same?
Comment #13 on issue 511 by dust...@gmail.com: scientific_name with and w/o
subgenus
http://code.google.com/p/arctos/issues/detail?id=511
(No comment was entered for this change.)