Dear Rutger and cc all
I'm not sure if this answer is your question.
You are discussing how to automatically curate the eDNA / barcode results, right?
And I'm afraid I'm not talking about the institution level, but about my previous personal subwork. It means I have invented the wheel once before, too. So, I don't think the following information is worth reading for an expert such as yourself. But just for your information.
The Materials and Methods section of the following my paper is somewhat more detailed.
"Observing Phylum-level metazoan diversity by environmental DNA analysis at the Ushimado area in the seto inland sea". Kawashima T. et al
First, in terms of reference sequences, I used two resources. One is Organelle Genome Resources from GenBank and another is barcode resources provided by JBIF. (JBIF is a GBIF-related organization in Japan.)
For the correspondence between sequence and scientific name, although it is very classical, I first made a hit with BlastN and corresponded it to the Taxonomy ID of NCBI. I wrote my own ruby script for the mapping.
What I need as arguments for my script are two files in the taxdump.tar.gz provided by NCBI-Taxonomy: nodes.dmp and names.dump. I then passed the taxonomy-ID or scientific name as an additional argument and it would go back through the tree of nodes and provide the taxonomy information.
I think most people would have a hard time downloading data and upgrading data in above way, but since I worked at the National Institute of Genetics in Japan until recently, the reference data in HD was constantly being upgraded by NIG staffs, so if I let my scripts work, I could get the results semi-automatically, I was not too stressed.
That's all.
Other related Links.
JBIF
my script
Takeshi Kawashima