Message from discussion
Advice sought for DOI metadata, taxonomic name-finding and resolution
Received: by 10.224.97.134 with SMTP id l6mr1747162qan.6.1334777373337;
Wed, 18 Apr 2012 12:29:33 -0700 (PDT)
X-BeenThere: dryad-dev@googlegroups.com
Received: by 10.229.111.140 with SMTP id s12ls1014397qcp.4.gmail; Wed, 18 Apr
2012 12:29:33 -0700 (PDT)
Received: by 10.224.196.196 with SMTP id eh4mr1759331qab.5.1334777373000;
Wed, 18 Apr 2012 12:29:33 -0700 (PDT)
Received: by 10.224.203.4 with SMTP id fg4msqab;
Wed, 18 Apr 2012 12:26:15 -0700 (PDT)
Received: by 10.236.175.38 with SMTP id y26mr581178yhl.11.1334777174157;
Wed, 18 Apr 2012 12:26:14 -0700 (PDT)
Date: Wed, 18 Apr 2012 12:26:13 -0700 (PDT)
From: David Shorthouse <davidpshortho...@gmail.com>
To: dryad-dev@googlegroups.com
Message-ID: <1878781.951.1334777173691.JavaMail.geo-discussion-forums@ynhs12>
Subject: Advice sought for DOI metadata, taxonomic name-finding and
resolution
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----=_Part_949_11301193.1334777173690"
------=_Part_949_11301193.1334777173690
Content-Type: multipart/alternative;
boundary="----=_Part_950_21080688.1334777173690"
------=_Part_950_21080688.1334777173690
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Folks,
I noticed on your development
list, http://wiki.datadryad.org/Repository_Development_Plan, that you are
considering ingestion of taxonomic / vernacular names to help supplement
search across your holdings. I also understand that you have been in touch
with David Patterson, PI of the NSF-funded Global Names project.
Am looking for a simple way for you to take advantage of the Global Names
taxonomic name-finding and resolution services. These are still under
development and receiving feedback from consumers. Nonetheless, one of our
services can take a URL as a query parameter and find all names. This URL
could point to a PDF, image, doc, xls, etc and does OCR on-the-fly as
needed. The response is a list of unique names. Another service of ours can
take a flat list of names and resolve these against other lists (e.g.
Catalogue of Life, NCBI, EOL, GBIF) and produce their local identifiers for
a linking service as well as their tree paths to root for possible concept
expansion in your index.
So, I'm writing to inquire if you have plans to include direct links to
data packages (and MIME type, though not immediate necessary) in responses
to DOI content negotiations.
For example:
curl -LH "Accept: application/rdf+xml"
"http://dx.doi.org/10.5061/dryad.584" (or any of your other supported
content types as expressed at http://data.datacite.org/10.5061/dryad.584)
...gives me some nice metadata, but doesn't actually give me a link to the
data package that I'm most interested in. The only apparent way to get the
package is to visit http://datadryad.org/resource/doi:10.5061/dryad.584 and
fish for it. Had a link to the package been provided, you'd be pretty close
to scratching "...Search over hierarchical concepts (e.g., "all
lizards")..." off your list.
There are however going to be some limitations and requirements for names
within any of your submitted data packages. These may have to feed back to
data depositors if they wish to have names within their submissions
recognizable and indexable. We can chat more about that at a later date.
All the best,
David P. Shorthouse
Global Names, http://www.globalnames.org
Marine Biological Laboratory
Woods Hole, MA
------=_Part_950_21080688.1334777173690
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Folks,<div><br></div><div>I noticed on your development list, http://w=
iki.datadryad.org/Repository_Development_Plan, that you are considering ing=
estion of taxonomic / vernacular names to help supplement search across you=
r holdings. I also understand that you have been in touch with David Patter=
son, PI of the NSF-funded Global Names project.</div><div><br></div><div>Am=
looking for a simple way for you to take advantage of the Global Names tax=
onomic name-finding and resolution services. These are still under developm=
ent and receiving feedback from consumers. Nonetheless, one of our services=
can take a URL as a query parameter and find all names. This URL could poi=
nt to a PDF, image, doc, xls, etc and does OCR on-the-fly as needed. The re=
sponse is a list of unique names. Another service of ours can take a flat l=
ist of names and resolve these against other lists (e.g. Catalogue of Life,=
NCBI, EOL, GBIF) and produce their local identifiers for a linking service=
as well as their tree paths to root for possible concept expansion in your=
index.</div><div><br></div><div>So, I'm writing to inquire if you have pla=
ns to include direct links to data packages (and MIME type, though not imme=
diate necessary) in responses to DOI content negotiations.</div><div><br></=
div><div>For example:</div><div>curl -LH "Accept: application/rdf+xm=
l" "http://dx.doi.org/10.5061/dryad.584" (or any of your other supported co=
ntent types as expressed at http://data.datacite.org/10.5061/dryad.584=
)<br></div><div><br></div><div>...gives me some nice metadata, but doesn't =
actually give me a link to the data package that I'm most interested in. Th=
e only apparent way to get the package is to visit http://datadryad.or=
g/resource/doi:10.5061/dryad.584 and fish for it. Had a link to the package=
been provided, you'd be pretty close to scratching "...Search over hierarc=
hical concepts (e.g., "all lizards")..." off your list.</div><div><br></div=
><div>There are however going to be some limitations and requirements for n=
ames within any of your submitted data packages. These may have to feed bac=
k to data depositors if they wish to have names within their submissions re=
cognizable and indexable. We can chat more about that at a later date.</div=
><div><br></div><div>All the best,</div><div><br></div><div>David P. Shorth=
ouse</div><div>Global Names, http://www.globalnames.org</div><div>Marine Bi=
ological Laboratory</div><div>Woods Hole, MA</div>
------=_Part_950_21080688.1334777173690--
------=_Part_949_11301193.1334777173690--