Re: [etd] Harvesting NDLTD Union Archive

17 views
Skip to first unread message

Daryl Grenz

unread,
Mar 8, 2022, 11:33:12 PM3/8/22
to Veronica Santos, e...@ndltd.org
Hi Veronica,

It seems that the NDLTD OAI-PMH base endpoint is not set up to offer any documentation. It is assuming that your harvester will already know the OAI verbs to use. But if you try out a few requests, it looks like they are responding correctly, see for example:

Is that what you are looking for?

Regards,
Daryl


On Wed, Mar 9, 2022 at 1:03 AM Veronica Santos <versan...@gmail.com> wrote:
Hello,

since last week I am trying to harvest some metadata from NDLTD Union Archive at URL http://union.ndltd.org/OAI-PMH/ and I am receiving an error message

I tried to access the same UTL using the bowser and the response is also not OK

<OAI-PMH xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">
<responseDate>2022-03-08T03:00:02Z</responseDate>
<request>http://union.ndltd.org:8080/union.OAI-PMH/</request>
<error code="badVerb">Illegal OAI verb</error>
</OAI-PMH>

Does anybody know what is happening?

Veronica

--
You received this message because you are subscribed to the Google Groups "ETD" group.
To unsubscribe from this group and stop receiving emails from it, send an email to etd+uns...@ndltd.org.
To view this discussion on the web visit https://groups.google.com/a/ndltd.org/d/msgid/etd/CAEAa0S%3DR4fpBie26FMkmcqdk4F9GoJcKMLAcW-mZ3asEHGdLfQ%40mail.gmail.com.

Veronica Santos

unread,
Mar 9, 2022, 1:54:39 PM3/9/22
to Daryl Grenz, e...@ndltd.org
Thanks Daryl!

I will try this way.

I was using a jupyter notebook and a python package named oaiharvest to collect. It worked for other endpoints.

Veronica

!oai-harvest --help
usage: oai-harvest [-h] [--db DATABASEPATH] [-p METADATAPREFIX]
                   [-f YYYY-MM-DD] [-u YYYY-MM-DD] [-s SET] [-d DIR]
                   [--delete | --no-delete] [-l LIMIT]
                   [--create-subdirs | --subdirs-on SUBDIRS]
                   provider [provider ...]

Harvest records from an OAI-PMH provider.

positional arguments:
  provider              OAI-PMH Provider from which to harvest. This may be
                        the base URL of an OAI-PMH server, or the short name
                        of a registered provider. You may also specify "all"
                        for all registered providers.

optional arguments:
  -h, --help            show this help message and exit
  --db DATABASEPATH, --database DATABASEPATH
                        Path to provider registry database. Currently supports
                        sqlite3 only.
  -p METADATAPREFIX, --metadataPrefix METADATAPREFIX
                        the metadataPrefix of the format (XML Schema) in which
                        records should be harvested.
  -f YYYY-MM-DD, --from YYYY-MM-DD
                        harvest only records added/modified after this date.
  -u YYYY-MM-DD, --until YYYY-MM-DD
                        harvest only records added/modified up to this date.
  -s SET, --set SET     harvest only records within this set
  -d DIR, --dir DIR     where to output files for harvested records.default:
                        current working path
  --delete              respect the server's instructions regarding deletions,
                        i.e. delete the files locally (default)
  --no-delete           ignore the server's instructions regarding deletions,
                        i.e. DO NOT delete the files locally
  -l LIMIT, --limit LIMIT
                        limit the number of records to harvest from each
                        provider
  --create-subdirs      create target subdirs (based on / characters in
                        identifiers) ifthey don't exist. To use something
                        other than /, use the newer--subdirs-on option
  --subdirs-on SUBDIRS  create target subdirs based on occurrences of the
                        given character in identifiers

Copyright (c) 2013, the University of Liverpool <http://www.liv.ac.uk>. All
rights reserved. Distributed under the terms of the BSD 3-clause License
<http://opensource.org/licenses/BSD-3-Clause>.

Reply all
Reply to author
Forward
0 new messages