OAI harvesting client

56 views
Skip to first unread message

DEBORA PIGNATARI DRUCKER

unread,
Aug 18, 2020, 4:36:54 PM8/18/20
to dataverse...@googlegroups.com

Hello

we are trying to create a harvesting client at our dataverse instance and are getting the attached error message.

The client is a DSpace instance.

We tried the following URLs:

__________________________
Aviso de confidencialidade

Esta mensagem da Empresa  Brasileira de Pesquisa  Agropecuaria (Embrapa), empresa publica federal  regida pelo disposto  na Lei Federal no. 5.851,  de 7 de dezembro de 1972,  e  enviada exclusivamente  a seu destinatario e pode conter informacoes  confidenciais, protegidas  por sigilo profissional.  Sua utilizacao desautorizada  e ilegal e  sujeita o infrator as penas da lei. Se voce  a recebeu indevidamente, queira, por gentileza, reenvia-la ao emitente, esclarecendo o equivoco.

Confidentiality note

This message from Empresa  Brasileira de Pesquisa  Agropecuaria (Embrapa), a government company  established under  Brazilian law (5.851/72), is directed exclusively to  its addressee  and may contain confidential data,  protected under  professional secrecy  rules. Its unauthorized  use is illegal and  may subject the transgressor to the law's penalties. If you are not the addressee, please send it back, elucidating the failure.
dataverse-oai.png

Julian Gautier

unread,
Aug 18, 2020, 6:25:29 PM8/18/20
to Dataverse Users Community
Hi Debora,

I think Dataverse would expect https://www.digipathos-rep.cnptia.embrapa.br/oai, but I'm not able to test what happens when I try that URL. When you try https://www.digipathos-rep.cnptia.embrapa.br/oai, your installation shows the same "Invalid URL" message?

DEBORA PIGNATARI DRUCKER

unread,
Aug 24, 2020, 8:33:10 AM8/24/20
to dataverse...@googlegroups.com
Yes, we get the "invalid URL" message

Anyone would have any ideas on how to solve this?

Thanks a lot

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/9bd0b441-3ba1-4f50-850b-26fdb6724611n%40googlegroups.com.

Philip Durbin

unread,
Aug 25, 2020, 1:53:50 PM8/25/20
to dataverse...@googlegroups.com
My first thought is that I'm not sure if Dataverse can harvest from DSpace or not. It should work, I think, but it would be nice to hear about a case of this working.

Meanwhile, Debora, you are welcome to create an issue called something like "Can't harvest from DSpace" (please include those URLs you tried) at https://github.com/IQSS/dataverse/issues

Thanks,

Phil



--

Julian Gautier

unread,
Aug 25, 2020, 2:44:30 PM8/25/20
to Dataverse Users Community
Just a heads up about another issue that I think will affect your Dataverse-based repository's ability to harvest records from this DSpace-based repository:

The conversation in the GitHub issue at https://github.com/IQSS/dataverse/issues/5050 titled "Harvesting zenodo client fail" has narrowed down to adjusting a restriction Dataverse imposes when harvesting oai_dc metadata from other repositories, where Dataverse will fail to harvest Dublin Core metadata if it doesn't recognize what's in the dc:identifier property as a DOI or HDL. In the GitHub issue, we're talking about adjusting this restriction so that Dataverse does a better job of recognizing DOIs and HDLs (in more formats).

It looks like some (and maybe all?) of the oai_dc records in this DSpace-based repository (https://www.digipathos-rep.cnptia.embrapa.br/oai/request?verb=ListRecords&metadataPrefix=oai_dc) have dc:identifier properties that Dataverse would not recognize as DOIs or HDLs, like https://www.digipathos-rep.cnptia.embrapa.br/jspui/handle/123456789/1128. It looks like that repository hasn't implemented handles, and "123456789" is just a placeholder.

One option is to ask the repository to implement handles and make sure they're included in the oai_dc metadata it publishes over OAI-PMH. Another is to ask that this Dataverse restriction be removed, so that Dataverse can harvest records that don't have DOIs or HDLs. I think the first option is better, especially if it can be done sooner than your repository needs to harvest its records.

Washington Carvalho Segundo

unread,
Aug 28, 2020, 10:08:29 AM8/28/20
to dataverse...@googlegroups.com
Hi Debora, 

I guess the correct harvesting address would be  https://www.digipathos-rep.cnptia.embrapa.br/oai/request .

Bests,
Washington 

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages