OAI-PMH Errors on ListRecords

65 views
Skip to first unread message

Sherry Lake

unread,
Sep 24, 2018, 2:10:58 PM9/24/18
to Dataverse Users Community

Philip Durbin

unread,
Sep 24, 2018, 2:29:29 PM9/24/18
to dataverse...@googlegroups.com
I hate to be the bearer of bad news but the *server* side of OAI-PMH is broken in Dataverse 4.9.2. (The client side is fine.) The good news is that in the "develop" branch, this was fixed in commit 7eaaf7f (Xrecord.java) as part of https://github.com/IQSS/dataverse/pull/5037 so the next release of Dataverse will have a fix. The regression is specific to 4.9.2. Previous versions are unaffected.

Phil

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/73a541fb-dd63-4eb7-8482-7c8d10259429%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


--

Pete Meyer

unread,
Sep 24, 2018, 2:50:17 PM9/24/18
to Dataverse Users Community
This looks to me like it may be a distinct problem - the XML produced by the URLs Sherri reported as problematic (and OAI sets from c43b4864b on develop) appear to be producing invalid XML.


On Monday, September 24, 2018 at 2:29:29 PM UTC-4, Philip Durbin wrote:
I hate to be the bearer of bad news but the *server* side of OAI-PMH is broken in Dataverse 4.9.2. (The client side is fine.) The good news is that in the "develop" branch, this was fixed in commit 7eaaf7f (Xrecord.java) as part of https://github.com/IQSS/dataverse/pull/5037 so the next release of Dataverse will have a fix. The regression is specific to 4.9.2. Previous versions are unaffected.

Phil

On Mon, Sep 24, 2018 at 2:10 PM Sherry Lake <shla...@gmail.com> wrote:
I've searched past google problems/questions and the github, but did not see any recent questions about problems I am seeing with OAI-PMH link results.

(Tried all the following in Safari browser) - The following do not work for UVa (4.8.6) nor for Harvard (4.9.2) - BUT does work at JHU (4.6)  - see last examples:




--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

Pete Meyer

unread,
Sep 24, 2018, 2:53:11 PM9/24/18
to Dataverse Users Community
On Monday, September 24, 2018 at 2:50:17 PM UTC-4, Pete Meyer wrote:
This looks to me like it may be a distinct problem - the XML produced by the URLs Sherri reported as problematic (and OAI sets from c43b4864b on develop) appear to be producing invalid XML.


I was wrong - c43b4864b does produce valid XML; so this is likely to be the one that Phil mentioned (and it appears I need to track down some more coffee).

Sorry about the confusion.

 

Philip Durbin

unread,
Sep 24, 2018, 3:25:45 PM9/24/18
to dataverse...@googlegroups.com
Sherry, I missed the fact that you're seeing brokenness in Dataverse 4.8.6. Have you upgraded recently? Do you think the brokenness was caused by an upgrade? Or could it have been caused by the publication of some new datasets?

The bug in 4.9.2 but I'm talking about is a double "</metadata></metadata>" when there should only be a single "</metadata>". Your problem looks different and perhaps based on the content of your datasets. Maybe some HTML in a description? I'm not sure.

Phil


To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Sherry Lake

unread,
Sep 24, 2018, 4:16:01 PM9/24/18
to dataverse...@googlegroups.com
Confirmed that it must be a problem with a new published dataset.

The following works:

This one dataset was one of the 1st ones deposited.

So how do I find the problem dataset? I assume creating sets based on deposit dates?

Thanks for helping me work through this.

--
Sherry


Philip Durbin

unread,
Sep 24, 2018, 4:37:35 PM9/24/18
to dataverse...@googlegroups.com
I'd take a look at the description of this one. It has some HTML and smart quotes in it and was published this month: https://doi.org/10.18130/V3/R9DDOR

I found it by searching for "in watershed acid-base status" from the error in Firefox (attached).

But who knows. Maybe it's some other dataset. I'm glad 4.8.6 is working fine otherwise.

Phil


For more options, visit https://groups.google.com/d/optout.
Screen Shot 2018-09-24 at 4.35.49 PM.png

Juan Corrales

unread,
Sep 25, 2018, 6:56:16 AM9/25/18
to Dataverse Users Community
Hi Philip and Sherry,
  v4.8.6. had a bug in OAI-PMH in repositories with deaccesioned  datasets; https://github.com/IQSS/dataverse/issues/4548 .

Juan
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.


--

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

Sherry Lake

unread,
Sep 25, 2018, 11:16:25 AM9/25/18
to dataverse...@googlegroups.com
Thanks, Juan.

I think that is it. My harvest set includes 41 datasets, and a note of 1 deleted dataset. 

Here is what the harvest results says:
Screen Shot 2018-09-25 at 11.11.55 AM.png

I went through most of the datasets, creating smaller OAI-PMH sets, and all worked. Even the one that Phil, you referenced below (seems OK):

Now I have a question about why deaccessioned datasets are being included in the harvest?
UVa's dataverse has two deaccessioned datasets - but only 1 gets mentioned in the harvest set (why just one, why not all, but then why any at all)?

Am curious, with "41 records exported", then why does it include the deleted dataset (which would make it 42)?


Thanks again.
Sherry


On Tue, Sep 25, 2018 at 6:56 AM Juan Corrales <juan...@gmail.com> wrote:
Hi Philip and Sherry,
  v4.8.6. had a bug in OAI-PMH in repositories with deaccesioned  datasets; https://github.com/IQSS/dataverse/issues/4548 .

Juan


El lunes, 24 de septiembre de 2018, 22:37:35 (UTC+2), Philip Durbin escribió:
I'd take a look at the description of this one. It has some HTML and smart quotes in it and was published this month: https://doi.org/10.18130/V3/R9DDOR

I found it by searching for "in watershed acid-base status" from the error in Firefox (attached).

But who knows. Maybe it's some other dataset. I'm glad 4.8.6 is working fine otherwise.

Phil

On Mon, Sep 24, 2018 at 4:16 PM Sherry Lake <shla...@gmail.com> wrote:
Confirmed that it must be a problem with a new published dataset.

The following works:

This one dataset was one of the 1st ones deposited.

So how do I find the problem dataset? I assume creating sets based on deposit dates?

Thanks for helping me work through this.

--
Sherry


On Mon, Sep 24, 2018 at 3:25 PM Philip Durbin <philip...@harvard.edu> wrote:
Sherry, I missed the fact that you're seeing brokenness in Dataverse 4.8.6. Have you upgraded recently? Do you think the brokenness was caused by an upgrade? Or could it have been caused by the publication of some new datasets?

The bug in 4.9.2 but I'm talking about is a double "</metadata></metadata>" when there should only be a single "</metadata>". Your problem looks different and perhaps based on the content of your datasets. Maybe some HTML in a description? I'm not sure.

Phil

 
On Monday, September 24, 2018 at 2:29:29 PM UTC-4, Philip Durbin wrote:

Juan Corrales

unread,
Sep 26, 2018, 10:33:45 AM9/26/18
to dataverse...@googlegroups.com
Now I have a question about why deaccessioned datasets are being included in the harvest?
They should be include to be deleted by the harvesters (If Harvard Dataverse harvest your repository, see the "deleted" dataset and remove its link) .

Yes, it is. 

--
You received this message because you are subscribed to a topic in the Google Groups "Dataverse Users Community" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/dataverse-community/9m4mWRUANYo/unsubscribe.
To unsubscribe from this group and all its topics, send an email to dataverse-commu...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

Philipp at UiT

unread,
Oct 5, 2018, 10:29:23 AM10/5/18
to Dataverse Users Community
We are getting complaints from our OAI harvesters about the "</metadata></metadata>" issue. Is this issue solved in 4.9.3 and 4.9.4?

Best,
Philipp
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

Philip Durbin

unread,
Oct 5, 2018, 1:13:35 PM10/5/18
to dataverse...@googlegroups.com
The double closing "</metadata></metadata>" issue was fixed in
https://github.com/IQSS/dataverse/pull/5037 which was released with
Dataverse 4.9.3. In general, you should be able to trust what the
"Milestone" says for a pull request (or an issue).
>>>>>> To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
>>>>>> To post to this group, send email to dataverse...@googlegroups.com.
>>>>>> To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/73a541fb-dd63-4eb7-8482-7c8d10259429%40googlegroups.com.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Philip Durbin
>>>>> Software Developer for http://dataverse.org
>>>>> http://www.iq.harvard.edu/people/philip-durbin
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
>>> To post to this group, send email to dataverse...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/98122509-20c1-47ba-b598-1aa0e8e22c8d%40googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>>
>> --
>> Philip Durbin
>> Software Developer for http://dataverse.org
>> http://www.iq.harvard.edu/people/philip-durbin
>
> --
> You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-commu...@googlegroups.com.
> To post to this group, send email to dataverse...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/0c4cff40-c37c-4922-9c58-98cb6bea78e2%40googlegroups.com.

Philipp at UiT

unread,
Oct 5, 2018, 2:58:35 PM10/5/18
to Dataverse Users Community
Thanks, Phil! I had skimmed through the list, but apperently I not well enough... I'll ask our system admin to upgrade our installation.
>>>>>> To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
>>>>>> To post to this group, send email to dataverse...@googlegroups.com.
>>>>>> To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/73a541fb-dd63-4eb7-8482-7c8d10259429%40googlegroups.com.
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Philip Durbin
>>>>> Software Developer for http://dataverse.org
>>>>> http://www.iq.harvard.edu/people/philip-durbin
>>>
>>> --
>>> You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
>>> To post to this group, send email to dataverse...@googlegroups.com.
>>> To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/98122509-20c1-47ba-b598-1aa0e8e22c8d%40googlegroups.com.
>>> For more options, visit https://groups.google.com/d/optout.
>>
>>
>>
>> --
>> Philip Durbin
>> Software Developer for http://dataverse.org
>> http://www.iq.harvard.edu/people/philip-durbin
>
> --
> You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages