Some of our Dataset Urls in Dublic Core are not resolving identifier to http

62 views
Skip to first unread message

ofu...@gmail.com

unread,
May 3, 2018, 7:56:21 AM5/3/18
to Dataverse Users Community

Hi,

We have some dataset urls in DC not resolving identifiers to/prefixing “http://”:

 <dcterms:identifier>hdl:10037.1/10040</dcterms:identifier>

should be changed to

  <dcterms:identifier> http://hdl:10037.1/10040</dcterms:identifier>

and

  <dcterms:identifier>hdl:10037.1/10040</dcterms:identifier>

should be changed to

  <dcterms:identifier> http://hdl:10037.1/10040</dcterms:identifier>

 

How can I convert this format to a resolving http://......?


Thanks in advance.

Obi

Don Sizemore

unread,
May 3, 2018, 8:21:15 AM5/3/18
to dataverse...@googlegroups.com
Hello,

This may be something you can do with RestfulHS?
https://github.com/NYULibraries/RestfulHS

All I see for "protocol" in the dataset table are hdl and doi.

Donald


--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/84d7bb58-2cad-4171-9c27-0deb71e9bfdd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Philip Durbin

unread,
May 3, 2018, 8:23:59 AM5/3/18
to dataverse...@googlegroups.com
I think the solution is to upgrade to Dataverse 4.8.6 which switches from http to https: https://github.com/IQSS/dataverse/issues/4469

I'm not sure if you then need to re-register your Handles or not but there's an API for that called "modifyRegistration": https://github.com/IQSS/dataverse/issues/3318

On Thu, May 3, 2018 at 7:56 AM, <ofu...@gmail.com> wrote:

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse-community@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/dataverse-community/84d7bb58-2cad-4171-9c27-0deb71e9bfdd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Pete Meyer

unread,
May 3, 2018, 9:43:49 AM5/3/18
to Dataverse Users Community
Hi Obi,

This looks to me like the tool you're using to resolve the identifiers isn't correctly handling the handle identifier scheme.  I may be incorrect, but I believe the recommendations for PIDs are have them in hdl:${identifier} (or doi:${identifier}) in machine readable areas, and display them as https://${resolver}/${identifier} in human visible areas.

Best,
Pete


On Thursday, May 3, 2018 at 8:23:59 AM UTC-4, Philip Durbin wrote:
I think the solution is to upgrade to Dataverse 4.8.6 which switches from http to https: https://github.com/IQSS/dataverse/issues/4469

I'm not sure if you then need to re-register your Handles or not but there's an API for that called "modifyRegistration": https://github.com/IQSS/dataverse/issues/3318
On Thu, May 3, 2018 at 7:56 AM, <ofu...@gmail.com> wrote:

Hi,

We have some dataset urls in DC not resolving identifiers to/prefixing “http://”:

 <dcterms:identifier>hdl:10037.1/10040</dcterms:identifier>

should be changed to

  <dcterms:identifier> http://hdl:10037.1/10040</dcterms:identifier>

and

  <dcterms:identifier>hdl:10037.1/10040</dcterms:identifier>

should be changed to

  <dcterms:identifier> http://hdl:10037.1/10040</dcterms:identifier>

 

How can I convert this format to a resolving http://......?


Thanks in advance.

Obi

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

Laura Huisintveld

unread,
May 4, 2018, 4:32:37 AM5/4/18
to Dataverse Users Community
I am not completely sure, but if you convert a handle to a URL, your link should probably look like

http://hdl.handle.net/10037.1/10040

instead of: http://hdl:10037.1/10040

Kind regards,
Laura

Op donderdag 3 mei 2018 13:56:21 UTC+2 schreef ofu...@gmail.com:

ofu...@gmail.com

unread,
May 7, 2018, 2:06:38 PM5/7/18
to Dataverse Users Community
I have upgraded to 4.8.6 and could not solve the problem. I can't find "modifyRegistration" API  at http://guides.dataverse.org/en/4.6.1/api/native-api.html. Where can I get it?

Obi


torsdag 3. mai 2018 14.23.59 UTC+2 skrev Philip Durbin følgende:
I think the solution is to upgrade to Dataverse 4.8.6 which switches from http to https: https://github.com/IQSS/dataverse/issues/4469

I'm not sure if you then need to re-register your Handles or not but there's an API for that called "modifyRegistration": https://github.com/IQSS/dataverse/issues/3318
On Thu, May 3, 2018 at 7:56 AM, <ofu...@gmail.com> wrote:

Hi,

We have some dataset urls in DC not resolving identifiers to/prefixing “http://”:

 <dcterms:identifier>hdl:10037.1/10040</dcterms:identifier>

should be changed to

  <dcterms:identifier> http://hdl:10037.1/10040</dcterms:identifier>

and

  <dcterms:identifier>hdl:10037.1/10040</dcterms:identifier>

should be changed to

  <dcterms:identifier> http://hdl:10037.1/10040</dcterms:identifier>

 

How can I convert this format to a resolving http://......?


Thanks in advance.

Obi

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

Pete Meyer

unread,
May 7, 2018, 2:22:31 PM5/7/18
to Dataverse Users Community
Hi Obi,

Looks like it's not in the API guides yet; but https://github.com/IQSS/dataverse/issues/4569#issuecomment-379371317 appears to have some information on this endpoint.

Best,
Pete

ofu...@gmail.com

unread,
May 7, 2018, 3:34:27 PM5/7/18
to Dataverse Users Community
Thanks Pete.

Is it this :
GET http://$SERVER/api/datasets/modifyRegistrationAll?key=$apiKey


Is the key the same as API token generated or where do I find the key?


Thanks in advance.

Obi

Pete Meyer

unread,
May 7, 2018, 3:46:23 PM5/7/18
to Dataverse Users Community
Hi Obi,

API token and key are the same thing.

Also worth mentioning in the context of Dataverse APIs - using the "X-Dataverse-key: $apiKey" header (`-H "X-Dataverse-key: $apiKey"` if you're using curl) is equivalent to `?key=$apiKey` , and the header doesn't leave the API key / token in the server access logs.

Best,
Pete

ofu...@gmail.com

unread,
May 8, 2018, 5:37:33 AM5/8/18
to Dataverse Users Community
Hi Pete, I tried to run this: 
curl -X GET http://localhost:8080/api/datasets/modifyRegistrationAll?key=<d32aa.............>
 But no success.

Cheers 
Obi

Pete Meyer

unread,
May 8, 2018, 10:23:46 AM5/8/18
to Dataverse Users Community
Hi Obi,

I don't know a lot about this API endpoint, unfortunately.  Could you provide a bit more information about how it was failing (which might help somebody with a better idea provide some suggestions)?

Best,
Pete

ofu...@gmail.com

unread,
May 10, 2018, 3:42:57 PM5/10/18
to Dataverse Users Community
Philip, could you help. I tried to run the follow on our test dataverse:
 curl -H "X-Dataverse-key:d32aa8bc .....a1784" -X GET http://localhost:8080/api/datasets/modifyRegistrationAll
 
and I received the following  message:
 {"status":"OK","data":{"message":"Update All Dataset target url completed"}}[root@tiffany /]#

But I could not see that they are all updated. Example:

Cheers
Obi

torsdag 3. mai 2018 14.23.59 UTC+2 skrev Philip Durbin følgende:
I think the solution is to upgrade to Dataverse 4.8.6 which switches from http to https: https://github.com/IQSS/dataverse/issues/4469

I'm not sure if you then need to re-register your Handles or not but there's an API for that called "modifyRegistration": https://github.com/IQSS/dataverse/issues/3318
On Thu, May 3, 2018 at 7:56 AM, <ofu...@gmail.com> wrote:

Hi,

We have some dataset urls in DC not resolving identifiers to/prefixing “http://”:

 <dcterms:identifier>hdl:10037.1/10040</dcterms:identifier>

should be changed to

  <dcterms:identifier> http://hdl:10037.1/10040</dcterms:identifier>

and

  <dcterms:identifier>hdl:10037.1/10040</dcterms:identifier>

should be changed to

  <dcterms:identifier> http://hdl:10037.1/10040</dcterms:identifier>

 

How can I convert this format to a resolving http://......?


Thanks in advance.

Obi

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

Philip Durbin

unread,
May 11, 2018, 2:28:16 AM5/11/18
to dataverse...@googlegroups.com
Hi Obi,

I'm confused because it looks like https://doi.org/10.18710/7NLJSG is correctly redirecting to https://dataverse.no/dataset.xhtml?persistentId=doi:10.18710/7NLJSG

With `curl -i https://doi.org/10.18710/7NLJSG` you can see the redirect as HTTP code 302.

If you're still having trouble, please let us know.

Thanks,

Phil

To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsubscribe...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

ofu...@gmail.com

unread,
May 11, 2018, 3:15:50 AM5/11/18
to Dataverse Users Community
Hi Philip,
Click on the menu "Metadata"  and choose "Dublin Core" from  "Export Metadata".
If you look at the generated xml file, you will see that "http" is missing: 

<dcterms:identifier>doi:10.18710/7NLJSG</dcterms:identifier>

Cheers
Obi
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

Jim Myers

unread,
May 11, 2018, 10:48:02 AM5/11/18
to Dataverse Users Community
Obi,
The exported metadata files get cached and aren't updated until the cached copy is removed. There is a clearAllCachedFormat(Dataset) method in edu.harvard.iq.dataverse.export.ExportService, but I'm not sure if it is exposed through the API (the only use I see is in deaccessioning).

I just removed all *.cached recursively in the dataverse/files dir tree when we had this issue (QDR changed it's URL from beta.qdr.org to qdr.syr.edu and the metadata files had the old URL cached). Care in typing is obviously needed to avoid removing data files, but I think it should work for you.

It's probably worth reporting an issue that an API call like the one you used should trigger removal of those cached files (or, if the clear cached files method is/could be available via the API, that the documentation get a note that you run it after changes that affect exported metadata).

-- Jim

Philip Durbin

unread,
May 11, 2018, 7:32:55 PM5/11/18
to dataverse...@googlegroups.com
Jim, thanks for jumping in. An API endpoint to clear out cached export files is a great idea and you should absolutely feel free to open a GitHub issue about this.

Obi, I understand know what your problem is. I'm sorry it took so much back and forth to get there. Thanks for upgrading to 4.8.6. I took a quick look at the code and it's calling globalId.toURL() like this[1]:

xmlw.writeCharacters(globalId.toURL().toString());

That code looks like this[2]:

public URL toURL() {
    URL url = null;
    try {
        if (protocol.equals(DOI_PROTOCOL)){
           url = new URL(DOI_RESOLVER_URL + authority + "/" + identifier);
        } else if (protocol.equals(HDL_PROTOCOL)){
           url = new URL(HDL_RESOLVER_URL + authority + "/" + identifier); 
        }          
    } catch (MalformedURLException ex) {
        Logger.getLogger(GlobalId.class.getName()).log(Level.SEVERE, null, ex);
    }      
    return url;
}   

I'm confused about why the DOI_RESOLVER_URL isn't inserted for you because it's hard-coded as "https://doi.org/" at the top of the file.

If I were you I would suggest adding some debug statements to see if you can narrow down the problem. You're also welcome to open a GitHub issue about this, of course.

I hope this helps,

Phil

-- Jim
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsubscribe...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsubscribe...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

ofu...@gmail.com

unread,
May 12, 2018, 12:31:05 PM5/12/18
to Dataverse Users Community
Thousand thanks to Jim and Philip. It works now. This is what I did:

Run:   curl -H "X-Dataverse-key:<token key>" -X GET http://localhost:8080/api/datasets/modifyRegistrationAll

cd to dataverse/files

Run: find . -type f -name '*.cached' -exec rm -rf {} \;

Cheers
Obi
-- Jim
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.
To post to this group, send email to dataverse...@googlegroups.com.

Philip Durbin

unread,
May 13, 2018, 10:29:30 PM5/13/18
to dataverse...@googlegroups.com
I'm glad that worked for you, Obi. Thanks again to Jim for suggesting the fix!

-- Jim
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsubscribe...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsubscribe...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsubscribe...@googlegroups.com.

To post to this group, send email to dataverse...@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "Dataverse Users Community" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dataverse-community+unsub...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages