handles, isreplacedby, and withdrawn items

213 views
Skip to first unread message

Fitchett, Deborah

unread,
Jun 14, 2018, 12:51:42 AM6/14/18
to dspac...@googlegroups.com

Kia ora all,

 

We’re occasionally getting duplicate records created in Dspace with no way to resolve the issue other than to withdraw the earlier record and go forward with the more recent one.(1)

 

But of course we don’t really want the handle of the earlier record to result in a dead-end – we’d like it to resolve to the new record, or at least be redirected there.

 

We have metadata fields dc.relation.isreplacedby and dc.relation.replaces. These seem to have no functional purpose in Dspace (ie it doesn’t do any automatic redirects), it’s just information. If the item is *not* withdrawn, I could add some javascript to the page to accomplish the redirect that way. But I’m not sure it’s the cleanest way.

 

I’m looking at the handle table in the database and wondering – what if I simply find the row with the handle that’s currently linked to the old record, and update it to point to the resource_id of the new record(2)? Then we could withdraw the old record but people following the old link would still get to what they want. Would this do what I’m thinking? Would there be any problems I’m not seeing with having two handles point to the same record?

 

And/or is there a better way? How would you deal with a case like this?

Deborah

 

(1)    Due to synchronisation with Symplectic Elements which works 99.9% of the time but not 100%.

(2)    I’m happy messing directly in the database in general, having done it with the metadatavalue table a pile of times – obviously always with much testing and great care. I haven’t touched the handle table before though.



P Please consider the environment before you print this email.
"The contents of this e-mail (including any attachments) may be confidential and/or subject to copyright. Any unauthorised use, distribution, or copying of the contents is expressly prohibited. If you have received this e-mail in error, please advise the sender by return e-mail or telephone and then delete this e-mail together with all attachments from your system."

Tim Donohue

unread,
Jun 14, 2018, 10:42:16 AM6/14/18
to Fitchett, Deborah, dspac...@googlegroups.com
Hi Deborah,

I'll admit, I've never tried this myself, but your suggestion to simply update the old "handle" table entries to point at the new "resource_id" seems like it *should work*.  The "handle" table in DSpace is really just used to resolve/assign Handles to Objects.  So, at least conceptually, it should support pointing two handles at the same object (Item).

That said, I'd recommend first trying this out on a test or development server.  I think it should work, but it'd be worth testing more thoroughly how DSpace behaves when one Item object has multiple Handles (and for example, whether both handles appear on the Item splash page, etc).  I'd recommend testing basic functionality like browse/search/reindex. I suspect they all should work, but as this isn't a documented feature, it's worth double checking.

Let us know how it goes (please report back on this list), as this seems like it might be of interest to others.

- Tim


--
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.
To post to this group, send email to dspac...@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.
--
Tim Donohue
Technical Lead for DSpace & DSpaceDirect
DuraSpace.org | DSpace.org | DSpaceDirect.org

Claudia Jürgen

unread,
Jun 14, 2018, 11:00:18 AM6/14/18
to dspac...@googlegroups.com
Hello Deborah,

not directly resolving but you my adjust the Item Withdrawn Page and use
the metadata to point to the new item.
We moved quite a bunch of stuff for the blind and visually impaired to a
new server and withdrew the records on our repository e.g.:
http://hdl.handle.net/2003/21283

Hope this helps

Claudia Jürgen


Am 14.06.2018 um 06:51 schrieb Fitchett, Deborah:
> Kia ora all,
>
> We're occasionally getting duplicate records created in Dspace with no way to resolve the issue other than to withdraw the earlier record and go forward with the more recent one.(1)
>
> But of course we don't really want the handle of the earlier record to result in a dead-end - we'd like it to resolve to the new record, or at least be redirected there.
>
> We have metadata fields dc.relation.isreplacedby and dc.relation.replaces. These seem to have no functional purpose in Dspace (ie it doesn't do any automatic redirects), it's just information. If the item is *not* withdrawn, I could add some javascript to the page to accomplish the redirect that way. But I'm not sure it's the cleanest way.
>
> I'm looking at the handle table in the database and wondering - what if I simply find the row with the handle that's currently linked to the old record, and update it to point to the resource_id of the new record(2)? Then we could withdraw the old record but people following the old link would still get to what they want. Would this do what I'm thinking? Would there be any problems I'm not seeing with having two handles point to the same record?
>
> And/or is there a better way? How would you deal with a case like this?
> Deborah
>
>
> (1) Due to synchronisation with Symplectic Elements which works 99.9% of the time but not 100%.
>
> (2) I'm happy messing directly in the database in general, having done it with the metadatavalue table a pile of times - obviously always with much testing and great care. I haven't touched the handle table before though.
>
> ________________________________
> P Please consider the environment before you print this email.
> "The contents of this e-mail (including any attachments) may be confidential and/or subject to copyright. Any unauthorised use, distribution, or copying of the contents is expressly prohibited. If you have received this e-mail in error, please advise the sender by return e-mail or telephone and then delete this e-mail together with all attachments from your system."
>

--
Claudia Juergen
Eldorado

Technische Universität Dortmund
Universitätsbibliothek
Vogelpothsweg 76
44227 Dortmund

Tel.: +49 231-755 40 43
Fax: +49 231-755 40 32
claudia...@tu-dortmund.de
www.ub.tu-dortmund.de

Wichtiger Hinweis: Die Information in dieser E-Mail ist vertraulich. Sie ist ausschließlich für den Adressaten bestimmt. Sollten Sie nicht der für diese E-Mail bestimmte Adressat sein, unterrichten Sie bitte den Absender und vernichten Sie diese Mail. Vielen Dank.
Unbeschadet der Korrespondenz per E-Mail, sind unsere Erklärungen ausschließlich final rechtsverbindlich, wenn sie in herkömmlicher Schriftform (mit eigenhändiger Unterschrift) oder durch Übermittlung eines solchen Schriftstücks per Telefax erfolgen.

Important note: The information included in this e-mail is confidential. It is solely intended for the recipient. If you are not the intended recipient of this e-mail please contact the sender and delete this message. Thank you. Without prejudice of e-mail correspondence, our statements are only legally binding when they are made in the conventional written form (with personal signature) or when such documents are sent by fax.

Fitchett, Deborah

unread,
Jun 20, 2018, 12:34:59 AM6/20/18
to dspac...@googlegroups.com
Hi Claudia,

That looks like a really good solution if editing the handles themselves doesn't work out. How did you implement this? I can't see where in the xml files to edit the Item Withdrawn page, so does it involve some coding changes?

Thanks very much,

Deborah

Claudia Jürgen

unread,
Jun 20, 2018, 8:09:57 AM6/20/18
to dspac...@googlegroups.com
Hello Deborah,

yes there are code changes involved for the jspui version 5.x in:
[dspace5x]/dspace/config/dspace.cfg
[dspace5x]/dspace-jspui/jspui/src/main/webapp/tombstone.jsp
[dspace5x]/dspace-api/src/main/java/org/dspace/content/Item.java
[dspace5x]/dspace-jspui/src/main/java/org/dspace/app/webui/servlet/HandleServlet.java
[dspace5x]/dspace-jspui/src/main/java/org/dspace/app/webui/servlet/BitstreamServlet.java
[dspace5x]/dspace-api/src/main/resources/Messages.properties
Do not forget the overlay.

I'm not that familiar with the xmlui to point you to the relevant files
there.

Hope this helps

Claudia Jürgen


Fitchett, Deborah

unread,
Aug 22, 2018, 9:56:16 PM8/22/18
to dspac...@googlegroups.com, Tim Donohue

Hi all,

 

I finally got around to trying this out on our dev server: updated the “handle” table to point two different handles to the one resource_id (and none to the withdrawn resource_id). It seemed in the first instance to work – following the links did what I expected.

 

Then I tried running the index-discovery -b job. That kept stopping mysteriously in the middle of it – no obvious errors in the solr or dspace logs, just stopped. (I don’t know, maybe our dev server’s just slow and ran out of memory or got distracted or something.) But running index-discovery with no options picked up where it left off, and after a couple of repetitions of this it finally indexed the item in question and all my search/browse tests worked as expected too.

 

So I was on the verge of declaring victory – and then I ran an oai import -c -v job. To my tremendous disappointment that failed partway through with:

 

Item with handle null indexed

org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Document is missing mandatory uniqueKey field: item.handle

        at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:552)

        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)

        at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)

        at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)

        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)

        at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)

        at org.dspace.xoai.app.XOAI.index(XOAI.java:213)

        at org.dspace.xoai.app.XOAI.indexAll(XOAI.java:200)

        at org.dspace.xoai.app.XOAI.index(XOAI.java:131)

        at org.dspace.xoai.app.XOAI.main(XOAI.java:495)

        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:57)

        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

        at java.lang.reflect.Method.invoke(Method.java:606)

        at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)

        at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)

 

This makes sense in retrospect: the OAI feed includes withdrawn items (in order to publish a “record status: deleted” record) but identifies them by their handles (so obviously requires a handle).

 

This is highly disappointing, but at least now we know. It looks like it would work for a repository which didn’t publish an OAI feed, but sadly that’s vital for us.

 

Our fall-back will be to follow Claudia’s suggestion of adjusting the Item Withdrawn page and using the metadata to point to the new item.

 

Deborah

 

 

From: Tim Donohue <tdon...@duraspace.org>
Sent: Friday, 15 June 2018 2:42 AM
To: Fitchett, Deborah <Deborah....@lincoln.ac.nz>
Cc: dspac...@googlegroups.com
Subject: Re: [dspace-tech] handles, isreplacedby, and withdrawn items

 

Hi Deborah,

Tim Donohue

unread,
Aug 23, 2018, 12:01:16 PM8/23/18
to Fitchett, Deborah, dspac...@googlegroups.com
Thanks for reporting back on your findings, Deborah.

It sounds like the act of pointing to handles at one item "works"...but, removing the handle from the withdrawn item is the core issue.  I'm not surprised that DSpace cannot fully manage Items without handles -- as handles are still very "built into" DSpace.

I've logged this issue as a bug: https://jira.duraspace.org/browse/DS-3993

In the meantime, one option to possibly work around this issue would be to define a "fake/dummy" handle for the withdrawn item.  For example, give it a handle of "withdrawn/[old-id]" or something.  This isn't exactly ideal, but if the only issue is that the Handle cannot be null, this might be a possible workaround.

Nonetheless, honestly, Claudia's fix seems very reasonable and it seems to involve much less "messing around in the database".  So, I wouldn't fault you for looking towards simply using that.

- Tim

Øyvind Gjesdal

unread,
May 13, 2020, 10:36:28 AM5/13/20
to DSpace Technical Support
I see the jira issue is still open, so hopefully this is relevant for someone.

I had a similar issue now with oai-pmh -c import in a Dspace 5 instance.

To locate which item was the culprit I received good help from this old nabble thread on dspace oai -c error, but our discovery index did not return any matches of items without handles, and the sql query returned > 1000 items. The oai-pmh clean index ran after changing the item without a DOI to private to change discoverable to false. This would make the item no longer harvestable through OAI.

SELECT item_id FROM item WHERE (in_archive=TRUE OR withdrawn=TRUE) AND discoverable=TRUE AND
NOT EXISTS
  (SELECT resource_id FROM handle WHERE handle.resource_id =
item.item_id);

In case someone has a similar error when reindexing, and needs to locate the item to fix the index.

I also noted that since the sql query used by the (oai) clean index doesn't sort by date,
running:
dspace oai import -c # Failed after 12200 items
dspace oai import # Ran succesfully for 2000 items, probably because the error item was older than the newest existing post in the index

This left us with an OAI index missing around 3000 items.


I don't know if this has been fixed in later dspaces, but I notice the code has been refactored.

Best regards,
Øyvind
Reply all
Reply to author
Forward
0 new messages