Kia ora all,
We’re occasionally getting duplicate records created in Dspace with no way to resolve the issue other than to withdraw the earlier record and go forward with the more recent one.(1)
But of course we don’t really want the handle of the earlier record to result in a dead-end – we’d like it to resolve to the new record, or at least be redirected there.
We have metadata fields dc.relation.isreplacedby and dc.relation.replaces. These seem to have no functional purpose in Dspace (ie it doesn’t do any automatic redirects), it’s just information. If the item is *not* withdrawn, I could add some javascript to the page to accomplish the redirect that way. But I’m not sure it’s the cleanest way.
I’m looking at the handle table in the database and wondering – what if I simply find the row with the handle that’s currently linked to the old record, and update it to point to the resource_id of the new record(2)? Then we could withdraw the old record but people following the old link would still get to what they want. Would this do what I’m thinking? Would there be any problems I’m not seeing with having two handles point to the same record?
And/or is there a better way? How would you deal with a case like this?
Deborah
(1) Due to synchronisation with Symplectic Elements which works 99.9% of the time but not 100%.
(2) I’m happy messing directly in the database in general, having done it with the metadatavalue table a pile of times – obviously always with much testing and great care. I haven’t touched the handle table before though.
--
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.
To post to this group, send email to dspac...@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.
Hi all,
I finally got around to trying this out on our dev server: updated the “handle” table to point two different handles to the one resource_id (and none to the withdrawn resource_id). It seemed in the first instance to work – following the links did what I expected.
Then I tried running the index-discovery -b job. That kept stopping mysteriously in the middle of it – no obvious errors in the solr or dspace logs, just stopped. (I don’t know, maybe our dev server’s just slow and ran out of memory or got distracted or something.) But running index-discovery with no options picked up where it left off, and after a couple of repetitions of this it finally indexed the item in question and all my search/browse tests worked as expected too.
So I was on the verge of declaring victory – and then I ran an oai import -c -v job. To my tremendous disappointment that failed partway through with:
Item with handle null indexed
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Document is missing mandatory uniqueKey field: item.handle
at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:552)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:210)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:206)
at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:124)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)
at org.dspace.xoai.app.XOAI.index(XOAI.java:213)
at org.dspace.xoai.app.XOAI.indexAll(XOAI.java:200)
at org.dspace.xoai.app.XOAI.index(XOAI.java:131)
at org.dspace.xoai.app.XOAI.main(XOAI.java:495)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl. java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:226)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:78)
This makes sense in retrospect: the OAI feed includes withdrawn items (in order to publish a “record status: deleted” record) but identifies them by their handles (so obviously requires a handle).
This is highly disappointing, but at least now we know. It looks like it would work for a repository which didn’t publish an OAI feed, but sadly that’s vital for us.
Our fall-back will be to follow Claudia’s suggestion of adjusting the Item Withdrawn page and using the metadata to point to the new item.
Deborah
From: Tim Donohue <tdon...@duraspace.org>
Sent: Friday, 15 June 2018 2:42 AM
To: Fitchett, Deborah <Deborah....@lincoln.ac.nz>
Cc: dspac...@googlegroups.com
Subject: Re: [dspace-tech] handles, isreplacedby, and withdrawn items
Hi Deborah,
SELECT item_id FROM item WHERE (in_archive=TRUE OR withdrawn=TRUE) AND discoverable=TRUE AND
NOT EXISTS
(SELECT resource_id FROM handle WHERE handle.resource_id =
item.item_id);
In case someone has a similar error when reindexing, and needs to locate the item to fix the index.
I also noted that since the sql query used by the (oai) clean index doesn't sort by date,
running:
dspace oai import -c # Failed after 12200 items
dspace oai import # Ran succesfully for 2000 items, probably because the error item was older than the newest existing post in the index
This left us with an OAI index missing around 3000 items.
I don't know if this has been fixed in later dspaces, but I notice the code has been refactored.
Best regards,
Øyvind