[Dspace-tech] Problem attempting to migrate from DSpace 1.8 directly to 4.1

7 views
Skip to first unread message

Patrick Rynhart

unread,
Aug 26, 2015, 1:16:55 PM8/26/15
to dspac...@lists.sourceforge.net, Curnow, Amanda
Hi all,

I’m attempting an upgrade from DSpace 1.8 directly to 4.1 on a new
server but am running into a problem. Along with the asset store and
DB, we are wanting to preserve our viewing stats. After migration, if I
run:

/usr/local/dspace/bin/dspace stats-util -b

then I am getting the following Java traceback.

Exception: Document is missing mandatory uniqueKey field: uid
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
Document is missing mandatory uniqueKey field: uid
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:424)
at
org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at
org.dspace.statistics.SolrLogger.reindexBitstreamHits(SolrLogger.java:1482)
at
org.dspace.statistics.util.StatisticsClient.main(StatisticsClient.java:97)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.dspace.app.launcher.ScriptLauncher.runOneCommand(ScriptLauncher.java:225)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:77)

This error message does not occur if I run “dspace stats-util -b” on our
DSpace 1.8 server.

———

A summary of the steps that I’m taking are:

1. Copying across /usr/local/dspace/solr/statistics/data to the new server.
2. Doing a rsync of the asset store:

rsync -av --exclude=.snapshot --exclude=log
/usr/local/dspace/assetstore/ newserver:/usr/local/dspace/assetstore/

3. Exporting / importing the PSQL database from the old to new server.
4. Applying DB schema database_schema_18-3.sql followed by
database_schema_3-4.sql

5. Running /usr/local/dspace/bin/dspace checker -l -p
6. Starting tomcat on the new server and running:

/usr/local/dspace/bin/dspace index-discovery

Then I'm attempting to run "/usr/local/dspace/bin/dspace stats-util -b"
which is when the error message occurs.

After Step 6, I have also tried:

/usr/local/dspace/bin/dspace stats-util -o
/usr/local/dspace/bin/dspace stat-general
/usr/local/dspace/bin/dspace stat-initial

but am still running into the above error message upon running
“stats-util -b”.

If anyone could assist me with this it would be appreciated.

With Thanks,

Patrick Rynhart



Patrick Rynhart

unread,
Aug 26, 2015, 1:16:56 PM8/26/15
to dspac...@lists.sourceforge.net
Update: I have found this post by helix84 which looks similar (the
context is for OAI but the traceback looks about the same):

http://dspace.2283337.n4.nabble.com/problems-with-OAI-td4671531.html

When I try running the following SQL on our DSpace 1.8 server (cut and
paste from the above message), I get:

# SELECT item.item_id FROM item WHERE NOT EXISTS (SELECT resource_id
FROM handle WHERE handle.resource_id = item.item_id AND
handle.resource_type_id = 2);
item_id
---------
435
503
499
432
646
461
559
2627
2628
3844
5443
5899
(12 rows)

We've been through by hand and bitstreams 435, 432 and 5899 result in
"Not found on this server", while the remainder appear okay (at least
via the webapp).

Looks like our 1.8 server has some problem that is only showing up as an
issue following the migration to 4.1 ?

What do we need to do to fix up the existing 1.8 install ?

Thanks,

Patrick
> ------------------------------------------------------------------------------
> HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
> Find What Matters Most in Your Big Data with HPCC Systems
> Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
> Leverages Graph Analysis for Fast Processing & Easy Data Exploration
> http://p.sf.net/sfu/hpccsystems
>



Peter Dietz

unread,
Aug 26, 2015, 1:31:34 PM8/26/15
to Patrick Rynhart, dspac...@lists.sourceforge.net
Hi Patrick, 

Sorry that nobody got back to you back then. But thank you for your post, I've just discovered this too, when trying to do some work on a site that had previously started on an older version of DSpace, and had seen some upgrades. So, today, I was attempting to shard a solr statistics index.

peterdietz:peterDSpace peterdietz$ /dspace/bin/dspace stats-util -s
Moving: 275 into core statistics-2010
Exception: Document is missing mandatory uniqueKey field: uid
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Document is missing mandatory uniqueKey field: uid
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:424)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:180)
at org.dspace.statistics.SolrLogger.shardSolrIndex(SolrLogger.java:1345)
at org.dspace.statistics.util.StatisticsClient.main(StatisticsClient.java:106) 


Document is missing mandatory uniqueKey field: uid

And look for entries that are missing the uid field.
ouch.. 1.7M out of 9M entries are missing UID.

Here's an example entry that is missing UID (I've just changed the IP/DNS):
      {
        "ip":"8.8.8.8",
        "id":1,
        "type":4,
        "time":"2011-06-03T18:56:05.174Z",
        "epersonid":2,
        "dns":"hidden.example.com.",
        "continent":"NA",
        "countryCode":"US",
        "city":"New York",
        "latitude":40.7619,
        "longitude":-73.9763,
        "isBot":false,
        "userAgent":"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:2.0.1) Gecko/20100101 Firefox/4.0.1"},
And here's a recent hit, which has a UID (changed IP/DNS).
      {
        "ip":"8.8.8.8",
        "referrer":"https://trydspace.longsight.com/",
        "dns":"hidden.example.com.",
        "continent":"NA",
        "countryCode":"US",
        "city":"New York",
        "latitude":40.7619,
        "longitude":-73.9763,
        "isBot":false,
        "userAgent":"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.66 Safari/537.36",
        "id":13,
        "type":4,
        "time":"2013-09-17T15:50:58.067Z",
        "epersonid":29,
        "statistics_type":"view",
        "uid":"6879167c-b90b-41a3-8249-f98d4e66fb86"}

I don't see anything in the SOLR documentation for how to just add a UID to any entry. I suppose you would have to search for all records that are missing a UID, then store their information into CSV, then delete by query that matches all of that information, and then add a document that had all of that old information, and perhaps this adds a UID?

I'm half tempted to say we'll have to delete these 1.7M records that are missing the UID. They're causing issues in the SOLR index. But this will surely cause a too large a gap in the data.


________________
Peter Dietz
Longsight
www.longsight.com
pe...@longsight.com
p: 740-599-5005 x809

Terry Brady

unread,
Aug 26, 2015, 1:37:23 PM8/26/15
to Peter Dietz, dspac...@lists.sourceforge.net, Patrick Rynhart
I have encountered this same issue.

I filed a bug based on the issue.


Terry

------------------------------------------------------------------------------
Want excitement?
Manually upgrade your production database.
When you want reliability, choose Perforce
Perforce version control. Predictably reliable.
http://pubads.g.doubleclick.net/gampad/clk?id=157508191&iu=/4140/ostg.clktrk



--
Terry Brady
Applications Programmer Analyst
Georgetown University Library Information Technology
Reply all
Reply to author
Forward
0 new messages