Two OAI problems with DSpace 6.2

424 views
Skip to first unread message

Evgeni Dimitrov

unread,
Oct 19, 2017, 2:04:39 PM10/19/17
to dspac...@googlegroups.com
I have a freshly installed DSpace 6.2 with 15 items in it.
I have not changed anything in the OAI configuration.
Two of the items have READ access for certain groups only. The rest are accessible to everybody.

Problem 1.
When I run
dspace oai import -c

I get in dspace.log two exceptions (I suppose for each of the restricted access items):

2017-10-19 17:55:22,650 WARN  org.dspace.xoai.util.ItemUtils @ Authorization denied for action READ on BITSTREAM:3d21ebf7-ef3e-4d98-a265-c4a516b1740b by user null
org.dspace.authorize.AuthorizeException: Authorization denied for action READ on BITSTREAM:3d21ebf7-ef3e-4d98-a265-c4a516b1740b by user null

The whole text of the exception is very long.

Problem 2.
After that I am trying
http://localhost:8080/oai/request?verb=Identify
http://localhost:8080/oai/request?verb=ListSets
http://localhost:8080/oai/request?verb=ListMetadataFormats
http://localhost:8080/oai/request?verb=ListIdentifiers&metadataPrefix=xoai
and it all works.

But for
http://localhost:8080/oai/request?verb=GetRecord&identifier=oai:localhost:nls/3&metadataPrefix=xoai
(nls/3 is an item accessible to everybody)
I get in the browser

java.io.IOException: com.lyncode.xoai.dataprovider.exceptions.WritingXmlException: Error trying to output ''
    org.dspace.xoai.services.impl.cache.DSpaceXOAICacheService.store(DSpaceXOAICacheService.java:113)

The whole text of the exception is very long.

My question is - am I missing something in the configuration or is this a known problem with DSpace 6.2?

Virus-free. www.avg.com

Evgeni Dimitrov

unread,
Oct 21, 2017, 6:59:29 AM10/21/17
to DSpace Technical Support

It turned out that Problem 1 is easy to fix - with one line added to XOAI.java

But this does not help for Problem 2 - which is that

http://localhost:8080/oai/request?verb=GetRecord&metadataPrefix=xoai&identifier=oai:localhost:nls/3

works fine for 10 items and does not work for 5 items . . .

Evgeni Dimitrov

unread,
Oct 23, 2017, 4:20:44 PM10/23/17
to DSpace Technical Support
The problem with
not working is related to metadata with non-ASCII characters. The strange thing is that the added log shows that DSpace gets with SolrServer.query() the metadata with the non-ASCII characters distorted. But when Solr is queried in the browser:

http://localhost:8080/solr/oai/select?q=item.handle:nls/3

then the non-ASCII characters are correct. I can not figure out why . . .



On Thursday, October 19, 2017 at 9:04:39 PM UTC+3, Evgeni Dimitrov wrote:

Christian Scheible

unread,
Oct 24, 2017, 1:58:19 AM10/24/17
to dspac...@googlegroups.com
Hi together,

we are currently investigating the same problem.
It's because there are non valid XML 1.0 Characters in the metadata (like surrogate pairs).

This block explains it and shows a method how to get rid of them:

http://blog.mark-mclaren.info/2007/02/invalid-xml-characters-when-valid-utf8_5873.html

If I have time I am going to make a pull request for this issue.

Hope this helps
Christian
--
You received this message because you are subscribed to the Google Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dspace-tech...@googlegroups.com.
To post to this group, send email to dspac...@googlegroups.com.
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.

Evgeni Dimitrov

unread,
Oct 24, 2017, 1:09:16 PM10/24/17
to DSpace Technical Support
Thank you Christian,
You may be right. But in my case it goes wrong before it comes to XML.
I am logging the SolrDocument before sending to Solr:

SolrInputDocument(fields: [item.id=ecc9c18e-4e91-4e7a-8e61-eb88472f8f0a, item.public=true, item.handle=nls/3, item.lastmodified=2017-10-16 23:18:01.117, item.submitter=e...@nomail.xx, item.deleted=false, item.collections=col_nls_2, item.communities=com_nls_1, metadata.dc.coverage.spatial=София, България, metadata.dc.creator=Карастоянов, Д.,

I am logging the SolrDocument after Solr returns it:

SolrDocument{item.id=ecc9c18e-4e91-4e7a-8e61-eb88472f8f0a, item.public=true, item.handle=nls/3, item.lastmodified=Mon Oct 16 23:18:01 EEST 2017, item.submitter=e...@nomail.xx, item.deleted=false, item.collections=[col_nls_2], item.communities=[com_nls_1], metadata.dc.coverage.spatial=[?????, ????????], metadata.dc.creator=[???????????, ?.],

I tried querying Solr from my own standalone application - same way as DSpace does - same version of the Solr client - the result is as above.

Finally I am querying Solr from the browser and I am getting a correct response:

<arr name="metadata.dc.coverage.spatial"><str>София, България</str></arr><arr name="metadata.dc.creator"><str>Карастоянов, Д.</str></arr>

Which makes me think that there is a problem with the Solr client - either configuration missing or something wrong in this exactly version.

Evgeni Dimitrov

unread,
Oct 25, 2017, 6:13:57 PM10/25/17
to DSpace Technical Support

To close the problem I should say that there is no real issue with the non-ASCII characters and OAI.

But in XOAI there is conversion between String and byte array using the platform's default charset. Which is not always UTF-8. One can fix this starting Java/Tomcat with
  -Dfile.encoding=UTF8

One more thing I noticed now is that in some logs in dspace.log the non-ASCII characters are replaced by ??
To get the real non-ASCII characters one can add in log4j.properties
  log4j.appender.A1.encoding=UTF-8



On Thursday, October 19, 2017 at 9:04:39 PM UTC+3, Evgeni Dimitrov wrote:

Christian Scheible

unread,
Oct 26, 2017, 1:48:33 AM10/26/17
to dspac...@googlegroups.com
Hi Eugeni,

I just added this item: http://demo.dspace.org/xmlui/handle/10673/32
to the demo site of DSpace and this breaks the XMLUI and will break OAI as soon as it is re indexed.

So I think our problems might be unrelated but there is a DSpace bug regarding characters that are disallowed in XML 1.0.

Regards
Christian
--

Christian Scheible

unread,
Oct 26, 2017, 8:49:18 AM10/26/17
to dspac...@googlegroups.com
There is the broken OAI webapp (http://demo.dspace.org/oai/request?verb=GetRecord&metadataPrefix=oai_dc&identifier=oai:demo.dspace.org:10673/32).

I found the reason why this did not happen in DSpace 5.x but in DSpace 6.x.
It's the Xalan version 2.7.2. Using Xalan 2.7.0 will not lead to this error.
Reply all
Reply to author
Forward
0 new messages