Can view PDFs in browser (DSpace 7.4), but can't download

174 views
Skip to first unread message

Night Librarian

unread,
Nov 22, 2022, 2:13:28 PM11/22/22
to DSpace Technical Support
I have 7.4 on Ubuntu 20.04, with Tomcat 9. I can view pdfs in the browser but downloading doesn't work and I end up with finelename.pdf.crdownload files.  I don't see errors in dspace.log, only [info] and [warn] entries.

Where should I look first to fix it?

Mohammad S. AlMutairi

unread,
Nov 22, 2022, 2:57:25 PM11/22/22
to DSpace Technical Support
Seems like it is either a browser or a connection issue. Try another browser or go into Incognito/InPrivate/Private mode and try it.

Night Librarian

unread,
Nov 22, 2022, 10:36:59 PM11/22/22
to DSpace Technical Support
Thank you for the suggestion.  I tried downloading the file in Edge, Chrome and Firefox on a Windows PC from home (via VPN) and on the Ubuntu machine itself that runs DSpace.  I did it before and after logging as admin.  I tried inPrivate/Incognito mode. In all cases, the pdf opens in the browser and I can read it in its entirety, but when I click "Download", it goes to 773 kb out of 776, and then I get a finelename.pdf.crdownload file downloaded and a message that the browser couldn't download the file.

When I try in Safari and Chrome on iPhone, after viewing the PDF, when I click on download, I see "err_connection_closed", but that probably doesn't explain why it wouldn't download even on the server itself.  dspace.log has many lines, including these:

INFO  3834015a-5783-403d-ba1b-085b68787b0e 64a217ee-6830-48b5-8558-d6140a123bc9 org.dspace.usage.LoggerUsageEventListener @ ad...@mydomain.com::view_item:handle=12345/1916

INFO  unknown 778fd02b-07ff-4e37-a846-561a88054545 org.dspace.app.rest.utils.DSpaceAPIRequestLoggingFilter @ Before request [GET /server/opensearch/search/search] originated from unknown

INFO  08de3499-a27b-474b-aa4c-07fbf0f066cf 5d55a1b2-1984-4213-a061-39e7a6e4cff2 org.dspace.app.rest.utils.DSpaceAPIRequestLoggingFilter @ Before request [GET /server/api/system/scripts/metadata-import] originated from /

WARN  08de3499-a27b-474b-aa4c-07fbf0f066cf 5d55a1b2-1984-4213-a061-39e7a6e4cff2 org.dspace.app.rest.exception.DSpaceApiExceptionControllerAdvice @ Authentication is required (status:401 exception: Access is denied at: org.springframework.security.access.vote.AffirmativeBased.decide(AffirmativeBased.java:73))

INFO  08de3499-a27b-474b-aa4c-07fbf0f066cf a300fb4f-141b-42b9-a403-1352ba5844a1 org.dspace.app.rest.utils.DSpaceAPIRequestLoggingFilter @ Before request [GET /server/api/core/items/6a40502a-a91b-4154-8f5f-19fd4ed69288] originated from /

INFO  08de3499-a27b-474b-aa4c-07fbf0f066cf 98bdb7b6-51ff-4b43-91dc-d36d1a40d668 org.dspace.app.rest.utils.DSpaceAPIRequestLoggingFilter @ Before request [GET /server/api/authz/authorizations/search/object] originated from /items/6a40502a-a91b-4154-8f5f-19fd4ed69288/full

WARN  unknown unknown org.dspace.app.rest.security.jwt.JWTTokenHandler @ XXX.XXX.XXX.XXX tried to use an expired or non-valid token

INFO  unknown 21ea0a82-e9a2-4235-89a9-2043f6411e07 org.dspace.app.rest.utils.DSpaceAPIRequestLoggingFilter @ Before request [GET /server/api/core/bitstreams/137aabe1-6fb9-49d9-8fe6-e3ff81332964/content] originated from https://mydomain.com/server/api/core/bitstreams/137aabe1-6fb9-49d9-8fe6-e3ff81332964/content?authentication-token=eyJhbGciOiJIUzI1NiJ9.eyJlaWQiOiJiZWMwYWNmMS04NDFjLTRiNmQtYmM5Yi02OTQ0OWU2OWRjMWIiLCJzZyI6W10sImF1dGhlbnRpY2F0aW9uTWV0aG9kIjoicGFzc3dvcmQiLCJleHAiOjE2NjkxNzA2NDB9.0jKqlv027PbO4QTgGWx06swXby_oKee-qsIRsUmlPRM


Finally, browser's console shows these 2 lines:

The response for 'https://mydomain.com/server/api/discover/search?configuration=default' has the self link 'https://ec.msvu.ca/server/api/discover/search'. These don't match. This could mean there's an issue with the REST endpoint main.3bafce0befaeaf6e.js:1:181849

The response for 'https://mydomain.com/server/api/core/items/e56a5422-b32b-4000-95dd-86467ef21c35/bundles?size=9999' has the self link 'https://ec.msvu.ca/server/api/core/items/e56a5422-b32b-4000-…es?embed=primaryBitstream&embed=bitstreams/format&size=1000'. These don't match. This could mean there's an issue with the REST endpoint main.3bafce0befaeaf6e.js:1:181849

Can any of this be relevant?

Night Librarian

unread,
Nov 22, 2022, 11:12:35 PM11/22/22
to DSpace Technical Support
Turns out, the problem is not limited to pdf.  All accdb, docx, mp3, pptx, xlsx files also fail to download.  Their download completion goes to near 100%, after which I get an error.  So, none of our items is downloadable!

At the same time, css, html and xml files (we had a couple of these in one submission) download w/o problems.

Message has been deleted

Night Librarian

unread,
Nov 30, 2022, 12:23:38 AM11/30/22
to DSpace Technical Support
A couple thing I notice in error logs that look strange:

1. error reading status line from remote server localhost:4000
Not sure why localhost is called remote server.

2. error reading from remote server returned by /xmlui/handle/12345/1169/browse
I have /xmlui/ in many lines in my logs.  It's like it got carried over during my migration from 4.x to 7.x.


apache2 error.log:
[Wed Nov 30 00:00:29.299436 2022] [mpm_event:notice] [pid 42668:tid 140076046756928] AH00489: Apache/2.4.41 (Ubuntu) OpenSSL/1.1.1f mod_wsgi/4.6.8 Python/3.8 configured -- resuming normal operations
[Wed Nov 30 00:00:29.299542 2022] [core:notice] [pid 42668:tid 140076046756928] AH00094: Command line: '/usr/sbin/apache2'
[Wed Nov 30 00:40:24.120247 2022] [proxy_http:error] [pid 66062:tid 140075878815488] (104)Connection reset by peer: [client [ipaddress]:58455] AH01102: error reading status line from remote server localhost:4000, referer: https://mydomain.com
[Wed Nov 30 00:40:36.111457 2022] [proxy:error] [pid 66061:tid 140076013307648] [client [ipaddress]:64193] AH00898: Error reading from remote server returned by /items/11e6a424-55bc-4d36-a9b7-ec4ab0985eb9
[Wed Nov 30 00:42:41.476436 2022] [proxy:error] [pid 66061:tid 140075778103040] [client [ipaddress]:31578] AH00898: Error reading from remote server returned by /xmlui/handle/12345/1169/browse
[Wed Nov 30 00:50:12.311975 2022] [proxy:error] [pid 66062:tid 140075786495744] [client [ipaddress]:61442]  AH00898: Error reading from remote server returned by /xmlui/handle/12345/95/search-filter

dspace.log:
INFO  aeda8ec0-3510-4e45-876d-d8c1f7854c55 db1d80cb-b03a-4711-b5ff-7ba3b806ccef org.dspace.app.rest.utils.DSpaceAPIRequestLoggingFilter @ Before request [GET /server/api/authz/authorizations/search/object] originated from /xmlui/search-filter?field=subject&filter_0=2016&filter_1=non-profit&filter_relational_operator_0=equals&filter_relational_operator_1=equals&filtertype_0=dateIssued&filtertype_1=subject&starts_with=n

I suppose I need to fix these before moving forward?

Night Librarian

unread,
Nov 30, 2022, 10:59:22 AM11/30/22
to DSpace Technical Support
Could the fact that many files in my /dspace/ and its subdirectories are "root tomcat" owned be related to my downloading problem?  And all files in /dspace-angular/ and its subdirectories are "dspace dspace" owned.  Shall I "chown -R" all files in /dspace/ and /dspace-angular/ to "tomcat tomcat"?

Tim Donohue

unread,
Nov 30, 2022, 4:02:40 PM11/30/22
to DSpace Technical Support
Hi,

I have to admit, I don't have a lot of ideas here.  It doesn't make sense to me that some Bitstreams (Files) would be downloadable while others are not.   Usually, a file permissions issue (in the local directories) would result in *all* files throwing an IOException or similar.   

But, it is worth noting that for the DSpace backend, the "/dspace/" directory structure should be readable/writable to the user that Tomcat is running as.  So, if Tomcat on your system is running as "tomcat:tomcat", then you should change the permissions of "/dspace/" (and all subdirectories) to run as that same user.  

However, the frontend / UI (/dspace-angular/) has nothing to do with Tomcat, so it likely does not need any permissions change.

If you still cannot figure this out, I'd recommend looking closer at which Bitstreams can / cannot be downloaded.  It seems very odd that this would only impact some Bitstreams... so that makes me wonder what else might be different between the Bitstreams that can be downloaded & those that cannot.   While it could relate to the file format, that would usually imply an issue with your *browser* (e.g. your browser is having issues understanding that file format).  I've never heard of an issue where DSpace fails to download only specific file formats. This is because the DSpace download process doesn't include any file format specific code... all files are sent to your browser in the same way & your browser decides how to deal with the file (sometimes it might download it and other times it might open it within the browser).

You also might look at our troubleshooting guide to see if you can find any other errors in the UI side.  The details you've provided above don't seem to relate to this problem, unless I'm overlooking something. https://wiki.lyrasis.org/display/DSPACE/Troubleshoot+an+error

Tim

Night Librarian

unread,
Nov 30, 2022, 5:52:42 PM11/30/22
to DSpace Technical Support
Thank you.  I chowned /dspace/ to tomcat:tomcat.  Then I revisited the issue and notice that only the last 4 PDFs we updated can be downloaded properly. I migrated from 4.x to 7.2 (then 7.4) in the summer, but we were not adding new materials for a few months.  So, a couple weeks ago (after the migration), we loaded up a few PDFs and they download fine.  There are two more that give me a whitelabel error:

Whitelabel Error Page
This application has no explicit mapping for /error, so you are seeing this as a fallback.

Wed Nov 30 18:35:05 AST 2022
There was an unexpected error (type=Internal Server Error, status=500).
An internal read or write operation failed
java.io.IOException: java.io.FileNotFoundException: /dspace/assetstore/25/26/12/25261257736478383774210749798254768435 (No such file or directory) at org.dspace.storage.bitstore.DSBitStoreService.get(DSBitStoreService.java:78) at org.dspace.storage.bitstore.BitstreamStorageServiceImpl.retrieve(BitstreamStorageServiceImpl.java:221) at org.dspace.content.BitstreamServiceImpl.retrieve(BitstreamServiceImpl.java:300) at org.dspace.app.rest.utils.BitstreamResource.getInputStream(BitstreamResource.java:98) at org.springframework.http.converter.ResourceHttpMessageConverter.writeContent(ResourceHttpMessageConverter.java:137) at org.springframework.http.converter.ResourceHttpMessageConverter.writeInternal(ResourceHttpMessageConverter.java:129) at org.springframework.http.converter.ResourceHttpMessageConverter.writeInternal(ResourceHttpMessageConverter.java:45) at org.springframework.http.converter.AbstractHttpMessageConverter.write(AbstractHttpMessageConverter.java:227) at org.springframework.web.servlet.mvc.method.annotation.AbstractMessageConverterMethodProcessor.writeWithMessageConverters(AbstractMessageConverterMethodProcessor.java:293) at org.springframework.web.servlet.mvc.method.annotation.HttpEntityMethodProcessor.handleReturnValue(HttpEntityMethodProcessor.java:219) at org.springframework.web.method.support.HandlerMethodReturnValueHandlerComposite.handleReturnValue(HandlerMethodReturnValueHandlerComposite.java:78) at org.springframework.hateoas.server.mvc.RepresentationModelProcessorHandlerMethodReturnValueHandler.handleReturnValue(RepresentationModelProcessorHandlerMethodReturnValueHandler.java:108) at org.springframework.web.method.support.HandlerMethodReturnValueHandlerComposite.handleReturnValue(HandlerMethodReturnValueHandlerComposite.java:78) at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:135) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:895) at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:808) at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
<etc>

All other files don't download.  I tried a bunch from different collections.  MP3 and such just don't download, while PDFs can be viewed in the browser, but when starting to download, they keep going for a few seconds then the browser says that it can't download.

Moreover, some PDFs seem corrupted.  Most text is there, but some text is missing.  The browser sometimes complains that it can't load embedded fonts from this document.  Sometimes images are blurred or distorted.  Weird.

I have a backup of assetstore from a year ago and all these files are intact.  It is almost as if I corrupted them in the process of migration.

I have a feeling that I won't be able to "uncorrupt" them, so I am ready to upload the manually (we only have about 1600 items and most are still intact).  But perhaps before doing it, something could be done to see whether intact files can start downloading and then I'd look for "bad" files and re-upload them?

Tim Donohue

unread,
Dec 2, 2022, 12:48:04 PM12/2/22
to DSpace Technical Support
Hi,

A "FileNotFoundException" means that the file is completely missing from your "assetstore" folder.  So, in your error above, DSpace thinks one of the bitstreams should be in this location "/dspace/assetstore/25/26/12/25261257736478383774210749798254768435", but it cannot find that bitstream.  (Those assetstore directories and file names will always appear as random looking numbers -- that's how DSpace stores files internally)

Generally, this is a sign that your migration wasn't successful.  Either you missed some files when you copied the "assetstore" folder over from 4.x to 7.x, or maybe the copy somehow failed (perhaps there was an internet connection issue or similar which caused some files to not copy or get corrupted). 

Unfortunately, my best advice here would be to consider recopying over the entire assestore from 4.x to 7.x... or, as you noted, you could re-upload the files manually if there are not many.

Tim

Night Librarian

unread,
Dec 6, 2022, 7:24:56 PM12/6/22
to DSpace Technical Support
Million thanks, Tim!

I did the following:

- backed up my post-migration assetstore,
- replaced it by the pre-migration assetstore copy,
- started Beyond Compare as root ("sudo QT_GRAPHICSSYSTEM=native bcompare") to compare the two directories,
- batch copied the 100 or so found (sub)directories with files that were uploaded after the migration and were intact. 

Now I can open and download intact versions of various files and so far haven't found any problems!

Another crisis averted!  Thanks again!

Reply all
Reply to author
Forward
0 new messages