[Dspace-tech] XPDF to Thumbnail Preview in DSpace 1.8.2

3 views
Skip to first unread message

Osama Alkadi

unread,
Aug 26, 2015, 9:45:10 AM8/26/15
to dspac...@lists.sourceforge.net
Hi all,

We are running dspace 1.8.2/Linux and having some issues with the pdftoppm tool when extracting some PDF's thumbnail. 

Some properties of the PDF's:

- Encoding software includes: Adobe PDF Library, Acrobat Distiller,  Acrobat PDFWriter.
- Size: varies from 1 to 10 MB.

In the logs (in debug mode) throws this after executing filter-media:

INFO  org.dspace.app.mediafilter.XPDF2Thumbnail @ XPDF2Thumbnail: outPrefix: /tmp/prevu1738144616485715914out
ERROR org.dspace.app.mediafilter.XPDF2Thumbnail @ Unable to delete file
ERROR org.dspace.app.mediafilter.XPDF2Thumbnail @ PDF conversion proc failed, exit status=1, file=/tmp/DSfilt2694438157933967840.pdf
--
Full Filter Name: org.dspace.app.mediafilter.HTMLFilter
org.dspace.app.mediafilter.HTMLFilter
Full Filter Name: org.dspace.app.mediafilter.WordFilter
org.dspace.app.mediafilter.WordFilter
Full Filter Name: org.dspace.app.mediafilter.JPEGFilter
org.dspace.app.mediafilter.JPEGFilter
Full Filter Name: org.dspace.app.mediafilter.XPDF2Text
org.dspace.app.mediafilter.XPDF2Text
Full Filter Name: org.dspace.app.mediafilter.BrandedPreviewJPEGFilter
org.dspace.app.mediafilter.BrandedPreviewJPEGFilter
Full Filter Name: org.dspace.app.mediafilter.XPDF2Thumbnail
org.dspace.app.mediafilter.XPDF2Thumbnail
Full Filter Name: org.dspace.app.mediafilter.PowerPointFilter
org.dspace.app.mediafilter.PowerPointFilter
FILTERED: bitstream 38802 (item: 1885/8749) and created 'DevelopmentBulletin-73_2009.pdf.txt'
ERROR filtering, skipping bitstream:

Item Handle: 1885/8749
Bundle Name: ORIGINAL
File Size: 1445348
Checksum: 1a1b0472e9361c4a4a00d30846f3e211 (MD5)
Asset Store: 0
javax.imageio.IIOException: Can't read input file!
javax.imageio.IIOException: Can't read input file!
at javax.imageio.ImageIO.read(ImageIO.java:1275)
at org.dspace.app.mediafilter.XPDF2Thumbnail.getDestinationStream(XPDF2Thumbnail.java:246)
at org.dspace.app.mediafilter.MediaFilterManager.processBitstream(MediaFilterManager.java:746)
at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:561)
at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:511)
at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:479)
at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:353)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:183)
FILTERED: bitstream 38805 (item: 1885/8749) and created '01whole_Grubb.pdf.txt'
FILTERED: bitstream 38805 (item: 1885/8749) and created '01whole_Grubb.pdf.jpg'
Updating search index:

Strangely, even when running the pdftoppm tool  manually I get  "Bogus memory allocation size"  error.  My JAVA_OPTS is set to "-Xmx1024M -Xms128M -XX:PermSize=192M -XX:MaxPermSize=384M"

Also someone on the mailing list  suggested a solution to change a line in XPDF2Thumbnail.java near the line reporting the error . The line was

File outf = new File(outPrefix+"-000001.ppm");
and change to 
File outf = new File(outPrefix+"-001.ppm");

Unfortunately, this has not worked for me. Any help would be appreciated?

Thanks

Osama Alkadi

unread,
Aug 26, 2015, 9:45:24 AM8/26/15
to dspac...@lists.sourceforge.net
Just a follow up on my previous email, I ran the pdftoppm manually using this command and got the error below:

pdftoppm -q -f 1 -l 1 -r 62 DevelopmentBulletin-73_2009.pdf bleg2
Bogus memory allocation size

Link to the pdf file at: http://hdl.handle.net/1885/9207

Has anyone seen this error?

Thanks


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/_______________________________________________
DSpace-tech mailing list
DSpac...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Osama Alkadi

unread,
Aug 26, 2015, 9:46:02 AM8/26/15
to dspac...@lists.sourceforge.net
We upgraded pdftoppm from 3.0 to 3.02 and that fixed the problem.

Thanks
Osama

helix84

unread,
Aug 26, 2015, 9:46:02 AM8/26/15
to Osama Alkadi, dspac...@lists.sourceforge.net
On Fri, Aug 31, 2012 at 4:30 AM, Osama Alkadi <osama....@anu.edu.au> wrote:
> We upgraded pdftoppm from 3.0 to 3.02 and that fixed the problem.

Thanks for reporting back!

Regards,
~~helix84

Reply all
Reply to author
Forward
0 new messages