For this particular item which was added on March 9th 2005, filter-media is throwing error.
https://ritdml.rit.edu/dspace/handle/1850/431
It just looks like normal pdf to me.
SKIPPED: bitstream 1113 because 'USHumanResources_2.pdf.txt' already exists
ERROR filtering, skipping bitstream #1123 java.io.IOException: You do not have permission to extract text
java.io.IOException: You do not have permission to extract text
at org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:140)
at org.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:99)
at org.dspace.app.mediafilter.PDFFilter.getDestinationStream(PDFFilter.java:108)
at org.dspace.app.mediafilter.MediaFilter.processBitstream(MediaFilter.java:157)
at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:244)
at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:207)
at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(MediaFilterManager.java:184)
at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:155)
Creating search index:
java.lang.Throwable: Warning: You did not close the PDF Document
at org.pdfbox.cos.COSDocument.finalize(COSDocument.java:384)
at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)
at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:83)
at java.lang.ref.Finalizer.access$100(Finalizer.java:14)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:160)
Any insight will be of great help. Because of this item, indexing after filtering as the logs shows is not working.
Thanks in advance.
- Naval
-
Thanks for all your help!!!.
Pdf submitted was locked earlier. Now it has been unlocked. And filter-media is working fine now.
- Naval
From: George Kozak
[mailto:gs...@cornell.edu]
Sent: Thursday, March 10, 2005
3:30 PM
To: Navalkishore H Sarda
Subject: Re: [Dspace-tech] Filter
media error
Naval:
I've seen this same error for PDF's that have been uploaded to DSpace but which
the user had originally set up with a password protection so no one could
change their PDF.
***************************
George Kozak
Digital Library Specialist
Library Systems
501 Olin Library
Cornell University
607-255-8924
***************************
gs...@cornell.edu
Hi all
Has anyone seen this one running filter-media before?
java.io.IOException: Invalid header signature; read 7015536635646467195, expected -2226271756974174256
at org.apache.poi.poifs.storage.HeaderBlockReader.<init>(HeaderBlockReader.java:125)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:120)
at org.textmining.text.extraction.WordExtractor.extractText(WordExtractor.java:32)
at org.dspace.app.mediafilter.WordFilter.getDestinationStream(WordFilter.java:97)
at org.dspace.app.mediafilter.MediaFilter.processBitstream(MediaFilter.java:162)
at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:287)
at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:250)
at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(MediaFilterManager.java:224)
at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:195)
Cheers
Gary
Gary Browne
Development Programmer
Library IT Services
University of Sydney
Australia
ph: 61-2-9351 5946
-------------------------------------------------------------------------Using Tomcat but need to do more? Need to support web services, security?Get stuff done quickly with pre-integrated technology to make your job easierDownload IBM WebSphere Application Server v.1.0.1 based on Apache GeronimoDSpace-tech mailing list