PDFBox errors

49 views
Skip to first unread message

Bridger Dyson-Smith

unread,
Apr 5, 2012, 1:18:49 PM4/5/12
to xtf-...@googlegroups.com, XTF Users List
Hi all -- 

I've come across a possible bug in Apache's pdfBox.jar. Indexing PDFs gives the errors below [1]. I've tried downloading the most recent version of pdfbox.jar, but I'm continuing to get errors. The errors occur on the same PDFs with both .jars, so it is probably a problem with the way they are being created. I'm trying to rule out generation-related errors, but has anyone else had this problem? If so, what did you do to fix it? It doesn't seem to impact the search-ability of the PDFs, so maybe it's a non-issue.

Thanks for any insights you may have.
Cheers,
Bridger

[1] errors from textIndexer:

(25%)  Indexing [pdf/275/275.pdf] ... Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Apr 5, 2012 10:41:01 AM org.apache.pdfbox.util.PDFStreamEngine processOperator
WARNING: java.io.IOException: Error: expected hex character and not  :32
java.io.IOException: Error: expected hex character and not  :32
at org.apache.fontbox.cmap.CMapParser.parseNextToken(CMapParser.java:316)
at org.apache.fontbox.cmap.CMapParser.parse(CMapParser.java:138)
at org.apache.pdfbox.pdmodel.font.PDFont.parseCmap(PDFont.java:549)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:383)
at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:372)
at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45)
at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:552)
at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:248)
at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:207)
at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:367)
at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:291)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:247)
at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:180)
at org.cdlib.xtf.textIndexer.PDFToString.convert(PDFToString.java:127)
at org.cdlib.xtf.textIndexer.PDFIndexSource.filterInput(PDFIndexSource.java:71)
at org.cdlib.xtf.textIndexer.XMLIndexSource$1.xmlSource(XMLIndexSource.java:173)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.parseText(XMLTextProcessor.java:1295)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processText(XMLTextProcessor.java:1196)
at org.cdlib.xtf.textIndexer.XMLTextProcessor.processQueuedTexts(XMLTextProcessor.java:1077)
at org.cdlib.xtf.textIndexer.SrcTreeProcessor.close(SrcTreeProcessor.java:168)
at org.cdlib.xtf.textIndexer.TextIndexer.doIndexing(TextIndexer.java:515)
at org.cdlib.xtf.textIndexer.TextIndexer.main(TextIndexer.java:339)
Done.
      (30%)  Indexing [pdf/276/276.pdf] ... Done.




Reply all
Reply to author
Forward
0 new messages