Extracting text from page

Skip to first unread message

Danny Venier

Feb 2, 2015, 11:55:16 AM2/2/15
to pdfnet-w...@googlegroups.com
With webviewer, I am looking to grab a snippet of text from a page of the document.  I've seen java examples with a TextExtractor class that can set a begin point and grab a substring of text.  Is there any such facility in webviewer that can provide that function?

Danny Venier

Feb 2, 2015, 1:56:34 PM2/2/15
to pdfnet-w...@googlegroups.com

I figured out a way to get the text (albeit wasteful) using the document object.

myDoc.LoadPageText(pageNum, callback(pageText) {

It grabs all the text on the page and then I can substring as desired.

Matt Parizeau

Feb 2, 2015, 7:39:31 PM2/2/15
to pdfnet-w...@googlegroups.com
Hi Danny,

Using the LoadPageText function would be the way to go. Note that behind the scenes the text for the page will be cached so subsequent requests shouldn't be too wasteful.

Matt Parizeau
Software Developer
PDFTron Systems Inc.
Reply all
Reply to author
0 new messages