Extracting text from page

48 views

Skip to first unread message

Danny Venier

unread,

Feb 2, 2015, 11:55:16 AM2/2/15

to pdfnet-w...@googlegroups.com

With webviewer, I am looking to grab a snippet of text from a page of the document. I've seen java examples with a TextExtractor class that can set a begin point and grab a substring of text. Is there any such facility in webviewer that can provide that function?

Danny Venier

unread,

Feb 2, 2015, 1:56:34 PM2/2/15

to pdfnet-w...@googlegroups.com

Update:

I figured out a way to get the text (albeit wasteful) using the document object.

myDoc.LoadPageText(pageNum, callback(pageText) {

...});

It grabs all the text on the page and then I can substring as desired.

Matt Parizeau

unread,

Feb 2, 2015, 7:39:31 PM2/2/15

to pdfnet-w...@googlegroups.com

Hi Danny,

Using the LoadPageText function would be the way to go. Note that behind the scenes the text for the page will be cached so subsequent requests shouldn't be too wasteful.

Matt Parizeau

Software Developer

PDFTron Systems Inc.

Reply all

Reply to author

Forward

0 new messages