Textextractor not extracting text correctly

28 views

Skip to first unread message

unread,

Jul 7, 2016, 3:00:57 PM7/7/16

to PDFTron PDFNet SDK

I'm using a textextractor to select text from a rect.

Unfortunatly, the extracted text isn't always correct, it sometimes

adds an extra character at the end or cuts off a character in the beginning

I can see that the rectangle is correct because i highlight it...

what i'm doing:

TextExtractor txtExtractor = new TextExtractor();

txtExtractor.Begin(page, hightlightAnnot.GetRect(),TextExtractor.ProcessingFlags.e_remove_hidden_text);

hightlightAnnot.SetContents(txtExtractor.GetAsText());

string text = hightlightAnnot.GetContents();

unread,

Jul 15, 2016, 12:26:38 PM7/15/16

to PDFTron PDFNet SDK

When you say sometimes, you mean on the same document you get different results on subsequent tries, or do you mean on different documents?

To diagnose the issue further we need the following information.

1. Input file(s)

2. Generated output

3. Clear description of what you expected to get (include screenshot if appropriate as we may not see what you do)