Checking if Searchable or Image Only PDF

58 views
Skip to first unread message

Robert Williams

unread,
May 10, 2016, 7:34:58 AM5/10/16
to tesseract-ocr
Hi

Within code - is it possible to check if a PDF is already "searchable"? 

We get documents from a third party and want to search for keywords - don't want to be running an OCR over files that are already searchable.

Thanks
Rob


Tom Morris

unread,
May 10, 2016, 11:58:13 AM5/10/16
to tesseract-ocr

On Tuesday, May 10, 2016 at 7:34:58 AM UTC-4, Robert Williams wrote:

Within code - is it possible to check if a PDF is already "searchable"? 

We get documents from a third party and want to search for keywords - don't want to be running an OCR over files that are already searchable.

Sure, but it doesn't have anything to do with OCR. If you can't figure it out from the documentation for whatever PDF toolkit you're using, you should ask in their support forum.

Tom 

Gunasekaran Velu

unread,
May 11, 2016, 2:33:31 AM5/11/16
to tesseract-ocr
Hi Rob

using iTextsharp you can check the pdf is already searchable pdf or not.


Regards
Guna
Reply all
Reply to author
Forward
0 new messages