Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

non-searchable PDF, convert to TIFF, ocr back to searchable PDF

93 views
Skip to first unread message

te...@intex.com

unread,
Feb 3, 2010, 1:33:47 PM2/3/10
to
I have a non-searchable (because there's no encoding provided) PDF
file from which I'd like to extract its text. Can I do the following

convert the non-searchable PDF file to a TIFF file
use OCR to convert the TIFF file to a searchable PDF

?

If so, how effective is this method and what program(s) would you
recommend for each step? There seem to be many that will convert
from PDF to TIFF. I would think that Adobe Acrobat would be able to
do this easily.

Thanks,

Ted

Lutrin

unread,
Feb 3, 2010, 2:46:38 PM2/3/10
to
On Wed, 03 Feb 2010 10:33:47 -0800, te...@intex.com ci disse:

> convert the non-searchable PDF file to a TIFF file use OCR to convert
> the TIFF file to a searchable PDF

[...]
truely speaking, you can do directly an ocrzation with

*Abbyy Finereader 8*
- http://www.abbyy.com/

if you want use only freeware software, you can use

*ghostscript*
- http://mirror.cs.wisc.edu/pub/mirrors/ghost/GPL/current/

with its graphical frontend

*Gsview*
- http://pages.cs.wisc.edu/~ghost/gsview/index.htm
--
Puppy Linux wiki: http://puppylover.netsons.org/dokupuppy
Puppy Linux Forum: http://puppylinux.ilbello.com
Windows me genuit, Ubuntu rapuere / tenet nunc Puppy Linux...

Bernd Alheit

unread,
Feb 4, 2010, 4:58:41 AM2/4/10
to
te...@intex.com wrote:

> I would think that Adobe Acrobat would be able to do this easily.

Yes.

0 new messages