Oh cool, I haven't actually used multi-page TIFFs before, it's nice
that Tesseract handles them well, straight from ghostscript.
Yes, at the moment I suppose you'll just have to make a little
script or something to wrap the ghostscript and tesseract steps
appropriately.
I have used pdfimages for a number of things, with scripts handling
the files one at a time. But I can see ghostscript would be a better
way of working for you (and quite possibly for me, next time I have
lots of stuff to process).
Nick
On Mon, Apr 29, 2013 at 05:51:49AM -0700, Steven McArdle wrote:
> Thanks Nick
>
> I already have it set up for ghostscript as it gives better results than
> imagemagick.
>
> As the PDF's are mostly multi-page and ghostscript can generate multi-page TIFF
> from these, I can feed these directly into Tesseract.
>
> So I don't think pdfimages is an option as it spits out multiple files.
>
> Steve
> --
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to
tesser...@googlegroups.com
> To unsubscribe from this group, send email to
>
tesseract-oc...@googlegroups.com
> For more options, visit this group at
>
http://groups.google.com/group/tesseract-ocr?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an email
> to
tesseract-oc...@googlegroups.com.
> For more options, visit
https://groups.google.com/groups/opt_out.
>
>