> Hi,I am trying to scan a series of documents which have been badly skewed by
It's hard to know whether you mean the whole page is scanned at an
angle, or that the text is badly distorted where the page curves
up into the spine when the book is squashed flat.
If the whole page is at an angle, then imagemagick 'convert' does
a reasonable job of deskewing.
I don't know about software to correct curved text, but I do my scanning
with a Plustek Opticbook scanner. It gets pretty close to the spine
of the book (about 1/4" for my 3600 model) without needing to try
to squash the book flat. So there is pretty well no skewing over
the scanning surface. If you're doing the scanning yourself and
money isn't too tight, you could look into getting one of these
scanners yourself.
Cheers,
Rob Komar
--
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to tesser...@googlegroups.com
To unsubscribe from this group, send email to
tesseract-oc...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en
In case when you'd like to make it in a software way, then you should
decide if you want to get a "rectified" image (like book pages came
out of a flatbed scanner) or just need the text to be recognized.
There are some image processing algorithms and out-of-the-box products
that allow you to achieve the former, though only to some degree of
perfection. However if for your goals the latter is enough, then you
may try to pass "rectified" images to an OCR system and probably get
decent results.
Methods of rectifying and preparing such curved images for OCR are
very diverse, though. E.g. in my engine for processing sales receipt
photos I use a great mixture of traditional and self-devised methods,
and there's still a room for improvement. You'll need to play with
methods to determine what suits you best.
Warm regards,
Dmitri Silaev
www.CustomOCR.com