Re: Font Size support

844 views
Skip to first unread message

Sven Pedersen

unread,
Jun 21, 2012, 12:13:36 PM6/21/12
to tesser...@googlegroups.com
Hi Islam,
Font size does not matter per se, but the number of pixels matters. As
Dmitri Silaev said,

``As a rule of the thumb, usually one can obtain good recognition
results for all standard regular fonts of 11-16pt size, be it a
screenshot or a 300 DPI scanned image. Should font size, resolution,
etc. differ significantly from these numbers, recognition quality
becomes a matter of experimentation.''

And another source says pixel height should be around 90px. Ideal scan
resolution is 200-300 dpi, and you can often just resize a low-res
image to improve accuracy.
-_Sven

On Thu, Jun 21, 2012 at 10:06 AM, islam ibrahim
<islam.ade...@gmail.com> wrote:
> Hello
>
> I have a question regarding the font size that Tesseract supports. Is there
> a specific size or is it just working whatever font size or even type used?
>
> Thanks in advance
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to tesser...@googlegroups.com
> To unsubscribe from this group, send email to
> tesseract-oc...@googlegroups.com
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en



--
``All that is gold does not glitter,
  not all those who wander are lost;
the old that is strong does not wither,
  deep roots are not reached by the frost.
From the ashes a fire shall be woken,
  a light from the shadows shall spring;
renewed shall be blade that was broken,
  the crownless again shall be king.”

Dmitri Silaev

unread,
Jun 21, 2012, 12:40:48 PM6/21/12
to tesser...@googlegroups.com
Now I would say a bit clearer. For Latin-, Greek- and Cyrillic- based alphabets characters having height of 24-72 pixels usually get recognized decently. For character heights falling out of this range you may need experimentation. Also I'm not sure if this all holds true for other writing systems e.g. Chinese.

Sometimes you may upscale low-res images in order for the characters to fall into the above height range. Same can be done with downscaling hi-res images. However you should understand what you're doing as scaling may change character outlines significantly.

Warm regards,
Dmitri Silaev
www.CustomOCR.com
Reply all
Reply to author
Forward
0 new messages