Optimal settings to read a floorplan diagram

28 views
Skip to first unread message

Jeremy Colton

unread,
Aug 20, 2016, 5:44:58 AM8/20/16
to tesseract-ocr
Hi,

I am using:
tesseract 3.03
 leptonica-1.70
  libgif 4.1.6(?) : libjpeg 8d : libpng 1.2.50 : libtiff 4.0.3 : zlib 1.2.8 : webp 0.4.0


I run:
tesseract my-image.png out digits

It provides an empty out.txt file with lots of blank lines!

My image is a coloured floorplan with numbers identifying rooms.  The numbers are not in a line, but effectively scattered throughout the page.  I need to read the numbers and get their (x,y) co-ordinates.

From what I've found, I think I need:

1. To remove any graphics from the plan and just leave the numbers.  
2. The colours should only be black and white.
3. It should be a .png or a .tif, which is better?
4. Minimal resolution is 300x300dpi.

What else?

Many thanks
Jeremy
Reply all
Reply to author
Forward
0 new messages