Tesseract performs poorly. What am i doing wrong?

136 views
Skip to first unread message

narayanan iyer

unread,
Feb 8, 2019, 9:45:12 AM2/8/19
to tesseract-ocr
I have scaled the image and also did binarization. Still  i get bad results, Is there anything else i could do to improve?

Maybe it's due to the mix of numbers and text.

thanks


11.jpg

Kristóf Horváth

unread,
Feb 8, 2019, 10:00:15 AM2/8/19
to tesseract-ocr
I think Tesseract does the binarization by default and what did you scale from to? because Tesseract performs optimal i believe on 300 dpi. Also you provided only a picture it would be nice to see the actual results.

Timothy Snyder

unread,
Feb 8, 2019, 2:11:34 PM2/8/19
to tesser...@googlegroups.com
You may want to try segmenting this image into smaller segments and try to remove elements of the table grid to see if you achieve better results.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/46f65963-ac03-485a-badc-086b7535b1c5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

narayanan iyer

unread,
Feb 9, 2019, 3:49:35 AM2/9/19
to tesseract-ocr
this is what i got!

| ee, |

ee sn REINS

| MICRO |

PN.422403 KIT, MOTOR DRIVE 63/71 IEC 3081

PN.422401 KIT, MOTOR DRIVE 56C 1800 RPK

FLANGED 1500 RPM

Ee

Ni.

Pa [1 joprion| “ner.

MOTOR, 1/2 HP, 1800 RPM

| 1 | 4 {OPTION

| 2 | 1 4422366 [cp-pes2e

2 | 1 |422394 ico-pase!

| 3 | 1 [422367 [co-es9es)

PLATE, MOTOR S6C MOUNTING

| 3 | 1 [422306 [co-asses

4] 1 [azoics| no |

| 411 jaze3saf” ono | SHEAVE, 3V 3.15 D0 1 GR. JA

|S [1 416357] no |

|S [1 [422300] No

Sj 4 | eszc2] No |

| 8 | 4 | sa2cz} xo |

7 [4 | es6a6 0)

7 [2] e646 [Ne

ete

PNA

Lo | «| seee 8

PN.422404 KIT, MOTOR DRIVE 80 IEC 3081

PN.422402 KIT, MOTOR DRIVE 56C 1500 RPH

FLANGED 1500 RPM

jt | a forrion| REF. |

MOTOR, 1/2 HP, 1500 RP

| 1 fa fOPrION| REF. |

MOTOR, .40 kW, i500 RPW

D2 [1 [aaesee [eo-o6sea

2 | 1 [aaeses [co-ness2

| 3 | 1 [422387 [co-asss5|

PLATE, MOTOR S6C MOUNTING

| 3 | 1 [422397 |co-assss

| 4 | 1 j4zesca| no |

4 |i fazasca| no |

5 | 1 f4tess7 [no |

5 | 1 [s224co |

8 | 4 | sezoz| No |

|S [4 jeez] so |

7] 4 | essis| xo

7 [4 | seat no

pal - jo - fT

SN

8 4 | pease] oo |

BEE SHT. 2 FOR OVERVEERS

iver [ecm: | rar. | oc. | smeSSCS~C~S~SCSY

man SE EUCES UALESS SARCTPTED |

KIT, MOTOR ORIVE

ae

{ER SRA: or

4.010"

ea

PUCTION:

4

jt-ar

20m

FIT UGG oN

LRA FS at 5-5 -55)

*aLe,

na

MOG PS CONVE

L_ etitidiia air | srare {seer > te

Kristóf Horváth

unread,
Feb 9, 2019, 4:20:13 AM2/9/19
to tesseract-ocr
thats really bad. Okey so which segmentation mode did you use? Did you use the best traineddata? If you dont have tools to clean the picture up i think you should do what timothy said "You may want to try segmenting this image into smaller segments and try to remove elements of the table grid to see if you achieve better results."

narayanan iyer

unread,
Feb 9, 2019, 6:42:03 AM2/9/19
to tesser...@googlegroups.com
I tried psm 6 but i get the error that Image too small to scale!! (3x36 vs min width of 3). even though the image has been scaled a lot.Do i need to replace the training data files in tesseract with the best trained data files? How should i proceed to segment the image into cells of single line text? any pointers would be appreciated.

thanks



--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
Reply all
Reply to author
Forward
0 new messages