How page segmentation mode helps improves accuracy ?

6,547 views
Skip to first unread message

Subrato Namata

unread,
Dec 4, 2017, 8:55:36 AM12/4/17
to tesseract-ocr
Hi,
Can someone please explain what exactly is page segmentation modes and for better accuracy of ocr, what psm mode should be tried ? I tried running tesseract with default psm i.e. 3, I got fewer words and when run using psm 11, I got more number of words and due to this, accuracy went high.

Regards,
Subrato

David Sixela

unread,
Dec 5, 2017, 2:07:14 AM12/5/17
to tesseract-ocr
Hi, 
Page segmentation mode defines how your text should be treated by Tesseract.
For example, if your image contains a single character or a block of text, you want to specify the corresponding psm so that you can improve accuracy.

You can find more information about the available psm modes by entering the following command in your terminal:
tesseract --help-psm
or by checking this wiki https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality#page-segmentation-method.

I hope this can help you.
Regards

Subrato Namata

unread,
Dec 11, 2017, 6:03:19 AM12/11/17
to tesseract-ocr
Thanks!
Reply all
Reply to author
Forward
0 new messages