Looking for segmentation algorithm implementations and (G)UIs

45 views
Skip to first unread message

Rainer Verteidiger

unread,
Jul 11, 2020, 11:49:22 AM7/11/20
to tesseract-ocr
Dear all,

I'm looking for a list (not https://tesseract-ocr.github.io/tessdoc/User-Projects-%E2%80%93-3rdParty) comparing various segmenters (AI-based or otherwise) that could be used instead of Tesseract's built-in segmenter, and also one comparing GUIs that could be used for improving automatic segmentation results, i.e. for further training of an AI-based segmenter or for smoothing out errors in the results of a non-trainable one.

Here are the ones I'm currently aware of (excluding vapourware and abandoned/unmaintained projects):

Segmenters:
- https://github.com/lquirosd/P2PaLA (AI-based; does both, bounding boxes and baselines)
- https://github.com/mittagessen/kraken (AI-based; old version did bounding boxes, seems to be switching to baselines now, judging from the Issues)

GUIs:
- https://transkribus.eu/Transkribus/ (desktop client that seems to use P2PaLA on the server side; many features cloud-only, but nice, intuitive editing UI)
- https://github.com/mauvilsa/nw-page-editor (UI not as user-friendly; needs a lot of getting used-to, but seems quite powerful)
- https://github.com/mittagessen/kraken (old version produces HTML pages that can be edited and saved again)
- https://wiki.gnome.org/Apps/OCRFeeder (uses a homebrewn XML format, sadly no PageXML, etc.)

Any input would be appreciated :)

Best regards

Rainer

Shree Devi Kumar

unread,
Jul 13, 2020, 4:27:03 AM7/13/20
to tesseract-ocr
Good collection of segmentation algorithms.

Dan Bloomberg has update the segmentation algorithms in leptonica some time back. You may want to take a look at those too. 

Tesseract also uses leptonica, but older algorithms, I think.

--
You received this message because you are subscribed to the Google Groups "tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-oc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/6b3e8d94-2bf8-49a7-a1b7-db928b5e92a2o%40googlegroups.com.


--

____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
Reply all
Reply to author
Forward
0 new messages