recognizing printed lists and simple tables

94 views
Skip to first unread message

Thomas L. Packer

unread,
Mar 8, 2011, 7:35:38 PM3/8/11
to ocr...@googlegroups.com

Hello all

 

                How well does Ocropus do at recognizing the layout of book pages containing interesting lists and simple tables?  Think of a high school yearbook with a creative layout.  Perhaps Ocropus won’t identify a column of names that lies diagonal across the page or where the photos and the names make a checker-board pattern.  But would it at least identify each name as a separate text line (and let me interpret the bounding box positions)? 

 

                If Ocropus has any difficulty with this scenario now (which I’m guessing it will), will it be possible to train Ocropus with hand-labeled pages to learn such page layouts in the near future?

 

                Any other suggestions?

 

                Thanks,

Thomas L. Packer

BYU CS

~~~~~~~~~~~~~~~~~~~~

 

Tom

unread,
Mar 10, 2011, 8:49:35 PM3/10/11
to ocr...@googlegroups.com
It can recognize the text pretty well, regardless of layout.  You can also use the XY-cuts or Voronoi layout methods to get blocks.  None of those will reliably give you a tabular structure, however.

We have developed and published trainable layout recognition, but it's not integrated into OCRopus yet.

Tom

pranjal rajput

unread,
Jan 18, 2013, 1:29:25 AM1/18/13
to ocr...@googlegroups.com
has trainable layout recognition been integrated into OCRopus yet?
if so, please share the links to such documentations, tutorials.

if its still under development, can i build a bleeding edge version from source code? 

please share the details.

thanks,

Sriranga(78yrsold)

unread,
Jan 18, 2013, 6:24:17 AM1/18/13
to ocr...@googlegroups.com
For which language  to be  trained by you ?

--
You received this message because you are subscribed to the Google Groups "ocropus" group.
To post to this group, send email to ocr...@googlegroups.com.
To unsubscribe from this group, send email to ocropus+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msg/ocropus/-/r3by0hLJflEJ.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

pranjal rajput

unread,
Jan 19, 2013, 12:03:39 AM1/19/13
to ocr...@googlegroups.com
English.

Tom

unread,
Apr 10, 2013, 1:41:14 AM4/10/13
to ocr...@googlegroups.com
Not yet. That's planned for OCRopus 0.8.

Tom
Reply all
Reply to author
Forward
0 new messages