Book layout element recognition

46 views
Skip to first unread message

Christoph Holtermann

unread,
May 24, 2015, 9:40:13 AM5/24/15
to ocr...@googlegroups.com
Hello,

now and then i digitalize a book. I have some questions and ideas.

Books have certain layout elements that are stable on the page. For example
the page numbers. They are (always ?) the same distance from the page borders.
They (usually) always have the same font size. They usually have an ascending
order.

So there's a lot of information in this element.

When I scan a book. I have some of these elements like page borders, the block
of main text which share stable attributes on multiple pages.

Is there some way to use these layout informations to get a good scanning
result ?

For example if I have a photograph of a book page where the page is not flat
the lines of text would form curves that correspond the perspectivic projection
of the book page. Book pages should have certain ways of curving that could
be matched with the curves of the text lines and in sequence calculated to a
flat page.

So if there is the goal of having a flat page: For that the page borders should
be recognized correctly and the pages should be scaled accordingly.

A goal could be to have all the pages look even by having the text on the same
positions. The page numbers for example. Or the font always having the same
size.

So there's a lot of layout elements. Is there a convention to name them ?
If you have a bunch of pages these parameters could be matched to make
them look similar by identifying stable characteristics.

By knowing what a book is the pages could be sorted according to their
numbering.

Having a look at a bunch of digitized pages there could be a software to
check if the images resemble characteristica common to books and then
morph the images to idealistically fit that characteristica.

In the end one would have a nice lookin book.

Is there some work on this in some open source project ?

regards,

Christoph Holtermann

c.holt...@gmx.de

unread,
Jun 5, 2015, 10:19:32 AM6/5/15
to ocr...@googlegroups.com
Hello,

to get a bit less abstract and a bit more practical.

Is there a way to access the data that has been extracted during
the ocropus recognition process via an object oriented way like
elements in a dom tree on a html page ? (via python)

regards,

Christoph Holtermann
Reply all
Reply to author
Forward
0 new messages