hocr's line baseline

416 views
Skip to first unread message

Diego de la Hera

unread,
Jun 25, 2016, 2:45:59 PM6/25/16
to tesseract-ocr
Hi everyone!

I am trying to figure out what the two numbers after "baseline" in the hocr's ocr_line tags are, but so far I couldn't sort it out. Here is one of this tags as an example: <span xmlns="http://www.w3.org/1999/xhtml" class="ocr_line" id="line_1_48" title="bbox 879 1300 1240 1335; baseline 0 -6">

So far, I've noticed some things:
If line is skewed upward (or anticlockwise) or not skewed, the second number is zero if no characters extend below the baseline (e.g. p's, q's, y's). If characters do extend below the baseline, then this value is negative.
If line is skewed downward (or clockwise), second number is negative.
Second number is always integer.

First number is zero if no skew, negative if line skewed anticlockwise, and positive if clockwise. This value is decimal. I thought it could be an angle expressed in some way, but I couldn't understand how.

Documentation says: baseline pn pn-1 … p0 - a polynomial describing the baseline of a line of text, the polynomial is in the coordinate system of the line, with the bottom left of the bounding box as the origin

But it is not clear to me.

Can anybody help me here? I would really appreciate it!

Stef

unread,
Jun 25, 2016, 4:41:23 PM6/25/16
to tesseract-ocr
The two numbers are the slope (1st number) and constant term (2nd number) of a linear equation describing the baseline relative to the bottom left of the bounding box. For a linear equation is n = 1, so the first number is p1 and the second number p0 and the equation describing the base line is y = p1 * x + p0.

Stef

Tom Morris

unread,
Jun 25, 2016, 6:58:50 PM6/25/16
to tesseract-ocr
Hi Stef. Is that info hiding in the wiki somewhere? If not, do you think you could find a place to add it? 

Tom

Stef

unread,
Jun 26, 2016, 2:02:34 PM6/26/16
to tesseract-ocr
Tom,

This info is in the hOCR spec (as mentioned by the OP), but maybe it's not so clear to understand. I could add this explanation along with a sketch to the FAQs if this is of common interest. I just don't want to be scolded for littering the wiki with possibly redundant information.

Stef

Stef

unread,
Jun 28, 2016, 8:01:52 AM6/28/16
to tesseract-ocr
Here you go.

Tom Morris

unread,
Jun 28, 2016, 9:09:02 AM6/28/16
to tesser...@googlegroups.com
Thanks Stef! I think that's much clearer than the brief mention buried in the hOCR spec.

Tom

--
You received this message because you are subscribed to a topic in the Google Groups "tesseract-ocr" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/tesseract-ocr/azjzEHTIJUM/unsubscribe.
To unsubscribe from this group and all its topics, send an email to tesseract-oc...@googlegroups.com.
To post to this group, send email to tesser...@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/0e2ea025-f307-4869-995e-27d5b8059fc3%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Diego de la Hera

unread,
Sep 3, 2016, 11:52:40 PM9/3/16
to tesseract-ocr
Sorry for reviving this but it looks like I had no notifications set on this thread and missed your replies. I couldn't leave this post closed without thanking you guys. This is amazing! Great explanation!!! Thank you very much
Reply all
Reply to author
Forward
0 new messages