What exactly is a "flow" in TextExtractor?

29 views
Skip to first unread message

Aaron Gravesdale

unread,
Apr 16, 2014, 4:53:53 PM4/16/14
to pdfne...@googlegroups.com
Q:

what exactly is a "flow"? The description of TextExtractor::Line::GetFlowID
()
says:

  The unique identifier for a paragraph or column that this line belongs to.

This is confusing as columns are usually made up of paragraphs.

Do you have a more precise definition of "flow"? For instance, does a flow
always have rectangular shape?

A:

Flow is a logical construct used to represent a reading order.  A flow contains a sequences of blocks (paragraphs / columns) representing the reading order on a given page.

Most pages contain only a single flow, but it could contain 2 or more – e.g. if half of the page is portrait and the content on the other half is rotated 90 degrees.

Reply all
Reply to author
Forward
0 new messages