How can I get letter-spacing word-spacing using TextExtractor.

101 views
Skip to first unread message

Zonker Harris

unread,
Jul 12, 2013, 11:59:29 AM7/12/13
to pdfne...@googlegroups.com
Hey all,

Working with what I learned from working the the TextExtractorTest program. I am working on a tech demo for my boss. I have figured out how to get almost all the information I need except when I try to
get the letter-spacing and word-spacing, they are not available to the TextExtractor class. If i can get the font, size, weight, serif, why can i not also assess the line and word spacing?

The ruby code

This all works great.
def PrintStyle (style)
puts " style=\"font-family:" + style.GetFontName + "; font-size:" +
style.GetFontSize.to_s + "; sans-serif: " + style.IsSerif.to_s +
"; color:" + style.GetColor.to_s + "\""
But this fails

puts style.GetCharSpacing.to_s

`PrintStyle': undefined method `GetCharSpacing' for #<PDFNetRuby::Style:0x007ff5f1863690> (NoMethodError)

Am I gonna have to abandon TextExtractor and roll out a version using an Element Reader or am I missing something.

Any help would be appreciated.


zonker

fraga

unread,
Aug 26, 2014, 1:48:54 AM8/26/14
to pdfne...@googlegroups.com
I have the same problem! Did you find an alternative?

Anatoly Kudrevatukh

unread,
Aug 26, 2014, 5:57:16 PM8/26/14
to pdfne...@googlegroups.com
Hello,

This information is only accessible through ElementReader interface.

Support

unread,
Aug 27, 2014, 6:24:31 PM8/27/14
to pdfne...@googlegroups.com


I guess you could use the info provided by TextExtractor ( namely glyph and word bounding boxes) to find the info you are looking for.

So,

... compute average char spacing by finding distances between x1...x2 on consecutive Glyph bboxes on the same line
... compute average word spacing by finding distances between x1...x2 on consecutive Word bboxes on the same line


for words:

pdftron::PDF::TextExtractor::Word::GetBBox() or GetQuad()

for glyphs

pdftron::PDF::TextExtractor::Word::GetGlyphQuad
Reply all
Reply to author
Forward
0 new messages