Getting Exact Text Height

336 views
Skip to first unread message

arunsrinivaasrs

unread,
Jan 14, 2015, 9:55:38 PM1/14/15
to pdfne...@googlegroups.com

gs.GetFontSize() gives me the font size. How can i get exactly font height of Ascent.

font.getAscent() gives height in glph coordinate . How to convert that to world co-ordinate (or) font-size.

Bounding Box of font also gives size in glyph coordinates.

Bounding Box of text element cannot be used for height because the box is not optimal one.

Support

unread,
Jan 14, 2015, 11:09:29 PM1/14/15
to

It sounds like you would like to use TextExtractor methods to get bbox for line/word/character:

https://www.pdftron.com/pdfnet/mobile/docs/Android/pdftron/PDF/TextExtractor.Line.html#getBBox()
https://www.pdftron.com/pdfnet/mobile/docs/Android/pdftron/PDF/TextExtractor.Word.html#getQuad()
https://www.pdftron.com/pdfnet/mobile/docs/Android/pdftron/PDF/TextExtractor.Word.html#getGlyphQuad(int)

Alternatively if you are working with ElementReader you could use element.GetBBox() 
... or for more low lever control see 



FYI, section 9.4.4. Text Space Details ( http://xodo.com/view/#/c0c11968-ee14-478e-9b09-6dc5635c0915 ) provides info on how to map all gstate and font parameters to a bbox in user coordinate system.

arunsrinivaasrs

unread,
Jan 19, 2015, 9:15:35 PM1/19/15
to pdfne...@googlegroups.com

Thanks. for you reply. element.getBBox is not accurate. Anyways i dont need entire height, i need the split of ascent and descent. 

 Below is the code that i used from 9.4.4 pdf spec. My Idea is to just derive a perfect Font Ascent value.

          1.)   pdftron::Common::Matrix2D Trm;
                double Tfs = gs.GetFontSize();
                double Th = gs.GetHorizontalScale() / 100;
                double Trise = gs.GetTextRise();
                Trm.m_a = Tfs * Th;
                Trm.m_b = 0;
                Trm.m_c = 0;
                Trm.m_d = Tfs;
                Trm.m_h = 0;
                Trm.m_v = Trise;

                pdftron::Common::Matrix2D Trm_mtx = element.GetCTM() * element.GetTextMatrix() * Trm;
                pdftron::PDF::Rect outBBox outBBoxText = font.GetBBox();
                tX1 = outBBoxText.GetX1() / 1000;
                tY1 = outBBoxText.GetY1() / 1000;
                tX2 = outBBoxText.GetX2() / 1000;
                tY2 = outBBoxText.GetY2() / 1000;

                Trm_mtx.Mult(tX1,tY1);
                Trm_mtx.Mult(tX2,tY2);

                tY2-tY1 should be the height of the ascent. But this value is inaccurate and is quite high. 


             2.)  Other method that i used to derive Ascent is
                
 double PdfRenderer::getFontHeight(double fontsize, const pdftron::Common::Matrix2D& mtx) { 
    double x = mtx.m_c * fontsize; 
    double y = mtx.m_d * fontsize; 
    return sqrt(x*x + y*y); 
}
                fontSize = getFontHeight(gs.GetFontSize() , element.GetCTM() * element.GetTextMatrix());
                outBBoxText = font.GetBBox();
                double bBoxAscent = (outBBoxText.y2 /(outBBoxText.y2 - outBBoxText.y1)) * fontSize;
                double bBoxDescent = (abs(outBBoxText.y1) /(outBBoxText.y2 + abs(outBBoxText.y1))) * fontSize;

               In the above logic, i multiply fontsize with a factor derived from Ascent:Descent ratio. Even this ascent value is slightly inaccurate.

 I saw your post regarding taking each character Bounding Box. we render e_text_begin to e_text_end as one single text flow, so even when we take character by character ascent, we would still endup taking the tallest BBox height .  

Kindly share if there is any other way to get optimal value of Ascent of Text.

On Thursday, January 15, 2015 at 12:09:29 PM UTC+8, Support wrote:
Alternatively if you are working with ElementReader you could use element.GetBBox() 
... or for more low lever control see 



FYI, section 9.4.4. Text Space Details ( http://xodo.com/view/#/c0c11968-ee14-478e-9b09-6dc5635c0915 ) provides info on how to map all gstate and font parameters to a bbox in user coordinate system.




On Wednesday, January 14, 2015 at 6:55:38 PM UTC-8, arunsrinivaasrs wrote:
Reply all
Reply to author
Forward
0 new messages