How do I get the coordinates of objects in PDF and edit the page?

1,556 views
Skip to first unread message

Support

unread,
Nov 28, 2011, 7:30:29 PM11/28/11
to pdfne...@googlegroups.com
Q: I am wondering if PDFNet SDK provides a way to get the coordinates of the objects in a PDF file.

Actually, we plan to develop a function to insert a custom figure into a PDF file at a specific point. We tried PDFSharp, an open source library, however, it is not relaible and doesn't provide functions to retrieve the coordinates.

-----------

A: Using PDFNet SDK you can perform content extraction and editing on any PDF.
I am not exactly sure what type of information you are looking for but as a starting point you may want to take a look at the following two projects:

ElementReader & ElementReaderAdv:
 http://www.pdftron.com/pdfnet/samplecode.html#ElementReaderAdv

specifically you would use element.GetBBox() to obtain the bounding box for any graphical element on the page.

In case you are looking for something a bit higher-level, please see TextExtract sample project:
  http://www.pdftron.com/pdfnet/samplecode.html#TextExtract

With 'pdftron.PDF.TextExtractor' you can obtain the positioning information for any character, word, line, paragraph, etc (e.g. word.GetBBox() etc).

In case you need to edit an existing PDF page a good starting point would be to take a look at ElementEdit sample:
  ElementEdit: http://www.pdftron.com/pdfnet/samplecode.html#ElementEdit

If you only need to stamp/watermark existing PDF pages you could use 'pdftron.PDF.Stamper' utility without using ElementBuilder/Writer:
  PDF Stamper: http://www.pdftron.com/pdfnet/samplecode.html#Stamper
 
 

Spencer Rathbun

unread,
Nov 29, 2011, 8:33:02 AM11/29/11
to pdfne...@googlegroups.com
I had to do the same thing, and I was pleased to discover it is possible with PDFNet. There's only one minor problem. I am very rusty at matrix math, which is what PDFNet uses, instead of a coordinate system. Once I got over that hump, it was simple. I'm using the python bindings, which may or may not apply in your case. This function checks to see if any part of a text element is near the right edge of a page. I'm using a hard coded number, but that could easily be changed for your use case.
 
def edgeCheck(element):
 """Check if position is on right edge, True if not."""
 itr = element.GetCharIterator()
 text_mtx = element.GetTextMatrix()
 while itr.HasNext():
  pt = Point()
  pt.x = itr.Current().x
  pt.y = itr.Current().y
  ctm = element.GetCTM()
 
  # To get the absolute character positioning information concatenate current
  # text matrix with CTM and then multiply relative positioning coordinates with
  # the resulting matrix.
  mtx = ctm * text_mtx
  mtx.Mult(pt)
  if pt.x > 550:
   return False
  itr.Next()
 return True
 
Hmm, the spacing is a little off, but that's the gist of it.

Spencer Rathbun

--
You received this message because you are subscribed to the "PDFTron PDFNet SDK" group. To post to this group, send email to sup...@pdftron.com
To unsubscribe from this group, send email to pdfnet-sdk-...@googlegroups.com. For more information, please visit us at http://www.pdftron.com
Reply all
Reply to author
Forward
0 new messages