How do I extract text from a given PDF layer using PDFNet SDK?

Skip to first unread message


Jan 25, 2013, 7:21:01 PM1/25/13
Our scenario is this:


·         Input file is a layered PDF (normally one page, but could be more)

·         We need to check that a particular layer has live (not outlined) text on it

·         We know the layer name we are looking for will contain the word ‘artwork’

·         Therefore, we want to attempt to extract text only on this particular layer (if it is found)

·         If the extracted text is empty, we will fail the process, otherwise we continue


Is there a recommended approach to this? My developers have been struggling a little with this as there doesn’t appear to be a way to extract text from only one layer?

Yes, this is a somewhat tricky. One thing that pops to mind is that you can extract the required text layer into a temp page then use ‘pdftron.PDF.TextExtractor’ to get text from the page.


To extract the layer you can use the approach shown in ElementEdit  sample:


To copy elements you would initialize ElementReader with OCG Context similar to the way PDFDraw in PDFLayers sample (


Config init_cfg = doc.GetOCGConfig();

Context ctx = new Context(init_cfg);

ctx.SetState(ocg, true);
reader.Begin(page, ctx);
if (element.IsOCVisible()) {
Reply all
Reply to author
0 new messages