Dealing with parent resources when extracting content using ElementReader

41 views
Skip to first unread message

Support

unread,
Mar 13, 2013, 5:59:35 PM3/13/13
to
Q: 
 
I encountered issue when extracing info from a PDF with Type 3 fonts using ElementReader.

 

Obj type3GlyphStream = font.GetType3GlyphStream(charData.char_code);

if (type3GlyphStream != null) {

   using (var reader = new ElementReader()) {

     reader.Begin(type3GlyphStream);

     Element el;

     while ((el = reader.Next()) != null) {

         // fails while iterating

     }

     reader.End();

   }

}
 
Full test code C# and C++ version is attached.
 
-------------
A:
 

The problem is that you Type3 content stream is referencing resources (e.g. R13 ExtGState) that is stored in the parent page resource dictionary (i.e. not within Type3 resource dictionary). To fix this you can pass page resource dictionary as the second param to ElementReader.Begin(). For example:

 
...

ElementReader.Begin(type3GlyphStream, parent_page.GetResourceDict());

 ...

test.cpp
test.cs
Reply all
Reply to author
Forward
0 new messages