Dealing with parent resources when extracting content using ElementReader

45 views

ElementReadererrorextract

Skip to first unread message

Support

unread,

Mar 13, 2013, 5:59:35 PM3/13/13

I encountered issue when extracing info from a PDF with Type 3 fonts using ElementReader.

Obj type3GlyphStream = font.GetType3GlyphStream(charData.char_code);

if (type3GlyphStream != null) {

using (var reader = new ElementReader()) {

reader.Begin(type3GlyphStream);

Element el;

while ((el = reader.Next()) != null) {

// fails while iterating

}

reader.End();

}

Full test code C# and C++ version is attached.

-------------

The problem is that you Type3 content stream is referencing resources (e.g. R13 ExtGState) that is stored in the parent page resource dictionary (i.e. not within Type3 resource dictionary). To fix this you can pass page resource dictionary as the second param to ElementReader.Begin(). For example:

...

ElementReader.Begin(type3GlyphStream, parent_page.GetResourceDict());

...

test.cpp

test.cs

Reply all

Reply to author

Forward

0 new messages