How do I find the compression method used on a PDF image?

1,168 views
Skip to first unread message

Support

unread,
Jul 25, 2013, 6:24:16 PM7/25/13
to pdfne...@googlegroups.com
Q:
 
I’m parsing through a PDF before transforming it to SVG and need to analyze the specific image formats (GIF/PNG/TIFF). I can’t seem to figure out what the property is of the image name/type….Or are all the images embedded in a PDF converted into PNGs only even though they may have originated as JPG?
What I have so far. I would imagine that one of the two commented out functions would exist, but I can’t seem to find anything like it.
using (PDFDoc doc = new PDFDoc(fileName)) {
for(int j = 1; j <= doc.getPageCount(); j++){
  Page page = doc.getPage(j);
  using (ElementReader reader =  new ElementReader()) {
      reader.begin(page);
      Element elem;
      while ((elem = reader.next()) != null){
              if (elem.getType() == Element.e_inline_image
                            || elem.getType() == Element.e_image){
                   
                     //System.out.println("******** Image Type: " + elem.getImageType());
                     //System.out.println("******** Image Filename: " + elem.getImageFileName());
                     int totalResolution = elem.getImageHeight() * elem.getImageWidth();
                    System.out.println("******** Image Resolution: " + totalResolution);
             }
       }
 }
-----------------
A:
 
GIF, PNG, and TIFF are not supported in PDF.  Instead, images can be embedded in any of the standard PDF filters: FlateDecode (Zlib), DCTDecode (JPEG), JPXDecode(JPEG2000), etc.  You can refer to the PDF specification for more information:

http://partners.adobe.com/public/developer/en/pdf/PDFReference.pdf - Table 3.5 - Standard Filters

To determine the filter type of a specific image, you can do something like this:
 
Image img(elem.GetImageXObject());
img.GetBaseCompressionType();.
....
 
 
enum BaseCompressionType {
 e_base_jbig2,
 e_base_ccitt,
 e_base_jpeg,
 e_base_jp2,
 e_base_flate,
 e_base_lzw,
 e_base_run_length,
 e_base_other
};

BaseCompressionType GetCompressionTypeFromName(string name)
{
 if (name == "JPXDecode"))   return e_base_jp2;
 else if (name == "DCTDecode")) return e_base_jpeg;
 else if (name == "JBIG2Decode")) return e_base_jbig2;
 else if (name == "CCITTFaxDecode")) return e_base_ccitt;
 else if (name == "FlateDecode")) return e_base_flate;
 else if (name == "LZWDecode")) return e_base_lzw;
 else if (name == "RunLengthDecode")) return e_base_run_length;
 else return e_base_other;
}
 
BaseCompressionType GetBaseCompressionType(Image img)
{
 Obj xobject = img.GetSDFObj();
 BaseCompressionType ret = e_base_other;
 if (!xobject) return ret;
 
 Obj f = xobject.FindObj("Filter");
 if (f == null) return ret;
 if (f.IsName())
  return GetCompressionTypeFromName(f.GetName());
 else if (f.IsArray())
 {
  int sz = int(f.Size());
  for (int i=0; i<sz; ++i) {
   if (f->GetAt(i).IsName())  {
    BaseCompressionType t = GetCompressionTypeFromName(f.GetAt(i).GetName());
    if (t != e_base_other)  {
     ret = t;
     if (ret == e_base_jpeg || ret == e_base_jp2 ||  ret == e_base_jbig2 ||  ret == e_base_jp2 ||  ret == e_base_ccitt) 
      return ret;  
    }
   }
  }
 }
 return ret;
}
 
 
When exporting images from PDF, the pdftron.PDF.Image.Export() method will choose the appropriate format based on the embedded image's format. Alternatively, you can use Image.ExportAsTiff or Image.ExportAsPng to enforce a given format.
 
 
 
Reply all
Reply to author
Forward
0 new messages