How do I extract images from PDF with the alpha channel?

358 views
Skip to first unread message

Support

unread,
Oct 29, 2014, 2:46:29 AM10/29/14
to pdfne...@googlegroups.com
Q:

How do I extract images from PDF with the alpha channel?

-------------
A:

Images in PDF do not have an explicit alpha channel. Instead there may be a soft or image mask associated with the base image (that may or may not be of the same dimensions as the base image;  for more info please see Section 8.9.5 in PDF Reference: http://xodo.com/view/#/c0c11968-ee14-478e-9b09-6dc5635c0915). You can learn more about SoftMasks from PDFNet KB: https://groups.google.com/forum/#!searchin/pdfnet-sdk/SMask


If you want to extract image with alpha channel you may need to extract both the base image and the soft mask (just a single channel gray image) then merge then into one image. To get image data you can use image.GetBitmap(). This would work in most cases but you will need to make some assumptions it image dimensions do not match (e.g. upscale image to the larger image... then merge channels).

If you want to rasterize a PDF page with transparent background (rather than solid white paper background, please see https://groups.google.com/d/msg/pdfnet-sdk/GFqayLaJdSU/J_0gqGvcAmAJ



Reply all
Reply to author
Forward
0 new messages