iOS: how to extract PDF image into UIImage?

83 views
Skip to first unread message

Kirill Povarintsev

unread,
Dec 29, 2016, 2:26:11 PM12/29/16
to PDFTron PDFNet SDK
I'm trying to implement PDF image downscaling/resampling (pre-print optimization) in iOS. Basing on information found here I wrote a code that iterates through document objects and swaps all images with something (I use the same hard-coded image, just to prove the concept). Obviously, I'd like to swap them with actual resampled images. I found a method on PTImage, exportAsPngStream, which takes PTFilterWriter as a parameter. Unfortunately I couldn't find anything like PTMemoryFilter implemented in PDFNet iOS - is it just missing? Is there any other in-memory output filter implementation then? Or maybe there is another way to extract image data? Or maybe there is another way to downscale/resample images, natively implemented in PDFNet iOS (rather than pulling it into UIKit, performing necessary manipulations there, and pushing back the result)? I'd appreciate any pointers. Thanks

Aaron

unread,
Dec 29, 2016, 4:21:56 PM12/29/16
to PDFTron PDFNet SDK
Rather than swap the images out manually, it might make more sense for you to use PDFNet's Optimizer Add-on.  The Optimizer API enables you to downsample / recompress all images in a document with a single API call.  Please see https://www.pdftron.com/pdfnet/addons.html#Optimizer and https://www.pdftron.com/pdfnet/samplecode/OptimizerTest.m.html for details.

Would that approach work for you?

Kirill Povarintsev

unread,
Dec 30, 2016, 12:28:21 PM12/30/16
to PDFTron PDFNet SDK
Thanks Aaron,

I've just stumbled across PTOptimizer today, but unfortunately our license doesn't include PDF Optimizer permission, so I couldn't try it out.

There are a few other concerns about using it though:
1) We don't necessarily want to downscale all the images, only heavy ones.
2) We don't necessarily want to optimize the whole document, only pages being selected for printing.
3) Not sure that just dropping the DPI down to the same value across the whole document is flexible enough - more on that later.

So #1 can't be achieved with PTOptimizer, though it's not necessary a problem.

With #2 we could of course assemble a proxy document consisting of printable pages only, and apply optimization to it. By the way, with manual approach, is it possible to do the swapping only on some of the pages?

#3 is a bit trickier though.

Our application deals with multiple documents of various sizes and resolutions. Viewing experience is provided without any downscaling or resampling. It's been originally implemented using Quartz, but it couldn't keep up with heavy documents (even with tiled rendering) and therefore we're in the process of reimplementing it using PDFNet, a well proven approach used in our apps for Windows Store and Android. Printing however is a separate story. As far as I know, there is no printing support provided by PDFNet iOS. There are two approaches to printing using UIPrintInteractionController - passing a print data or a rendering delegate. Some of the documents we deal with kill UIPrintInteractionController right away when thrown on it as a data stream - like 8000x8000 image embedded into page. Same happens when this document is drawn into CG context using rendering delegate approach. Using bitmaps rendered by PDFNet is not an option as it dramatically increases the print job size. Combining everything into optimized PDF and passing it as print data is not desired either, because there is a lot of extra content in the print output, sitting on top of the PDF and around it - we would like to avoid rewriting this content generation from simple CG drawing into annotating PDF. The idea is to pre-process PDF with PDFNet to reduce image size and then print it out using native Quartz rendering, which will be lean and fast for text and geometry and only slow down on images.

This was a bit of background. The task now is to handle that document with 8000x8000 image. Not sure what the DPI is, but it's not necessarily very high already - the page itself is 111x111 inches (according to Acrobat). When using Quartz to print PDF you're pretty isolated from physical units. iOS printing engine provides you with CGContext and output rectangle (not even sure what the units are), and you set a mapping between output rectangle and PDF crop box on that CGContext, and everything else happens magically. If I do this with original page, the app crashes out of memory. If I swap 8000x8000 image with the one downscaled to 1000x1000 (have no idea of resulting DPI) it doesn't, even though the crop box is still huge. We don't have to produce an optimized PDF as such, we only need to produce something, that Quartz can print without crashing out of memory. So the initial idea was to care not about DPI, but about pixel size of the image - cap it to some value, sensible from the memory consumption point of view. What do you think, am I on the wrong route?

P.S. I've proved the original concept by passing the original PTImage into UIImage through a temporary file, but it's less than ideal. I'd be keen to do it in-memory.

P.P.S. Then I found PTImage2RGBA filter that seems to do the in-memory trick, but because of reading the whole image content at once and then loading it into CGImage, the app runs out of memory. It would be ideal to have PTImage.createWithFilter overload that does resizing - I really don't want to implement downscaling with sequential reading myself...
Reply all
Reply to author
Forward
0 new messages