Converting multipages TIFF to PDF with modified compression

64 views
Skip to first unread message

Stephane CIVERA

unread,
May 27, 2015, 6:12:27 AM5/27/15
to pdfhummus-in...@googlegroups.com
Hi,

I'm trying to convert multipages TIFF to PDF, so far i made the easy part : copying each page with the CCITDecode compression.
But now, i want to add FlateDecode to certain page, to improve the overall PDF file size.

Is it possible with the PDFHummus C library ?

Thank you,
Stéphane.

Gal Kahana

unread,
May 28, 2015, 2:27:59 AM5/28/15
to pdfhummus-in...@googlegroups.com, stephan...@gmail.com
PDFHummus can help you here,
sure.

the library contains tiff support, including multipage tiffs.

look into: https://github.com/galkahana/PDF-Writer/wiki/Images-Support for explanation and an example.
PDFHummus will take care of flatedecoding everything appropriately.

Gal.

Stephane CIVERA

unread,
May 28, 2015, 3:16:40 AM5/28/15
to pdfhummus-in...@googlegroups.com, stephan...@gmail.com
Thank for the hint. My actual code is roughly like the "TiffSpecialsTest" one, adding "TIFFNumberOfDirectories" to determine the page number and using stream in lieu of filepath.
So, i end up with PDF with CCITTFaxDecode for each page. So far, so good.

Now, i would like to compress certain TIFF pages with CCITTFaxDecode and FlateDecode filter.
I didn't found any example of combining two filters or compressing an already compressed image, can you give me some directions ?

Thank in advance,
Stéphane.

Stephane CIVERA

unread,
May 29, 2015, 4:09:25 AM5/29/15
to pdfhummus-in...@googlegroups.com
I can find any solution, maybe with an 'ObjectsContext' extender...

Gal Kahana

unread,
May 29, 2015, 9:10:53 AM5/29/15
to pdfhummus-in...@googlegroups.com, stephan...@gmail.com
oh. i see. totally didn't get what you are looking at.
This would be a bit of a problem.

Normally, if you want to affect the stream writing of the library you create an IObjectContextExtender and set it on the object Context. This is explained here - https://github.com/galkahana/PDF-Writer/wiki/The-ObjectsContext-Object#extensibility
in short - you can determine at times to provide your own stream implementation to replace the default flate compression.
Now any stream that requires compression will consult the extender whether to use the default flate compression or a custom one that you will provide.
At least, when compression is requried. you can set a flag on the pdfwriter to avoid compression completly.

However, the TiffImageHanlder as is determines its own compression, and so specfically asks the object context to provide unfiletered streams. you can see the relevant lines here and here. it then writes the output in whatever compression that's best fitting to the tiff data. in our case - ccit. unfortunately, unfiltered steram writing completly ignore the extension mechanism, and so you can't override it providing the extra flate encoding that you want (b.t.w, flate encoding, is simple, just use OutputFlateEncodeStream as a returned instance when asked to provide a stream).

There's couple of ways around it, to implement what you want, and they require a change in the library code.

1. You can change the implementation of the pdfstream to also consult the compression query in case of unfiltered stream requests. Then implement your extender, and when writing a tiff image, add your compression.
2. You can change teh implementation of tiffimagehandler to optionally take the default streams in case of unfiltered streams. this would return the default flate stream, and so you'll get what you want (or ask and external party to provide a stream and send the flate stream). you will want to implement something that asks you every time which option to take - as you said, you want it only on some pages, not all. best route here is to add elements to IObjectContextExtender particular for tiff writing, and consult them from the TiffImageHandler.
3. you could write your own tiff image handler.

i think the best options is (2). extend IObjectContextExtender to have a particular tiff stream request. if return null - use the default stream, if return something, use it instead. then call this method from the tiffimagehandler (the two lines i referred to). implement your own extender, with only this method (use the ObjectsContextExtenderAdapter to provide default implementations for the rest - and be nice and add your own default for your new method). in your method determine when to return null or an OutputFlateEncodeStream and done.

hope this works.
If you need more help with this, let me know.

Gal.

Stephane CIVERA

unread,
May 29, 2015, 9:31:13 AM5/29/15
to pdfhummus-in...@googlegroups.com, stephan...@gmail.com
Ok, thank for your guidance.
I'll try to duplicate TiffImageHandler and modify its behavior to comply with solution 2.
If i succeed, I post my work and discuss with you if it seems correct.

Stéphane.

Stephane CIVERA

unread,
May 29, 2015, 9:47:29 AM5/29/15
to pdfhummus-in...@googlegroups.com, stephan...@gmail.com
The most difficult part for me, is how to generate the correct PDF structure, using mostly your library functions, from :
/Type/XObject
/Subtype/Image
/Length 35327
/Filter/CCITTFaxDecode
/DecodeParms
<<
/K -1
/EndOfBlock false
/Columns 1652
/Rows 2338
>>
/Width 1652
/Height 2338
/BitsPerComponent 1
/ColorSpace/DeviceGray
/Interpolate true
>>
stream

to

/Type/XObject
/Subtype/Image
/Length 7901
/Filter[/FlateDecode/CCITTFaxDecode]
/DecodeParms[
<<
>>
<<
/K -1
/EndOfBlock false
/Columns 1652
/Rows 2338
>>
]
/Width 1652
/Height 2338
/BitsPerComponent 1
/ColorSpace/DeviceGray
/Interpolate true
>>
stream


That's puzzling me... :-(

Stéphane.

Gal Kahana

unread,
May 29, 2015, 10:27:09 AM5/29/15
to pdfhummus-in...@googlegroups.com, stephan...@gmail.com
yeah. you need some point of entry to add your own items to filter and decodeparams.
The writing of filter and decode params is done here:
void TIFFImageHandler::WriteImageXObjectFilter(DictionaryContext* inImageDictionary,int inTileIndex)

you will need to write an array if there's another filter, and an array of decode params.

use objectcontext->StartArray and EndArray to start and end arrays.

theDictionary = mObjectsContext->StartDictionary() to start a dictionary and mObjectsContext->EndDictionary(/*theDictionary*/) to finish it.

is this good?

Gal.

Stephane CIVERA

unread,
May 29, 2015, 10:32:23 AM5/29/15
to pdfhummus-in...@googlegroups.com

Thank you very much for your quick support. I'll try that on monday.

Stéphane.

--
You received this message because you are subscribed to a topic in the Google Groups "PDFHummus interest group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/pdfhummus-interest-group/u8a91QfGFw4/unsubscribe.
To unsubscribe from this group and all its topics, send an email to pdfhummus-interest...@googlegroups.com.
To post to this group, send email to pdfhummus-in...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/pdfhummus-interest-group/144be38d-5460-4dc8-a8a7-db1fa3d7607d%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Reply all
Reply to author
Forward
0 new messages