I just picked up Ruby for the first time this week so I could modify some PDF files via Origami.
In order to figure out how to do what I need to do I started from the code in pdfdecompress and pared it back to the simplest code that would do something, then ran it on one of the PDF files that are a sample of those I need to modify. The first file I rewrote spat out errors when I displayed it via evince.
It turned out that reading and rewriting the stream objects was reversing the order of the arguments to the Tf operator - for example, while the input stream contained:
/F1 1 Tf
The output stream contained:
1 /F1 Tf
I tracked it down to
http://code.google.com/p/origami-pdf/source/browse/lib/origami/graphics/instruction.rb#82
I am not sure if that is the correct fix, but it removed the problem for me.
Then I noticed another problem in the stream parsing and rewriting. It appears that numbers are formatted with the default .to_s for floats which appears to be %g (I know very little Ruby - only started learning it this week). This leads to small numbers like 0.000001 being formatted as 1.0e-06 in the output content stream. Evince does not like numbers in that format.
The problem with %f is that it appears to include up to six trailing zeroes after the decimal point. The posted patch removes extraneous trailing zeroes.
Again, I am not sure if this is the correct fix, but It Works For Me (TM).
- Dave