Context :: hummusJS - Adding a single graphic to an existing pdf

1,242 views
Skip to first unread message

Stephen Feather

unread,
Jun 14, 2013, 12:15:46 AM6/14/13
to pdfhummus-in...@googlegroups.com
In my original comments on the blog article referencing the release, i mentioned merging two pdfs together.
Our original plan was to create a pdf page with our generated graphics (QR code) onto a page then merge (append) that pdf as a page.

After 2 days of looking through the library, it would seem better to skip the extra step of creating a page and to add the QR code directly to a copy of the original document.

We have a node server that generates the graphic based upon the user's input. It was a PNG, but since hummus doesn't seem to like those, its now a jpg image.
Using hummus, the generated image has been successfully embedded into a new pdf. - +1!

The pdf we are adding the image two may be 1-3 pages, but we will always add the graphic on the last page.
We will know the page count before hand.

I did see the thread here regarding a similar question, and the concerns with the parse possibly losing embedded objects.

With that in mind, your advice on the direction we should head?

Thank you for your time, and for hummusjs


sf




Gal Kahana

unread,
Jun 14, 2013, 3:32:56 AM6/14/13
to pdfhummus-in...@googlegroups.com
sounds like you may be looking at a modification scenario. since that post with paul, i did implement modification, so you should be able to add the graphics by modifying the original document, adding another content stream to the original page.
This needs some "manual" work, as i don't currently have support for adding content to an existing modified page, but you could do that yourself with some PDF knowhow. there is some trickery if you want to recreate the page with new content added. i can provide tips and some sample code if you need it.

here is what i suggest:

1. open the original file for modification. read here for some details (https://github.com/galkahana/HummusJS/wiki/Modification) how. call hummus.createWriterToModify.
2. now we'll need to modifiy the original page object, and add more content to it.to do this:
            2.1. create a form out of the jpg image (https://github.com/galkahana/HummusJS/wiki/Show-images#jpg)
            2.2. create a new object (read about it here https://github.com/galkahana/HummusJS/wiki/Extensibility#basic-objects-creation-with-objects-context), and in it create a PDF Stream. it will be a new content stream. you can read about creating pdf streams here (https://github.com/galkahana/HummusJS/wiki/Extensibility#pdf-streams). grab the pdfwriter obejcts context, and call startPDFStream.
            2.3. you will need to write the drawing commands yourself as i have the easy contentcontext for just new pages and forms, but it's just this:
                                q
                                mySpecialImage do
                                Q
           2.4 end the steram with endPDFStream. also end the object. now we have a content stream calling the image. good. need to add that to the page.
            2.5 there is an example here (https://github.com/galkahana/HummusJS/blob/master/tests/ModifyingExistingFileContent.js) on how to change a page size for a page. we'll do something similar, but change the "contents" entry instead.
            2.6 do the same, but look for "contents". then instead of the original, write your own "contents" entry. write the key, and than startandarray (objCxt.startArray). now if the original contents is any array copy the element ids into the new array, otherwise (it's a stream) take its id and place in the array.). add another id, which is the id of the object we create for the stream. now we have the new content in.
           2.7 almost ready. one last thing is to add "mySpecialImage" name with the form ID into the resource stream. This means that apart from "contents" you should also modify the resources dictionary. it's just a matter of checking also for "resources", and then copying the dictionary but it's xobjects entry, which you shuould copy too, but add another entry with the necessary name->id mapping.

It's a bit complicated, because we want to make sure we maintain the original page untouched as to any other linked object, so we have to make these "delicate" modification behaviors. i hope it's clear. if not, let me know , can help with some sample code.

Cheers,
Gal.

Stephen Feather

unread,
Jun 14, 2013, 10:33:45 PM6/14/13
to pdfhummus-in...@googlegroups.com
Wow, it rarely happens, but I'm so very very lost in this. *grin*
I can program micro controllers, tear apart engines and rebuild them, but can't wrap my mind around PDFs.

I have 3 elements in play: qrcode.jpg, original.pdf (source), copy.pdf(our modified psf target.)
Zip file with qrcode.jpg and original.pdf shared here: https://dl.dropboxusercontent.com/u/8411442/pdf-sources.zip

-- start block --
// Gal Kahanas pattern
// Open original file for modification
var pdfWriter = hummus.createWriterToModify('./tmp/original.pdf', {modifiedFilePath:'./tmp/copy.pdf'});
var copyingContext = pdfWriter.createPDFCopyingContextForModifiedFile();
var pageID = copyingContext.getSourceDocumentParser().getPageObjectID(0);
console.log(pageID); // returns 2
var pageObject = copyingContext.getSourceDocumentParser().parsePage(0).getDictionary().toJSObject();
console.log(pageObject); // returns a bunch of empty objects
//{ Contents: {},
//  MediaBox: {},
//  Parent: {},
//  Resources: {},
//  Type: { value: 'Page' } }
//{}

// Create a form from the jpg image
var formXObject = pdfWriter.createFormXObjectFromJPG(inputJPG);

// Create a new object
// Create a new content stream

// Write out the drawing commands

// End our stream

// End the object

// Add to the page

// Add our named item to the stream

pdfWriter.end();
-- end block --

Gal Kahana

unread,
Jun 15, 2013, 11:23:25 AM6/15/13
to pdfhummus-in...@googlegroups.com
Don't worry about it. There is some skill & knowhow needed with PDF. i tend to forget that having worked with it so much.

cooked up a script to do what you want. attached.
i changed the algo a bit so it's more full-proof.
as you can see it's quite big, so no need to feel a bit swayed by the task if you are not familiar with PDFs.

two notes:
1. if you want to modify the placement or scale of the qr code check out the top of the script:
var formPositionOnPage = [100,100];
var formScaleOnPage = 0.5;

the array is position (left,bottom) and there's scale.

2. there was a bug in the module, where stream creation didn't return the stream object (dahhh).
fixed, but you'll have to update then.

Cheers,
Gal
ModifyContent.js

Stephen Feather

unread,
Jun 15, 2013, 12:53:00 PM6/15/13
to pdfhummus-in...@googlegroups.com
wow! I would never have come up with that.  Thank you.
Will play with it right after lunch.


sf

Stephen Feather

unread,
Jun 15, 2013, 3:41:44 PM6/15/13
to pdfhummus-in...@googlegroups.com
Ok, using version 1.0.10 from the github repo I end up with a zero byte Copy.pdf.

Some system information:
> process.versions
{ http_parser: '1.0',
  node: '0.8.22',
  v8: '3.11.10.25',
  ares: '1.7.5-DEV',
  uv: '0.8',
  zlib: '1.2.3',
  openssl: '1.0.0f' }

> npm list
── connect-en...@0.1.1
├── e...@0.7.2
├─┬ exp...@3.2.6
│ ├── buffer...@0.2.1
│ ├── comm...@0.6.1
│ ├─┬ con...@2.7.11
│ │ ├── by...@0.2.0
│ │ ├── coo...@0.0.5
│ │ ├── formi...@1.0.14
│ │ ├── pa...@0.0.1
│ │ ├── q...@0.6.5
│ │ └─┬ se...@0.1.1
│ │   └── mi...@1.2.9
│ ├── coo...@0.1.0
│ ├── cookie-s...@1.0.1
│ ├── de...@0.7.2
│ ├── fr...@0.1.0
│ ├── met...@0.0.1
│ ├── mkd...@0.3.4
│ ├── range-...@0.0.4
│ └─┬ se...@0.1.0
│   └── mi...@1.2.6
├── hum...@1.0.10
├── image...@0.1.3
├─┬ mong...@3.5.14
│ ├── ho...@0.2.1
│ ├─┬ mon...@1.3.5
│ │ ├── bs...@0.1.8
│ │ └── kerb...@0.0.2
│ ├── m...@0.1.0
│ ├── mu...@0.3.1
│ └── sli...@0.0.3
├─┬ oauth...@0.1.0
│ └── de...@0.7.2
├─┬ pass...@0.1.17
│ ├── pa...@0.0.1
│ └── pkg...@0.2.3
├─┬ passpo...@0.2.2
│ └── pkg...@0.2.3
├─┬ passport-h...@0.2.1
│ └── pkg...@0.2.3
├─┬ passpor...@0.1.6
│ └── pkg...@0.2.3
└─┬ passport-oauth2...@0.1.1
  └── pkg...@0.2.3

On build of hummusjs 1.0.10, i get the following warning from another library:

CC(target) Release/obj.target/libtiff/src/deps/LibTiff/tif_write.o
../src/deps/LibTiff/tif_write.c:633:49: warning: comparison of integers of
      different signs: 'toff_t' (aka 'unsigned int') and 'tsize_t' (aka 'int')
      [-Wsign-compare]
                && td->td_stripbytecount[strip] >= cc )
                   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^  ~~
1 warning generated.
  CC(target) Release/obj.target/libtiff/src/deps/LibTiff/tif_zip.o
  LIBTOOL-STATIC Release/tiff.a


Stephen Feather

unread,
Jun 15, 2013, 3:48:28 PM6/15/13
to pdfhummus-in...@googlegroups.com

Thinking maybe it was the source pdf file (source was a sample pdf that came out of the OSX print dialog) i grabbed one from the Adobe site:
http://partners.adobe.com/public/developer/en/xml/AdobeXMLFormsSamples.pdf (421kb) I get a 262kb Copy.pdf but cannot open it.

Gal Kahana

unread,
Jun 15, 2013, 4:16:52 PM6/15/13
to pdfhummus-in...@googlegroups.com
Re'd so it didnt poat right. The file was fine on my env, and im getting the same warnings. Two things to try:
1. Update from npm. Not git. Forgot to push the latest update (had some other matters) but did publish.
2. Open the log for hummus. Check the basic pdfwriter opening for how to turn it on.

Cheers,
Gal

Stephen Feather

unread,
Jun 15, 2013, 5:24:47 PM6/15/13
to pdfhummus-in...@googlegroups.com
var pdfWriter =
hummus.createWriterToModify('samples.pdf',{log:'hummus.log',
modifiedFilePath:'Copy.pdf'});

That would seem to be the format from the docs, but i get no logs.
> --
> You received this message because you are subscribed to a topic in the Google Groups "PDFHummus interest group" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/pdfhummus-interest-group/59SbamIoTvY/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to pdfhummus-interest...@googlegroups.com.
> To post to this group, send email to pdfhummus-in...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pdfhummus-interest-group/4e41a6f3-23e6-46b7-99fe-3fc263a619c8%40googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Gal Kahana

unread,
Jun 15, 2013, 6:01:21 PM6/15/13
to pdfhummus-in...@googlegroups.com
humff. so it's so unhappy that it wouldn't even tell us why.
look, it does work very well on my computer (mac osx), so i guess, if that's interesting to you, we'll have to go manual.
a little bit of removing code till it's ok, and than tracking back till it's not.

if you're ok with this. let's try the following (each step seeing if it creates empty or not. once it's empty - we know where the problem is):
1. try leaving on only that part that opens for modification and ends it. see if it creates a good copy of the original (it should be the same). if ok continue
2. try just creating the form for the jpg. if works, continue
3. try doing just the part that creates the new content stream, but without hooking to the page. that means commenting out all between "var cpyCxt = pdfWriter.createPDFCopyingContextForModifiedFile();" and "objCxt.startNewIndirectObject(newContentObjectID);" not including

if all these passed, and you get a good file (though without the barcode). let me know. if any failed (you get 0 file), let me know as well.

Cheers,
Gal.


On Sunday, June 16, 2013 12:24:47 AM UTC+3, Stephen Feather wrote:
var pdfWriter =
hummus.createWriterToModify('samples.pdf',{log:'hummus.log',
modifiedFilePath:'Copy.pdf'});

That would seem to be the format from the docs, but i get no logs.

On Sat, Jun 15, 2013 at 4:16 PM, Gal Kahana
<gal.bezal...@gmail.com> wrote:
> Re'd so it didnt poat right. The file was fine on my env, and im getting the same warnings. Two things to try:
> 1. Update from npm. Not git. Forgot to push the latest update (had some other matters) but did publish.
> 2. Open the log for hummus. Check the basic pdfwriter opening for how to turn it on.
>
> Cheers,
> Gal
>
> --
> You received this message because you are subscribed to a topic in the Google Groups "PDFHummus interest group" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/topic/pdfhummus-interest-group/59SbamIoTvY/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to pdfhummus-interest-group+unsub...@googlegroups.com.

Stephen Feather

unread,
Jun 15, 2013, 7:26:00 PM6/15/13
to pdfhummus-in...@googlegroups.com
look, it does work very well on my computer (mac osx)

i believe you :)

1. try leaving on only that part that opens for modification and ends it. see if it creates a good copy of the original (it should be the same). if ok continue

using the original.pdf (the fake sample form letter) i end up with an empty, 0 byte file.
if use the pdf from adobe, I get a 262k unreadable file, that contains bits 'o pdf stuff in it.

code:
var hummus = require('hummus');
var pdfWriter = hummus.createWriterToModify('original.pdf',{log:'hummus.log', modifiedFilePath:'Copy.pdf'});
pdfWriter.end();


As an aside, for a sanity check, I ran the test set all returned ok in the console,  but the ModifyingExistingFileContent.pdf is 0 byte.

Stephen Feather

unread,
Jun 15, 2013, 7:26:47 PM6/15/13
to pdfhummus-in...@googlegroups.com
I have another osx system here, am going to install on it and try there.

Stephen Feather

unread,
Jun 15, 2013, 9:59:17 PM6/15/13
to pdfhummus-in...@googlegroups.com

On a second system, that doesn't have nearly the junk mine does.
I have the same results.



Ran the test set, and the ModifyExistingFileContent.pdf was zero bytes as well.
Rolled back through 1.0.13, 1.0.10, 1.0.9, 1.0.5
Installed module as global, as local to the project, no difference.




Gal Kahana

unread,
Jun 16, 2013, 2:29:00 AM6/16/13
to pdfhummus-in...@googlegroups.com
I see. well, thanks for running the test. b.t.w, what you sent me back then is the "fake sample form letter", right? 
tried on my work pc now. just downloaded the materials, npm install hummus. got the script. modified file names. unfortunately it worked on both original and adobe's pdf.
humff. i was hoping for a failure so i can realize what's going on here.

there are several options i can think of:
1. bug in hummus, perhaps some memory override or some other matter that i can't currently figure. this is my prime suspect
2. system issue, maybe file write permissions
3. wrong version of hummus?!

(3) doesn't make sense, you got 10.0.13. didn't work
(2) that's easy to test. try a simple hello worlds example, with regular createWriter/writepage/end not for modification, to the same target. 
(1) that's the heavy one. i'd like to focus on the fake form file, cause that's where we are headed. i would like to check some theories, if you can spend the time:
    (a) maybe modifying and writing to another file doens't work well - please change the script to write to the same file (remove the modifiedfilepath). make sure to do that on a copy of the file (just copy before running the script.
    (b) maybe for some reason parsing the fake form file doesn't work. ok. please run a script that just reads the file gets teh page count and displays on console. if that doesn't work, this will pin point the source better.
[let me know if you wish to test this, and had gone through (a) and (b)...and we can think of other things later]   

sorry for all this, i guess it's an early adapter kind of illness.

Regards,
Gal.

Stephen Feather

unread,
Jun 16, 2013, 10:29:17 PM6/16/13
to pdfhummus-in...@googlegroups.com
(2) that's easy to test. try a simple hello worlds example, with regular createWriter/writepage/end not for modification, to the same target. 

Generates a new copy.pdf with the qrcode near the middle
---- start block ----
var pdfWriter = require('hummus').createWriter('./copy.pdf');

var page = pdfWriter.createPage(0,0,595,842);
var cxt = pdfWriter.startPageContentContext(page);
cxt.drawImage(0,400,'./qrcode.jpg',{transformation:{width:216,height:216}});
pdfWriter.writePage(page);
pdfWriter.end();

console.log('done - ok');
---- end block ----

(a) maybe modifying and writing to another file doens't work well - please change the script to write to the same file (remove the modifiedfilepath). make sure to do that on a copy of the file (just copy before running the script.

This changes modified timestamp on the file with the content in it. unlike when used with the modifiedfilepath that creates an empty.  I'll try to roll some of the other functions into the script to see if i can break it.
---- start block ----
var hummus = require('hummus');
var pdfWriter = hummus.createWriterToModify('copy.pdf',{log:'hummus.log'});
pdfWriter.end();
---- end block ----

Stephen Feather

unread,
Jun 16, 2013, 10:33:40 PM6/16/13
to pdfhummus-in...@googlegroups.com
Spoke too soon. timestamp was from the copying of the original.pdf to copy.pdf to play with.
Thought about that after I posted, waited a few minutes, no timestamp change.
So, not working.

Stephen Feather

unread,
Jun 16, 2013, 10:43:51 PM6/16/13
to pdfhummus-in...@googlegroups.com
---- start block ----
var hummus = require('hummus');
var pdfReader = hummus.createReader('./copy.pdf');
console.log('[getPdfLevel] '+pdfReader.getPDFLevel()); // returns 1.3 
console.log('[getPagesCount] '+pdfReader.getPagesCount()); // returns 1
---- end block ----

Gal Kahana

unread,
Jun 17, 2013, 1:46:27 AM6/17/13
to pdfhummus-in...@googlegroups.com
Good. it's not supposed to do that. the code that you ran, should for the very least add a new trailer entry. and even change the byte size (if using the original.pdf file, it'll be from 106kb to 107kb).

modification is not happy.

let's try to see exactly what part. 

a. reading not happy? ending not happy?  

in the script that you wrote for the modification here, please add the reading code from below using the modification parser. should be something like this:
var hummus = require('hummus');
var pdfWriter = hummus.createWriterToModify('copy.pdf',{log:'hummus.log'});
var modParser = pdfWriter.createPDFCopyingContextForModifiedFile().getSourceDocumentParser();
console.log(modParser.isEncrypted());
console.log(modParser.getPDFLevel());
console.log(modParser.getPagesCount());

pdfWriter.end();
console.log('done');

-- try this, if you don't see any of the logs, we'll advance somewhat knowing if there's a read error. or whether it's considered encrypted or what not.

b. copying doesn't work on this file?
try doing the original simple pdfwriter.appendpages from this file to another. let's see that the copying context works, and that the problem is limited to the modification module. 

Regards,
Gal.

Stephen Feather

unread,
Jun 17, 2013, 11:51:32 AM6/17/13
to pdfhummus-in...@googlegroups.com
sfeather:node ModifyContent.js 
false
1.3
1

Stephen Feather

unread,
Jun 17, 2013, 12:04:49 PM6/17/13
to pdfhummus-in...@googlegroups.com
Creates a nice readable pdf. Still no logs created.

--------------
var pdfWriter = require('hummus').createWriter('./test.pdf',{log:'./hummus.log'});
pdfWriter.appendPDFPagesFromPDF('./original.pdf');
pdfWriter.appendPDFPagesFromPDF('./samples.pdf');
pdfWriter.appendPDFPagesFromPDF('./copy.pdf');
pdfWriter.end();
console.log('done - ok');
--------------

Gal Kahana

unread,
Jun 17, 2013, 1:22:42 PM6/17/13
to pdfhummus-in...@googlegroups.com
Yeah, if it has nothing to say than you won't see a log.
so it seems like appending works, which means no problem to copy. parser of the copying context, based on the previous test also reads fine.
it is only one thing that i see - that previous test printed the encryption value, level and pages count...but didn't write the "done-ok" (unless you didn't say so).
this means that "end" fails. which would explain why there's nothing changed with the modified file.
Please confirm that you didnt get "done-ok" in the previous test. i'm curious as to why the log doesn't show anything. let me think about it. in the meantime - just please confirm about "done-ok".

Gal.

Gal Kahana

unread,
Jun 17, 2013, 1:23:15 PM6/17/13
to pdfhummus-in...@googlegroups.com
sorry, it was "done". not "done-ok"

Gal Kahana

unread,
Jun 17, 2013, 2:05:12 PM6/17/13
to pdfhummus-in...@googlegroups.com
Stephen,
i might have found something. an un-initialized flag may be the source of all the trouble. it's a bug which is in modification and not in creation...and will come to play only at the "end" phase....sound right.
you can either get the fix from npm or git. let me know if this solves the problem. i hope it will :).
version is 10.0.14

Gal.

Stephen Feather

unread,
Jun 17, 2013, 2:15:40 PM6/17/13
to pdfhummus-in...@googlegroups.com
i left that part off, it did finish.
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "PDFHummus interest group" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/pdfhummus-interest-group/59SbamIoTvY/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> pdfhummus-interest...@googlegroups.com.
> To post to this group, send email to
> pdfhummus-in...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pdfhummus-interest-group/702d8ff0-37d1-419e-a486-bdcddc30afd6%40googlegroups.com.

Stephen Feather

unread,
Jun 17, 2013, 2:16:20 PM6/17/13
to pdfhummus-in...@googlegroups.com
pulling to check right after lunch. thank you.
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "PDFHummus interest group" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/pdfhummus-interest-group/59SbamIoTvY/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> pdfhummus-interest...@googlegroups.com.
> To post to this group, send email to
> pdfhummus-in...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/pdfhummus-interest-group/68d479d7-a787-48a5-ac97-0ce9e502f4de%40googlegroups.com.

Stephen Feather

unread,
Jun 17, 2013, 4:44:13 PM6/17/13
to pdfhummus-in...@googlegroups.com
woop!
var pdfWriter = hummus.createWriterToModify('original.pdf',{log:'hummus.log'}); // Works

var pdfWriter = hummus.createWriterToModify('copy.pdf', {modifiedFilePath:'copy2.pdf', log:'hummus.log'}); // works (single page)

var pdfWriter = hummus.createWriterToModify('samples.pdf', {modifiedFilePath:'copy3.pdf', log:'hummus.log'}); //works (adobe's document)

Thank you very much.



Reply all
Reply to author
Forward
0 new messages