If the PDF embedded vector graphics similar to how it does rasters, then
this becomes somewhat practical. For example, if the vector graphics were
an embedded SVG then we'd pull that out (similar to how pdfimage from
poppler can pull out embedded rasters.) Then we'd teach Leptonica to
read the SVG into a pix, which is a rasterization operation. At PDF generation
time we'd throw out the pix and instead use the original SVG.
But I don't think it works that way at all. I think the vector graphics
commands are likely to be direct PDF primitives and therefore way, way,
way too hard play with in this fashion.
I think a more likely approach (but still very unlikely!) is using a
modified Tesseract to create a new PDF containing invisible text
layer and nothing else. Then hope someone has written a general
purpose "composite two PDF files on top of each other" program
and use it to merge. Such a merging program would be pretty difficult
to write, and it is hard for me to imagine why it would exist.