Embed preexisting font in pdf with python

392 views
Skip to first unread message

Spencer Rathbun

unread,
Jan 11, 2012, 11:23:47 AM1/11/12
to pdfne...@googlegroups.com
I'm doing various manipulations to a pdf, and at one point I create a new pdf from a set of pages in the original. The original pdf does not have any fonts embedded, so I want to embed the used fonts into the new pdf when I create it. However, when I open the new pdf, the fonts are not embedded. What step am I missing?

# Python 2.7 Snippet
for group in groupings:
logger.info("pages {0}-{1}".format(group[1], group[2]))

new_doc = PDFDoc()
copy_pages = VectorPage()
itr = in_doc.GetPageIterator(group[1])
while itr.HasNext():
if itr.Current().GetIndex() > group[2]:
break
page = itr.Current()
res = page.GetResourceDict()
copy_pages.push_back(page)

if (res != None):
fonts = res.FindObj("Font")
if (fonts != None):
#itr2 = fonts.DictBegin()
itr2 = fonts.GetDictIterator()
#end = fonts.DictEnd()
while itr2.HasNext():
fnt_dict = itr2.Value()
font = Font(fnt_dict)
if font.GetName() not in fontsToEmbed:
fontsToEmbed.append(font.GetName())
itr2.Next()
itr.Next()

for f in fontsToEmbed:
if f == 'Times-Roman':
Font.Create(new_doc.GetSDFDoc(), Font.e_times_roman, True)
elif f == 'CourierNewPSMT':
Font.Create(new_doc.GetSDFDoc(), Font.e_courier, True)
else:
logger.info("Unknown font: {0}".format(f))
fontsToEmbed = []
imported_pages = new_doc.ImportPages(copy_pages)
i = iter(imported_pages)
for x in i:
new_doc.PagePushBack(x)

new_doc.Save("{0}.pdf".format(os.path.join(output_path, basename)), 0)

Spencer Rathbun
IT/Programming
L & D Mail Masters, Inc.
110 Security Parkway
New Albany, IN 47150
Phone: 812.981.7161
Fax: 812.981.7169
srat...@ldmailmasters.com

Support

unread,
Jan 12, 2012, 6:02:39 PM1/12/12
to pdfne...@googlegroups.com
 
Unfortunately embedding a missing font is not as simple as adding a font with the same name to the target doc. All text in the doc must explicitly reference the new font.

 

I guess you could try to add missing FontFile/FontFile2/FontFile3 in the existing font descriptor (Font.GetDescriptor()), however this may not work if font formats do not match or if text and font encoding is mismatched.

 

The simplest option to embed missing fonts would be to use PDF/A Add-on (see pdftron.PDF.PDFA.PDFACompliance - http://www.pdftron.com/pdfnet/samplecode.html#PDFA) to save the orifinal PDF as PDF/A compliant document (which requires that all fonts are embedded). In this case PDFNet PDFACompliance.Save() will embed all missing fonts and you can use the resulting document for further PDF processing.

Reply all
Reply to author
Forward
0 new messages