Re: Font extraction

637 views

Skip to first unread message

Support

unread,

Sep 20, 2012, 4:19:36 PM9/20/12

to pdfne...@googlegroups.com

If you use pdftron.PDF.Convert.ToXps() [for example of how to use this method see Convert sample - http://www.pdftron.com/pdfnet/samplecode.html#Convert ], PDFNet will automatically extract, convert, and normalize all fonts to OTF format.

Now XPS is just a ZIP & XML, fonts, PNGs ... file. All extracted fonts will reside in the '/Font' folder. The only tricky think is that the fonts are obfuscated (this is required by XPS spec - http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-388.pdf). The de-ofuscation should be fairly simple, for example:

/// Deobfuscates a font according to the XPS specification
/// <param name="data">The font data to be deobfuscated.</param>
/// <param name="name">The name of the font.</param>
private void DeobfuscateFont(Byte[] data, String name) {
String[] seperators = new String[] { "-", ".odttf", ".odttc" };
String[] parts = name.Split(seperators, StringSplitOptions.RemoveEmptyEntries);
String[] brokenString = new String[16];
  int count = 0;
for (int i = 0; i < parts.Count(); i++) {
    for (int j = 0; j < parts[i].Length; j += 2) {
      brokenString[count] = parts[i].Substring(j, 2);
      count++;

}

   }
  Byte[] key = HexFromChars(brokenString);
for (int i = 0; i < 16; i++) data[i] ^= key[i];
  for (int i = 16; i < 32; i++) data[i] ^= key[i - 16];
}

Now, if for some reason ToXps() does not work for you (i.e. you prefer to use your current approach), you could potentially use some some font conversion tool (e.g. FontForge etc). I would warn you in advance that this may not work as somooth as it may seem - in large part because there are so many subsetted, malformed & incomplete fonts in PDF files.

Reply all

Reply to author

Forward

0 new messages