Re: Font extraction

628 views
Skip to first unread message

Support

unread,
Sep 20, 2012, 4:19:36 PM9/20/12
to pdfne...@googlegroups.com
 
If you use pdftron.PDF.Convert.ToXps() [for example of how to use this method see Convert sample - http://www.pdftron.com/pdfnet/samplecode.html#Convert ], PDFNet will automatically extract, convert, and normalize all fonts to OTF format.
 
Now XPS is just a ZIP & XML, fonts, PNGs  ... file. All extracted fonts will reside in the '/Font' folder. The only tricky think is that the fonts are obfuscated (this is required by XPS spec - http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-388.pdf). The de-ofuscation should be fairly simple, for example:
 
/// Deobfuscates a font according to the XPS specification
/// <param name="data">The font data to be deobfuscated.</param>
/// <param name="name">The name of the font.</param>
private void DeobfuscateFont(Byte[] data, String name)  {
  String[] seperators = new String[] { "-", ".odttf", ".odttc" }; 
  String[] parts = name.Split(seperators, StringSplitOptions.RemoveEmptyEntries); 
  String[] brokenString = new String[16]; 
  int count = 0; 
  for (int i = 0; i < parts.Count(); i++) { 
    for (int j = 0; j < parts[i].Length; j += 2) {  
      brokenString[count] = parts[i].Substring(j, 2); 
      count++; 
     } 
   } 
  Byte[] key = HexFromChars(brokenString); 
  for (int i = 0; i < 16; i++) data[i] ^= key[i]; 
  for (int i = 16; i < 32; i++) data[i] ^= key[i - 16];
}
 
 
Now, if for some reason ToXps() does not work for you (i.e. you prefer to use your current approach), you could potentially use some some font conversion tool (e.g. FontForge etc). I would warn you in advance that this may not work as somooth as it may seem - in large part because there are so many subsetted, malformed & incomplete fonts in PDF files.
 
 
 
 
 
 
Reply all
Reply to author
Forward
0 new messages