PDFmark for PDF/A-3 embedded files

Skip to first unread message

Jochen Stärk

May 3, 2020, 3:59:14 AM5/3/20
to zug...@googlegroups.com


I need valid PDF-A/3 output and I have plain PDF input and Ghostscript is the only viable open source solution I can imagine (are there others?).

With basic postscript file like psHelloWorld.ps and as pdfmark the usual pdfa_def.ps and ICC file from https://www.adobe.com/support/downloads/iccprofiles/iccprofiles_win.html and

gs -P -dPDFA=3 -dCompressStreams=false -dNOOUTERSAVE -sProcessColorModel=DeviceRGB -sDEVICE=pdfwrite -o helloworld1.pdf -dPDFACompatibilityPolicy=3 -dRenderIntent=3 -sGenericResourceDir="./" ".\pdfa_def.ps"  psHelloWorld.ps

helloworld1.pdf is valid PDF A/3b (according to verapdf).

AFAIK I have to add a XMP Schema extension, with pdfa_def.ps and addXMP.ps this also seems to be valid (helloworld2.pdf), exiftool shows more metadata, and verapdf still reports validity.

Now I tried to embed the file with the PDFmark from  https://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/pdfmark_reference.pdf  pg 31.

To be honest I don't know what Unicode Unique Name precisely is and what I am supposed to do with it, or why. And I guess embedding files works differently in A-3.

After the according additional errors I added the UF key and AfRelationship Alternative.

I end up with

gs -P -dPDFA=3 -dCompressStreams=false -dNOOUTERSAVE -sProcessColorModel=DeviceRGB -sDEVICE=pdfwrite -o helloworld3.pdf -dPDFACompatibilityPolicy=3 -dRenderIntent=3 -sGenericResourceDir="./" ".\pdfa_def.ps" ".\addXMP.ps" ".\addAttachment.ps"  psHelloWorld.ps

but verapdf (on helloworld3.pdf) reports two errors:

Specification: ISO 19005-3:2012, Clause: 6.8, Test number: 1   
The MIME type of an embedded file, or a subset of a file, shall be specified using the Subtype key of the file specification dictionary. If the MIME type is not known, the "application/octet-stream" shall be used.    Failed
2 occurrences    Hide
Subtype != null && /^[-\w+\.]+\/[-\w+\.]+$/.test(Subtype)   
root/indirectObjects[7](8 0)/directObject[0]/values[0]/elements[1]/EF[0]

Specification: ISO 19005-3:2012, Clause: 6.8, Test number: 4   
The additional information provided for associated files as well as the usage requirements for associated files indicate the relationship between the embedded file and the PDF document or the part of the PDF document with which it is associated.    Failed
2 occurrences    Hide
isAssociatedFile == true   
root/indirectObjects[7](8 0)/directObject[0]/values[0]/elements[1]

As Mime-type I would use text/xml, I understand this will be escaped to text#2Fxml.

I tried several places to be honest I still have no idea where, do I need to access some file dictionary somewhere?

Any suggestions to improve my pdfmark, in particular addAttachment.ps?

Thanks in advance!


mit freundlichen Grüßen
Jochen Stärk

Huswertstraße 14
60435 Frankfurt

Tel: (069)569940-20
Fax: (069)569940-19
Mobil: (0177)4512645
Reply all
Reply to author
0 new messages