Public api to access signature objects?

277 views
Skip to first unread message

Miklos Vajna

unread,
Jun 4, 2020, 3:17:58 PM6/4/20
to pdf...@googlegroups.com
Hi,

As far as I see, pdfium currently has no public API to access signature
objects.

Assuming this is correct, would it be accepted if I would submit changes
to expose such digital signatures as public API?

What I have in mind is a minimal API, so the signature blobs would be
accessible by pdfium clients (and needed metadata like byte ranges), but
pdfium itself would be crypto library agnostic, actually determining of
signature blobs are valid or invalid would be still a task for pdfium
clients. Does this sound like a sane idea?

Thanks,

Miklos
signature.asc

Miklos Vajna

unread,
Jun 13, 2020, 4:27:44 PM6/13/20
to pdf...@googlegroups.com
Hi,
Dan, Lei, others: any input on this one?

Thanks,

Miklos
signature.asc

Miklos Vajna

unread,
Jun 22, 2020, 4:42:11 PM6/22/20
to pdf...@googlegroups.com, the...@chromium.org, tse...@chromium.org
Hi,
Still looking for input here. :-)

Thanks,

Miklos
signature.asc

Lei Zhang

unread,
Jun 23, 2020, 1:54:25 AM6/23/20
to Miklos Vajna, pdfium
I'm not a crypto expert, but I would prefer not to avoid dragging too
much PKI, CA, and signature handler code into PDFium. So if PDFium can
just provide APIs to support digital signatures, and those interested
in digital signatures can connect the two parts, that level of
integration sounds good to me. If we experiment and find this does not
work, then we will at least have a better understanding and can
discuss where we need tighter integration and why.
> --
> You received this message because you are subscribed to the Google Groups "pdfium" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pdfium+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pdfium/20200604191753.GC32683%40vmiklos.hu.

Miklos Vajna

unread,
Aug 14, 2020, 3:43:16 AM8/14/20
to Lei Zhang, pdfium
Hi Lei, all,

On Mon, Jun 22, 2020 at 10:54:12PM -0700, Lei Zhang <the...@chromium.org> wrote:
> I'm not a crypto expert, but I would prefer not to avoid dragging too
> much PKI, CA, and signature handler code into PDFium. So if PDFium can
> just provide APIs to support digital signatures, and those interested
> in digital signatures can connect the two parts, that level of
> integration sounds good to me.

One last thing, I forgot to ask in the initial mail...

public/fpdf_signature.h now has a set of experimental APIs that do most
of this. One more API I have in mind is a bit more low-level, so I
thought I would bring it up here before investing effort in coding.

Suppose a pdfium client wants to determine if the signature is
"complete". This is a reasonable ask, Acrobat provides this information
on its UI. If there is a single signature, that's easy: the byte range
is available via FPDFSignatureObj_GetByteRange(), and client code can
require that it covers everything other than the signature itself.

Things get more complicated with multiple signatures. Strictly speaking,
a second signature invalidates the first one, since adding a 2nd
signature makes the 1st one not cover the whole document anymore. In
real life, this is too strict though, Acrobat doesn't enforce this,
either.

Using Acrobat, a user does not get a warning when a document is signed
multiple times. It warns only if there are unsigned incremental updates
between the signatures or after the last one. This ensures that no
unwanted content is added after signing (without triggering a "signature
is incomplete" warning), but multiple signatures keep working.

Now the question: what API to expose in pdfium to help client code do a
similar test?

I would suggest that we expose a set of byte offsets, each denoting the
end of the EOF token of a trailer ("end of incremental update" for any
non-first part). This has the benefit that pdfium client code has the
info it needs, and we can delegate the fine-grained policy to client
code, especially as the PDF reference does not seem to specify this
policy in an explicit way. The API would be a bit low-level, though.

Would a gerrit change doing this be accepted?

Thanks,

Miklos

Miklos Vajna

unread,
Aug 19, 2020, 3:48:35 AM8/19/20
to Lei Zhang, pdfium
Hi Lei, others,

On Fri, Aug 14, 2020 at 09:43:11AM +0200, Miklos Vajna <vmi...@vmiklos.hu> wrote:
> I would suggest that we expose a set of byte offsets, each denoting the
> end of the EOF token of a trailer ("end of incremental update" for any
> non-first part).

OK to go ahead here or would you suggest some other way to support this
use-case?

Thanks,

Miklos

Lei Zhang

unread,
Aug 20, 2020, 4:49:49 AM8/20/20
to Miklos Vajna, pdfium
0On Fri, Aug 14, 2020 at 12:43 AM Miklos Vajna <vmi...@vmiklos.hu> wrote:
> Things get more complicated with multiple signatures. Strictly speaking,
> a second signature invalidates the first one, since adding a 2nd
> signature makes the 1st one not cover the whole document anymore. In
> real life, this is too strict though, Acrobat doesn't enforce this,
> either.
>
> Using Acrobat, a user does not get a warning when a document is signed
> multiple times. It warns only if there are unsigned incremental updates
> between the signatures or after the last one. This ensures that no
> unwanted content is added after signing (without triggering a "signature
> is incomplete" warning), but multiple signatures keep working.
>
> Now the question: what API to expose in pdfium to help client code do a
> similar test?
>
> I would suggest that we expose a set of byte offsets, each denoting the
> end of the EOF token of a trailer ("end of incremental update" for any
> non-first part). This has the benefit that pdfium client code has the
> info it needs, and we can delegate the fine-grained policy to client
> code, especially as the PDF reference does not seem to specify this
> policy in an explicit way. The API would be a bit low-level, though.
>
> Would a gerrit change doing this be accepted?

Instead of looking for EOF tokens, would it work if there was an API
to get the range of the signature? e.g. If we have:

Sig1: /ByteRange [0 10 30 10]
Sig2: /ByteRange [40 10 60 10]

If the caller can see Sig1 has offset:length 10:20, and Sig2 has
offset:length 50:10, and the file is 70 bytes long, then is that
sufficient to check the entire file, sans signatures, is covered?

Miklos Vajna

unread,
Aug 20, 2020, 4:27:19 PM8/20/20
to Lei Zhang, pdfium
Hi Lei,

On Thu, Aug 20, 2020 at 01:49:36AM -0700, Lei Zhang <the...@chromium.org> wrote:
> Instead of looking for EOF tokens, would it work if there was an API
> to get the range of the signature? e.g. If we have:
>
> Sig1: /ByteRange [0 10 30 10]
> Sig2: /ByteRange [40 10 60 10]
>
> If the caller can see Sig1 has offset:length 10:20, and Sig2 has
> offset:length 50:10, and the file is 70 bytes long, then is that
> sufficient to check the entire file, sans signatures, is covered?

Almost. :-) I think the problem with this approach is that you can have
both incremental updates which add new signatures and also ones which
don't add one. Technically, the signatures are chained, so if there is
a signature which truly covers the whole document, then adding a second
one will naturally change the first signature to a partial one. As
mentioned, sadly I did not find any details regarding the handling of
this in the PDF reference, but I checked what Adobe Acrobat does. It
seems it still considers such a "first" signature as a complete
signature in case all later incremental updates introduce new
signatures.

Perhaps it helps, I attach two examples:

1) unsigned-incremental-update.pdf is a case where a signature is
followed by an incremental update which does not add a signature, then
followed by the final incremental update which adds a second signature.
In this case, the above reasoning considers the first signature as
partial.

2) two-signatures.pdf is a normal case where the PDF file has two
signatures. Technically the first one no longer covers the whole file
once the second one is added. In this case, the above reasoning
considers the first signature as complete.

I'm not arguing this is how pdfium should verify signatures. I just say
that I think this above reasoning is a reasonable one, and adding a
public API to pdfium to support this use-case makes sense.

But then I think it is necessary to expose the end position of EOF
tokens, that's how pdfium client code can differentiate between case 1)
and case 2). As far as I see, the API you propose doesn't solve this
particular problem. Or did I miss something?

Thanks,

Miklos
unsigned-inc-update.pdf
two-signatures.pdf
signature.asc

Miklos Vajna

unread,
Aug 27, 2020, 4:49:24 AM8/27/20
to Lei Zhang, pdfium
Hi Lei, all,

On Thu, Aug 20, 2020 at 10:27:12PM +0200, Miklos Vajna <vmi...@vmiklos.hu> wrote:
> 1) unsigned-incremental-update.pdf is a case where a signature is
> followed by an incremental update which does not add a signature, then
> followed by the final incremental update which adds a second signature.
> In this case, the above reasoning considers the first signature as
> partial.
>
> 2) two-signatures.pdf is a normal case where the PDF file has two
> signatures. Technically the first one no longer covers the whole file
> once the second one is added. In this case, the above reasoning
> considers the first signature as complete.

Does this provide enough context to show why it seems useful to expose
info about incremental updates via public API?

If so, would it make sense to add API to expose the byte offsets of EOF
tokens?

Thanks,

Miklos
signature.asc

Miklos Vajna

unread,
Sep 3, 2020, 3:59:30 AM9/3/20
to Lei Zhang, pdfium
Hi Lei, all,

On Thu, Aug 27, 2020 at 10:49:13AM +0200, Miklos Vajna <vmi...@vmiklos.hu> wrote:
> > 1) unsigned-incremental-update.pdf is a case where a signature is
> > followed by an incremental update which does not add a signature, then
> > followed by the final incremental update which adds a second signature.
> > In this case, the above reasoning considers the first signature as
> > partial.
> >
> > 2) two-signatures.pdf is a normal case where the PDF file has two
> > signatures. Technically the first one no longer covers the whole file
> > once the second one is added. In this case, the above reasoning
> > considers the first signature as complete.
>
> Does this provide enough context to show why it seems useful to expose
> info about incremental updates via public API?

Ping. :-) I would like to proceed this one at some stage, but first it
would be nice to get an ACK here.

TL;DR: the proposal is to add public API which allows client code to
perform Acrobat-like signature verification (determine if a signature is
partial or not). Since the exact steps of the outcome is not in the PDF
reference, it seems wise not to hardcode the rules on the pdfium side.

Thanks,

Miklos
signature.asc

Lei Zhang

unread,
Sep 3, 2020, 4:07:36 AM9/3/20
to Miklos Vajna, pdfium
Pong. Sorry, it's been busy. I don't have anything of value to add to
the conversation at the moment.
> --
> You received this message because you are subscribed to the Google Groups "pdfium" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pdfium+un...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pdfium/20200903075924.GC4725%40collabora.com.

Lei Zhang

unread,
Sep 19, 2020, 1:01:44 AM9/19/20
to Miklos Vajna, pdfium
On Thu, Aug 20, 2020 at 1:27 PM Miklos Vajna <vmi...@vmiklos.hu> wrote:
>
> Hi Lei,
>
> On Thu, Aug 20, 2020 at 01:49:36AM -0700, Lei Zhang <the...@chromium.org> wrote:
> > Instead of looking for EOF tokens, would it work if there was an API
> > to get the range of the signature? e.g. If we have:
> >
> > Sig1: /ByteRange [0 10 30 10]
> > Sig2: /ByteRange [40 10 60 10]
> >
> > If the caller can see Sig1 has offset:length 10:20, and Sig2 has
> > offset:length 50:10, and the file is 70 bytes long, then is that
> > sufficient to check the entire file, sans signatures, is covered?
>
> Almost. :-) I think the problem with this approach is that you can have
> both incremental updates which add new signatures and also ones which
> don't add one. Technically, the signatures are chained, so if there is
> a signature which truly covers the whole document, then adding a second
> one will naturally change the first signature to a partial one. As
> mentioned, sadly I did not find any details regarding the handling of
> this in the PDF reference, but I checked what Adobe Acrobat does. It
> seems it still considers such a "first" signature as a complete
> signature in case all later incremental updates introduce new
> signatures.
>
> Perhaps it helps, I attach two examples:

Thanks for the examples. After looking at how they are structured and
how Acrobat reacts to them, it does look like one needs an API to
figure out what the incremental updates are. Otherwise, one cannot
detect the unsigned portion of unsigned-incremental-update.pdf.

What was wrong with my earlier conception of /ByteRange is what it
covers. Instead of:
Sig1: /ByteRange [0 10 30 10]
Sig2: /ByteRange [40 10 60 10]

it should have been:
Sig1: /ByteRange [0 10 30 10]
Sig2: /ByteRange [0 50 60 10]

Miklos Vajna

unread,
Sep 25, 2020, 3:10:01 AM9/25/20
to pdfium
Hi,

Thanks for all the reviews, in total 8 new APIs were added around
signatures, and I think pdfium now provides a good set of functions to
write client code to do decent signature verification.

I've put together an example/demo cmdline tool that uses these APIs:

https://github.com/vmiklos/odfsig/blob/pdfiumsig/pdfiumsig.cpp

This is mostly just manual integration testing, which complements the
in-tree unit testing. (It intentionally links to a crypto lib, so
probably adding it to pdfium.git is not a great idea.)

I'm not sure how useful it is; perhaps it would make sense linking it in
the documentation, but possibly it's not that interesting.

Regards,

Miklos
signature.asc
Reply all
Reply to author
Forward
0 new messages