Can the Merkle tree ever be trimmed?

88 views
Skip to first unread message

Matt Palmer

unread,
Jul 8, 2014, 6:44:35 AM7/8/14
to certificate-...@googlegroups.com
Howdy everyone,

I'm pondering the resource requirements of running a log server. One of the
fairly obvious ones involved is disk space. From a naive perspective, the
log of certificates can never be trimmed -- everything builds on everything
else. Despite the (apparently) never-ending increase in the size of storage
(and corresponding decrease in size), it still seems like a bad idea to
commit to running what is, essentially, a never-ending append-only log file.

What I'd like to confirm is a suspicion I've developed, that the tree can,
in fact, be "pruned" to keep disk space usage under control. Essentially,
my hypothesis is that once the entire contents of a subtree have expired
(where the expiry of a node in the tree is defined as MAX(expiry_child1,
expiry_child2)) only the head node of that subtree needs to be kept. I'm
basing that on the supposition that nobody's going to care about
certificates that have expired, so they don't need to be kept.

The thing that hangs this is the current way that a new monitor "comes
online" (RFC6962, Section 5.3). The need to get every certificate in
the history makes it impossible to trim the tree. Similarly, though, it
increases the barrier to entry of new monitors over time, as the volume of
certificate data that needs to be downloaded to initialise the monitor isn't
bound by the set of currently valid certificates, but rather by the number
of certificates *ever* issued. Perhaps the assumption is that storage and
bandwidth growth will outpace this growth, but it makes me nervous.

The best answer I've got so far is for a new MerkleLeafType to be defined
(intermediate_hash, perhaps?) which represents the hash of the head of the
expired subtree. Then when verifying the STH, the monitor uses the
intermediate hashes for the elided subtrees rather than calculating the
entire subtree by hand. It might (will?) also be necessary to store the
range of entries covered by the intermediate node.

I can't think of any *new* attacks this opens up -- but then, anyone can
design a cryptosystem they themselves can't defeat. If nobody's watching a
log, the log can play silly-buggers with the leaf nodes and recalculate the
tree, but that's not new -- the reliability of the entire CT scheme depends
on monitors watching over time to make sure no log pulls a swiftie.

I suppose the ultimate feasability of this scheme depends on whether or not
it is acceptable for a monitor/auditor to respond to a request to validate
an SCT with "I have no idea what you're talking about; I don't suppose that
certificate has expired, has it?" -- in much the same way that an OCSP
responder can say "all is well", "Danger Will Robinson!" or "lolidunno". If
an auditor needs to be able to say, for now and evermore, whether or not an
SCT is valid, then I guess I'll just start stocking up on hard drives...

Thanks,
- Matt


Ben Laurie

unread,
Jul 19, 2014, 7:39:26 AM7/19/14
to certificate-...@googlegroups.com
Apologies again for missing this.

On 8 July 2014 11:44, Matt Palmer <mpa...@hezmatt.org> wrote:
> Howdy everyone,
>
> I'm pondering the resource requirements of running a log server. One of the
> fairly obvious ones involved is disk space. From a naive perspective, the
> log of certificates can never be trimmed -- everything builds on everything
> else. Despite the (apparently) never-ending increase in the size of storage
> (and corresponding decrease in size), it still seems like a bad idea to
> commit to running what is, essentially, a never-ending append-only log file.
>
> What I'd like to confirm is a suspicion I've developed, that the tree can,
> in fact, be "pruned" to keep disk space usage under control. Essentially,
> my hypothesis is that once the entire contents of a subtree have expired
> (where the expiry of a node in the tree is defined as MAX(expiry_child1,
> expiry_child2)) only the head node of that subtree needs to be kept. I'm
> basing that on the supposition that nobody's going to care about
> certificates that have expired, so they don't need to be kept.

Right. This is my thinking, too, though obviously nothing has been formalised.

> The thing that hangs this is the current way that a new monitor "comes
> online" (RFC6962, Section 5.3). The need to get every certificate in
> the history makes it impossible to trim the tree. Similarly, though, it
> increases the barrier to entry of new monitors over time, as the volume of
> certificate data that needs to be downloaded to initialise the monitor isn't
> bound by the set of currently valid certificates, but rather by the number
> of certificates *ever* issued. Perhaps the assumption is that storage and
> bandwidth growth will outpace this growth, but it makes me nervous.
>
> The best answer I've got so far is for a new MerkleLeafType to be defined
> (intermediate_hash, perhaps?) which represents the hash of the head of the
> expired subtree. Then when verifying the STH, the monitor uses the
> intermediate hashes for the elided subtrees rather than calculating the
> entire subtree by hand. It might (will?) also be necessary to store the
> range of entries covered by the intermediate node.

That's one answer.

A more sledgehammer approach that works now would be to simply stop
updating the log at some point and bring a new log into existence.
When all the certs in the old log have expired, remove it from the
trusted log list and turn it down.

> I can't think of any *new* attacks this opens up -- but then, anyone can
> design a cryptosystem they themselves can't defeat. If nobody's watching a
> log, the log can play silly-buggers with the leaf nodes and recalculate the
> tree, but that's not new -- the reliability of the entire CT scheme depends
> on monitors watching over time to make sure no log pulls a swiftie.
>
> I suppose the ultimate feasability of this scheme depends on whether or not
> it is acceptable for a monitor/auditor to respond to a request to validate
> an SCT with "I have no idea what you're talking about; I don't suppose that
> certificate has expired, has it?" -- in much the same way that an OCSP
> responder can say "all is well", "Danger Will Robinson!" or "lolidunno". If
> an auditor needs to be able to say, for now and evermore, whether or not an
> SCT is valid, then I guess I'll just start stocking up on hard drives...

If a client is prepared to decline expired certificates it has no need
to check their validity in a log. Conversely, if it is prepared to
accept them, it probably doesn't care enough about security to check
their validity :-)

Matt Palmer

unread,
Jul 19, 2014, 8:52:51 PM7/19/14
to 'Ben Laurie' via certificate-transparency
On Sat, Jul 19, 2014 at 12:39:25PM +0100, 'Ben Laurie' via certificate-transparency wrote:
> On 8 July 2014 11:44, Matt Palmer <mpa...@hezmatt.org> wrote:
> > I'm pondering the resource requirements of running a log server. One of the
> > fairly obvious ones involved is disk space. From a naive perspective, the
> > log of certificates can never be trimmed -- everything builds on everything
> > else. Despite the (apparently) never-ending increase in the size of storage
> > (and corresponding decrease in size), it still seems like a bad idea to
> > commit to running what is, essentially, a never-ending append-only log file.
> >
> > What I'd like to confirm is a suspicion I've developed, that the tree can,
> > in fact, be "pruned" to keep disk space usage under control. Essentially,
> > my hypothesis is that once the entire contents of a subtree have expired
> > (where the expiry of a node in the tree is defined as MAX(expiry_child1,
> > expiry_child2)) only the head node of that subtree needs to be kept. I'm
> > basing that on the supposition that nobody's going to care about
> > certificates that have expired, so they don't need to be kept.
>
> Right. This is my thinking, too, though obviously nothing has been formalised.

Would you like me to put together something for the trans list to chew over?
It seems like the sort of thing that would be worth getting into the spec
from early on, so everyone's used to it from day one.

> > The thing that hangs this is the current way that a new monitor "comes
> > online" (RFC6962, Section 5.3). The need to get every certificate in
> > the history makes it impossible to trim the tree. Similarly, though, it
> > increases the barrier to entry of new monitors over time, as the volume of
> > certificate data that needs to be downloaded to initialise the monitor isn't
> > bound by the set of currently valid certificates, but rather by the number
> > of certificates *ever* issued. Perhaps the assumption is that storage and
> > bandwidth growth will outpace this growth, but it makes me nervous.
> >
> > The best answer I've got so far is for a new MerkleLeafType to be defined
> > (intermediate_hash, perhaps?) which represents the hash of the head of the
> > expired subtree. Then when verifying the STH, the monitor uses the
> > intermediate hashes for the elided subtrees rather than calculating the
> > entire subtree by hand. It might (will?) also be necessary to store the
> > range of entries covered by the intermediate node.
>
> That's one answer.
>
> A more sledgehammer approach that works now would be to simply stop
> updating the log at some point and bring a new log into existence.
> When all the certs in the old log have expired, remove it from the
> trusted log list and turn it down.

I agree with you for now that rotating logs is the low-effort approach.
However, I'm pretty sure that changing the set of trusted logs will
eventually come to be quite the involved operation, not dissimilar to the
work involved in root CA manipulation. Since it won't be done *often*,
it'll end up being a pest to do, so let's avoid needing to do it more than
is absolutely necessary.

> If a client is prepared to decline expired certificates it has no need
> to check their validity in a log. Conversely, if it is prepared to
> accept them, it probably doesn't care enough about security to check
> their validity :-)

I like that description of the situation. <grin>

- Matt

--
Of course, I made the mistake of showing [a demo application] off to my boss,
who showed it off to his boss, and suddenly I couldn't reboot my desktop box
without getting a change control approved.
-- Derick Siddoway, in a place that doesn't exist

Ben Laurie

unread,
Jul 20, 2014, 10:09:44 AM7/20/14
to certificate-...@googlegroups.com
On 20 July 2014 01:52, Matt Palmer <mpa...@hezmatt.org> wrote:
> On Sat, Jul 19, 2014 at 12:39:25PM +0100, 'Ben Laurie' via certificate-transparency wrote:
>> On 8 July 2014 11:44, Matt Palmer <mpa...@hezmatt.org> wrote:
>> > I'm pondering the resource requirements of running a log server. One of the
>> > fairly obvious ones involved is disk space. From a naive perspective, the
>> > log of certificates can never be trimmed -- everything builds on everything
>> > else. Despite the (apparently) never-ending increase in the size of storage
>> > (and corresponding decrease in size), it still seems like a bad idea to
>> > commit to running what is, essentially, a never-ending append-only log file.
>> >
>> > What I'd like to confirm is a suspicion I've developed, that the tree can,
>> > in fact, be "pruned" to keep disk space usage under control. Essentially,
>> > my hypothesis is that once the entire contents of a subtree have expired
>> > (where the expiry of a node in the tree is defined as MAX(expiry_child1,
>> > expiry_child2)) only the head node of that subtree needs to be kept. I'm
>> > basing that on the supposition that nobody's going to care about
>> > certificates that have expired, so they don't need to be kept.
>>
>> Right. This is my thinking, too, though obviously nothing has been formalised.
>
> Would you like me to put together something for the trans list to chew over?
> It seems like the sort of thing that would be worth getting into the spec
> from early on, so everyone's used to it from day one.

Sure.
I really hope changing logs is not hard, or we are back into "too big
to fail" territory.

>
>> If a client is prepared to decline expired certificates it has no need
>> to check their validity in a log. Conversely, if it is prepared to
>> accept them, it probably doesn't care enough about security to check
>> their validity :-)
>
> I like that description of the situation. <grin>
>
> - Matt
>
> --
> Of course, I made the mistake of showing [a demo application] off to my boss,
> who showed it off to his boss, and suddenly I couldn't reboot my desktop box
> without getting a change control approved.
> -- Derick Siddoway, in a place that doesn't exist
>
> --
> You received this message because you are subscribed to the Google Groups "certificate-transparency" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to certificate-transp...@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

Matt Palmer

unread,
Jul 20, 2014, 7:06:57 PM7/20/14
to 'Ben Laurie' via certificate-transparency
On Sun, Jul 20, 2014 at 03:09:44PM +0100, 'Ben Laurie' via certificate-transparency wrote:
> On 20 July 2014 01:52, Matt Palmer <mpa...@hezmatt.org> wrote:
> > On Sat, Jul 19, 2014 at 12:39:25PM +0100, 'Ben Laurie' via certificate-transparency wrote:
> >> A more sledgehammer approach that works now would be to simply stop
> >> updating the log at some point and bring a new log into existence.
> >> When all the certs in the old log have expired, remove it from the
> >> trusted log list and turn it down.
> >
> > I agree with you for now that rotating logs is the low-effort approach.
> > However, I'm pretty sure that changing the set of trusted logs will
> > eventually come to be quite the involved operation, not dissimilar to the
> > work involved in root CA manipulation. Since it won't be done *often*,
> > it'll end up being a pest to do, so let's avoid needing to do it more than
> > is absolutely necessary.
>
> I really hope changing logs is not hard, or we are back into "too big
> to fail" territory.

What makes CAs too-big-to-fail is that removing one from the trust store for
being a screwup impacts *so* many end-users, and the many and varied trust
stores out there don't co-ordinate their actions (hence no browser is
willing to make the move unilaterally, for market share reasons). You've
sidestepped the problem nicely with CT, I think, by requiring certificates
to have N SCTs (based on issuance period), while also allowing post-creation
SCTs issued by other logs to be presented, if the worst does happen and all
the logs which issued the in-cert SCTs get removed. Since removing a log
from the trusted set *shouldn't* have a dramatic impact on end users,
browsers won't feel the same conflicting priorities.

The job that *is* hard is an orderly transition of logs. This will require
communication with everyone who relies on that log in some way -- and not
just browsers (although even communicating effectively with all the browsers
can be an effort, apparently). Notifying all the CAs that submit
certificates to the log, as well as every site that decides to submit their
own certs (if you don't think those log addresses aren't going to be
hard-coded into server configurations across the Internet, you're
excessively optimistic). Monitors, auditors, and random curious people are
all going to have to be notified *somehow* of a change.

If there was a protocol mechanism standardised for "log rollover", including
indicating where the "authorised successor" was located, would help
somewhat, but would by no means solve the problem completely. Having
/add-{pre-,}chain return 301 and a Location, /get-proof-by-hash return 303,
and other endpoints like /get-sth and /get-entries include an optional
"successor" field would help the situation. I have my doubts that all
clients would correctly handle all those corner cases, especially when such
rollover information probably won't be seen "in the wild" (and therefore the
bugs won't be noticed) for quite some time.

- Matt

--
Ideas are like rabbits. You get a couple and learn how to handle
them, and pretty soon you have a dozen.
-- John Steinbeck

Eran Messeri

unread,
Jul 21, 2014, 5:37:25 AM7/21/14
to certificate-...@googlegroups.com
Let's see if I can summarize the discussion correctly:
The problem is the unbounded space requirements for running a log.
Seems like there's agreement that pruning trees which contain only expired certificates is desirable.
One proposed solution is to build into the tree structure a way to prune an existing tree so that only the hash of the expired subtree is stored.
Another possible solution is log roll-over: Stop logging in a log that contains mostly old certs, add a new log, turn down the old log when it only contains expired certificates.

I favour the second solution (log roll-over) because:
* Building in a way to prune sub-trees would add complexity to the log implementation and may add a new class of attacks. I'll happily be convinced this is not a concern if someone analyses this proposal properly..
* Log roll-over can be done more gradually than removing root certificate from the trust store. It can be done over the span of a long time. Additionally, rather than making log clients understand HTTP redirect, log submitters can be made to understand redirects. 
* Another thing which is necessary regardless of this problem, is the ability to "finalize" a log: Mark that a log is not to be trusted anymore past a given STH (in case of a log compromise, for example). Having this mechanism in-place will also ease log roll-over (since you could keep a log alive but prevent new submissions into it using this mechanism, before turning it completely off).
* IMHO keeping a read-only log up is much easier than keeping a live log up, since there is no longer a requirement to maintain data integrity of submitted certificates, so read-only logs can be served from "cheaper" storage.

Eran



Matt Palmer

unread,
Jul 21, 2014, 10:02:17 PM7/21/14
to 'Eran Messeri' via certificate-transparency
Hi Eran,

On Mon, Jul 21, 2014 at 10:37:24AM +0100, 'Eran Messeri' via certificate-transparency wrote:
> The problem is the unbounded space requirements for running a log.

In a nutshell, yes. More precisely, the likely geometric increase in log
size due to the need to keep all data forever.

> Seems like there's agreement that pruning trees which contain only expired
> certificates is desirable.

Indeed.

> One proposed solution is to build into the tree structure a way to prune an
> existing tree so that only the hash of the expired subtree is stored.

Correct.

> Another possible solution is log roll-over: Stop logging in a log that
> contains mostly old certs, add a new log, turn down the old log when it
> only contains expired certificates.

I think that's a reasonable summary.

> I favour the second solution (log roll-over) because:
> * Building in a way to prune sub-trees would add complexity to the log
> implementation

I'd debate the assertion that it necessarily *must* increase implementation
complexity. I first thought of the idea of log trimming during my log
server implementation simply because I noticed that it would be utterly
trivial to add such a feature. Other implementations may find it difficult
to implement, but that shouldn't be a determining factor.

> and may add a new class of attacks.

There is definitely a possibility that trimming a tree *may* introduce a way
to attack the integrity of a merkle tree -- I couldn't find much useful
literature on the subject of trimming merkle trees. I can't find any
problems with it, but I make no claims of being competent to analyse
anything even vaguely cryptographically-related.

> I'll happily be
> convinced this is not a concern if someone analyses this proposal properly..

There's two concerns here -- implementation complexity and attacks. I
assume you're referring to the attack concern here. Can you give some
indication of what a "proper analysis" would look like?

> * Log roll-over can be done more gradually than removing root certificate
> from the trust store. It can be done over the span of a long time.

Increasing the span of time over which a log is in the process of "being
retired" doesn't really help anyone -- it increases the amount of time that
the old log has to be kept online (because the log entry with the last
expiry determines how long the log has to survive for, and any additional
entry will likely have a longer expiry than what's already in the log).

> Additionally, rather than making log clients understand HTTP redirect, log
> submitters can be made to understand redirects.

I place "log submitters" in the "log clients" set. Also, the set of log
submitters will likely be large and diverse -- potentially more diverse than
the set of log verifiers. Log submitters will be all CAs (both for precerts
and OCSP response-embedded SCTs), but also any number of server admins who
wish to obtain SCTs for submission via TLS extension. Log verifiers will,
for the most part, be browsers.

> * Another thing which is necessary regardless of this problem, is the
> ability to "finalize" a log: Mark that a log is not to be trusted anymore
> past a given STH (in case of a log compromise, for example). Having this
> mechanism in-place will also ease log roll-over (since you could keep a log
> alive but prevent new submissions into it using this mechanism, before
> turning it completely off).

* If a log is untrustworthy, it needs to be switched off (or removed from
the trust store of log verifiers), not marked read-only. If a log has
been compromised, I don't want to have to trust *any* of it (how do I know
the problem really started with the entry the log is asserting is the
first untrustworthy one?).

* To make the finalization "provably secure", you'd need to add it to the
log itself. However, that's a separate entry type as well, and monitors
would need to know to verify that the "finalized" log entry is the last
one in the log. More complexity.

* Stopping a log from being able to accept new submissions is easy -- just
return a 403 to all POSTs. As I understand it, a log submitter isn't
going to check /get-sth to see if the log is "alive" before posting an
entry. As I understand it, that was something that was raised on the
trans list (although it was in the context of hitting /get-roots before
/get-sth) as something that log submitters (or at least one of them)
explicitly *didn't* want to have to do.

* "Finalizing" a log isn't going to stop monitors, et al from needing to
check the log regularly. Since the finalization data is under the control
of the log operator, they could (concievably) "unlock" the log and add
entries, if they were malicious (or came under the control of a malicious
party).

> * IMHO keeping a read-only log up is much easier than keeping a live log
> up, since there is no longer a requirement to maintain data integrity of
> submitted certificates, so read-only logs can be served from "cheaper"
> storage.

*All* of the data served up by the resources accessed via GET can be served
from static files, whether or not you're running a read-only log or a
read-write log. The use of query params makes it non-trivial, but it is
possible. You only need to rewrite a subset of the files when you're doing
a log rebuild, too. I'm pretty sure I could write an entire log server
using "make" and a small CGI for /add-{pre-,}chain -- or at least "rake".

/me adds that to the "rainy day" todo list

For a read-only log, you still need that periodic run, too. The STH must be
regenerated within the MMD. This means that the private key for the log
must be kept available, and some sort of code execution environment. Sure,
you only have to recalculate a signature over a fixed hash, but you had to
calculate all the tree hashes previously, and anyone who isn't caching all
those tree hashes for a log in regular operation is going to have a sad, sad
time of it anyway.

All in all, I don't see how making a log read-only buys you any practical
benefit from an operational point of view, except for the fact that you can
turn it off a few years later. The practical issues of notifying everyone
of a log's pending retirement are still an issue, too.

- Matt

Reply all
Reply to author
Forward
0 new messages