--Firstly, apologies for the delay in sending this out. We decided to do a more thorough impact analysis / breakdown and that took up some extra time.
On October 24th 2018 all Trillian based Google CT logs experienced an approx 40 minute impact on availability. This occurred from roughly 04:22 to 05:03 Pacific Time (11:22 to 12:03 GMT), and affected the argon20XX and xenon20XX logs, as well as the solera20XX and crucible test logs.
Within this time there was a shorter 18 minute window when 3653 requests in total (from 157 unique IPv4/v6 addresses) received 502 HTTP response codes. Successful requests, totalling 35153 during this 18 minute window, possibly experienced much higher and variable latency than normal. For the overall ~40 minute impact 131240 requests (from 224 unique IP addresses) were subject to potentially higher latencies.
The root cause was an unexpected behavioural change in a network library that we depend on for routing external requests to our servers. The result was that the servers began rejecting all inbound traffic. Automatic checks on the new release binary gave different results on several runs. We believe this was due to differing traffic patterns at the time and because internal traffic bypassed the failure. The result was that the new release was briefly set live before being manually rolled back to the previous version.
Summary timeline of events:
04:22 PT OUTAGE BEGINS - Rollout of new release begins
04:28 PT DETECTION TIME - First warning bug received for raised error rates
04:33 PT ESCALATION TIME - First page for raised error rates
04:34 PT Rollout of new release reaches approx. 90% complete
04:34 PT Rollout aborted
04:36 PT (approx.) On-call requests immediate rollback
04:39 PT Rollback process is initiated to restore previous release
04:45 PT Rollback begins to take effect
05:03 PT OUTAGE ENDS - Rollback complete
As of November 5th 2018 the lowest argon20xx availability 90 day uptime is 99.9907% for argon2018.
The following list shows the number of 502s we served for each of the affected log endpoints. A few malformed requests have been omitted.
Endpoint
502s Returned
/logs/argon2017/ct/v1
1
/logs/argon2018/ct/v1/get-roots
1
/logs/argon2019/ct/v1/add-chain
1
/logs/argon2021/ct/v1/add-chain
1
/logs/argon2021/ct/v1/get-roots
1
/logs/solera2018/ct/v1/get-entries
1
/logs/solera2019/ct/v1/get-entries
1
/logs/solera2021/ct/v1/get-entries
1
/logs/xenon2019/ct/v1/add-pre-chain
1
/logs/xenon2020/ct/v1/get-roots
1
/logs/xenon2021/ct/v1/add-pre-chain
1
/logs/xenon2022/ct/v1/get-entries
1
/logs/xenon2018/ct/v1/get-entries
2
/logs/xenon2019/ct/v1/get-entries
2
/logs/xenon2021/ct/v1/get-entries
3
/logs/xenon2020/ct/v1/get-entries
4
/logs/argon2018/ct/v1/add-pre-chain
5
/logs/argon2021/ct/v1/get-entries
8
/logs/argon2019/ct/v1/add-pre-chain
22
/logs/argon2021/ct/v1/add-pre-chain
25
/logs/argon2020/ct/v1/get-entries
36
/logs/solera2021/ct/v1/get-sth
68
/logs/solera2018/ct/v1/get-sth
69
/logs/solera2019/ct/v1/get-sth
69
/logs/solera2020/ct/v1/get-sth
70
/logs/solera2022/ct/v1/get-sth
73
/logs/xenon2021/ct/v1/get-sth
87
/logs/xenon2020/ct/v1/get-sth
88
/logs/argon2022/ct/v1/get-sth
89
/logs/crucible/ct/v1/get-sth
91
/logs/xenon2019/ct/v1/get-sth
98
/logs/xenon2018/ct/v1/get-sth
113
/logs/argon2017/ct/v1/get-sth
115
/logs/xenon2022/ct/v1/get-sth
115
/logs/argon2019/ct/v1/get-entries
137
/logs/argon2019/ct/v1/get-sth
207
/logs/argon2018/ct/v1/get-sth
212
/logs/argon2020/ct/v1/get-sth
216
/logs/argon2021/ct/v1/get-sth
231
/logs/argon2020/ct/v1/add-pre-chain
394
/logs/argon2018/ct/v1/get-entries
523
Total
3184
We apologize for this interruption to serving and will be introducing additional deployment checks and monitoring to guard against a future recurrence.
MartinGoogle CT Team
You received this message because you are subscribed to the Google Groups "certificate-transparency" group.
To unsubscribe from this group and stop receiving emails from it, send an email to certificate-transp...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/certificate-transparency/CAK76_KVJO4U6y2ax1_F4tzkPGOD8nN%2B1oqZkKUsYRd1RX7u%2B%2BQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
You received this message because you are subscribed to the Google Groups "Certificate Transparency Policy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ct-policy+...@chromium.org.
To post to this group, send email to ct-p...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/ct-policy/CACM%3D_Ocyry-Z4mw6fVnX55QxYTvU9fm9TWF7UXfsvqB2zsmWtg%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/ct-policy/CACvaWvYsS8ad94xpw-8EuKZw5etu9OaFQFzcWmMwgbSL-mtYkA%40mail.gmail.com.
Maybe I missed it, but was there a more detailed Postmortem so the community can understand the root causes and mitigation?
I ask because Google has plans [1] to take their older non-sharded CT logs down in the May-August timeframe next year. Having all logs based on Trillian and being managed within the same infrastructure, release process, DNS management, DoS protections, etc. can result in a higher probability for an outage across all Google CT logs. While any other CT log operator can go down with little ecosystem impact, this is not the case for Google CT logs (CAs are obligated to include at least one Google SCT). Has this risk been adequately addresses?
The more recent outage [2] due to "Preloader Induced DoS Defense Mode" makes me even more concerned about successful DoS which results in disabling global SSL issuance. Perhaps it's time to consider changing the Google CT policy to permit issuance of certificates without a Google SCT?
OK, in addition below is a summary of what I presented at the Policy Days. We can't go into much more detail as the problems occurred in code that's not open source.MartinMore DetailsWe share common networking infrastructure with most Google services. This is managed for us and contains a lot of moving parts. We normally don’t worry much about it. Requests arrive from the Internet and are routed through this via internal networks to our servers.Our release process is fully automated and consists of multiple stages. Continuously generated release candidates must progress through the stages to become live releases. At each stage a combination of tests are run combined with evaluations of the behaviour of the servers including comparisons to previous versions. If any test or evaluation fails the candidate is blocked and not released.Servers are typically built in layers. In our case requests pass through an Application Framework layer (shared code), then our interceptor that performs rate limiting and other common functionality before making it through to our actual HTTP request handler.A bug was introduced into the Application Framework library that made all external requests arriving at our server incorrectly fail internal ACL checks. These requests never reached our handler or interceptor code so did not appear in our error metrics. This caused no unit test failures as it was outside their scope. Integration tests also did not see the problem as the traffic involved was internal and did not trigger the ACL failure, which required via interactions with other networking components.The nature of this failure prevented the errors from being visible to the release evaluation because they were not recorded in metrics. Additionally, other requests were being submitted directly to the servers from our internal systems, which all succeeded. This meant that if the release evaluation occurred at a time when large number of internal requests were happening everything seemed good and the evaluation passed.Consequently, the release briefly made it live in production. Once it was deployed probers began accumulating errors and the edge network -> server error ratio began to increase as the faulty binary rolled out in more locations. This led to a number of alerts being triggered. The oncall was able to rapidly correlate the error onset to the beginning of the rollout and requested an immediate rollback.Once the rollback was complete, and the previous version was redeployed everywhere, the error metrics returned to normal and the observed impact reduced to zero.Our primary follow-up actions will be to ensure that our canary environment is tested via the external request processing path, and improve the release evaluation process at the canary stage so it assesses live traffic.
You received this message because you are subscribed to the Google Groups "Google CT Logs" group.
To unsubscribe from this group and stop receiving emails from it, send an email to google-ct-log...@googlegroups.com.
To view this discussion on the web, visit https://groups.google.com/d/msgid/google-ct-logs/CACvaWvYsS8ad94xpw-8EuKZw5etu9OaFQFzcWmMwgbSL-mtYkA%40mail.gmail.com.
On Tue, Dec 11, 2018 at 9:42 AM Doug Beattie <douglas...@gmail.com> wrote:Maybe I missed it, but was there a more detailed Postmortem so the community can understand the root causes and mitigation?There was, actually, but it looks like Martin wasn't a member of ct-policy@ and thus it wasn't archived. This is, in general, an indictment against cross-posting.I ask because Google has plans [1] to take their older non-sharded CT logs down in the May-August timeframe next year. Having all logs based on Trillian and being managed within the same infrastructure, release process, DNS management, DoS protections, etc. can result in a higher probability for an outage across all Google CT logs. While any other CT log operator can go down with little ecosystem impact, this is not the case for Google CT logs (CAs are obligated to include at least one Google SCT). Has this risk been adequately addresses?I think one common thread of these post-mortems is that CAs can be taking steps to reduce any impact, and that CAs that have taken such steps have seen minimal impact. This has also been a recurring theme of CT Policy Days.

The more recent outage [2] due to "Preloader Induced DoS Defense Mode" makes me even more concerned about successful DoS which results in disabling global SSL issuance. Perhaps it's time to consider changing the Google CT policy to permit issuance of certificates without a Google SCT?From that post-mortem, it also appears that CAs which took steps to diversify their logging saw limited impact (perhaps none, in some cases), while others exhibited certain pessimistic behaviours that have been called out as problematic in past discussions.
Could you share what data - whether from the post-mortem or from CA operations - that you think leads to the conclusion? It seems like having actionable and concrete data, which these post-mortems ensure, allow a bit more discussion and evaluation.
This time copying the list.---------- Forwarded message ---------
From: Doug Beattie <douglas...@gmail.com>
Date: Fri, Dec 14, 2018 at 8:21 AM
Subject: Re: [ct-policy] Re: Google CT Log Outage Postmortem For Oct 24 2018
To: <rsl...@chromium.org>Ryan,I think you missed the point. Mandating that all SSL certificates use SCTs from at least one Google log is an unacceptable risk, imo, especially when they are managed within the same infrastructure. I'd prefer to see a requirement for more SCTs without any from Google over the current Google CT policy that requires at least one Google SCT. For example:< 15 monthCurrent: 2 (one Google and one non-Google)Proposed: 2 (one Google and one non-Google), or 3 (from non-Google CT logs from at least 2 different operators)>= 15, <= 27 monthsCurrent: 3 (at least one Google and one Non-GoogleProposed: 3 (at least one Google and one Non-Google, or 4 (from non-Google logs from at least 2 different operators)Is anyone else concerned bout this single point of failure?
Argon2019 has a backlog of more than 1000 when I checked just now, https://crt.sh/monitored-logs Will the 2 Google logs have the capacity and bandwidth to support the growing certificate issuance requirements by providing SCT in a timely manner?
Doug,
There’s a lot going on in your message, so I’m going to try to reframe it a little, so we don’t get lost in the replies. I’ll try to work from the ‘easiest’ stuff and then onto the more nuanced stuff.
I’m glad to see other members of the community have already highlighted that the misunderstanding with what crt.sh was reflecting. To recap:
Those metrics have nothing to do with the delivery of SCTs, but of monitored entries. Only the entities logging can provide data about their experiences obtaining SCTs, using either the /ct/v1/add-chain API or the /ct/v1/add-pre-chain API. In my previous message, I tried to capture this, by explicitly asking GlobalSign to share any data that it can, relevant to this discussion.
Crt.sh doesn’t allow inferring bandwidth or capacity from those measurements; if anything, the greater bandwidth and capacity of those logs causes monitors, such as crt.sh, to fall behind quicker, because of how rapidly they are able to integrate entries.
The offer still stands: if there is data that GlobalSign has regarding challenges with Logs, Google or otherwise, then we really would love to have that data made available, and that’s exactly the kind of discussion we’d love to see more of on ct-policy@. As we’ve shared before, so much of the policy and approach rests on the sharing of information to help inform and craft better policies. We do our best with the data we have, holistically, but GlobalSign can make a real impact by sharing such data. For example, discussions about the logs you use, the latency you see, the error rates, etc, all help both the community and Log Operators better set expectations.
Next, on the topic of log diversity, I’m having a bit of trouble. From looking at https://ct.cloudflare.com/ , it looks like GlobalSign is perhaps leading the pack of CAs in terms of Log diversity and distribution. From your reply, it sounds like neither issue caused any discernible or meaningful disruption for GlobalSign, is that correct? If so, that highlights the point I was trying to capture - that some CAs were impacted by these issues is no doubt tied to decisions that those CAs made with regards to where and how they log, and as a consequence, faced greater disruption. While it’s certainly a goal to make sure no Log has disruption, I don’t think we can lay the blame for any and all CA disruptions at the feet of the Logs or Log Operators given the current state here. At the same time, it’s somewhat telling that it appears there’s no use of Argon in the mix, or any Trillian CA. It may be an artifact of Cloudflare’s dashboard - I admit, I haven’t run the hard numbers myself for GlobalSign - but it seems odd to be concerned about Google Log outages or performance when it appears, on cursory glance, that steps aren’t being taken to mitigate those risks through diversity.
This is key to understanding what, I think, is the most nuanced and complicated of your points: the assessment of risk. It seems your focus on risk is, quite understandably, the risk on the issuance side of the pipeline, rather than on the relying party side. Broadly speaking, it seems the risk you’re most concerned about is a global, simultaneous, persistent Google Log outage. Without wanting to put words in your mouth, the impression I got is that you see that risk manifesting in several ways:
Infrastructure homogeneity, such as using common Google infrastructure (front-ends, DoS capabilities) or networking
Codebase homogeneity, such as the alignment on Trillian
The raw number of Google logs, independent of those first two concerns
Operational issues, such as the release management process and deployment
I hope that’s a fair presentation/restatement of the concerns, and that it’s not unreasonable to suggest that your primary concern is regarding the issuance of certificates.
You both propose and ask about a possible solution, such as substituting the Google Log requirement for something else. Given the set of concerns, it’s not unreasonable to see that as a possible solution. Unfortunately, I think it overlooks a number of practical limitations that have been previously discussed, while also overlooking some of the other risk factors that are a part of the calculus.
To be clear and up-front: Our goal is not to keep the Google Log requirement indefinitely. Since the beginning - quite literally in those first public versions of policy - we’ve been wanting to move to an ecosystem that is wholly independent of Google. But as I hope to show, there are real and practical challenges with that, which are still in the process of being addressed, and that the value being provided far, far exceeds the risk, even in consideration of these sorts of incidents.
Since it’s the oldest issue, the one I’ll tackle first is the question about independence. You’ve posed a question about “operators”, but if you recall, early versions of the policy included similar language, focused on “infrastructure or administrative access”. You may recall, from that thread, our disagreement about whether or not diverse logs was a “security” requirement, and I tried to explain and document why it was a fundamentally critical requirement, and Ben explained the security risk introduced by SCTs in the first place, which was a design concession to CA concerns.
Over time, this evolved into the current requirement for One Google Log, as captured in this thread from May 2015. Hopefully, that thread captures some of the reasons. There’s also this thread, from February 2017 following CT Policy Days, that captures more of the challenges and risks in quantifying some of the diversity requirements. We’re not alone in facing these challenges - you can see Gerv struggling to pin down a good solution for Mozilla, knowing these challenges.
These risks aren’t purely theory and armchair quarterbacking. We’ve seen them play out in the ecosystem already. The question of infrastructure independence has come up with Log Operators, both in the context of deploying to cloud providers as well as outages caused by both cloud providers and local infrastructure. We’ve seen a greater coalescing around implementations onto Trillian - which is positive for the ecosystem in some ways, particularly scalability, but understandably has the negatives of single-system risk, whether Google-operated or otherwise. As we’ve moved to require CT for all certificates, we know that there are real benefits to CAs otherwise ‘hiding’ certificates by colluding with Logs. We’ve seen multiple Log operators combine - sometimes publicly, as was the case with DigiCert and Symantec, sometimes privately, as was the case with StartCom and WoTrus/WoSign. We’ve seen Log operators issue SCTs and fail to incorporate them. The point being is that all of these are meant to capture that the risks you are seemingly concerned about, with the One Google requirement, are not meaningfully addressed by sprinkling in more SCTs or trying to pin down diversity.
This opens up the bigger issue, though, which is the question about “Why One Google in the first place?”. It’s not just that defining diversity is hard, and it’s not about purely best practice either. As I alluded to earlier, the risks being mitigated here are not only those risks to CA issuance.
One aspect of this policy is about risk management for Chrome users and certificate subscribers. If Chrome is going to require CT for certificates, as it does, then it’s important to take reasonable steps to mitigate the risk that such a requirement would cause certificates to stop working for site operators and users. If all of the SCTs within a certificate are from Logs that are disqualified, that once-working certificate will no longer work. Some of that risk is mitigated by the number of SCTs required, but the long lifetime of certificates, and the unfortunate and wholly avoidable challenges with replacing certificates, means that risk very much has to be considered. By not only operating a Log, but requiring the Log, we are better able to ensure that certificate holders will not find their certificate rendered unusable for Chrome users. This assumption rests on the belief that Google Logs are more resilient and scalable, and less likely to experience critical, DQ-worthy failure. To be clear, it’s not that we would not disqualify a Google Log if necessary, but it’s a variable that we can control and invest in - and have, rather significantly - in the furtherance of greater transparency.
I mentioned this in the very first thread we had on the matter, when similarly talking about the “Too Big to Fail” scenario. While counter-intuitive to suggest that the critical requirement on a Google Log mitigates a single-point-of-failure, when you think about the threat model that users and site operators face, rather than that of CAs, the requirement for a Google Log prevents a third-party from being critical to Chrome users’ security and the reliability of sites for Chrome users. For example, imagine a Log providing a known-hostile, split-view set of certificates. The clear action to take is to disqualify the Log. In the world you proposed, this would run the risk that such a Log would be a “load bearing” Log, and the consequence is that disqualifying such a Log could render millions of sites inoperable. As I called out those several years ago, that’s a very similar story to where we see ourselves with CAs today, and as recent challenges have shown, doing the right thing isn’t always the easy thing, and we should avoid introducing such issues in new systems.
However, the single largest reason for the ongoing one-Google requirement is the lack of a deployed SCT checking mechanism. As I mentioned, and Ben captured in some of those threads, SCTs were introduced as a concession for CAs concerned about the performance and time-to-issuance. As reasonable as those decisions may have been, given the facts that were available, it introduced a new challenge: A need for clients to check SCTs as part of ensuring that a Log is behaving correctly. If SCTs are not checked, then a Log can provide split views or hide certificates, and as a consequence, become highly-trusted and critical to Internet security. To be clear: In a world of SCTs, the choices are either between Logs being Trusted (much in the way that CAs are) or to deploy consistent verification of SCTs, either during or post-validation, to ensure meaningful detection of Log malfeasance.
CT’s key advance has been that it IS possible to cryptographically verify, detect, and prove Log shenanigans, and that’s a significant advancement from the Web PKI’s hierarchical model of trust. Unfortunately, one of the key challenges has been balancing the privacy needs of users and the operational deployment challenges at Internet scale, and that takes time and is critical to “get right”. You can see we’ve been exploring this in the context of the DNS-based proof delivery mechanism, you can see explorations of this in the IETF TRANS WG’s work on gossip, and you can see this consideration factoring in heavily into some of the changes of RFC 6962-bis.
Absent that, however, the choices are either to introduce a Trusted Log or to trust all logs equally. Trusting all logs equally is not an easy decision - beyond all of the above concerns elaborated on, it also introduces a host of new questions, such as “Why do I trust this Log”, and the perennial favorite question of my WebTrust colleagues, “Should Logs be audited?” All of the cryptographic verifiability would be unnecessary in such a system, while simultaneously, real concerns about “Should Logs be operated by CAs” would be introduced. Two Logs, for example, could collude or be compromised in such a way as to introduce significant risk to the ecosystem, and that’s both a real concern and one that has, unfortunately, historically been validated as legitimate within the CA ecosystem.
The alternative is what we’ve pursued, for this interim period, which is that of a Trusted Log. In this regard, the Logs that Google operates serve a critical security purpose for Chrome users and site operators: They ensure that any certificate that will be accepted by Chrome will be shared, by Google, through its Logs. Users and relying parties can inspect Google Logs to see what Chrome trusts. If you do not trust Google Logs to be honest, then you similarly should not trust Chrome, to verify SCTs or certificates or select which CAs to trust or to run native code, since this is all rooted in Google. This is certainly an imperfect solution, but it’s one that is intended to be temporary, as both client implementations and Log operators improve and grow.
While we’ve continued to make progress on developing the necessary tools and solutions to address this SCT-checking challenge, I can understand if there’s some frustration that it’s not here now. You’re not alone, if that is the case. Our focus and priority has been on improving the Log compliance and monitoring side, because we totally understand that it is an area that can very much impact CAs in the issuance side, and those are areas where CAs are feeling the most pain right now.
This is part of the holistic calculus to risk we’re taking - considering users and site operators. We understand and acknowledge that there is a risk that if all the Google Logs encountered a simultaneous, global, consistent outage, CAs would face challenges in issuing certs. On the whole, however, Google can take steps to mitigate those risks, and similarly encourage CAs to take appropriate steps to do the same. Sharing data along with the concerns can help improve the dialog and highlight if our calculus is off, and we’re always open to better understand. That said, the risks to users is of paramount importance to us, and to a lesser extent but still greater than that of CAs, the risks to site operators. We want to make sure that we’re meaningfully mitigating those risks as best as possible, while the ecosystem continues to grow and improve.
It may be that the Log risk mitigation takes more time than expected, and we’ve discussed what that may mean in past CT Policy Days events. As we’ve shared in the past, other solutions in this space might mean formalizing the notion of Trusted Logs. As mentioned, the notion of a Trusted Log is about ensuring that relying parties and site operators have the risks of CA collusion mitigated, so there would understandably be significant challenges to identify those criteria.
I realize this is a very long email, which is a product of how long this discussion having gone on, and us not really having holistically put it out there in written form. I realize this doesn’t enumerate a specific and concrete set of timeline and steps to reducing the Google Log requirement, but rather principles - getting those discrete steps is something Devon and I continue to work on. As you can see from the concerns, though, it’s very much a fluid thing - pulling in one area has an impact in another, so we want to make sure we’re thoughtfully balancing things and developing concrete, actionable, and meaningful solutions. Getting rid of the One Google requirement is not merely aspirational, it is a concrete goal of ours, but we’re balancing the steps to get there with those steps necessary to make sure the ecosystem is robust and growing.
That said, I wanted to acknowledge one more part of your message, which is what is the risk of both Google Trillian Logs going down at the same time. It sounds like you’re unhappy with the postmortems they’ve provided, and may see additional architectural risks not being addressed or acknowledged. On that front, I want to encourage you to push them for more details and more questions that can help you build that confidence. Despite the above “Trusted Log” discussion, we on the Chrome side intentionally and deliberately try to keep a wall between us and the CT team, to make sure that we’re holding all Logs to the same set of standards and expectations. As long as the “Trusted Status” remains, it’s not unreasonable to hold the Google Logs to an even higher standard.
On Wed, Dec 19, 2018 at 03:58:30PM -0500, Ryan Sleevi wrote:
>
> Next, on the topic of log diversity, I’m having a bit of trouble. From
> looking at https://ct.cloudflare.com/ , it looks like GlobalSign is perhaps
> leading the pack of CAs in terms of Log diversity and distribution. From
> your reply, it sounds like neither issue caused any discernible or
> meaningful disruption for GlobalSign, is that correct? If so, that
> highlights the point I was trying to capture - that some CAs were impacted
> by these issues is no doubt tied to decisions that those CAs made with
> regards to where and how they log, and as a consequence, faced greater
> disruption.
As long as a Google log is required, no amount of diversity helps.
If all Google logs are down, no certificates can be issued.
What the cloudflare page shows, is that all CAs mentioned there
spread their precertificates over 2 at least Google logs. Sentigo
seems to be the only one that clearly seems to favour one Google
log over the others.
But what that doesn't show is that this spread happens all the
time, or that they just switched from one log to an other. It's
about all the certificates they issued, not from the past month or
something.
Note that spreading the load over multiple Google logs can also be
a problem if one of the logs has a problem. They might switch to
only using the other log, and the shift in load might trigger the
DoS protection.
I think many of the other points you're talking about is really
about trusting the logs. Logs are not supposed to be trusted. We
should have technical measures to make sure they are working
properly. But this all still needs to be implemented. I think if
we ever come to a situation where trust in a log is no longer a
requirement, the requirement of the Google log can also go away.