Pending CT Log Disqualification: WoSign

588 views
Skip to first unread message

Ryan Sleevi

unread,
Apr 1, 2016, 9:39:00 PM4/1/16
to Certificate Transparency Policy
As part of the inclusion process for recognizing Certificate Transparency logs included in Chrome, we have determined that the WoSign CT Log, https://ct.wosign.com , has been having ongoing issues and is no longer measured at 99% uptime. As the Chromium Certificate Transparency Log Policy states, 99% uptime is part of the initial and ongoing requiremens that Log Operators are expected to abide by.

Because of this, the WoSign CT Log will not be included in Chrome. While SCTs from the WoSign CT Log may continue to be included after that point, they will not count towards the requirement of one non-Google log, and if embedded in certificates, they will not count towards the minimum number of SCTs required. All SCTs from the WoSign CT Log, past, present, and future will not count towards the requirement that at least one SCT is from a valid log at time of evaluation.

What does this mean for site operators

This change should have no impact on your operations. As Chromium-based code did not yet trust the WoSign CT Log, this change in status should not affect any of your certificates or servers.

What does this mean for CAs

If you are embedding SCTs in your certificates, SCTs from the WoSign CT Log will not count towards the minimum requirements. This is important to highlight, because as explained in the Chromium Certificate Transparency EV/CT plan, CAs may include SCTs within certificates from logs that are pending qualification, provided that all logs are accepted as qualified prior to the TLS handshake. Any certificates which relied upon a presumption of inclusion will find that, due to the disqualification, SCTs from the WoSign CT Log will not be counted as qualified at the time of the TLS handshake. As a result, any such certificates which fail to include a sufficient number of SCTs, not counting the WoSign CT Log's SCTs, will not be trusted in Chrome.

liangdong

unread,
Apr 2, 2016, 12:10:14 AM4/2/16
to Certificate Transparency Policy
Ryan Sleevi:
      I think there must be some disagreements between us.
      
      I notice that most outage reason of wosign log record by your monitor system is timeout, and many of that didn't find by our monitor system .
      I think this can cause by different reason:
      different timeout parameter in monitor system can cause different uptime ,
      and monitor server deploy in different position also cause different uptime .
      Our monitor system deploy in Los Angeles and the timeout paramater of our system is 10 seconds, we monitor our ct log system every minutes and our monitor system didn't get so much timeout alert. 

      let me think if there are three times connect to an log, first time we get data success, second time we connect it timeout, third time we still get data success again, this will mostly cause by network and log operator couldn't control it.
      I know that Certificate Transparency Log Policy provision outages include network level outages, but I think charges it upon log is unfair.
      And I suggest if there is an place to show every log's realtime outage is better. 
     
      I don't know whether I represent my opinion clearly, if you need more information please reply to me, thanks!


在 2016年4月2日星期六 UTC+8上午9:39:00,Ryan Sleevi写道:

Ryan Sleevi

unread,
Apr 2, 2016, 2:02:17 AM4/2/16
to liangdong, Certificate Transparency Policy
I appreciate and understand your view. As noted in https://www.chromium.org/Home/chromium-security/certificate-transparency/log-policy , however, the policy clearly states that a Log must "Have 99% uptime, with no outage lasting longer than the MMD (as measured by Google)"

The policy also states that "After acceptance, Google will monitor the Log, including random compliance testing, prior to its inclusion within Chromium."

During the first attempt to monitor, this log failed to adhere to its published MMD, as captured in https://bugs.chromium.org/p/chromium/issues/detail?id=534745#c16 

Rather than disqualify the log from inclusion, we allowed the current log and key to reapply - https://bugs.chromium.org/p/chromium/issues/detail?id=534745#c17

However, during the second monitoring period, this log failed to here to the uptime requirement of 99%.

While this is an unfortunate event, there is nothing prohibiting you from using your current infrastructure and code and reapplying, after first updating the URL and key. This is an important thing - clients need to be able to ensure that if a log had issues, there was no opportunity for abuse or misuse, and monitors need to be able to ensure a consistent and distinct API endpoint for logs.

Given the multiple issues this log has had, prior to completing the monitoring period and then following the completion, we feel it appropriate to request that any further attempts use a new instance, to avoid any potential risk or ecosystem harm that might arise with continuing to reuse the existing log.

--
You received this message because you are subscribed to the Google Groups "Certificate Transparency Policy" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ct-policy+...@chromium.org.
To post to this group, send email to ct-p...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/ct-policy/24457bf0-94db-4ab9-9937-b0ed2ad4b8f9%40chromium.org.

liangdong

unread,
Apr 2, 2016, 3:29:50 AM4/2/16
to Certificate Transparency Policy
Thanks for your explanation, we know that a Log must "Have 99% uptime, with no outage lasting longer than the MMD (as measured by Google)" ,and we admit that during the first attempt to monitor, our log failed to adhere to its published MMD.

But after that , we fix our problem and reapply, and then re-monitor for three months, I think is is no different between restart an new log.

During the second monitoring period, I think the wrong outage calculate method cause the uptime number in monitor system below 99%. This is what I means.

And I wonder what is definetion of "Have 99% uptime", the uptime metion the uptime of one day? one week? or one month? or from begin request to be inclusion?


在 2016年4月2日星期六 UTC+8上午9:39:00,Ryan Sleevi写道:
As part of the inclusion process for recognizing Certificate Transparency logs included in Chrome, we have determined that the WoSign CT Log, https://ct.wosign.com , has been having ongoing issues and is no longer measured at 99% uptime. As the Chromium Certificate Transparency Log Policy states, 99% uptime is part of the initial and ongoing requiremens that Log Operators are expected to abide by.

Ryan Sleevi

unread,
Apr 4, 2016, 3:07:51 PM4/4/16
to liangdong, Certificate Transparency Policy
On Sat, Apr 2, 2016 at 12:29 AM, liangdong <liang...@gmail.com> wrote:
But after that , we fix our problem and reapply, and then re-monitor for three months, I think is is no different between restart an new log.

As I explained, there are observable differences here, and that's why we have a policy that after any disqualification, you need to rotate the log and the key, otherwise any SCTs issued during the "problem period" may be treated as valid once finally accepted. Since the entire point of the policies is to ensure that all SCTs issued/trusted by a log are done while a log conforms, this would otherwise be problematic. 

Ryan Sleevi

unread,
Apr 4, 2016, 3:09:53 PM4/4/16
to liangdong, Certificate Transparency Policy
On Sat, Apr 2, 2016 at 12:29 AM, liangdong <liang...@gmail.com> wrote:
And I wonder what is definetion of "Have 99% uptime", the uptime metion the uptime of one day? one week? or one month? or from begin request to be inclusion?

In this case, it was calculated over a rolling 90 day period, which is the present "compliance testing" period that we communicated on the bug that your log would be subject to. 

Peter Bowen

unread,
Apr 4, 2016, 3:38:33 PM4/4/16
to Ryan Sleevi, liangdong, Certificate Transparency Policy
Based on the reports posted, it looks like the requirement is two or three fold:
1) A DNS lookup for the CT log hostname returns an address record
within $X seconds
2) A get-sth call to the returned IP must return a valid response
within $Y seconds
(presumably the address is randomly chosen from the results if
multiple address records are returned)
3) The timestamp is no more than MMD old

This is all tested every couple of minutes. Availability is defined
as total success divided by total tests. So over 90 days there are
about six five thousand data points gathered. If more than 648 of
these are not a pass, then the log availability is below 99%.

Does this sound right? If so, can you share the values for $X and $Y?

Thanks,
Peter

Ryan Sleevi

unread,
Apr 4, 2016, 3:49:53 PM4/4/16
to Peter Bowen, Ryan Sleevi, liangdong, Certificate Transparency Policy
On Mon, Apr 4, 2016 at 12:38 PM, Peter Bowen <pzb...@gmail.com> wrote:
Based on the reports posted, it looks like the requirement is two or three fold:
1) A DNS lookup for the CT log hostname returns an address record
within $X seconds
2) A get-sth call to the returned IP must return a valid response
within $Y seconds
(presumably the address is randomly chosen from the results if
multiple address records are returned)
3) The timestamp is no more than MMD old

This is all tested every couple of minutes.  Availability is defined
as total success divided by total tests.  So over 90 days there are
about six five thousand data points gathered.  If more than 648 of
these are not a pass, then the log availability is below 99%.

Does this sound right?  If so, can you share the values for $X and $Y?

TL;DR: For now, the values for $X and $Y are not shared, and they're not necessarily the variables being quantified here.

Longer explanation:
For example, we make no guarantee or statement that it will be a "get-sth" call; certainly, in order to test MMDs, there's a need for an /add-chain or /add-pre-chain call, to test the time from that returning an SCT to it being integrated within the MMD, and there needs to be /get-sth-consistency / /get-proof-by-hash to verify the append-only policies and correctness of API. Similarly, /get-entries is needed for mirroring/replication, and /get-roots to ensure that the policies are being performed.

There's clearly opportunities for improvements here, and I'll see about getting a separate thread started for that, after gathering some feedback internally and externally on some of the points raised. There's certainly benefits to explaining more of the criteria, but there's also the risk, of course, in over-specifying the measurements which may permit a log to act against the communities' interests.

We've so far tried to take a balanced approach in the level of specificity. Sometimes that has not always worked out well (for example, the criteria for 'independence' turned out to be much more difficult in practice for CAs, contrary to our expectations), and we've definitely improved over time.
Reply all
Reply to author
Forward
0 new messages