Regarding CA requirements as to technical infrastructure utilized in automated domain validations, etc. (if any)

849 views
Skip to first unread message

Matthew Hardeman

unread,
Jul 17, 2017, 6:08:14 PM7/17/17
to mozilla-dev-s...@lists.mozilla.org
Hi all,

I was just reading through the baseline requirements -- specifically 3.2.2.4 and its children -- and noted that while there are particular standards as to the blessed methods of validation of authority & control for domain names (and host names within domain names), there is nothing specified regarding the technical requirements for the infrastructure and procedures for this validation. Instead, simple protocol names are called out as the method (like over HTTP/HTTPS, or establishment of TXT record in the DNS). Nothing more specific is noted.

My own background is originally in software development, but specifically with an emphasis on network applications. Additionally, I've been involved in quite a number of small / medium regional ISP interconnection matters over the years. I'm extremely familiar with the various mechanisms of ISP to ISP connectivity, whether via purchase from a transit ISP, direct private peering over an independent physical link, connectivity over IXP switched infrastructure, whether via private VLAN, private BGP session over switched ethernet on IXP, or via IXP route servers, or any combination of these (very common).

It has occurred to me that a small certificate authority might plausibly have their principal operations infrastructure at a single data center. Even in instances where multiple ISPs provide access to this CA, they will almost inevitably be pulled from a single data center or cluster of physically close data centers. Quite frequently, those ISPs will perform regional peering between each other at one of a small number of data centers in the geographic region.

Presumably, best practice for a DNS challenge currently involves:

1. Do the various things that negotiate between the CA and the authentication client what actual DNS record needs to get created (TXT record with certain name or similar).
2. Client creates record and if necessary allows it to propagate in their or their providers' infrastructure.
3. Client pings CA as ready for validation test.
4. CA presumably uses a smart DNS resolver to resolve (with DNSSEC for as far as possible) from the IANA root to the TLD name servers to determine the authoritative name servers for the zone in question.
5. Having the authoritative DNS servers now known, the CA infrastructure queries directly to one or more of the authoritatives for the domain to get the result. Cache policy presumably is 0 or near zero.

In actuality, if that is "best practice", it falls short of handling or attempting to handle certain network interconnection / interception attacks which could definitely be mitigated significantly, though imperfectly and at some cost.

The trouble is that for many domains served by independent DNS infrastructure, you might only need to "steal" routing for a small network (say a /23) for a very brief period and only at the nearest major interconnection hub to the CA's infrastructure to briefly hijack the DNS queries from the CA infrastructure to the authoritative DNS servers for the registered domain. If you know or can proximately control when the validation test will run within even minutes, it's quite possible the "route leak" wouldn't be noticed.

I should note that it is similarly possible to leak such an advertisement to hijack an http rather than DNS test as well.

While it will probably not be possible to guarantee that the route seen to certain infrastructure that a CA wishes to test control of can not be hijacked at all, there are definitely ways to greatly reduce the risk and significantly curb the number of organizations well positioned to execute such an attack.

Questions I pose:

1. Should specific procedurals as to how one _correctly_ and via best practice performs the validation of effective control of a file served up on the web server or correctly validates a DNS challenge be part of the baseline requirements and/or root program requirements?

2. What specific elements would strike the right balance of need for security vs cost of implementation?

3. Are CAs today already contemplating this? I note that code commits in Let's Encrypt's Boulder CA recently include the notion of remotely reached validation agents and coordinating the responses that the validation agents got and establishing rules for quorate interpretation of the results of dispersed validators. I can not imagine that said work occurred in a vacuum or without some thought as to the kinds of risks I am speaking of.

Even if we stop short of specifying the kinds of disparate networks and locations that CA validation infrastructure should measure validations from, there are other questions I think that are appropriate to discuss:

For example, 3.2.2.4.6 mentioned validation via HTTP or HTTPS access to an FQDN for a given blessed path. It never says how to fetch that or how to look up where to fetch it from. It may be tempting to say "In the absence of other guidance, behave like a browser." I believe, however, that this would be an error. A browser would accept non-standard ports. We should probably only allow 80 or 443. A browser wouldn't load over HTTPS if the current certificate were untrusted. This is presumably irrelevant to the validation check and should probably ignore the certificate. An HSTS preload might well be incorporated into a browser, but should probably be ignored by the validator.

At the network and interconnection layer, I think there are significant opportunities for a bad actor to compromise domain (and email, etc, etc) validation in ways that parties not intimately familiar with how service providers interconnect and route between themselves could fail to even minimally mitigate.

If I am correctly reading between the lines in commit messages and capabilities being built into Let's Encrypt's Boulder CA software, it would appear that there are others concerned about the limitations inherent in single point of origination DNS queries being relied upon for validation purposes. If that is the case, what is the appropriate forum to discuss the risks and potential mitigations? If there is reasonable consensus as to those, what is the proper place to lobby for adoption of standards?

Thanks,

Matt Hardeman

Jakob Bohm

unread,
Jul 18, 2017, 2:45:01 AM7/18/17
to mozilla-dev-s...@lists.mozilla.org
Many of the concerns you list below are already covered in different
ways.

1. I believe (though others may know better) that the high general
requirements for the security of CA systems also apply to the
systems performing the validation procedures in question.

2. For all DV (Domain Validated) certificate validation methods, it is
basically accepted that if an attacker can hijack access to a domain
for the duration of the validation, then that attacker can fool even
the most secure CA into giving the attacker a DV certificate.
This is because the problem is fundamentally unsolvable.

3. The location from which to fetch the confirmation file for HTTP based
validation is generally dictated by the CA, not the applicant. So one
CA might require the file to be at
"http://www.example.com/check1234.html", another might require it to
be at "http://www.example.com/.well-known/check5678.txt" and so on.
One of the numerous issues that lead to WoSign becoming distrusted
was that they allowed the applicant to specify the port, leading to
multiple unauthorized certificates being issued, some of which were
not revoked when they were told about it!

4. Exact variations within the 10 permitted domain validation methods
are very much up to the ingenuity of the CA doing the work. For
example the advanced secure checks developed by "Let's Encrypt" are
technically just extra good variations of some of these 10 methods.
Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

Nick Lamb

unread,
Jul 18, 2017, 5:12:59 AM7/18/17
to mozilla-dev-s...@lists.mozilla.org
On Tuesday, 18 July 2017 07:45:01 UTC+1, Jakob Bohm wrote:
> 1. I believe (though others may know better) that the high general
> requirements for the security of CA systems also apply to the
> systems performing the validation procedures in question.

Yes, however I don't think Matthew's concern was about systems owned by the CA but rather systems proximate to them in the network. For example if the CA purchases Internet service from a single local Internet Service Provider, the BRs obviously don't require that this ISP have all the security procedures in place of a CA. That ISP would be able to act as a MitM for the entire rest of the Internet, and isn't subject to the BRs so that this might happen at the whim of a 17-year old unpaid intern let loose with the routing tables.


> 2. For all DV (Domain Validated) certificate validation methods, it is
> basically accepted that if an attacker can hijack access to a domain
> for the duration of the validation, then that attacker can fool even
> the most secure CA into giving the attacker a DV certificate.
> This is because the problem is fundamentally unsolvable.

Only some of the 10 Blessed Methods involve the actual network. Domain Authorization Documents would get the job done and needn't travel over the network. If your main threat is from network-based adversaries, such documents are an admirable choice to prevent that.

[Aside: Did the CA/B really still not manage to pass a resolution fixing the list of Blessed Methods all these months later? I guess Mozilla's intervention here was more necessary than I'd appreciated]

Where a domain has enabled DNSSEC, it is possible for the CA to rely upon DNSSEC to prevent tampering with records for that domain. So that secures DNS-based validations. We can argue about whether the DNSSEC cryptography would withstand attack by a resourceful adversary, but it certainly raises the bar very considerably compared to just fiddling with a routing table.

Unlike a typical end user, the CA is certainly in a position to implement DNSSEC validation in its DNS resolver correctly and to reject attempts to validate control which run into problems with DNS server correctness. I know that Let's Encrypt does this, and judging from their user forums a small but noticeable fraction of applicants run into problems because their DNS server is crap and replies SERVFAIL (or times out) for legal DNS queries.

There is doubtless a strong temptation for commercial reasons for a CA to ignore such problems and press on with the compromised validation, but the BRs don't require that, and it would not be unreasonable to "level the playing field" by updating them, or Mozilla's programme requirements, to demand the CA reject validation when an applicant's DNS servers won't answer correctly.

> 3. The location from which to fetch the confirmation file for HTTP based
> validation is generally dictated by the CA, not the applicant.

The Blessed Methods specifically call out a sub-path of the IETF's reserved /.well-known/ URLs for this purpose. ACME has its own path, which being IETF-standardized will be suitable as well (the Blessed Methods say you can use a different path if it's on IANA's list and IETF standardization includes adding things to IANA's list as an automatic step), but unless somebody else in the industry has an IETF standards track protocol under the radar those are the only two valid choices under the Blessed Methods.

There definitely are lots of non-Blessed Methods approaches deployed when I last looked (and so perhaps still today) which use other paths, but you're correct that they're usually chosen by the CA. This is always going to be more dangerous than letting the IETF control it, so the Blessed Methods are making a good change here.

Matthew Hardeman

unread,
Jul 18, 2017, 12:51:30 PM7/18/17
to mozilla-dev-s...@lists.mozilla.org
> Yes, however I don't think Matthew's concern was about systems owned by the CA but rather systems proximate to them in the network. For example if the CA purchases Internet service from a single local Internet Service Provider, the BRs obviously don't require that this ISP have all the security procedures in place of a CA. That ISP would be able to act as a MitM for the entire rest of the Internet, and isn't subject to the BRs so that this might happen at the whim of a 17-year old unpaid intern let loose with the routing tables.

You are correct as to my concern, except in as far as it is more insidious than even that. Not only is it a question of trusting your ISP. The CA's ISP need do nothing wrong. Another ISP trusted by your ISP could be the vector of injection of a quite temporary and very narrowly scoped route hijack. Furthermore, it can absolutely even be done if the CA's ISP's primary IP transit ISP purchases transit from another ISP (this is quite common) which in turn trusts other peers.

For example, I myself manage a network interconnected to the Digital Realty / Telx ATL1 TIE (Telx Intenet Exchange). Across that exchange, I have (for example) a peering session with Hurricane Electric. I have no doubt I could leak a prefix briefly to HE that would get picked up. Another ISP who uses HE as a primary transit path would almost certainly accept the advertisement from HE and that traffic would flow my way. For many ISPs the scope of this would be limited to the southeast USA in my example, but assuming that I were targeting a CA in the southeast USA, that would be a bonus -- it would severely limit the number of worldwide eyes who might notice my brief hijacking activity. If I wanted to target a west coast CA in the bay area, or Seattle, or the LA area, I would just need to be one out of a universe of hundreds of well peered network participants on the prevailing IXP exchange at San Francisco / Palo Alto, the Seattle Westin building, or CoreSite's One Wilshire respectively.

> Only some of the 10 Blessed Methods involve the actual network. Domain Authorization Documents would get the job done and needn't travel over the network. If your main threat is from network-based adversaries, such documents are an admirable choice to prevent that.

Of course, but the real threat one faces is what other CAs will accept as proof, not what one would wish that other CAs accept as proof. CAA obviously does a great deal to help here, especially the combination of CAA with DNSSEC.

The broader point I wish to make is that much can be done do improve the strength of the various subset of the 10 methods which do rely solely on network reliant automated validation methodologies. The upside would be a significant, demonstrable increase in difficulty for even well placed ISP admins to compromise a compliant CAs validation processes. The downside would be increases in cost and complexity borne by the compliant CA.

> [Aside: Did the CA/B really still not manage to pass a resolution fixing the list of Blessed Methods all these months later? I guess Mozilla's intervention here was more necessary than I'd appreciated]

I noticed that too. I assume it is still tied up in IPR hell?

> Where a domain has enabled DNSSEC, it is possible for the CA to rely upon DNSSEC to prevent tampering with records for that domain. So that secures DNS-based validations. We can argue about whether the DNSSEC cryptography would withstand attack by a resourceful adversary, but it certainly raises the bar very considerably compared to just fiddling with a routing table.

This does greatly enhance the defensive capability for a given domain.

> Unlike a typical end user, the CA is certainly in a position to implement DNSSEC validation in its DNS resolver correctly and to reject attempts to validate control which run into problems with DNS server correctness. I know that Let's Encrypt does this, and judging from their user forums a small but noticeable fraction of applicants run into problems because their DNS server is crap and replies SERVFAIL (or times out) for legal DNS queries.

Agreed. At least let any tax related to implementation of DNSSEC fall where it is due -- upon the party that incorrectly implemented it.


> There is doubtless a strong temptation for commercial reasons for a CA to ignore such problems and press on with the compromised validation, but the BRs don't require that, and it would not be unreasonable to "level the playing field" by updating them, or Mozilla's programme requirements, to demand the CA reject validation when an applicant's DNS servers won't answer correctly.

I would advocate a level playing field here. This would have the bonus upside of helping to fix bad DNSSEC deployments. If broken DNSSEC broke ability to get a certificate anywhere, either the incorrect deployment would likely be rolled back in the worst case or fixed in the best.

>
> > 3. The location from which to fetch the confirmation file for HTTP based
> > validation is generally dictated by the CA, not the applicant.
>
> The Blessed Methods specifically call out a sub-path of the IETF's reserved /.well-known/ URLs for this purpose. ACME has its own path, which being IETF-standardized will be suitable as well (the Blessed Methods say you can use a different path if it's on IANA's list and IETF standardization includes adding things to IANA's list as an automatic step), but unless somebody else in the industry has an IETF standards track protocol under the radar those are the only two valid choices under the Blessed Methods.
>
> There definitely are lots of non-Blessed Methods approaches deployed when I last looked (and so perhaps still today) which use other paths, but you're correct that they're usually chosen by the CA. This is always going to be more dangerous than letting the IETF control it, so the Blessed Methods are making a good change here.

I was terribly unclear as to my meaning here and I apologize. I was not speaking to the URL path segment at all. I was speaking as to the combination of both physical and logical point of interconnection of the element on the network into the internet, from which the CA's validation query activity will originate toward the target resources to be validated.

I believe there would be a massive improvement in the security of DNS query and HTTP client fetch type validations if the CA were required to execute multiple queries (ideally at least 3 or 4), sourced from different physical locations (said locations having substantial network and geographic distance between them) and each location utilizing significantly different internet interconnection providers.

Despite the fact that this would massively increase the burden to quietly and momentarily hijack DNS server IPs in order to trick a CA, I believe that there is presently no commercial impetus or advantage for a CA in the marketplace to implement such security measures. On that basis, I raise the question of whether such discussion is appropriate for consideration to begin a dialogue on what rules or requirements should perhaps issue upon CAs to combat the risk.

Gervase Markham

unread,
Jul 20, 2017, 10:39:40 AM7/20/17
to mozilla-dev-s...@lists.mozilla.org
On 18/07/17 17:51, Matthew Hardeman wrote:
> The broader point I wish to make is that much can be done do improve the strength of the various subset of the 10 methods which do rely solely on network reliant automated validation methodologies. The upside would be a significant, demonstrable increase in difficulty for even well placed ISP admins to compromise a compliant CAs validation processes. The downside would be increases in cost and complexity borne by the compliant CA.

Your point, in the abstract, is a reasonable one, but so is your further
point about trade-offs. The only way we can really make progress is for
you to propose specific changes to the language, and we can then discuss
the trade-offs of each.

> I noticed that too. I assume it is still tied up in IPR hell?

No. IPR issues are solved. We are currently in arguments about what, if
any, additional necessary fixes to the text should go into the "restore
the text" ballot and what should go into a subsequent ballot, along with
the question of whether and which existing domain validations to
grandfather in and which to require that they be redone.

> I would advocate a level playing field here. This would have the bonus upside of helping to fix bad DNSSEC deployments. If broken DNSSEC broke ability to get a certificate anywhere, either the incorrect deployment would likely be rolled back in the worst case or fixed in the best.

Certainly for CAA, we don't allow broken DNSSEC to fail open. I hope
that will be true of DNS-based validation methods - either after 190
passes, or soon after that.

> I believe there would be a massive improvement in the security of DNS query and HTTP client fetch type validations if the CA were required to execute multiple queries (ideally at least 3 or 4), sourced from different physical locations (said locations having substantial network and geographic distance between them) and each location utilizing significantly different internet interconnection providers.

How could such a requirement be concretely specced in an auditable way?

Gerv

Jakob Bohm

unread,
Jul 20, 2017, 11:51:50 AM7/20/17
to mozilla-dev-s...@lists.mozilla.org
This could be audited as part of general security/implementation
auditing. Also, the CA could/should log the list of deployed probes
that checked/softfailed each domain as part of the usual evidence
logging.

As this would probably require most CAs to set up additional "probe
servers" at diverse locations, while still maintaining the high
auditable level of network security, a longer than usual phase in for
such a requirement would be in order. (I am thinking mostly of smaller
CAs here, whose security may have been previously based on keeping
everything except off-line backups in one or two secure buildings).

A new requirement would be that as part of the 10 approved methods:
- All DNS lookups should be done from at least 5 separate locations
with Internet connectivity from different ISPs. 4 out of 5 must
return the same result before that result is used either directly
or as part of a second step.
- All repeatable network connections (such as HTTP probes and whois
lookups) must be done from 5 separate locations with Internet
connectivity from different ISPs using DNS results checked as above,
again 4 out of 5 must agree.
- All difficult to repeat network connections (such as sending mails),
must be done from randomly selected locations chosen out of at least
4 that are simultaneously available (not down) and have Internet
connection from different ISPs. And still using DNS results checked
as above.

The exact number of and details of the separate locations should be kept
secret, except for the auditors and a small number of CA employees, so
that attackers will not know when and where to set up man-in-the middle
network attacks such that 80% of the probes are fooled.

Implementation examples (not requirements):

In practice, a CA would typically set up 5 "probe" servers around the
geographic area served (which may be a country, continent or the world),
each capable of relaying the relevant network traffic from the central
validation system. If one "probe" goes off line, validation can
continue, but with 0 failures allowed, while if two out of 5 go down,
validation cannot be done (thus some CAs may want to use 10 or more
locations for added redundancy).

The "probe" servers could be relatively simple VPN boxes, carefully
hardened and audited and then encased in welded shut steel boxes before
being transported to 3rd party data centers. Central software
continuously verifies that it is talking to a box with a known
private/public key and that various network tests confirm that the box
is still connected to the expected remote network as seen both from
inside and outside. A CA employee should also be dispatched to
physically check after any power or connectivity failure, but this may
be delayed by a few days.

Keeping extra probes and not always using all of them can also help hide
the complete list of probe locations from attackers (who might otherwise
just log the accesses to one of their own servers during a legitimate
request).

A public many-locations VPN service such as TOR could be used as a
supplemental check, but cannot be audited to CA network security
standards and thus would be an additional check.

Matthew Hardeman

unread,
Jul 20, 2017, 4:24:08 PM7/20/17
to mozilla-dev-s...@lists.mozilla.org
On Thursday, July 20, 2017 at 9:39:40 AM UTC-5, Gervase Markham wrote:

> Your point, in the abstract, is a reasonable one, but so is your further
> point about trade-offs. The only way we can really make progress is for
> you to propose specific changes to the language, and we can then discuss
> the trade-offs of each.

I would be willing to take a stab at this if the subject matter is of interest and would be willing to commit some time to work on it providing that it would appear a convenient time to discuss and contemplate the matter. Can anyone give me a sense of whether the matter of the potential vulnerabilities that I see here -- and of the potential mitigations I might suggest -- are of interest to the community?

> Certainly for CAA, we don't allow broken DNSSEC to fail open. I hope
> that will be true of DNS-based validation methods - either after 190
> passes, or soon after that.

A requirement that if the zone is configured for DNSSEC that any domain validation technology that relies in any part upon DNS lookups (i.e. direct DNS validations as well as HTTP validations) must succeed only if DNSSEC validation of the lookups succeeds would strengthen the requirements with very little cost or negative consequence. This would, of course, only improve the security posture of domain validations for those domains configured for DNSSEC, but that is still a significant benefit. Much like CAA, it gives those holding and/or managing high value domains significant power to restrict domain hijacking for purposes of acquiring certificates.

Quite separately, it appears that 3.2.2.8's "As part of the issuance process..." text would strongly suggest that CAA record checking be performed upon each instance of certificate issuance. I presume that applies even in the face of a CA which might be relying upon previous DNS / HTTP domain validation. I grant that the text goes on to say that issuance must occur within the greater of 8 hours or the CAA TTL, but it does appear that the intent is that CAA records be queried for each instance of issuance and for each SAN dnsName. If this is the intent and ultimately the practice and we are already requiring blocking reliance on DNS query within the process of certificate issuance, should the validity of domain validation itself be similarly curtailed? My argument is that if we are placing a blocking reliance upon both the CA's DNS validation infrastructure AS WELL AS the target domain's authoritative DNS infrastructure during the course of the certificate issuance process, then there is precious little extra point of failure in just requiring that domain validation occur with similarly reduced validity period.

> > I believe there would be a massive improvement in the security of DNS query and HTTP client fetch type validations if the CA were required to execute multiple queries (ideally at least 3 or 4), sourced from different physical locations (said locations having substantial network and geographic distance between them) and each location utilizing significantly different internet interconnection providers.
>
> How could such a requirement be concretely specced in an auditable way?

I can certainly propose a series of concrete specifications / requirements as to a more resilient validation infrastructure. I can further propose a list of procedures for validating point-in-time compliance of each of the requirements in the aforementioned list. Further, I can propose a list of data points / measurements / audit data that might be recorded as part of the validation record data set by the CA at the time of validation which could be used to provide strong support that the specifications / requirements are being followed through the course of operations. If those were written up and presented does that begin to address your question?

Thanks,

Matt Hardeman

Ryan Sleevi

unread,
Jul 20, 2017, 4:32:29 PM7/20/17
to Matthew Hardeman, mozilla-dev-security-policy
On Thu, Jul 20, 2017 at 4:23 PM, Matthew Hardeman via dev-security-policy <
dev-secur...@lists.mozilla.org> wrote:

> I would be willing to take a stab at this if the subject matter is of
> interest and would be willing to commit some time to work on it providing
> that it would appear a convenient time to discuss and contemplate the
> matter. Can anyone give me a sense of whether the matter of the potential
> vulnerabilities that I see here -- and of the potential mitigations I might
> suggest -- are of interest to the community?
>

Broadly, yes, but there's unfortunately a shade of IP issues that make it
more difficult to contribute as directly as Gerv proposed. Gerv may accept
any changes to the Mozilla side, but if the goal is to modify the Baseline
Requirements, you'd need to sign the IPR policy of the CA/B Forum and join
as an Interested Party before changes.

And realize that the changes have to be comprehensible by those with
limited to know background in technology :)


> Quite separately, it appears that 3.2.2.8's "As part of the issuance
> process..." text would strongly suggest that CAA record checking be
> performed upon each instance of certificate issuance. I presume that
> applies even in the face of a CA which might be relying upon previous DNS /
> HTTP domain validation. I grant that the text goes on to say that issuance
> must occur within the greater of 8 hours or the CAA TTL, but it does appear
> that the intent is that CAA records be queried for each instance of
> issuance and for each SAN dnsName. If this is the intent and ultimately
> the practice and we are already requiring blocking reliance on DNS query
> within the process of certificate issuance, should the validity of domain
> validation itself be similarly curtailed? My argument is that if we are
> placing a blocking reliance upon both the CA's DNS validation
> infrastructure AS WELL AS the target domain's authoritative DNS
> infrastructure during the course of the certificate issuance process
> , then there is precious little extra point of failure in just requiring
> that domain validation occur with similarly reduced validity period.
>

This is indeed a separate issue. Like patches, it's best to take as small
as you can go.

The question about the validity/reuse of this information is near and dear
to Googles' heart (hence Ballots 185 and 186) and the desire to reduce this
time substantially exists. That said, the Forum as a whole has mixed
feelings on this, and so it's still an active - and separate - point of
discussion.


> > > I believe there would be a massive improvement in the security of DNS
> query and HTTP client fetch type validations if the CA were required to
> execute multiple queries (ideally at least 3 or 4), sourced from different
> physical locations (said locations having substantial network and
> geographic distance between them) and each location utilizing significantly
> different internet interconnection providers.
> >
> > How could such a requirement be concretely specced in an auditable way?
>
> I can certainly propose a series of concrete specifications / requirements
> as to a more resilient validation infrastructure. I can further propose a
> list of procedures for validating point-in-time compliance of each of the
> requirements in the aforementioned list. Further, I can propose a list of
> data points / measurements / audit data that might be recorded as part of
> the validation record data set by the CA at the time of validation which
> could be used to provide strong support that the specifications /
> requirements are being followed through the course of operations. If those
> were written up and presented does that begin to address your question?


I think it's worth exploring.

Note that there's a whole host of process involved:

- Change the CA/B documents (done through the Validation WG, at present -
need to minimally execute an IPR agreement before even members can launder
ballots for you)
- Change to the WebTrust TF audit criteria (which would involve
collaboration with them, and in general, they're not a big fan of precise
auditable controls)
- Change to the ETSI audit criteria (similar collaboration)

Alternatively, if exploring the Mozilla side, it's fairly easy to make it
up as you go along - which is not a criticism of the root store policy, but
praise :) You just may not get as much feedback.

That said, I think it's worthwhile to make sure the threat model, more than
anything, is defined and articulated. If the threat model results in us
introducing substantive process, but without objective security gain, then
it may not be as worthwhile. Enumerating the threats both addressed and
unaddressible are thus useful in that scope.

Matthew Hardeman

unread,
Jul 20, 2017, 8:13:15 PM7/20/17
to mozilla-dev-s...@lists.mozilla.org
One (Hypothetical) Concrete Example of a Practical DNS Validation Attack:

(Author's note: I've chosen for this example to utilize the Let's Encrypt CA as the Certificate Authority involved and I have chosen as a target for improper validation the domain eff.org. Neither of these is in any way endorsing what I have documented here. Neither is aware of the scenario I am painting here. I have NOT actually carried out a route hijack attack in order to get a certificate for eff.org. I DO NOT intend to do so. I have laid out the research methodology and data points of interest that one who would seek to get a certificate for eff.org illegitimately would need.)

The target: eff.org

In order to validate as eff.org, one needs to -- at a minimum -- be positioned to temporarily answer DNS queries on behalf of eff.org. Assuming that the DNS root servers, the .org TLD servers and the registrar for eff.org are not to be compromised, the best mechanism to accomplish answering for eff.org in the DNS will be to hijack the IP space for the authoritative name servers for the eff.org zone.

First, we must find them.

appleprov1:~ mhardeman$ dig +trace -t NS eff.org

; <<>> DiG 9.8.3-P1 <<>> +trace -t NS eff.org
;; global options: +cmd
. 161925 IN NS a.root-servers.net.
. 161925 IN NS b.root-servers.net.
. 161925 IN NS c.root-servers.net.
. 161925 IN NS d.root-servers.net.
. 161925 IN NS e.root-servers.net.
. 161925 IN NS f.root-servers.net.
. 161925 IN NS g.root-servers.net.
. 161925 IN NS h.root-servers.net.
. 161925 IN NS i.root-servers.net.
. 161925 IN NS j.root-servers.net.
. 161925 IN NS k.root-servers.net.
. 161925 IN NS l.root-servers.net.
. 161925 IN NS m.root-servers.net.
;; Received 228 bytes from 10.47.52.1#53(10.47.52.1) in 330 ms

org. 172800 IN NS a0.org.afilias-nst.info.
org. 172800 IN NS a2.org.afilias-nst.info.
org. 172800 IN NS b0.org.afilias-nst.org.
org. 172800 IN NS b2.org.afilias-nst.org.
org. 172800 IN NS c0.org.afilias-nst.info.
org. 172800 IN NS d0.org.afilias-nst.org.
;; Received 427 bytes from 198.97.190.53#53(198.97.190.53) in 154 ms

eff.org. 86400 IN NS ns1.eff.org.
eff.org. 86400 IN NS ns2.eff.org.
;; Received 93 bytes from 2001:500:b::1#53(2001:500:b::1) in 205 ms

eff.org. 7200 IN NS ns1.eff.org.
eff.org. 7200 IN NS ns6.eff.org.
eff.org. 7200 IN NS ns2.eff.org.
;; Received 127 bytes from 69.50.225.156#53(69.50.225.156) in 79 ms


(Further research suggests ns6.eff.org is presently non-responsive or is in some special role - I would guess a hidden master, considering that the .org delegation servers only refer out to ns1.eff.org and ns2.eff.org.)

Is eff.org DNSSEC protected? Asking "dig +trace -t DNSKEY eff.org" will reveal no DNSKEY records returned. No DNSSEC for this zone. See also dnsviz.net for such lookups.

So, all I need to do is hijack the IP space for ns1.eff.org and ns2.eff.org -- and very temporarily -- to get a certificate issued for eff.org.

(Author's further note: I'll grant that eff.org is probably on various peoples' high value domain list and thus would likely get policy blocked for other reasons regardless of successful domain validation. This is, after all, only an example. I also wish to set out that I give IPv4 examples below, knowing one would also need to work on the IPv6 angle as well. I do not explore this here, the principles are the same.)

Now, we need to know what IP space to hijack:

dig -t A ns1.eff.org yields:
;; ANSWER SECTION:
ns1.eff.org. 3269 IN A 173.239.79.201

dig -t A ns2.eff.org yields:
;; ANSWER SECTION:
ns2.eff.org. 6385 IN A 69.50.225.156


Ultimately, to succeed in getting a DNS TXT record domain validation from Let's Encrypt for eff.org, we will need to _very briefly_ take over the IP space and be able to receive and answer DNS queries for 173.239.79.201 and 69.50.225.156. This is probably far easier for a great number of people than many would believe.

Let's understand more about those two IP addresses and how the network space containing those two IP addresses is advertised to the broader internet. I will utilize the University of Oregon's Route Views project for this:

route-views>show ip bgp 173.239.79.201
BGP routing table entry for 173.239.64.0/20, version 196945145
Paths: (41 available, best #18, table default)
Not advertised to any peer
Refresh Epoch 1
3277 3267 174 32354
195.208.112.161 from 195.208.112.161 (194.85.4.13)
Origin IGP, localpref 100, valid, external
Community: 3277:3267 3277:65321 3277:65323 3277:65331


route-views>show ip bgp 69.50.225.156
BGP routing table entry for 69.50.224.0/19, version 103657500
Paths: (40 available, best #33, table default)
Not advertised to any peer
Refresh Epoch 1
58511 6939 13332
103.247.3.45 from 103.247.3.45 (103.247.3.45)
Origin IGP, localpref 100, valid, external

The key facts we are most interested in is who is advertising this IP space (what ASN ultimately advertises the space) and what is the scope of the advertisement - what network block size. We have the following:

173.239.64.0/20 advertised by AS32354 (Unwired) - and -
69.50.224.0/19 advertised by AS13332 (NephoScale Inc.)

Of particular importance, our work is made easier by the fact that the IPs we wish to hijack are advertised as part of larger aggregated blocks. The global BGP network broadly accepts advertisements as small as a /24. If a couple of /24s which are subsets of the above listed /19 and /20 are suddenly apparent in the global routing table (or, in fact, visible within a chosen subset of the global routing tables) then the routes for just that single /24 within the larger blocks will be hijacked, leaving the rest of the IP space in the /19 and /20 unharmed. This greatly reduces the odds that a hijack will be detected. It also greatly improves the odds that the hijack will propagate sufficiently to get the needed effect. Remember that in routing table lookups, the most specific match will win (specialized policy routing excepted). A /24 is smaller than a /19 or /20.

The following advertisements would beat out the legitimate advertisements for just the individual /24 segments that we need to perform our hijack of the DNS servers' IPs:
173.239.79.0/24
69.50.224.0/24

If we only need to convince a relatively small part of the world, versus the whole world, that we are the path to those two /24 network blocks, and if we only need to do so for a brief time, we will massively reduce the odds that our BGP routing hijack for these two blocks will be caught as fraudulent or even casually observed.

But how can we tell who must be convinced? By definition, we must convince the software/hardware/infrastructure at our target CA (Let's Encrypt) which performs the domain validation DNS queries.

Today and yesterday, I ran a total of 4 DNS validation queries (2 on each day) to domains I legitimately own and control. In all instances, the IPv4 IP address making these DNS validation queries was:

64.78.149.164

While Let's Encrypt doesn't advertise that IP address to the world, there's no real way for them to hide the source of a validation either. I would presume that they roll to different IPs fairly frequently, but it doesn't seem to happen super often.

What can we learn about this IP?

>show ip bgp 64.78.149.164

BGP routing table entry for 64.78.144.0/20, version 150101167
Paths: (41 available, best #34, table default)
Not advertised to any peer
Refresh Epoch 1
58511 6939 13649
103.247.3.45 from 103.247.3.45 (103.247.3.45)
Origin IGP, localpref 100, valid, external

The IP address is part of an advertisement (64.78.144.0/20) originated by AS13649 (ViaWest) in Colorado. This is the ISP providing service to Let's Encrypt's DNS validation point at the moment.

In order to successfully get a certificate from eff.org from Let's Encrypt, we need to persuade ViaWest that there's a better than presently reflected route to 173.239.79.0/24 and 69.50.224.0/24 and that some piece of infrastructure we control is that better route. Ultimately, we likely don't even have to persuade ViaWest directly. We could, instead, merely persuade one of ViaWest's upstream transit suppliers.

Knowing where to make what advertisement to get ViaWest to see those two /24s is a skillset which is effectively only built in actual industry experience. The specifics of how one would most easily place this attack and constrain the scope to a point that it would impact little overall internet traffic and would limit visibility of the hijack to the larger internet vary with every target. From a glance, I can tell you that ViaWest purchases IPv4 transit from at least Level3 (AS3356) and QWest/CenturyLink (AS209). More than that though, ViaWest is a fairly aggressive IXP peering participant with a stated "Open" peering policy. They are present on several exchanges, including CoreSite's Any2 Denver Internet Exchange. See their entry at the PeeringDB database: https://www.peeringdb.com/net/388 The majority of peering relationships across IXPs don't validate ownership of the advertisements other than a visual check at the initial peering establishment and a rule in the routers to auto-shut the peering if a peering which historically shared 200 routes tries to send 300+ routes. Things like that. Why that is the general case is a long, complicated and nuanced topic. Anyone with money -- and not even all that much of that -- could colocate at CoreSite Any2 Denver, join the exchange, and probably get a peering relationship with ViaWest easily. Two weeks later, in the middle of the nite, ViaWest for a period of 5 minutes sees new advertisements for the /24s containing the two IP addresses we need to hijack.

At that time, we would run the certificate validation requests and the certificate issuance requests, get the certificate and disappear the advertisements. If we were so lucky as to have secured a direct peering relationship with ViaWest, then only we and ViaWest will have ever seen the route. Let's Encrypt's infrastructure will have known nothing. In the worst case, ViaWest's subscribers may be unable to access things running on the (at most) ~500ish IPs we just hijacked for a period that definitely wouldn't need to exceed 5 minutes. In the middle of the night.

As easily as that, one could definitely get a certificate issued without breaking most of the internet, without leaving much of a trace, and without failing domain validation.

That certificate could later be utilized in a targeted MITM attack.

My purpose in writing this was to illustrate just how easily someone with quite modest resources and the right skill set can presently overcome the technical checks of DNS based domain validation (which includes things such as HTTP validation).

I'll write separately in a less sensationalized post to describe each risk factor and appropriate mitigations.

In closing I wish to emphasize that Let's Encrypt was only chosen for this example because it was convenient as I already had a client installed and also literally free for me to perform multiple validations and certificate issuances. (Though I could do that with Comodo's domain validation 3 month trial product too, couldn't I?) A couple of extra checks strongly suggest that quite several other CAs which issue domain validation products could be just as easily subverted. As yet, I have not identified a CA which I believe is well prepared for this level of network manipulation. To their credit, it is clear to me that the people behind Let's Encrypt actual recognize this risk (on the basis of comments I've seen in their discussion forums as well as commentary in some of their recent GitHub commits.) Furthermore, there is evidence that they are working toward a plan which would help mitigate the risks of this kind of attack. I reiterate again that nothing in this article highlights a risk surfaced by Let's Encrypt that isn't also exposed by every other DV issuing CA I've scrutinized.

Thanks,

Matt Hardeman

Ryan Sleevi

unread,
Jul 20, 2017, 9:08:52 PM7/20/17
to Matthew Hardeman, mozilla-dev-security-policy
On Thu, Jul 20, 2017 at 8:13 PM, Matthew Hardeman via dev-security-policy <
dev-secur...@lists.mozilla.org> wrote:
>
> My purpose in writing this was to illustrate just how easily someone with
> quite modest resources and the right skill set can presently overcome the
> technical checks of DNS based domain validation (which includes things such
> as HTTP validation).
>

Sure, and this was an excellent post for that. But I note that you
discounted, for example, registry attacks (which are, sadly, all too
common).

And I think that's an important thing to consider. The use of BGP attacks
against certificate issuance are well-known and long-documented, and I also
agree that it's not something we've mitigated by policy. I also appreciate
the desire to improve issuance practices - after all, it's telling that
it's only 2017 before we're likely to finally do away with "CA invents
whatever method to validate it wants" - but I think we should also look
holistically at the threat scenario.

I mention these not because I wouldn't want to see mitigations for these,
but to make sure that the mitigations proposed are both practical and
realistic, and that they look holistically at the threat model. In a
holistic look, one which accepts that the registry can easily be
compromised (and/or other forms of DNSSEC shenanigans), it may be that the
solution is better invested on detection than prevention.

I'll write separately in a less sensationalized post to describe each risk
> factor and appropriate mitigations.
>
> In closing I wish to emphasize that Let's Encrypt was only chosen for this
> example because it was convenient as I already had a client installed and
> also literally free for me to perform multiple validations and certificate
> issuances. (Though I could do that with Comodo's domain validation 3 month
> trial product too, couldn't I?) A couple of extra checks strongly suggest
> that quite several other CAs which issue domain validation products could
> be just as easily subverted. As yet, I have not identified a CA which I
> believe is well prepared for this level of network manipulation. To their
> credit, it is clear to me that the people behind Let's Encrypt actual
> recognize this risk (on the basis of comments I've seen in their discussion
> forums as well as commentary in some of their recent GitHub commits.)
> Furthermore, there is evidence that they are working toward a plan which
> would help mitigate the risks of this kind of attack. I reiterate again
> that nothing in this article highlights a risk surfaced by Let's Encrypt
> that isn't also exposed by every other DV issuing CA I've scrutinized.


Agreed. However, we're still figuring out with CAs how not to follow
redirects when validating requests, so we've got some very, very
low-hanging fruit in the security space to improve on. And this improvement
is, to some extent, a limited budget, so we want to go for the big returns.

Nick Lamb

unread,
Jul 20, 2017, 9:13:23 PM7/20/17
to mozilla-dev-s...@lists.mozilla.org
On Friday, 21 July 2017 01:13:15 UTC+1, Matthew Hardeman wrote:
> As easily as that, one could definitely get a certificate issued without breaking most of the internet, without leaving much of a trace, and without failing domain validation.

One trace this would leave, if done using Let's Encrypt or several other popular CAs, is a CT log record. Google has pushed back its implementation date, but it seems inevitable at this point that certificates for ordinary web sites (as opposed to HTTPS APIs, SMTP, IRC, and so on) will need to be submitted for CT if you expect them to work much beyond this year. The most obvious way to achieve this is for the CA to submit automatically during or immediately after issuance.

Now, most likely the EFF (if your example) does not routinely check CT logs, and doesn't subscribe to any service which monitors the logs and reports new issuances. But a high value target certainly _should_ be doing this, and it significantly closes the window.

DNSSEC is probably the wiser precaution if you're technically capable of deploying it, but paying somebody to watch CT and tell you about all new issuances for domains you control doesn't require any technical steps, which makes it the attractive option if you're protective of your name but not capable of bold technical changes.

Matthew Hardeman

unread,
Jul 21, 2017, 1:15:31 AM7/21/17
to mozilla-dev-s...@lists.mozilla.org
On Thursday, July 20, 2017 at 8:13:23 PM UTC-5, Nick Lamb wrote:

> On Friday, 21 July 2017 01:13:15 UTC+1, Matthew Hardeman wrote:
> > As easily as that, one could definitely get a certificate issued without breaking most of the internet, without leaving much of a trace, and without failing domain validation.

> One trace this would leave, if done using Let's Encrypt or several other popular CAs, is a CT log record. Google has pushed back its implementation date, but it seems inevitable at this point that certificates for ordinary web sites (as opposed to HTTPS APIs, SMTP, IRC, and so on) will need to be submitted for CT if you expect them to work much beyond this year. The most obvious way to achieve this is for the CA to submit automatically during or immediately after issuance.

Indeed, I should have better qualified my "without leaving much of a trace". CT logging would absolutely catch this issuance from many CAs today (and presumably from all of the publicly trusted CAs sometime in 2018.) I think the public at large, the security community, and even the CA community owe a debt of gratitude to the Googlers who brought CT to bear and the (primarily) Googlers who are doing / have done so much to drive its adoption. In the period since around 2014, as CT logs depth of coverage has increased, entire categories of misissuance and significant misdeeds have come to light and proper responses have been made.

Having said that, I meant to refer particularly that if carefully orchestrated there would likely be no forensic trace that could be reconstructed to identify the party who successfully acquired the certificate. Indeed, but for the fact of the issuance, should a domain owner who knows that the certificate was not requested properly become aware of the issuance, it is unlikely that a skilled and brief hijack would even get noticed for someone to begin trying to investigate. Even still, in many instances a skilled attacker would be able to scope the advertisement carefully enough that a single validation point could be tricked while managing to miss having the route propagate to any of the systems that might meaningfully notice and log the anomalous route advertisement.

What I meant by "without a trace" is that in the case of an artful and limited-in-scope IP hijack, a CA exploited as in my hypothetical might well be called to present evidence in a court room. Should they need to attest to the veracity that this certificate represents the party in control of eff.org on the date of the issuance, said CA would likely present an evidence file which perfectly supports that it was properly issued to the party in control of eff.org. Nothing in the CA's logs -- at least presently -- would be expected to hint at a routing anomaly.

Further, from the network side of the equation, I strongly suspect that if a CA, questioning whether there was an anomalous route for the few minutes surrounding request and issuance of the suspect certificate would, even mere days after the issuance, be able to get any conclusive log data from their ISP as to 1) if a route change encompassing the IPs in question occurred at all at said specific time and (even less likely) 2) which of their peers introduced what prefix specifically and when. To the extent that MOST service providers log individual announcement / withdrawal of individual prefixes, (many don't persist this for any time period at all), these logs are terribly ephemeral. It's voluminous and generally is of low value save for in the moment. It's also in a category of records that the ISP has every incentive to have for their own diagnostics and debugging for a very brief period and every incentive NOT to keep for longer than very strictly needed as when knowledge that you have it gets out, it will lead to burdensome requests to produce it.

> Now, most likely the EFF (if your example) does not routinely check CT logs, and doesn't subscribe to any service which monitors the logs and reports new issuances. But a high value target certainly _should_ be doing this, and it significantly closes the window.

In as far as they're a major sponsor and participant in the Let's Encrypt community (they author/maintain the certbot client, right?), I would be shocked if they didn't monitor such things, at least periodically.

> DNSSEC is probably the wiser precaution if you're technically capable of deploying it, but paying somebody to watch CT and tell you about all new issuances for domains you control doesn't require any technical steps, which makes it the attractive option if you're protective of your name but not capable of bold technical changes.

Sadly, I did some quick checks on quite several top domain names. Out of a quick search of Twitter, Amazon, Google, Ebay, and PayPal, only paypal.com has implemented DNSSEC. I presume there must still be too many badly configured or badly implemented validating resolvers which always fail validation and break things. It seems many exceptionally technically talented owners of very high value domain names aren't signing their zones as yet.

Having said that, for those domains that do implement DNSSEC, I concur that many risks are mitigated.

Indeed, a combination of proper DNSSEC implementation, CAA, and picking the right domain registrar and right security options for the management access to those domains at said domain registrar, probably eliminates most risks of domain hijack via either registrar hijack as well as DNS server IP hijacks. (For even small owners, domain registration at domains.google.com protected by a Google user account with U2F based 2FA enabled is very accessible economically, not difficult to set up and maintain, and probably way above average as far as overall security posture goes.)

That said, presently DNSSEC implementation prevalence is quite poor.

Meanwhile, knowing retrospectively (especially if it occurs in within minutes to hours of misissuance) via CT that a bad certificate has issued can at least make it possible to warn your user base of a risk they may encounter.

Sadly, the state of revocation today makes revocation almost entirely ineffective, especially for DV leaf certs. I do believe that OCSP must-staple will eventually fix this problem. That said, OCSP must-staple will require good server-software support which is just now in its infancy. (By that I mean that competent non-IIS solutions for OCSP must-staple are only just recently available.) I believe it will be at least another 3 to 5 years before the default packed-in HTTP server software in major distributions and stacks will have good OCSP stapling support out-of-the-box.

Meanwhile I think there is some value to exploring risks, vulnerabilities, and mitigations with a view to splitting these down a dividing line: 1) those matters a reasonable web domain owner / operator can implement themselves to improve their property's security posture and 2) those risks over which a normal site owner/operator can have no influence over whatsoever and further that the normal site owner/operator is unlikely to even be aware of the existence of said risks

In those which the site owner/operator can influence:

1. Domain Registrar could get hacked - Starts with picking a domain registrar who is behaving in a manner which suggests that the registrar overall at the registrar administration level is not going to be compromised.
2. Domain owner/operator's account at registrar could get hacked - Having picked a security conscious domain registrar, the owner/operator must properly secure his account and account recovery options to ensure no unauthorized changes occur to the domain registration by defrauding access by impersonating the owner/operator to the registrar.
3. DNS servers could get hacked - Maintain strong security over the authoritative DNS servers themselves, whether you run the servers or just have access to modify your records.
4. IP space of DNS servers could be hijacked - (Iffy) Properly implementing DNSSEC, but I don't really count this as yet because the lack of deployment by the big boys you would expect would implement first makes me think that there's a compat issue I'm just not fully versed in.

Those which the domain owner/operator can not influence:

1. IP space hijacks allowing spoofed DNS responder to become authoritative - Other than DNSSEC, the average site operator is totally powerless and largely blind on this.

In as far as significant risks exist for which the site owner/operator is powerless to mitigate, and probably also because the subject matter aligns to some of my own expertise and operational experience, I believe that an examination of the risks and mitigations of IP space hijacking should be a topic of engagement in the community, with a view to closing gaps that only the community (and consensus from the community driving action by the various actors in the CA ecosystem) can resolve.


Matthew Hardeman

unread,
Jul 21, 2017, 1:39:13 AM7/21/17
to mozilla-dev-s...@lists.mozilla.org
On Thursday, July 20, 2017 at 3:32:29 PM UTC-5, Ryan Sleevi wrote:

> Broadly, yes, but there's unfortunately a shade of IP issues that make it
> more difficult to contribute as directly as Gerv proposed. Gerv may accept
> any changes to the Mozilla side, but if the goal is to modify the Baseline
> Requirements, you'd need to sign the IPR policy of the CA/B Forum and join
> as an Interested Party before changes.

I think at this phase that it makes sense to better flesh out within the Mozilla dev security community what kinds of mitigations might be economically feasible, provide measurable benefit of sufficient counterbalancing value, and practical details for implementation before trying to bring a ballot before the CAB forum. Having said that, I did read over the IPR policy agreement and at first blush saw nothing that would prevent myself or my company from becoming signatories should the idea take off.

>
> And realize that the changes have to be comprehensible by those with
> limited to know background in technology :)

Certainly. I would expect that the target audience is more compliance / audit / accounting than it is network engineering. Even still, I have to believe it is possible to describe the specific modes of risk and counterbalancing mitigations in a framework that those of that skillset can evaluate.

> The question about the validity/reuse of this information is near and dear
> to Googles' heart (hence Ballots 185 and 186) and the desire to reduce this
> time substantially exists. That said, the Forum as a whole has mixed
> feelings on this, and so it's still an active - and separate - point of
> discussion.

I mentioned it mostly as I was curious if the argument that time-of-issuance (or thereabout) DNS queries of an issuance blocking nature pertaining to CAA had been utilized as yet as an argument that more revalidation might now be appropriate because the revalidation is merely a minor extra burden to the queries already being run for CAA. (There's already a blocking DNS query on records of the subject domain holding up issuance, so what's a few more records on the same domain?)

> That said, I think it's worthwhile to make sure the threat model, more than
> anything, is defined and articulated. If the threat model results in us
> introducing substantive process, but without objective security gain, then
> it may not be as worthwhile. Enumerating the threats both addressed and
> unaddressible are thus useful in that scope.

Can you provide any good reference or point to an example of what you would regard an exceptionally well described definition and articulation of a thread model within the certificate issuance space generally? I feel I have a quite solid grasp of the various weak points in the network infrastructure and network operations aspects that underpin the technological measures involved in domain validation. I am interested in taking that knowledge and molding that into a view which might best permit this community to assess my thoughts on the matter, weigh pros and cons, and help guide proposals for mitigation.

Matthew Hardeman

unread,
Jul 21, 2017, 6:06:42 PM7/21/17
to mozilla-dev-s...@lists.mozilla.org
It seems that a group of Princeton researchers just presented a live theoretical* misissuance by Let's Encrypt.

They did a sub-prefix hijack via a technique other than those I described here and achieved issuance while passing-through traffic for other destination within the IP space of the hijacked scope.

They've got a paper at: https://petsymposium.org/2017/papers/hotpets/bgp-bogus-tls.pdf

I say that theoretical because they hijacked a /24 of their own /23 under a different ASN but I am given to believe that the "adversarial" ASN is also under their control or that they had permission to use it. In as far as this is the case, this technically isn't a misissuance because hijacking ones own IP space is technically just a different routing configuration diverting the traffic to the destination they properly control to another point of interconnection they properly controlled.

birg...@princeton.edu

unread,
Jul 22, 2017, 3:49:39 AM7/22/17
to mozilla-dev-s...@lists.mozilla.org
Hi,

I am Henry Birge-Lee, one of the researchers at Princeton leading that effort. I just performed that live demo a couple hours ago. You are correct about how we performed that attack. One minor detail is that we were forced to use the same ASN twice (for both the adversary and the victim). The adversary and the victim were two different routers peering with completely different ASes, but we were forced to reuse the AS because we were performing the announcements with the PEERING testbed (https://peering.usc.edu/) and are not allowed to announce from another ASN. Thus from a policy point of view this was not a misissuance and our BGP announcements would likely not have triggered an alarm from a BGP monitoring system. Even if we had the ability to hijack another ASes prefix and trigger such alarms we would be hesitant to because of ethical considerations. Our goal was to demonstrate the effectiveness and ease of interception of the technique we used, not freak out network operators because of potential hijacks.

I know some may argue that had we triggered alarms from the major BGP monitoring frameworks, CAs might not have issued us the certificates the did. We find that this is unlikely because 1) The DV certificate signing process is automated but the type of BGP announcements we made would likely require manual review before they could be definitively flagged as an attack 2) There is no evidence CAs are doing this (we know Let's Encrypt does not use BGP data because of their transparency and conversations with their executive director Josh Aas as well as their engineering team).

As for further work, implementation of countermeasures into the CA and Browser forum Baseline Requirements is our eventual goal and we see engaging with this ongoing discussion as a step in the right direction.

Over the next couple days I will look over these conversations in more detail and look for ways that we can integrate these ideas into the research we are doing.

Best,
Henry Birge-Lee

Princeton University department of Computer Science

Gervase Markham

unread,
Jul 24, 2017, 3:49:20 AM7/24/17
to mozilla-dev-s...@lists.mozilla.org
On 20/07/17 21:31, Ryan Sleevi wrote:
> Broadly, yes, but there's unfortunately a shade of IP issues that make it
> more difficult to contribute as directly as Gerv proposed. Gerv may accept
> any changes to the Mozilla side, but if the goal is to modify the Baseline
> Requirements, you'd need to sign the IPR policy of the CA/B Forum and join
> as an Interested Party before changes.

I'm on holiday at the moment but, as Ryan says, this particular part of
what CAs do is the part most subject to IPR restrictions and so work on
it is probably best done in a CAB Forum context rather than a more
informal process.

I will attempt to respond to your messages in more depth when I return.

Gerv

Jakob Bohm

unread,
Jul 24, 2017, 8:31:33 AM7/24/17
to mozilla-dev-s...@lists.mozilla.org
On 22/07/2017 02:38, birg...@princeton.edu wrote:
> On Friday, July 21, 2017 at 5:06:42 PM UTC-5, Matthew Hardeman wrote:
>> It seems that a group of Princeton researchers just presented a live theoretical* misissuance by Let's Encrypt.
>>
>> They did a sub-prefix hijack via a technique other than those I described here and achieved issuance while passing-through traffic for other destination within the IP space of the hijacked scope.
>>
>> They've got a paper at: https://petsymposium.org/2017/papers/hotpets/bgp-bogus-tls.pdf
>>
>> I say that theoretical because they hijacked a /24 of their own /23 under a different ASN but I am given to believe that the "adversarial" ASN is also under their control or that they had permission to use it. In as far as this is the case, this technically isn't a misissuance because hijacking ones own IP space is technically just a different routing configuration diverting the traffic to the destination they properly control to another point of interconnection they properly controlled.
>
> Hi,
>
> I am Henry Birge-Lee, one of the researchers at Princeton leading that effort. I just performed that live demo a couple hours ago. You are correct about how we performed that attack. One minor detail is that we were forced to use the same ASN twice (for both the adversary and the victim). The adversary and the victim were two different routers peering with completely different ASes, but we were forced to reuse the AS because we were performing the announcements with the PEERING testbed (https://peering.usc.edu/) and are not allowed to announce from another ASN. Thus from a policy point of view this was not a misissuance and our BGP announcements would likely not have triggered an alarm from a BGP monitoring system. Even if we had the ability to hijack another ASes prefix and trigger such alarms we would be hesitant to because of ethical considerations. Our goal was to demonstrate the effectiveness and ease of interception of the technique we used, not freak out network operators because of potential hijacks.
>
> I know some may argue that had we triggered alarms from the major BGP monitoring frameworks, CAs might not have issued us the certificates the did. We find that this is unlikely because 1) The DV certificate signing process is automated but the type of BGP announcements we made would likely require manual review before they could be definitively flagged as an attack 2) There is no evidence CAs are doing this (we know Let's Encrypt does not use BGP data because of their transparency and conversations with their executive director Josh Aas as well as their engineering team).
>

Another testing option would have been to use another AS legitimately
operated by someone associated with your research team. Unless
Princeton has historically obtained 2 AS numbers (this is not uncommon),

Cooperating with a researcher at another research facility could obtain
the other AS number without any actual breach or hijack.

> As for further work, implementation of countermeasures into the CA and Browser forum Baseline Requirements is our eventual goal and we see engaging with this ongoing discussion as a step in the right direction.
>
> Over the next couple days I will look over these conversations in more detail and look for ways that we can integrate these ideas into the research we are doing.
>
>

Matthew Hardeman

unread,
Jul 24, 2017, 10:44:26 AM7/24/17
to mozilla-dev-s...@lists.mozilla.org
Hi, Gerv,

I'm certainly willing and able to execute an IPR agreement in my own right and/or on behalf of my company.

My concern is that I would like to have a more fully fleshed out proposal to bring to the forum. I have a strong understanding of the network and interconnection environment as pertains the issue of IP hijacking, etc, but significantly less understanding of the infrastructure side of the CA and so I feel rather limited in being able to structure mitigations which are practical for CAs to deploy.

In short I can point out the weak spots and the potential consequences of the weak spots, and I can recommend mechanisms for reducing the risk, but I feel I could much better recommend specific solutions and frameworks for addressing the risks if I had a better understanding of typical CA interaction with the outside network as well as a firmer understanding of the various trust boundaries across the various CA elements.

Thanks,

Matt Hardeman

birg...@princeton.edu

unread,
Jul 25, 2017, 2:00:39 PM7/25/17
to mozilla-dev-s...@lists.mozilla.org
On Monday, July 24, 2017 at 5:31:33 AM UTC-7, Jakob Bohm wrote:
> On 22/07/2017 02:38, birg...@princeton.edu wrote:
> > On Friday, July 21, 2017 at 5:06:42 PM UTC-5, Matthew Hardeman wrote:
> >> It seems that a group of Princeton researchers just presented a live theoretical* misissuance by Let's Encrypt.
> >>
> >> They did a sub-prefix hijack via a technique other than those I described here and achieved issuance while passing-through traffic for other destination within the IP space of the hijacked scope.
> >>
> >> They've got a paper at: https://petsymposium.org/2017/papers/hotpets/bgp-bogus-tls.pdf
> >>
> >> I say that theoretical because they hijacked a /24 of their own /23 under a different ASN but I am given to believe that the "adversarial" ASN is also under their control or that they had permission to use it. In as far as this is the case, this technically isn't a misissuance because hijacking ones own IP space is technically just a different routing configuration diverting the traffic to the destination they properly control to another point of interconnection they properly controlled.
> >
> > Hi,
> >
> > I am Henry Birge-Lee, one of the researchers at Princeton leading that effort. I just performed that live demo a couple hours ago. You are correct about how we performed that attack. One minor detail is that we were forced to use the same ASN twice (for both the adversary and the victim). The adversary and the victim were two different routers peering with completely different ASes, but we were forced to reuse the AS because we were performing the announcements with the PEERING testbed (https://peering.usc.edu/) and are not allowed to announce from another ASN. Thus from a policy point of view this was not a misissuance and our BGP announcements would likely not have triggered an alarm from a BGP monitoring system. Even if we had the ability to hijack another ASes prefix and trigger such alarms we would be hesitant to because of ethical considerations. Our goal was to demonstrate the effectiveness and ease of interception of the technique we used, not freak out network operators because of potential hijacks.
> >
> > I know some may argue that had we triggered alarms from the major BGP monitoring frameworks, CAs might not have issued us the certificates the did. We find that this is unlikely because 1) The DV certificate signing process is automated but the type of BGP announcements we made would likely require manual review before they could be definitively flagged as an attack 2) There is no evidence CAs are doing this (we know Let's Encrypt does not use BGP data because of their transparency and conversations with their executive director Josh Aas as well as their engineering team).
> >
>
> Another testing option would have been to use another AS legitimately
> operated by someone associated with your research team. Unless
> Princeton has historically obtained 2 AS numbers (this is not uncommon),
>
> Cooperating with a researcher at another research facility could obtain
> the other AS number without any actual breach or hijack.
>

We have been considering research in this direction. PEERING controls several ASNs and may let us use them more liberally with some convincing. We also have the ASN from Princeton that could be used with cooperation from Princeton OIT (the Office of Information Technology) where we have several contracts. The problem is not the source of the ASNs but the network anomaly the announcement would cause. If we were to hijack the prefix of a cooperating organization, the PEERING ASes might have their announcements filtered because they are seemingly launching BGP attacks. This could be fixed with some communication with ISPs, but regardless there is a cost to launching such realistic attacks. Matthew Hardeman would probably know more detail about how this would be received by the community, but this is the general impression I have got from engaging with the people who run the PEERING framework.

So far we have not been working on such an attack very much because we are focusing our research more on countermeasures. We believe that the attack surface is large and there are countless BGP tricks an adversary could use to get the desired properties in an attack. We are focusing our research on simple and countermeasures CAs can implement to reduce this attack space. We also aim to use industry contacts to accurately asses the false positive rates of our countermeasures and develop example implementations.

If it appears that actually launching such a realistic attack would be valuable to the community, we certainty could look into it further.

Matthew Hardeman

unread,
Jul 25, 2017, 8:38:23 PM7/25/17
to mozilla-dev-s...@lists.mozilla.org
On Tuesday, July 25, 2017 at 1:00:39 PM UTC-5, birg...@princeton.edu wrote:
> We have been considering research in this direction. PEERING controls several ASNs and may let us use them more liberally with some convincing. We also have the ASN from Princeton that could be used with cooperation from Princeton OIT (the Office of Information Technology) where we have several contracts. The problem is not the source of the ASNs but the network anomaly the announcement would cause. If we were to hijack the prefix of a cooperating organization, the PEERING ASes might have their announcements filtered because they are seemingly launching BGP attacks. This could be fixed with some communication with ISPs, but regardless there is a cost to launching such realistic attacks. Matthew Hardeman would probably know more detail about how this would be received by the community, but this is the general impression I have got from engaging with the people who run the PEERING framework.

I have some thoughts on how to perform such experiments while mitigating the likelihood of significant lasting consequence to the party helping ingress the hijack to the routing table, but you correctly point out that the attack surface is large and the one consistent feature of all discussion up to this point on the topic of BGP hijacks for purpose of countering CA domain validation is that none of those discuss have, up to this point, expressed doubt as to the risks or the feasibility of carrying out these risks. To that ends, I think the first case that would need to be made to further that research is whether anything of significance is gained in making the attack more tangible.

>
> So far we have not been working on such an attack very much because we are focusing our research more on countermeasures. We believe that the attack surface is large and there are countless BGP tricks an adversary could use to get the desired properties in an attack. We are focusing our research on simple and countermeasures CAs can implement to reduce this attack space. We also aim to use industry contacts to accurately asses the false positive rates of our countermeasures and develop example implementations.
>
> If it appears that actually launching such a realistic attack would be valuable to the community, we certainty could look into it further.

This is the question to answer before performing such an attack. In effect, who is the audience that needs to be impressed? What criteria must be met to impress that audience? What benefits in furtherance of the work arise from impressing that audience?

Thanks,

Matt Hardeman

Dimitris Zacharopoulos

unread,
Aug 24, 2017, 8:45:11 AM8/24/17
to Matthew Hardeman, mozilla-dev-s...@lists.mozilla.org
On 26/7/2017 3:38 πμ, Matthew Hardeman via dev-security-policy wrote:
> On Tuesday, July 25, 2017 at 1:00:39 PM UTC-5,birg...@princeton.edu wrote:
>> We have been considering research in this direction. PEERING controls several ASNs and may let us use them more liberally with some convincing. We also have the ASN from Princeton that could be used with cooperation from Princeton OIT (the Office of Information Technology) where we have several contracts. The problem is not the source of the ASNs but the network anomaly the announcement would cause. If we were to hijack the prefix of a cooperating organization, the PEERING ASes might have their announcements filtered because they are seemingly launching BGP attacks. This could be fixed with some communication with ISPs, but regardless there is a cost to launching such realistic attacks. Matthew Hardeman would probably know more detail about how this would be received by the community, but this is the general impression I have got from engaging with the people who run the PEERING framework.
> I have some thoughts on how to perform such experiments while mitigating the likelihood of significant lasting consequence to the party helping ingress the hijack to the routing table, but you correctly point out that the attack surface is large and the one consistent feature of all discussion up to this point on the topic of BGP hijacks for purpose of countering CA domain validation is that none of those discuss have, up to this point, expressed doubt as to the risks or the feasibility of carrying out these risks. To that ends, I think the first case that would need to be made to further that research is whether anything of significance is gained in making the attack more tangible.
>
>> So far we have not been working on such an attack very much because we are focusing our research more on countermeasures. We believe that the attack surface is large and there are countless BGP tricks an adversary could use to get the desired properties in an attack. We are focusing our research on simple and countermeasures CAs can implement to reduce this attack space. We also aim to use industry contacts to accurately asses the false positive rates of our countermeasures and develop example implementations.
>>
>> If it appears that actually launching such a realistic attack would be valuable to the community, we certainty could look into it further.
> This is the question to answer before performing such an attack. In effect, who is the audience that needs to be impressed? What criteria must be met to impress that audience? What benefits in furtherance of the work arise from impressing that audience?
>
> Thanks,
>
> Matt Hardeman
> _______________________________________________
> dev-security-policy mailing list
> dev-secur...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-security-policy

That was a very interesting topic to read. Unfortunately, CAs can't do
much to protect against network hijacking because most of the
counter-measures lie in the ISPs' side. However, the CAs could request
some counter-measures from their ISPs.

Best practices for ISPs state that for each connected peer, the ISP need
to apply a prefix filter that will allow announcements for only
legitimate prefixes that the peer controls/owns. We can easily imagine
that this is not performed by all ISPs. Another solution that has been
around for some time, is RPKI
<https://www.ripe.net/manage-ips-and-asns/resource-management/certification>
along with BGP Origin Validation
<https://www.ripe.net/manage-ips-and-asns/resource-management/certification/bgp-origin-validation>.
Of course, we can't expect all ISPs to check for Route Origin
Authorizations (ROAs) but if the major ISPs checked for ROAs, it would
improve things a lot in terms of securing the Internet.

So, in order to minimize the risk for a CA or a site owner network from
being hijacked, if a CA/site owner has an address space that is Provider
Aggregatable (PA) (this means the ISP "owns" the IP space), they should
check that their upstream network provider has properly created the ROAs
for the CA/site operator's network prefix(es) in the RIR authorized
list, and that they have configured their routers to validate ROAs for
each prefix. If the CA/site operator has a Provider Independent (PI)
address space (this means the CA/site operator "owns" the IP space),
then the CA/site operator should create the ROAs.

In Matt's example, if eff.org had ROAs for their network prefixes (that
include their DNS servers) and if Let's Encrypt provider (or Let's
Encrypt router) was validating ROAs, this attack wouldn't work.


Dimitris.

Ryan Hurst

unread,
Aug 25, 2017, 2:42:30 PM8/25/17
to mozilla-dev-s...@lists.mozilla.org
Dimitris,

I think it is not accurate to characterize this as being outside of the CAs controls. Several CAs utilize multiple network perspectives and consensus to mitigate these risks. While this is not a total solution it is fairly effective if the consensus pool is well thought out.

Ryan

Dimitris Zacharopoulos

unread,
Aug 28, 2017, 12:56:56 AM8/28/17
to Ryan Hurst, mozilla-dev-s...@lists.mozilla.org
On 25/8/2017 9:42 μμ, Ryan Hurst via dev-security-policy wrote:
> Dimitris,
>
> I think it is not accurate to characterize this as being outside of the CAs controls. Several CAs utilize multiple network perspectives and consensus to mitigate these risks. While this is not a total solution it is fairly effective if the consensus pool is well thought out.
>
> Ryan

Just to make sure I am not misunderstanding, are you referring to CAs
with real-time access to the Full Internet Routing Table that allows
them to make routing decisions or something completely different? If
it's something different, it would be great if you could provide some
information about how this consensus over network perspectives (between
different CAs) works today.  There are services that offer
routing-status like https://stat.ripe.net/widget/routing-status or
https://www.cidr-report.org/as2.0/ but I don't know if they are being
used by CAs to minimize the chance of accepting a hijacked address
prefix (Matt's example).

Dimitris.

Nick Lamb

unread,
Aug 28, 2017, 4:15:55 AM8/28/17
to mozilla-dev-s...@lists.mozilla.org
I think that instead Ryan H is suggesting that (some) CAs are taking advantage of multiple geographically distinct nodes to run the tests from one of the Blessed Methods against an applicant's systems from several places on the Internet at once. This mitigates against attacks that are able to disturb routing only for the CA or some small corner of the Internet containing the CA. For example my hypothetical 17 year-old at the ISP earlier in the thread can't plausibly also be working at four other ISPs around the globe.

This is a mitigation not a fix because a truly sophisticated attacker can obtain other certificates legitimately to build up intelligence about the CA's other perspective points on the Internet and then attack all of them simultaneously. It doesn't involve knowing much about Internet routing, beyond the highest level knowledge that connections from very distant locations will travel by different routes to reach the "same" destination.

Ryan Hurst

unread,
Aug 29, 2017, 2:18:02 AM8/29/17
to mozilla-dev-s...@lists.mozilla.org
On Monday, August 28, 2017 at 1:15:55 AM UTC-7, Nick Lamb wrote:
> I think that instead Ryan H is suggesting that (some) CAs are taking advantage of multiple geographically distinct nodes to run the tests from one of the Blessed Methods against an applicant's systems from several places on the Internet at once. This mitigates against attacks that are able to disturb routing only for the CA or some small corner of the Internet containing the CA. For example my hypothetical 17 year-old at the ISP earlier in the thread can't plausibly also be working at four other ISPs around the globe.
>
> This is a mitigation not a fix because a truly sophisticated attacker can obtain other certificates legitimately to build up intelligence about the CA's other perspective points on the Internet and then attack all of them simultaneously. It doesn't involve knowing much about Internet routing, beyond the highest level knowledge that connections from very distant locations will travel by different routes to reach the "same" destination.

Thanks, Nick, that is exactly what I was saying.

Jakob Bohm

unread,
Aug 29, 2017, 3:59:29 AM8/29/17
to mozilla-dev-s...@lists.mozilla.org
On 28/08/2017 10:15, Nick Lamb wrote:
> I think that instead Ryan H is suggesting that (some) CAs are taking advantage of multiple geographically distinct nodes to run the tests from one of the Blessed Methods against an applicant's systems from several places on the Internet at once. This mitigates against attacks that are able to disturb routing only for the CA or some small corner of the Internet containing the CA. For example my hypothetical 17 year-old at the ISP earlier in the thread can't plausibly also be working at four other ISPs around the globe.
>
> This is a mitigation not a fix because a truly sophisticated attacker can obtain other certificates legitimately to build up intelligence about the CA's other perspective points on the Internet and then attack all of them simultaneously. It doesn't involve knowing much about Internet routing, beyond the highest level knowledge that connections from very distant locations will travel by different routes to reach the "same" destination.
>

Another reason this is only a mitigation is that it provides little
protection against an attack against the small corner of the Internet
containing the victim domain owner. For this, the attacker just needs
to be close enough to divert traffic to the victim from most of the
Internet. Of cause that could be mitigated if the victim system is
geographically distributed (as is often true of DNS) using unrelated
ISPs for the different locations (so not 5 different Amazon availability
zones for example).
Reply all
Reply to author
Forward
0 new messages