Certificates with improperly normalized IDNs

Jonathan Rudenberg

unread,

Aug 10, 2017, 4:23:26 PM8/10/17

to mozilla-dev-s...@lists.mozilla.org

RFC 5280 section 7.2 and the associated IDNA RFC requires that Internationalized Domain Names are normalized before encoding to punycode.

Let’s Encrypt appears to have issued at least three certificates that have at least one dnsName without the proper Unicode normalization applied.

https://crt.sh/?id=187634027&opt=cablint
https://crt.sh/?id=187628042&opt=cablint
https://crt.sh/?id=173493962&opt=cablint

It’s also worth noting that RFC 3491 (referenced by RFC 5280 via RFC 3490) requires normalization form KC, but RFC 5891 which replaces RFC 3491 requires normalization form C. I believe that the BRs and/or RFC 5280 should be updated to reference RFC 5890 and by extension RFC 5891 instead.

Jonathan

Jakob Bohm

unread,

Aug 10, 2017, 5:32:13 PM8/10/17

to mozilla-dev-s...@lists.mozilla.org

All 3 dnsName values exist in the DNS and point to the same server (IP
address). Whois says that the two second level names are both registered
to OOO "JilfondService" .

This raises the question if CAs should be responsible for misissued
domain names, or if they should be allowed to issue certificates to
actually existing DNS names.

I don't know if the bad punycode encodings are in the 2nd level names (a
registrar/registry responsibility, both were from 2012 or before) or in
the 3rd level names (locally created at an unknown date).

An online utility based on the older RFC349x round trips all of these.
So if the issue is only compatibility with a newer RFC not referenced
from the current BRs, these would probably be OK under the current BRs
and certLint needs to accept them.

Note: The DNS names are:

xn--80aqafgnbi.xn--b1addckdrqixje4a.xn--p1ai
xn--80aqafgnbi.xn--f1awi.xn--p1ai
xn-----blcihca2aqinbjzlgp0hrd8c.xn--f1awi.xn--p1ai

Or broken down into DNS labels:

ICANN tld:

xn--p1ai

Second level domains, registrar is currently RUCENTER-RF

xn--b1addckdrqixje4a
xn--f1awi

Third level domains, subscriber responsibility:

xn--80aqafgnbi
xn-----blcihca2aqinbjzlgp0hrd8c

Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

Roland Bracewell Shoemaker

unread,

Aug 10, 2017, 5:53:20 PM8/10/17

to dev-secur...@lists.mozilla.org

We are aware of this and are looking into it further.

On 08/10/2017 01:22 PM, Jonathan Rudenberg via dev-security-policy wrote:
> RFC 5280 section 7.2 and the associated IDNA RFC requires that Internationalized Domain Names are normalized before encoding to punycode.
>
> Let’s Encrypt appears to have issued at least three certificates that have at least one dnsName without the proper Unicode normalization applied.
>
> https://crt.sh/?id=187634027&opt=cablint
> https://crt.sh/?id=187628042&opt=cablint
> https://crt.sh/?id=173493962&opt=cablint
>
> It’s also worth noting that RFC 3491 (referenced by RFC 5280 via RFC 3490) requires normalization form KC, but RFC 5891 which replaces RFC 3491 requires normalization form C. I believe that the BRs and/or RFC 5280 should be updated to reference RFC 5890 and by extension RFC 5891 instead.
>
> Jonathan
>

> _______________________________________________
> dev-security-policy mailing list
> dev-secur...@lists.mozilla.org
> https://lists.mozilla.org/listinfo/dev-security-policy
>

Ryan Sleevi

unread,

Aug 10, 2017, 6:15:34 PM8/10/17

to Jakob Bohm, mozilla-dev-security-policy

On Thu, Aug 10, 2017 at 5:31 PM, Jakob Bohm via dev-security-policy <
dev-secur...@lists.mozilla.org> wrote:
>
> This raises the question if CAs should be responsible for misissued
> domain names, or if they should be allowed to issue certificates to
> actually existing DNS names.
>

No. It doesn't. That's been addressed several times in the CA/Browser Forum
with other forms of 'invalid' (non-preferred name syntax) domain names,
such as those with underscores.

It's not permitted under RFC 5280, thus, CAs are responsible. Full stop.

> I don't know if the bad punycode encodings are in the 2nd level names (a
> registrar/registry responsibility, both were from 2012 or before) or in
> the 3rd level names (locally created at an unknown date).
>
> An online utility based on the older RFC349x round trips all of these.
> So if the issue is only compatibility with a newer RFC not referenced from
> the current BRs, these would probably be OK under the current BRs and
> certLint needs to accept them.
>

No, it's a newer RFC not referenced in RFC 5280, so it's not permitted
under the current BRs.

There's no retroactive immunity.

Peter Bowen

unread,

Aug 10, 2017, 7:59:20 PM8/10/17

to Jakob Bohm, mozilla-dev-s...@lists.mozilla.org

On Thu, Aug 10, 2017 at 2:31 PM, Jakob Bohm via dev-security-policy
<dev-secur...@lists.mozilla.org> wrote:
> On 10/08/2017 22:22, Jonathan Rudenberg wrote:
>>

> All 3 dnsName values exist in the DNS and point to the same server (IP
> address). Whois says that the two second level names are both registered
> to OOO "JilfondService" .
>

> This raises the question if CAs should be responsible for misissued
> domain names, or if they should be allowed to issue certificates to
> actually existing DNS names.
>

> I don't know if the bad punycode encodings are in the 2nd level names (a
> registrar/registry responsibility, both were from 2012 or before) or in
> the 3rd level names (locally created at an unknown date).
>
> An online utility based on the older RFC349x round trips all of these.
> So if the issue is only compatibility with a newer RFC not referenced from
> the current BRs, these would probably be OK under the current BRs and
> certLint needs to accept them.
>

> Note: The DNS names are:
>
> xn--80aqafgnbi.xn--b1addckdrqixje4a.xn--p1ai
> xn--80aqafgnbi.xn--f1awi.xn--p1ai
> xn-----blcihca2aqinbjzlgp0hrd8c.xn--f1awi.xn--p1ai

These are not the names causing issues.

"xn--109-3veba6djs1bfxlfmx6c9g.xn--b1addckdrqixje4a.xn--p1ai" from
https://crt.sh/?id=187634027&opt=cablint
"xn--109-3veba6djs1bfxlfmx6c9g.xn--f1awi.xn--p1ai" from
https://crt.sh/?id=187628042&opt=cablint
"xn--109-3veba6djs1bfxlfmx6c9g.xn--f1awi.xn--p1ai" from
https://crt.sh/?id=173493962&opt=cablint (same name as the prior cert)

It is the xn--109-3veba6djs1bfxlfmx6c9g label that is incorrect in all
three. In all three the bad label is not in the registered domain or
any public suffix.

Directly decoded, this string is:

"\u0608\u061c\u0628\u0031\u0608\u0611\u0618\u061e\u0608\u0621\u0612\u0614\u0030\u061b\u0039\u061a\u0618\u061c"

However the string when normalized to NFC is:

"\u0608\u061c\u0628\u0031\u0608\u0618\u0611\u061e\u0608\u0621\u0612\u0614\u0030\u061b\u0039\u0618\u061a\u061c"

If you look carefully, you will see two different pairs of codepoints
that are swapped in the normalized string.

Thanks,
Peter

Jakob Bohm

unread,

Aug 10, 2017, 10:48:12 PM8/10/17

to mozilla-dev-s...@lists.mozilla.org

On 11/08/2017 00:00, Jonathan Rudenberg wrote:

>
>> On Aug 10, 2017, at 17:31, Jakob Bohm via dev-security-policy <dev-secur...@lists.mozilla.org> wrote:
>>
>> On 10/08/2017 22:22, Jonathan Rudenberg wrote:

>> All 3 dnsName values exist in the DNS and point to the same server (IP
>> address). Whois says that the two second level names are both registered
>> to OOO "JilfondService" .
>>
>> This raises the question if CAs should be responsible for misissued
>> domain names, or if they should be allowed to issue certificates to
>> actually existing DNS names.
>>
>> I don't know if the bad punycode encodings are in the 2nd level names (a
>> registrar/registry responsibility, both were from 2012 or before) or in
>> the 3rd level names (locally created at an unknown date).
>>
>> An online utility based on the older RFC349x round trips all of these.
>> So if the issue is only compatibility with a newer RFC not referenced from the current BRs, these would probably be OK under the current BRs and certLint needs to accept them.
>

> In this case, the NFC and NFKC representations are the same:
>
> $ irb
> irb(main):001:0> require 'simpleidn'
> => true
> irb(main):002:0> a = "xn--109-3veba6djs1bfxlfmx6c9g"
> => "xn--109-3veba6djs1bfxlfmx6c9g"
> irb(main):003:0> u = SimpleIDN.to_unicode(a)
> => "؈؜ب1؈ؘؑ؞؈ءؒؔ0؛9ؘؚ؜"
> irb(main):004:0> u.unicode_normalize(:nfc) == a
> => false
> irb(main):005:0> u.unicode_normalize(:nfc) == u.unicode_normalize(:nfkc)
> => true
> irb(main):006:0> n = SimpleIDN.to_ascii(u.unicode_normalize(:nfc))
> => "xn--109-3veba6djs0bgykfmx6c9g"
> irb(main):007:0> n == a
> => false
>

Ah, I missed that this was about one of many extra SANs in the
certificate, not the main name, as this was not previously reported and
I don't have the tools handy to easily go through all those SANs myself.

Jakob Bohm

unread,

Aug 10, 2017, 10:58:48 PM8/10/17

to mozilla-dev-s...@lists.mozilla.org

On 11/08/2017 00:14, Ryan Sleevi wrote:
> On Thu, Aug 10, 2017 at 5:31 PM, Jakob Bohm via dev-security-policy <
> dev-secur...@lists.mozilla.org> wrote:
>>
>> This raises the question if CAs should be responsible for misissued
>> domain names, or if they should be allowed to issue certificates to
>> actually existing DNS names.
>>
>
> No. It doesn't. That's been addressed several times in the CA/Browser Forum
> with other forms of 'invalid' (non-preferred name syntax) domain names,
> such as those with underscores.
> > It's not permitted under RFC 5280, thus, CAs are responsible. Full stop.
>

As an aside (not applicable to this case), it is worth noting that some
newer RFCs explicitly require DNS names with underscores, though
currently only for things that won't to go in a certificate dnsName SAN
extension.

>
>> I don't know if the bad punycode encodings are in the 2nd level names (a
>> registrar/registry responsibility, both were from 2012 or before) or in
>> the 3rd level names (locally created at an unknown date).
>>
>> An online utility based on the older RFC349x round trips all of these.
>> So if the issue is only compatibility with a newer RFC not referenced from
>> the current BRs, these would probably be OK under the current BRs and
>> certLint needs to accept them.
>>
>
> No, it's a newer RFC not referenced in RFC 5280, so it's not permitted
> under the current BRs.
>
> There's no retroactive immunity.
>

As you could see, in the snipped part of my posting, I was checking the
wrong name from the certificate and concluding that it was apparently
valid under RFC349x, which Jonathan wrote was the one referenced by the
BRs. Therefore I mistook the report for complaining that the encoding
was not valid under RFC5890, which is not referenced by the BRs.

In a later post, Jonathan explained that the problematic name was a
different one which I did not look at.

Peter Bowen

unread,

Aug 11, 2017, 9:54:22 AM8/11/17

to Jonathan Rudenberg, mozilla-dev-s...@lists.mozilla.org

On Thu, Aug 10, 2017 at 1:22 PM, Jonathan Rudenberg via
dev-security-policy <dev-secur...@lists.mozilla.org> wrote:
> RFC 5280 section 7.2 and the associated IDNA RFC requires that Internationalized Domain Names are normalized before encoding to punycode.
>
> Let’s Encrypt appears to have issued at least three certificates that have at least one dnsName without the proper Unicode normalization applied.
>

> It’s also worth noting that RFC 3491 (referenced by RFC 5280 via RFC 3490) requires normalization form KC, but RFC 5891 which replaces RFC 3491 requires normalization form C. I believe that the BRs and/or RFC 5280 should be updated to reference RFC 5890 and by extension RFC 5891 instead.

I did some reading on Unicode normalization today, and it strongly
appears that any string that has been normalized to normalization form
KC is by definition also in normalization form C. Normalization is
idempotent, so doing toNFKC(toNKFC()) will result in the same string
as just doing toNFKC() and toNFC(toNFC()) is the same as toNFC().
Additionally toNFKC is the same as toNFC(toK()).

This means that checking that a string matches the result of
toNFC(string) is a valid check regardless of whether using the 349* or
589* RFCs. It does mean that Certlint will not catch strings that are
in NFC but not in NFKC.

Thanks,
Peter

P.S. I've yet to find a registered domain name not in NFC, and that
includes checking every name in the the zone files for all ICANN gTLDs
and a few ccTLDs

swch...@gmail.com

unread,

Jun 25, 2018, 3:37:20 PM6/25/18

to mozilla-dev-s...@lists.mozilla.org

Hi,
I have an example international domain that is NFC but not NFKC, "xn--ttt-8fa.pumesa.com" (this is a fake domain and my focus is on the general pattern).
The pattern that will cause a domain to be NFC but not NFKC in Golang is: "xn--" followed by any same three letters followed by a single "-" followed by any single digit number followed by "fa"; now I know this pattern doesn't describe real unicode, however the behavior in the programming language is curious (below).
The pattern described above causes strings to be NFC positive but not NFKC in Golang; furthermore, I ran a few tests using Golang (version go1.10.3 darwin) and Java (version "1.8.0_60") and here is the key parts of the code I used:
1) Golang (Used "ToUnicode" to mimic how Zlint tests):
package main
import (
"fmt"
"golang.org/x/net/idna"
"golang.org/x/text/unicode/norm"
)
func main(){
str := "xn--xxx-7fa.pumesa.com"
punycode,err := idna.ToUnicode(str)
if err != nil {
fmt.Println(err)
}
fmt.Println("Is NFC ", norm.NFC.IsNormalString(punycode))
fmt.Println("Is NFKC ", norm.NFKC.IsNormalString(punycode))
}

The last NFKC check is what causes Zlint to throw an error, stating that the unicode is not in compliance, seems that Zlint needs to be updated to follow the latest BR (RFC 5891), meaning check if the unicode in question is NFC compliant rather than NFKC?

Below is something even more interesting.

2) Java:
import java.net.IDN;
import java.text.Normalizer;
public class Main{
public static void main(String args[]){
String cn = "xn--www-0xx.pumesa.com";
String punycode = IDN.toASCII(cn);
//punycode = IDN.toUnicode(punycode);
System.out.println("is NFC " + Normalizer.isNormalized(punycode, Normalizer.Form.NFC));
System.out.println("is NFKC " + Normalizer.isNormalized(punycode, Normalizer.Form.NFKC));
}
}

Per Oracle doc, java.net.IDN.toASCII conforms with RFC 3490, and it throws no error, this can be double checked within the language by converting the punycode back to Unicode, both print statements return true.

So to reiterate, the two main questions are:
1) Should there be a discussion about why Oracle Java and Golang don't agree on whether this pattern causes unicode to be NFKC compliant?
The potential impact is that results obtained from a Java system may not be Zlint compliant.
2) Should Zlint be updated to the latest BR (RFC 5891) regardless of question #1?

Peter Saint-Andre

unread,

Jun 25, 2018, 4:21:47 PM6/25/18

to swch...@gmail.com, mozilla-dev-s...@lists.mozilla.org

On 6/25/18 1:35 PM, swchang10--- via dev-security-policy wrote:
> On Friday, August 11, 2017 at 6:54:22 AM UTC-7, Peter Bowen wrote:

Probably. However, please be aware that the change from RFC 3490 (IDNA)
to RFC 5891 (IDNA2008) involved more than just a change from Unicode
normalization form C to Unicode normalization form KC.

Also relevant:

https://tools.ietf.org/html/rfc8399

Peter

signature.asc