Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Jurisdiction of incorporation validation issue

322 views
Skip to first unread message

Jeremy Rowley

unread,
Aug 22, 2019, 10:29:50 PM8/22/19
to mozilla-dev-s...@lists.mozilla.org
I posted this tonight: https://bugzilla.mozilla.org/show_bug.cgi?id=1576013. It's sort of an extension of the "some-state" issue, but with the incorporation information of an EV cert. The tl;dr of the bug is that sometimes the information isn't perfect because of user entry issues.

What I was hoping to do is have the system automatically populate the jurisdiction information based on the incorporation information. For example, if you use the Delaware secretary of state as the source, then the system should auto-populate Delaware as the State and US as the jurisdiction. And it does...with some.

However, you do you have jurisdictions like Germany that consolidate incorporation information to www.handelsregister.de<http://www.handelsregister.de> so you can't actually tell which area is the incorporation jurisdiction until you do a search. Thus, the fields to allow some user input. That user input is what hurts. In the end, we're implementing an address check that verifies the locality/state/country combination.

The more interesting part (in my opinion) is how to find and address these certs. Right now, every time we have an issue or whenever a guideline changes we write a lot of code, pull a lot of certs, and spend a lot of time reviewing. Instead of doing this every time, we're going to develop a tool that will run automatically every time we change a validation rule to find everything else that will fail the new update rules. IN essence, building unit tests on the data. What I like about this approach is it ends up building a system that lets us see how all the rule changes interplay since sometimes they may intercept in weird ways. It'll also let us easier measure impact of changes on the system. Anyway, I like the idea. Thought I'd share it here to get feedback and suggestions for improvement. Still in spec phase, but I can share more info as it gets developed.

Thanks for listening.

Ryan Sleevi

unread,
Aug 23, 2019, 11:46:06 AM8/23/19
to Jeremy Rowley, mozilla-dev-s...@lists.mozilla.org
On Thu, Aug 22, 2019 at 10:29 PM Jeremy Rowley via dev-security-policy <
dev-secur...@lists.mozilla.org> wrote:

> I posted this tonight:
> https://bugzilla.mozilla.org/show_bug.cgi?id=1576013. It's sort of an
> extension of the "some-state" issue, but with the incorporation information
> of an EV cert. The tl;dr of the bug is that sometimes the information
> isn't perfect because of user entry issues.
>
> What I was hoping to do is have the system automatically populate the
> jurisdiction information based on the incorporation information. For
> example, if you use the Delaware secretary of state as the source, then the
> system should auto-populate Delaware as the State and US as the
> jurisdiction. And it does...with some.
>
> However, you do you have jurisdictions like Germany that consolidate
> incorporation information to www.handelsregister.de<
> http://www.handelsregister.de> so you can't actually tell which area is
> the incorporation jurisdiction until you do a search. Thus, the fields to
> allow some user input. That user input is what hurts. In the end, we're
> implementing an address check that verifies the locality/state/country
> combination.
>

Could you highlight a bit more your proposal here? My understanding is
that, despite the Handelsregister ("Commercial Register") being available
at a country level, it's further subdivided into a list of couunty or
region - e.g. the Amtsgericht Herne ("Local Court Herne").

It sounds like you're still preparing to allow for manual/human input, and
simply consistency checking. Is there a reason to not use an
allowlist-based approach, in which your Registration Agents may only select
from an approved list of County/Region/Locality managed by your Compliance
Team?

That, of course, still allows for human error. Using the excellent example
of the Handelsregister, perhaps you could describe a bit more the flow a
Validation Specialist would go through. Are they expected to examine a
faxed hardcopy? Or do they go to handelsregister.de and look up via the
registration code?

I ask, because it strikes me that this could be an example where a CA could
further improve automation. For example, it's not difficult to imagine that
a locally-developed extension could know the webpages used for validation
of the information, and extract the salient info, when that information is
not easily encoded in a URL. For those not familiar, Handelsregister
encodes the parameters via form POST, a fairly common approach for these
company registers, and thus makes it difficult to store a canonical
resource URL for, say, a server-to-server retrieval. This would help you
quickly and systematically identify the relevant jurisdiction and court,
and in a way that doesn't involve human error.

I'm curious how well that approach generalizes, and/or what challenges may
exist. I totally understand that for registries which solely use hard
copies, this is a far more difficult task than it needs to be, and thus an
element of human review. However, depending on how prevalent the hardcopy
vs online copy is, we might be able to pursue automation for more, and thus
increase the stringency for the exceptions that do involve physical copies.


> The more interesting part (in my opinion) is how to find and address these
> certs. Right now, every time we have an issue or whenever a guideline
> changes we write a lot of code, pull a lot of certs, and spend a lot of
> time reviewing. Instead of doing this every time, we're going to develop a
> tool that will run automatically every time we change a validation rule to
> find everything else that will fail the new update rules. IN essence,
> building unit tests on the data. What I like about this approach is it ends
> up building a system that lets us see how all the rule changes interplay
> since sometimes they may intercept in weird ways. It'll also let us easier
> measure impact of changes on the system. Anyway, I like the idea. Thought
> I'd share it here to get feedback and suggestions for improvement. Still in
> spec phase, but I can share more info as it gets developed.
>

This sounds like a great idea, and would love to know more details here.
For example, what's the process now for identifying these
jurisdictionOfIncorporation issues? How would it improve or change with
this system?

You describe it as "validation rule" changes - and I'm not sure if you're
talking about the BRs (i.e. "we validated this org at time X") or something
else. I'm not sure whether you're adding additional data, or formalizing
checks on existing data. More details here could definitely help try and
generalize it, and might be able to formalize it as a best practice.
Alternatively, even if we can't formalize it as a requirement, it may be
able to use as the basis when evaluating potential impact or cost of
changes (to policy or the BRs) in the future. That is, "any CA that has
implemented (system you describe) should be able to provide quantifiable
data about the impact of (proposed change X). If CAs cannot do so (because
they did not implement the change), their feedback and concerns will not be
considered."

Jakob Bohm

unread,
Aug 23, 2019, 1:19:20 PM8/23/19
to mozilla-dev-s...@lists.mozilla.org
Additional issues seen at some CAs (not necessarily Digicert):

1. I believe the BRs and/or underlying technical standards are very
clear if the ST field should be a full name ("California") or an
abbreviation ("CA").

2. The fact that a country has subdivisions listed in the general ISO
standard for country codes doesn't mean that those are always part of
the jurisdiction of incorporation and/or address.

3. The fact that a government data source lists the incorporation
locality of a company, doesn't mean that this locality detail is
actually a relevant part of the jurisdictionOfIncorporation. This
essentially depends if the rules in that country ensure uniqueness of
both the company number and company name at a higher jurisdiction
level (national or state) to the same degree as at the lower level.
For example, in the US the company name "Stripe" is not unique
nationwide.

In practice this means that validation specialists need to draw up
various common facts for each country served by a CA, and keep those up
to date.

As a non-expert citizen, I believe the proper details for my own country
(C=DK) are:

1. ST= should be Greenland or Grønland (Greenland self-governing
territory aka .gl), Faeroe Islands or Færøerne (Faeroe Islands self-
governing territory aka .fo) or omitted (main country, under the
central government). Other territories were lost more than 100 years
ago and can't occur in current certificate subjects.

2. Company numbers for the main country are numbers from the online CVR
database, they are the same as VAT numbers except: No leading DK, not
all companies have a VAT registration and not all VAT registrations
are companies (some are actual the social security numbers of the
owners of sole proprietorships). Other private organizations are
often listed in CVR too.

3. Government institutions at all levels have numbers from a database
used for electronic billing in OIO UBL XML formats. Some of those
numbers are CVR numbers (like for companies), some are SE numbers and
some are EAN/GLN numbers.

4. Postal codes are 4 digits (leading 0 only occurs in some special
cases, DK prefix is added on international physical mail, but is not
actually part of the postal code).

5. The new code number types added in EV 1.7.0 require additional
research on how they officially map to Danish public administration.

But a CA validation team should research this further to set up proper
templates and scripts for validating EV/OV/IV applicants claiming C=DK.


Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S. https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark. Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

Jeremy Rowley

unread,
Aug 23, 2019, 2:00:36 PM8/23/19
to ry...@sleevi.com, mozilla-dev-s...@lists.mozilla.org


* Could you highlight a bit more your proposal here? My understanding is that, despite the Handelsregister ("Commercial Register") being available at a country level, it's further subdivided into a list of couunty or region - e.g. the Amtsgericht Herne ("Local Court Herne").


* It sounds like you're still preparing to allow for manual/human input, and simply consistency checking. Is there a reason to not use an allowlist-based approach, in which your Registration Agents may only select from an approved list of County/Region/Locality managed by your Compliance Team?


* That, of course, still allows for human error. Using the excellent example of the Handelsregister, perhaps you could describe a bit more the flow a Validation Specialist would go through. Are they expected to examine a faxed hardcopy? Or do they go to handelsregister.de<http://handelsregister.de> and look up via the registration code?



* I ask, because it strikes me that this could be an example where a CA could further improve automation. For example, it's not difficult to imagine that a locally-developed extension could know the webpages used for validation of the information, and extract the salient info, when that information is not easily encoded in a URL. For those not familiar, Handelsregister encodes the parameters via form POST, a fairly common approach for these company registers, and thus makes it difficult to store a canonical resource URL for, say, a server-to-server retrieval. This would help you quickly and systematically identify the relevant jurisdiction and court, and in a way that doesn't involve human error.

I did not know that about Handelsregister. So that’s good info. Right now, the validation staff selects Handelsregister as the source, the system retrieves the information, the staff then selects the jurisdiction information and enters the registration information. Germany is locked in as the country of verification (because Handelsregister is the source), but the staff enters the locality/state type information as the system doesn’t know which region is correct.

The idea is that everywhere we can, the process should automatically fill in jurisdiction information for the validation staff so no typing is required. This is being done in three parts:

1. Immediate (aka Stop the Hurt): The first step is to put the GeoCode check in place to ensure that no matter what there will be valid non-mispelled information in the certificate. There will still be user-typed information during the phase since this phase is Aug 18 2019. The system will work exactly as it does now except that the JOI information will run through the GeoCode system to verify that yes, this information isn’t wrong. If wrong, the system won’t allow the cert to be approved. At this point, no new issues should occur, but I won’t be satisfied as its way too manual – and the registration number is still a manual entry. That needs to change.
2. Intermediate (aka Neuter the Staff): During this process we plan to eliminate typing of sources. Instead, the sources will be picklists based on jurisdiction. This means that if you select Germany and the company type is an LLC, you get a list of available sources. Fool proof-ish. There’s still a copy/paste or manual entry of the registration number. For those sources that do provide an API, we can tie into the API, retrieve the documentation, and populate the information. We want to do that as well, provided it doesn’t throw off phase 3. Since the intermediate solution is also a stop-gap to the final solution, we want it to be a substantial improvement but one that doesn’t impede our final destination.
3. The refactor (aka Documents r Us): This is still very much being specc’ed but we’re currently thinking we want to evolve the system to a document system. Right now the system works on checklists. For JOI, you enter the JOI part, select a document (or two) that you’ll to verify JOI and then transfer information to the system from the document. The revamp moves it to where you have the document and specify on the document which parts of the document apply to the organization. For example, you specify on the document that a number is a registration number or that a name is an org name, highlighting the info. With auto-detection of the fields (just based on key words), you end up with a pretty dang automated system. The validation staff is there to review for accuracy and highlight things that might be missed. Hence, no typing or specifying any information. It’s all directly from the source.

Naming conventions also not approved yet. Since the engineers watch this forum, they’ll probably throw things at me when they see the code names.


* I'm curious how well that approach generalizes, and/or what challenges may exist. I totally understand that for registries which solely use hard copies, this is a far more difficult task than it needs to be, and thus an element of human review. However, depending on how prevalent the hardcopy vs online copy is, we might be able to pursue automation for more, and thus increase the stringency for the exceptions that do involve physical copies.

Right now we get the hard copies and turn them into a PDF to store in the audit system for review during internal and external audits. During validation, all documentation must be present and reviewed. Using OCR better, we can always at least copy and paste information instead of typing it.

The more interesting part (in my opinion) is how to find and address these certs. Right now, every time we have an issue or whenever a guideline changes we write a lot of code, pull a lot of certs, and spend a lot of time reviewing. Instead of doing this every time, we're going to develop a tool that will run automatically every time we change a validation rule to find everything else that will fail the new update rules. IN essence, building unit tests on the data. What I like about this approach is it ends up building a system that lets us see how all the rule changes interplay since sometimes they may intercept in weird ways. It'll also let us easier measure impact of changes on the system. Anyway, I like the idea. Thought I'd share it here to get feedback and suggestions for improvement. Still in spec phase, but I can share more info as it gets developed.


* This sounds like a great idea, and would love to know more details here. For example, what's the process now for identifying these jurisdictionOfIncorporation issues? How would it improve or change with this system?


The process right now is we right a script based on things we can think of that might be wrong (abbreviated states, the word “some” in the state field, etc). We usually pull a sampling of a couple thousand certs and review those to see if we can find anything wrong that can help identify other patterns. We’re in the middle of doing that for the JOI issues. What would be WAY better is if we had rule sets for validation information (similar to cablint) that checked validation information and how it is stored in our system and made these rule sets run on the complete data every time we change something in validation. Right now, we build quick and dirty checks that run one time when we have an incident. That’s not great as it’s a lot of stuff we can’t reuse. What we should do is build something (that crossing my fingers we can open source and share) that will be a library of checks on validation information. Sure, it’ll take a lot of configuration to work with how other CAs store data, but one thing we’ve seen problems with is that changes in one system lead to un-expected potential non-compliances in others. Having something that works cross-functionally throughout the system helps.

A better example in some-state. We scanned for values not listed as states and cities that have “some”, “any”, “none”, etc. That only finds a limited set of the problem, and obviously missed the JOI information (not part of the same data set. Going forward, I want a rule set that says, is this a state? If so, then check this source to see if it’s a real state. Then check this to see if it also exists in the country specified. Then check to see if the locality specified exists in the state. Then see if there is a red flag from a map that says the org doesn’t exist. (The map check is coming – not there yet….) Instead of finding small one off problems people report, find them on a global scale with a rule we run every time something in the CAB forum, Mozilla policy, or our own system changes.



* You describe it as "validation rule" changes - and I'm not sure if you're talking about the BRs (i.e. "we validated this org at time X") or something else. I'm not sure whether you're adding additional data, or formalizing checks on existing data. More details here could definitely help try and generalize it, and might be able to formalize it as a best practice. Alternatively, even if we can't formalize it as a requirement, it may be able to use as the basis when evaluating potential impact or cost of changes (to policy or the BRs) in the future. That is, "any CA that has implemented (system you describe) should be able to provide quantifiable data about the impact of (proposed change X). If CAs cannot do so (because they did not implement the change), their feedback and concerns will not be considered."

Validation rule meaning our own system, the CAB forum, mozilla policy. Basically, anything that could call into question the integrity of some data piece within our system. The point is to catch all changes that may happen proactively, not just when someone pings me with a problem. The requirement I think we’re trying to meet is “never have the same problem again, even if a rule changes” because the system will take that one problem, log it as a unit test, and run that unit test ever time we change the internal rule set to detect all data that violates that rule as modified. Illustrative example: Assume we decide we want all states abbreviated. Note this would contradict the rule in the EV guidelines that requires JOI states to be written out. Right now, this contradiction could pass undetected by a lot of CA systems I think. However, if you have a rule set that can be enforced globally across the entire data set, you end up instantly detecting that no valid EV cert could ever issue. Danger! Anyway, the value of this is pretty huge internally IMO. And for compliance, it’ll make our job easier. No more 3% audits trying to catch mistakes.

Ryan Sleevi

unread,
Aug 23, 2019, 2:39:27 PM8/23/19
to Jeremy Rowley, ry...@sleevi.com, mozilla-dev-s...@lists.mozilla.org
On Fri, Aug 23, 2019 at 2:00 PM Jeremy Rowley <jeremy...@digicert.com>
wrote:

>
>
> - Could you highlight a bit more your proposal here? My understanding
> is that, despite the Handelsregister ("Commercial Register") being
> available at a country level, it's further subdivided into a list of
> couunty or region - e.g. the Amtsgericht Herne ("Local Court Herne").
>
>
>
> - It sounds like you're still preparing to allow for manual/human
> input, and simply consistency checking. Is there a reason to not use an
> allowlist-based approach, in which your Registration Agents may only select
> from an approved list of County/Region/Locality managed by your Compliance
> Team?
>
>
>
> - That, of course, still allows for human error. Using the excellent
> example of the Handelsregister, perhaps you could describe a bit more the
> flow a Validation Specialist would go through. Are they expected to examine
> a faxed hardcopy? Or do they go to handelsregister.de and look up via
> the registration code?
>
>
>
> - I ask, because it strikes me that this could be an example where a
> - I'm curious how well that approach generalizes, and/or what
> challenges may exist. I totally understand that for registries which solely
> use hard copies, this is a far more difficult task than it needs to be, and
> thus an element of human review. However, depending on how prevalent the
> hardcopy vs online copy is, we might be able to pursue automation for more,
> and thus increase the stringency for the exceptions that do involve
> physical copies.
>
>
>
> Right now we get the hard copies and turn them into a PDF to store in the
> audit system for review during internal and external audits. During
> validation, all documentation must be present and reviewed. Using OCR
> better, we can always at least copy and paste information instead of typing
> it.
>

I'm a little nervous about encouraging wide use of OCR. You may recall at
least one CA was bit by an issue in which their OCR system misidentified
letters - https://bugzilla.mozilla.org/show_bug.cgi?id=1311713

That's why I was keen to suggest technical solutions which would verify and
cross-check. My main concern here would be, admittedly, to ensure the
serialNumber itself is reliably entered and detected. Extracting that from
a system, such as you could due via an Extension when looking at, say, the
Handelsregister, is a possible path to reduce both human transcription and
machine-aided transcription issues.

Of course, alternative ways of cross-checking and vetting that data may
exist. Alternatively, it may be that the solution would be to only
allowlist the use of validation sources that made their datasets machine
readable - this would/could address a host of issues in terms of quality.
I'm admittedly not sure the extent to which organizations still rely on
legacy paper trails, and I understand they're still unfortunately common in
some jurisdictions, particularly in the Asia/Pacific region, so it may not
be as viable.


> The process right now is we right a script based on things we can think of
> that might be wrong (abbreviated states, the word “some” in the state
> field, etc). We usually pull a sampling of a couple thousand certs and
> review those to see if we can find anything wrong that can help identify
> other patterns. We’re in the middle of doing that for the JOI issues. What
> would be WAY better is if we had rule sets for validation information
> (similar to cablint) that checked validation information and how it is
> stored in our system and made these rule sets run on the complete data
> every time we change something in validation. Right now, we build quick and
> dirty checks that run one time when we have an incident. That’s not great
> as it’s a lot of stuff we can’t reuse. What we should do is build something
> (that crossing my fingers we can open source and share) that will be a
> library of checks on validation information. Sure, it’ll take a lot of
> configuration to work with how other CAs store data, but one thing we’ve
> seen problems with is that changes in one system lead to un-expected
> potential non-compliances in others. Having something that works
> cross-functionally throughout the system helps.
>

Hugely, and this is exactly the kind of stuff I'm excited to see CAs
discussing and potentially sharing. I think there are some opportunities
for incremental improvements here that may be worth looking at, even before
that final stage.

I would argue a source of (some of) these problems is ambiguity that is
left to the CA's discretion. For example, is the state abbreviated or not?
Is the jurisdictional information clear? Who are the authorized registries
for a jurisdiction that a CA can use?

I can think of some incremental steps here:
- Disclosing exact detailed procedures via CP/CPS
- An emphasis should be on allowlisting. Anything not on the allowlist
*should* be an exceptional thing.
- For example, stating DigiCert will always use a State from ISO 3166-2
makes it clear, and also makes it something verifiable (i.e. someone can
implement an automated check)
- Similarly, enumerating the registries used makes it possible, in many
cases, to automatically check the serialNumber for both format and accuracy
- Modifying the CA/B Forum documents to formalize those processes, by
explicitly removing the ambiguity or CA discretion. DigiCert's done well
here in the past, removing validation methods like 3.2.2.4.1 / 3.2.2.4.5
due to their misuse and danger
- Writing automated tooling to vet/validate

The nice part is that by formalizing the rules, you can benefit a lot from
improved checking that the community may develop, and if it doesn't
materialize, contribute your own to the benefit of the community.


> A better example in some-state. We scanned for values not listed as states
> and cities that have “some”, “any”, “none”, etc. That only finds a limited
> set of the problem, and obviously missed the JOI information (not part of
> the same data set. Going forward, I want a rule set that says, is this a
> state? If so, then check this source to see if it’s a real state. Then
> check this to see if it also exists in the country specified. Then check to
> see if the locality specified exists in the state. Then see if there is a
> red flag from a map that says the org doesn’t exist. (The map check is
> coming – not there yet….) Instead of finding small one off problems people
> report, find them on a global scale with a rule we run every time something
> in the CAB forum, Mozilla policy, or our own system changes.
>

Yes, this is the expectation of all CAs.

As I understand it, following CAs' remediation of Some-State, etc, this is
exactly what members of the community went and did. This is not surprising,
since one of the clearly identified best practices from that discussion was
to look at ISO 3166-1/ISO 3166-2 for such information inconsistency.
SecureTrust, one of the illustrative good reports, did exactly that, and
that's why it's such a perfect example. It's unfortunate that a number of
other CAs didn't, which is why on the incident reports, I've continued to
push them in terms of their evaluation and disclosure.

This is the exact goal of Incident Reports: identifying not just the
incidents, but the systemic issues, devising solutions that can work, and
making sure to holistically remediate the problem.


>
> - You describe it as "validation rule" changes - and I'm not sure if
Yes, this is the goal, and I'm glad to hear some CAs are recognizing this.

Jeremy Rowley

unread,
Aug 23, 2019, 4:19:08 PM8/23/19
to ry...@sleevi.com, mozilla-dev-s...@lists.mozilla.org

>> I'm a little nervous about encouraging wide use of OCR. You may recall at least one CA was bit by an issue in which their OCR system misidentified letters - https://bugzilla.mozilla.org/show_bug.cgi?id=1311713

>> That's why I was keen to suggest technical solutions which would verify and cross-check. My main concern here would be, admittedly, to ensure the serialNumber itself is reliably entered and detected. Extracting that from a system, such as you could due via an Extension when looking at, say, the Handelsregister, is a possible path to reduce both human transcription and machine-aided transcription issues.

Right – and the OCR there is just to make the initial assessment. The idea is to still require validation staff to select the appropriate fields. I like the idea of cross-checking. Maybe what we can also do is tie into a non-primary source (Like D&B or something) to confirm the jurisdiction information. We’ll have to evaluate it, but I like the idea of cross-checking against a reliable source that has an API even if we can’t use the source as our primary source for that information. I’ll need to investigate, but it should be possible for most of EU and the US. Less so for the middle east and Asia.

>> Of course, alternative ways of cross-checking and vetting that data may exist. Alternatively, it may be that the solution would be to only allowlist the use of validation sources that made their datasets machine readable - this would/could address a host of issues in terms of quality. I'm admittedly not sure the extent to which organizations still rely on legacy paper trails, and I understand they're still unfortunately common in some jurisdictions, particularly in the Asia/Pacific region, so it may not be as viable.

Yeah – that mean you basically can’t issue in middle east and mot of Asia. Japan would still work. China I’d have to look. Like I said, there could be non-primary sources that could correlate. We’ll spec that out as we get closer and see what we can do for cross-correlation. May be that we can have enough somethings world-wide that you can always confirm registration with a secondary source.

The process right now is we right a script based on things we can think of that might be wrong (abbreviated states, the word “some” in the state field, etc). We usually pull a sampling of a couple thousand certs and review those to see if we can find anything wrong that can help identify other patterns. We’re in the middle of doing that for the JOI issues. What would be WAY better is if we had rule sets for validation information (similar to cablint) that checked validation information and how it is stored in our system and made these rule sets run on the complete data every time we change something in validation. Right now, we build quick and dirty checks that run one time when we have an incident. That’s not great as it’s a lot of stuff we can’t reuse. What we should do is build something (that crossing my fingers we can open source and share) that will be a library of checks on validation information. Sure, it’ll take a lot of configuration to work with how other CAs store data, but one thing we’ve seen problems with is that changes in one system lead to un-expected potential non-compliances in others. Having something that works cross-functionally throughout the system helps.


* Hugely, and this is exactly the kind of stuff I'm excited to see CAs discussing and potentially sharing. I think there are some opportunities for incremental improvements here that may be worth looking at, even before that final stage.


* I would argue a source of (some of) these problems is ambiguity that is left to the CA's discretion. For example, is the state abbreviated or not? Is the jurisdictional information clear? Who are the authorized registries for a jurisdiction that a CA can use?

I think that’s definitely true. There’s lots of ambiguities in the EV guidelines. You and I were talking about Incorporating Agencies, which is not really defined as incorporating agencies. Note that CAs can use Incorporating Agencies or Registration Agencies to confirm identity, which is very broad, but there is no indication in the certificate what that means.

> I can think of some incremental steps here:
> - Disclosing exact detailed procedures via CP/CPS

Maybe an addendum to the CPS. Or RPS. I’ll experiment and post something to see what the community thinks.

> - An emphasis should be on allowlisting. Anything not on the allowlist *should* be an exceptional thing.

This we actually have internally. Or are you saying across the industry? The allow list internally is something prevetted by compliance and legal. We’re currently (prompted by a certificate problem report) reviewing the entire allowed list to see what’s there and taking anything off that I don’t like. Basically we’re using your suggestion of https://www.gleif.org/en/about-lei/code-lists/gleif-registration-authorities-list plus a couple of lists for banking (like FDIC).

> - For example, stating DigiCert will always use a State from ISO 3166-2 makes it clear, and also makes it something verifiable (i.e. someone can implement an
automated check)

Maybe what we’ll do is keep a running list of the checks. We’re finalizing on spelling out all states. No abbreviations. This is something we can specify in our RPS – how it looks for each field.

> - Similarly, enumerating the registries used makes it possible, in many cases, to automatically check the serialNumber for both format and accuracy

Checking the registration number for format and accuracy is something I proposed for the new project, but I wasn’t sure how feasible it was considering the wide variation. You end up with a lot of different numbers. I wonder if you could get it to range for formats? That would certainly be doable while adding some layers of protection.

>- Modifying the CA/B Forum documents to formalize those processes, by explicitly removing the ambiguity or CA discretion. DigiCert's done well here in the past, removing validation methods like 3.2.2.4.1 / 3.2.2.4.5 due to their misuse and danger

One ballot I do want to pass is adding a field for the JOI entity information. This way everyone can see where the registration number originated. Short of a formalized CAB forum list of permitted entities (which is also on the table), this would make it very easy to have a conversation on whether what the registration number means. There’s probably others, but that’s a request that’s been surfacing a few times.

> - Writing automated tooling to vet/validate
This is where we are going for sure.



* The nice part is that by formalizing the rules, you can benefit a lot from improved checking that the community may develop, and if it doesn't materialize, contribute your own to the benefit of the community.




A better example in some-state. We scanned for values not listed as states and cities that have “some”, “any”, “none”, etc. That only finds a limited set of the problem, and obviously missed the JOI information (not part of the same data set. Going forward, I want a rule set that says, is this a state? If so, then check this source to see if it’s a real state. Then check this to see if it also exists in the country specified. Then check to see if the locality specified exists in the state. Then see if there is a red flag from a map that says the org doesn’t exist. (The map check is coming – not there yet….) Instead of finding small one off problems people report, find them on a global scale with a rule we run every time something in the CAB forum, Mozilla policy, or our own system changes.

>> Yes, this is the expectation of all CAs.

>> As I understand it, following CAs' remediation of Some-State, etc, this is exactly what members of the community went and did. This is not surprising, since one of the clearly identified best practices from that discussion was to look at ISO 3166-1/ISO 3166-2 for such information inconsistency. SecureTrust, one of the illustrative good reports, did exactly that, and that's why it's such a perfect example. It's unfortunate that a number of other CAs didn't, which is why on the incident reports, I've continued to push them in terms of their evaluation and disclosure.

>> This is the exact goal of Incident Reports: identifying not just the incidents, but the systemic issues, devising solutions that can work, and making sure to holistically remediate the problem.

Right, and we did this for the location on our some state issues (on all the data). But that was a one-time scan and reported it to compliance for review. It was a little script we wrote. What I want the system to do is scan for this particular change every time the validation system changes to make sure nothing contradicts this and invalidate all validations that break a rule.

Jeremy Rowley

unread,
Aug 23, 2019, 4:37:35 PM8/23/19
to Jakob Bohm, mozilla-dev-s...@lists.mozilla.org
>> 1. I believe the BRs and/or underlying technical standards are very
clear if the ST field should be a full name ("California") or an
abbreviation ("CA").

This is only true of the EV guidelines and only for Jurisdiction of Incorporation. There is no formatting requirement for place of business. I think requiring a format would help make the data more useful as you could consume it easier en masse.

>> 2. The fact that a country has subdivisions listed in the general ISO
standard for country codes doesn't mean that those are always part of
the jurisdiction of incorporation and/or address.

Right. For the EV Guidelines, what matters is the Jurisdiction of Registration or Jurisdiction of Incorporation as that is what is used to determine the Jurisdiction of Incorporation/Registration information, including what goes into the Registration Number Field.

Incorporating Agency is defined as: In the context of a Private Organization, the government agency in the Jurisdiction of
Incorporation under whose authority the legal existence of the entity is registered (e.g., the government agency that issues
certificates of formation or incorporation). In the context of a Government Entity, the entity that enacts law, regulations, or
decrees establishing the legal existence of Government Entities

Registration Agency: A Governmental Agency that registers business information in connection with an entity's business
formation or authorization to conduct business under a license, charter or other certification. A Registration Agency MAY
include, but is not limited to (i) a State Department of Corporations or a Secretary of State; (ii) a licensing agency, such as a
State Department of Insurance; or (iii) a chartering agency, such as a state office or department of financial regulation,
banking or finance, or a federal agency such as the Office of the Comptroller of the Currency or Office of Thrift
Supervision

This is broad. IMO we should reduce it to be the number listed on the certificate of formation/incorporation so there is consistency to what the registration means. We should also identify in the certificate the source of the registration number as it provides information to relying parties about the actual organization.

>> 3. The fact that a government data source lists the incorporation
locality of a company, doesn't mean that this locality detail is
actually a relevant part of the jurisdictionOfIncorporation. This
essentially depends if the rules in that country ensure uniqueness of
both the company number and company name at a higher jurisdiction
level (national or state) to the same degree as at the lower level.
For example, in the US the company name "Stripe" is not unique
nationwide.

Right - this depends on where the formation/registration occurs. That's captured in the EV guidelines.

Ryan Sleevi

unread,
Aug 23, 2019, 5:04:54 PM8/23/19
to Jeremy Rowley, ry...@sleevi.com, mozilla-dev-s...@lists.mozilla.org
On Fri, Aug 23, 2019 at 4:18 PM Jeremy Rowley <jeremy...@digicert.com>
wrote:

> > I can think of some incremental steps here:
>
> > - Disclosing exact detailed procedures via CP/CPS
>
>
>
> Maybe an addendum to the CPS. Or RPS. I’ll experiment and post something
> to see what the community thinks.
>

Yup. I've seen plenty of CP/CPSes that place extensive detail within
appendices of their CP/CPS. The important part of having this within the
CP/CPS is the (albeit limited) binding to the audit procedure. After all,
the objective of an audit is to ensure the CP/CPS is fairly stated with
respect to the actual operations practiced, across several dimensions, and
so having that allowlist clearly documented, versioned, and audited helps
provide a degree of assurance to RPs that simply placing in the RPS
wouldn't necessarily achieve.


> > - An emphasis should be on allowlisting. Anything not on the allowlist
> *should* be an exceptional thing.
>
>
>
> This we actually have internally. Or are you saying across the industry?
> The allow list internally is something prevetted by compliance and legal.
> We’re currently (prompted by a certificate problem report) reviewing the
> entire allowed list to see what’s there and taking anything off that I
> don’t like. Basically we’re using your suggestion of
> https://www.gleif.org/en/about-lei/code-lists/gleif-registration-authorities-list
> plus a couple of lists for banking (like FDIC).
>

Ideally, yes, I'd like to see a transition from ad-hoc interpretations into
formalized lists, whether it be through browser policy or through the
Baseline Requirements/EV Guidelines.

I'd love to see a CA take their existing list and propose it, which would
allow for discussion and rationale (e.g. if there are organizations that
should or should not be on that list), but would also help formalize it as
an industry. DigiCert documenting in its CP/CPS is a good first step.
Better would be DigiCert proposing a ballot, although I'm sure members of
the browser community would be happy to propose a ballot on the basis of
whomever publishes their list first.

Regardless, however, the mere act of publishing that list helps develop
tooling to externally vet compliance and consistency with those statements,
and that might amortize some of the internal tooling costs.

> - For example, stating DigiCert will always use a State from ISO 3166-2
> makes it clear, and also makes it something verifiable (i.e. someone can
> implement an
>
> automated check)
>
>
>
> Maybe what we’ll do is keep a running list of the checks. We’re finalizing
> on spelling out all states. No abbreviations. This is something we can
> specify in our RPS – how it looks for each field.
>

Why not CP/CPS? This is somewhat a canonical example of the purpose of a
Certificate Profile as specified within a Certificate Policy. Several CAs,
such as Sectigo and SwissSign, include within their CP extensive profiles
of all certificate types they issue. That's somewhat a model for other CAs
to consider and examine.

You can see Appendix C of Sectigo's CP/CPS, or 7.1 of SwissSign.


> > - Similarly, enumerating the registries used makes it possible, in many
> cases, to automatically check the serialNumber for both format and accuracy
>
>
>
> Checking the registration number for format and accuracy is something I
> proposed for the new project, but I wasn’t sure how feasible it was
> considering the wide variation. You end up with a lot of different numbers.
> I wonder if you could get it to range for formats? That would certainly be
> doable while adding some layers of protection.
>

I agree, it's going to vary based on the registry. However, enumerating the
registries is the first step to being able to enumerate the numbers and
formats. For example, one of the problem reports for a recent set of issues
went ahead and started a opensource GitHub repository to try and collect
some of that information, based on Registry -
https://github.com/bitcynth/company-number-formats/blob/master/formats.md -
so we certainly know it's possible.


> >- Modifying the CA/B Forum documents to formalize those processes, by
> explicitly removing the ambiguity or CA discretion. DigiCert's done well
> here in the past, removing validation methods like 3.2.2.4.1 / 3.2.2.4.5
> due to their misuse and danger
>
>
>
> One ballot I do want to pass is adding a field for the JOI entity
> information. This way everyone can see where the registration number
> originated. Short of a formalized CAB forum list of permitted entities
> (which is also on the table), this would make it very easy to have a
> conversation on whether what the registration number means. There’s
> probably others, but that’s a request that’s been surfacing a few times.
>

I think I'd invert that priority list. I hear that some CA customers,
particularly in the banking sector, get exceptionally nervous for any
change to certificate profiles, even for something as simple and
uncontroversial as a validity period, so extending the EV Guidelines, which
is a much more marked change, seems like it might take years to be
functionally useful and effective. This is definitely a quintessential
example of where reduced validity periods would help improve the security
for relying parties much sooner, and please don't think this criticism is
opposition to the change - just to prioritizing it over an allowlist.

That's because changing the guidelines to an allowlist approach,
formalizing a list of permitted entities, would require no changes to the
certificate profile that relying parties encounter, while providing clear,
auditable, and reliable changes on a much quicker timescale.

Ryan Sleevi

unread,
Aug 23, 2019, 5:10:05 PM8/23/19
to Jeremy Rowley, Jakob Bohm, mozilla-dev-s...@lists.mozilla.org
It's less broad when you also include the additional (included by
reference) definition for Government Agency

Government Agency: In the context of a Private Organization, the government
agency in the Jurisdiction of Incorporation
under whose authority the legal existence of Private Organizations is
established (e.g., the government agency that issued
the Certificate of Incorporation). In the context of Business Entities, the
government agency in the jurisdiction of operation
that registers business entities. In the case of a Government Entity, the
entity that enacts law, regulations, or decrees
establishing the legal existence of Government Entities.

So, for example, a Private Organization who registers with multiple
entities frequently will only obtain the Certificate of Incorporation from
a single one of those entities, reducing some of the ambiguity and
confusion.

That said, I agree, we should better clarify the expectations by moving to
a per-Jurisdiction allowlist of such organizations. This would also allow
creating and assigning codes to those entities, allowing is to supplant or
replace the existing Jurisdiction information with that entity code, which
would then unambiguously identify those attributes.

Jakob Bohm

unread,
Aug 23, 2019, 6:45:01 PM8/23/19
to mozilla-dev-s...@lists.mozilla.org
[Please note that the way MS Outlook marks quoted text doesn't work well
with Mozilla mail programs].

On 23/08/2019 22:37, Jeremy Rowley wrote:
>> 1. I believe the BRs and/or underlying technical standards are very
>> clear if the ST field should be a full name ("California") or an
>> abbreviation ("CA").
>
> This is only true of the EV guidelines and only for Jurisdiction of
> Incorporation. There is no formatting requirement for place of business.
> I think requiring a format would help make the data more useful as you
> could consume it easier en masse.
>
X.520 (10/2012) says this:

6.3.3 State or Province Name

The State or Province Name attribute type specifies a state or province.
When used as a component of a directory name, it identifies a geographical
subdivision in which the named object is physically located or with which
it is associated in some other important way.

An attribute value for State or Province Name is a string, e.g., S = "Ohio".

stateOrProvinceName ATTRIBUTE ::= {
SUBTYPE OF name
WITH SYNTAX UnboundedDirectoryString
LDAP-SYNTAX directoryString.&id
LDAP-NAME {"st"}
ID id-at-stateOrProvinceName }

The Collective State or Province Name attribute type specifies a state or
province name for a collection of entries.

collectiveStateOrProvinceName ATTRIBUTE ::= {
SUBTYPE OF stateOrProvinceName
COLLECTIVE TRUE
ID id-at-collectiveStateOrProvinceName }

[End of X.520 section 6.3.3]

For the location, (L and street attributes), X.520 is quite vague, but
for the remarkably similar "postalAddress" attribute is defined in terms
of the F.401 specification.


>> 2. The fact that a country has subdivisions listed in the general ISO
>> standard for country codes doesn't mean that those are always part of
>> the jurisdiction of incorporation and/or address.
>
> Right. For the EV Guidelines, what matters is the Jurisdiction of
> Registration or Jurisdiction of Incorporation as that is what is used> to determine the Jurisdiction of Incorporation/Registration information,
> including what goes into the Registration Number Field.

As I mentioned, these are issues seen with other CAs blindly importing
ISO 3166-2 into their systems. For example one CA recently insisted
that we filled the ST field with the equivalent of a county, because
there was a political desire to eliminate having elected officials at
the equivalent of state level, so someone in government probably went
ahead and submitted an update to 3166-2 presuming success of that
effort.

>
> Incorporating Agency is defined as: In the context of a Private
> Organization, the government agency in the Jurisdiction of
> Incorporation under whose authority the legal existence of the entity
> is registered (e.g., the government agency that issues certificates
> of formation or incorporation). In the context of a Government Entity,
> the entity that enacts law, regulations, or decrees establishing the
> legal existence of Government Entities
>
> Registration Agency: A Governmental Agency that registers business
> information in connection with an entity's business formation or
> authorization to conduct business under a license, charter or other
> certification. A Registration Agency MAY include, but is not limited
> to (i) a State Department of Corporations or a Secretary of State;
> (ii) a licensing agency, such as a State Department of Insurance; or
> (iii) a chartering agency, such as a state office or department of
> financial regulation, banking or finance, or a federal agency such
> as the Office of the Comptroller of the Currency or Office of Thrift
> Supervision
>
> This is broad. IMO we should reduce it to be the number listed on the> certificate of formation/incorporation so there is consistency to what
> the registration means. We should also identify in the certificate the
> source of the registration number as it provides information to relying
> parties about the actual organization.

For most of the non-default numbering sources, the addition made in EVG
1.7.0 appears to provide this. Ideally, this should leave us with
exactly one number-authority for each jurisdiction, org type and number
format, subject of cause to random changes in local legislation and/or
government practice.

For my example of C=DK, the numbering system for government entities has
changed multiple times in recent decades. In the 1970s there was only
some tiny numbering systems such as 3 digit county numbers found in some
obscure government records. In the early 2000s it was decreed that all
billing of government customers at all levels should use an XML format
that identified each sub-entity by an EAN number (as in the 13 digit
number system for product barcodes!), which was subsequently changed to
many of the larger entities instead getting numbers from the companies
registry (currently up to 8 digits, with older registrants having
shorter numbers). However there is an online database for mapping
numbers in both systems to entity names (but not the other way!), and
of cause the full searchability of the companies database.


>
>> 3. The fact that a government data source lists the incorporation
>> locality of a company, doesn't mean that this locality detail is
>> actually a relevant part of the jurisdictionOfIncorporation. This
>> essentially depends if the rules in that country ensure uniqueness of
>> both the company number and company name at a higher jurisdiction
>> level (national or state) to the same degree as at the lower level.
>> For example, in the US the company name "Stripe" is not unique
>> nationwide.
>
> Right - this depends on where the formation/registration occurs. That's
> captured in the EV guidelines.
>

Unfortunately, there is no consistent mapping between the general words
of the EVG and the variable practice of various governments.

Again for C=DK, there is an old tradition that incorporation paperwork
states the county of incorporation, even though for many decades now the
registration is actually done in country level computer systems, that
capture the text of that paperwork. Thus someone reading the wording of
company bylaws, would assume all companies are registered and incorporated
at the county level, because the bylaws will usually not even mention the
country (or the registration number, as the initial bylaws must be
submitted to get a number).
0 new messages