New IllegalStateException on startup (KmsKeyring)

47 views
Skip to first unread message

Hans Ridder

unread,
Nov 14, 2017, 4:08:38 PM11/14/17
to nomulus-discuss
I think a recent commit is causing an IllegalStateException (stack trace below) to be thrown from KmsKeyring.getDecryptedData() during FrontendServlet.init(), thus causing the service to fail to start. I think this means steps in the Install Guide are, once again, incomplete and the code now requires a valid Keyring configuration to successfully complete the deployment validation ("You need permission..."). I seem to recall that before (~30 days ago) it would log the exception, probably asynchronously, but the service would start and serve requests. I'd have to roll back and redeploy to be sure.

I'm going to try to configure the Keyring, or get a dummy one going to see if that works... Please let me know if I'm missing something obvious.

-h
java.lang.IllegalStateException: Requested secret 'json-credential-string' does not exist.
	at com.google.common.base.Preconditions.checkState(Preconditions.java:534)
	at google.registry.keyring.kms.KmsKeyring.getDecryptedData(KmsKeyring.java:190)
	at google.registry.keyring.kms.KmsKeyring.getString(KmsKeyring.java:163)
	at google.registry.keyring.kms.KmsKeyring.getJsonCredential(KmsKeyring.java:150)
	at google.registry.keyring.api.KeyModule.provideJsonCredential(KeyModule.java:121)
	at google.registry.keyring.api.KeyModule_ProvideJsonCredentialFactory.get(KeyModule_ProvideJsonCredentialFactory.java:23)
	at google.registry.keyring.api.KeyModule_ProvideJsonCredentialFactory.get(KeyModule_ProvideJsonCredentialFactory.java:8)
	at google.registry.request.Modules_GoogleCredentialModule_ProvideGoogleCredentialFactory.get(Modules_GoogleCredentialModule_ProvideGoogleCredentialFactory.java:41)
	at google.registry.request.Modules_GoogleCredentialModule_ProvideGoogleCredentialFactory.get(Modules_GoogleCredentialModule_ProvideGoogleCredentialFactory.java:11)
	at dagger.internal.DoubleCheck.get(DoubleCheck.java:47)
	at google.registry.request.Modules$GoogleCredentialModule.lambda$provideScopedGoogleCredential$0(Modules.java:220)
	at google.registry.monitoring.whitebox.StackdriverModule.provideMonitoring(StackdriverModule.java:50)
	at google.registry.monitoring.whitebox.StackdriverModule_ProvideMonitoringFactory.get(StackdriverModule_ProvideMonitoringFactory.java:45)
	at google.registry.monitoring.whitebox.StackdriverModule_ProvideMonitoringFactory.get(StackdriverModule_ProvideMonitoringFactory.java:14)
	at google.registry.monitoring.whitebox.StackdriverModule_ProvideMetricWriterFactory.get(StackdriverModule_ProvideMetricWriterFactory.java:48)
	at google.registry.monitoring.whitebox.StackdriverModule_ProvideMetricWriterFactory.get(StackdriverModule_ProvideMetricWriterFactory.java:11)
	at google.registry.monitoring.whitebox.StackdriverModule_ProvideMetricReporterFactory.get(StackdriverModule_ProvideMetricReporterFactory.java:33)
	at google.registry.monitoring.whitebox.StackdriverModule_ProvideMetricReporterFactory.get(StackdriverModule_ProvideMetricReporterFactory.java:11)
	at dagger.internal.DoubleCheck.get(DoubleCheck.java:47)
	at google.registry.module.frontend.FrontendServlet.init(FrontendServlet.java:43)
	at javax.servlet.GenericServlet.init(GenericServlet.java:244)


Lai Jiang

unread,
Nov 14, 2017, 4:38:49 PM11/14/17
to hans....@gmail.com, nomulus-discuss
This is due to https://github.com/google/nomulus/commit/c702b4486c9e2d7796f00dcd83b6e4cb230a103a#diff-6fb8b95bf357f0a2e1d704e7adc73e70

We need a GoogleCredential to initialize the monitoring client, not a AppIdentityCredential, which is provided when an HttpRequestInitializer is requested. The reason being that we want the monitoring client to run in a standard java thread, not an App Engine managed thread, so that it can be set to a daemon mode and do not keep an ideal instance from being killed. When running in a non-app-engine thread, no App Engine APIs can be called, so AppIdentityCredential cannot be used (which uses App Engine APIs). Hence the GoogleCredential stored encrypted by the "json-credential-string" key is provided.

Unfortunately I don't see a way to work around configuring a keyring in this case. GoogleCredential.getApplicationDefault() will also give you an AppIdentityCredential when running in App Engine:

https://github.com/google/google-api-java-client/blob/4fc8c099d9db5646770868cc1bc9a33c9225b3c7/google-api-client/src/main/java/com/google/api/client/googleapis/auth/oauth2/DefaultCredentialProvider.java#L57

 
So you have to explicitly provide a .json file to construct the credential, be it from kms or locally.

Lai Jiang | Software Engineer | jian...@google.com | 212-565-6361



--
NOTE: This is a public discussion list for the Nomulus domain registry project.
---
You received this message because you are subscribed to the Google Groups "nomulus-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nomulus-discu...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nomulus-discuss/dac4929b-1e12-411b-ba0b-7a7061f45006%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Hans Ridder

unread,
Nov 14, 2017, 6:57:15 PM11/14/17
to nomulus-discuss


On Tuesday, November 14, 2017 at 1:38:49 PM UTC-8, Lai Jiang wrote:
This is due to https://github.com/google/nomulus/commit/c702b4486c9e2d7796f00dcd83b6e4cb230a103a#diff-6fb8b95bf357f0a2e1d704e7adc73e70

We need a GoogleCredential to initialize the monitoring client, not a AppIdentityCredential, which is provided when an HttpRequestInitializer is requested. The reason being that we want the monitoring client to run in a standard java thread, not an App Engine managed thread, so that it can be set to a daemon mode and do not keep an ideal instance from being killed. When running in a non-app-engine thread, no App Engine APIs can be called, so AppIdentityCredential cannot be used (which uses App Engine APIs). Hence the GoogleCredential stored encrypted by the "json-credential-string" key is provided.

Are you sure? From a quick look at the diff, I don't see how this change explains the current problem. The code seemed to use GoogleCredential before and after the commit. So it should have been failing all along. Maybe I missed something or don't understand?

Unfortunately I don't see a way to work around configuring a keyring in this case.

Then you agree that the commit above (or another) broke the install instructions?

Also, the section on the Keyring in the Configuration document suggests that the DummyKeyringModule is the default, which is apparently wrong. I also haven't found any documentation on setting up the KmsKeyring.

These are the kinds of things that make it hard to participate in open source development... :-(
 
GoogleCredential.getApplicationDefault() will also give you an AppIdentityCredential when running in App Engine:

https://github.com/google/google-api-java-client/blob/4fc8c099d9db5646770868cc1bc9a33c9225b3c7/google-api-client/src/main/java/com/google/api/client/googleapis/auth/oauth2/DefaultCredentialProvider.java#L57

 
So you have to explicitly provide a .json file to construct the credential, be it from kms or locally.

Okay. I'll give this a try. Thanks.
 

Lai Jiang | Software Engineer | jian...@google.com | 212-565-6361

-h

Lai Jiang

unread,
Nov 15, 2017, 10:35:49 AM11/15/17
to Hans Ridder, nomulus-discuss
The specific change is in StackdriverModule.java, where instead of providing a Function<Set<String>, ? extends HttpRequestInitializer>, we are now providing a Function<Set<String>, GoogleCredential> as credential. Both bindings are provided from Modules.java. We almost always install UseAppIdentityCredentialForGoogleApisModule in components that needs a HttpRequestInitializer, resulting in an AppIdentityCredential being provided.

From the stacktrace you provided, it does seem like KmsKeyring is the default, so the document indeed needs to be updated. To set up KmsKeyring, run nomulus -e ENV update_kms_keyring.

We have CI running to make sure that our FOSS version builds and passes all unit tests, but things like this are hard to catch because it only happens when you actually deploy and run a foss build to app engine. And this missing key is requested from the root component, which results in immediate exception as soon as the app starts running. If it were in a subcomponent, it would not throw until a request comes in. I should note that since KmsKeyring appears to be the default, an unconfigured KMS keyring will result in exceptions when other secrets are requested during request handling. So even without this commit, a freshly deployed nomulus would not actually handle requests that require any keys. It just would not throw right away when the app starts.

This unfortunately is a chicken and egg problem, without open source users we wouldn't catch these issues, but these issues are making it hard for open source users to try out... :-(

I'll put an item on our issue track to update the documentation. Feel free to follow up if you still have problems. Or, if you feel like it, send us a PR with updated documentation that works for you :-P

Lai Jiang | Software Engineer | jian...@google.com | 212-565-6361



Hans Ridder

unread,
Nov 15, 2017, 1:16:17 PM11/15/17
to nomulus-discuss
On Wednesday, November 15, 2017 at 7:35:49 AM UTC-8, Lai Jiang wrote:
The specific change is in StackdriverModule.java, where instead of providing a Function<Set<String>, ? extends HttpRequestInitializer>, we are now providing a Function<Set<String>, GoogleCredential> as credential. Both bindings are provided from Modules.java. We almost always install UseAppIdentityCredentialForGoogleApisModule in components that needs a HttpRequestInitializer, resulting in an AppIdentityCredential being provided.

I saw that, but I don't understand how that change started making the service die on startup with IllegalStateException.
 
From the stacktrace you provided, it does seem like KmsKeyring is the default, so the document indeed needs to be updated. To set up KmsKeyring, run nomulus -e ENV update_kms_keyring.

Okay, thanks for the tip. I'm trying with the DummyKeyringModule for the moment. If that takes too much time then I'll invest in the KmsKeyring. I was hoping to keep it simple. I think adding "just one more" GC service to enable and configure makes it that much harder to simply get the registry running.
 
We have CI running to make sure that our FOSS version builds and passes all unit tests, but things like this are hard to catch because it only happens when you actually deploy and run a foss build to app engine.

Sure. Unless someone invests in scripting the whole environment setup so a test can be written to really start from scratch. Even without that, some of those scripts would be really helpful for initial setup. For example, a script to create all the required GS buckets... (I've started that.)
 
And this missing key is requested from the root component, which results in immediate exception as soon as the app starts running. If it were in a subcomponent, it would not throw until a request comes in. I should note that since KmsKeyring appears to be the default, an unconfigured KMS keyring will result in exceptions when other secrets are requested during request handling. So even without this commit, a freshly deployed nomulus would not actually handle requests that require any keys. It just would not throw right away when the app starts.

Agreed. I'd have to take a closer look, but I thought most (all?) of the keys were things like passwords for third party services such as escrow providers. A new user is unlikely to have things like an escrow provider. Things like that should somehow be optional until the user is ready to enable them. Once they're enabled, then it makes sense for those things to throw on startup.

I understand the keys will eventually be needed, but I think the idea in the Install document was to get a basic system deployed, and then fully configure it later. This gives new users a good experience that reflects well on the project.

This unfortunately is a chicken and egg problem, without open source users we wouldn't catch these issues, but these issues are making it hard for open source users to try out... :-(

It's an ongoing challenge. Some things a harder to catch, and perhaps not worth trying. Other things we can do are to automate more of the installation so there aren't so many manual steps.
 
I'll put an item on our issue track to update the documentation. Feel free to follow up if you still have problems. Or, if you feel like it, send us a PR with updated documentation that works for you :-P

Thanks Lai.

-h

Lai Jiang

unread,
Nov 15, 2017, 1:51:24 PM11/15/17
to Hans Ridder, nomulus-discuss
In this case, json-credential-string provides the private key of a user-created service account that the monitoring service will use. Since the monitoring service starts along with the servlet itself, it becomes mandatory and causes the program to die immediately.

We'll see if we can make this optional or disabled by default, so that it does not throw on startup. A startup script is a great idea! If you can contribute it back when you're ready, that'll be greatly appreciated.

Thanks!

Lai Jiang | Software Engineer | jian...@google.com | 212-565-6361



Hans Ridder

unread,
Nov 15, 2017, 4:30:11 PM11/15/17
to nomulus-discuss
On Wednesday, November 15, 2017 at 10:51:24 AM UTC-8, Lai Jiang wrote:
In this case, json-credential-string provides the private key of a user-created service account that the monitoring service will use. Since the monitoring service starts along with the servlet itself, it becomes mandatory and causes the program to die immediately.

Just wild guess (that I mentioned in my first message), but I think before the Java 8 changes, perhaps the thread factory change, the loading of the credential happened asynchronous to the Servlet init() method. So the exception was thrown and logged, but didn't cause startup to fail. I remember seeing an exception about the StackDriver stuff, but it wasn't fatal so I ignored it.
 
We'll see if we can make this optional or disabled by default, so that it does not throw on startup. A startup script is a great idea! If you can contribute it back when you're ready, that'll be greatly appreciated.

Thanks!

Lai Jiang | Software Engineer | jian...@google.com | 212-565-6361

-h

Lai Jiang

unread,
Nov 15, 2017, 5:01:37 PM11/15/17
to Hans Ridder, nomulus-discuss
Before the commit, the credential used is an AppIdentityCrendential, which retrieves the application default credential (default service account) for the running app. This credential always exists, so you would not get an exception when the credential was requested when constructing a MetricReporter. But since the service account is not configured with access to StackDriver Monitoring API, when the metric reporter tries to flush metrics to StackDriver, you get a non-fatal error.

Now that we request a GoogleCredential, it fails at injection site because KMS is not setup. Since the injection happens in Servlet#init, the app died on startup.

You are partially right that the problem manifested itself later in the execution path before the change, but not because the credential is loaded asynchronously, rather that before the exception is thrown when the credential is used, now it is thrown when it is constructed.

Hope this explanation helps.

Lai Jiang | Software Engineer | jian...@google.com | 212-565-6361



Hans Ridder

unread,
Nov 15, 2017, 5:12:03 PM11/15/17
to nomulus-discuss


On Wednesday, November 15, 2017 at 2:01:37 PM UTC-8, Lai Jiang wrote:
Before the commit, the credential used is an AppIdentityCrendential, which retrieves the application default credential (default service account) for the running app. This credential always exists, so you would not get an exception when the credential was requested when constructing a MetricReporter. But since the service account is not configured with access to StackDriver Monitoring API, when the metric reporter tries to flush metrics to StackDriver, you get a non-fatal error.

Now that we request a GoogleCredential, it fails at injection site because KMS is not setup. Since the injection happens in Servlet#init, the app died on startup.

You are partially right that the problem manifested itself later in the execution path before the change, but not because the credential is loaded asynchronously, rather that before the exception is thrown when the credential is used, now it is thrown when it is constructed.

Okay, that helps. Thanks!

Hans Ridder

unread,
Nov 15, 2017, 10:05:30 PM11/15/17
to nomulus-discuss
On Wednesday, November 15, 2017 at 10:16:17 AM UTC-8, Hans Ridder wrote:
I'm trying with the DummyKeyringModule for the moment. If that takes too much time then I'll invest in the KmsKeyring. I was hoping to keep it simple. I think adding "just one more" GC service to enable and configure makes it that much harder to simply get the registry running.

The DummyKeyringModule/InMemoryKeyring appear to be broken, in two ways... so far. There may be more. First, the PGP private keyring won't load:
com.google.common.base.VerifyException: checksum mismatch at 0 of 20
	at google.registry.keyring.api.PgpHelper.lookupKeyPair(PgpHelper.java:120)
	at google.registry.keyring.api.DummyKeyringModule.provideKeyring(DummyKeyringModule.java:71)
...


A little research suggests this is because the keyring has password, and the code assumes there's no password. I don't know if it ever worked, but there's no unit tests, so who knows.

I replaced the keyrings and got past that. Then it fails because the "JSON credential" isn't really JSON. (You probably saw this coming, I'm a bit slow.)

java.lang.RuntimeException: com.fasterxml.jackson.core.JsonParseException: Unrecognized token 'not': was expecting 'null', 'true', 'false' or NaN
at [Source: java.io.ByteArrayInputStream@5218464b; line: 1, column: 5]
at google.registry.request.Modules$GoogleCredentialModule.provideGoogleCredential(Modules.java:213)
at google.registry.request.Modules_GoogleCredentialModule_ProvideGoogleCredentialFactory.get(Modules_GoogleCredentialModule_ProvideGoogleCredentialFactory.java:38)

at google.registry.request.Modules_GoogleCredentialModule_ProvideGoogleCredentialFactory.get(Modules_GoogleCredentialModule_ProvideGoogleCredentialFactory.java:11)
at dagger.internal.DoubleCheck.get(DoubleCheck.java:47)
at google.registry.request.Modules$GoogleCredentialModule.lambda$provideScopedGoogleCredential$0(Modules.java:220)
at google.registry.monitoring.whitebox.StackdriverModule.provideMonitoring(StackdriverModule.java:50)

So, I think I'd like to know what the consensus is over there about keeping DummyKeyringModule working. I think it would be good if there were an "easy way" to get the registry running, so either an easy way to get KMS configured, or fix the DummyKeyringModule et. al.

Let me know what you folks think and I'll move forward. In the meantime I'll take a look at KmsKeyring.

-h

Lai Jiang

unread,
Nov 15, 2017, 10:47:04 PM11/15/17
to Hans Ridder, nomulus-discuss
Hmmm, interesting. I actually just made a change to install DummyKeyringModule in the FOSS build by default (it’s not pushed to github yet). As far as I can see we were not using DummyKeyringModule anywhere before. I was trusting that it works, but maybe not? I’ll have to deploy a FOSS build to GAE and do some tests. Will circle back with findings.

And yes I should have seen the JSON thing coming, of course the GoogleCredential builder cannot work magic and build a credential from a dummy string... I guess we need to put in some more thoughts into how to make this work for out-of-the-box FOSS build.

--
NOTE: This is a public discussion list for the Nomulus domain registry project.
---
You received this message because you are subscribed to the Google Groups "nomulus-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nomulus-discu...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--

Lai Jiang

unread,
Dec 2, 2017, 12:44:24 PM12/2/17
to Hans Ridder, nomulus-discuss
We did a push that partially solved this problem: the servlet no longer dies when using the default DummyKeyringModule.

Still need to investigate how to make DummyKeyringModule return a dummy keyring without throwing.

--
NOTE: This is a public discussion list for the Nomulus domain registry project.
---
You received this message because you are subscribed to the Google Groups "nomulus-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nomulus-discu...@googlegroups.com.

For more options, visit https://groups.google.com/d/optout.
--

Hans Ridder

unread,
Dec 2, 2017, 7:14:30 PM12/2/17
to nomulus-discuss
On Saturday, December 2, 2017 at 9:44:24 AM UTC-8, Lai Jiang wrote:
We did a push that partially solved this problem: the servlet no longer dies when using the default DummyKeyringModule.

Okay, thanks for the update. I'll give it a try when I get a chance. I've been busy with other things and haven't made progress on the KMS keyring, either.
 
Still need to investigate how to make DummyKeyringModule return a dummy keyring without throwing.

Is the need for the JSON credential simply a limitation of AppEngine or Stackdriver (that might be temporary)? It feels like we should be able to use an existing service account type mechanism to authenticate with Stackdriver, but I haven't spent enough time looking at that. There's too many ways to authenticate things in GC, and I don't understand the limitations of each.

On a related note, I feel like there needs to be a discussion (or decision/statement) on what parts of the system are required and what parts are optional, and the optional stuff wouldn't need to be configured and therefore keys or credentials wouldn't need to exist. For example, if you're not doing RDE, you don't need any of those keys configured.... Depending on what is really needed for the metrics/Stackdriver, we need to either have an easy way to setup the keyring, or avoid using it in basic configurations. Should we start a separate thread about this part?

-h

Lai Jiang

unread,
Dec 3, 2017, 3:56:17 PM12/3/17
to nomulus-discuss
Maybe I wasn't very clear in what I meant. I was referring to the keyring password issue which makes DummyKeyringModule throw when requesting a Keyring from it. We still need to figure out how to make that work.

As for the valid json string question, it was a limitation of App Engine. Yes it is true that we can request the default service account bound to the App Engine instance directly, but that returns an AppIdentityCredential, instead of a generic GoogleCredential. The AppIdentityCredential uses App Engine specific HttpTransport which cannot be called from a non-request thread that is not managed by App Engine. In case of the metric report, we need it to be in such a thread, because if its thread is managed by App Engine, we cannot set it to daemon and it will prevent the instance from being killed when idle. Therefore we had to get the credential from the json file, or somewhere else.

Making part of the service optional is certainly appealing, but I don't think the system is designed to be that modular. i. e. if you run nomulus, we assume you run the whole thing because RDE and other things are required by ICANN. Certainly a lot of the non-core stuff, when not configured, would not make the core nomulus crash, but making them explicitly optional was not part of the design.

But this is just my personal understanding, you're welcome to start another thread to discuss it :-)

Hans Ridder

unread,
Dec 4, 2017, 5:09:08 PM12/4/17
to nomulus-discuss


On Sunday, December 3, 2017 at 12:56:17 PM UTC-8, Lai Jiang wrote:
Maybe I wasn't very clear in what I meant. I was referring to the keyring password issue which makes DummyKeyringModule throw when requesting a Keyring from it. We still need to figure out how to make that work.

If you're talking about the "VerifyException: checksum mismatch at 0 of 10" while loading the PGP private keyring in the DummyKeyringModule (too many keyrings!) that I mentioned back on Nov 15, then I'm pretty sure the workaround is to replace the pgp-private-keyring.asc with one that does not have a password, as the code expects. I did this locally it worked. I can put that into a PR if that helps. But that takes us to the need for a valid (parsable) JSON credential that I mentioned in that same message, or can we make it "optional" (see below).
 
As for the valid json string question, it was a limitation of App Engine. Yes it is true that we can request the default service account bound to the App Engine instance directly, but that returns an AppIdentityCredential, instead of a generic GoogleCredential. The AppIdentityCredential uses App Engine specific HttpTransport which cannot be called from a non-request thread that is not managed by App Engine. In case of the metric report, we need it to be in such a thread, because if its thread is managed by App Engine, we cannot set it to daemon and it will prevent the instance from being killed when idle. Therefore we had to get the credential from the json file, or somewhere else.

Do you know if this is considered a bug by the AppEngine/Cloud people at Google?

Making part of the service optional is certainly appealing, but I don't think the system is designed to be that modular. i. e. if you run nomulus, we assume you run the whole thing because RDE and other things are required by ICANN.

Correct me if I'm wrong, but I believe that RDE is only required by gTLDs, or at least is not required to run a ccTLD. I believe ICANN reporting and maybe other things are similar.
 
Certainly a lot of the non-core stuff, when not configured, would not make the core nomulus crash, but making them explicitly optional was not part of the design.

Of course. But I hope that the intention for open sourcing nomulus is to make it usable by other types of TLDs. I understand it wasn't built for that, but I'm hoping there is agreement that nomulus would like to get there (with appropriate PR's).

And by "optional" I only mean, for example, if we don't configure a particular cron job, then any credentials or keys needed by that job don't need to exist, and the system will not fail to start or continuously log errors. The metrics stuff seems different from that, in that it's directly used by the core. If, for example, leaving out the JSON credential just logged a single warning saying that metrics logging was disabled, then we could say it's "optional". The danger there is that someone could deploy a new version, accidentally without a credential, logging would disappear, and they may never know. Metrics seem (are!) very important to running a real production service, so I don't really want to make it optional... But I want to make it easy to setup/develop/install.

Lai Jiang

unread,
Dec 4, 2017, 8:18:46 PM12/4/17
to Hans Ridder, nomulus-discuss

If you're talking about the "VerifyException: checksum mismatch at 0 of 10" while loading the PGP private keyring in the DummyKeyringModule (too many keyrings!) that I mentioned back on Nov 15, then I'm pretty sure the workaround is to replace the pgp-private-keyring.asc with one that does not have a password, as the code expects. I did this locally it worked. I can put that into a PR if that helps. But that takes us to the need for a valid (parsable) JSON credential that I mentioned in that same message, or can we make it "optional" (see below).
 
Yes That's what I meant. PRs are always welcome :-D

Do you know if this is considered a bug by the AppEngine/Cloud people at Google?

I don't think they are considered bugs. It is documented here. To use an non-appengine-managed thread is an edge case to get around the no-daemon limitation. But I guess you can say that not able to set the the thread to daemon is a bug... Bottom line is I don't think the behavior will be altered.

Correct me if I'm wrong, but I believe that RDE is only required by gTLDs, or at least is not required to run a ccTLD. I believe ICANN reporting and maybe other things are similar.
 
Yeah, you are right, a lot of the requirements are only for gTLDs, not ccTLDs.

Of course. But I hope that the intention for open sourcing nomulus is to make it usable by other types of TLDs. I understand it wasn't built for that, but I'm hoping there is agreement that nomulus would like to get there (with appropriate PR's).

Of course, like I said, PRs are always welcome. I just don't think the team at Google here will have the bandwidth to make Nomulus more modular as you suggested at the moment. We will not turn away any PRs that improve the system, that's for sure :-P.

And by "optional" I only mean, for example, if we don't configure a particular cron job, then any credentials or keys needed by that job don't need to exist, and the system will not fail to start or continuously log errors. The metrics stuff seems different from that, in that it's directly used by the core. If, for example, leaving out the JSON credential just logged a single warning saying that metrics logging was disabled, then we could say it's "optional". The danger there is that someone could deploy a new version, accidentally without a credential, logging would disappear, and they may never know. Metrics seem (are!) very important to running a real production service, so I don't really want to make it optional... But I want to make it easy to setup/develop/install. 

My understanding of the system is that most of the stuff do not run automatically. If you don't configure a cron job, it will not run, therefore not asking for credentials. The only things that are run continuously and routinely are metrics and data exporting (snapshotting, etc). Of course EPP/WHOIS service are running all the time but they are request driven, not time scheduled. We cannot really make metrics optional at this point because regardless whether the metrics reporter is configured correctly, the metrics registry singleton is always present and metrics are always written into it (i. e. when we record metrics, wo do not check if the metric reporter is running, and we record metrics in a lot of places). They are just not exported if the reporter doesn't run. In fact come to think about it, you'll run out of memory (in theory) if you don't run the reporter because the metrics data is not purged from metrics registry until they are exported. So with the latest commit you'll be able to run nomulus out-of-the-box, but not for an infinite amount of time I guess. 

We surely want to make Nomulus easier to adopt for open source users, but as I mentioned in an earlier reply, without open source users' feedback it is hard for us to understand what is making it hard for open source users. If you'd like to undertake the effort to identify optional components and make them off by default, we'd be happy to look at PRs. Or if you encounter other difficulties like this one, please don't hesitate to tell us and we'll try to identify and address the problem in a timely manner. I think the metrics reporter crash is more of a one-off issue than a broad pattern, because it is kicked off in the main thread. But again, I am not running the open source build, so I don't have the full experience. The same goes for documentation, we try our best to make it up-to-date, but inadvertently it get stale. We are happy to assist you in cases where the documentation is inadequate or just wrong, or if you figure it out yourself, please kindly send us a PR :-)

Lai Jiang | Software Engineer | jian...@google.com | 212-565-6361



Lai Jiang

unread,
Dec 5, 2017, 11:13:31 AM12/5/17
to nomulus-discuss
Correction: Having a non-functioning metric reporter would not make the metric registry blow up memory. The metric registry only stores the latest data points for a given metric and labels set. 
To unsubscribe from this group and stop receiving emails from it, send an email to nomulus-discuss+unsubscribe@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages