Jira (PUP-10662) environment_classes API Endpoint takes minutes to return/update

3 views

Skip to first unread message

Jason Lang (Jira)

unread,

Sep 11, 2020, 9:42:03 AM9/11/20

to puppe...@googlegroups.com

Jason Lang created an issue

Puppet /

PUP-10662

environment_classes API Endpoint takes minutes to return/update

Issue Type:	Bug
Assignee:	Unassigned
Created:	2020/09/11 6:41 AM
Priority:	Normal
Reporter:	Jason Lang

Puppet Version: 5.5.18 (but unrelated to the issue)
Puppet Server Version: 5.3.1/5.3.10
OS Name/Version: RHEL 7

This is an extension of PUP-10233 that I've continued to break down and make "less ambiguous" We have a larger environment which makes use of the environment classes cache (Foreman Integrated).

When we "update code" we hit the "environment_classes" api endpoint to refresh the environment classes cache. https://puppet.com/docs/puppetserver/5.3/puppet-api/v3/environment_classes.html

Specifically - i see this spawn a "process" in the "/status/v1/services/jruby-metrics?level=debug" output on the puppet master.

This can take "forever" (400 seconds or more) during which the cache doesn't return as its still "processing/building/whatever". I can check CPU/Memory/Disk IOPS and see "nothing constrained" - borderline idle for this entire time.

Example from the jruby-metrics endpoint - specifically under:

status -> experimental -> metrics -> borrowed_instances
{
"time": 1599830351410,
"reason": {
"request":

{ "uri": "/puppet/v3/environment_classes", "request-method": "get", "route-id": "puppet-v3-environment_classes-/*/" }

},
"duration-millis": 296094
},

"Eventually" - 300-400 seconds after called it will finish. and the environment_classes endpoint will return proper cached data.

Desired Behavior:

I realize generating a cache isn't instant. But 400 seconds seems obscenely long - particularly when im finding no resource bottleneck. I do have hunddres of modules/thousands of classes across each environment. HOWEVER - even with that - I guess my desired behavior is "60 seconds or less - even in extreme scenarios"

I'm not above this potentially being an issue with my config/setup, or even "bad inefficient classes" we have in use on our end contributing here. My problem is knowing where or what to check. This call seems like a "black box" that takes 400 seconds but produces no output/logging/progress.

Is there any way to enable some type of "debug logging" on the environment_classes endpoint so i can see exactly "what its doing/processing" to possibly identify bottlenecks? Is there known JVM args that can possibly help here? should i be doing this on a "dedicated" puppetmaster not also serving catalog traffic? would more IOPS help even though i see all disk(s) essentially idle? I've googled this extensively - and haven't found much there either....

Add Comment

This message was sent by Atlassian Jira (v8.5.2#805002-sha1:a66f935)

Josh Cooper (Jira)

unread,

Sep 11, 2020, 2:15:03 PM9/11/20

to puppe...@googlegroups.com

Josh Cooper commented on

PUP-10662

Re: environment_classes API Endpoint takes minutes to return/update

Jason Lang thanks for the information, but it'd be good to update the existing ticket with these additional details instead of creating a new ticket. I'm going to mark this as a dup of the other one.