| Puppet Version: 5.5.18 (but unrelated to the issue) Puppet Server Version: 5.3.1/5.3.10 OS Name/Version: RHEL 7 This is an extension of PUP-10233 that I've continued to break down and make "less ambiguous" We have a larger environment which makes use of the environment classes cache (Foreman Integrated). When we "update code" we hit the "environment_classes" api endpoint to refresh the environment classes cache. https://puppet.com/docs/puppetserver/5.3/puppet-api/v3/environment_classes.html Specifically - i see this spawn a "process" in the "/status/v1/services/jruby-metrics?level=debug" output on the puppet master. This can take "forever" (400 seconds or more) during which the cache doesn't return as its still "processing/building/whatever". I can check CPU/Memory/Disk IOPS and see "nothing constrained" - borderline idle for this entire time. Example from the jruby-metrics endpoint - specifically under: status -> experimental -> metrics -> borrowed_instances { "time": 1599830351410, "reason": { "request": { "uri": "/puppet/v3/environment_classes", "request-method": "get", "route-id": "puppet-v3-environment_classes-/*/" } }, "duration-millis": 296094 }, "Eventually" - 300-400 seconds after called it will finish. and the environment_classes endpoint will return proper cached data. Desired Behavior: I realize generating a cache isn't instant. But 400 seconds seems obscenely long - particularly when im finding no resource bottleneck. I do have hunddres of modules/thousands of classes across each environment. HOWEVER - even with that - I guess my desired behavior is "60 seconds or less - even in extreme scenarios" I'm not above this potentially being an issue with my config/setup, or even "bad inefficient classes" we have in use on our end contributing here. My problem is knowing where or what to check. This call seems like a "black box" that takes 400 seconds but produces no output/logging/progress. Is there any way to enable some type of "debug logging" on the environment_classes endpoint so i can see exactly "what its doing/processing" to possibly identify bottlenecks? Is there known JVM args that can possibly help here? should i be doing this on a "dedicated" puppetmaster not also serving catalog traffic? would more IOPS help even though i see all disk(s) essentially idle? I've googled this extensively - and haven't found much there either.... |