AgentNoCookieSetException RemotingClient Server responded to action `ping` with: status 422 Unprocessable Entity

102 views
Skip to first unread message

Veenu Official

unread,
Dec 16, 2022, 10:53:55 AM12/16/22
to go-cd
We have these errors coming up on Elastic Agent Container Start up and it occurs 4 times or more. Every time it takes 5 seconds and then it is a loop of 5 seconds for 4 times, so we are trying to optimize the time by getting rid of these errors.

Please let me know if anyone has been able to resolve these errors. It would be greatly helpful.



2022-12-13T09:20:50.605-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - 2022-12-13 15:20:50,583 ERROR [scheduler-2] RemotingClient:166 - Server responded to action `ping` with: status[422 Unprocessable Entity], body[{
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - "message" : "Agent [Agent [d1c6f1714063, 172.17.0.3, 50392830-07ba-4475-8061-67293edd1015, null]] has no cookie set; nested exception is com.thoughtworks.go.server.service.AgentNoCookieSetException: Agent [Agent [d1c6f1714063, 172.17.0.3, 50392830-07ba-4475-8061-67293edd1015, null]] has no cookie set"
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - }]
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - 2022-12-13 15:20:50,584 ERROR [scheduler-2] AgentHTTPClientController:114 - Error occurred when agent tried to ping server:
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - java.lang.RuntimeException: org.apache.http.client.ClientProtocolException: The server returned status code 422. Possible reasons include:
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - - This agent has been deleted from the configuration
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - - This agent is pending approval
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - - There is possibly a reverse proxy (or load balancer) that has been misconfigured. See https://docs.gocd.org/22.1.0/installation/configure-reverse-proxy.html#agents-and-reverse-proxies for details.
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - at com.thoughtworks.go.agent.RemotingClient.post(RemotingClient.java:118)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - at com.thoughtworks.go.agent.RemotingClient.ping(RemotingClient.java:74)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - at com.thoughtworks.go.agent.AgentHTTPClientController.ping(AgentHTTPClientController.java:109)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - at java.base/java.lang.reflect.Method.invoke(Unknown Source)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:65)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,584 [stdout] - at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,585 [stdout] - at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,585 [stdout] - at java.base/java.lang.Thread.run(Unknown Source)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,585 [stdout] - Caused by: org.apache.http.client.ClientProtocolException: The server returned status code 422. Possible reasons include:
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,585 [stdout] - - This agent has been deleted from the configuration
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,585 [stdout] - - This agent is pending approval
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,585 [stdout] - - There is possibly a reverse proxy (or load balancer) that has been misconfigured. See https://docs.gocd.org/22.1.0/installation/configure-reverse-proxy.html#agents-and-reverse-proxies for details.
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,585 [stdout] - at com.thoughtworks.go.agent.RemotingClient.validateResponse(RemotingClient.java:141)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,585 [stdout] - at com.thoughtworks.go.agent.RemotingClient.post(RemotingClient.java:114)
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:50,585 [stdout] - ... 14 common frames omitted
    2022-12-13T09:20:51.811-06:00    jvm 1 | 2022-12-13 15:20:51,714 [stdout] - 2022-12-13 15:20:51,713 INFO [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/external/epsilon-gocd-nexus-component-task-plugin-1.2.jar
    2022-12-13T09:21:06.760-06:00    jvm 1 | 2022-12-13 15:20:51,724 [stdout] - 2022-12-13 15:20:51,724 INFO [main] Defaul
        No newer events at this moment. Auto retrying...
Pause

Chad Wilson

unread,
Dec 17, 2022, 11:37:18 PM12/17/22
to go...@googlegroups.com
Hi

I haven't looked into your specific case in detail (there's not enough logging here as it is), however the general challenge of "ensure elastic agent startup time is as optimal as possible" is one I am interested in improving, especially with some help from the community in figuring out what is happening.

In 22.3.0 I made a change to reduce some redundant sleeps specific to when the agent needs to update the version to synchronize with the server at startup (very common for elastic agents, which tend to not have any persistent storage to cache the up-to-date agent launcher and jar versions), however that doesn't specifically deal with the errors below where the agent tries to negotiate with the server to become authorised. It might take 20s in "total" time" with earlier versions, but I don't think you should see those "no cookie set" errors for all of those 20 seconds, if working correctly - perhaps just one at the end.

The ping operation is independent of the main loop that retrieves work and ensures the cookie is present, so it's conceptually possible you can get some of these ping errors during startup before it has fully registered and retrieved the cookie, but not exactly sure what is going on there.

To understand further, you might want to look at the fuller logs from an agent's perspective. IIRC not all the logs that might be useful will be seen in the agent's stdout, but you should be able to see the fuller lifecycle of what is happening from the agent's perspective inside the /godata/logs/agent.log (if running as a container) which should show which part of the start-up and work loop the agent is working on. If you can build a timeline of important events we can probably figure out what is going on and the right bits to be optimised.

-Chad

--
You received this message because you are subscribed to the Google Groups "go-cd" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-cd+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/go-cd/6f655dc5-fd6a-4fa5-8272-0bc2b1799413n%40googlegroups.com.

Veenu Official

unread,
Dec 19, 2022, 9:46:30 AM12/19/22
to go-cd
Thank you so much Chad. I will analyse the logs /godata/logs/agent.log and get back to you soon.

Veenu Official

unread,
Dec 19, 2022, 2:55:54 PM12/19/22
to go-cd
Please find the logs below, these are just "agent" logs I got from the container :

2022-12-19 19:03:22,364 INFO  [main] ClassPathXmlApplicationContext:583 - Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@74e52303: startup date [Mon Dec 19 19:03:22 UTC 2022]; root of context hierarchy
2022-12-19 19:03:22,398 INFO  [main] XmlBeanDefinitionReader:317 - Loading XML bean definitions from class path resource [applicationContext.xml]
2022-12-19 19:03:22,677 INFO  [main] XmlBeanDefinitionReader:317 - Loading XML bean definitions from class path resource [applicationContext-plugin-infra.xml]
2022-12-19 19:03:23,301 INFO  [main] DefaultListableBeanFactory:821 - Overriding bean definition for bean 'pluginLoader' with a different definition: replacing [Generic bean: class [com.thoughtworks.go.plugin.infra.PluginLoader]; scope=singleton; abstract=false; lazyInit=false; autowireMode=0; dependencyCheck=0; autowireCandidate=true; primary=false; factoryBeanName=null; factoryMethodName=null; initMethodName=null; destroyMethodName=null; defined in URL [jar:onejar:lib/go-plugin-infra-22.1.0-13913.jar!/com/thoughtworks/go/plugin/infra/PluginLoader.class]] with [Generic bean: class [com.thoughtworks.go.plugin.infra.PluginLoader]; scope=; abstract=false; lazyInit=false; autowireMode=0; dependencyCheck=0; autowireCandidate=true; primary=false; factoryBeanName=null; factoryMethodName=null; initMethodName=null; destroyMethodName=null; defined in class path resource [applicationContext-plugin-infra.xml]]
2022-12-19 19:03:23,633 INFO  [main] AutowiredAnnotationBeanPostProcessor:156 - JSR-330 'javax.inject.Inject' annotation found and supported for autowiring
2022-12-19 19:03:24,869 INFO  [main] AgentStatusHttpd:90 - Agent status HTTP API server running on http://localhost:8152.
2022-12-19 19:03:24,954 INFO  [main] AgentHTTPClientController:96 - Configured remoting type: RemotingClient
2022-12-19 19:03:24,972 INFO  [main] ThreadPoolTaskScheduler:166 - Initializing ExecutorService 'scheduler'
2022-12-19 19:03:26,004 INFO  [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/bundled/gocd-yum-repository-poller-plugin.jar
2022-12-19 19:03:26,185 INFO  [scheduler-1] SslInfrastructureService:78 - [Agent Registration] Starting to register agent.
2022-12-19 19:03:26,185 INFO  [scheduler-1] SslInfrastructureService:88 - [Agent Registration] Fetching token from server.
2022-12-19 19:03:26,193 INFO  [scheduler-1] TokenRequester:56 - The server has generated token for the agent.
2022-12-19 19:03:26,195 INFO  [scheduler-1] SslInfrastructureService:91 - [Agent Registration] Got a token from server.
2022-12-19 19:03:26,205 INFO  [scheduler-1] SslInfrastructureService:170 - This agent is now approved by the server.
2022-12-19 19:03:26,205 INFO  [scheduler-1] SslInfrastructureService:119 - [Agent Registration] Retrieved registration from Go server.
2022-12-19 19:03:26,237 INFO  [scheduler-1] SslInfrastructureService:81 - [Agent Registration] Successfully registered agent.
2022-12-19 19:03:26,337 INFO  [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/bundled/gocd-yum-repository-poller-plugin.jar
2022-12-19 19:03:26,346 INFO  [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/external/abcd-gocd-generic-nexus-task-plugin-1.1.jar
2022-12-19 19:03:26,554 ERROR [scheduler-2] RemotingClient:166 - Server responded to action `ping` with: status[422 Unprocessable Entity], body[{
  "message" : "Agent [Agent [3ba812767907, 172.17.0.2, d1330c5c-f60e-4de2-beca-bc21859ee99b, null]] has no cookie set; nested exception is com.thoughtworks.go.server.service.AgentNoCookieSetException: Agent [Agent [3ba812767907, 172.17.0.2, d1330c5c-f60e-4de2-beca-bc21859ee99b, null]] has no cookie set"
}]
2022-12-19 19:03:26,556 ERROR [scheduler-2] AgentHTTPClientController:114 - Error occurred when agent tried to ping server:
java.lang.RuntimeException: org.apache.http.client.ClientProtocolException: The server returned status code 422. Possible reasons include:
   - This agent has been deleted from the configuration
   - This agent is pending approval
   - There is possibly a reverse proxy (or load balancer) that has been misconfigured. See https://docs.gocd.org/22.1.0/installation/configure-reverse-proxy.html#agents-and-reverse-proxies for details.
    at com.thoughtworks.go.agent.RemotingClient.post(RemotingClient.java:118)
    at com.thoughtworks.go.agent.RemotingClient.ping(RemotingClient.java:74)
    at com.thoughtworks.go.agent.AgentHTTPClientController.ping(AgentHTTPClientController.java:109)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.base/java.lang.reflect.Method.invoke(Unknown Source)
    at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:65)
    at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)

Caused by: org.apache.http.client.ClientProtocolException: The server returned status code 422. Possible reasons include:
   - This agent has been deleted from the configuration
   - This agent is pending approval
   - There is possibly a reverse proxy (or load balancer) that has been misconfigured. See https://docs.gocd.org/22.1.0/installation/configure-reverse-proxy.html#agents-and-reverse-proxies for details.
    at com.thoughtworks.go.agent.RemotingClient.validateResponse(RemotingClient.java:141)
    at com.thoughtworks.go.agent.RemotingClient.post(RemotingClient.java:114)

    ... 14 common frames omitted
2022-12-19 19:03:28,110 INFO  [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/external/abcd-gocd-generic-nexus-task-plugin-1.1.jar
2022-12-19 19:03:28,119 INFO  [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/external/abcd-gocd-nexus-task-plugin-1.6.jar
2022-12-19 19:03:30,554 INFO  [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/external/abcd-gocd-nexus-task-plugin-1.6.jar
2022-12-19 19:03:30,562 INFO  [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/external/gocd-git-path-material-plugin-2.2.0-164.jar
2022-12-19 19:03:31,564 ERROR [scheduler-2] RemotingClient:166 - Server responded to action `ping` with: status[422 Unprocessable Entity], body[{
  "message" : "Agent [Agent [3ba812767907, 172.17.0.2, d1330c5c-f60e-4de2-beca-bc21859ee99b, null]] has no cookie set; nested exception is com.thoughtworks.go.server.service.AgentNoCookieSetException: Agent [Agent [3ba812767907, 172.17.0.2, d1330c5c-f60e-4de2-beca-bc21859ee99b, null]] has no cookie set"
}]
2022-12-19 19:03:31,564 ERROR [scheduler-2] AgentHTTPClientController:114 - Error occurred when agent tried to ping server:
java.lang.RuntimeException: org.apache.http.client.ClientProtocolException: The server returned status code 422. Possible reasons include:
   - This agent has been deleted from the configuration
   - This agent is pending approval
   - There is possibly a reverse proxy (or load balancer) that has been misconfigured. See https://docs.gocd.org/22.1.0/installation/configure-reverse-proxy.html#agents-and-reverse-proxies for details.
    at com.thoughtworks.go.agent.RemotingClient.post(RemotingClient.java:118)
    at com.thoughtworks.go.agent.RemotingClient.ping(RemotingClient.java:74)
    at com.thoughtworks.go.agent.AgentHTTPClientController.ping(AgentHTTPClientController.java:109)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.base/java.lang.reflect.Method.invoke(Unknown Source)
    at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:65)
    at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)

Caused by: org.apache.http.client.ClientProtocolException: The server returned status code 422. Possible reasons include:
   - This agent has been deleted from the configuration
   - This agent is pending approval
   - There is possibly a reverse proxy (or load balancer) that has been misconfigured. See https://docs.gocd.org/22.1.0/installation/configure-reverse-proxy.html#agents-and-reverse-proxies for details.
    at com.thoughtworks.go.agent.RemotingClient.validateResponse(RemotingClient.java:141)
    at com.thoughtworks.go.agent.RemotingClient.post(RemotingClient.java:114)

    ... 14 common frames omitted
2022-12-19 19:03:33,363 INFO  [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/external/gocd-git-path-material-plugin-2.2.0-164.jar
2022-12-19 19:03:33,370 INFO  [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/external/script-executor-0.3.0.jar
2022-12-19 19:03:35,825 INFO  [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/external/script-executor-0.3.0.jar
2022-12-19 19:03:35,836 INFO  [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/external/abcd-gocd-nexus-material-plugin-1.1.jar
2022-12-19 19:03:36,571 ERROR [scheduler-2] RemotingClient:166 - Server responded to action `ping` with: status[422 Unprocessable Entity], body[{
  "message" : "Agent [Agent [3ba812767907, 172.17.0.2, d1330c5c-f60e-4de2-beca-bc21859ee99b, null]] has no cookie set; nested exception is com.thoughtworks.go.server.service.AgentNoCookieSetException: Agent [Agent [3ba812767907, 172.17.0.2, d1330c5c-f60e-4de2-beca-bc21859ee99b, null]] has no cookie set"
}]
2022-12-19 19:03:36,571 ERROR [scheduler-2] AgentHTTPClientController:114 - Error occurred when agent tried to ping server:
java.lang.RuntimeException: org.apache.http.client.ClientProtocolException: The server returned status code 422. Possible reasons include:
   - This agent has been deleted from the configuration
   - This agent is pending approval
   - There is possibly a reverse proxy (or load balancer) that has been misconfigured. See https://docs.gocd.org/22.1.0/installation/configure-reverse-proxy.html#agents-and-reverse-proxies for details.
    at com.thoughtworks.go.agent.RemotingClient.post(RemotingClient.java:118)
    at com.thoughtworks.go.agent.RemotingClient.ping(RemotingClient.java:74)
    at com.thoughtworks.go.agent.AgentHTTPClientController.ping(AgentHTTPClientController.java:109)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.base/java.lang.reflect.Method.invoke(Unknown Source)
    at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:65)
    at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)

Caused by: org.apache.http.client.ClientProtocolException: The server returned status code 422. Possible reasons include:
   - This agent has been deleted from the configuration
   - This agent is pending approval
   - There is possibly a reverse proxy (or load balancer) that has been misconfigured. See https://docs.gocd.org/22.1.0/installation/configure-reverse-proxy.html#agents-and-reverse-proxies for details.
    at com.thoughtworks.go.agent.RemotingClient.validateResponse(RemotingClient.java:141)
    at com.thoughtworks.go.agent.RemotingClient.post(RemotingClient.java:114)

    ... 14 common frames omitted
2022-12-19 19:03:38,530 INFO  [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/external/abcd-gocd-nexus-material-plugin-1.1.jar
2022-12-19 19:03:38,540 INFO  [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/external/abcd-gocd-nexus-component-task-plugin-1.2.jar
2022-12-19 19:03:41,578 ERROR [scheduler-2] RemotingClient:166 - Server responded to action `ping` with: status[422 Unprocessable Entity], body[{
  "message" : "Agent [Agent [3ba812767907, 172.17.0.2, d1330c5c-f60e-4de2-beca-bc21859ee99b, null]] has no cookie set; nested exception is com.thoughtworks.go.server.service.AgentNoCookieSetException: Agent [Agent [3ba812767907, 172.17.0.2, d1330c5c-f60e-4de2-beca-bc21859ee99b, null]] has no cookie set"
}]
2022-12-19 19:03:41,579 ERROR [scheduler-2] AgentHTTPClientController:114 - Error occurred when agent tried to ping server:
java.lang.RuntimeException: org.apache.http.client.ClientProtocolException: The server returned status code 422. Possible reasons include:
   - This agent has been deleted from the configuration
   - This agent is pending approval
   - There is possibly a reverse proxy (or load balancer) that has been misconfigured. See https://docs.gocd.org/22.1.0/installation/configure-reverse-proxy.html#agents-and-reverse-proxies for details.
    at com.thoughtworks.go.agent.RemotingClient.post(RemotingClient.java:118)
    at com.thoughtworks.go.agent.RemotingClient.ping(RemotingClient.java:74)
    at com.thoughtworks.go.agent.AgentHTTPClientController.ping(AgentHTTPClientController.java:109)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.base/java.lang.reflect.Method.invoke(Unknown Source)
    at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:65)
    at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
    at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
    at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
    at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.base/java.lang.Thread.run(Unknown Source)

Caused by: org.apache.http.client.ClientProtocolException: The server returned status code 422. Possible reasons include:
   - This agent has been deleted from the configuration
   - This agent is pending approval
   - There is possibly a reverse proxy (or load balancer) that has been misconfigured. See https://docs.gocd.org/22.1.0/installation/configure-reverse-proxy.html#agents-and-reverse-proxies for details.
    at com.thoughtworks.go.agent.RemotingClient.validateResponse(RemotingClient.java:141)
    at com.thoughtworks.go.agent.RemotingClient.post(RemotingClient.java:114)

    ... 14 common frames omitted
2022-12-19 19:03:42,754 INFO  [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/external/abcd-gocd-nexus-component-task-plugin-1.2.jar
2022-12-19 19:03:42,762 INFO  [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/external/epcc-nexus-gocd-2.0-SNAPSHOT.jar
2022-12-19 19:03:46,231 INFO  [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/external/epcc-nexus-gocd-2.0-SNAPSHOT.jar
2022-12-19 19:03:46,232 INFO  [scheduler-1] AgentHTTPClientController:139 - About to get cookie from the server.
2022-12-19 19:03:46,243 INFO  [scheduler-1] AgentHTTPClientController:142 - Got cookie: 0838ac28-be1e-4965-a6ca-833b2c2ba63b
2022-12-19 19:03:56,343 INFO  [scheduler-1] HttpService:125 - Got back 200 from server
2022-12-19 19:03:56,344 INFO  [scheduler-1] DefaultGoPublisher:88 - Agent [3ba812767907, 172.17.0.2, d1330c5c-f60e-4de2-beca-bc21859ee99b] is reporting status [Preparing] to Go Server for Build [Q3UNIT-Infra-QA/9/GetNexusArtifact/37/GetNexusArtifact/142012]
2022-12-19 19:03:56,393 INFO  [scheduler-1] HttpService:125 - Got back 200 from server
2022-12-19 19:03:56,394 INFO  [scheduler-1] DefaultGoPublisher:88 - Agent [3ba812767907, 172.17.0.2, d1330c5c-f60e-4de2-beca-bc21859ee99b] is reporting status [Building] to Go Server for Build [Q3UNIT-Infra-QA/9/GetNexusArtifact/37/GetNexusArtifact/142012]
2022-12-19 19:03:57,122 INFO  [scheduler-1] HttpService:125 - Got back 200 from server
2022-12-19 19:03:57,123 INFO  [scheduler-1] DefaultGoPublisher:94 - Agent [3ba812767907, 172.17.0.2, d1330c5c-f60e-4de2-beca-bc21859ee99b] is reporting build result [Passed] to Go Server for Build [Q3UNIT-Infra-QA/9/GetNexusArtifact/37/GetNexusArtifact/142012]
2022-12-19 19:03:57,153 INFO  [scheduler-1] HttpService:125 - Got back 200 from server
2022-12-19 19:03:57,154 INFO  [scheduler-1] DefaultGoPublisher:88 - Agent [3ba812767907, 172.17.0.2, d1330c5c-f60e-4de2-beca-bc21859ee99b] is reporting status [Completing] to Go Server for Build [Q3UNIT-Infra-QA/9/GetNexusArtifact/37/GetNexusArtifact/142012]
2022-12-19 19:03:57,180 INFO  [scheduler-1] ArtifactsPublisher:68 - Pluggable metadata folder is empty.
2022-12-19 19:03:57,180 INFO  [scheduler-1] DefaultGoPublisher:100 - Agent [3ba812767907, 172.17.0.2, d1330c5c-f60e-4de2-beca-bc21859ee99b] is reporting build result [Passed] to Go Server for Build [Q3UNIT-Infra-QA/9/GetNexusArtifact/37/GetNexusArtifact/142012]
2022-12-19 19:03:57,185 INFO  [scheduler-1] HttpService:125 - Got back 200 from server
2022-12-19 19:03:57,238 INFO  [scheduler-1] DefaultGoPublisher:82 - Stopping Transmission for Build [Q3UNIT-Infra-QA/9/GetNexusArtifact/37/GetNexusArtifact/142012]

Chad Wilson

unread,
Dec 19, 2022, 9:59:21 PM12/19/22
to go...@googlegroups.com
Thanks! That helps.

So what you see there in the errors is the background 5 second ping to check connection. This is a symptom but not the root cause of the problem. Before this scheduled check can go healthy, the agent has to fully initialise and start to obtain work. The root cause of the start-up delay appears to be that the plugin load is slow (the stuff you see on the [main] thread in the logs).

If we remove this background thread noise, we see the below.

2022-12-19 19:03:24,972 INFO  [main] ThreadPoolTaskScheduler:166 - Initializing ExecutorService 'scheduler'
2022-12-19 19:03:26,004 INFO [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/bundled/gocd-yum-repository-poller-plugin.jar
2022-12-19 19:03:26,337 INFO  [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/bundled/gocd-yum-repository-poller-plugin.jar
2022-12-19 19:03:26,346 INFO [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/external/abcd-gocd-generic-nexus-task-plugin-1.1.jar
2022-12-19 19:03:28,110 INFO  [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/external/abcd-gocd-generic-nexus-task-plugin-1.1.jar
2022-12-19 19:03:28,119 INFO [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/external/abcd-gocd-nexus-task-plugin-1.6.jar
2022-12-19 19:03:30,554 INFO [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/external/abcd-gocd-nexus-task-plugin-1.6.jar
2022-12-19 19:03:30,562 INFO [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/external/gocd-git-path-material-plugin-2.2.0-164.jar
2022-12-19 19:03:33,363 INFO  [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/external/gocd-git-path-material-plugin-2.2.0-164.jar
2022-12-19 19:03:33,370 INFO [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/external/script-executor-0.3.0.jar
2022-12-19 19:03:35,825 INFO [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/external/script-executor-0.3.0.jar
2022-12-19 19:03:35,836 INFO [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/external/abcd-gocd-nexus-material-plugin-1.1.jar
2022-12-19 19:03:38,530 INFO  [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/external/abcd-gocd-nexus-material-plugin-1.1.jar
2022-12-19 19:03:38,540 INFO [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/external/abcd-gocd-nexus-component-task-plugin-1.2.jar
2022-12-19 19:03:42,754 INFO  [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/external/abcd-gocd-nexus-component-task-plugin-1.2.jar
2022-12-19 19:03:42,762 INFO [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/external/epcc-nexus-gocd-2.0-SNAPSHOT.jar
2022-12-19 19:03:46,231 INFO [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/external/epcc-nexus-gocd-2.0-SNAPSHOT.jar
2022-12-19 19:03:46,232 INFO [scheduler-1] AgentHTTPClientController:139 - About to get cookie from the server.


What I observe is that some of your plugins are very slow to load and it's taking about 20 seconds to load all plugins into the agent which is the main delay to start-up time. Builds cannot be scheduled (and the cookie obtained) before all the plugins are loaded into the agent.

Several of them are taking 3-4 seconds to do so (seems to get slower over time? - 330ms, 1.7s, 2.3s, 2.8s, 2.5s, 2.7s, 4.3s, 3.5s) and even very basic ones are rather slow like script-executor-0.3.0.jar (which are unlikely to do anything special during plugin initialisation). While I haven't looked in detail at slow plugin loading before or compared to other environments, you might want to consider:

  • Remove/rationalise plugins that are not needed/used from your server.
    • Some of these plugins appear to be custom ones you may have built/written rather than community plugins, so I am unsure what they are doing, how big they are, whether they are loading a lot of things into memory or have huge numbers of dependencies inside them.
    • Consider updating plugins to later versions on your server. Some have had dependnencies rationalised and might make them quicker to load if the issue is with their raw size and dependencies. (script-executor, git-path-material) You'd have to review that yourself for your custom plugins.
  • Each plugin has some logs specific to that plugin stored in the logs dir. Often they dont log much, but perhaps worth looking to see whether there are any hints at anything that is slow due to internals of the plugin, rather than something the agent plugin infrastructure is doing.
  • Consider whether the storage that is sitting behind /go/plugins/external and your agent is slow and whether disk access speed could be the bottleneck here. Plugins need to be downloaded, unpacked and then loaded of this storage, whcih is effectively temporary (and has to be done every startup) for an elastic agent, so you'll want it to be as fast as possible.
    • You might be able to also check whether the plugin download is slow in earlier logs to see whether this is a possible area of investigation.
  • Check the CPU usage during this 20 seconds to see what is happening - does it look like the main thread is CPU limited, i.e constantly at 100% during this plugin load time? Or garbage collection threads? This might help determine whether the issue is likely to be I/O speed or something more intrinsic to work required to load plugins.
  • Try allocating a bigger memory/heap to your agent (both container memory limits and JVM limits) and see if this changes the plugin load speed. This might be relevant if your plugins are very large.

Hope that helps.

-Chad

Veenu Official

unread,
Dec 20, 2022, 10:00:32 AM12/20/22
to go-cd
Thank you so much Chad.
Those points are really helpful.
I will go through the list of recommendations and see where we can optimize.

-Veenu

Veenu Official

unread,
Dec 20, 2022, 2:18:27 PM12/20/22
to go-cd
1. While testing on a new environment, we are also seeing 2 minutes in the logs to just configure RemotingType, do you know what can be the reason for that?


2022-12-20 16:46:39,192 INFO  [main] AutowiredAnnotationBeanPostProcessor:156 - JSR-330 'javax.inject.Inject' annotation found and supported for autowiring
2022-12-20 16:46:40,448 INFO  [main] AgentStatusHttpd:90 - Agent status HTTP API server running on http://localhost:8152.

2022-12-20 16:48:50,532 INFO  [main] AgentHTTPClientController:96 - Configured remoting type: RemotingClient
2022-12-20 16:48:50,548 INFO  [main] ThreadPoolTaskScheduler:166 - Initializing ExecutorService 'scheduler'
2022-12-20 16:48:51,098 INFO  [main] DefaultPluginJarChangeListener:67 - Plugin load starting: /go/plugins/bundled/gocd-yum-repository-poller-plugin.jar
2022-12-20 16:48:51,420 INFO  [main] DefaultPluginJarChangeListener:74 - Plugin load finished: /go/plugins/bundled/gocd-yum-repository-poller-plugin.jar



2. Also, while reading how to limit the number of plugins being loaded by an agent, I came across this suggestion, do you also think it is a good idea to configure:

Can Go CD agent load only few plugins as compared to all plugins from Go CD Server

 

Yes, a Go CD agent can be configured to load only a subset of the plugins that are available on the Go CD server. This can be useful if you have a large number of plugins installed on the server and you only want a specific set of plugins to be available on the agent.

To configure an agent to load only specific plugins, you can use the PLUGIN_LOCATOR environment variable when starting the agent. The PLUGIN_LOCATOR variable specifies the path to a JSON file that lists the plugins that should be loaded by the agent. The JSON file should contain an array of plugin IDs, with each ID corresponding to a plugin that is installed on the Go CD server.

For example, if you have the following JSON file:

 

 

[  "github.com/gocd-contrib/gocd-json-config-plugin",  "github.com/gocd-contrib/gocd-yaml-config-plugin"]

Then the agent will only load the gocd-json-config-plugin and gocd-yaml-config-plugin plugins, even if there are other plugins available on the Go CD server.

It's also possible to use wildcards in the plugin IDs to load multiple plugins at once. For example, the following JSON file will load all plugins that start with github.com/gocd-contrib/gocd-:

 

 

[  "github.com/gocd-contrib/gocd-*"]

Note that the PLUGIN_LOCATOR variable only affects the plugins that are loaded by the agent. It does not affect the plugins that are available on the Go CD server, or the plugins that are loaded by other agents.
 

Chad Wilson

unread,
Dec 21, 2022, 12:20:20 AM12/21/22
to go...@googlegroups.com
If you have a random two minute gap, it points to a general slowness in your agent, maybe not just plugin loads. There are lots of things that are happening in start-up there inside the JVM, but they shouldn't take two minutes unless your agent has some resource constraints (slow disk, CPU etc). I would focus in these more general areas as it looks like plugin load time isn't the sole problem here.

You could increase the logging to DEBUG level but it will be very noisy so not sure how much you will get out of it without detailed knowledge of GoCD internals, spring applications etc.

I'm not sure where you read about PLUGIN_LOCATOR - could you share where this is documented/shared? I am not familiar with this capability and have never used it, but maybe it exists. :-) In any case, if you have other random slowness, in "non plugin-load" parts of the start-up your effort is probably better spent looking there at the standard resource constraints for software - disk speed, CPU, memory, network speed.

-Chad

Veenu Official

unread,
Dec 21, 2022, 3:33:17 PM12/21/22
to go-cd
I got that information on  chat.openai.com  :) I could not find the actual source though!
I was trying to see if a Go Agent can load selective plugins instead of picking everything under the Go server folders /plugins/bundled and /plugins/external.

Anyways, thanks for all your inputs.

Veenu Official

unread,
Jan 3, 2023, 4:00:58 PM1/3/23
to go-cd
Hi Chad,

Wishing you a happy new year!

For some reason, we always have a new agent spinning up for each Stage (instead of picking a running agent), even if multiple stages are using the same Elastic Profile. Even though we have number of minimum Running agents as 10 for elastic agents cluster profile, our next stage (that uses the same Elastic profile for the agent container) is always spinning up a new agent instead of picking up a running agent. Is there a flag to set up to not stop an agent container when it is done with a stage?
As of now, for us each stage with elastic agent still takes around 46  seconds to launch an agent even after increasing JVM startup arguments, reducing the relaunch time from 10 sec to 200 ms, please suggest.

Also, where do we set the logging to debug level ?

Thanks,
Veenu

Chad Wilson

unread,
Jan 3, 2023, 9:28:04 PM1/3/23
to go...@googlegroups.com
Hi Veenu

  • What type of elastic agents are these? ECS? Kubernetes? Something else?
    • From your description, it sounds like the ECS one. If so, the config you are referring to on the cluster profile keeps instances (VMs) online within the cluster, but individual containers and thus agent processes are not re-used, they are always booted clean and bootstrapped from the server for each job - either on an existing instance, or a new instance if there aren't enough resources.
    • Which specific agent container image and version are you using? (your server is 22.1.0 I believe)
    • Might help if you share a (redacted) output from /go/api/support so we are not guessing about your server configuration, plugin versions etc.
    • And share your elastic agent cluster profile and elastic profile configuration.
  • Are you able to share what you did to address the weird unexplained plugin delays noted earlier?
    • Which suggested steps have you taken or not taken?
    • How long are all the plugin loads taking from start -> finish now - the same as before?
    • Did you try increasing the container level resources available to the agent container (CPU or memory limits)?
  • How are you measuring the 46 seconds?
    • What are the start/complete times you are considering? From the perspective of the server? What is "Complete"? When the agent reports as online in the logs on the server agent console and starts executing tasks?
    • Please share a representative breakdown/timeline of the time spent doing various steps, from server initiating agent creation, to when the agent reports as online. It's difficult to give anything but generic suggestions unless you share the methodology for measurement and the data you've collected, based on which logging.
  • Additional logging is a bit of a pain to configure on elastic agents. The generic detail is at https://docs.gocd.org/current/advanced_usage/logging.html however for containers this generally means mounting some logging config file overrides into /godata/config which is not so easy (personally I haven't done it for agents, so would be doing trial+error also). To reiterate, debug logs on their own are not likely to tell us the major constraint without considering other metrics on resources.
-Chad

Veenu Official

unread,
Jan 4, 2023, 10:33:11 PM1/4/23
to go-cd
Hi Chad,

Took some time to gather all the info.
Please take a look at the files attached.

I tried loading less plugins, but the test GoCD environment my Operations team has created gives very inconsistent timings, so that is not helping me to test with selective plugin loading.
I would may be try upgrading to new GoCD in the new test environment but that is another team so will have to wait till I get approval.

Also as you mentioned, how do we increase disk speed and network speed, I could see only CPU and memory in the Elastic Profile.
 
Really appreciate your time and inputs. Thanks !
Veenu

On Tuesday, January 3, 2023 at 8:28:04 PM UTC-6 Chad Wilson wrote:
Hi Veenu

  • What type of elastic agents are these? ECS? Kubernetes? Something else?
    • From your description, it sounds like the ECS one. If so, the config you are referring to on the cluster profile keeps instances (VMs) online within the cluster, but individual containers and thus agent processes are not re-used, they are always booted clean and bootstrapped from the server for each job - either on an existing instance, or a new instance if there aren't enough resources. You're right, it is ECS. Thanks for the information. I wish we could re-use agent and that would save us a lot of time spent in bootstrapping a new one each time.
    • Which specific agent container image and version are you using? (your server is 22.1.0 I believe) Yes 22.1.0, attached Dockerfile and docker-entrypoint.sh
    • Might help if you share a (redacted) output from /go/api/support so we are not guessing about your server configuration, plugin versions etc. Attached
    • And share your elastic agent cluster profile and elastic profile configuration. Attached
  • Are you able to share what you did to address the weird unexplained plugin delays noted earlier?
    • Which suggested steps have you taken or not taken? CPU monitoring on the ECS instances running the container is usually 15-20%.
    • How long are all the plugin loads taking from start -> finish now - the same as before? I would say same as before.
    • Did you try increasing the container level resources available to the agent container (CPU or memory limits)? Increased JVM on the container for both agent-bootstrapper and agent jar. (-Xms1024m -Xmx2g)  This looks like has helped with couple of seconds I feel.
  • How are you measuring the 46 seconds?
    • What are the start/complete times you are considering? From the perspective of the server? What is "Complete"? When the agent reports as online in the logs on the server agent console and starts executing tasks?  Yes agent bootup time as reported on the Go CD server, ready to execute tasks
    • Please share a representative breakdown/timeline of the time spent doing various steps, from server initiating agent creation, to when the agent reports as online. It's difficult to give anything but generic suggestions unless you share the methodology for measurement and the data you've collected, based on which logging. Attached both server logs and Agent container logs
  • Additional logging is a bit of a pain to configure on elastic agents. The generic detail is at https://docs.gocd.org/current/advanced_usage/logging.html however for containers this generally means mounting some logging config file overrides into /godata/config which is not so easy (personally I haven't done it for agents, so would be doing trial+error also). To reiterate, debug logs on their own are not likely to tell us the major constraint without considering other metrics on resources. Thanks for the info.
-Chad

StgGoCD-ElasticProfile.txt
base_image_dockerfile.txt
Agent container logs- From ECSCloudWatch.txt
Go API Support - server info - Staging.txt
docker-entrypoint.sh
StgGoCD-ClusterProfile.txt

Chad Wilson

unread,
Jan 29, 2023, 12:21:12 PM1/29/23
to go...@googlegroups.com
Hi there

Sorry for the delayed reply, hadn't got around to digging into this a bit. Few observations from the log/files.

1. The initial agent bootstrap is taking about 5-6 seconds. This isn't terrible, but the dockerfile you attached says the agent bootstrapper is 21.1.0 even though your server is 22.1.0. This will be faster if your agent bootstrapper matches, or is closer to the server version. Your Dockerfile also seems to be copy+pasted from an old version of https://github.com/gocd/docker-gocd-agent-centos-7/blob/master/Dockerfile and then customised. It would probably be better to build on top of the pre-built GoCD agent images published to Dockerhub (FROM gocd/gocd-agent-centos-7:v22.1.0 etc) rather than copy and paste all the instructions - to make it easier to update periodically. Not sure how often you rebuild, but that is likely rather old and I would suggest you rebuild on something more recent.

2. Hacking the relaunch time is happening because your agent is needing to update (version mismatch mentioned above). Additionally, you probably don't need to hack the relaunch time if you update to 22.3.0 - that's the tweak I made to relaunch immediately on version updates.

3. Your plugin loads are still very slow (18s overall). This is not normal and I have not seen it so slow before. I suspect something is amiss with the container or host, but there's not enough information to know. Looks like you are allocating 2 cores to the agent in the elastic profile. Does it improve massively if you allocate more? Again, I would try a more detailed level of inspection of what is going on within your ECS instance when the agent comes up. Try creating an agent in a different environment (e.g just running locally) with the same set of plugins. Are the plugin loads also slow in a simpler environment?

4. Are you using the ECS optimized AMIs from Amazon or something custom/proprietary? You could try updating them. If it's the latter, you might need to dig yourself into what is happening.

5. There are some more tips at https://github.com/gocd/gocd/issues/11215#issuecomment-1397268015 for reducing the amount of time it takes for the agent to start doing work (you will likely see a 10 second gap after it first tries to get work before a second attempt by default). -Dgocd.agent.extra.properties=agent.get.work.interval=1000 will reduce this to 1s - but please read the caveats for this on the GitHub issue above.

6. Your server looks reasonably big in number of pipelines and materials, but you seem to have the default memory allocated to it (1G max). You might get a better experience with more memory available to the server. Doesn't seem to be the main issue with the agent start-up speed, but might be contributing to other slowness.

Overall, your agent just looks sluggish though, so there is a bigger challenge than tweaks to sleeps/polling within GoCD logic.

-Chad

Veenu Official

unread,
Feb 1, 2023, 11:30:55 PM2/1/23
to go-cd
Hi Chad,

Thanks a lot for the response here.

We have shifted our focus at this point to just try and use Static agents and try scaling them up and down, as needed for now.

Since we had a huge VSM with 100+ pipelines. At one particular point in the VSM, one of the pipelines spawns 40+ downstream pipelines, which mean we need to quickly spin up those many containers and even if we used bigger EC2 instances(we tried with m6a.8xlarge) for tackling the load, there was a big scheduling gap for GoCD trying to reach out to ECS Cluster and it drastically slowed down the process. Though, with bigger instances we did see the time to spin up the agent container took less like 37 seconds instead of 45+ seconds.
For now, we are moving on with static agents optimization to support multiple such VSMs that might have to run in parallel.

Anyways, thank you for investing the time in this analysis.
Really appreciate your inputs.

Regards,
Veenu
Reply all
Reply to author
Forward
0 new messages