Agents going Lost Contact after a while and warnings in the server about remoteBuildRepository

26 views
Skip to first unread message

Diogo Oliveira

unread,
Aug 5, 2020, 1:10:29 PM8/5/20
to go-cd
(Note: I originally opened this as an issue in GitHub, thanks @Aravind SV for pointing me to the right place)

We migrated our Go Server to a new server with a PostgreSQL database (without migrating any data), and we created agents with a fresh installation of GoCD Agent. Both server and agents are on 20.5.0 version.

Important detail: we have a "pre-production" version of our server, where we did the same migration, and had no issues.


Environment

Using an EC2 instance + RDS with a LB.


Basic environment details
  • Go Version: 20.5.0 (11820-1c9b12ac8aa216a2c062fbec4cba18d9cfb8b404)
  • JAVA Version: 13.0.2
  • OS: Linux 3.10.0-1127.18.2.el7.x86_64
Issue and logs

After a while, we observe that all agents go "Lost Contact". Eventually if we restart the server everything starts working well for a couple of hours.

By checking the logs, we see this on the Agents:


2020-08-04 18:15:22,327 ERROR [scheduler-1] AgentHTTPClientController:105 - Error occurred when agent tried to ping server: 
org.springframework.remoting.RemoteAccessException: Could not access HTTP invoker remote service at [https://<server_url>:443/go/remoting/remoteBuildRepository]; nested exception is org.apache.http.client.ClientProtocolException: The server returned status code 403. Possible reasons include:
   - This agent has been deleted from the configuration
   - This agent is pending approval
   - There is possibly a reverse proxy (or load balancer) that has been misconfigured. See https://docs.gocd.org/20.5.0/installation/configure-reverse-proxy.html#agents-and-reverse-proxies for details.
	at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.convertHttpInvokerAccessException(HttpInvokerClientInterceptor.java:226)
	at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.invoke(HttpInvokerClientInterceptor.java:153)
	at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179)
	at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213)
	at com.sun.proxy.$Proxy10.ping(Unknown Source)
	at com.thoughtworks.go.agent.AgentHTTPClientController.ping(AgentHTTPClientController.java:100)
	at jdk.internal.reflect.GeneratedMethodAccessor7.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.base/java.lang.reflect.Method.invoke(Unknown Source)
	at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:65)
	at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source)
	at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
	at java.base/java.lang.Thread.run(Unknown Source)
Caused by: org.apache.http.client.ClientProtocolException: The server returned status code 403. Possible reasons include:
   - This agent has been deleted from the configuration
   - This agent is pending approval
   - There is possibly a reverse proxy (or load balancer) that has been misconfigured. See https://docs.gocd.org/20.5.0/installation/configure-reverse-proxy.html#agents-and-reverse-proxies for details.
	at com.thoughtworks.go.agent.GoHttpClientHttpInvokerRequestExecutor.validateResponse(GoHttpClientHttpInvokerRequestExecutor.java:100)
	at com.thoughtworks.go.agent.GoHttpClientHttpInvokerRequestExecutor.doExecuteRequest(GoHttpClientHttpInvokerRequestExecutor.java:66)
	at org.springframework.remoting.httpinvoker.AbstractHttpInvokerRequestExecutor.executeRequest(AbstractHttpInvokerRequestExecutor.java:137)
	at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.executeRequest(HttpInvokerClientInterceptor.java:202)
	at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.executeRequest(HttpInvokerClientInterceptor.java:184)
	at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.invoke(HttpInvokerClientInterceptor.java:150)

After checking the server logs, we also see this error:

image 

As suggested by @Aravind SV, we have tried to create a local agent on the same machine the server is to rule out LB issues. Looks like this agent is algo going lost contact.

We have also noticed a spike in requests when this happened, but this can be due to retries.

Any help on this is very welcome, we're a bit lost here :)
Reply all
Reply to author
Forward
0 new messages