We migrated our Go Server to a new server with a PostgreSQL database (without migrating any data), and we created agents with a fresh installation of GoCD Agent. Both server and agents are on 20.5.0 version.
Important detail: we have a "pre-production" version of our server, where we did the same migration, and had no issues.
Using an EC2 instance + RDS with a LB.
After a while, we observe that all agents go "Lost Contact". Eventually if we restart the server everything starts working well for a couple of hours.
By checking the logs, we see this on the Agents:
2020-08-04 18:15:22,327 ERROR [scheduler-1] AgentHTTPClientController:105 - Error occurred when agent tried to ping server: org.springframework.remoting.RemoteAccessException: Could not access HTTP invoker remote service at [https://<server_url>:443/go/remoting/remoteBuildRepository]; nested exception is org.apache.http.client.ClientProtocolException: The server returned status code 403. Possible reasons include: - This agent has been deleted from the configuration - This agent is pending approval - There is possibly a reverse proxy (or load balancer) that has been misconfigured. See https://docs.gocd.org/20.5.0/installation/configure-reverse-proxy.html#agents-and-reverse-proxies for details. at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.convertHttpInvokerAccessException(HttpInvokerClientInterceptor.java:226) at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.invoke(HttpInvokerClientInterceptor.java:153) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.aop.framework.JdkDynamicAopProxy.invoke(JdkDynamicAopProxy.java:213) at com.sun.proxy.$Proxy10.ping(Unknown Source) at com.thoughtworks.go.agent.AgentHTTPClientController.ping(AgentHTTPClientController.java:100) at jdk.internal.reflect.GeneratedMethodAccessor7.invoke(Unknown Source) at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.base/java.lang.reflect.Method.invoke(Unknown Source) at org.springframework.scheduling.support.ScheduledMethodRunnable.run(ScheduledMethodRunnable.java:65) at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.base/java.util.concurrent.FutureTask.runAndReset(Unknown Source) at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.base/java.lang.Thread.run(Unknown Source) Caused by: org.apache.http.client.ClientProtocolException: The server returned status code 403. Possible reasons include: - This agent has been deleted from the configuration - This agent is pending approval - There is possibly a reverse proxy (or load balancer) that has been misconfigured. See https://docs.gocd.org/20.5.0/installation/configure-reverse-proxy.html#agents-and-reverse-proxies for details. at com.thoughtworks.go.agent.GoHttpClientHttpInvokerRequestExecutor.validateResponse(GoHttpClientHttpInvokerRequestExecutor.java:100) at com.thoughtworks.go.agent.GoHttpClientHttpInvokerRequestExecutor.doExecuteRequest(GoHttpClientHttpInvokerRequestExecutor.java:66) at org.springframework.remoting.httpinvoker.AbstractHttpInvokerRequestExecutor.executeRequest(AbstractHttpInvokerRequestExecutor.java:137) at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.executeRequest(HttpInvokerClientInterceptor.java:202) at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.executeRequest(HttpInvokerClientInterceptor.java:184) at org.springframework.remoting.httpinvoker.HttpInvokerClientInterceptor.invoke(HttpInvokerClientInterceptor.java:150)
After checking the server logs, we also see this error: