We have had some issues on our VMWare platform that has meant that some machines become unresponsive and are required to be forcibly rebooted by the VMWare console host software.
When this has happened we have noticed that in all cases the GO Agent does not start correctly. It starts then stops immediately.
Looking in the go-agent-bootstrapper.log file we can see that the bootstrapper finds an existing lock file and terminates its process. I assume this is because the lock file was not cleanly removed from the previous shutdown.
We can start the service manually after the first failure which appears to fix it.
Is this normal behaviour? Can we do something to fix it? I am considering setting Service recovery options so the service will restart on failure.
This seems quite an odd design choice considering how common this scenario will occur.
Thanks
Carl
OS: Windows 2012 Server R2
Relevant log file entries are:
STATUS | wrapper | 2014/06/19 11:16:04 | --> Wrapper Started as Service
STATUS | wrapper | 2014/06/19 11:16:04 | Java Service Wrapper Standard Edition 3.3.3
STATUS | wrapper | 2014/06/19 11:16:04 | Copyright (C) 1999-2009 Tanuki Software, Ltd. All Rights Reserved.
STATUS | wrapper | 2014/06/19 11:16:04 | http://wrapper.tanukisoftware.org
STATUS | wrapper | 2014/06/19 11:16:04 | Licensed to ThoughtWorks for Cruise Agent
STATUS | wrapper | 2014/06/19 11:16:04 |
STATUS | wrapper | 2014/06/19 11:16:04 | Launching a JVM...
INFO | jvm 1 | 2014/06/19 11:16:13 | WrapperManager: Initializing...
INFO | jvm 1 | 2014/06/19 11:16:15 | logFile Environment Variable= null
INFO | jvm 1 | 2014/06/19 11:16:15 | Logging to go-agent-bootstrapper.log
2014-06-19 11:16:16,859 [WrapperSimpleAppMain] INFO agent.common.util.JarUtil:68 - Attempting to load Go-Agent-Launcher-Class from aedd84b2-bda7-4f4c-b77f-9553dfc2ccd8agent-launcher.jar File:
2014-06-19 11:16:16,890 [WrapperSimpleAppMain] INFO agent.common.util.JarUtil:77 - manifestLibDirKey: Go-Agent-Launcher-Lib-Dir: libs
2014-06-19 11:16:17,202 [WrapperSimpleAppMain] INFO agent.common.util.JarUtil:83 - manifestClassKey: Go-Agent-Launcher-Class: com.thoughtworks.cruise.agent.launcher.AgentLauncherImpl
2014-06-19 11:16:17,218 [WrapperSimpleAppMain] INFO agent.common.util.ParentClassAccessFilteringClassloader:43 - Loading com.thoughtworks.cruise.agent.common.launcher.AgentLauncher using com.simontuffs.onejar.JarClassLoader
2014-06-19 11:16:17,343 [WrapperSimpleAppMain] INFO go.agent.bootstrapper.AgentBootstrapper:72 - Attempting create and start launcher...
INFO | jvm 1 | 2014/06/19 11:16:17 | 0 [WrapperSimpleAppMain] INFO com.thoughtworks.go.agent.launcher.Lockfile - Sleeping for 10000 secs to before 'last modified check'...
2014-06-19 11:16:27,484 [WrapperSimpleAppMain] INFO go.agent.bootstrapper.AgentBootstrapper:76 - Launcher returned with code 12245933(0xBADBAD)
2014-06-19 11:16:27,531 [WrapperSimpleAppMain] INFO go.agent.bootstrapper.AgentBootstrapper:116 - Destroying launcher creator
INFO | jvm 1 | 2014/06/19 11:16:27 | Already running agent launcher in this folder.
INFO | jvm 1 | 2014/06/19 11:16:27 | 10000 [WrapperSimpleAppMain] ERROR com.thoughtworks.go.agent.launcher.Lockfile - Already running agent launcher in this folder.
2014-06-19 11:16:28,593 [WrapperSimpleAppMain] INFO go.agent.bootstrapper.DefaultAgentLauncherCreatorImpl:90 - Attempt No: 1 to cleanup launcher temp files
2014-06-19 11:16:28,609 [WrapperSimpleAppMain] INFO go.agent.bootstrapper.AgentBootstrapper:99 - Waiting for 10000 ms before re-launch....
2014-06-19 11:16:38,718 [WrapperSimpleAppMain] INFO go.agent.bootstrapper.AgentBootstrapper:116 - Destroying launcher creator
2014-06-19 11:16:38,968 [WrapperSimpleAppMain] INFO go.agent.bootstrapper.AgentBootstrapper:93 - Agent Bootstrapper stopped
I fixed this by setting up a Powershell script to run on agent start-up. The startup task itself was pushed out to all the agents using GO so it was a very quick fix which seems to have done the trick.