Why does the Puppet-Agent on Windows use a batch file?
I posted a question in the Ask PuppetLabs section and was directed to create post here as well;
https://ask.puppetlabs.com/question/3506/why-does-the-puppet-agent-on-windows-use-a-batch-file/
------------------------ Original question
The Puppet Enterprise for Windows Agent runs as a windows service, basically a deamonised version of Puppet, which is all fine. However the Windows Service calls a batch file which seems extremely strange. While it does work, i.e. the Service starts and runs, using CMD.EXE as a service executable is generally considered a really bad idea.
It does not respond to the usual SCM (Service Control Manager) calls and in it's current state is misconfigured e.g. The service says that it can respond to Pause and Continue events but CMD.EXE can't fulfill those requests. Also CMD.EXE does not monitor the ruby process (except for the basic operation of is it running) and vice versa. I can kill the cmd.exe process and the service manager will report that the Puppet Agent has stopped however the ruby process is still quite happily running.
Either I'm missing something and CMD.EXE is an appropriate service executable or perhaps the community or puppet labs could create a better native wrapper to the ruby based puppet process.
------------------------
So I did a few tests;
I'd be happy to write a service wrapper in C# and give it to the community, but before I do I want to make sure that it's really needed and I'm not trying to solve a problem that doesn't really exist.
Why does the Puppet-Agent on Windows use a batch file?
I posted a question in the Ask PuppetLabs section and was directed to create post here as well;
https://ask.puppetlabs.com/question/3506/why-does-the-puppet-agent-on-windows-use-a-batch-file/------------------------ Original question
The Puppet Enterprise for Windows Agent runs as a windows service, basically a deamonised version of Puppet, which is all fine. However the Windows Service calls a batch file which seems extremely strange. While it does work, i.e. the Service starts and runs, using CMD.EXE as a service executable is generally considered a really bad idea.It does not respond to the usual SCM (Service Control Manager) calls and in it's current state is misconfigured e.g. The service says that it can respond to Pause and Continue events but CMD.EXE can't fulfill those requests.
Also CMD.EXE does not monitor the ruby process (except for the basic operation of is it running) and vice versa. I can kill the cmd.exe process and the service manager will report that the Puppet Agent has stopped however the ruby process is still quite happily running.
Either I'm missing something and CMD.EXE is an appropriate service executable or perhaps the community or puppet labs could create a better native wrapper to the ruby based puppet process.
------------------------
So I did a few tests;
- You can send pause and continue messages to the service but they're just ignored even though the Services says it's Paused.
- You can kill CMD.EXE service process but the Puppet Agent is still running (It becomes an orphaned process). You can then start the service again, and you'll end up with two Puppet Agents running daemonised at the same time.
- I'm not sure what will happen if they both try to do a catalog run at the same time, but nothing good can come of it.
- CMD.EXE doesn't respond to power events e.g. going into Standby/Hibernate; but I have no idea how any service wrapper could raise that kind of event in Puppet so that it could deal with it. Does it even matter if the host goes into Standby in the middle of Puppet run? Admittedly this would be very unlikely scenario as Puppet seems to be more always-on server orientated rather than for laptop configuration management.
- If you start Puppet and then quickly attempt to stop it, it get's stuck in the Stopping state and you have to kill Ruby manually. Thtat's because the service control manager raises events in a multithreaded manner, however CMD is running a single thread. If CMD.EXE is too busy to process the event then things get into a funny state. This would probably occur during a catalog run too but I haven't confirmed it.
- From a troubleshooting perspective, there are no logs or diagnostic information created by CMD.EXE
Hi Josh,
After a lot of digging around, I think I have partial solution;
NOTE - This is my first attempt at writing ruby so I expect there are some issues with what I've written. I only had a single host (Server 2008 R2 64bit) to test this on, but I believe the changes I've made are generic enough to work on all puppet supported MS Operating Systems, 32 or 64bit.
While the CMD.EXE is not required by the Agent once running, there Service Control Manager is still monitoring that process and if it dies, it will say the service is not running, even though the orphaned the RUBY.EXE process is still running.
WINDOWS SERVICE CONFIG
The process for the service doesn't need the entire environment as that of puppet, in order run. The way I see it, the service needs enough information to act as a Windows Service and to spawn child processes of Puppet. The Puppet.BAT calls Environment.BAT which does all the work to setup the environment variables on a per call basis.
So what I did is change the ImagePath of the pe-puppet service to call ruby directly;
HKLM\System\CurrentControlSet\Services\pe-puppet\ImagePath;
FROM:
"C:\Program Files (x86)\Puppet Labs\Puppet Enterprise\service\daemon.bat"
TO:
"C:\Program Files (x86)\Puppet Labs\Puppet Enterprise\sys\ruby\bin\rubyw.exe" -C"C:\Program Files (x86)\Puppet Labs\Puppet Enterprise\service" "C:\Program Files (x86)\Puppet Labs\Puppet Enterprise\service\daemon.rb"
That is enough information for Ruby to run the service. Obviously the paths in these may differ depending on each host, BUT that can all be authored in the puppet MSI easily.
DAEMON.RB
I made some changes to daemon.rb (attached to this post);
* I created a basic function for Windows EventLog logging Puppet Bug #21641. It doesn't register an application source so it's a bit of a hack and could really do with a more professional cleanup.
* I fixed up the behaiour of Puppet Agent terminating once Paused Bug #22972
* A side effect of not running the daemon from a CMD.EXE was that the call to get to runinterval was failing. I suspect this is due to STDOUT not being available anymore. So I used the well worn method of pipe the output to a file and read that instead (Lines 60-79). I still need to try RUBY.EXE instead of RUBYW.EXE and see if it makes a difference.
* I put the Puppet Agent run in an IF statement, which will only evaluate as true if the service is in a RUNNING or IDLE state (Lines 81-86)
* I think may have found a bug in the Win32 Daemon code which was taking the service out of PAUSED and put it into a RUNNING state whenever a SERVICE_INTERROGATE event is recieved. I need to log this with the authours. (Lines 108-119).
* I added in a little extra logging in the Resume and Pause events. I changed some of the wording in the main loop to reduce any confusion about "Service Resuming"
Glenn.
* A side effect of not running the daemon from a CMD.EXE was that the call to get to runinterval was failing. I suspect this is due to STDOUT not being available anymore. So I used the well worn method of pipe the output to a file and read that instead (Lines 60-79). I still need to try RUBY.EXE instead of RUBYW.EXE and see if it makes a difference.
Well that was easier than I expected* daemon.rb now defaults to logging in the Event Log and optionally to the windows.log file
* The ImagePath string now looks like;"C:\Program Files (x86)\Puppet Labs\Puppet Enterprise\sys\ruby\bin\ruby.exe" -rubygems -C"C:\Program Files (x86)\Puppet Labs\Puppet Enterprise\service" "C:\Program Files (x86)\Puppet Labs\Puppet Enterprise\service\daemon.rb"I'm not sure if the "-rubygems" is required but it is in the daemon.bat file.
On Wed, Oct 30, 2013 at 8:41 PM, Glenn Sarti <glenn....@gmail.com> wrote:
Well that was easier than I expected* daemon.rb now defaults to logging in the Event Log and optionally to the windows.log fileThis should resolve https://projects.puppetlabs.com/issues/21641. Can you submit a PR for this issue?
Josh
Josh Cooper
Developer, Puppet Labs