Jira (PUP-3914) Intermittent lock file left after reboot.

4 views
Skip to first unread message

David (JIRA)

unread,
Jan 27, 2015, 11:14:55 AM1/27/15
to puppe...@googlegroups.com
David created an issue
 
Puppet / Bug PUP-3914
Intermittent lock file left after reboot.
Issue Type: Bug Bug
Affects Versions: PUP 3.7.3
Assignee: Kylo Ginsberg
Components: Client
Created: 2015/01/27 8:14 AM
Environment:

Windows 2008 Server R2 (Datacentre)

Fix Versions: PUP 3.7.4
Labels: windows puppet-agent
Priority: Normal Normal
Reporter: David

We have our own module that can trigger a reboot of the windows client. This basically boils down to this;

c:\\windows\\system32\\shutdown.exe /r /t 15 /c .....

After the reboot has completed we sometimes see the puppet (agent) lock file remains and the agent process is not started/aborts.

Error message;

Run of Puppet configuration client already in progress; skipping  (C:/ProgramData/PuppetLabs/puppet/var/state/agent_catalog_run.lock exists)

In the example I'm looking at I can see a different unrelated process occupying the pid of the original (pre reboot) puppet agent process.
So I suspect that is preventing the automatic cleanup of the lock file. And hence is the explanation of the intermittent nature of the bug.

I am new to this project and I am not a ruby coder, but I can have a go. But I would appreciate validation of the bug and also some pointers to where in the code base I can find where the lock file clean up is done.

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v6.3.10#6340-sha1:7ea293a)
Atlassian logo

Kylo Ginsberg (JIRA)

unread,
Jan 27, 2015, 11:44:49 AM1/27/15
to puppe...@googlegroups.com
Kylo Ginsberg updated an issue
Change By: Kylo Ginsberg
Scrum Team: Windows

Kylo Ginsberg (JIRA)

unread,
Jan 27, 2015, 11:44:52 AM1/27/15
to puppe...@googlegroups.com
Kylo Ginsberg assigned an issue to Ethan Brown
Change By: Kylo Ginsberg
Assignee: Kylo Ginsberg Ethan Brown

David (JIRA)

unread,
Jan 27, 2015, 12:04:56 PM1/27/15
to puppe...@googlegroups.com
David commented on Bug PUP-3914
 
Re: Intermittent lock file left after reboot.

I have killed the process that happened to be occupying the pid (of the pre reboot puppet agent) and (after a restart of the puppet agent service) the puppet agent started to work as expected.

Josh Cooper (JIRA)

unread,
Jan 27, 2015, 1:04:53 PM1/27/15
to puppe...@googlegroups.com
Josh Cooper commented on Bug PUP-3914

David Chute it looks like you are using a 15sec timeout, shorter than the default of 60sec. You will want to make sure that the timeout is larger than the biggest catalog you expect to process, as well as the amount of time it takes the agent to send the report, and have the master process the report, potentially by multiple report processors. If your timeout is too short, then the reboot can leave behind the pid file.

That said, the agent should be more robust in the face of failures. The code in lib/puppet/util/pidlock.rb handles locking and reclaiming of the pidfile. Note there are a few pidfile bugs filed in this area already, so you might want to check those out first.

Josh Cooper (JIRA)

unread,
Jan 27, 2015, 1:13:52 PM1/27/15
to puppe...@googlegroups.com
Josh Cooper updated an issue
 
Change By: Josh Cooper
Fix Version/s: PUP 3.7.4
Fix Version/s: PUP 4.x

Ethan Brown (JIRA)

unread,
Apr 7, 2015, 2:56:50 PM4/7/15
to puppe...@googlegroups.com
Ethan Brown updated an issue
Change By: Ethan Brown
Sprint: Windows 2015-05-06
This message was sent by Atlassian JIRA (v6.3.15#6346-sha1:dbc023d)
Atlassian logo

Ethan Brown (JIRA)

unread,
Apr 7, 2015, 2:57:46 PM4/7/15
to puppe...@googlegroups.com

Kenaz Kwa (JIRA)

unread,
Aug 29, 2016, 7:28:06 PM8/29/16
to puppe...@googlegroups.com
Kenaz Kwa updated an issue
Change By: Kenaz Kwa
Team: Agent & Platform Support
This message was sent by Atlassian JIRA (v6.4.13#64028-sha1:b7939e9)
Atlassian logo

Geoff Nichols (JIRA)

unread,
Oct 20, 2016, 1:45:05 PM10/20/16
to puppe...@googlegroups.com
Geoff Nichols updated an issue
Change By: Geoff Nichols
Labels: needs_repro puppet-agent windows
This message was sent by Atlassian JIRA (v6.4.14#64029-sha1:ae256fe)
Atlassian logo

Josh Cooper (JIRA)

unread,
Apr 6, 2017, 3:16:02 PM4/6/17
to puppe...@googlegroups.com
Josh Cooper updated an issue
Change By: Josh Cooper
Fix Version/s: PUP 4.y
Fix Version/s: PUP 5.y

Sean McDonald (JIRA)

unread,
May 15, 2017, 7:47:05 PM5/15/17
to puppe...@googlegroups.com
Sean McDonald updated an issue
Change By: Sean McDonald
Labels: needs_repro puppet-agent  triaged  windows

Moses Mendoza (JIRA)

unread,
May 18, 2017, 1:56:27 PM5/18/17
to puppe...@googlegroups.com
Moses Mendoza updated an issue
Change By: Moses Mendoza
Labels: needs_repro puppet-agent  triaged  windows

Josh Cooper (JIRA)

unread,
Mar 16, 2018, 2:39:04 PM3/16/18
to puppe...@googlegroups.com
Josh Cooper updated an issue
Change By: Josh Cooper
Sub-team: Coremunity
This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93)
Atlassian logo

Josh Cooper (JIRA)

unread,
Mar 16, 2018, 2:41:02 PM3/16/18
to puppe...@googlegroups.com
Josh Cooper updated an issue
Change By: Josh Cooper
Fix Version/s: PUP 5.y

Josh Cooper (JIRA)

unread,
Mar 16, 2018, 2:41:04 PM3/16/18
to puppe...@googlegroups.com
Josh Cooper assigned an issue to Unassigned
Change By: Josh Cooper
Assignee: Ethan Brown

Josh Cooper (JIRA)

unread,
Jan 23, 2020, 1:30:04 AM1/23/20
to puppe...@googlegroups.com
Josh Cooper commented on Bug PUP-3914
 
Re: Intermittent lock file left after reboot.

In PUP-9247, puppet was modified to reclaim stale pidlocks, so this is no longer an issue, and I'll close this as a duplicate:

C:\>puppet config print agent_catalog_run_lockfile
C:/ProgramData/PuppetLabs/puppet/cache/state/agent_catalog_run.lock
 
C:\>echo 12345678 > C:/ProgramData/PuppetLabs/puppet/cache/state/agent_catalog_run.lock
 
C:\>type C:\ProgramData\PuppetLabs\puppet\cache\state\agent_catalog_run.lock
12345678
 
C:\>puppet agent -t
Info: Using configured environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
...
Info: Applying configuration version '1579760502'
Notice: Applied catalog in 10.13 seconds

There is an outstanding issue filed as PUP-10248 whereby if the lock file refers to a pid that is currently running as a more privileged user (as LocalSystem for example), then the foreground run will fail. Here pid 492 is winlogon.exe:

C:\>echo 492 > C:/ProgramData/PuppetLabs/puppet/cache/state/agent_catalog_run.lock
 
C:\>puppet agent -t
Error: Could not run Puppet configuration client: OpenProcess(2000, 0, 492):  Access is denied.

Reply all
Reply to author
Forward
0 new messages