Jira (PUP-10218) Puppet incorrectly detecting stale pidfile

22 views
Skip to first unread message

Marcin Deranek (JIRA)

unread,
Jan 7, 2020, 3:45:03 AM1/7/20
to puppe...@googlegroups.com
Marcin Deranek created an issue
 
Puppet / Bug PUP-10218
Puppet incorrectly detecting stale pidfile
Issue Type: Bug Bug
Affects Versions: PUP 6.11.1
Assignee: Unassigned
Attachments: pidlock.patch
Created: 2020/01/07 12:44 AM
Priority: Normal Normal
Reporter: Marcin Deranek

Puppet Version: 6.11.1
Puppet Server Version: 6.11.1
OS Name/Version: CentOS 7

When Puppet agent is incorrectly terminated (eg. killed by KILL signal) it might have a problem in detecting stale PID file. The code in question is this:

puppet/lib/ruby/vendor_ruby/puppet/util/pidlock.rb

def clear_if_stale
    begin
      Process.kill(0, lock_pid)
    rescue *errors
      return @lockfile.unlock
    end
 
    if Puppet.features.posix?
      procname = Puppet::Util::Execution.execute(["ps", "-p", lock_pid, "-o", "comm="]).strip
      args     = Puppet::Util::Execution.execute(["ps", "-p", lock_pid, "-o", "args="]).strip
      @lockfile.unlock unless procname =~ /ruby/ && args =~ /puppet/ || procname =~ /puppet(-.*)?$/
    elsif Puppet.features.microsoft_windows?
      # On Windows, we're checking if the filesystem path name of the running
      # process is our vendored ruby:
      exe_path = Puppet::Util::Windows::Process::get_process_image_name_by_pid(lock_pid)
      @lockfile.unlock unless exe_path =~ /\\bin\\ruby.exe$/
    end

Process.kill(0, pid) tries to find out if process with certain pid exists. The problem is that Process.kill checks regular processes as well as LightWeight Processes (LWP), so it will verify if certain process or lightweight process currently exists. If it exists it will try to find the name of the command for it. Unfortunately ps -p command only cares about processes and not LightWeight Processes, so if stale file contains PID of LWP Puppet will never be able to recover (unless we remove stale lock file) as LWP usually are spawned by long running daemons. Please find an attached patch (tested on CentOS7) which addresses the above issue: it makes sure ps command also considers LWPs otherwise you might run into error shown below.

Desired Behavior:

Puppet agent starts up and runs correctly.

Actual Behavior:

# puppet agent --test
Error: Could not run Puppet configuration client: Execution of 'ps -p 2181 -o comm=' returned 1: 

Add Comment Add Comment
 
This message was sent by Atlassian JIRA (v7.7.1#77002-sha1:e75ca93)
Atlassian logo

Josh Cooper (JIRA)

unread,
Jan 7, 2020, 9:50:04 AM1/7/20
to puppe...@googlegroups.com
Josh Cooper updated an issue
Change By: Josh Cooper
Team: Night's Watch

Josh Cooper (JIRA)

unread,
Jan 7, 2020, 9:58:04 AM1/7/20
to puppe...@googlegroups.com
Josh Cooper commented on Bug PUP-10218
 
Re: Puppet incorrectly detecting stale pidfile

Thanks for the patch Marcin Deranek! Couple of notes. The code will need to account for posix platforms that don't support -q, e.g. mac osx reports that q is an illegal option. If you want to make a contribution to puppet, could you submit a pull request? If not, that's fine too, we can make the fix ourselves.

Mihai Buzgau (JIRA)

unread,
Jan 14, 2020, 10:05:05 AM1/14/20
to puppe...@googlegroups.com
Mihai Buzgau updated an issue
 
Change By: Mihai Buzgau
Sprint: PR - Triage

Mihai Buzgau (JIRA)

unread,
Jan 22, 2020, 5:49:04 AM1/22/20
to puppe...@googlegroups.com

Mihai Buzgau (JIRA)

unread,
Jan 22, 2020, 5:49:05 AM1/22/20
to puppe...@googlegroups.com
Mihai Buzgau updated an issue
Change By: Mihai Buzgau
Sprint: PR NW - Triage 2020-02-05

Luchian Nemes (JIRA)

unread,
Jan 28, 2020, 3:29:05 AM1/28/20
to puppe...@googlegroups.com
Luchian Nemes assigned an issue to Luchian Nemes
Change By: Luchian Nemes
Assignee: Luchian Nemes

Mihai Buzgau (JIRA)

unread,
Feb 5, 2020, 5:38:08 AM2/5/20
to puppe...@googlegroups.com
Mihai Buzgau updated an issue
Change By: Mihai Buzgau
Sprint: NW - 2020-02-05 , NW - 2020-02-19

Ciprian Badescu (JIRA)

unread,
Feb 6, 2020, 8:09:04 AM2/6/20
to puppe...@googlegroups.com
Ciprian Badescu commented on Bug PUP-10218
 
Re: Puppet incorrectly detecting stale pidfile

Marcin Deranek, can you provide us the steps to reproduce the issue? How did you start puppet process as LWP?

Luchian Nemes (JIRA)

unread,
Feb 13, 2020, 3:54:04 AM2/13/20
to puppe...@googlegroups.com
Luchian Nemes updated an issue
 
Change By: Luchian Nemes
Fix Version/s: PUP 6.13.0

Kate Medred (JIRA)

unread,
Feb 18, 2020, 12:07:08 PM2/18/20
to puppe...@googlegroups.com
Kate Medred updated an issue
Change By: Kate Medred
Labels: resolved-issue-added

nobody (Jira)

unread,
May 19, 2021, 2:54:06 AM5/19/21
to puppe...@googlegroups.com
nobody commented on Bug PUP-10218
 
Re: Puppet incorrectly detecting stale pidfile

The same trouble with `puppet-agent-5.5.22-1.el7.x86_64` , can someone reopen the issue?

This message was sent by Atlassian Jira (v8.13.2#813002-sha1:c495a97)
Atlassian logo

Ciprian Badescu (Jira)

unread,
May 19, 2021, 7:40:04 AM5/19/21
to puppe...@googlegroups.com

nobody (Jira)

unread,
May 19, 2021, 11:25:02 AM5/19/21
to puppe...@googlegroups.com
nobody commented on Bug PUP-10218

Sad to hear that  Ok, understood, thx 4 ur time.

Reply all
Reply to author
Forward
0 new messages