incorrect use of pidfile(3)

31 views
Skip to first unread message

Dag-Erling Smørgrav

unread,
Oct 13, 2011, 6:54:38 AM10/13/11
to cur...@freebsd.org, p...@freebsd.org
I looked at some of the programs that use pidfile(3) in base, and they
pretty much all get it wrong. Consider these two scenarios:

1) common case

process A process B

main()
pidfile_open() -> success
perform_initialization()
daemon()
pidfile_write() -> success
perform_work() main()
pidfile_open() -> EEXIST
exit()

2) very unlikely but still possible case

process A process B

main()
pidfile_open() -> success main()
perform_initialization() pidfile_open() -> EAGAIN
daemon() perform_initialization()
pidfile_write() -> success daemon()
perform_work() perform_work()

The problem is that most of them (at least the ones I checked) ignore a
pidfile_open() failure unless errno == EEXIST.

How do we fix this? My suggestion is to loop until pidfile_open()
succeeds or errno != EAGAIN. Does anyone have any objections to that
approach?

DES
--
Dag-Erling Smørgrav - d...@des.no
_______________________________________________
freebsd...@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-curre...@freebsd.org"

Pawel Jakub Dawidek

unread,
Oct 13, 2011, 7:30:26 AM10/13/11
to Dag-Erling Smørgrav, cur...@freebsd.org

I think we already do that internally in pidfile_open(). Can you take a look at
the source and confirm that this is what you mean?

--
Pawel Jakub Dawidek http://www.wheelsystems.com
FreeBSD committer http://www.FreeBSD.org
Am I Evil? Yes, I Am! http://yomoli.com

Dag-Erling Smørgrav

unread,
Oct 13, 2011, 7:51:26 AM10/13/11
to Pawel Jakub Dawidek, cur...@freebsd.org
Pawel Jakub Dawidek <p...@FreeBSD.org> writes:

> Dag-Erling Smørgrav <d...@des.no> writes:
> > How do we fix this? My suggestion is to loop until pidfile_open()
> > succeeds or errno != EAGAIN. Does anyone have any objections to that
> > approach?
> I think we already do that internally in pidfile_open(). Can you take a look at
> the source and confirm that this is what you mean?

No, it doesn't; pidfile_open(3) returns NULL with errno == EAGAIN if the
pidfile is locked but empty, as is the case in the window between a
successful pidfile_open(3) and the first pidfile_write(3). This is
documented in the man page:

[EAGAIN] Some process already holds the lock on the given pid‐
file, but the file is truncated. Most likely, the
existing daemon is writing new PID into the file.

I have a patch that adds a pidfile to dhclient(8), where I do this:

for (;;) {
pidfile = pidfile_open(path_dhclient_pidfile, 0600, &otherpid);
if (pidfile != NULL || errno != EAGAIN)
break;
sleep(1);
}
if (pidfile == NULL) {
if (errno == EEXIST)
error("dhclient already running, pid: %d.", otherpid);
warning("Cannot open or create pidfile: %m");
}

I'm not sure I agree with the common idiom (which I copied here) of
ignoring all other errors than EEXIST, but that's a different story.

Carlos A. M. dos Santos

unread,
Oct 13, 2011, 8:21:11 AM10/13/11
to Dag-Erling Smørgrav, Pawel Jakub Dawidek, cur...@freebsd.org
2011/10/13 Dag-Erling Smørgrav <d...@des.no>:

You are also ignoring the return value of sleep(1), which would tell
you if the call was interrupted by a signal handler. This can be fine
for dhclient(8) but other utilities might require some guards against
such interruptions.

--
"The flames are all long gone, but the pain lingers on"

Dag-Erling Smørgrav

unread,
Oct 13, 2011, 8:54:16 AM10/13/11
to cur...@freebsd.org, p...@freebsd.org
After discussing this with pjd@ on IRC, I arrived at the attached patch,
which increases the length of time pidfile_open() itself waits (I hadn't
noticed that it already looped) and sets *pidptr to -1 if it fails to read
a pid.
pidfile_open-loop.diff

Pawel Jakub Dawidek

unread,
Oct 13, 2011, 9:48:42 AM10/13/11
to Dag-Erling Smørgrav, cur...@freebsd.org
On Thu, Oct 13, 2011 at 02:54:16PM +0200, Dag-Erling Smørgrav wrote:
> After discussing this with pjd@ on IRC, I arrived at the attached patch,
> which increases the length of time pidfile_open() itself waits (I hadn't
> noticed that it already looped) and sets *pidptr to -1 if it fails to read
> a pid.

I'm still in opinion that EWOULDBLOCK and EAGAIN (which is the same
value on FreeBSD) should be converted to EEXIST on pidfile_open()
return.

Also if we now have for loop, why not to put count in there?

I'm not very happy about touching pidptr in case of error other than
EEXIST. This is not documented, but a bit unexpected anyway.

I'm happy with increasing the timeout.

> Index: lib/libutil/pidfile.c
> ===================================================================
> --- lib/libutil/pidfile.c (revision 226271)
> +++ lib/libutil/pidfile.c (working copy)
> @@ -118,22 +118,19 @@
> */
> fd = flopen(pfh->pf_path,
> O_WRONLY | O_CREAT | O_TRUNC | O_NONBLOCK, mode);
> - if (fd == -1) {
> - count = 0;
> + if (fd == -1 && errno == EWOULDBLOCK && pidptr != NULL) {
> + *pidptr = -1;
> + count = 20;
> rqtp.tv_sec = 0;
> rqtp.tv_nsec = 5000000;
> - if (errno == EWOULDBLOCK && pidptr != NULL) {
> - again:
> + for (;;) {
> errno = pidfile_read(pfh->pf_path, pidptr);
> - if (errno == 0)
> - errno = EEXIST;
> - else if (errno == EAGAIN) {
> - if (++count <= 3) {
> - nanosleep(&rqtp, 0);
> - goto again;
> - }
> - }
> + if (errno != EAGAIN || --count == 0)
> + break;
> + nanosleep(&rqtp, 0);
> }
> + if (errno == 0)
> + errno = EEXIST;
> free(pfh);
> return (NULL);

Dag-Erling Smørgrav

unread,
Oct 13, 2011, 10:11:40 AM10/13/11
to Pawel Jakub Dawidek, cur...@freebsd.org
Pawel Jakub Dawidek <p...@FreeBSD.org> writes:
> I'm still in opinion that EWOULDBLOCK and EAGAIN (which is the same
> value on FreeBSD) should be converted to EEXIST on pidfile_open()
> return.

The historical (and documented) behavior is to return EAGAIN.

> Also if we now have for loop, why not to put count in there?

Because if we do, there will be a nanosleep after the last
pidfile_read() attempt. We need to break the loop after pidfile_read()
failed but before nanosleep().

> I'm not very happy about touching pidptr in case of error other than
> EEXIST. This is not documented, but a bit unexpected anyway.

Well, it was your idea, I just moved it to before the loop :)

DES
--
Dag-Erling Smørgrav - d...@des.no

Jos Backus

unread,
Oct 13, 2011, 10:16:01 PM10/13/11
to cur...@freebsd.org
Why not import daemontools? It's public domain these days. Pidfiles are a
hacky mess. UNIX already has a way to track processes which avoids all these
issues, with very little overhead.

Jos

Pawel Jakub Dawidek

unread,
Oct 14, 2011, 6:05:22 AM10/14/11
to Dag-Erling Smørgrav, cur...@freebsd.org
On Thu, Oct 13, 2011 at 04:11:40PM +0200, Dag-Erling Smørgrav wrote:
> Pawel Jakub Dawidek <p...@FreeBSD.org> writes:
> > I'm still in opinion that EWOULDBLOCK and EAGAIN (which is the same
> > value on FreeBSD) should be converted to EEXIST on pidfile_open()
> > return.
>
> The historical (and documented) behavior is to return EAGAIN.

We don't want to duplicate the code of handling EAGAIN into every single
pidfile(3) consumer. This is why we hav pidfile(3) API in the first
place - to make it easy for people to use.

> > Also if we now have for loop, why not to put count in there?
>
> Because if we do, there will be a nanosleep after the last
> pidfile_read() attempt. We need to break the loop after pidfile_read()
> failed but before nanosleep().

Right, ok.

> > I'm not very happy about touching pidptr in case of error other than
> > EEXIST. This is not documented, but a bit unexpected anyway.
>
> Well, it was your idea, I just moved it to before the loop :)

In my patch *pidptr was set to -1 only in the case of EAGAIN from
pidfile_read(), not for every other error.

BTW. With your patch we will continue even when flopen(3) failed for
other reasons, instead of returning NULL. Checking for fd being -1
should not be done in the same statement with other checks.

After proposed changes it would look like this, what do you think?

http://people.freebsd.org/~pjd/patches/pidfile.3.patch

Dag-Erling Smørgrav

unread,
Oct 14, 2011, 8:44:01 AM10/14/11
to Pawel Jakub Dawidek, cur...@freebsd.org
Pawel Jakub Dawidek <p...@FreeBSD.org> writes:
> After proposed changes it would look like this, what do you think?
>
> http://people.freebsd.org/~pjd/patches/pidfile.3.patch

Looks OK to me, but you should also remove the paragraph about EAGAIN in
the man page.

Reply all
Reply to author
Forward
0 new messages