Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Questions about SIGHUP behavior

95 views
Skip to first unread message

Steffen Dettmer

unread,
Nov 12, 2013, 7:40:02 AM11/12/13
to
Hi,

Debian 7.2 with /bin/bash as login shell (via /etc/passwd), shopt
huponexit off (as by default), bash run via SSH from other host.

When closing shell with CTRL-D, "sleep &" continues to run. I had
expected I had to use nohup, setsid, disown or a combination of them
in order to keep background jobs running after ending a shell session.

Was "shopt huponexit" used to be on in the past (or non-existing and
behaving like if it had been on) or do I incorrectly remember? I think
I remember correctly, otherweise the "nohup" tool would not be needed
and probably would not exist...

When closing shell with CTRL-D, "cat &" does not continue to run,
because it receives SIGTERM. I had expected it behaves like sleep. I
cannot test "nohup cat", because nohup also redirects inputs and thus
cat instantly terminates, but anyway nohup should not change behavior
of SIGTERM. Why are cat and sleep different? Note: in "strace sleep",
I don't see that file handles get closed, so it is not obviously
related to that, as I initially assumed.

When aborting SSH session with "~.", "sleep &" continues to run. I had
expected that the sshd process sends a SIGHUP to the process session,
but ps told that bash does not run in the process session as sshd, but
in an own session. strace shows that the bash process receives a
SIGHUP, but it does not send SIGHUP (but there is no "shopt huponhup
off"). As I understand the man page of bash, I had expected that bash
sends SIGHUP to all created process groups (i.e. my "sleep &"), unless
disown was used.

I tried to see who is sending the SIGHUP and ran strace on the parent
process sshd, but I saw now kill() invocation. To go for sure I even
ran strace on the parent parent process (the "master" sshd process),
but also saw no kill (but the SIGCHLD handling as expected). Where
does the SIGHUP received by "-bash" come from?

Best regards,
Steffen


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/CAOBoUnNeGn9i-G0bzugORa_uL3QCW=VkwCtdYeW3...@mail.gmail.com

David F

unread,
Nov 12, 2013, 12:10:01 PM11/12/13
to
On 11/12/2013 07:35 AM, Steffen Dettmer wrote:
> Debian 7.2 with /bin/bash as login shell (via /etc/passwd), shopt
> huponexit off (as by default), bash run via SSH from other host.
>
> When closing shell with CTRL-D, "sleep &" continues to run. I had
> expected I had to use nohup, setsid, disown or a combination of them
> in order to keep background jobs running after ending a shell session.

Short answer: I doubt that this ever worked as you think it did; if you're
using a shell with job control and run programs in the background, the shell
needs to deliver the HUP signal, which can happen in one of two ways in
bash: huponexit on; or SIGHUP delivered to bash.

Long answer:

SIGHUP is not necessarily sent by the shell to background processes when it
exits, but more often by the controlling tty's driver or line discipline,
which on most Unixes (Linux included) is sadly a morass of cruft with
multiple APIs that evolved separately and later merged.

Back in the bad old days, when one used a "real" terminal on RS232 and
turned off the terminal, or logged in via a modem connected to the system
via RS232 and "hung up" the phone, the DSR line would fall and both
foreground and background processes would get SIGHUP (hang up) from the tty
driver, because it was the "controlling tty" (a concept that still exists
today even though real terminals are almost extinct). Keep in mind, unless
one was using the C shell, this was before "job control".

Fast-forward to today: bash by default uses job control except when
executing a script, and in the case of SSH, a pseudo-tty is used to simulate
the "real" device and its driver [details at pty(7)].

If you have a read of setpgid(2) [also of interest tty_ioctl(4)], you'll see
that (basically) on hangup of the tty device a SIGHUP is delivered to the
"foreground process group of the controlling terminal". Without defining
the "foreground process group" too carefully, suffice it to say that
processes can be put in or out of it via system calls like setpgid(2) by the
shell, various "daemon starting" programs, or themselves. More important,
we can easily see which processes are in it by looking at the pgid and tpgid
columns of ps(1)'s output.

For the final piece of the puzzle, check the relevant section of bash(1):
"The shell exits by default upon receipt of a SIGHUP. Before exiting, an
interactive shell resends the SIGHUP to all jobs [...] If the huponexit
shell option has been set with shopt, bash sends a SIGHUP to all jobs when
an interactive login shell exits." [An 'interactive' shell means
(basically) one that is running on a tty rather than reading a script from a
file.]

A little investigation with ps will show see why your sleep process didn't
receive a SIGHUP: when job control is enabled, bash moves background jobs
out of the foreground process group; they therefore won't receive a SIGHUP
from the tty driver, and since (a) you are exiting bash via EOF and (b) you
don't have huponexit set, bash doesn't send it to them. Note that had bash
exited due to receiving a SIGHUP *itself* (which would happen e.g. if sshd
died and released the pty), it would have delivered the SIGHUP to all of its
jobs, foreground and background, which is one reason why you want to use
commands like nohup, disown, etc. if you want to really be sure that your
background commands continue to run even after you logout.

The following session log demonstrates all of this. I use 'sleep 1h' and
'sleep 2h' to make clearer in the output of 'ps' which command was run by
'nohup' (but notice also the 'ignored' column).


~ % # ===>
~ % # ===> First, let's run some commands in the background with job-control
enabled.
~ % # ===>
~ % ssh localhost
...
~ % sleep 1h &
[1] 32187
~ % nohup sleep 2h &
[2] 32204
~ % nohup: ignoring input and appending output to `nohup.out'

~ % ps -o pid,ppid,pgid,sess,tpgid,tty,ignored,args
PID PPID PGID SESS TPGID TT IGNORED COMMAND
31723 31722 31723 31723 32281 pts/21 00384004 -bash
32187 31723 32187 31723 32281 pts/21 00000000 sleep 1h
32204 31723 32204 31723 32281 pts/21 00000001 sleep 2h
32281 31723 32281 31723 32281 pts/21 00000000 ps -o pid,ppid,

~ % #Notice ^^^^^ that the jobs have different PGID's from bash and the TPGID.

~ % exit
logout
Connection to localhost closed.

~ % # Check that both 'sleep' processes are still running:
~ % ps -eo pid,ppid,pgid,sess,tpgid,tty,ignored,args |grep sleep |grep -v grep
32187 1 32187 31723 -1 ? 0000000000000000 sleep 1h
32204 1 32204 31723 -1 ? 0000000000000001 sleep 2h
~ %
~ % # Demonstrate the effects of nohup on one of them:
~ % kill -HUP 32187 32204
~ % ps -eo pid,ppid,pgid,sess,tpgid,tty,ignored,args |grep sleep |grep -v grep
32204 1 32204 31723 -1 ? 0000000000000001 sleep 2h
~ %
~ % # OK, now kill it too:
~ % kill -TERM 32204
~ % ps -eo pid,ppid,pgid,sess,tpgid,tty,ignored,args |grep sleep |grep -v grep
~ %


~ % # ===>
~ % # ===> Try the same thing again, with job control disabled.
~ % # ===>
~ % ssh localhost
...
~ % set +m
~ % sleep 1h &
[1] 677
~ % nohup sleep 2h &
[2] 706
~ % nohup: appending output to `nohup.out'

~ % ps -o pid,ppid,pgid,sess,tpgid,tty,ignored,args
PID PPID PGID SESS TPGID TT IGNORED COMMAND
677 32636 32636 32636 32636 pts/21 00000006 sleep 1h
706 32636 32636 32636 32636 pts/21 00000007 sleep 2h
765 32636 32636 32636 32636 pts/21 00000000 ps -o pid,ppid,
32636 32635 32636 32636 32636 pts/21 00384004 -bash

~ % # Notice ^^^^ that this time all processes' PGID are the same, and is
the TPGID.

~ % exit
logout
Connection to localhost closed.

~ % # Now only the nohup-protected process remains:
~ % ps -eo pid,ppid,pgid,sess,tpgid,tty,ignored,args |grep sleep |grep -v grep
706 1 32636 32636 -1 ? 0000000000000007 sleep 2h
~ % kill 706


~ % # ===>
~ % # ===> One more time, *with* job-control, but terminate bash via SIGHUP
~ % # ===> (simulating a turned-off terminal, lost network connection, etc.)
~ % # ===>
~ % ssh localhost
[...]
~ % sleep 1h &
[1] 4643
~ % nohup sleep 2h &
[2] 4644
~ % nohup: ignoring input and appending output to `nohup.out'

~ % ps -o pid,ppid,pgid,sess,tpgid,tty,ignored,args
PID PPID PGID SESS TPGID TT IGNORED COMMAND
4580 4579 4580 4580 4646 pts/21 0000000000384004 -bash
4643 4580 4643 4580 4646 pts/21 0000000000000000 sleep 1
4644 4580 4644 4580 4646 pts/21 0000000000000001 sleep 2
4646 4580 4646 4580 4646 pts/21 0000000000000000 ps -o p

~ % #Separate^^^^ pgid's this time: the tty won't deliver SIGHUP, it's up to
the shell

~ % kill -HUP $$
Connection to localhost closed.
~ %
~ % # Non-nohup-protected process got SIGHUP; the other remains:
~ % ps -eo pid,ppid,pgid,sess,tpgid,tty,ignored,args |grep sleep |grep -v grep
4644 1 4644 4580 -1 ? 0000000000000001 sleep 2h
~ % kill 4644


-- David


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/52825F12...@meta-dynamic.com

Steffen Dettmer

unread,
Nov 13, 2013, 6:50:02 AM11/13/13
to
Hi,

thanks for your detailed answer.

On Tue, Nov 12, 2013 at 6:02 PM, David F <deb...@meta-dynamic.com> wrote:
> On 11/12/2013 07:35 AM, Steffen Dettmer wrote:
>>
>> Debian 7.2 with /bin/bash as login shell (via /etc/passwd), shopt
>> huponexit off (as by default), bash run via SSH from other host.
>>
>> When closing shell with CTRL-D, "sleep &" continues to run. I had
>> expected I had to use nohup, setsid, disown or a combination of them
>> in order to keep background jobs running after ending a shell session.
>
> Short answer: I doubt that this ever worked as you think it did;

(I think at least when using a physical modem it should have been
worked, wasn't this what the signal was defined for?)

> Fast-forward to today: bash by default uses job control except when
> executing a script, and in the case of SSH, a pseudo-tty is used to simulate
> the "real" device and its driver [details at pty(7)].

Ahh yes, so this is the source of the SUGHUP delivered to the
shell, thanks for explaining.

> For the final piece of the puzzle, check the relevant section of bash(1):
> "The shell exits by default upon receipt of a SIGHUP. Before
> exiting, an interactive shell resends the SIGHUP to all jobs [...] If the
> huponexit shell option has been set with shopt, bash sends a SIGHUP to all
> jobs when an interactive login shell exits."

I did my tests all without changing huponexit, so I'd expect the
same behavior (I could understand whether bash always or never
resending SIGHUP, depending whether huponexit is applied to the
"exit caused by a signal")

> A little investigation with ps will show see why your sleep process didn't
> receive a SIGHUP: when job control is enabled, bash moves background jobs
> out of the foreground process group; they therefore won't receive a SIGHUP

But the shell process does receive a SIGHUP:

# ps -A -O sid,pgid|grep -e '\(sleep\|bash\)'
11702 11702 11702 S pts/0 00:00:00 -bash
11707 11702 11707 S pts/0 00:00:00 sleep 1h
root@NOMAD-BLN-R3200Xstd:~# strace -p 11702
Process 11702 attached - interrupt to quit
read(0, 0xbfa6975f, 1) = -1 EIO (Input/output error)
--- SIGHUP (Hangup) @ 0 (0) ---
--- SIGCONT (Continued) @ 0 (0) ---
[...]
kill(-11707, SIGHUP)
[...]
kill(11702, SIGHUP) = 0
sigreturn() = ? (mask now [])
--- SIGHUP (Hangup) @ 0 (0) ---
Process 11702 detached

In this test, it behaves like expected: bash gets SIGHUP and
resends to the background process group 11707 using
"kill(-11707, SIGHUP)".

I repeated this test (with and without strace) and noticed, that
in all cases when using strace, the background sleep process
terminate along with the SSH session and in cases when not using
strace in most (!) cases the sleep process continued to exist,
but I also found a few cases where the sleep process did
terminate.

> Note that had bash
> exited due to receiving a SIGHUP *itself* (which would happen e.g. if sshd
> died and released the pty), it would have delivered the SIGHUP to all of its
> jobs, foreground and background, which is one reason why you want to use
> commands like nohup, disown, etc. if you want to really be sure that your
> background commands continue to run even after you logout.

this was my assumption, but my observation is different.
I see several cases where I ended SSH using ~. causing SIGHUP to
be delivered to bash, but and a background process continued to
tun.

> ~ % ssh localhost
[...]
> ~ % exit
> logout
> Connection to localhost closed.

I think this does not lead to a SIGHUP delivered to bash, I think
you have to end SSH session using "~." or maybe closing the xterm.

My test a bit more in detail.

Client (with prompt "$") opens SSH session to test system (with
prompt "#"). On this pts/0, observation is made using "ps".

On a second xterm on the Client a second SSH session is opened
(pts/1) and the command "sleep 1h&" is issued.

On the first xterm, the ps command is issued.
Result: sleep 1h process exists.
On the second xterm, the SSH session is disconnected using SSH
escape "~.".
On the first xterm, the second ps command is issued.
Result: no sleep 1h process anymore.

Repeat.

On the first xterm, the ps command is issued.
Result: sleep 1h process exists.
On the second xterm, the SSH session is disconnected using SSH
escape "~.".
On the first xterm, the second ps command is issued.
Result: sleep 1h process still exists.

Here the transscript; the first xterm 18 space intended, I hope
that it reads better.


<second xterm> <first xterm>

$ ssh 172.22.9.1
#

$ ssh 172.22.9.1
# sleep 1h&
[1] 12147

# ps -A -O sid,pgid|grep -e '\(sleep\|-bash\)'
12132 12132 12132 S pts/0 00:00:00 -bash
12140 12140 12140 S pts/1 00:00:00 -bash
12147 12140 12147 S pts/1 00:00:00 sleep 1h
12149 12132 12148 S pts/0 00:00:00 grep -e \(sleep\|-bash\)

# Connection to 172.22.1.9 closed.
$

# ps -A -O sid,pgid|grep -e '\(sleep\|-bash\)'
12132 12132 12132 S pts/0 00:00:00 -bash
12151 12132 12150 S pts/0 00:00:00 grep -e \(sleep\|-bash\)

### (no sleep in ps)

### repeat



$ ssh 172.22.9.1
# sleep 1h&
[1] 12160

# ps -A -O sid,pgid|grep -e '\(sleep\|-bash\)'
12132 12132 12132 S pts/0 00:00:00 -bash
12156 12156 12156 S pts/1 00:00:00 -bash
12160 12156 12160 S pts/1 00:00:00 sleep 1h
12162 12132 12161 S pts/0 00:00:00 grep -e \(sleep\|-bash\)

# Connection to 172.22.1.9 closed.
$

# ps -A -O sid,pgid|grep -e '\(sleep\|-bash\)'
12132 12132 12132 S pts/0 00:00:00 -bash
12160 12156 12160 S ? 00:00:00 sleep 1h
12164 12132 12163 S pts/0 00:00:00 grep -e \(sleep\|-bash\)

### (sleep still in ps)

Maybe it is some race condition? Unfortunately trying to use
"strace" to see whats happening influences the behavior.

Regards,
Steffen


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/CAOBoUnMJnxoOgYhKNEt_Srg9...@mail.gmail.com

David L. Craig

unread,
Nov 13, 2013, 8:30:03 AM11/13/13
to
On 13Nov13:1240+0100, Steffen Dettmer wrote:

> thanks for your detailed answer.

Indeed, this is very good material to understand. As a minor
point in the interest of complete treatment, I add the nohup nohup
construct; e.g.,

( while : ; do sleep 60 ; echo awake `date` ; done &>/dev/null & )

which has the same effect as the nohup command. Try inserting
" trap 'echo HUP' HUP ; " before the 'while' and redirecting
to something like /tmp/$$.log instead of /dev/null to see if
it actually receives SIGHUP signals via kill -1 <PID of sh>
commands.
--
<not cent from sell>
May the LORD God bless you exceedingly abundantly!

Dave_Craig______________________________________________
"So the universe is not quite as you thought it was.
You'd better rearrange your beliefs, then.
Because you certainly can't rearrange the universe."
__--from_Nightfall_by_Asimov/Silverberg_________________


--
To UNSUBSCRIBE, email to debian-us...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listm...@lists.debian.org
Archive: http://lists.debian.org/20131113132614.GA12453@dlc-dt
0 new messages