sigqueue code paths

Charles Cui

unread,

Jul 19, 2016, 7:56:19 PM7/19/16

to

Hi guys,

I spent some time investigating the sigqueue code path, which is used by
realtime signals,
but I am confused about some logics,
please provide comments if you have specific knowledge about this part.
I followed the call chain path of sigqueue, and pretty confident about how
a signal is generated and put into a queue, but not sure how a signal is
consumed. Followings are some details.
When we used sigqueue, netbsd will invoke
sys_sigqueueinfo->kill1->kpsignal2(assume we just want to send signal to a
process)->sigput & sigpost.

Basically, sigput will put a signal into a queue specific to a process, and
sigpost will change the running status of a process and notify proper
process to handle the signal.
In terms of consuming a signal, I saw there was function call sigclearall,
which takes all signals from the queue and prepare to consume. I searched
all related calls to sigclearall, it seems it is invoked when executing
exec, exit functions. It seems not the correct callers to sigclearall when
running a simple program like,
https://github.com/ycui1984/posixtestsuite/blob/master/conformance/interfaces/sigqueue/9-1.cI
guess there will be some routines that are called when a process is running
on a processor.
It may find the running flag is set with pending signals and then start to
operate signals, but
I cannot find the signal consumer code path. Any ideas on this?

Thanks Charles

Robert Elz

unread,

Jul 19, 2016, 10:49:59 PM7/19/16

to

First, the basic (since forever) unix signal handling method, then some
NetBSD code pointers (and I'm sure you will find the rest).

Signal generation (causing a signal to be sent to a process) you have
largely found I believe - either a sys call (from the process to
receive the signal, or another) or some async event (interrupt from the
clock, or some device driver) decides a signal needs to be delivered to
a process, and arranges to queue it (it used to be just set a bit in the
signal pending word, but I'm sure it is more complex these days).

That code also takes care of doing nothing should the signal not be wanted
by the receiving process.

Next note, that if we're doing this, kernel code is obviously running,
so before any user code can execute again, the kernel needs to execute the
"return to user space" function(s). Part of that is selecting which
process runs next (if we have actually delivered a signal to a process it
will have been made runnable, as you noted, so that process is a candidate
for being the next - or one of the next on a multiprocessor - process to be
given the cpu).

One of the steps in returning to user space is to look and see if there are
any pending signals. If there are, rather than returning to where the
process was previously executing, an "interrupt" stack context is created
for the process, and it is set to resume at its signal handler - that step
is obviously highly machine (and emulation) dependent. Then when the
kernel returns to user space, the user process will run is signal handler.

In NetBSD, return to user space is (partly) lwp_userret() in kern/kern_lwp.c
In there you will see...

if ((l->l_flag & (LW_PENDSIG | LW_WCORE | LW_WEXIT)) ==
LW_PENDSIG) {
mutex_enter(p->p_lock);
while ((sig = issignal(l)) != 0)
postsig(sig);
mutex_exit(p->p_lock);
}

That is checking if there is a pending signal, and if so, deliver it
to the process - you can follow postsig() to see how it gets down into
the arch/emul specific sig setup routine to actually set the environ for
the user process, and issignal() to see how the pending signals are
examined and one is selected to deliver first (but think carefully
about just what the while loop quoted above actually means...)
Both of those are in kern/kern_sig.c

kre

Edgar Fuß

unread,

Jul 20, 2016, 9:28:25 AM7/20/16

to

> Next note, that if we're doing this, kernel code is obviously running

On one of the CPUs, yes.

> so before any user code can execute again

... on this CPU. What about the other CPUs? Do all a processes LWPs run
on the same CPU?

Robert Elz

unread,

Jul 20, 2016, 12:16:58 PM7/20/16

to

Date: Wed, 20 Jul 2016 15:28:13 +0200
From: Edgar =?iso-8859-1?B?RnXf?= <e...@math.uni-bonn.de>
Message-ID: <20160720132...@trav.math.uni-bonn.de>

| > so before any user code can execute again
| ... on this CPU. What about the other CPUs? Do all a processes LWPs run
| on the same CPU?

First, I am certainly no expert, or even particularly knowledgable on
threading, or lwps, or anything multi-processor related, so hopefully
someone who is will confirm or correct, but ...

CPUs aren't really what is important here, what matters is that the
signal gets delivered to the process. Now it is certainly possible
that the target process is running on a different CPU than the one
which is delivering the signal - as I understand it, that is handled
by forcing the process to (effectively) enter the kernel so it has
to exit back to user space again, and when that happens it collects the
signal. The mechanism to make that happen I will leave for someone
else to provide details of.

As I understand it, different LWPs can run on different CPUS, but again,
I'm not sure that is really relevant - most signals (all?) have processes
as a target, not LWPs. I believe signals are delivered to just one LWP.
How that one is selected/controlled I will leave for someone else...

kre

Charles Cui

unread,

Jul 21, 2016, 2:38:30 AM7/21/16

to

Thanks Robert for your understanding.
In conclusion, you think the signals are consumed
at the time of context switch. I will keep this information
in mind and see how all functions are connected.

Thanks Charles

Robert Elz

unread,

Jul 21, 2016, 3:38:41 AM7/21/16

to

Date: Wed, 20 Jul 2016 21:39:46 -0700
From: Charles Cui <charles...@gmail.com>
Message-ID: <CA+SXE9voYGG-kmEOvXwOcKHb...@mail.gmail.com>

| In conclusion, you think the signals are consumed
| at the time of context switch.

No, not just a context switch (that is when one process stops running - on a
cpu - and another replaces it) - but any time that a user process resumes
running in user mode (like when a system call completes, or a device interrupt
is finished). Context switches are just one of the possibilities.

kre