core files

Michael Elizabeth Chastain

unread,

Dec 30, 1998, 3:00:00 AM12/30/98

to linux-...@vger.rutgers.edu

You can implement smart core handling in user space.

You can write a wrapper program that uses PTRACE_CONT. If the child
gets a signal that it doesn't have a handler for, then the wrapper gains
control and does whatever you want.

To be correct, the wrapper needs to use PTRACE_SYSCALL instead,
and filter the syscall event stream for calls to signal(2) so that it
knows which signals the child is handling. There is enough information
available to do this conveniently.

Or you could hack the kernel to let the parent read the child's
sig->action array in some way, such as a PTRACE_GET_SIGACTION call.

You can even implement classical core dumps by reading /proc/$child/maps,
calling /proc/$child/mem, and then writing a file named "core".
Or "core-$ARGV[0]", or whatever you want.

I've written a lot of code like this. ptrace is a great tool.

Michael Elizabeth Chastain
<mailto:m...@shout.net>
"love without fear"

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/

Chris Wedgwood

unread,

Dec 30, 1998, 3:00:00 AM12/30/98

to Michael Elizabeth Chastain

On Tue, Dec 29, 1998 at 01:53:18PM -0600, Michael Elizabeth Chastain wrote:

> To be correct, the wrapper needs to use PTRACE_SYSCALL instead, and
> filter the syscall event stream for calls to signal(2) so that it
> knows which signals the child is handling. There is enough
> information available to do this conveniently.

The speed would suck rocks. Run squid or netscape under strace and
you'll see what I mean.

-cw

Michael Elizabeth Chastain

unread,

Dec 30, 1998, 3:00:00 AM12/30/98

to ch...@cybernet.co.nz

Hi Chris,

> On a UP box it still need to switch to the tracing context once (or
> more) per system call. Ouch.

Yes, it would. And you get a 100% user-space solution that works
on 2.0 and 2.2.

As I wrote in my first message, and you didn't bother addressing, you
could add one ptrace call to determine whether the child is about to
core dump on the signal it just received. Then you get a 98% user-space
solution with one simple policy-free change to the kernel, and that
solution runs child system calls at full speed.

Or you could add another flavor of PTRACE_CONT that runs full speed
through system calls and handled signals, but traps out on unhandled
signals. Then you get a 98% user-space solution that runs every child
process at full speed.

And look at this, the user-space program can easily try the ptrace
enhancements and fall back if they are not present, so you get a 'ucore'
that works on every kernel version and runs at full speed when it finds
kernel support.

Michael Elizabeth Chastain
<mailto:m...@shout.net>
"love without fear"

-

Michael Elizabeth Chastain

unread,

Dec 30, 1998, 3:00:00 AM12/30/98

to j...@sunsite.ms.mff.cuni.cz

Hi Jakub,

Look at it this way:

Amount of kernel effort available: zero
Ability to invoke debugger with kmod-like trap: none
Ability to invoke debugger with ptrace technology: limited
Michael's ucore could replace kernel core dumping: no

Amount of kernel effort available: minimal
Ability to invoke debugger with kmod-like trap: questionable
Ability to invoke debugger with ptrace technology: yes
Michael's ucore could replace kernel core dumping: maybe

Amount of kernel effort available: extensive
Ability to invoke debugger with kmod-like trap: yes
Ability to invoke debugger with ptrace technology: yes
Michael's ucore could replace kernel core dumping: maybe

For any given level of kernel effort, using ptrace technology gives
a more powerful result than a kmod-like trap. That's my point.

Albert D. Cahalan

unread,

Dec 31, 1998, 3:00:00 AM12/31/98

to linux-...@vger.rutgers.edu

> Michael Elizabeth Chastain writes:

> Amount of kernel effort available: minimal
> Ability to invoke debugger with kmod-like trap: questionable
> Ability to invoke debugger with ptrace technology: yes
> Michael's ucore could replace kernel core dumping: maybe
>
> Amount of kernel effort available: extensive
> Ability to invoke debugger with kmod-like trap: yes
> Ability to invoke debugger with ptrace technology: yes
> Michael's ucore could replace kernel core dumping: maybe
>
> For any given level of kernel effort, using ptrace technology gives
> a more powerful result than a kmod-like trap. That's my point.

These aren't the only ways.

I think it would be good to let a crash dump handler open a device file
to get control. Before dumping core, the kernel checks for a crash dump
handler running with the same UID. If one is found, the dying process
sleeps and the crash dump handler is asked to respond.

The crash dump handler could supply a core dump name or start a
debugger to revive the process. Perhaps just leave the process sleeping
and dial tech support on a modem. This could be useful. :-)

Ptrace-like implementations suffer from process relationship problems
and signal handling issues. Kmod-like implementations suffer from the
need to start a complex application as root. It is much nicer to let
an existing user-defined process (GNOME?) select on a device file.

Michael Elizabeth Chastain

unread,

Dec 31, 1998, 3:00:00 AM12/31/98

to acah...@cs.uml.edu, linux-...@vger.rutgers.edu

Hi Albert,

> These aren't the only ways.

That's probably true.

> I think it would be good to let a crash dump handler open a device file
> to get control. Before dumping core, the kernel checks for a crash dump
> handler running with the same UID. If one is found, the dying process
> sleeps and the crash dump handler is asked to respond.

That would remove process invocation from the kernel path.

You could do a lot of *this* in user space by having a library function
set the default value for all signals, or just SIGSEGV, to a function
that uses some kind of IPC, such as writing to a fifo whose name includes
the uid. I wonder if it's possible to use preloading to inject that
snippet of code into an existing executable without relinking it and
without touching libc.

Michael Elizabeth Chastain
<mailto:m...@shout.net>
"love without fear"

-

Albert D. Cahalan

unread,

Dec 31, 1998, 3:00:00 AM12/31/98

to linux-...@vger.rutgers.edu

Jakub Jelinek writes:

> On the other side, if some kind of corename patch makes it into
> the kernel and administrator chooses to put all core files e.g to
> /tmp/cores/core.<pid>, then some ucore program can poll on /tmp/cores
> and once it notices a new file created there, just spawn a debugger
> or whatever the user wants (not that I'd like to do it on my machines).
> But such thing should be implemented in userland, why to bloat the kernel.

Spawn a debugger on the core file?

Oh no, you missed the whole point of this idea. Such a hack would not
need to be in the kernel, and would be nearly useless anyway.

Goal: start a debugger on a live process, with all the process
relationships intact and file descriptors still open. The process
should be as intact as if you had attached a debugger yourself
right before the crash.

The general plan is to let users have this feature effective for all
processes, not just ones expected to crash. That means zero overhead
is a requirement. The OS must pass standards compliance tests with the
feature operating. Nothing can interfere with process relationships
or signal handling

There is a clear need for kernel support here.

Proposed implementation:

Provide debug info via a World-accessable character device.
On a per-UID basis, a crash monitor watches for trouble.
If something crashes:
1. dying process sleeps
2. message passed to crash monitor
3. crash monitor can say "dump /tmp/joe-1.core" or start ddd or ...

The crash monitor is totally the user's choice. It could have
debugging ability, but most versions would rely on a separate
debugger. It could look up the program in an RPM or DEB database,
then offer to email an accurate bug report. It could offer to
unpack or download source code for you. It could offer to hide
the bug for a moment so that you might be able to save your work.

BTW, I think the plain core name patch belongs in 2.2.xx while
the rest is obviously a big 2.3.xx project.

Michael Shields

unread,

Dec 31, 1998, 3:00:00 AM12/31/98

to Jakub Jelinek

In article <1998123010...@sunsite.mff.cuni.cz>,

Jakub Jelinek <j...@sunsite.mff.cuni.cz> wrote:
> On the other side, if some kind of corename patch makes it into the kernel
> and administrator chooses to put all core files e.g to
> /tmp/cores/core.<pid>, then some ucore program can poll on /tmp/cores and
> once it notices a new file created there, just spawn a debugger or whatever
> the user wants (not that I'd like to do it on my machines).

This doesn't buy you anything. You want to be able to start up the
debugger on the dying process, *before* state is lost (fds closed, &c.)
--
Shields.

Kenneth Albanowski

unread,

Jan 1, 1999, 3:00:00 AM1/1/99

to Albert D. Cahalan

On Thu, 31 Dec 1998, Albert D. Cahalan wrote:

> Goal: start a debugger on a live process, with all the process
> relationships intact and file descriptors still open. The process
> should be as intact as if you had attached a debugger yourself
> right before the crash.
>
> The general plan is to let users have this feature effective for all
> processes, not just ones expected to crash. That means zero overhead
> is a requirement. The OS must pass standards compliance tests with the
> feature operating. Nothing can interfere with process relationships
> or signal handling
>
> There is a clear need for kernel support here.

Thank you, yes, this states the points I wanted to make.

> Proposed implementation:
>
> Provide debug info via a World-accessable character device.
> On a per-UID basis, a crash monitor watches for trouble.
> If something crashes:
> 1. dying process sleeps
> 2. message passed to crash monitor
> 3. crash monitor can say "dump /tmp/joe-1.core" or start ddd or ...

I'd add a mechanism to ptrace to trigger an in-kernel core dump, perhaps
with a specified file name (to get back to that), or perhaps even add a
new core dump approach that would return the core file to a user buffer,
allowing the core file to be stored remotely. (I'm very interested in this
idea for an embedded design.)

Making the character device world accessible might not be quite right,
though, unless the kernel allows multiple connections, and automatically
gives each connected daemon from the one with the most authority (root
perms) to the least (user perms) a chance at taking the uid, in turn.

It's probably simpler to have a root daemon that uses /etc/debugconf,
~user/debugconf, etc., to figure out what processes get which treatment.

> The crash monitor is totally the user's choice. It could have
> debugging ability, but most versions would rely on a separate
> debugger. It could look up the program in an RPM or DEB database,
> then offer to email an accurate bug report. It could offer to
> unpack or download source code for you. It could offer to hide
> the bug for a moment so that you might be able to save your work.

Yes. Once the daemon gets the uid (and presumably has sufficient
permissions), it can be responsible for triggering any sort of debugger,
ranging from gdb or a core dump, to something resembling Dr. Watson, or
even to something resembling a gdb remote stub, allowing the process to be
debugged over a network. (The daemon could even keep just one process "in
limbo" like this -- if another crashes, it lets the first one die
completely, perhaps generating a core -- again, useful for an embedded
design.)

> BTW, I think the plain core name patch belongs in 2.2.xx while
> the rest is obviously a big 2.3.xx project.

Agreed on 2.3.x.

--
Kenneth Albanowski (kja...@kjahds.com, CIS: 70705,126)

Todd Larason

unread,

Jan 1, 1999, 3:00:00 AM1/1/99

to Michael Elizabeth Chastain, acah...@cs.uml.edu, linux-...@vger.rutgers.edu

On 981230, Michael Elizabeth Chastain wrote:
> You could do a lot of *this* in user space by having a library function
> set the default value for all signals, or just SIGSEGV, to a function
> that uses some kind of IPC, such as writing to a fifo whose name includes
> the uid. I wonder if it's possible to use preloading to inject that
> snippet of code into an existing executable without relinking it and
> without touching libc.

Yes. The only hard part I see is getting the executable filename.
core.c:
---
#include <stdlib.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <signal.h>
#include <unistd.h>

void init_libcore(void) __attribute__((constructor));

void
sighandler(int signo)
{
struct {
int signo;
pid_t pid;
} s;
int fd;

s.signo = signo;
s.pid = getpid();
fd = open(getenv("SIG_FIFO"), O_WRONLY);
write(fd, &s, sizeof s);
close(fd);
pause();
}

void
init_libcore(void)
{
struct sigaction act;

act.sa_handler = sighandler;
sigfillset(&act.sa_mask);
act.sa_flags = SA_RESETHAND;
sigaction(SIGSEGV, &act, NULL);
}
-----
% gcc -fPIC -o core.o -c core.c
% gcc -shared -o core.so core.o
% LD_PRELOAD=./core SIG_FIFO=$HOME/.crash_handler_fifo bash

A program to listen to the fifo should be straightforward.
--
ICQ UIN: 56155810

Pavel Machek

unread,

Jan 4, 1999, 3:00:00 AM1/4/99

to Jakub Jelinek

Ahoj! / Hi!

> On the other side, if some kind of corename patch makes it into the kernel
> and administrator chooses to put all core files e.g to
> /tmp/cores/core.<pid>, then some ucore program can poll on /tmp/cores and
> once it notices a new file created there, just spawn a debugger or whatever
> the user wants (not that I'd like to do it on my machines).

> But such thing should be implemented in userland, why to bloat the kernel.

There's other view possible: why bother having core-dumper in kernel,
when you can have userland process doing it?

Imagine kernel execing /sbin/coredump-him <PID> <dumpable> when
someone needs to dump core. It is both more flexible than current
approach, and probably would save some kernel space (confirmed, saves
4230 bytes of kernel size). Disadvantages are:

* it is slower
...which does not matter: core dumps are slow, anyway, and they
happen rarely

* it will not work without /proc mounted
...not too important because you usualy do not have / readwrite when
/proc is not mounted, so you would not be able to dump core, anyway.

[can not think of more]

I do not know how hard writing of userland coredumper is, mj says it
is possible and not that hard. I do not think this is important... I
may play with it on one cloudy day.

Pavel
--
I'm really pa...@atrey.karlin.mff.cuni.cz. Pavel
Look at http://atrey.karlin.mff.cuni.cz/~pavel/ ;-).

Kenneth Albanowski

unread,

Jan 4, 1999, 3:00:00 AM1/4/99

to Pavel Machek

On Sat, 2 Jan 1999, Pavel Machek wrote:

> I do not know how hard writing of userland coredumper is, mj says it
> is possible and not that hard. I do not think this is important... I
> may play with it on one cloudy day.

I'd prefer to see it remain in-kernel, mainly because it may need to be
privy to knowledge that is most easily obtained in-kernel (if the process
load address is variable, for example), and because it can safely be
considered an atomic and robust process if it is in the kernel. (Yes, I
know, putting it in the kernel doesn't make it bug free. But we can keep
the existing code that _does_ seem to be bug free.)

The private-knowledge aspect isn't crucial: to set up a debugger, the same
information will need to be available through ptrace. But mostly I'd just
prefer to leave something alone if it isn't broken.

--
Kenneth Albanowski (kja...@kjahds.com, CIS: 70705,126)

-

Kenneth Albanowski

unread,

Jan 5, 1999, 3:00:00 AM1/5/99

to Albert D. Cahalan

On Sun, 3 Jan 1999, Albert D. Cahalan wrote:

> Kenneth Albanowski writes:
>
> > Making the character device world accessible might not be quite right,
> > though, unless the kernel allows multiple connections, and automatically
> > gives each connected daemon from the one with the most authority (root
> > perms) to the least (user perms) a chance at taking the uid, in turn.
>

> World accessible with multiple connections is totally correct.
> Only an exact authority match is acceptable. If you run a setuid
> app and want to catch crashes, you need a setuid daemon to do it.

I'm not sure that degree of precision is needed. A deamon with the uid/gid
that the app was set to (as opposed to what it is running as) should be
sufficient. A setuid daemon would then work too, of course.

> Possible exception: root (well, CAP_DEBUG_ANY) could ask to accept
> anything left over. Users always get first chance.

Hmm. I'd like to say that root is root, and gets control first, and it
spirals in from there, with the user being last. Of course that isn't
quite right, as the user may very well be using the this mechanism for
something (like persistance) that the kernel's handler (core-dump, let's
say) is inappropriate for. Maybe a spiral inwards and then back outwards,
with a presumption that the administrator is sane and put the core dumper
in at the end, instead of the beginning? And I'm not sure whether there
can be less/more security, or just root/user. Where do groups come in?

However, I'm not sure any of this matters much, at least to the in-kernel
side. See below.

> > It's probably simpler to have a root daemon that uses /etc/debugconf,
> > ~user/debugconf, etc., to figure out what processes get which treatment.
>

> Ugh. You get an authorization mess, with a user-space tool trying to
> verify permission. I don't think it is good for a root daemon to mess
> with user home directories. (not that it can always be avoided of course)

Agreed, the problem is that if we want a nice orderly chain between
different crash handlers (each possibly deciding to skip the process or
just record some data, then handing it off to the next in line), we
_don't_ want the kernel responsible for coping with this, because then the
kernel must make sure nobody is playing silly buggers and not passing
control around until next Tuesday, or something equally stupid. I'd much
rather have any such logic in user-space, meaning a root daemon that
spawns or communicates with minimum security processes. There's no problem
with setuid or any other tasks, as ptrace() has perfect security, in any
case.

I think the character device can consist solely of the kernel sending out
a pid to the reader. It automatically puts the task in stasis and promptly
forgets about it. The daemon takes responsibility of doing anything with
the task that it wants to. When everyone is done with it (to the daemon's
satisfaction), the task can be thawed and allowed to die by sending it any
signal. (If a core is needed, that's handled by a new ptrace sub-command,
which doesn't implicitly kill the task.)

Note that there is a specific contract here: if anyone is listening on the
device, they take the responsibility of thawing out and killing the tasks.
Otherwise the corpsicles will pile up. (They are limited by quota, and can
be removed by any signal, so they aren't a risk, just annoying.)

If we do want some level of multiple-reader support within the kernel,
perhaps you can do it by making the device a clearing-house of sorts. It
allows multiple readers, and sends the pid to the "first" reader (root or
user, whichever). If that reader decides to pass control on to the next
hander, it writes the pid back into the device. The driver makes sure the
pid corresponds to a frozen process, goes through the security checks, and
sends the pid on to the _next_ reader that deserves it. If there isn't
such a next reader, it thaws the process, letting it die (without dumping
core).

This approach won't require any lists to be stored, as the "next handler
in sequence" is computed anew each time based on who wrote in the request,
and the security details of the still-frozen process. However, there may
be some security issues involved with writing in a "guessed" pid and
causing race conditions to make the driver skip sending notification to a
handler. I'm not sure if this would be considered a significant problem.

Hmm. I guess one extra long per task would be enough to store the pid of
the last task that is the current "death handler", making it impossible to
trick the driver into skipping handlers.

Kenneth Albanowski

unread,

Jan 6, 1999, 3:00:00 AM1/6/99

to Albert D. Cahalan

On Mon, 4 Jan 1999, Albert D. Cahalan wrote:

> Kenneth Albanowski writes:
> > On Sun, 3 Jan 1999, Albert D. Cahalan wrote:
> >>
> >> World accessible with multiple connections is totally correct.
> >> Only an exact authority match is acceptable. If you run a setuid
> >> app and want to catch crashes, you need a setuid daemon to do it.
> >
> > I'm not sure that degree of precision is needed. A deamon with the uid/gid
> > that the app was set to (as opposed to what it is running as) should be
> > sufficient. A setuid daemon would then work too, of course.
>

> Bear in mind that people are trying to lock down Linux with serious
> security, such as mandatory access control. The Coda filesystem
> developers seem to want each login to be isolated from every other.
>
> Perfect matches are very reliable. Anything less is likely to
> allow mistakes.

And imperfect matches usually involve kludges to avoid touching suid
programs and such... Yes, perfect matches make sense here. But do please
remember that /dev/crash's security is irrelevant if you can still attach
to a process with ptrace().

> I highly doubt that root will want to steal core dumps.
> For embedded systems, non-root simply won't run a crash handler.
> In any case, "chmod 600 /dev/crash" if you want to steal cores.
>
> It is more likely that root will be too lazy to run the daemon.
> I'd hate to rely on certain admins I know.

Agreed to all.

> That logic does not belong anywhere. It is overly complex.

Or rather, let the user do it if they insist, but don't even think about
it in the kernel. Yes, I agree, that was more complex then we need.

> This is simple:
>
> 1. Look for an exact security match.
> 2. If none, look for root.
> 3. If not found, dump core.
>
> Step 2 is really for setuid programs and servers that change UID.
> It is not intended to catch normal user processes. In fact, it
> should just dump a standard core if it catches one by accident.

Yes, this should be sufficient for the moment.

D. Feuer

unread,

Jan 7, 1999, 3:00:00 AM1/7/99

to Kenneth Albanowski

I've lost track of this thread, but I think even if not running userspace
dumper/debugger/tracer/whatever should be possible to configure kernel to
do core.usr-bin-netscrape or core.usr-bin-netscrape.23433 (pid). This was
the original issue brought up, and I think still valid. Even simple core
dumps should be given better names.

David Feuer
dfe...@his.com

Raul Miller

unread,

Jan 8, 1999, 3:00:00 AM1/8/99

to linux-...@vger.rutgers.edu

Albert D. Cahalan <acah...@cs.uml.edu> wrote:
> 1. Look for an exact security match.
> 2. If none, look for root.
> 3. If not found, dump core.
>
> Step 2 is really for setuid programs and servers that change UID.
> It is not intended to catch normal user processes. In fact, it
> should just dump a standard core if it catches one by accident.

It should also clear the setuid bit on the executable, and be very,
very careful to not step on anything when dumping core.

--
Raul