linux-audit: reconstruct path names from syscall events?

John Feuerstein

unread,

Sep 16, 2011, 8:20:01 PM9/16/11

to

Hi,

I would like to audit all changes to a directory tree using the linux
auditing system[1].

# auditctl -a exit,always -F dir=/etc/ -F perm=wa

It seems like the GNU coreutils are enough to break the audit trail.

The resulting SYSCALL events provide CWD and multiple PATH records,
depending on the syscall. If one of the PATH records is relative, I can
reconstruct the absolute path using the CWD record.

However, that does not work for the whole *at syscall family
(unlinkat(2), renameat(2), linkat(2), ...); accepting paths relative to
a given directory file descriptor. GNU coreutils are prominent users,
for example "rm -r" making use of unlinkat(2) to prevent races.

Things like dup(2) and fd passing via unix domain sockets come to mind.
It's the same old story again: mapping fds to path names is ambiguous at
best, if not impossible.

I wonder why such incomplete file system auditing rules are considered
sufficient in the CAPP/LSPP/NISPOM/STIG rulesets?

Here's a simplified example:

$ cd /tmp
$ mkdir dir
$ touch dir/file
$ ls -ldi /tmp /tmp/dir /tmp/dir/file
2057 drwxrwxrwt 9 root root 380 Sep 17 00:02 /tmp
58781 drwxr-xr-x 2 john john 40 Sep 17 00:02 /tmp/dir
56228 -rw-r--r-- 1 john john 0 Sep 17 00:02 /tmp/dir/file
$ cat > unlinkat.c
#include <unistd.h>
#include <fcntl.h>

int main(int argc, char **argv)
{
int dirfd = open("dir", O_RDONLY);
unlinkat(dirfd, "file", 0);
return 0;
}
^D
$ make unlinkat
cc unlinkat.c -o unlinkat
$ sudo autrace ./unlinkat
Waiting to execute: ./unlinkat
Cleaning up...
Trace complete. You can locate the records with 'ausearch -i -p 32121'
$ ls -li dir
total 0

Now, looking at the resulting raw SYSCALL event for unlinkat(2):

type=SYSCALL msg=audit(1316210542.899:779): arch=c000003e syscall=263 success=yes exit=0 a0=3 a1=400690 a2=0 a3=0 items=2 ppid=32106 pid=32121 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts12 ses=36 comm="unlinkat" exe="/tmp/unlinkat" key=(null)
type=CWD msg=audit(1316210542.899:779): cwd="/tmp"
type=PATH msg=audit(1316210542.899:779): item=0 name="/tmp" inode=58781 dev=00:0e mode=040755 ouid=1000 ogid=1000 rdev=00:00
type=PATH msg=audit(1316210542.899:779): item=1 name="file" inode=56228 dev=00:0e mode=0100644 ouid=1000 ogid=1000 rdev=00:00
type=EOE msg=audit(1316210542.899:779):

- From this event alone, there's no way to answer "Who unlinked
/tmp/dir/file?". For what it's worth, the provided path names would be
exactly the same if we had unlinked "/tmp/dir/dir/dir/dir/dir/file".

- PATH item 0 reports the inode of "/tmp/dir" (58781, see ls output
above), however, the reported path name is "/tmp" (bug?).

In this example I've used autrace, which traces everything, so I could
possibly search for a previous open(2) of inode 58781. And indeed, there
it is:

type=SYSCALL msg=audit(1316210542.899:778): arch=c000003e syscall=2 success=yes exit=3 a0=40068c a1=0 a2=7fff22724fc8 a3=0 items=1 ppid=32106 pid=32121 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts12 ses=36 comm="unlinkat" exe="/tmp/unlinkat" key=(null)
type=CWD msg=audit(1316210542.899:778): cwd="/tmp"
type=PATH msg=audit(1316210542.899:778): item=0 name="dir" inode=58781 dev=00:0e mode=040755 ouid=1000 ogid=1000 rdev=00:00
type=EOE msg=audit(1316210542.899:778):

Great, so inode 58781 was opened using "/tmp/dir", and therefore, the relative
path "file" given to unlinkat(2) above could possibly translate to
"/tmp/dir/path"... not really feeling confident here.

- All file system auditing rules in various rulesets and the examples in
the documentation add the "-F perm=wa" (or similar) filter, so the
open(2) wouldn't even make it into the audit trail.

- If you can handle the volume and log all open(2), what happens if the
open(2) was done hours, days, weeks, ... ago?

- What if the open(2) was done by another process which passed the fd
on a unix domain socket?

It looks like the kernel auditing code should provide

... item=0 name="/tmp/dir" inode=58781 ...

in the unlinkat(2) syscall event above. Looking up the unlinkat(2)
documentation:

int unlinkat(int dirfd, const char *pathname, int flags);

If the pathname given in pathname is relative, then it is
interpreted relative to the directory referred to by the file
descriptor dirfd (rather than relative to the current working
directory of the calling process, as is done by unlink(2) and
rmdir(2) for a relative pathname).

If the pathname given in pathname is relative and dirfd is the
special value AT_FDCWD, then pathname is interpreted relative
to the current working directory of the calling process (like
unlink(2) and rmdir(2)).

As you might see, there's not only the fd->pathname problem, but
also the special case for AT_FDCWD. In this case the kernel side should
probably just duplicate CWD's path name into item 0's path name. But
that's just unlinkat(2), there are a lot more.

What am I missing here? Is there no way to audit a directory tree?
I've looked at alternatives: Inotify watches won't scale to big trees
and events lack so much detail that they can't be used for auditing.
Fanotify, while providing the pid, still lacks a lot of events and
passes fds; the example code relies on readlink("/proc/self/fd/...").

Thanks,
John

[1] http://people.redhat.com/sgrubb/audit/

--
John Feuerstein <jo...@feurix.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Mark Moseley

unread,

Oct 9, 2012, 7:10:02 PM10/9/12

to

After posting something similar to linux-audit the other day, I
rechecked the archives and found this. I'm hitting the same issue.
I've seen the same thing happen also with chmodat and chownat (and
those are even uglier since you don't have the chain of inodes in the
audit logs like you do with unlinkat). I'd mistakenly thought it was
due to recursion, but that's clearly not the case (the binaries in
question must just be using the *at() versions of the syscalls for
recursive calls). Did you find any sort of workaround for this? I
didn't see a relevant-looking patch in the more recent LKML archives,
though it could've been titled something I didn't interpret as
relevant.

If you see my recent linux-audit posting, another related thing (at
least as far as missing relevant information in the logs) is that the
audit logs are logging pathnames relative to the chroot, instead of
the pathnames relative to the root of the OS itself. You'd expect a
process chroot'd to /chroot, accessing (from the perspective of the
OS) /chroot/etc/password would get logged as /chroot/etc/password but
is rather logged as /etc/password.

I don't have a working LXC install handy, but I'd imagine the audit
subsystem would log relative to the container's / instead of the
host's / too.

Al Viro

unread,

Oct 9, 2012, 7:30:01 PM10/9/12

to

On Tue, Oct 09, 2012 at 04:09:18PM -0700, Mark Moseley wrote:
> On Fri, Sep 16, 2011 at 5:12 PM, John Feuerstein <jo...@feurix.com> wrote:
> > Hi,
> >
> > I would like to audit all changes to a directory tree using the linux
> > auditing system[1].
> >
> > # auditctl -a exit,always -F dir=/etc/ -F perm=wa
> >
> > It seems like the GNU coreutils are enough to break the audit trail.
> >
> > The resulting SYSCALL events provide CWD and multiple PATH records,
> > depending on the syscall. If one of the PATH records is relative, I can
> > reconstruct the absolute path using the CWD record.
> >
> > However, that does not work for the whole *at syscall family
> > (unlinkat(2), renameat(2), linkat(2), ...); accepting paths relative to
> > a given directory file descriptor. GNU coreutils are prominent users,
> > for example "rm -r" making use of unlinkat(2) to prevent races.
> >
> > Things like dup(2) and fd passing via unix domain sockets come to mind.
> > It's the same old story again: mapping fds to path names is ambiguous at
> > best, if not impossible.

Your point being? Even if you do get all pathnames, you *can't* reconstruct
the changes of filesystem tree, period. Pathname resolution is not atomic.
Can't be made such, either - not without serializing all system calls, which
will hurt too damn much.

You can tell when something happens to filesystem *object*. Which audit,
lousy as it is, allows to do. Anything that hopes to reconstruct the
history of changes based on fully timestamped history of syscalls is
inherently unreliable.

Again, pathname resolution is not atomic at all and neither is reconstructing
pathname by object (i.e. by vfsmount/dentry pair).

Al Viro

unread,

Oct 9, 2012, 7:40:02 PM10/9/12

to

On Tue, Oct 09, 2012 at 04:09:18PM -0700, Mark Moseley wrote:

> If you see my recent linux-audit posting, another related thing (at
> least as far as missing relevant information in the logs) is that the
> audit logs are logging pathnames relative to the chroot, instead of
> the pathnames relative to the root of the OS itself. You'd expect a
> process chroot'd to /chroot, accessing (from the perspective of the
> OS) /chroot/etc/password would get logged as /chroot/etc/password but
> is rather logged as /etc/password.
>
> I don't have a working LXC install handy, but I'd imagine the audit
> subsystem would log relative to the container's / instead of the
> host's / too.

BTW, what makes you think that container's root is even reachable from
"the host's /"? There is no such thing as "root of the OS itself"; different
processes can (and in case of containers definitely do) run in different
namespaces. With entirely different filesystems mounted in those, and
no promise whatsoever that any specific namespace happens to have all
filesystems mounted somewhere in it...

Mark Moseley

unread,

Oct 9, 2012, 7:50:01 PM10/9/12

to

On Tue, Oct 9, 2012 at 4:39 PM, Al Viro <vi...@zeniv.linux.org.uk> wrote:
> On Tue, Oct 09, 2012 at 04:09:18PM -0700, Mark Moseley wrote:
>
>> If you see my recent linux-audit posting, another related thing (at
>> least as far as missing relevant information in the logs) is that the
>> audit logs are logging pathnames relative to the chroot, instead of
>> the pathnames relative to the root of the OS itself. You'd expect a
>> process chroot'd to /chroot, accessing (from the perspective of the
>> OS) /chroot/etc/password would get logged as /chroot/etc/password but
>> is rather logged as /etc/password.
>>
>> I don't have a working LXC install handy, but I'd imagine the audit
>> subsystem would log relative to the container's / instead of the
>> host's / too.
>
> BTW, what makes you think that container's root is even reachable from
> "the host's /"? There is no such thing as "root of the OS itself"; different
> processes can (and in case of containers definitely do) run in different
> namespaces. With entirely different filesystems mounted in those, and
> no promise whatsoever that any specific namespace happens to have all
> filesystems mounted somewhere in it...

Nothing beyond guesswork, since it's been a while since I've played
with LXC. In any case, I was struggling a bit for the correct
terminology.

Am I similarly off-base with regards to the chroot'd scenario?

Al Viro

unread,

Oct 9, 2012, 8:00:03 PM10/9/12

to

On Tue, Oct 09, 2012 at 04:47:17PM -0700, Mark Moseley wrote:
> > BTW, what makes you think that container's root is even reachable from
> > "the host's /"? There is no such thing as "root of the OS itself"; different
> > processes can (and in case of containers definitely do) run in different
> > namespaces. With entirely different filesystems mounted in those, and
> > no promise whatsoever that any specific namespace happens to have all
> > filesystems mounted somewhere in it...
>
> Nothing beyond guesswork, since it's been a while since I've played
> with LXC. In any case, I was struggling a bit for the correct
> terminology.
>
> Am I similarly off-base with regards to the chroot'd scenario?

chroot case is going to be reachable from namespace root, but I seriously
doubt that pathname relative to that will be more useful...

Again, relying on pathnames for forensics (or security in general) is
a serious mistake (cue unprintable comments about apparmor and similar
varieties of snake oil). And using audit as poor man's ktrace analog
is... misguided, to put it very mildly.

Mark Moseley

unread,

Oct 10, 2012, 6:50:02 PM10/10/12

to

On Tue, Oct 9, 2012 at 4:54 PM, Al Viro <vi...@zeniv.linux.org.uk> wrote:
> On Tue, Oct 09, 2012 at 04:47:17PM -0700, Mark Moseley wrote:
>> > BTW, what makes you think that container's root is even reachable from
>> > "the host's /"? There is no such thing as "root of the OS itself"; different
>> > processes can (and in case of containers definitely do) run in different
>> > namespaces. With entirely different filesystems mounted in those, and
>> > no promise whatsoever that any specific namespace happens to have all
>> > filesystems mounted somewhere in it...
>>
>> Nothing beyond guesswork, since it's been a while since I've played
>> with LXC. In any case, I was struggling a bit for the correct
>> terminology.
>>
>> Am I similarly off-base with regards to the chroot'd scenario?
>
> chroot case is going to be reachable from namespace root, but I seriously
> doubt that pathname relative to that will be more useful...

Possibly not, but it'd still be good to have some sort of indicator
that this entry is being logged relative to the chroot, like an
additional item in the audit entry or even some kind of flag. But in
this case, and far more so in the unlinkat/chmodat/chownat case, I'd
think the least surprising thing (to me, at least) would be for the
directory item in the audit entry to have a pathname relative to
namespace root.

> Again, relying on pathnames for forensics (or security in general) is
> a serious mistake (cue unprintable comments about apparmor and similar
> varieties of snake oil). And using audit as poor man's ktrace analog
> is... misguided, to put it very mildly.

Caveat: I'm just a sysadmin, so this stuff is as darn near "magic" as
I get to see on a regular basis, so it's safe to expect some naivety
and/or misguidedness on my part :)

I'm just using it as a log of files that have been written/changed on
moderately- to heavily-used systems. If there's another in-kernel
mechanism that'd be better suited for that sort of thing (at least
without adding a lot of overhead), I'd be definitely eager to know
about it. It's a web hosting environment, with customer files all
solely on NFS, so writes to the same directory can come from an
arbitrary number of servers. When they get swamped with write
requests, the amount of per-client stats exposed by our Netapp and
Oracle NFS servers is often only enough to point us at a client server
with an abusive user on it (but not much more, without turning on
debugging). Having logs of who's doing writes would be quite useful,
esp when writes aren't happening at that exact moment and wouldn't
show up in tools like iotop. The audit subsystem seemed like the best
fit for this kind of thing, but I'm more than open to whatever works.

Mark Moseley

unread,

Oct 29, 2012, 9:20:02 PM10/29/12

to

On Thu, Oct 11, 2012 at 10:27 AM, Mark Moseley <mosel...@gmail.com> wrote:
> On Wed, Oct 10, 2012 at 4:07 PM, Mark Moseley <mosel...@gmail.com> wrote:
>> On Wed, Oct 10, 2012 at 4:00 PM, Steve Grubb <sgr...@redhat.com> wrote:

>>> The audit system is the best fit. But I think Al is saying there are some
>>> limitations. i know that Eric pushed some patches a while back that makes a
>>> stronger effort at collecting some of this information. What kernel are you
>>> using?

Would you happen to have a pointer to those patches? I've been surfing
the archives and not gotten lucky yet with finding the applicable
patchset.

>> Yup, understood. I've been playing with a variety of boxes, but mostly
>> within the 3.0.x and 3.2.x series. I'll drop 3.5.6 on some of these
>> boxes and see if my issues are already fixed (and proceed directly to
>> foot-in-mouth chagrined stage -- usually takes slightly longer to get
>> to that stage).
>
> Just gave 3.5.6 a shot and in these two particular cases, the result
> is the same: chroot'd actions are logged in the audit entry relative
> to the chroot, and the unlinkat/chmodat/chownat audit log entries only
> have one item with the bare filename and no indication of directory.

renameat seems to be the toughest of all of them (where
unlinkat/chmodat/chownat give you a hint in another audit entry). This
is doing a renameat(), from /home/moseley/tmp/tmp/renameat/1/a1 to
/home/moseley/tmp/tmp/renameat/2/a2

type=SYSCALL msg=audit(1351557710.520:74211): arch=c000003e
syscall=264 success=yes exit=0 a0=3 a1=40075c a2=4 a3=400759 items=4
ppid=22742 pid=15181 auid=0 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0
sgid=0 fsgid=0 tty=pts17 ses=1727 comm="rename" exe="/tmp/rename"
key=(null)
type=CWD msg=audit(1351557710.520:74211): cwd="/tmp"
type=PATH msg=audit(1351557710.520:74211): item=0 name="/tmp"
inode=2367550 dev=08:02 mode=040775 ouid=1000 ogid=1000 rdev=00:00
type=PATH msg=audit(1351557710.520:74211): item=1 name="/tmp"
inode=2367551 dev=08:02 mode=040775 ouid=1000 ogid=1000 rdev=00:00
type=PATH msg=audit(1351557710.520:74211): item=2 name="a1"
inode=2367552 dev=08:02 mode=0100664 ouid=1000 ogid=1000 rdev=00:00
type=PATH msg=audit(1351557710.520:74211): item=3 name="a2"
inode=2367552 dev=08:02 mode=0100664 ouid=1000 ogid=1000 rdev=00:00

Anything else I could/should be trying? I'm more than willing to
experiment. I just always assume I'm missing some key flag or
something.

Here's the simple example code ... and, yes, I *do* know how to use
variables, just didn't bother here ;)

#include <fcntl.h>
#include <stdio.h>
#include <sys/types.h>
#include <dirent.h>

int main() {
DIR *a;
DIR *b;

char* dir1 = "/home/moseley/tmp/tmp/renameat/1";
char* dir2 = "/home/moseley/tmp/tmp/renameat/2";

a = opendir( dir1 );
b = opendir( dir2 );

int afd = dirfd( a );
int bfd = dirfd( b );

renameat( afd, "a1", bfd, "a2" );