Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Bug#1040373: bookworm lsof does not show DELeted files on a filesystem anymore

25 views
Skip to first unread message

Michael Tokarev

unread,
Jul 5, 2023, 1:50:04 AM7/5/23
to
Package: lsof
Version: 4.95.0-1
Severity: normal

Previously, I was able to see open but deleted files on a given filesystem
(for example, after upgrading a library, with old .so files still open by
older processes) by doing this:

lsof / | grep DEL

This gave output like this (after recent libX11 update):

Xwayland 1998 2007 Xwayland: mjt DEL REG 0,23 339489 /usr/lib/x86_64-linux-gnu/libX11-xcb.so.1.0.0
Xwayland 1998 2007 Xwayland: mjt DEL REG 0,23 339461 /usr/lib/x86_64-linux-gnu/libX11.so.6.4.0

But with bookworm lsof, it does not work, `lsof /' is not finding these
DEL files anymore. It only works without the filesystem argument, eg

lsof | grep DEL

but obviously it shows far more than I wanted it to show, and works
*dramatically* slower.

It looks like something changed within lsof in the filesystem matching
code..

/mjt

-- System Information:
Debian Release: 12.0
APT prefers stable-security
APT policy: (990, 'stable-security'), (990, 'stable'), (500, 'oldstable-security'), (500, 'oldstable-debug'), (500, 'oldoldstable'), (500, 'oldstable'), (100, 'testing'), (50, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386, x32

Kernel: Linux 6.1.0-9-amd64 (SMP w/16 CPU threads; PREEMPT)
Kernel taint flags: TAINT_WARN
Locale: LANG=ru_RU.utf8, LC_CTYPE=ru_RU.utf8 (charmap=UTF-8), LANGUAGE not set
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled

Versions of packages lsof depends on:
ii libc6 2.36-9
ii libselinux1 3.4-1+b6
ii libtirpc3 1.3.3+ds-1

lsof recommends no packages.

lsof suggests no packages.

-- no debconf information

Andres Salomon

unread,
Jul 5, 2023, 2:40:04 AM7/5/23
to

On Wed, Jul 5 2023 at 08:42:23 AM +03:00:00, Michael Tokarev <m...@tls.msk.ru> wrote:
Package: lsof Version: 4.95.0-1 Severity: normal Previously, I was able to see open but deleted files on a given filesystem (for example, after upgrading a library, with old .so files still open by older processes) by doing this: lsof / | grep DEL This gave output like this (after recent libX11 update): Xwayland 1998 2007 Xwayland: mjt DEL REG 0,23 339489 /usr/lib/x86_64-linux-gnu/libX11-xcb.so.1.0.0 Xwayland 1998 2007 Xwayland: mjt DEL REG 0,23 339461 /usr/lib/x86_64-linux-gnu/libX11.so.6.4.0 But with bookworm lsof, it does not work, `lsof /' is not finding these DEL files anymore. It only works without the filesystem argument, eg lsof | grep DEL but obviously it shows far more than I wanted it to show, and works *dramatically* slower. It looks like something changed within lsof in the filesystem matching


That's odd, it still works for me on bookworm:

dilinger@5310:~$ lsof / |grep DEL|grep chromiu chromium 146607 dilinger DEL REG 254,1 54575015 /home/dilinger/.config/chromium/BrowserMetrics/BrowserMetrics-64A3BB92-23CAF.pma chromium 146607 dilinger DEL REG 254,1 54623911 /home/dilinger/.config/dconf/user dilinger@5310:~$


What does your /proc/mounts look like? Maybe there's something different with how your / is mounted? Or, maybe lsof lacks permission to access something? Does it work if you run lsof / as root?  Running it under strace could also provide a hint.

Michael Tokarev

unread,
Jul 6, 2023, 5:30:04 AM7/6/23
to
05.07.2023 09:26, Andres Salomon пишет:
>
>
> On Wed, Jul 5 2023 at 08:42:23 AM +03:00:00, Michael Tokarev <m...@tls.msk.ru> wrote:
>> Package: lsof Version: 4.95.0-1 Severity: normal Previously, I was able to see open but deleted files on a given filesystem (for example, after
>> upgrading a library, with old .so files still open by older processes) by doing this: lsof / | grep DEL This gave output like this (after recent
>> libX11 update): Xwayland 1998 2007 Xwayland: mjt DEL REG 0,23 339489 /usr/lib/x86_64-linux-gnu/libX11-xcb.so.1.0.0 Xwayland 1998 2007 Xwayland: mjt
>> DEL REG 0,23 339461 /usr/lib/x86_64-linux-gnu/libX11.so.6.4.0 But with bookworm lsof, it does not work, `lsof /' is not finding these DEL files
>> anymore. It only works without the filesystem argument, eg lsof | grep DEL but obviously it shows far more than I wanted it to show, and works
>> *dramatically* slower. It looks like something changed within lsof in the filesystem matching
>
>
> That's odd, it still works for me on bookworm:
>
> dilinger@5310:~$ lsof / |grep DEL|grep chromiu chromium 146607 dilinger DEL REG 254,1 54575015
> /home/dilinger/.config/chromium/BrowserMetrics/BrowserMetrics-64A3BB92-23CAF.pma chromium 146607 dilinger DEL REG 254,1 54623911
> /home/dilinger/.config/dconf/user dilinger@5310:~$

Hm. Interesting. And odd, indeed.

> What does your /proc/mounts look like? Maybe there's something different with how your / is mounted? Or, maybe lsof lacks permission to access
> something? Does it work if you run lsof / as root?  Running it under strace could also provide a hint.

I did some more tests. The prob isn't lsof itself but something else, since
lsof from bullseye shows exactly the same behavour.

For example:

# lsof -p 1 | grep /lib/
systemd 1 root txt REG 0,24 92544 111995 /usr/lib/systemd/systemd
systemd 1 root mem REG 0,22 111995 /usr/lib/systemd/systemd (path dev=0,24)
systemd 1 root mem REG 0,22 118104 /usr/lib/x86_64-linux-gnu/libpcre2-8.so.0.11.2 (path dev=0,24)
systemd 1 root mem REG 0,22 22227 /usr/lib/x86_64-linux-gnu/libm.so.6 (path dev=0,24)
systemd 1 root mem REG 0,22 147312 /usr/lib/x86_64-linux-gnu/libcrypto.so.3 (path dev=0,24)
systemd 1 root mem REG 0,22 124846 /usr/lib/x86_64-linux-gnu/libgpg-error.so.0.33.1 (path dev=0,24)
systemd 1 root mem REG 0,22 117583 /usr/lib/x86_64-linux-gnu/libcap-ng.so.0.0.0 (path dev=0,24)
systemd 1 root mem REG 0,22 22103 /usr/lib/x86_64-linux-gnu/liblzma.so.5.4.1 (path dev=0,24)
...

# lsof / | grep ' 1 '
systemd 1 root cwd DIR 0,24 184 256 /
systemd 1 root rtd DIR 0,24 184 256 /
systemd 1 root txt REG 0,24 92544 111995 /usr/lib/systemd/systemd


(digging more..)

It looks like this is btrfs-specific. Note the pathnames above - the "(path dev=0,24)" tail thing.
It is not shown when root fs is on ext4:

anothersystem# lsof -p 1 | grep /lib/
systemd 1 root txt REG 9,1 92544 21522 /usr/lib/systemd/systemd
systemd 1 root mem REG 9,1 157768 438 /usr/lib/x86_64-linux-gnu/libgpg-error.so.0.33.1
systemd 1 root mem REG 9,1 629384 7184 /usr/lib/x86_64-linux-gnu/libpcre2-8.so.0.11.2
systemd 1 root mem REG 9,1 907784 10122 /usr/lib/x86_64-linux-gnu/libm.so.6
systemd 1 root mem REG 9,1 190456 10205 /usr/lib/x86_64-linux-gnu/liblzma.so.5.4.1
systemd 1 root mem REG 9,1 4709656 4362 /usr/lib/x86_64-linux-gnu/libcrypto.so.3
systemd 1 root mem REG 9,1 30704 2770 /usr/lib/x86_64-linux-gnu/libcap-ng.so.0.0.0
...

and it shows expected set of files when run as `lsof /'

I noticed this on a few systems after upgrading to bookworm, but I also updated
filesystem layout and switched to btrfs. Had no idea this can ever be related
to a filesystem.

I'll try to dig further here. strace didn't reveal anything interesting so far.

(and yes, all tests are done as root, as is the initial issue. Here's my /prpc/mounts:
udev /dev devtmpfs rw,nosuid,noexec,relatime,size=8156784k,nr_inodes=2039196,mode=755,inode64 0 0
runfs /run tmpfs rw,nosuid,relatime,size=3265800k,mode=755,inode64 0 0
/dev/md1 / btrfs rw,relatime,compress=zstd:4,space_cache=v2,subvolid=256,subvol=/rootfs 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev,inode64 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k,inode64 0 0
cgroup2 /sys/fs/cgroup cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot 0 0
pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0
efivarfs /sys/firmware/efi/efivars efivarfs rw,nosuid,nodev,noexec,relatime 0 0
bpf /sys/fs/bpf bpf rw,nosuid,nodev,noexec,relatime,mode=700 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,relatime,pagesize=2M 0 0
mqueue /dev/mqueue mqueue rw,nosuid,nodev,noexec,relatime 0 0
debugfs /sys/kernel/debug debugfs rw,nosuid,nodev,noexec,relatime 0 0
tracefs /sys/kernel/tracing tracefs rw,nosuid,nodev,noexec,relatime 0 0
fusectl /sys/fs/fuse/connections fusectl rw,nosuid,nodev,noexec,relatime 0 0
configfs /sys/kernel/config configfs rw,nosuid,nodev,noexec,relatime 0 0
ramfs /run/credentials/systemd-sysctl.service ramfs ro,nosuid,nodev,noexec,relatime,mode=700 0 0
ramfs /run/credentials/systemd-tmpfiles-setup-dev.service ramfs ro,nosuid,nodev,noexec,relatime,mode=700 0 0
tmpfs /tmp tmpfs rw,nosuid,nodev,relatime,inode64 0 0
/dev/sda5 /squid/b btrfs rw,nosuid,nodev,noexec,relatime,compress=zstd:5,space_cache,subvolid=5,subvol=/ 0 0
/dev/sdc1 /boot/efi vfat rw,nosuid,nodev,relatime,fmask=0022,dmask=0022,codepage=437,iocharset=ascii,shortname=mixed,utf8,errors=remount-ro 0 0
/dev/sdc5 /squid/a btrfs rw,nosuid,nodev,noexec,relatime,compress=zstd:5,space_cache,subvolid=5,subvol=/ 0 0
/dev/md2p1 /var ext4 rw,nosuid,nodev,relatime 0 0
/dev/md2p3 /ora ext4 rw,nosuid,nodev,relatime 0 0
/dev/md0 /stage btrfs rw,nosuid,nodev,relatime,space_cache,subvolid=5,subvol=/ 0 0
/dev/md2p4 /ws/ws btrfs rw,nosuid,nodev,relatime,compress=zstd:5,space_cache,subvolid=258,subvol=/ws 0 0
/dev/md2p4 /home btrfs rw,nosuid,nodev,relatime,compress=zstd:5,space_cache,subvolid=257,subvol=/home 0 0
/dev/md2p4 /share btrfs rw,nosuid,nodev,relatime,compress=zstd:5,space_cache,subvolid=260,subvol=/soft 0 0
/dev/md2p4 /home/mail btrfs rw,nosuid,nodev,relatime,compress=zstd:5,space_cache,subvolid=259,subvol=/mail 0 0
ramfs /run/credentials/systemd-tmpfiles-setup.service ramfs ro,nosuid,nodev,noexec,relatime,mode=700 0 0
)

Thanks,

/mjt

Michael Tokarev

unread,
Nov 2, 2023, 7:00:08 AM11/2/23
to
Control: retitle -1 lsof does not work correctly with btrfs subvolumes

So it turned out the whole thing is about btrfs subvolumes.

When a btrfs filesystem is mounted with default subvol=/, lsof works
as expected, the same way it works for, say, ext4 or xfs.

But when non-root subvolume is mounted, eg subvol=@rootfs, lsof does
not find any files on the filesystem anymore:

lsof /btrfs/file
lsof /btrfs

both list nothing, while

lsof | grep /btrfs/file
lsof | grep /btrfs

does.

It is somehting about the matching of files/inodes in lsof which gets
confused about btrfs subvolume mounts.

/mjt

Tj

unread,
Nov 2, 2023, 8:20:04 AM11/2/23
to
Package: lsof
Version: 4.95.0-1
Followup-For: Bug #1040373

This seems to be related to upstream issue:

https://github.com/lsof-org/lsof/issues/152 "LTlock fails on btrfs"

Small test-case:

SUBVOL_FILE="/mnt/machines_old/test";
sudo findmnt -oTARGET,SOURCE,FSTYPE,MAJ:MIN "${SUBVOL_FILE%/*}"
sudo stat --format="%n %Hd:%Ld" "$SUBVOL_FILE"
flock "$SUBVOL_FILE" grep ":$(stat --format="%i" "$SUBVOL_FILE")" /proc/locks | tee >(bc <<< "ibase=16; $(cut -d: -f 3)")

TARGET SOURCE FSTYPE MAJ:MIN
/mnt/machines_old /dev/mapper/SUNNY-machines_old btrfs 0:100
/mnt/machines_old/test 0:101
20: FLOCK ADVISORY WRITE 378526 00:64:395 0 EOF
100

Michael Tokarev

unread,
Nov 7, 2023, 1:30:05 PM11/7/23
to
On Mon, 06 Nov 2023 12:41:27 -0500 Andres Salomon <dili...@queued.net> wrote:
> On Thu, 2 Nov 2023 13:54:44 +0300 Michael Tokarev <m...@tls.msk.ru>
> wrote:
> > Control: retitle -1 lsof does not work correctly with btrfs
> subvolumes
> >
> > So it turned out the whole thing is about btrfs subvolumes.
> >
>
> I'm wondering if this is a regression in lsof (ie, did it begin with
> bookworm's lsof), or has it been there all along? An easy way to test
> this would be to rebuild bullseye/oldstable's lsof on bookworm, install
> it on your machine with that subvolume, and see if it does the same
> thing.

No, this is not a regression. It is a property of btrfs and zfs - both
support multiple devices behind a single filesystem and both support
multiple filesystem roots, so there's no 1:1 device:filesystem mapping
anymore. It's been this way for multiple debian releases.

See also:

https://unix.stackexchange.com/questions/345220/btrfs-how-to-get-real-device-id
http://www.sabi.co.uk/blog/21-two.html?210804#210804 (linked to from above)
https://github.com/util-linux/util-linux/issues/2349#issuecomment-1615854709

/mjt
0 new messages