Matt Tolle
unread,Feb 5, 2026, 12:41:43 PM (6 days ago) Feb 5Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to kiwi
Hi all,
I'm running into an issue with KIWI 10.2.33 on a Rocky 9 build host where intermittently the cleanup phase leaves behind device mapper entries that can't be removed without rebooting the host. I'm hoping someone can point me in the right direction or tell me if this is a known issue.
My setup: I'm building LVM-based GCE/Azure/AWS disk images for Rocky 9, RHEL 9, and OEL 9. The builds run sequentially on the same host. Each image has a rootvg volume group with lv_root, lv_var, lv_tmp, lv_home, and lv_opt logical volumes.
The problem shows up during KIWI's internal cleanup. When KIWI tries to unmount /app/tmp/kiwi_volumes.XXXXX, it sometimes gets "target is busy" errors. After 5 retries it falls back to lazy unmount. Here's what that looks like in the logs:
[ WARNING ]: 01:06:00 | 0 umount of /app/tmp/kiwi_volumes.qw595_hg failed with: target is busy
[ WARNING ]: 01:06:01 | 1 umount of /app/tmp/kiwi_volumes.qw595_hg failed with: target is busy
[ WARNING ]: 01:06:02 | 2 umount of /app/tmp/kiwi_volumes.qw595_hg failed with: target is busy
[ WARNING ]: 01:06:03 | 3 umount of /app/tmp/kiwi_volumes.qw595_hg failed with: target is busy
[ WARNING ]: 01:06:04 | 4 umount of /app/tmp/kiwi_volumes.qw595_hg failed with: target is busy
[ DEBUG ]: 01:06:05 | EXEC: [umount --lazy /app/tmp/kiwi_volumes.qw595_hg]
The lazy unmount "succeeds" but leaves behind a kernel reference on the dm device. After KIWI exits, I'm left with this:
$ dmsetup info rootvg-lv_root
Name: rootvg-lv_root
State: ACTIVE
Open count: 1
$ fuser -v /dev/mapper/rootvg-lv_root
(nothing - no userspace process has it open)
$ dmsetup remove rootvg-lv_root
device-mapper: remove ioctl on rootvg-lv_root failed: Device or resource busy
So there's a kernel-level hold with open count 1, but nothing in userspace is using it. I've tried dmsetup remove --force, vgchange -an, suspending the device first - nothing works. Only a reboot clears it.
This blocks subsequent builds because they can't create a new rootvg volume group while the old dm entries exist.
I dug into the debug logs to figure out what's causing the "target is busy" and found something interesting. At the exact moment KIWI is trying to unmount kiwi_volumes, it's also mounting a NEW kiwi_mount_manager on the same LVM volume:
[ DEBUG ]: 01:06:00 | EXEC: [umount /app/tmp/kiwi_volumes.qw595_hg]
[ DEBUG ]: 01:06:00 | EXEC: Failed with stderr: umount: target is busy
[ DEBUG ]: 01:06:00 | EXEC: [mount /dev/rootvg/lv_root /app/tmp/kiwi_mount_manager.28_k0mus]
Same second, same LVM volume. So KIWI seems to be racing with itself - it's got concurrent operations trying to mount and unmount the same underlying block device. That would explain the "target is busy" error.
What's weird is that it doesn't happen every time. In my last run:
- Rocky 9: zero "target is busy" errors, built fine
- RHEL 9: zero "target is busy" errors, built fine
- OEL 9: 21 "target is busy" errors, fell back to lazy unmount, left zombie dm entry
All three use the same KIWI config and profiles. They run one after another on the same host. Rocky and RHEL had no issues at all, OEL hit the race condition hard. It's not always the third build that failes. Sometimes it's the first. Sometimes it's the second.
I looked at mount_manager.py and see the umount() method does 5 retries with 1 second sleep, then falls back to umount --lazy. I tried patching it locally to do 10 retries and add fuser -km calls to kill any holders, but that didn't fully solve it because the "holder" is apparently KIWI's own concurrent mount operation, not an external process.
A few questions:
1. Is the concurrent mount/unmount on the same LV expected? It seems like there might be missing serialization between different KIWI subsystems.
2. Is there a way to disable the lazy unmount fallback? I'd rather have the build fail than leave zombie entries that require a reboot. Or at least have an option to control this behavior.
3. Has anyone else seen this on EL9? We recently updated to lvm2-2.03.32-2.el9_7.1 and device-mapper-1.02.206-2.el9_7.1 (Feb 3rd), and I'm wondering if something changed in the LVM/dm layer that made this more likely to happen.
4. Any suggestions for workarounds? Right now I'm adding a 60 second delay between builds to give the kernel more time to settle, but that's just a guess.
For reference, my environment:
- KIWI: 10.2.33
- Host: Rocky Linux 9
- Kernel: 5.14.0-611.24.1.el9_7.x86_64
- LVM2: 2.03.32-2.el9_7.1
- device-mapper: 1.02.206-2.el9_7.1
Happy to provide more debug logs or test patches if that would help.
Thanks,