cc Konstantin, ntfs3 list
FYI this warning basically means we're attempting to zero a folio that
starts beyond EOF. It warns because if the folio starts beyond EOF,
writeback may throw it away instead of submitting I/O, making the
zeroing ineffective.
I was poking through the code a bit out of curiosity and playing with
Claude and I think it was able to come up with a reproducer. I don't
know if this is the same circumstance as syzbot (we'll see if it
eventually spits out its own reproducer), but it relies on formatting
with a larger cluster size than page size.
Basically what it's doing is creating a non-cluster aligned i_valid size
based on completing some writes, a slightly larger also unaligned
i_size, and then running fallocate up to the full i_size. AFAICS this
means the falloc will leave i_size alone and fall into the
is_supported_holes block, which will call into
ntfs_extend_initialized_size() using the rounded up cluster end to zero
the range from i_valid to the end of the cluster. If say the cluster
size is 16k and i_size is not in the last 4k of the EOF cluster, then I
suppose we're going to end up trying to zero a folio that starts beyond
i_size. I'll append the script that Claude came up with that reproduces
the warning for me.
I'm not totally sure what the best fix is here. It seems like this is
already playing games with i_size to handle the KEEP_SIZE case, so
perhaps a quick hack might be to coopt that to handle this case as well.
Maybe another option is to split the zeroing to only iomap zero through
the end of the EOF folio and use another mechanism to zero the rest of
the cluster, or zero it earlier at alloc time, etc. But anyways, I
suspect NTFS3 developers will have a better handle on this than I..
Brian
--- 8< ---
Claude generated reproducer script:
#!/bin/bash
#
# NTFS3 fallocate bug reproducer
#
# Tests whether ntfs_fallocate() tries to zero folios beyond i_size
# when using sparse files with cluster_size > PAGE_SIZE
#
set -e
IMG=./ntfs_test.img
MOUNT=/mnt
FILE=$MOUNT/testfile
echo "=== NTFS3 Fallocate Bug Test ==="
echo ""
# Check if running as root
if [ "$EUID" -ne 0 ]; then
echo "ERROR: Must run as root"
exit 1
fi
# Check for mkfs.ntfs
if ! command -v mkfs.ntfs &> /dev/null; then
echo "ERROR: mkfs.ntfs not found (install ntfs-3g)"
exit 1
fi
# Clear WARN_ON_ONCE state so we can see warnings on repeated runs
echo "[1] Clearing WARN_ON_ONCE state..."
if [ -f /sys/kernel/debug/clear_warn_once ]; then
echo 1 > /sys/kernel/debug/clear_warn_once
echo " Cleared"
else
echo " /sys/kernel/debug/clear_warn_once not found (debugfs not mounted?)"
fi
# Create image file
echo "[2] Creating 100M image file..."
truncate -s 100M $IMG
# Setup loop device
echo "[3] Setting up loop device..."
DEV=$(losetup -f)
losetup $DEV $IMG
echo " Using $DEV"
# Format
echo "[4] Formatting with 16K clusters..."
mkfs.ntfs -f -c 16384 -q $DEV
# Mount
echo "[5] Mounting with -o sparse to /mnt..."
mount -t ntfs3 -o sparse $DEV $MOUNT
# Create test file
echo "[6] Creating file with i_valid=4995, i_size=4995..."
dd if=/dev/urandom of=$FILE bs=999 count=1 seek=4 conv=notrunc status=none 2>/dev/null
sync
echo "[7] Extending i_size to 5000 (i_valid stays at 4995)..."
truncate -s 5000 $FILE
echo "[8] Running: fallocate -l 5000 $FILE"
echo ""
echo "Bug trigger: This should try to zero up to cluster boundary (16384)"
echo "while i_size=5000, causing iomap to process folios at 8192 and 12288"
echo "which start beyond EOF."
echo ""
if fallocate -l 5000 $FILE 2>&1; then
echo "-> fallocate completed"
else
echo "-> fallocate failed: $?"
fi
# Cleanup
echo ""
echo "[9] Cleaning up..."
umount $MOUNT
losetup -d $DEV
rm -f $IMG
echo ""
echo "Done. Check 'dmesg | tail' for kernel warnings."