zfs corruption at zroot/usr/home:<0x0>

0 views
Skip to first unread message

Tomek CEDRO

unread,
Nov 11, 2025, 9:59:17 PM (16 hours ago) Nov 11
to FreeBSD Questions Mailing List
Hello world :-)

On 14.3-RELEASE-p5 amd64 I have encountered a kernel panic (will
report on bugzilla in a moment). After that I found some sites did not
load in a web browser, so my first guess was to try zpool status -v
and I got this:

errors: Permanent errors have been detected in the following files:
zroot/usr/home:<0x0>

Any guess what does the <0x0> mean and how to fix the situation?
It should be a file name right?

zpool scrub and resilver did not help :-(

Will rolling back a snapshot fix the problem?

Any hints appreciated :-)
Tomek

--
CeDeROM, SQ7MHZ, http://www.tomek.cedro.info

Sad Clouds

unread,
4:26 AM (10 hours ago) 4:26 AM
to Tomek CEDRO, FreeBSD Questions Mailing List
Hi, I'm not a ZFS expert, but I wonder if this error is related to some
of the ZFS internal objects, rather than the file data blocks being
corrupted. In which case, ZFS may not be able to correctly repair it?

I'm currently evaluating ZFS on FreeBSD for some of my storage needs
and your report is a bit concerning. Are you able to share the details
on the I/O workloads and the storage geometry you use? Do you have more
info on the kernel panic message or backtraces?

If you put it all in the bug, then can you please share the bug ID?

Thanks.

Frank Leonhardt

unread,
9:19 AM (5 hours ago) 9:19 AM
to ques...@freebsd.org
I suspect <0x0> refers to the object number within a dataset, with zero
being metadata. A permanent error is bad news.

zpool scrub doesn't fix any errors - well not exactly. It tries to read
everything and if it finds an error, it'll fix it. If you encounter an
error outside of a scrub it'll fix it anyway, if it can. The point of a
scrub is to ensure all your data is readable even if it hasn't been read
it a while.

This is most likely down to a compound hardware failure - with flaky
drives it's still possible to lose both copies and not know about it
(hence doing scrubs).

My advice would be to back up what's remaining to tape ASAP before
anything else.

You can sometimes rollback to an earlier version of a dataset - take a
look and see if it's readable (i.e. mount it or look for it in the .zfs
directory). One good way is to use "zfs clone
zroot/usr/home@snapshotname  zroot/usr/home_fingerscrossed"

ZFS is NOT great at detecting failed drives. I'm currently investigating
this for my blog (and it was the subject of a question I posted here in
about Jan to which the answer was "hmm"). zfsd, however, does monitor
drive health using devctl and might pick up impending drive failures
before you get to this stage. I'm going through the source code now to
convince myself it works (it's written in C++ and appears to be
influenced by Design Patters so it's not exactly clear!)

If the metadata for a dataset is unrecoverable you'll need to destroy
and recreate the dataset from a backup. HOWEVER, I'd be investigating
the health of the drives. dd them to /dev/null and see what you get. You
can actually do this while ZFS is using them. Also check the console log
for CAM messages - if it's got to that stage you really need to think
about data recovery.

Regards, Frank.



Tomek CEDRO

unread,
2:19 PM (now) 2:19 PM
to Sad Clouds, FreeBSD Questions Mailing List
Hmm this is brand new NVME drive not really likely to fail. I have the
same problem on zraid0 (stripe) array while initially I saw the bad
file name with 3 problems (vm image) it now turned into
ztuff/vm:<0x482>. Charlie Foxtrot :-(

I also have 4x4TB HDD zraid2 array and this one was not affected.

I have some snapshots will try to revert and see if that helps. Before
that I will backup data and then try to recover what is possible after
snap rollback.
Reply all
Reply to author
Forward
0 new messages