zfs corruption at zroot/usr/home:<0x0>

0 views
Skip to first unread message

Tomek CEDRO

unread,
Nov 11, 2025, 9:59:17 PMNov 11
to FreeBSD Questions Mailing List
Hello world :-)

On 14.3-RELEASE-p5 amd64 I have encountered a kernel panic (will
report on bugzilla in a moment). After that I found some sites did not
load in a web browser, so my first guess was to try zpool status -v
and I got this:

errors: Permanent errors have been detected in the following files:
zroot/usr/home:<0x0>

Any guess what does the <0x0> mean and how to fix the situation?
It should be a file name right?

zpool scrub and resilver did not help :-(

Will rolling back a snapshot fix the problem?

Any hints appreciated :-)
Tomek

--
CeDeROM, SQ7MHZ, http://www.tomek.cedro.info

Sad Clouds

unread,
Nov 12, 2025, 4:26:16 AMNov 12
to Tomek CEDRO, FreeBSD Questions Mailing List
Hi, I'm not a ZFS expert, but I wonder if this error is related to some
of the ZFS internal objects, rather than the file data blocks being
corrupted. In which case, ZFS may not be able to correctly repair it?

I'm currently evaluating ZFS on FreeBSD for some of my storage needs
and your report is a bit concerning. Are you able to share the details
on the I/O workloads and the storage geometry you use? Do you have more
info on the kernel panic message or backtraces?

If you put it all in the bug, then can you please share the bug ID?

Thanks.

Frank Leonhardt

unread,
Nov 12, 2025, 9:19:20 AMNov 12
to ques...@freebsd.org
I suspect <0x0> refers to the object number within a dataset, with zero
being metadata. A permanent error is bad news.

zpool scrub doesn't fix any errors - well not exactly. It tries to read
everything and if it finds an error, it'll fix it. If you encounter an
error outside of a scrub it'll fix it anyway, if it can. The point of a
scrub is to ensure all your data is readable even if it hasn't been read
it a while.

This is most likely down to a compound hardware failure - with flaky
drives it's still possible to lose both copies and not know about it
(hence doing scrubs).

My advice would be to back up what's remaining to tape ASAP before
anything else.

You can sometimes rollback to an earlier version of a dataset - take a
look and see if it's readable (i.e. mount it or look for it in the .zfs
directory). One good way is to use "zfs clone
zroot/usr/home@snapshotname  zroot/usr/home_fingerscrossed"

ZFS is NOT great at detecting failed drives. I'm currently investigating
this for my blog (and it was the subject of a question I posted here in
about Jan to which the answer was "hmm"). zfsd, however, does monitor
drive health using devctl and might pick up impending drive failures
before you get to this stage. I'm going through the source code now to
convince myself it works (it's written in C++ and appears to be
influenced by Design Patters so it's not exactly clear!)

If the metadata for a dataset is unrecoverable you'll need to destroy
and recreate the dataset from a backup. HOWEVER, I'd be investigating
the health of the drives. dd them to /dev/null and see what you get. You
can actually do this while ZFS is using them. Also check the console log
for CAM messages - if it's got to that stage you really need to think
about data recovery.

Regards, Frank.



Tomek CEDRO

unread,
Nov 12, 2025, 2:19:22 PMNov 12
to Sad Clouds, FreeBSD Questions Mailing List
Hmm this is brand new NVME drive not really likely to fail. I have the
same problem on zraid0 (stripe) array while initially I saw the bad
file name with 3 problems (vm image) it now turned into
ztuff/vm:<0x482>. Charlie Foxtrot :-(

I also have 4x4TB HDD zraid2 array and this one was not affected.

I have some snapshots will try to revert and see if that helps. Before
that I will backup data and then try to recover what is possible after
snap rollback.

Tomek CEDRO

unread,
Nov 12, 2025, 2:21:28 PMNov 12
to Frank Leonhardt, ques...@freebsd.org
Sorry my previous response should go here :-P

Hmm this is brand new NVME drive not really likely to fail. I have the
same problem on zraid0 (stripe) array while initially I saw the bad
file name with 3 problems (vm image) it now turned into
ztuff/vm:<0x482>. Charlie Foxtrot :-(

I also have 4x4TB HDD zraid2 array and this one was not affected.

I have some snapshots will try to revert and see if that helps. Before
that I will backup data and then try to recover what is possible after
snap rollback.

Sad Clouds

unread,
Nov 13, 2025, 2:54:53 AMNov 13
to Tomek CEDRO, FreeBSD Questions Mailing List
On Wed, 12 Nov 2025 20:18:39 +0100
Tomek CEDRO <to...@cedro.info> wrote:

> Hmm this is brand new NVME drive not really likely to fail. I have the
> same problem on zraid0 (stripe) array while initially I saw the bad
> file name with 3 problems (vm image) it now turned into
> ztuff/vm:<0x482>. Charlie Foxtrot :-(

Personally I still prefer the hardware RAID. For years I've used second
hand LSI 9260-8i cards I bought on ebay and not noticed any corruption
issues.

ZFS has nice features like checksumming and snapshots, however if I
need ZFS then I deploy it on top of a hardware RAID virtual drive. I
know people will frown up this configuration, but if I notice any issues
with ZFS, I can easily switch to UFS and keep the benefits of the
hardware RAID.

I could be wrong, but I sort of feel the ancient firmware on a hardware
RAID card is more stable than the large and complex codebase of ZFS
that is constantly refactored and improved by many people.

Frank Leonhardt

unread,
Nov 13, 2025, 8:06:07 AMNov 13
to ques...@freebsd.org
On 12/11/2025 19:20, Tomek CEDRO wrote:
Hmm this is brand new NVME drive not really likely to fail. I have the
same problem on zraid0 (stripe) array while initially I saw the bad
file name with 3 problems (vm image) it now turned into
ztuff/vm:<0x482>. Charlie Foxtrot :-(

NVME drives are known to fail early in their life if they're going to fail at all, otherwise they're quite reliable for a long time.

Almost every time I've blamed ZFS in the past (and there have been quite a few occasions) it's turned out to be a hardware problem, even when it seemed okay. Testing subsequently confirmed a flaky drive or controller. A few times I haven't found conclusive proof one way or the other. I believe ZFS is just particularly good at detecting corruption - I've seen corrupted data on UFS2 over the years, but the OS doesn't notice.

There's always the chance of a bug in the drivers, of course.

And this is why (as mentioned elsewhere) I do a last-ditch backup of files to tape using tar!

ZFS is sold as a magic never-lose-data filing system. It's good, but it can't work miracles on flaky hardware. IME, when it goes, it goes.

Good luck with recovering the snapshot.

Regards, Frank.


Tomek CEDRO

unread,
Nov 13, 2025, 10:43:39 AMNov 13
to Frank Leonhardt, ques...@freebsd.org
On Thu, Nov 13, 2025 at 2:06 PM Frank Leonhardt <freeb...@fjl.co.uk> wrote:
> On 12/11/2025 19:20, Tomek CEDRO wrote:
> Hmm this is brand new NVME drive not really likely to fail. I have the
> same problem on zraid0 (stripe) array while initially I saw the bad
> file name with 3 problems (vm image) it now turned into
> ztuff/vm:<0x482>. Charlie Foxtrot :-(
>
> NVME drives are known to fail early in their life if they're going to fail at all, otherwise they're quite reliable for a long time.
>
> Almost every time I've blamed ZFS in the past (and there have been quite a few occasions) it's turned out to be a hardware problem, even when it seemed okay. Testing subsequently confirmed a flaky drive or controller. A few times I haven't found conclusive proof one way or the other. I believe ZFS is just particularly good at detecting corruption - I've seen corrupted data on UFS2 over the years, but the OS doesn't notice.
> There's always the chance of a bug in the drivers, of course.

Hmm, will try to boot some diagnostics iso from the vendor to make
sure nvme drive status.. but it was and is still working fine for
several months already, it uses onboard controller and has big
heatsink installed. This is Samsung PRO 9100 2TB NVME with latest
firmware installed (I know Samsung can release nvmes that will self
destruct because of faulty firmware).

I once noticed these early errors in the raidz2 with brand new hdd wd
red disks, so I checked every single one of them with destructive
badblocks and one turned out to be faulty and was replaced quickly.
This was the only time when I saw ZFS error before. Since then I
always pass every disk with rw badblocks in several iterations even
before first use :-)

I have three zfs pools two are simple stripe (1x2TB for root, 2x2TB
for data) and one is raidz2 (4x4TB for data). I am sure this was
caused by two kernel panics I have triggered by hand during tests. Not
sure if this a "driver" bug because I was source of the problem but it
there is a place for improvement in ZFS then I just found one.. I
would rather prefer to have lost last write than inconsistent
filesystem with unknown location after :-P

Only raidz2 is unaffected because it had some additional data to
restore content. Now I understand why the "lost" 8TB space is required
:D The other two pools were in active use during panic thus the data
loss. I will replace old zfs stripe and add two disks to the raidz2 at
first occasion when some cash jumps in :-)

With UFS2 I not only always had filesystem corruption on kernel panic
but as you say there were hidden corruption problems that fsck could
not catch. ZFS is like a dream here.. and look the problem only
happened in some known dataset so I can restore only the dataset not
the whole disk :-)

> And this is why (as mentioned elsewhere) I do a last-ditch backup of files to tape using tar!
> ZFS is sold as a magic never-lose-data filing system. It's good, but it can't work miracles on flaky hardware. IME, when it goes, it goes.

Yes, I will re-enable my auto zfs snap now in the cron to have at
least one month of snapshots created every week/day auto created and
then zfs export. I had this running but got too confident disabled it
and look it would help now :-P

I am using BluRay disks for backups these are bigger and faster (also
EMP reistant?) than tape but I really admire the tape approach :-)
Just bought several Sony's BD-RE XL 100GB (rewritable) and also have
some BD-RE DL (50GB) but these are slow to write (2x 36Mbps). BD-R and
BD-R DL are a lot faster to write (i.e. 6..12x) but one time only..
and I have BD-R XL 128GB with 4x write. I also got DVD-RAM (2..5x
write speed still slower than 6x DVD-RW) that can store small portions
of data quickly in theory (good for logs) but FreeBSD's UDF support
ends at 1.50 while 2.60 is required for true random access, and I did
not manage to udfclient to provide random read/write so multisession
is the only way for now. Also good disk burner with firmware that
supports these disks is required not all can even read them and write
speed is important too it if takes 12h or quarter of that.

> Good luck with recovering the snapshot.
> Regards, Frank.

Thank you Frank!! For now I am backing up current data to the disks it
takes some time will report back after simple snapshot rollback :-)

Tomek CEDRO

unread,
Nov 13, 2025, 11:11:39 AMNov 13
to Sad Clouds, FreeBSD Questions Mailing List
Yes hardware RAID will be faster than ZFS I experimented with that
some years back when I played with ZFS for the first time. RAID-0
(stripe) have no backup data built-in so its only faster than ZFS
stripe but prone to data loss too. Higher RAID levels will protect
data at the cost of available space and some performance. But ZFS
gives you a lot more (see below).

Imagine you are making a hardware upgrade with 16TB array. There is
only a slight change it will work on a new hardware unless you also
move the RAID controller. What if the array is even bigger? Where will
you store backups? It will probably cost you double of the disks price
and the time to transfer data. Also every controller has limited
amounts of ports for disks.

With ZFS RAID you can move the array to any hardware, and then
add/replace disks from other controllers, so you are not bound by the
hardware limitations. Recently I moved to a new machine and things
just worked out-of-the-box with zero additional work I was surprised
to be honest!!

ZFS gives you then far more features than any other controller /
filesystem, including creating dedicated datasets with specific
attributes like compression or encryption, deduplication, snapshots,
quotas, backup export/import to any stream over any medium, etc etc.

I just realized my RAIDZ2 has double parity scheme that equals RAID-6
not RAID-5 as I said before sorry for that (RAID-5 ~ RAIDZ). Now I can
add some additional disks with no problem. And it looks self-healing
true because two other simple stripes detected data corruption while
raidz2 did not. And the corruption affects only one dataset not the
whole pool so I can either delete questionable files and restore them
or roll back snapshot / export for that specific single dataset not
the whole pool. If you think about lost efficiency then you can enable
compression then for HDD things work faster not slower (kind of
surprise too because amount of data written is smaller and the
compression is ultra fast).

I found this article on comparing RAID-Z with RAID helpful:

https://www.diskinternals.com/raid-recovery/what-is-raidz/

ZFS is really amazing and makes FreeBSD unique, even OpenBSD does not
have that, so for people avoiding Linux this helps making the BSD
choice easier :-P

Tomek CEDRO

unread,
Nov 13, 2025, 6:32:11 PMNov 13
to Frank Leonhardt, ques...@freebsd.org
On Thu, Nov 13, 2025 at 4:42 PM Tomek CEDRO <to...@cedro.info> wrote:
> Thank you Frank!! For now I am backing up current data to the disks it
> takes some time will report back after simple snapshot rollback :-)

Okay so:
1. I have booted to single user and made zfs read-write.
2. I tried to rollback a snapshot of zfs/usr/home that did not help.
3. I have exported zfs/usr/home to /var/bkp/home.zfs.
4. Then imported home.zfs to zfs/usr/home2.
5. Removed zfs/usr/home.
6. zpool status -v resulted in permanent error in <0x4b>:<0x0> :-(
7. Tried zpool scrub zfs... but after some time machine hanged unresponsive.
8. Hard power off.
9. Power on -> single user.
10. zpool status did not show any problems on zfs now.
11. zpool scrub completed with no problems and no problems are reported.
12. I have renamed zfs/usr/home2 to zfs/usr/home.

smartcl cannot query /dev/nda0 so I have to find iso with disk vendor
tools just to make sure disk is fine.

So far so good :-P

Frank Leonhardt

unread,
Nov 14, 2025, 7:14:11 AMNov 14
to Tomek CEDRO, ques...@freebsd.org

Glad to hear it!

I also worry that a system crash can mess up ZFS. I think it must be possible as it may corrupt data in RAM before it is written to disk. What is not clear is whether the system crash is caused by a drive fault to begin with. I also use ECC memory, but this does not mean everything in RAM is good!

I have never tried Blu-ray.  I do not trust optical drives over time. MO is okay, but writeable CD/DVD wasn't good. Unfortunately my MO drive holds so little by modern standards it's pointless. Here I use LTO tapes, which hold terabytes of data on a single cartridge. The drives are expensive(!) but the data on the tapes is intended to last 30 years or longer. if you don't have enough money for a current drive (LTO 10, 40-100Tb capacity for $$$$$$) you can buy one of the earlier models quite cheap. Good LTO6 drives are a couple of hundred dollars and hold 2.5-6Tb per tape. FreeBSD supports tape well, including libraries (auto-changers). LTO7 is an even better bargain - a bit more expensive but much larger and faster.

Regards, Frank.







Tomek CEDRO

unread,
Nov 14, 2025, 8:31:43 AMNov 14
to Frank Leonhardt, ques...@freebsd.org
On Fri, Nov 14, 2025 at 1:13 PM Frank Leonhardt <freeb...@fjl.co.uk> wrote:
> Glad to hear it!
I also worry that a system crash can mess up ZFS. I think it must be
possible as it may corrupt data in RAM before it is written to disk.
What is not clear is whether the system crash is caused by a drive
fault to begin with. I also use ECC memory, but this does not mean
everything in RAM is good!

Well for sure I caused the crash manually.. two times in a row..
because why not test on production I have ZFS right? :D

I had some test krash before but these never resulted in any
corruption of the ZFS so this one had to be very special :D


> I have never tried Blu-ray. I do not trust optical drives over time. MO is okay, but writeable CD/DVD wasn't good. Unfortunately my MO drive holds so little by modern standards it's pointless. Here I use LTO tapes, which hold terabytes of data on a single cartridge. The drives are expensive(!) but the data on the tapes is intended to last 30 years or longer. if you don't have enough money for a current drive (LTO 10, 40-100Tb capacity for $$$$$$) you can buy one of the earlier models quite cheap. Good LTO6 drives are a couple of hundred dollars and hold 2.5-6Tb per tape. FreeBSD supports tape well, including libraries (auto-changers). LTO7 is an even better bargain - a bit more expensive but much larger and faster.

Well optical disks are (were?) popular so that was the best way to
provide backups for my clients. So far ~25 years and all CD-ROM disks
work fine.. and there are M-DISC DVD/BD that are advertised to last
1000 years [1] but these are 25..50GB max.

Never heard of those high capacity LTO MO drives thanks for the hint!!
Grok even mentions them [1] as alternative to M-DISC. I have now new
toy to hunt for and the TB capacity is definitely what I need when
servicing various machines. Tapes are still available brand new at
many stores.. and its nice to see Quantum brand again I still keep my
first Quantum Fireball HDD :-)

Thanks again Frank! :-)

[1] https://grokipedia.com/page/M-DISC

void

unread,
Nov 14, 2025, 1:03:26 PMNov 14
to ques...@freebsd.org
On Fri, Nov 14, 2025 at 02:30:57PM +0100, Tomek CEDRO wrote:
>provide backups for my clients. So far ~25 years and all CD-ROM disks
>work fine.. and there are M-DISC DVD/BD that are advertised to last
>1000 years [1] but these are 25..50GB max.

hi, slight correction but M-DISC have BDXL variant to 125GB now
https://thetechylife.com/what-is-blu-ray-bdxl/
durability up to 1000 years according to Verbatim
https://www.verbatim.com.au/m-disc-optical-media-benefits/
more general M-DISC info https://mdisc.com/faq.html
--

Frank Leonhardt

unread,
Nov 14, 2025, 1:27:25 PMNov 14
to ques...@freebsd.org
MO was magnito-optical disks - they were of the 1980s and 1990s.
Although the capacity was good, it was nothing by modern standards and
CD/DVD overtook it in capacity and price. They were specialised,
although the Sony MiniDisks were MO.

They had a better archive life than other technologies, which was what
made them more interesting. I worked with a CD-R manufacturer in the
1990s and at the time the archive life of the disks was "uncertain" to
say the least. A lot had to do with chemistry, and the push to make the
blank disks cheaper did nothing for the lifespan. Originally they had a
reflective layer of gold, which is very stable over time, but rather
pricy when cheaper reflective materials could be found!

Just to be clear, LTO is tape, not MO. It could be exactly what you're
looking for. It's what banks use to store historic data that must be
accessible but very rarely, and which mustn't disappear. You can get
autoloaders (aka libraries aka jukeboxes) on places like eBay - the
drives are interchangeable so you can upgrade them if the picker
mechanism works.  A 2U Dell powervault holds two cartridges for 12 tapes
each - that's 24. Let's not go crazy with the drive - say LTO8. That's
12Tb/tape or 30Tb after the built in compression - that's 720Tb
nearline. And if you run out just swap the tapes. I, er, have more than
one :-)


Reply all
Reply to author
Forward
0 new messages