Permanent error in file <0x0>?

1,805 views
Skip to first unread message

Jan Ploski

unread,
Jun 14, 2010, 1:55:41 PM6/14/10
to zfs-fuse
Hi,

After several days of depositing nightly backups, my green pool is now
reporting an error, but I suspect that it might be actually a false
alarm:

# zpool status -v
pool: green
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore
the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: scrub in progress for 0h14m, 0.58% done, 41h20m to go
config:

NAME STATE READ WRITE CKSUM
green ONLINE 0 0 0
disk/by-id/dm-name-green ONLINE 0 0 0

errors: Permanent errors have been detected in the following files:

green:<0x0>

Any idea what file 0x0 might mean? A bug in zfs-fuse? I'm now running
scrub, but it's slow as molasses. BTW, is scrub stateful, that is,
does it continue where it left off across machine reboots?

Best regards,
Jan Ploski

sgheeren

unread,
Jun 14, 2010, 2:16:35 PM6/14/10
to zfs-...@googlegroups.com

Sry to hear that. There goes our track record :)
Now, seriously this looks like something to ask Sun/Upstream

http://dlc.sun.com/osol/docs/content/ZFSADMIN/gbbwl.html

You might contact the Solaris forums. I hear Sun engineers are friendly to help out with e.g. zdb (but I don't know whether that is tied to support contracts). Anyhow, there are a number of feeely accessible zdb descriptions online (google).

Of course, we might try to debug things, but that won't useually work unless you have a box to spare with
(a) remote access
(b) dev tools
(c) trust in us...

Unless, of course, your pool is around 512Mb (...) in which case I can give you an upload URL.

I would say: be careful, if you don't have them: back up first (prefer rsync e.g., only in second instance trust zfs send with your 'damaged(?)' pool)

Seth

Aneurin Price

unread,
Jun 14, 2010, 2:46:02 PM6/14/10
to zfs-...@googlegroups.com

I believe so, but I may be confusing it with resilver which seems to
work incrementally just fine.

I have found that scrub tends to give highly pessimistic estimates at
the start, and speeds up substantially over the next few hours; you
could still be waiting a long time though. I'd wait until the scrub is
complete and see what 'zpool status -v' says, before doing anything.
It might also be worth backing up the metadata as a precautionary
measure, by using dd or similar to copy the first and last 64MB of
each disk in the pool while it's offline (64MB was the number I worked
out when I needed to do this, though that may be different in the
unlikely event that you don't have 512 byte blocks). Plus, obviously,
backup your data as Seth says if you can.

BTW your symptom appears to be described at
http://docs.sun.com/app/docs/doc/819-5461/gbctx?l=en&a=view, but I
can't say that it looks particularly enlightening.

Nye

Aneurin Price

unread,
Jun 15, 2010, 7:53:07 AM6/15/10
to zfs-...@googlegroups.com
On Mon, Jun 14, 2010 at 19:46, Aneurin Price <aneuri...@gmail.com> wrote:
> On Mon, Jun 14, 2010 at 18:55, Jan Ploski <jpl...@gmx.de> wrote:

>> Any idea what file 0x0 might mean? A bug in zfs-fuse? I'm now running
>> scrub, but it's slow as molasses. BTW, is scrub stateful, that is,
>> does it continue where it left off across machine reboots?
>>
>
> I believe so, but I may be confusing it with resilver which seems to
> work incrementally just fine.

I think I was wrong about this. I haven't found a definitive
statement, but it looks like the only way to stop a scrub is to cancel
it, whereupon it will start again from scratch next time.

Nye

sgheeren

unread,
Jun 15, 2010, 8:17:40 AM6/15/10
to zfs-...@googlegroups.com
Which leaves undefined whether a scrub will resume where it left off
e.g. at a system reboot. Which is exactly the kind of situation in which
I'd expect scrub to resume. Also, there was recently an upstream bug
about 'zpool scrub poolname' _silently_ restarting a running scrub

Jan Ploski

unread,
Jun 15, 2010, 5:51:53 PM6/15/10
to zfs-fuse
After scrubbing, I can now see errors reported for several files:

# zpool status -v
pool: green
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore
the
entire pool from backup.
see: http://www.sun.com/msg/ZFS-8000-8A
scrub: none requested
config:

NAME STATE READ WRITE CKSUM
green ONLINE 0 0 0
disk/by-id/dm-name-green ONLINE 0 0 0

errors: Permanent errors have been detected in the following files:

<metadata>:<0x0>
green/backup/vista@20100612_042019:/BOOTSECT.BAK
green/backup/vista@20100612_042019:/CTX.DAT
green/backup/vista@20100612_042019:/NTDETECT.COM
green/backup/vista@20100612_042019:/WirelessDiagLog.csv
green/backup/vista@20100612_042019:/boot.ini
/green/backup/vista/BOOTSECT.BAK
/green/backup/vista/CTX.DAT
/green/backup/vista/autoexec.bat
/green/backup/vista/boot.ini

These files indeed cannot be read (Input/output error). I vaguely
recall having seen the backup job stuck with a kernel panic on the
night when I backed up the Vista machine, so it might be related to
that. Unfortunately, I didn't capture the log when it happened :-(,
and so didn't report a bug, knowing it would not be possible to
reproduce anyway. I will try overwriting these few files, maybe
perform a few more rsync-from-Vista backups, and watch whether the
problems reappear. As for the <metadata>:<0x0> error, I think I'm just
going to force-clear it.

sgheeren

unread,
Jun 15, 2010, 6:09:06 PM6/15/10
to zfs-...@googlegroups.com
Jan,

I'm really sorry we cannot be of more help. Thanks anyway for keeping us
up to date

Seth

Reply all
Reply to author
Forward
0 new messages