Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

XFS bug or hard disk error ?

3 views
Skip to first unread message

Rui Pedro Mendes Salgueiro

unread,
Dec 21, 2004, 7:26:00 AM12/21/04
to
Hi.

I have a server with Suse Linux 9.1 and a couple of disks with XFS
filesystems. Lately it has been crashing with this or a similar error:

Message from syslogd@xxx at Mon Dec 20 20:47:20 2004 ...
xxx kernel: xfs_iget_core: ambiguous vns: vp/0xdedb9080, invp/0xf67c5980

This error happens when a backup is being made, that is, when the
whole disk tree is (probably) being traversed.

It doesn't show any disk error on the console (it just freezes), but
IIRC, the last time it happened soon the disk failed completely.

I am thinking of replacing the disk tonight, but anyway I would like
if someone can confirm that this error is due to an hardware error.

I will probably still use XFS, because there is no dump for reiserfs.

--
http://www.mat.uc.pt/~rps/

.pt is Portugal| `Whom the gods love die young'-Menander (342-292 BC)
Europe | Villeneuve 50-82, Toivonen 56-86, Senna 60-94

Alan Hughes

unread,
Dec 21, 2004, 9:26:42 AM12/21/04
to
Rui Pedro Mendes Salgueiro wrote:

> Hi.
>
> I have a server with Suse Linux 9.1 and a couple of disks with XFS
> filesystems. Lately it has been crashing with this or a similar error:
>
> Message from syslogd@xxx at Mon Dec 20 20:47:20 2004 ...
> xxx kernel: xfs_iget_core: ambiguous vns: vp/0xdedb9080, invp/0xf67c5980
>
> This error happens when a backup is being made, that is, when the
> whole disk tree is (probably) being traversed.
>
> It doesn't show any disk error on the console (it just freezes), but
> IIRC, the last time it happened soon the disk failed completely.
>
> I am thinking of replacing the disk tonight, but anyway I would like
> if someone can confirm that this error is due to an hardware error.
>
> I will probably still use XFS, because there is no dump for reiserfs.
>

Sounds to me that the file system has got corrupted somehow. Maybe a disk
sector is on its way out - XFS is very stable in my experience, so software
problems are not that common.

The only advantage that xfsdump/xfsrestore has over tar is that the former
saves some metadata (e.g. ACLs and attributes) that the latter is not aware
of. If you are not using this metadata (i.e. are only using regular file
permissions) then there is not much point in using XFS *unless* your file
system is handling large files (which XFS is far better at than ReiserFS).

mjt

unread,
Dec 21, 2004, 10:48:43 AM12/21/04
to
Rui Pedro Mendes Salgueiro wrote:

> I have a server with Suse Linux 9.1 and a couple of disks with XFS
> filesystems. Lately it has been crashing with this or a similar error:
>
> Message from syslogd@xxx at Mon Dec 20 20:47:20 2004 ...
> xxx kernel: xfs_iget_core: ambiguous vns: vp/0xdedb9080, invp/0xf67c5980


... check your logs
--
<< http://michaeljtobler.homelinux.com/ >>
There once was a hacker named Ken
Who inherited truckloads of Yen
So he built him some chicks
Of silicon chips
And hasn't been heard from since then.

Rui Pedro Mendes Salgueiro

unread,
Dec 21, 2004, 10:34:41 AM12/21/04
to
Alan Hughes <nos...@nospam.org> wrote:
> Rui Pedro Mendes Salgueiro wrote:

>> I have a server with Suse Linux 9.1 and a couple of disks with XFS
>> filesystems. Lately it has been crashing with this or a similar error:
>>
>> Message from syslogd@xxx at Mon Dec 20 20:47:20 2004 ...
>> xxx kernel: xfs_iget_core: ambiguous vns: vp/0xdedb9080, invp/0xf67c5980
>>
>> This error happens when a backup is being made, that is, when the
>> whole disk tree is (probably) being traversed.
>>
>> It doesn't show any disk error on the console (it just freezes), but
>> IIRC, the last time it happened soon the disk failed completely.
>>
>> I am thinking of replacing the disk tonight, but anyway I would like
>> if someone can confirm that this error is due to an hardware error.
>>
>> I will probably still use XFS, because there is no dump for reiserfs.

> Sounds to me that the file system has got corrupted somehow.

I forgot to tell: xfs_check doesn't find any serious error, although it
says that a dozen of users have their quotas miscalculated. That doesn't
sound like a problem serious enough to crash the kernel.

> Maybe a disk sector is on its way out -

That is also my suspicion.

> XFS is very stable in my experience, so software
> problems are not that common.

That was what I wanted to hear.

> The only advantage that xfsdump/xfsrestore has over tar is that the former
> saves some metadata (e.g. ACLs and attributes) that the latter is not aware
> of.

Two words: incremental backups. Of course it is possible to achieve
similar results with scripts and find -newer, but is much less pratical.
Also restoring one or a few files with tar needs reading the tape
twice, one to check the exact name(s), another to extract it (them).
Dump places the table of contents at the beginning of the tape so that
it is easier to use (restore -i).

Rui Pedro Mendes Salgueiro

unread,
Jan 26, 2005, 11:25:39 AM1/26/05
to

One month ago (Dec 21 12:26:05 2004) I wrote:
> I have a server with Suse Linux 9.1 and a couple of disks with XFS
> filesystems. Lately it has been crashing with this or a similar error:

> Message from syslogd@xxx at Mon Dec 20 20:47:20 2004 ...
> xxx kernel: xfs_iget_core: ambiguous vns: vp/0xdedb9080, invp/0xf67c5980

> This error happens when a backup is being made, that is, when the
> whole disk tree is (probably) being traversed.

> It doesn't show any disk error on the console (it just freezes), but
> IIRC, the last time it happened soon the disk failed completely.

> I am thinking of replacing the disk tonight, but anyway I would like
> if someone can confirm that this error is due to an hardware error.

I replaced the disk, but the problems continued.

At the time I had googled for this problem but didn't find anything
relevant. Today I found the XFS web page
http://linux-xfs.sgi.com/projects/xfs/

and searching on that page I found several persons with the same
problem and I concluded that this is an old bug that seems to be
caused by an interaction between XFS and NFS in a SMP kernel.
I suppose this is hard to track down and so is still not solved.
Or maybe it was solved only recently (I am using Suse 9.1).

On the xfs page I found this recipe to reproduce the problem
(I haven't tried yet):

http://bugme.osdl.org/show_bug.cgi?id=870

------- Additional Comment #4 From Robbie Williamson 2003-12-11 12:14 -------

How To Reproduce:
1) Download the latest LTP testsuite
http://ltp.sf.net/nfs

2) Build and install the testsuite
3) Make sure the NFS server daemons are running
4) Export an XFS filesystem to be used for testing, globally, with root allowed.
ex: /mnt/xfs *(sync,rw, no_root_squash)
5) Change directory to where the LTP is installed
6) Change directory to testcases/bin/
7) Execute 'nfs_fsstress.sh' and follow the prompts.
a) Enter your hostname as the server
b) Enter the export filesystem name, i.e. /mnt/xfs
c) Enter "1" for the number of hours to execute.

The oops should occur within 30 minutes or so.

------- -------

> I will probably still use XFS, because there is no dump for reiserfs.

I now think that I will change to ext3. It is a pity because xfsdump
and xfsrestore are quite sophisticated.

0 new messages