Hello, when trying to recursively delete a directory (same directory
twice) from my 500gb hard drive I get a problem. It crashed first in
2.6.16.20, then I upgraded to try to get rid of the issue. This one is
from 2.6.17:
xfs_da_do_buf: bno 16777216
dir: inode 1507133580
Filesystem "sda1": XFS internal error xfs_da_do_buf(1) at line 2119 of
file /usr/src/linux-stable-cold/fs/xfs/xfs_da_btree.c. Caller 0xb01
d9b63
<b01d9720> xfs_da_do_buf+0x40e/0x7c7 <b01d9b63> xfs_da_read_buf+0x30/0x35
<b01e43d5> xfs_dir2_leafn_lookup_int+0x2f3/0x453 <b01d9b63>
xfs_da_read_buf+0x30/0x35
<b01e2ba5> xfs_dir2_node_removename+0x288/0x47f <b01e2ba5>
xfs_dir2_node_removename+0x288/0x47f
<b01ddbd3> xfs_dir2_removename+0xce/0xd5 <b020ff5d> kmem_zone_alloc+0x4d/0x98
<b020d0ef> xfs_remove+0x2ac/0x444 <b0215e7f> xfs_vn_unlink+0x17/0x3b
<b020a32b> xfs_lookup+0x6e/0x78 <b011e734> __capable+0xc/0x1f
<b0155827> generic_permission+0x93/0xcc <b01558f8> permission+0x98/0xa4
<b0155da0> may_delete+0x32/0xe9 <b0156243> vfs_unlink+0x6d/0xa3
<b0157c7a> do_unlinkat+0x92/0x125 <b0159a0d> sys_getdents64+0x9c/0xa6
<b0102b67> sysenter_past_esp+0x54/0x75
Filesystem "sda1": XFS internal error xfs_trans_cancel at line 1150 of
file /usr/src/linux-stable-cold/fs/xfs/xfs_trans.c. Caller 0xb020d2
5e
<b0204b48> xfs_trans_cancel+0x59/0xe5 <b020d25e> xfs_remove+0x41b/0x444
<b020d25e> xfs_remove+0x41b/0x444 <b0215e7f> xfs_vn_unlink+0x17/0x3b
<b020a32b> xfs_lookup+0x6e/0x78 <b011e734> __capable+0xc/0x1f
<b0155827> generic_permission+0x93/0xcc <b01558f8> permission+0x98/0xa4
<b0155da0> may_delete+0x32/0xe9 <b0156243> vfs_unlink+0x6d/0xa3
<b0157c7a> do_unlinkat+0x92/0x125 <b0159a0d> sys_getdents64+0x9c/0xa6
<b0102b67> sysenter_past_esp+0x54/0x75
xfs_force_shutdown(sda1,0x8) called from line 1151 of file
/usr/src/linux-stable-cold/fs/xfs/xfs_trans.c. Return address =
0xb0218b68
Filesystem "sda1": Corruption of in-memory data detected. Shutting
down filesystem: sda1
While trying to xfs_repair I get the following:
fatal error -- can't read block 16777216 for directory inode 1507133580
Badblocks has been run on this machine and it was sucessful.
I did find an old thread with this, but no solution:
http://oss.sgi.com/archives/xfs/2005-02/msg00067.html
config:
http://olricha.homelinux.net:8080/config.gz
Thanks for any help. If I can help at all please let me know.
--
avuton
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
The same here.
after a complete mkfs.xfs under 2.6.17-rc6 it was solved.
Same if i boot 2.6.8 mk.xfs, boot into 2.6.16 the xfs get "shreddered"
a directly boot from .8 to .17-rc6 works. so i think there was a bug in .16
in the transition of the xfs wich got solved somewhere in the .17.rc? time.
> Filesystem "sda1": Corruption of in-memory data detected. Shutting
> down filesystem: sda1
Daniel
How reproducible is it? Is it reproducible even after xfs_repair?
If so, can you try Mandy's patch below, to see if it is addressing
the root cause of your problem? If problems persist, a reproducible
test case would be wonderful, if one can be found..
cheers.
--
Nathan
Fix nused counter. It's currently getting set to -1 rather than getting
decremented by 1. Since nused never reaches 0, the "if (!free->hdr.nused)"
check in xfs_dir2_leafn_remove() fails every time and xfs_dir2_shrink_inode()
doesn't get called when it should. This causes extra blocks to be left on
an empty directory and the directory in unable to be converted back to
inline extent mode.
Signed-off-by: Mandy Kirkconnell <alki...@sgi.com>
Signed-off-by: Nathan Scott <nat...@sgi.com>
--- a/fs/xfs/xfs_dir2_node.c 2006-06-20 16:00:45.000000000 +1000
+++ b/fs/xfs/xfs_dir2_node.c 2006-06-20 16:00:45.000000000 +1000
@@ -972,7 +972,7 @@ xfs_dir2_leafn_remove(
/*
* One less used entry in the free table.
*/
- free->hdr.nused = cpu_to_be32(-1);
+ be32_add(&free->hdr.nused, -1);
xfs_dir2_free_log_header(tp, fbp);
/*
* If this was the last entry in the table, we can
Happens every time I try to remove that inode (directory). xfs_repair
ends with a fatal error:
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- ensuring existence of lost+found directory
- traversing filesystem starting at / ...
rebuilding directory inode 128
fatal error -- can't read block 16777216 for directory inode 1507133580
> If so, can you try Mandy's patch below, to see if it is addressing
> the root cause of your problem? If problems persist, a reproducible
> test case would be wonderful, if one can be found..
I'm sorry, the patch doesn't change anything. It never makes it though
the xfs_repair due to the above error. If there's any information I
can get for you please let me know.
I'm not sure if it changes anything, but here's the message after the patch:
xfs_da_do_buf: bno 16777216
dir: inode 1507133580
Filesystem "sda1": XFS internal error xfs_da_do_buf(1) at line 2119 of
file /usr/src/linux-stable-cold/fs/xfs/xfs_da_btree.c. Caller
0xb01d9b63
<b01d9720> xfs_da_do_buf+0x40e/0x7c7 <b01d9b63> xfs_da_read_buf+0x30/0x35
<b01e43d9> xfs_dir2_leafn_lookup_int+0x2f3/0x453 <b01d9b63>
xfs_da_read_buf+0x30/0x35
<b01e2ba5> xfs_dir2_node_removename+0x288/0x483 <b01e2ba5>
xfs_dir2_node_removename+0x288/0x483
<b01ddbd3> xfs_dir2_removename+0xce/0xd5 <b020ff61> kmem_zone_alloc+0x4d/0x98
<b020d0f3> xfs_remove+0x2ac/0x444 <b0215e83> xfs_vn_unlink+0x17/0x3b
<b016190c> mntput_no_expire+0x11/0x7e <b01575f1> link_path_walk+0xaf/0xb9
<b011e734> __capable+0xc/0x1f <b0155827> generic_permission+0x93/0xcc
<b01558f8> permission+0x98/0xa4 <b0155da0> may_delete+0x32/0xe9
<b0156243> vfs_unlink+0x6d/0xa3 <b0157c7a> do_unlinkat+0x92/0x125
<b0159a0d> sys_getdents64+0x9c/0xa6 <b0102b67> sysenter_past_esp+0x54/0x75
Filesystem "sda1": XFS internal error xfs_trans_cancel at line 1150 of
file /usr/src/linux-stable-cold/fs/xfs/xfs_trans.c. Caller 0xb020d262
<b0204b4c> xfs_trans_cancel+0x59/0xe5 <b020d262> xfs_remove+0x41b/0x444
<b020d262> xfs_remove+0x41b/0x444 <b0215e83> xfs_vn_unlink+0x17/0x3b
<b016190c> mntput_no_expire+0x11/0x7e <b01575f1> link_path_walk+0xaf/0xb9
<b011e734> __capable+0xc/0x1f <b0155827> generic_permission+0x93/0xcc
<b01558f8> permission+0x98/0xa4 <b0155da0> may_delete+0x32/0xe9
<b0156243> vfs_unlink+0x6d/0xa3 <b0157c7a> do_unlinkat+0x92/0x125
<b0159a0d> sys_getdents64+0x9c/0xa6 <b0102b67> sysenter_past_esp+0x54/0x75
xfs_force_shutdown(sda1,0x8) called from line 1151 of file
/usr/src/linux-stable-cold/fs/xfs/xfs_trans.c. Return address =
0xb0218b6c
Filesystem "sda1": Corruption of in-memory data detected. Shutting
down filesystem: sda1
Please umount the filesystem, and rectify the problem(s)
xfs_force_shutdown(sda1,0x1) called from line 338 of file
/usr/src/linux-stable-cold/fs/xfs/xfs_rw.c. Return address =
0xb0218b6c
--
avuton
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
Also, forgot to mention I ran xfs_check on it and it gave me more
information than I had before:
More information, ran xfs_check and got the following:
missing free index for data block 0 in dir ino 1507133580
missing free index for data block 2 in dir ino 1507133580
missing free index for data block 3 in dir ino 1507133580
missing free index for data block 4 in dir ino 1507133580
missing free index for data block 5 in dir ino 1507133580
missing free index for data block 6 in dir ino 1507133580
missing free index for data block 7 in dir ino 1507133580
missing free index for data block 8 in dir ino 1507133580
missing free index for data block 9 in dir ino 1507133580
--
avuton
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
Oh - thats a kernel patch, not a repair patch, I was more interested
in whether the initial corruption could be reproduced. Which version
of xfs_repair are you running? (xfs_repair -V) xfsprogs-2.7.18 will
resolve your problem, I suspect.
cheers.
--
Nathan
OK, I'm running Gentoo's latest: 2.7.11, I can't find 2.7.18
_anywhere_ although 2.7.13 is in the pre directory on the ftp, is that
the one you're referring to?
--
avuton
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
No - its in CVS (for a long time); I'll go get the ftp area updated,
looks like thats been forgotten about again.
cheers.
--
Nathan
OK, just compiled from CVS HEAD (xfs_repair 2.8.2) and it still fails:
If this fix is not yet in the 2.8.x I will wait for 2.7.18 to get on the ftp.
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and perform inode discovery...
- agno = 0
entry "/ost+found" at block 0 offset 448 in directory inode 128
references invalid inode 18374686479671623679
clearing inode number in entry at offset 448...
entry at block 0 offset 448 in directory inode 128 has illegal name
"/ost+found": imap claims a free inode 859505 is in use, correcting
imap and clearing inode
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
- process newly discovered inodes...
Phase 4 - check for duplicate blocks...
- setting up duplicate extent list...
- clear lost+found (if it exists) ...
- check for inodes claiming duplicate blocks...
- agno = 0
- agno = 1
- agno = 2
- agno = 3
- agno = 4
- agno = 5
- agno = 6
- agno = 7
- agno = 8
- agno = 9
- agno = 10
- agno = 11
- agno = 12
- agno = 13
- agno = 14
- agno = 15
Phase 5 - rebuild AG headers and trees...
- reset superblock...
Phase 6 - check inode connectivity...
- resetting contents of realtime bitmap and summary inodes
- ensuring existence of lost+found directory
- traversing filesystem starting at / ...
rebuilding directory inode 128
fatal error -- can't read block 16777216 for directory inode
1507133580
--
avuton
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
This looks to be the same problem as http://oss.sgi.com/bugzilla/show_bug.cgi?id=631
Note that the block numbers are identical in both reports: 16777216 = 0x1000000.
A very suspicious block number, wouldn't you say?
Best wishes,
Duncan.
In the initial email I do state that I have run badblocks on this disk
sucessfully.
Just the defaults, but it doesn't matter, someone else is having the
same exact issue I am, from the bugzilla entry earlier in this thread.
FWIW, I've updated the ftp area now.
> OK, just compiled from CVS HEAD (xfs_repair 2.8.2) and it still fails:
Is this a large filesystem? Any chance we can get access to
it somehow (e.g. xfs_copy to a sparse file, then send me a
pointer to it) to reproduce the problem locally?
> fatal error -- can't read block 16777216 for directory inode
> 1507133580
Once you save a copy of it for further analysis of xfs_repair,
if you can, you can clear out this problem by directly poking at
the device using xfs_db in expert mode. "xfs_db -x /dev/xxx";
then "inode 1507133580"; then "write core.mode 0"; and then try
another xfs_repair run. Please try capture the fs for us first
though (if possible) else we're going to struggle to improve on
this aspect of xfs_repair. Send me some private mail if you do
manage to grab the fs and put it someplace for me.
thanks.
--
Nathan
Ciao,
D.
Sorry, that should say: "I've had no problems at all with 2.6.15".
Also, xfs_repair successfully repaired the filesystem this time.
I've kept a copy of the filesystem in case anyone is interested.
Duncan.