Just wondered if anyone had experienced this issue, we currently have
two AIX 5.2 ML5 (64bit on 650) servers running in a HACMP cluster
using SSA with fibre extenders. We recently experienced a power
failure of one cabinet and the system tried to failover, the system
failed to start due to (error from hacmp log)
liveapp1:cl_activate_fs[240] /usr/sbin/fsck -f -p -o nologredo
/dev/livenij
****************
/dev/rlivenij: fsck: 0507-089 Unrecoverable error reading from
/dev/rlivenij. Cannot continue.
/dev/rlivenij: fsck: 0507-039 Fatal error (-10012,-1) accessing the
file system (1,403079168,16384,4294967295).
/dev/rlivenij: fsck: 0507-089 Unrecoverable error reading from
/dev/rlivenij. Cannot continue.
/dev/rlivenij: fsck: 0507-039 Fatal error (-10012,-1) accessing the
file system (1,403079168,16384,4294967295).
/dev/rlivenij: Root directory has a corrupt tree (FIXED)
/dev/rlivenij: fsck: 0507-089 Unrecoverable error reading from
/dev/rlivenij. Cannot continue.
/dev/rlivenij: fsck: 0507-039 Fatal error (-10007,-1) accessing the
file system (1,417472512,16384,4294967295).
fsck: Execute module "/sbin/helpers/jfs2/fsck64" failed.
liveapp1:cl_activate_fs[85] mount /data/msas/live-414/nij
Replaying log for /dev/livenij.
Failure replaying log: -3
mount: /dev/livenij on /data/msas/live-414/nij: Unformatted or
incompatible media
The superblock on /dev/livenij is dirty. Run a full fsck to fix.
The Techy then performed a fsck which cleared the issue and allowed
the filesystem to be mounted but all the files were missing, and had
beceom inodes within the lost+found directory, any ideas as to the
cause and why all the files would have been removed any tips also
appreciated can I get my data back ( inode map somewhere on the system
)?
Well, this can happen, when the FS has lots of write activity
while the machine is crashing. Plus i'd check the current lpps
against apars. There have been several patches concerning jfs2.
If you have only a few files in lost+found, you can probably
restore them, if you know the right order to put them together.
If there are lost of files, just restore your backup, which you
probably have, since you're running hacmp.
Regards,
Frank
Thanks for the information Frank,
Is there anything that can be done to reduce the chance of this issue
happening again? and is there anything that can be done to recover the
files?
I've already installed a process to capture the ls -i output of the
file system every hour as our database don't change inodes so this
information would provide the source to recover the files from
lost+found.
Any other point ideas?
Regards
Peter
Yes, build in some redundancy into your power supply system.
peter....@dsl.pipex.com wrote:
> Is there anything that can be done to reduce the chance of this issue
> happening again? and is there anything that can be done to recover the
> files?
there are commercial tools which claim to be able to recover
such lost+found files. I never had to use one, so i can't
give you feedback on this, but they seem to be really pricy.
> I've already installed a process to capture the ls -i output of the
> file system every hour as our database don't change inodes so this
> information would provide the source to recover the files from
> lost+found.
>
> Any other point ideas?
as far as JFS is concerned i'd go with:
- double check your APAR-/ML-Level, to ensure your not running
into an already known problem.
- switch back to JFS. Unfortunately there are some performance
penalties.
as for the DB:
- check with your DBA, if raw devices are an option. You won't
have to worry about corrupted FS anymore, but you'll have to
make sure your DB will survive a crash.
- do regular offline backups. Turn up the frequency of archive
log switch and shove them into your backup facility as soon as
possible. This won't save you from a corrupt FS, but you won't
be caught with your pants down in case next time things go
*really* wrong.
as for the whole cluster:
- go over your hacmp design/setup. Hire a technical consultant
with proven experience in hacmp design, if your not that firm
with the subject. Search for SPOFs. Don't forget network, power,
cooling, etc. Eliminate the SPOFs.
Regards,
Frank
tangential to this, we're actually starting to ponder the last suggestion.
we recently were bit BAD by IY68589 on two prod filesystems. thankfully both
run the same app and didn't fail at the same time. IBM response, though, has
been less than encouraging.. namely the final fix for this won't be until
August and it sounds like it's correcting a design flaw; to me, it sounds
like whatever is released in Aug will be handling inodes in a totally new
fashion from the current implementation. beta-test, anyone?
we've upgraded a number of high-profile production servers over the last 6-8
months, used jfs2 because it would provide us additional advertised
flexibility.. and are now rather nervous filesystems will suddenly go
read-only with 20% or better space remaining because the filesystem is
fragmented enough the inode table can't grow. the two boxes that crashed
can't be moved back; their filecounts are past the max inode limit of a jfs
filesystem.
think long and hard before you implement anything jfs2.
-r (apar details below, for the curious..)
---
IY68589: JFS2 VARIABLE INODE EXTENT SIZE
Error description
Due to fragmentation, customer cannot create file on J2
filesystem even if there is plenty of space available.
Local fix
Problem summary
Due to fragmentation, customer cannot create file on J2
filesystem even if there is plenty of space available.
Problem conclusion
Introduce variable length inode extents that allow at least
one inode to be created as long as there is enough space
available.
Temporary fix
Comments
Hi Guys
Thanks for the updates just to give you some info:
1- The power system in the DC are already really very good but this
issue was caused by a engineer doing some maintenance work it took out
1 PDU ( in the DC not the cabinet ), which then took out 7 systems in
total.
2- HACMP design is ok no SPOF and it works when tested about 2 months
ago even from a power failure on the primary server including all disk
subsystems. The issue here was that the JFS2 corruption required an
FSCK to fix, which then caused the issue. IBM have already checked out
configuration and agree its fine.
3- I agree JFS2 has caused us two issues this one and a crash due to a
memory problem with the snapshot copy. I would also not suggest anyone
use JFS2 until its stable.
4- IBM support is getting really disappointing, I've been told that the
issue can happen under certain circumstances ( great system how do you
explain that to your customers ). The simple thing is that in 12 years
of running UNIX systems I've never seen an issue quite like this!
5- My main concern at the moment is that it appears IBM have no clue of
what the solution is so I'm still at risk from poor coding.
Regards
Peter
1) Purchase at least 'advocate' level support. A little of this is
common across all vendors if you look only at the lowest level support
packages. Still inexcusable, IMHO, but the universe rarely functions as
I think it should.
>
> 5- My main concern at the moment is that it appears IBM have no clue of
> what the solution is so I'm still at risk from poor coding.
You are a paying customer. Exercise your rights. Learn some IBMspeak
and use the term ESCALATE when you need to. Open a COMPLAINT. JFS is
pretty damn stable, and it was written, what, 15 years ago? Hold their
feet to the fire, they are getting paid for it.
Still, the fundamental problem is you have these damn HUMANS doing the
programming, and until we find a way around that, there will always be
these pesky problems in both OSS and proprietary software. 8-) Even
with JFS being pretty good, it hasn't been perfect, and I can remember
some SMIT errors in the mid 90s that unintentionally could loose data
manipulating volume groups.
Sigh. It is frustrating, but much like hitting yourself in the head
with a hammer - because it feels so good when it stops - go admin some
Windows systems for a few months. 8-) 8-)
Good luck.
>
>
> Regards
>
> Peter
>
heh.. add 'duty manager' to that list; we had to do that after being left in
limbo transitioning from level2 - level3 and the level2 dude told us they
had no way to directly contact anyone at level3 (!).
-r