Is it normal for fsck to run for more than 10 days for a 600GB system
?
Please advise, Thanks.
-------------------------------------------------------------------
Dear all,
We just encounted a serious problem. We have a Sun E6500 enterprise
server attached to a EMC storage, and one of the file system was
damaged.
The /Data is configured as Raid 1 (or maybe 0+1, I don't know how EMC
works), about 2TB big, so usable space is around 1 TB. Last week,
our sysadmin tried to put another 400G to it
and the file system crashed, since the largest file system under
Solaris is 1TB.
The symptom of the crash is that you can see some directories under
/data, but some are empty, and some are not empty. if you do a pwd
under those empty direcotries: "cannot determine current directory".
These empty ones used to be filled up with files -- so files are lost
after the crash.
What's the right way to fix it ? We know that all files are there on
those disks. We only have partial backups.
Many thanks,
W.M
It might be faster to restore the data (if you had a uptodate backup!)
than fsck-ing it! But anyway, you´re in deep shit! What kind of
volummanager are you using? If you´re running Solaris 8, why don´t you
use the built in logging?
steph
> One of our strage system crashed. (refer to the following previous post)
> We are running "fsck" on this system now trying to recover missing
> files, and it already took 10 days
> and there's no sign of finishing.
>
> Is it normal for fsck to run for more than 10 days for a 600GB system ?
>
> Please advise, Thanks.
Well that is *long* especially because I assume your diskS are pretty
fast. Fsck could take a few hours if your file system ir really fscked up
but no 10 days. Anyway I recommend you a journaling filesystem next time
:-) or softupdates as the *BSD's do it...
This will probably depend on the os to, if it has a to-be-fscked-slowly
file system. But still...
P.S. cross-posting is not done.
--
yalu
mail: frankpuntvandammebijstudent-kuleuven-ac-belgie
homepage: www.student.kuleuven.ac.be/~m9917684
jabber: ya...@jabber.com
THIS IS NOT ADVICE:
I remember reading the post below...
A file system that is 600GB is pretty big, and fsck does use
longer time for bigger partitions (linearly increasing I
assume).
Hmmmm... actually, I just now read about someone saying that the
time for doing an fsck will actually increase exponantially with
number of files on the system [1].
I've only run into trouble once with fsck, when one of my 3GB
partitions had a surface error. It was only a small portion of
the disk that was affected, but it took 30 minutes to fsck it
(this was Linux and an ext2 FS).
Well, that wasn't very helpful :-(
Other interesting reads:
* Sun Alert ID 27791 [2]
* Search "EMC fsck Solaris" on Google [3]
* Dito, but on USENET (via Google) [4]
[1] http://searchstorage.techtarget.com/ateQuestionNResponse
/0,289625,sid5_cid431606_tax286193,00.html
[2] http://sunsolve.sun.com/pub-cgi/retrieve.pl?doc=fsalert%2F27791
[3] http://www.google.com/search?q=EMC+fsck+Solaris
[4] http://groups.google.com/groups?q=EMC%20fsck%20Solaris
>
> -------------------------------------------------------------------
>
>
> Dear all,
>
> We just encounted a serious problem. We have a Sun E6500 enterprise
> server attached to a EMC storage, and one of the file system was
> damaged.
>
> The /Data is configured as Raid 1 (or maybe 0+1, I don't know how EMC
> works), about 2TB big, so usable space is around 1 TB. Last week,
> our sysadmin tried to put another 400G to it
> and the file system crashed, since the largest file system under
> Solaris is 1TB.
>
> The symptom of the crash is that you can see some directories under
> /data, but some are empty, and some are not empty. if you do a pwd
> under those empty direcotries: "cannot determine current directory".
> These empty ones used to be filled up with files -- so files are lost
> after the crash.
>
> What's the right way to fix it ? We know that all files are there on
> those disks. We only have partial backups.
>
> Many thanks,
> W.M
--
Andreas Kähäri
--------------------------------------------------------------
Feed your daemons. www.netbsd.org
Since january 2001, Solaris 8 does soft updates too.
The soft update sources are not covered by a BSD license in order to allow
Kirc Mckusick to seel the code to Sun :-)
--
EMail:jo...@schily.isdn.cs.tu-berlin.de (home) Jörg Schilling D-13353 Berlin
j...@cs.tu-berlin.de (uni) If you don't have iso-8859-1
schi...@fokus.gmd.de (work) chars I am J"org Schilling
URL: http://www.fokus.gmd.de/usr/schilling ftp://ftp.fokus.gmd.de/pub/unix
I wouldn't be surprised if there's also a knee in the curve. fsck probably
has to keep quite a bit of information in memory, so that it can perform
consistency checks between different parts of the filesystem (e.g. orphaned
files are detected by making a list of all the inodes that are marked in
use, and then scanning the directories looking for references). For a huge
filesystem, this information could overflow physical RAM, and might cause
quite a bit of paging. The fsck programmers probably didn't spend much
time optimizing locality of reference, so you could get significant page
thrashing when this happens, and performance would drop like a rock.
--
Barry Margolin, bar...@genuity.net
Genuity, Woburn, MA
*** DON'T SEND TECHNICAL QUESTIONS DIRECTLY TO ME, post them to newsgroups.
Please DON'T copy followups to me -- I'll assume it wasn't posted to the group.
> >
> >Well that is *long* especially because I assume your diskS are pretty
> >fast. Fsck could take a few hours if your file system ir really fscked up
> >but no 10 days. Anyway I recommend you a journaling filesystem next time
> >:-) or softupdates as the *BSD's do it...
>
> Since january 2001, Solaris 8 does soft updates too.
>
> The soft update sources are not covered by a BSD license in order to allow
> Kirc Mckusick to seel the code to Sun :-)
>
UFS with logging works fine for me, never had a single problem with and i'm using it since it was introduced in the later Sol 7 releases.
--
Barbie - Prayers are like junkmail for Jesus
I have seen things you lusers would not believe.
I've seen Sun monitors on fire off the side of the multimedia lab.
I've seen NTU lights glitter in the dark near the Mail Gate.
All these things will be lost in time, like the root partition last week.
Time to die.
It's on by default as it rather makes the fs more stable.
Impressive test results done on Solaris 8 x86 and Linux about a year ago
show that UFS on Solaris is much much faster than Ext2 and even
faster than reiserfs.
My unpacking tests with our freedb database tar file showed that
those days (freedb had about 450000 files) Solaris takes about 6 minutes
while Linux takes more than 2 hours. With reiserfs Linux takes about
20 minutes. Numbers are from my memory and may not 100% correct.
I posted the numbers in this group about a year ago but nobody
seems to care then :-(
A point of order: cross-posting is absolutely done, and it is the correct
choice when you have a problem that is on-topic for several groups. Doing
cross-posting to appropriate newsgroups has been a UseNet standard since
before useNet was attached to the ARPAnet. Please read the newuser FAQ
files about UseNet.
If you object to one specific group in the distribution, cut it from your
reply and say so.
Now back to the topic: fsck should not take more than a few hours but it
really depends on which filesystem you have an how many files you have.
With vxfs I've never seen an fsck take more than a few minutes even with a
few million files. With ufs the more files you have the longer it takes.
If you picked UFS filesystem and you have 600 gigabytes filled with billions
of files, it may well take 10 days.
Have you been watching the CPU usage with top, the disk usage with iostat,
the memory usage with vmstat, and everything you can think of with sar? If
not, now is the time to start. They will tell you if it has hung or if it
is busy. If it is busy they will tell you what the bottleneck is.
Tony
Hmmm... brave soul who goes renaming 600 gigs of random files :-)
--
yalu
mail: frankpuntvandammebijstudent-kuleuven-ac-belgie
homepage: www.student.kuleuven.ac.be/~m9917684
jabber: ya...@jabber.com
Q
"yalu" <frank...@mail.com> wrote in message
news:pan.2002.05.10.16...@mail.com...