Before the rebuild, our 18GB of mailboxes took 60+ minutes cat to
/dev/null; after building a new ext3 filesystem and copying mailboxes
to it, the same cat of all mailboxes on the new filesystem to /dev/null
took 6 minutes.
We discovered that our original ext3 /var/spool/mail filesystem was
heavily fragmented:
# fsck -f /dev/sde1
fsck 1.32 (09-Nov-2002)
e2fsck 1.32 (09-Nov-2002)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/var/spool/mail: 4340/16777216 files (92.5% non-contiguous),
5132569/33553752 blocks
The general consensus after several google searches is that
fragmentation should not be a big problem with ext2/3. Clearly,
however, at least for us, fragmentation became a big issue. Defrag
tools we found seeem to be pretty old.
Does anyone have a recommendation for either a better way to defrag or
for a superior filesystem type that is less susceptible to
fragmentation?
Thanks for the assist,
Craig
You aren't the only person who has discovered that this claim is an
optimistic myth, based upon the assumption that all files are small.
-- Mark --
http://staff.washington.edu/mrc
Science does not emerge from voting, party politics, or public debate.
Si vis pacem, para bellum.
If you ignore the Linux acolytes who maintain that fragmentation
(a) doesn't exist
or
(b) can happen but is a "good" thing
then you'll find that there are tools for some FS types like
EXT2/EXT3 but not for others. The only universal way to
defrag that I'm aware of is the backup/wipe/restore method
that you used.
Since "defrag is never needed" there are very few tools to do it,
but look for "defrag", which works for EXT2/EXT3. No idea how
reliable it is though, given the disclaimers on their web site.
Anyone know of more general defrag tools?
Stan
--
Stan Bischof ("stan" at the below domain)
www.worldbadminton.com
You left out the important information. How much space was available
for that file system? And what percentage of it was used?
--
Floyd L. Davidson <http://web.newsguy.com/floyd_davidson>
Ukpeagvik (Barrow, Alaska) fl...@barrow.com
The people who claim that fragmentation is less of a problem than you
may think have a point. Unix-like systems are multi-user and
multi-process, and Linux itself is very aggressive about disk caching,
so when any process reads a file, that file is stuck in cache. The next
process that needs that file reads it from cache, not disk, so
fragmentation only matters on the first read. If you have problems with
the speed of disk I/O, faster disks or more RAM may help more than
trying to defragment things.
Peter T. Breuer has written on this topic in the past; groups.google for
his name and keyword "fragmentation", read what he's said, and address
his points and/or rebut them before you do anything else.
> then you'll find that there are tools for some FS types like EXT2/EXT3
> but not for others. The only universal way to defrag that I'm aware of
> is the backup/wipe/restore method that you used.
> look for "defrag", which works for EXT2/EXT3. Anyone know of more
> general defrag tools?
"Backup, mkfs, restore" is the most general way. Any defrag tool *must*
be filesystem-specific for obvious reasons. If there's any defrag tool
for ReiserFS 3.6, I haven't heard of it.
--
Matt G|There is no Darkness in Eternity/But only Light too dim for us to see
Brainbench MVP for Linux Admin / mail: TRAP + SPAN don't belong
http://www.brainbench.com / Hire me!
-----------------------------/ http://crow202.dyndns.org/~mhgraham/resume
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/sde1 132109124 18424392 106973984 15% /var/spool/mail2
Craig
If you can throw hardware at the problem, then a faster way to do this is
to transfer from the old drive to the new drive, then swap the drives.
The old drive then becomes a backup, which gets wiped and becomes the
"new" drive at the next transfer.
If you're paranoid, then rotate between three drives, which prevents the
state of filesystem corruption on the current drive being undetected until
after the backup with the good data got wiped.
The simplest means is with a few minutes of downtime, but if you're
clever, you can figure ways to do this on a hot system with active users,
especially if you're willing to sacrifce the esthetic of having a time
with everybody on a perfectly defragmented drive (as opposed to "good
enough").
> [ IMAP group removed ]
> ["Followup-To:" header set to comp.os.linux.misc.]
> On Wed, 27 Apr 2005 16:44:22 +0000 (UTC), esste...@worldbadminton.com
> staggered into the Black Sun and said:
>> In comp.os.linux.misc c...@stolaf.edu <c...@stolaf.edu> wrote:
>>> Does anyone have a recommendation for either a better way to defrag or
>>> for a superior filesystem type that is less susceptible to
>>> fragmentation?
>> If you ignore the Linux acolytes who maintain that fragmentation
>> (a) doesn't exist
>> (b) can happen but is a "good" thing
>
> The people who claim that fragmentation is less of a problem than you
> may think have a point. Unix-like systems are multi-user and
> multi-process, and Linux itself is very aggressive about disk caching,
> so when any process reads a file, that file is stuck in cache. The next
> process that needs that file reads it from cache, not disk, so
> fragmentation only matters on the first read. If you have problems with
> the speed of disk I/O, faster disks or more RAM may help more than
> trying to defragment things.
>
> Peter T. Breuer has written on this topic in the past; groups.google for
> his name and keyword "fragmentation", read what he's said, and address
> his points and/or rebut them before you do anything else.
Also read Lew Pitcher's excellent explanation:
gives a very24vid.21865$dj2.1...@news20.bellglobal.com
And ask yourself how often any one process reads the entire contents of
/var/spool/mail/ and how relevant continuous read/seek performance is in
these situations.
HTH
Andreas
Indeed. And it was exactly that, that the OP tested (judging by the
fact the he mentioned catting mailboxes to /dev/null).
Which is not to say that the OP doesn't have a problem, but he
might be confusing symtoms with causes. Something strange
caused more than 90% fragmentation, and whatever that was (an
over full disk, for example) might well have caused other
symptoms that were resulting in customer complaints.
Defragmentation is merely going to hide one meaningless symptom,
and do absolutely nothing else.
<snip>
>
> The general consensus after several google searches is that
> fragmentation should not be a big problem with ext2/3. Clearly,
> however, at least for us, fragmentation became a big issue. Defrag
> tools we found seeem to be pretty old.
>
> Does anyone have a recommendation for either a better way to defrag or
> for a superior filesystem type that is less susceptible to
> fragmentation?
You probably know of these already, but reiserfs and xfs might be worth
a look.
Mark
> You aren't the only person who has discovered that this claim is an
> optimistic myth, based upon the assumption that all files are small.
That's a fairly reasonable assumption for a mail spool, though.
But if you're having problems with fragmentation -- and it appears you are
-- why not try a different filesystem? I've had good luck with xfs, and I
hear ReiserFS works well for large numbers of small files.
--
John (jo...@os2.dhs.org)
> The people who claim that fragmentation is less of a problem than you
> may think have a point. Unix-like systems are multi-user and
Yes - the major problem with his post is that he is testing wrong. He's
sequentially going through all (relatively large) files and sequentially
reading each of them.
That's making his machine look like MSDOS. His machine doesn't work
like that under nomal circumstances - it runs multiple processes at
once, reading and writing to multiple files at once, so fragmentation is
irrelevant. It's only relevant when you read sequentially in a single
thread - which is his test.
To "fix" his non-problem, he needs to stop making large mailboxes which
are read sequentially by single proceses. That's a question of changing
his client programs or changing the mailbox format. Changing to
"maildir" format from "mailbox" format is the obvious quick fix.
> multi-process, and Linux itself is very aggressive about disk caching,
> so when any process reads a file, that file is stuck in cache. The next
> process that needs that file reads it from cache, not disk, so
> fragmentation only matters on the first read. If you have problems with
> the speed of disk I/O, faster disks or more RAM may help more than
> trying to defragment things.
> Peter T. Breuer has written on this topic in the past; groups.google for
> his name and keyword "fragmentation", read what he's said, and address
> his points and/or rebut them before you do anything else.
Mmmph.
Peter
My colleague found that ext2 and ext3 don't care so much about
fragmentation, but try to keep the file fragments close to each other
on the disk. Thus, the fact that 90-some percent of our files were
fragmented doesn't necessarily mean that performance is impacted in any
noticeable way. Still, fragmentation appears to have played a large
part in the problems we were seeing, which maybe means that the
fragments were located far apart.
Interestingly, one day after our filesystem rebuild, fragmentation on
/var/spool/mail is back up to 80% (not yet as high as 92% before).
Since performance is still good, the fragmentation percentage shown by
fsck may not a good indicator for what we're interested in.
We will experiment with the old, "slow" filesystem to see if we can
better identify the core problem...
Thanks for the feedback, everyone.
Craig
That happens on a lot of filesystems where there are a lot of files
that are over time randomly appended to. Things like pre-allocation
don't help either since that only works when you have a file open.
I have a similar issue with a diablo-dreader-spool: lots of files
that are randomly opened and written to. I've modified the Linux
XFS filesystem to allocate those files on disk in 256 KB consecutive
chunks (persistent preallocation) and that helps a lot.
>Does anyone have a recommendation for either a better way to defrag or
>for a superior filesystem type that is less susceptible to
>fragmentation?
For your purpose I think it's sufficient to switch to XFS instead of
ext3, and run the "xfs_fsr" utility nightly. That is a user-level,
online defragmentation utility.
Mike.
> That happens on a lot of filesystems where there are a lot of files
> that are over time randomly appended to. Things like pre-allocation
> don't help either since that only works when you have a file open.
Presumably he can simply switch to maildir instead of mailbox format?
Then no files are appended to.
The analysis sounds good - presumably the files aren't clustered far
enough apart initially to accunt for their future growth. But simply
copying the mailboxes anew (cp; mv;) in a nightly task should fix things
for the next day, no?
Can he not also do the copy efficiently by making a holed file first and
filling it in afterwards? (I'm trying to force sufficient preallocation
and distancing from other files).
> I have a similar issue with a diablo-dreader-spool: lots of files
> that are randomly opened and written to. I've modified the Linux
> XFS filesystem to allocate those files on disk in 256 KB consecutive
> chunks (persistent preallocation) and that helps a lot.
Well, it wouldn't help if they grow beyond 256K in a later session
after the initial open. He said his average was 4MB?
But I'd still say "don't do that". Use maildir, or a client that reads
a mailbox more intelligently (say by maintaining a list of offsets).
One can probably also alter the server to aoocassionally write out the
mailbox anew, but then one can also do that every night.
> For your purpose I think it's sufficient to switch to XFS instead of
> ext3, and run the "xfs_fsr" utility nightly. That is a user-level,
> online defragmentation utility.
Well, if he is going to run a nightly task, he might as well just copy
back his mailboxes on WHATEVER fs he is on, with the same result.
Peter