defrag in linux?

Bill Crocker

unread,

Dec 7, 2003, 11:55:18 AM12/7/03

to

Is there any type of hard drive defrag utility available for
linux? I'm running SuSE 9.0.

Thanks,
Bill Crocker

: : b r i a n : :

unread,

Dec 7, 2003, 12:29:13 PM12/7/03

to

Bill Crocker wrote:
> Is there any type of hard drive defrag utility available for
> linux? I'm running SuSE 9.0.
>

You don't.

--
Brian

Olaf Donk

unread,

Dec 7, 2003, 12:32:45 PM12/7/03

to

On Sun, 07 Dec 2003 11:55:18 -0500, Bill Crocker wrote:

> Is there any type of hard drive defrag utility available for linux? I'm
> running SuSE 9.0.

Suse/linux comes with good filesystem, i.e. it doesn't need to be
defragmented (as long as you keep about 10% of your hard drive free).

bye, olaf

houghi

unread,

Dec 7, 2003, 12:33:42 PM12/7/03

to

Bill Crocker wrote:
> Is there any type of hard drive defrag utility available for
> linux? I'm running SuSE 9.0.

There are. However you normaly do not need to do this, as the Linux way
to handle things is how it should be. :-)

Unless there is a very extremely special reason to do so, thre is no
need to do a defrag.

Freshmeat only shows 3 hits on 'defrag' wich means that almost nobody
bothers to write such a program for Linux. :-D

For those of you who do not know the site: http://freshmeat.net has a
LOT of programs for Linux. If Suse does not have it itself, try to find
it there and the use http://rpmfind.net/ to find the rpm, if it is not
on the site itself.

--
houghi
> How to ask questions on Usenet :
http://www.houghi.org/question
Take a look and pass the word.

Michael Heiming

unread,

Dec 7, 2003, 12:53:14 PM12/7/03

to

Bill Crocker <wcroc...@comcast.net> wrote:
> Is there any type of hard drive defrag utility available for
> linux? I'm running SuSE 9.0.

Welcome to the world of real computing, there's no *nix FS
needing defrag. Even if there are tools for some, just marketing.

--
Michael Heiming

Remove +SIGNS and www. if you expect an answer, sorry for
inconvenience, but I get tons of SPAM

mjt

unread,

Dec 7, 2003, 1:03:35 PM12/7/03

to

On Sun, 07 Dec 2003 11:55:18 -0500, Bill Crocker <wcroc...@comcast.net> wrote:

> Is there any type of hard drive defrag utility available for
> linux? I'm running SuSE 9.0.

... not required
.
--
/// Michael J. Tobler: motorcyclist, surfer, skydiver, \\\
\\\ and author: "Inside Linux", "C++ HowTo", "C++ Unleashed" ///

Robert Norton

unread,

Dec 7, 2003, 2:34:18 PM12/7/03

to

On Sun, 07 Dec 2003 11:55:18 -0500, Bill Crocker wrote:

> Is there any type of hard drive defrag utility available for linux? I'm
> running SuSE 9.0.

Look up the reiserfs "reiser file system" to find out why you don't need a
defrag.

Robert Hull

unread,

Dec 8, 2003, 7:54:15 AM12/8/03

to

In message <cPidnRV5Cor...@comcast.com>, Bill Crocker
<wcroc...@comcast.net> wrote

>Is there any type of hard drive defrag utility available for linux?

Lew Pitcher

unread,

Dec 9, 2003, 10:11:06 PM12/9/03

to

Bill Crocker wrote:
> Is there any type of hard drive defrag utility available for
> linux? I'm running SuSE 9.0.

Well, it looks like it's time for my stock "Linux Defrag" post ;-)

In a single-user, single-tasking OS, it's best to keep all the data blocks
for a given file together, because _most_ of the disk accesses over a given
period of time will be against a single file. In this scenario, the read-write
heads of your HD advance sequentially through the hard disk. In the same
sort of system, if your file is fragmented, the read-write heads jump
all over the place, adding seek time to the hard disk access time.

In a multi-user, multi-tasking, multi-threaded OS, many files are being
accessed at any time, and, if left unregulated, the disk read-write
heads would jump all over the place all the time. Even with
'defragmented' files, there would be as much seek-time delay as there
would be with a single-user single-tasking OS and fragmented files.

Fortunately, multi-user, multi-tasking, multi-threaded OSs are usually
built smarter than that. Since file access is multiplexed from the point
of view of the device (multiple file accesses from multiple, unrelated
processes, with no order imposed on the sequence of blocks requested),
the device driver incorporates logic to accomodate the performance hits,
like reordering the requests into something sensible for the device
(i.e an "elevator" algorithm or the like).

In other words, fragmentation is a concern when one (and only one)
process access data from one (and only one) file. When more than one
file is involved, the disk addresses being requested are 'fragmented'
with respect to the sequence that the driver has to service them, and
thus it doesn't matter to the device driver whether or not a file was
fragmented.

To illustrate:

I have two programs executing simultaneously, each reading two different
files.

The files are organized sequentially (unfragmented) on disk...
[1.1][1.2][1.3][2.1][2.2][2.3][3.1][3.2][3.3][4.1][4.2][4.3][4.4]

Program 1 reads file 1, block 1
file 1, block 2
file 2, block 1
file 2, block 2
file 2, block 3
file 1, block 3

Program 2 reads file 3, block 1
file 4, block 1
file 3, block 2
file 4, block 2
file 3, block 3
file 4, block 4

The OS scheduler causes the programs to be scheduled and executed such
that the device driver receives requests
file 3, block 1
file 1, block 1
file 4, block 1
file 1, block 2
file 3, block 2
file 2, block 1
file 4, block 2
file 2, block 2
file 3, block 3
file 2, block 3
file 4, block 4
file 1, block 3

Graphically, this looks like...

[1.1][1.2][1.3][2.1][2.2][2.3][3.1][3.2][3.3][4.1][4.2][4.3][4.4]
}------------------------------>[3.1]
[1.1]<--------------------------'
`----------------------------------------->[4.1]
[1.2]<------------------------------------'
`-------------------------->[3.2]
[2.1]<----------------'
`------------------------------->[4.2]
[2.2]<--------------------------'
`---------------->[3.3]
[2.3]<-----------'
`------------------------------->[4.4]
[1.3]<---------------------------------------------'

As you can see, the accesses are already 'fragmented' and we haven't
even reached the disk yet (up to this point, the access have been
against 'logical' addresses). I have to stress this, the above
situation is _no different_ from an MSDOS single file physical access
against a fragmented file.

So, how do we minimize the effect seen above? If you are MSDOS, you
reorder the blocks on disk to match the (presumed) order in which they
will be requested. On the other hand, if you are Linux, you reorder the
_requests_ into a regular sequence that minimizes disk access using
something like an elevator algorithm. You also read ahead on the drive
(optimizing disk access), buffer most of the file data in memory, and
you only write dirty blocks. In other words, you minimize the effect of
'file fragmentation' as part of the other optimizations you perform
on the _access requests_ before you execute them.

Now, this is not to say that 'file fragmentation' is a good thing. It's
just that 'file fragmentation' doesn't have the *impact* here that it
would have in MSDOS-based systems. The performance difference between a
'file fragmented' Linux file system and a 'file unfragmented' Linux
file system is minimal to none, where the same performance difference
under MSDOS would be huge.

Under the right circumstances, fragmentation is a neutral thing, neither
bad nor good. As to defraging a Linux filesystem (ext2fs), there are
tools available, but (because of the design of the system) these tools
are rarely (if ever) needed or used. That's the impact of designing up
front the multi-processing/multi-tasking multi-user capacity of the OS
into it's facilities, rather than tacking multi-processing/multi-tasking
multi-user support on to an inherently single-processing/single-tasking
single-user system.

[ And, I'll add Peter T Breuer's <p...@lab.it.uc3m.es> comments from ]
[ Message-ID: <lo73t9...@news.it.uc3m.es>, posted on ]
[ Wed, 05 Dec 2001 23:52:52 GMT ... ]

All "fragmented" drives are better than "unfragmented" ones on a
multiuser multitasking o/s. The point is that the machine is doing
many things simultaneously, so it has to jump arround even if one task
is interested in only one file. Tehre will be up to a hundred tasks
doing i/o simultaneously.

Yes, all disk drivers use elevator algorithms, in any o/s.

But to answer your question, ext2s spreads blocks out evenly through
the disk, using various strategies (well, a single mixed strategy)..
This reduces the average seek time on a single elevator pass.

Peter

[ And I'll conclude with Eric P. McCoy's <ctr2...@yahoo.com> comments ]
[ from Message-ID: <87wv019...@providence.local>, posted on ]
[ Wed, 05 Dec 2001 23:52:52 GMT ... ]

"Linux filesystems" is a little misleading. e2fs doesn't generally
have fragmentation issues, for certain definitions of "fragmentation."

The short answer is this: e2fs splits the disks up into block groups,
which are contiguous regions of blocks. The group will contain a
certain number of inodes and (data) blocks. When you create an inode,
Linux probably chooses the group with the largest number of free
(data) blocks. When you write to an inode, Linux will preferentially
allocate (data) blocks in the same group as the inode. When it has
to, it will move on to another (later) group, but will still try to
keep the blocks together.

The end result of this is that data is generally fragmented by only a
few blocks, and almost always travels in the same direction. That's
as opposed to the front-to-back fragmentation which could, and
frequently did, occur in FAT and its derivatives.

The above works great until the file system is nearly full, at which
point free blocks are scattered all across the disk is discontiguous
locations. This is why, on a nearly-full file system (above 95% or
so), e2fs performance will degrade _substantially_.

Other file systems (HPFS in particular) are similar, but call groups
"bands" or "stripes" instead. HPFS is actually worse than e2fs when
nearly full, because it uses pseudo B-trees for the directory
structure which periodically need to be rebalanced. The problem there
is that, when the file system is nearly full, directories may need to
be rebalanced into many different groups, which will obviously cause
enormous slowdowns. e2fs uses a crummy, paleolithic array for its
directories, which results in far worse performance overall, but wins
out in this one narrow case (or can, depending on what's done to the
directory).

Sorry, but most people on this group know better than to mention "file
systems" and "explain" in the same sentence when I am around.

[ I hope this makes things clearer ]
--
Lew Pitcher

Master Codewright and JOAT-in-training
Registered Linux User #112576 (http://counter.li.org/)
Slackware - Because I know what I'm doing.

Y...@must-be-kidding.com

unread,

Dec 10, 2003, 2:52:12 AM12/10/03

to

On Tue, 09 Dec 2003 22:11:06 -0500, Lew Pitcher <lpit...@sympatico.ca>
wrote:

> Bill Crocker wrote:
>> Is there any type of hard drive defrag utility available for
>> linux? I'm running SuSE 9.0.
>

Well I really enjoyed Lew's post but lets make things a little simpler.

Your hard disk is divided into small storage areas, each are is a finite
amount of space, and lets say for sake of argument, that each bit of space
is 16K in size.

When data is written out to the hard disk, the disk finds the first
avaialable empty spot and starts writing the data. Now IF your file is
smaller then 16K it all gets writen into that one storage are and that
storage area is now used up, and anything NOT used up, lets say your file
is 8K, you have 8K of what is called "slack" because that 8K is NOT
avaiable for use, except to that file.

So far so good ???

Fragmentation comes into play when you have BIG files, or just files
larger then the smallest storage area on your hard disk. Lets say you
have a relatively new install, the disk starts writing out all the stuff
you install, starting from the beginning of the hard disk and continues
until its done and lets say that used 10% of the available space, so you
have 90% left.

So Lets say you create a database and pre-load with say 10 gigs of data.
Now the disk is going to start writing the data out in a nice pretty
_continious_ stream. In other words, if you start on track 10, sector 0
it will write sequentialy through to the end. When you ask the disk to
fetch this data for you it will happen fast, because the disk has to do
minimul work to read the data because the heads simply step through each
track in turn to read the data. Life is good!

No lets take that same scenario except lets put it onto a disk drive that
has been used for quite a while. Its had files written and removed etc.
etc. Your files parts will tend to be scattered all over, so when you
start reading that BIG database the heads will be dancing all over the
place quite a lot to collect all the data and your file is said to be,
Fragmented.

Now lets get to the real world. Disk Drives have come a LONG way baby
since the days when a 100 meg drive had 10 platters in it and 20
read/write heads. Most hard drives you buy these days have at MOST three
platters in it, more then likely just two and 4 read/write heads. They
also have LOTS of cache memory on them and trust me, they can pump the
bytes faster then the buss can accept them and faster then the O/S can eat
them in most cases with IDE and in every case with Ultra-SCSI-320.

Disk Fragmentation is not something the average user need concern
themselves with. The chances of you blasting your data to mars because
some defrag tool heads south right in the middle of writing everything to
disk is high and its just not worth it given the performance of modern
hard drives.

Its interesting that Lew mentioned elevator seeking and writing. That was
a very hot topic when your drive had 8 or 10 platters in it. It made sure
that your data was stored ( as much as possible ) on the same track on
each platter so that when the disk was reading it could read as much of
your data as possbile without moving the read/write heads. Remember 80
Milisecond SeaGate drives? With two platters in your modern IDE or SCSI
drive, its not so much of a big deal anymore although there is some speed
to be gained from not having to move the heads.

In short unless you are dealing with LOTS off access to BIG files by LOTS
of people at the same time, dont worry about defrag.

Bill Sappington
San Francisco CA