[PATCH] Speedup FAT filesystem directory reads

0 views
Skip to first unread message

Karsten Wiese

unread,
Aug 3, 2005, 9:40:09 PM8/3/05
to
Hi,

Please give this a try and commit to -mm or mainline, if approved.

Thanks,
Karsten

Summary:
This speeds up directory reads for large FAT partitions,
if the buffercache has to be filled from the drive.
Following values were taken from:
$ time find path_to_freshly_mounted_fat > /dev/null
on an otherwise idle system.
FAT with 16KB Clusters on IDE attached drive: Factor 2
FAT with 32KB Clusters on USB2 attached drive: Factor 10 (!)
Its less than 1/10 slower, if the buffercache is uptodate.

The patch touches 3 areas:
- fat_bmap() returns the sector's offset in the cluster or a
negativ error code instead of 0 or the negativ error code.
It's callers are changed accordingly.
- fat__get_entry() calls sb_breadahead() to readahead a whole cluster,
if the requested sector is the first one in a cluster.
It is usefull to do this, because on FAT directories occupy whole
clusters.
Readahead is only done, if the cluster's first sector is not uptodate
to avoid overhead, when the buffer cache is already uptodate.
Note that on memory pressure, the maximal byte count wasted
(read: has to be red from disk twice) is 1 cluster's size. Thats 64KB.
- Unrelated cleanup at one spot:
if (bh)
brelse(bh);
is replaced with:
brelse(bh);
brelse() can handle NULL pointer arguments by itself.

Signed-off-by: Karsten Wiese <annabell...@yahoo.de>

fat+sb_breadahead.diff

OGAWA Hirofumi

unread,
Aug 4, 2005, 10:30:17 AM8/4/05
to
Karsten Wiese <annabell...@yahoo.de> writes:

> Please give this a try and commit to -mm or mainline, if approved.

Looks good. Thanks. However, I tweaked the patch.

- replace __getblk() to sb_getblk()
- exclude root-dir of FAT12/FAT16 from readahead
- exclude (sec_per_clus == 1) from readahead
- move the all readahead stuff to one inline function

What do you think of the following patch?
--
OGAWA Hirofumi <hiro...@mail.parknet.co.jp>


Signed-off-by: Karsten Wiese <annabell...@yahoo.de>
Signed-off-by: OGAWA Hirofumi <hiro...@mail.parknet.co.jp>
---

fs/fat/dir.c | 28 ++++++++++++++++++++++++++--
1 files changed, 26 insertions(+), 2 deletions(-)

diff -puN fs/fat/dir.c~fat-sb_breadahead fs/fat/dir.c
--- linux-2.6.13-rc4/fs/fat/dir.c~fat-sb_breadahead 2005-08-04 21:21:59.000000000 +0900
+++ linux-2.6.13-rc4-hirofumi/fs/fat/dir.c 2005-08-04 23:05:58.000000000 +0900
@@ -30,6 +30,29 @@ static inline loff_t fat_make_i_pos(stru
| (de - (struct msdos_dir_entry *)bh->b_data);
}

+static inline void fat_dir_readahead(struct inode *dir, sector_t iblock,
+ sector_t phys)
+{
+ struct super_block *sb = dir->i_sb;
+ struct msdos_sb_info *sbi = MSDOS_SB(sb);
+ struct buffer_head *bh;
+ int sec;
+
+ /* This is not a first sector of cluster, or sec_per_clus == 1 */
+ if ((iblock & (sbi->sec_per_clus - 1)) || sbi->sec_per_clus == 1)
+ return;
+ /* root dir of FAT12/FAT16 */
+ if ((sbi->fat_bits != 32) && (dir->i_ino == MSDOS_ROOT_INO))
+ return;
+
+ bh = sb_getblk(sb, phys);
+ if (bh && !buffer_uptodate(bh)) {
+ for (sec = 0; sec < sbi->sec_per_clus; sec++)
+ sb_breadahead(sb, phys + sec);
+ }
+ brelse(bh);
+}
+
/* Returns the inode number of the directory entry at offset pos. If bh is
non-NULL, it is brelse'd before. Pos is incremented. The buffer header is
returned in bh.
@@ -58,6 +81,8 @@ next:
if (err || !phys)
return -1; /* beyond EOF or error */

+ fat_dir_readahead(dir, iblock, phys);
+
*bh = sb_bread(sb, phys);
if (*bh == NULL) {
printk(KERN_ERR "FAT: Directory bread(block %llu) failed\n",
@@ -635,8 +660,7 @@ RecEnd:
EODir:
filp->f_pos = cpos;
FillFailed:
- if (bh)
- brelse(bh);
+ brelse(bh);
if (unicode)
free_page((unsigned long)unicode);
out:
_
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majo...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Karsten Wiese

unread,
Aug 4, 2005, 9:00:11 PM8/4/05
to
Am Donnerstag, 4. August 2005 16:21 schrieb OGAWA Hirofumi:
> Karsten Wiese <annabell...@yahoo.de> writes:
>
> > Please give this a try and commit to -mm or mainline, if approved.
>
> Looks good. Thanks. However, I tweaked the patch.
>
> - replace __getblk() to sb_getblk()
> - exclude root-dir of FAT12/FAT16 from readahead
> - exclude (sec_per_clus == 1) from readahead
> - move the all readahead stuff to one inline function
>
> What do you think of the following patch?

Looks better, is smaller and works equally well here, thanks.
I had to hand apply it though as it was slightly scrambled
(by my mail client?) so patch couldn't handle it.
Please send patches as attachment.

Andrew,
please replace the initial version in -mm with this one.

Thanks,
Karsten


From: Karsten Wiese <annabell...@yahoo.de>
From: OGAWA Hirofumi <hiro...@mail.parknet.co.jp>

This speeds up directory reads for large FAT partitions, if the buffercache
has to be filled from the drive. Following values were taken from:

$ time find path_to_freshly_mounted_fat > /dev/null

on an otherwise idle system.

FAT with 16KB Clusters on IDE attached drive: Factor 2
FAT with 32KB Clusters on USB2 attached drive: Factor 10 (!)
Its less than 1/10 slower, if the buffercache is uptodate.

The patch introduces the new function fat_dir_readahead().

fat_dir_readahead() calls sb_breadahead() to readahead a whole cluster,


if the requested sector is the first one in a cluster.
It is usefull to do this, because on FAT directories occupy whole

clusters, with the exception of FAT12/FAT16 root dirs.

Readahead is only done, if the cluster's first sector is not uptodate
to avoid overhead, when the buffer cache is already uptodate.

Note that under memory pressure, the maximal byte count wasted


(read: has to be red from disk twice) is 1 cluster's size. Thats 64KB.

fat_dir_readahead() is called from fat__get_entry().

There is also an unrelated cleanup at one spot:

if (bh)
brelse(bh);

is replaced with:

brelse(bh);

brelse() can handle NULL pointer arguments by itself.

Signed-off-by: Karsten Wiese <annabell...@yahoo.de>
Signed-off-by: OGAWA Hirofumi <hiro...@mail.parknet.co.jp>


speedup-fat-filesystem-directory-reads_2.patch

OGAWA Hirofumi

unread,
Aug 4, 2005, 9:40:07 PM8/4/05
to
Karsten Wiese <annabell...@yahoo.de> writes:

> Looks better, is smaller and works equally well here, thanks.
> I had to hand apply it though as it was slightly scrambled
> (by my mail client?) so patch couldn't handle it.
> Please send patches as attachment.

We like a plain text, not attachment, see Documentation/SubmittingPatches.
Anyway, thanks for nice work.
--
OGAWA Hirofumi <hiro...@mail.parknet.co.jp>

Jan Engelhardt

unread,
Aug 5, 2005, 2:20:10 AM8/5/05
to

>We like a plain text, not attachment, see Documentation/SubmittingPatches.
>Anyway, thanks for nice work.

|Exception: If your mailer is mangling patches then someone may ask
|you to re-send them using MIME.

from the doc ;)

OGAWA Hirofumi

unread,
Aug 5, 2005, 4:30:23 AM8/5/05
to
Jan Engelhardt <jen...@linux01.gwdg.de> writes:

> |Exception: If your mailer is mangling patches then someone may ask
> |you to re-send them using MIME.
>
> from the doc ;)

Oh, sure, I missed to read it :) But my mailer is actually sane.
Please double check your mailer.
--
OGAWA Hirofumi <hiro...@mail.parknet.co.jp>

Reply all
Reply to author
Forward
0 new messages