Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

VMS Directory size. Are there limits ?

468 views
Skip to first unread message

Jerry Leichter

unread,
Nov 7, 1994, 8:37:23 AM11/7/94
to
Some users at our installation created several thousand files in a
single directory, in one case 10800 files causing a 1400 blocks
directory file.

The problem: It takes now up to several seconds to delete single file
from such a large directory. The performance seems to go
down significantly if more than about 2000 files are in
one directory. A MONI FCP shows a very high "Disk Write
Rate", indicating that the whole or at least a large
part of the directory is updated on disk for each file
delete.

The question: Does somebody know a fix for the DELETE performance ?
problem? Or is there a practical limit of directory size ?

Several effects are contributing to your problems:

a. VMS caches directories. However, the cache is limited to 127
blocks. It turns out that the way caching is done, if a
directory file is over 127 blocks long, caching is turned off
for it entirely.

For many applications, you should think of 127 blocks as the
practical limit.

This particular limitation has been in VMS for a very long
time. There have been repeated rumors that a future version
of VMS would increase or eliminate the limit, but as far as
I know, not only hasn't this happened yet, but there hasn't
even been a semi-official hint from DEC that it will happen
any time soon.

b. A directory is a sequential file, with directory entries stored
in sorted order. When you create a file in a directory, VMS
tries to insert it into the appropriate block in the file.
If that block is full, it "slides" all subsequent blocks "up
by one" to create and empty block, then splits the entries in
the full block, and finally inserts the new entry. "Sliding"
the blocks over means copying them.

If all the allocate blocks in the directory are in use, VMS
has to extend the file first - which, because directories have
to be contiguous, almost always involves allocating a whole
new, larger directory and copying all the old contents.

Conversely, when you delete a file, a hole is left it the
entry's block. Normally, VMS just leaves it there for future
use. However, if the result is a completely empty directory,
VMS "slides" all subsequent blocks "down by one" to squeeze
out the empty space. Again, "sliding" means copying.

c. A quick calculation shows that your 1400 block directory file with
10800 files in it is using an average of over 66 bytes per
entry. That's a rather high number. A directory entry
contains mainly the file name, type, version number, and
6-byte file id. Further, multiple versions of the same file
don't even repeat the name and type; all they need is the new
version number. This means your directories undoubtedly
contain large amounts of empty space, adding to your problems.

Often, you can fix this by "compressing" the directory. VMS
provides no direct way to do this, but it can be accomplished
easily by creating a new directory file and renaming every-
thing from the old directory to the new. Once you've done
that, you can delete the old directory and rename the new one
back to the old one's name. I'll bet that your 1400 block
directory shrinks in size by a factor of two. This should
help performance - though a 700 block directory is still too
large.

The most common cause of large directories like this is MAIL. I wrote an
article about this topic for Digital Systems Journal - then VAX Professional -
a couple of years back (April 1991). For MAIL, "compressing" the directory is
a good first step; but beyond that, the best approach is to split your mail
into multiple mail files in multiple subdirectories, and use SET FILE in mail
to move among them.

For applications where you can control the placement of the files, the best
answer is from the old joke: Doctor, it hurts when I lift my arm like this.
So don't lift your arm like that! To get decent performance, split files
among a number of directories, so that all of them remain below 128 blocks
in size.
-- Jerry

J...@moe.sannet.gov

unread,
Nov 7, 1994, 10:44:17 AM11/7/94
to
In article <39kpph$h...@rs18.hrz.th-darmstadt.de>,
mue...@axp612.gsi.de (W.F.J.Mueller) writes

>
>Some users at our installation created several thousand files in a single
>directory, in one case 10800 files causing a 1400 blocks directory file.
>
>The problem: It takes now up to several seconds to delete single file
> from such a large directory. The performance seems to go
> down significantly if more than about 2000 files are in one
> directory. A MONI FCP shows a very high "Disk Write Rate",
> indicating that the whole or at least a large part of the
> directory is updated on disk for each file delete.
>
>The question: Does somebody know a fix for the DELETE performance problem ?

> Or is there a practical limit of directory size ?
>

I'd suggest that the "practical limit of directory size" is less than or
equal to 128 blocks used for the directory. It is at this point that the
RMS data cache runs out of room and is "turned off". It is at this point
that a separate operation for each directory entry must be performed (at
the cost of 3 IOs if I remember correctly).

There is no way to speed up delete in this case... but, I suspect that, if
possible, deleting files from the "back-end" of the directory first would
be quickest (though still slow) once the directory has exceeded 128 blocks
used.

Perhaps storing files in BACKUP save sets or text libraries within the
directories would help?

- Jim
--
Jim McKinney | j...@library.sannet.gov (JM7707)
San Diego Data Processing Corp |
5975 Santa Fe St | Voice: 619.581.9665
San Diego, CA 92109 USA | Fax: 619.581.9606

W.F.J.Mueller

unread,
Nov 7, 1994, 3:50:57 AM11/7/94
to

Some users at our installation created several thousand files in a single
directory, in one case 10800 files causing a 1400 blocks directory file.

The problem: It takes now up to several seconds to delete single file
from such a large directory. The performance seems to go
down significantly if more than about 2000 files are in one
directory. A MONI FCP shows a very high "Disk Write Rate",
indicating that the whole or at least a large part of the
directory is updated on disk for each file delete.

The question: Does somebody know a fix for the DELETE performance problem ?
Or is there a practical limit of directory size ?

--
Walter F.J. Mueller Mail: W.F.J....@gsi.de
GSI, Abteilung KP3 Phone: +49-6151-359-2766
D-64220 Darmstadt WWW: http://www-kp3.gsi.de/www/kp3/people/mueller.html

Hein RMS van den Heuvel

unread,
Nov 7, 1994, 6:13:32 AM11/7/94
to

In article <39kpph$h...@rs18.hrz.th-darmstadt.de>, mue...@axp612.gsi.de (W.F.J.Mueller) writes...

>
>Some users at our installation created several thousand files in a single
>directory, in one case 10800 files causing a 1400 blocks directory file.
>
>The problem: It takes now up to several seconds to delete single file
> from such a large directory. The performance seems to go
> down significantly if more than about 2000 files are in one

Deleting a random file will be fast no matter what size the directory
has UNLESS... that file had the last record in use in a (512 bytes)
directory block. In that case the XQP will squish out the now empty
block in a fail-safe manner by copying down all higher blocks, a block
at a time: for (vbn=empty, vbn < eof, vbn++ ) { read vbn+1; write vbn }
So worst case is to delete the first last entry in the first directory block.
For your 1400 block directory you can expect 1399 reads + 1400 1 block writes.
Clever disks/controllers/caches will help the reads, but the 1400 writes
are likely to take 30 seconds or thereabouts!

When deleting all, or most files from a large directory, you may want
to use a reverse delete approach using $sort/desc on $dire/colu=1 output.

Hope this helps, +--------------------------------------+
| All opinions expressed are mine, and |
Hein van den Heuvel, Digital | may not reflect those of my employer |
vanden...@eps.enet.dec.com +--------------------------------------+

Carl J Lydick

unread,
Nov 9, 1994, 7:03:46 AM11/9/94
to
In article <9411070744...@MOE.SANNET.GOV>, J...@MOE.SANNET.GOV writes:
=I'd suggest that the "practical limit of directory size" is less than or
=equal to 128 blocks used for the directory. It is at this point that the
=RMS data cache runs out of room and is "turned off". It is at this point
=that a separate operation for each directory entry must be performed (at
=the cost of 3 IOs if I remember correctly).
=
=There is no way to speed up delete in this case... but, I suspect that, if
=possible, deleting files from the "back-end" of the directory first would
=be quickest (though still slow) once the directory has exceeded 128 blocks
=used.

Gee. You've just contradicted yourself there:
"There's no way to speed up delete...."


"deleting files from the `back-end' of the directory first would be

quickest."
Could you please try, in the future, to at least make your bullshit
self-consistent?
--------------------------------------------------------------------------------
Carl J Lydick | INTERnet: CA...@SOL1.GPS.CALTECH.EDU | NSI/HEPnet: SOL1::CARL

Disclaimer: Hey, I understand VAXen and VMS. That's what I get paid for. My
understanding of astronomy is purely at the amateur level (or below). So
unless what I'm saying is directly related to VAX/VMS, don't hold me or my
organization responsible for it. If it IS related to VAX/VMS, you can try to
hold me responsible for it, but my organization had nothing to do with it.

Hein RMS van den Heuvel

unread,
Nov 9, 1994, 5:27:19 AM11/9/94
to

>In article <9411070744...@MOE.SANNET.GOV>, J...@MOE.SANNET.GOV writes:
>I'd suggest that the "practical limit of directory size" is less than or
>equal to 128 blocks used for the directory. It is at this point that the
>RMS data cache runs out of room and is "turned off". It is at this point
>that a separate operation for each directory entry must be performed (at
>the cost of 3 IOs if I remember correctly).

A number of folks have replied mentioning the RMS Directory cache limit of
64Kb (127 blocks). Just like to put the record straight that this does NOT
influence the speed (slowness?) of a DELETE. It does play a role in wildcard
file searched which are often executed in close proximity of such deletes.

If a particular applications uses straight file access and/or delete with
no wildcard lookups in place at all (no overt SYS$PARSE+SYS$SEARCH, nor
a hidden one in F$SEARCH, LIB$FIND_FILE or LIB$FILE_SCAN (sp?)) then the
RMS cache size is of no consequence and there is no performance 'cliff'
at 127 blocks. One such application is VMSmail. It performs quite similar
whether a directory is say 125 blocks or 129 blocks.

Yes, by all means keep your directories short & dense
Yes, 127 blocks is a nice number to aim for.
No, do not panic whenever a directory grows over 127.
Yes, there are applications out there that perform with accpetable
speeds using directories of hundreds or even thousands of blocks.
Yes, those applications often did a whole lot of thinking to get there.


Do not panic, +--------------------------------------+

Ted Crane

unread,
Mar 14, 2023, 1:17:07 PM3/14/23
to
OK. So 30 years have passed us by on this topic. I have no idea whether VMS ever offered a solution to this problem--my systems are still running VMS 7.0 and 7.1--but I'll offer a DCL procedure I created about 30 years ago to "compress" MAIL directories. I've written better stuff since, but this still works.

It's worth noting that the .MAI files are moved to 256 smaller directories, starting from the "end" of the main MAIL directory. The moving process is slow at first and gets much faster toward the end. The process is reversed when restoring the original MAIL directory...and proceeds much more quickly.

Side comment: It's too bad that VMS MAIL didn't implement the kind of directory structure this procedure uses. It would have made MAIL far less vulnerable to large directories.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

$ inquire SURE "Go directly to restore"
$ if SURE then goto RESTORE
$
$ if f$search("[SITE.LOGIN]MAIL2.DIR").eqs.""
$ then
$ create/directory/log [SITE.LOGIN.MAIL2]
$ else
$ dir [SITE.LOGIN.MAIL2...]/exc=*.DIR
$ inquire SURE "Is nothing in MAIL2? Continue"
$ if .not.SURE then exit
$ delete/log [SITE.LOGIN.MAIL2...]*.DIR;
$ delete/log [SITE.LOGIN.MAIL2...]*.DIR;
$ delete/log [SITE.LOGIN.MAIL2...]*.DIR;
$ endif
$
$ set noon
$
$!
$! Create the required temporary subdirectories
$! Save individual mail files from MAIL to MAIL2
$!
$ NUM = 255
$LOOP_SAVE:
$ TEXT = f$fao("!2XL",NUM)
$ show time
$ create/directory/allocation=16 [SITE.LOGIN.MAIL2.'TEXT']
$ rename/log [SITE.LOGIN.MAIL]MAIL$'TEXT'*.MAI [SITE.LOGIN.MAIL2.'TEXT']
$ NUM = NUM - 1
$ if NUM.ge.0 then goto LOOP_SAVE
$
$!
$! Save remaining stuff from MAIL to MAIL2
$!
$ rename/log [SITE.LOGIN.MAIL]*.* [SITE.LOGIN.MAIL2]
$
$RESTORE:
$
$!
$! Confirm that we should proceed backwards
$!
$ dir [SITE.LOGIN.MAIL...]
$ show time
$ inquire SURE "Is nothing in MAIL? Continue"
$
$!
$! Restore the root file first, just in case
$!
$ rename/log [SITE.LOGIN.MAIL2]MAIL.MAI; [SITE.LOGIN.MAIL]
$
$!
$! Restore individual mail files from MAIL2 to MAIL
$! Remove the temporary subdirectories
$!
$ NUM = 0
$LOOP_RESTORE:
$ TEXT = f$fao("!2XL",NUM)
$ show time
$ if f$search("[SITE.LOGIN.MAIL2]''TEXT'.DIR").nes.""
$ then
$ rename/log [SITE.LOGIN.MAIL2.'TEXT']MAIL$'TEXT'*.MAI [SITE.LOGIN.MAIL]
$ delete/log [SITE.LOGIN.MAIL2]'TEXT'.DIR;
$ endif
$ NUM = NUM + 1
$ if NUM.lt.256 then goto LOOP_RESTORE
$
$!
$! Restore remaining stuff from MAIL to MAIL2
$!
$ rename/log [SITE.LOGIN.MAIL2]*.* [SITE.LOGIN.MAIL]
$
$!
$! Remove temporary directory
$!
$ if f$search("[SITE.LOGIN.MAIL2...]*.*;*").eqs.""
$ then
$ set file/protection=(world:rwed) [SITE.LOGIN]MAIL2.dir;
$ delete/log [SITE.LOGIN]MAIL2.dir;
$ else
$ write sys$output "MAIL2 is not empty!"
$ endif
$
$ exit
0 new messages