del a*.*;*
del a*.txt;*
del a*.txt;1
Thank you for your help
Mike
Mike,
The answer depends on what else is in the directory. In the degenerate
case (only a.txt;1 is present), the differences are minuscule. If
there are thousands of files in the directory, the answer is
different.
There is also a semantic problem. Specifying ";1" will not have the
desired result if a command procedure is interrupted, leaving the file
behind. Specifying the wildcad (e.g., ";*") deletes ALL versions.
(Specifying ";0" only deletes the highest numbered version).
In general, I advise my clients to exercise EXTREME CAUTION with this
type of coding, as it is quite easy to create mayhem if two different
processes are executing the code in the same directory.
- Bob Gezelter, http://www.rlgsc.com
All things being equal (i.e. the only files in the directory that
match the a*.*;* wildcard also match a*.txt;1), I'd expect no significant
performance difference.
The real work is going to be the disk I/O writing directory contents
back to disk. Reading directory entries into cache and parsing
and searching directory entries from cache is unlikely to be the
bottleneck.
Why do you ask?
Historically, the thing that absolutely kills delete performance is
the "bubble down" that can take place if you delete the last directory
entry in a block near the front end of a _HUGE_ .DIR file.
Various tweaks over the years have improved this behavior by orders
of magnitude. If it's still an issue for you, a reverse-alphabetical-order
delete is one thing that can sometimes be of use.
I think that a non-wildcarded delete will be the fastest since it can do
a direct lookup. (but this is not in your example)
In the above cases, the delete command would still have to sequentially
scan all files in the directory beginning with "a" and then see if it
matches the mask. Obviously, the more "*" you have in a mask, the more
CPU will be needed to decide if a full file spec matches the wildcard,
but unless you are running an All Mighty Microvax II, you might not see
any difference since the delete command will spend the most time in IO
and the CPU time needed to check a string against a wildcard
specification is fairly trivial.
Thank you for your help....
Mike
<bri...@encompasserve.org> wrote in message
news:iO4j52...@eisner.encompasserve.org...
Top posting. *sigh*.
In any case, you've hit the "bubble down" problem that I alluded to.
And, because you want to delete the first 30,000 files in a 200,000
file directory, reverse alphabetical order isn't going to do
much for you.
Hmmmm...
There _is_ a sneaky approach.
How about if instead of
$ delete a*.txt;1
you
$ rename a*.txt;1 *.*;2 /log
That should run pretty fast because the directory entries can be
modified in place. There won't be any "bubble down".
You can press control-Y when you come to the 30,000th file.
Then you can go back in and ftp all of your remaining version 1 files.
ftp> mput a*.txt;1
When you're finished you can do a reverse alphabetical order delete
on the whole directory.
Note also that if you'd like to delete *all* files in the
directory, The DELETE option in DFU is (claimed to be) much
faster then DCL DELETE.
Sorry for the top down....the group I spend the most of my time in prefers
it that way....
any way,
BRILLIANT!.....the rename makes alot of sense and I will give that a try.
Thanks!
Mike
So those files are 'early on' in the directory.
'In place' renaming to ;2 for easy exclusion with FTP is not a bad
thought!
What OpenVMS version?
The problem is somewhat similar to one discussed in:
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=1165943
You may want to check out my rename suggestion there.
It does a double rename to allow the system to allways take from and
add to the end.
This avoids the expensive 1-block shuflef up to make room when
inserting an 'early' file and the
equally expensive shuffle down when the last entry from vbn 1 is
removed.
You may want to adjust it to pre-establish 5K - 25K - chunks of file
to be dealt with.
Cheers,
Hein.
Mike,
200,000+ files in one directory is so ridiculous I can scarcely imagine
anyone doing it. You are now finding out just one of the reasons why
it's ridiculous.
Can you just INIT the disk and start over? Or back up everything else,
INIT and restore?
The last time I had to cope with something like this was eight or nine
years ago at McGraw-Hill. Some woman had gone on maternity leave. She
had some sort of self resubmitting job that just went right on creating
these files.... ISTR that, by the time we discovered it, the directory
was well over 2000 blocks in size! It took DAYS to delete all those files!
VMS's handling of .DIR files is much better today then 8-9 years ago,
if I'm not wrong. And, as I wrote in another post, DFU has some tools
such as bulk erase of a dir or dir-tree and compress/defrag of .DIR
files to make the "cleaning job" easier. Re-init of the disk should
not be needed on a reasonable new VMS version and using reasonable
modern disk subsystems.
Reverse sorting makes DELETE run oodles faster.
--
Ron Johnson, Jr.
Jefferson LA USA
Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!
...; however, be VERY CAREFUL which DFU version you use.
V2.7-1 can't handle big trees. It will crash and leave a mess to cleanup,
although ANALYZE/DISK/REPAIR handles it nicely.
--
David J Dachtera
dba DJE Systems
http://www.djesys.com/
Unofficial OpenVMS Marketing Home Page
http://www.djesys.com/vms/market/
Unofficial Affordable OpenVMS Home Page:
http://www.djesys.com/vms/soho/
Unofficial OpenVMS-IA32 Home Page:
http://www.djesys.com/vms/ia32/
Unofficial OpenVMS Hobbyist Support Page:
http://www.djesys.com/vms/support/
Really??!! Which group is that? Even the UN*X groups soundly thrash people for
top posting.
As I think you've discovered, the answer is, "it depends". Lots of ;1 files
means lots of directory entries - not necessarily a good or bad thing, just
something to consider, in light of what it takes to delete a single version file
at the beginning of a large directory.
Unusual part of VMS directories is that a single directory entry can represent
multiple versions of a "name.ext". Pull a small directory into EDT sometime
(*PLEASE* use /READ!!!) and check it out.
Each directory record begins with a length attribute (but you won't see that
because EDT does RECORD I/O!), then a version limit, some binary fields before
the "name.ext", then the version numbers and FIDs of the various versions out to
the end of the record.
I don't have the code at hand just now, but I wrote a FILCNT.COM to count the
number of files in a directory simply by reading the directory. It never hits
the file headers, and so is a bit faster than trying to use the DIRECTORY
command using only the default qualifiers (/HEADING, /TRAILING). It does only
one READ for each directory entry, then calculates the number of versions
represented by the entry.
The code *IS* in http://www.djesys.com/freeware/vms/4038_freeware.zip
I went ahead and extracted the code:
$ open/read/share=write dir &p1
$ filcnt_l = 0
$read_loop:
$ read/end=eof_dir dir p9
$ namlen_s = f$extr( 3, 1, p9 )
$ namlen_l = f$cvui( 0, 8, namlen_s )
$ versns_l = (f$length( p9 ) - 4 - namlen_l) / 8
$ filcnt_l = filcnt_l + versns_l
$ goto read_loop
$eof_dir:
$ close dir
$ show symbol filcnt_l
$ exit
I suppose you could make "filcnt_l" (file count, long) a global symbol and use
it for another purpose.
Example usage:
$ @filcnt mydir.dir
I was just sitting here watching defrag run on my W98-SE machine, thinking about
various - totally UNSUPPORTED!!! - ways to accomplish such things as deleting
files the way you said you needed to in your response to Mr. Briggs. Creative,
but not recommendable. Involves doing things folks here would scoff at.
You could, of course, employ the RENAME to ;2 strategy, then start up another
FTP and kick off a delete of the ;2 files in batch. That's fully
supported/-able.
the search for a*.*;* should be the fastest way. If the first letter
matches, the rest can be ignored and the file can be deleted. For the
a*.txt;* the file extension must match too and in the third case OpenVMS
have to search for three matches.
Best regards Rudolf Wingert.
> 200,000+ files in one directory is so ridiculous I can scarcely imagine anyone doing it.
> You are now finding out just one of the reasons why it's ridiculous.
>
> Can you just INIT the disk and start over? Or back up everything else, INIT and
> restore?
A bit drastic, perhaps. I would make a load of directories
(say taking the last 2-3 digits of the version number). Then just
SET FILE/NODIR and DELETE the original. Making sure I
had a good backup just in case.
Mike Minor
"David J Dachtera" <djes...@spam.comcast.net> wrote in message
news:4715588B...@spam.comcast.net...
Thank you,
Mike Minor
David,
Are you sure you ran DIR without file-header-hitting quals? I find
DIRECTORY/TOTAL works much faster than DIR/TOTAL where DIR is a symbol
defined to be something like the typical DIRECTORY/SIZE/DATE/
PROTECTION. Apparently, DIR/SIZE/DATE/PROT/TOTAL and similar commands
hit the file headers just as if /TOTAL weren't there. So try it being
sure you use JUST DIRECTORY/TOTAL.
I just tried your program on a huge directory and DIRECTORY/TOTAL runs
a little faster.
[...]
AEF
Can you please explain what this means? Make a load of directories and
do what?
If you SET FILE/NODIR on a directory file that has files in it, the
files will still be on the disk and have to be recovered via ANAL/DISK/
REPAIR and then deleted from [SYSLOST].
AEF
>> A bit drastic, perhaps. I would make a load of directories
>> (say taking the last 2-3 digits of the version number). Then just
>> SET FILE/NODIR and DELETE the original. Making sure I
>> had a good backup just in case.
>
> Can you please explain what this means? Make a load of directories and
> do what?
Make alias entries for the files. Sorry, I rather overedited the previous post.
Sorry, I meant something like DIR/DATE/TOTAL, or with any other
qualifiers that don't produce output with /TOTAL.
AEF
Reading between the lines I took his approach to be:
Hash the file names from the original directory to split them into
a bunch of separate buckets. Suggested hash functions are
version number mod 1000 or version number mod 100.
Create a directory for each such bucket
$ SET FILE /ENTER each file from the original directory so it has
a new entry in the chosen target directory.
Nuke the original directory rather than deleting from it piecemeal.
My impression is that the files in the case at hand are all version 1,
so hashing them based on version number is a poor idea.
I also get the impression that the original poster is trying to get
his 200,000 files migrated to another system, probably in preparation
to deleting them all anyway.
That depends on how many files fit the different patterns, of course,
but I assume that's not what your point is.
There is some work needed to process the wildcard and look for all
the possible matches. If this actually becomes measureable it's
probably being much more heavily influenced by other issues, such as
a directory file larger than the directory cache.
This is a good reason to get and use DFU, but in the meantime you
make get faster results via backup/delete to the null device.
That's a good way to:
a) temporarily loose disk space since all the files in the directory
are still on the disk, just not entered into a directory
b) make the next anal/disk/repair really slow as it enters all those
lost files in [syslost]
c) move the problem to having to delete all those files from
[syslost] instead of thier original directory
Also,
delete az*.*;*
delete ay*.*;*
delete ax*.*;*
...
delete ac*.*;*
delete ab*.*;*
delete aa*.*;*
would be faster since it would begin the deletes further down the list
and while not a full "reverse order" delete, it would reduce the amount
of shuffling it needs to do for each delete.
Also, make sure your volume is not set to "erase on delete" as this will
greatly slow down deletes. (SHOW DEV <disk>/FULL) will tell you if it is
set or not). (SET VOLUME is the command to set/unset that feature).
> I have a directory with 200000+files, all in the a*.txt;1 range. I need to
> ftp these files to another server.
Why not put them in a backup saveset or zip archive before the transfer?
Because once you are in this situation it is too late?
You can out tham in an archive, but the removal will still cost as
much.
JF wrote...
> Have you considered renaming the files needing to be deleted to a
> different directory, and then deleting them in that directory where the
> size will be more manageable ?
That's what I suggested early on, along with the hint, to do a double
renaming making sure only to take from teh end, and add to the end.
I even included a pointer to a and a working example in perl,
But in the mean time I hacked up something cute....
Attached a tool which can split a directory in two parts in units of
disk-clusters.
It takes the time of a file create and a handfull of IOs for ANY
number of files.
10 files or 10,000 files move literally just as quickly with a slide
of hands.
Perfect scaling! :-)
It's all done with smoke and mirrors involving file headers and
mapping pointers.
Very minimal testing to date... just with empty files on an small LD
device.
I did test the cluster size code, but I really only tested 1 Retrieval
Pointer format for now.
Check this out though... :-)
$ ld create sys$login:lda4.disk /size=10000
$ ld connec sys$login:lda4.disk lda4:
$ init lda4: lda4
$ moun lda4: lda4
$ create/dir lda4:[A]
$ perl -e "foreach $i (1..1000) { open X,"">lda4:[A]$
{i}_blah_blah_blah_${i}""}" ! A little random order
$ dir LDA4:[A]
Directory LDA4:[A]
1000_BLAH_BLAH_BLAH_1000.;1 100_BLAH_BLAH_BLAH_100.;1
101_BLAH_BLAH_BLAH_101.;1 102_BLAH_BLAH_BLAH_102.;1
103_BLAH_BLAH_BLAH_103.;1 104_BLAH_BLAH_BLAH_104.;1
:
998_BLAH_BLAH_BLAH_998.;1 999_BLAH_BLAH_BLAH_999.;1
99_BLAH_BLAH_BLAH_99.;1 9_BLAH_BLAH_BLAH_9.;1
Total of 1000 files.
$ mcr dev:[disk]SPLIT_DIRECTORY lda4:[000000]a.dir lda4:[000000]b.dir
9
! First filename after split: 161_BLAH_BLAH_BLAH_161.;1
$ dir lda4:[a]
Directory LDA4:[A]
161_BLAH_BLAH_BLAH_161.;1 162_BLAH_BLAH_BLAH_162.;1
163_BLAH_BLAH_BLAH_163.;1 164_BLAH_BLAH_BLAH_164.;1
:
99_BLAH_BLAH_BLAH_99.;1 9_BLAH_BLAH_BLAH_9.;1
Total of 932 files.
$ dir lda4:[b]
Directory LDA4:[B]
1000_BLAH_BLAH_BLAH_1000.;1 100_BLAH_BLAH_BLAH_100.;1
:
158_BLAH_BLAH_BLAH_158.;1 159_BLAH_BLAH_BLAH_159.;1
15_BLAH_BLAH_BLAH_15.;1 160_BLAH_BLAH_BLAH_160.;1
Total of 68 files.
$ mcr dev:[disk]SPLIT_DIRECTORY lda4:[000000]a.dir lda4:[000000]c.dir
20
! First filename after split: 291_BLAH_BLAH_BLAH_291.;1
$ dir/total lda4:[*...]
Directory LDA4:[A]
Total of 788 files.
Directory LDA4:[B]
Total of 68 files.
Directory LDA4:[C]
Total of 144 files.
Grand total of 3 directories, 1000 files.
I'll check more configurations if there (ever) is a business
justification.
In the mean time, if I was in a crunch and needed a tool like this
then I would
- mark the directory no-dir
- take a copy
- re-mark as directory
Hope this helps someone, somewhere, somehow
Please let me know if it does!
Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting
/*
** split_directory.c
** Copyright ... Hein van den Heuvel, Oct 2007
**
** This program can be used to split a directory into two, for faster
deletes
** and renames. This is mostly just a fun excercise, but it could come
in handy
** some day. This program workings are relatively simple because
directory files
** are contiguous. So there is just one mapping pointer and extention
headers
** are not likely (!-). We can take a number of blocks (multiple of
cluster
** size) from the bottom of the mapped area and bequeat them to an
other file.
** Then adjust that mapping pointer and the EOF and such and be done
with
** just a file create an a handful IOs.
**
** Method:
** 1) create an empty normal file, on the selected disk
** The main data in the file header for this file will be
replaced
** by adjusted data from the source directory.
** 2) copy the file header attributes from the source directory,
** over the target file header.
** - set eof-block to select high-block
** - adjust mapping pointer (just one.... directories are
contiguous)
** - adjust file-id to clone file id
** - re-calculate checksum and write out
** 3) Write out fresh split.
** 4) On succes touch up original and write out.
** 5) According to the very last page in Kirby McCoy's VMS File
System Internals,
** the vbn write to indexf.sys will trigger a flush of the
associated caches.
** 8.6.7 "User Invalidation if Cached Buffers.
** Seems almost too easy. No poking of the volume lock needed to
do cause
** the RM$DIRCACHE_BLKAST, No poking of the the file
serialization lock
** for the (source) directory?!
**
**
** Enjoy!
** Hein, HvdH Performance Consulting
**
*/
/*
** cc SPLIT_DIRECTORY.C+SYS$COMMON:[SYSLIB]SYS$LIB_C.TLB/lib
**
** libr/extr=fatdef/out=fatdef.h sys$library:sys$lib_c.tlb
*/
#include fh2def
#include fatdef
#include fm2def
#include rms
#include stdio
#include stdlib
#include string
#include dvidef
#include ssdef
typedef struct { short len, cod; void *address; int *retlen; } item;
int sys$open(), sys$connect(), sys$read(), sys$write(), sys$close();
int sys$create(), sys$parse(), sys$search(), sys$erase();
int sys$getdvi(), lib$spawn();
main(argc,argv)
int argc;
char *argv[];
{
int checksum, i, spawn_status;
FAT *source_fat, *target_fat;
unsigned char *p;
union {
unsigned int ebk;
struct {
unsigned short int lo; /* high order word */
unsigned short int hi; /* low order word */
} words;
} ebk;
static unsigned short source_header[256], target_header[256];
static char *usage = "Usage: $ split_directory old_name new_name
<blocks_to_split>\n";
static char esa[256], rsa[256], command[256];
static int status, channel, bytes, blocks_to_split=0, vbn=1;
static int file_hbk, file_nbytes, spec_nbytes;
static int index_file_id_offset, index_file_bitmap_size,
index_file_bitmap_vbn;
static int maxfiles, cluster, source_fid, target_fid;
static struct FAB fab;
static struct RAB rab;
static struct NAM nam;
// static struct XABFHC fhc;
FH2 *source_fh2, *target_fh2;
FM2 *source_fm2, *target_fm2;
item getdvi_items[] = { 4, DVI$_MAXFILES, &maxfiles, 0,
4, DVI$_CLUSTER, &cluster, 0,
0, 0, 0, 0 } ;
struct { int len; char *addr; } devnam_desc, command_desc;
/
******************************************************************************/
/* Verify that we've been properly invoked */
if (argc != 4) printf("%s",usage), exit(1);
/* Use RMS to parse the file so that we get a FID for the header
clone */
fab = cc$rms_fab;
fab.fab$b_shr = FAB$M_NIL; /* want to be alone for thie */
fab.fab$b_fac = FAB$M_PUT | FAB$M_GET | FAB$M_BIO; /* not
really... */
fab.fab$l_fna = argv[1];
fab.fab$b_fns = strlen (argv[1]);
fab.fab$l_nam = &nam;
// fab.fab$l_xab = &fhc;
// fhc = cc$rms_xabfhc;
nam = cc$rms_nam;
nam.nam$l_esa = esa;
nam.nam$b_ess = sizeof (esa) - 1;
nam.nam$l_rsa = rsa;
nam.nam$b_rss = sizeof (rsa) - 1;
rab = cc$rms_rab;
rab.rab$l_fab = &fab;
rab.rab$w_usz = 512;
/*
** Pick up the file ID for the source file...
** re-use the FAB and NAM for target later
*/
status=sys$parse(&fab);
if (status & 1 ) status=sys$search(&fab);
if (status & 1 ) status=sys$open(&fab);
if (!(status & 1 )) return status;
source_fid = nam.nam$b_fid_nmx << 16;
source_fid += nam.nam$w_fid_num;
/*
** Get maxfile and cluster size from GETDVI, in order to calculate
** the offset to apply to the file ID to get the VBN in indexf.sys
*/
devnam_desc.addr = nam.nam$l_dev;
devnam_desc.len = nam.nam$b_dev;
status = sys$getdvi ( 0, 0, &devnam_desc, getdvi_items,0,0,0,0);
index_file_id_offset = 4 * cluster + ( maxfiles/4096 ) + 1;
blocks_to_split = atoi(argv[3]);
if (!blocks_to_split || blocks_to_split % cluster ) {
printf ("blocks_to_split (%d) must be a multiple of the
device"
" clustersize (%d).\n", blocks_to_split, cluster);
printf ("(Yeah, I could round up for you, but this needs to
be"
" a concious choice.\n");
return (16);
}
/*
** EBK check replace by reading beyond split point.
** if (fhc.xab$l_ebk < blocks_to_split) return ( RMS$_EOF );
*/
if (!fab.fab$v_ctg) return ( SS$_FILNOTCNTG );
rab.rab$l_bkt = blocks_to_split + 1;
p = (void *) source_header;
rab.rab$l_ubf = (void *) p;
status = sys$connect(&rab);
if (status & 1 ) status = sys$read(&rab);
if (status & 1 ) status = sys$close(&fab);
if (!(status & 1 )) return status;
i = p[5]; // DIR$B_NAME_COUNT
printf ("! First filename after split: %*s;%d\n",
i, &p[6], source_header[(6+i+1)/2] );
/*
** Re-use the FAB and NAM to create a target file.
** Must be on the same disk.
** We'll use this header to clone the target header into.
** Close it and stash away its file ID.
*/
nam.nam$w_fid_num = 0;
nam.nam$w_fid_seq = 0;
nam.nam$b_fid_nmx = 0;
fab.fab$l_fna = argv[2];
fab.fab$b_fns = strlen (argv[2]);
fab.fab$l_dna = nam.nam$l_dev;
fab.fab$b_dns = nam.nam$b_dev;
fab.fab$l_alq = 0;
status = sys$create(&fab);
if (status & 1) status = sys$close(&fab);
if (!(status & 1 )) return status;
target_fid = nam.nam$b_fid_nmx << 16;
target_fid += nam.nam$w_fid_num;
/*
** re-use the FAB and NAM again to open INDEXF.SYS (id=1,1)
*/
nam.nam$w_fid_num = 1;
nam.nam$w_fid_seq = 1;
nam.nam$b_fid_nmx = 0;
fab.fab$l_fop = FAB$M_NAM;
fab.fab$b_shr = FAB$M_UPI | FAB$M_SHRPUT | FAB$M_SHRGET;
status = sys$open(&fab);
if (status & 1 ) status = sys$connect(&rab);
if (!(status & 1 )) return status;
/*
** Read original header. UBF already set up.
*/
rab.rab$l_bkt = source_fid + index_file_id_offset;
status = sys$read(&rab);
if (!(status & 1 )) return status;
/*
** Read target header.
*/
rab.rab$l_bkt = target_fid + index_file_id_offset;
rab.rab$l_ubf = (void *) target_header;
status = sys$read(&rab);
if (!(status & 1 )) return status;
/*
** Copy record attribute area
*/
source_fh2 = (void *) source_header;
target_fh2 = (void *) target_header;
source_fat = (void *) &source_fh2->fh2$w_recattr;
target_fat = (void *) &target_fh2->fh2$w_recattr;
for (i = 10; i<(sizeof (FAT) / 2); i++) {
target_header[i] = source_header[i];
}
target_fh2->fh2$l_filechar = source_fh2->fh2$l_filechar;
/*
** Set the adjusted, word swapped, End-Of-File-Blocks.
*/
ebk.words.lo = source_fat->fat$w_efblkl;
ebk.words.hi = source_fat->fat$w_efblkh;
ebk.ebk -= blocks_to_split;
source_fat->fat$w_efblkl = ebk.words.lo;
source_fat->fat$w_efblkh = ebk.words.hi;
ebk.words.lo = source_fat->fat$w_hiblkl;
ebk.words.hi = source_fat->fat$w_hiblkh;
ebk.ebk -= blocks_to_split;
source_fat->fat$w_hiblkl = ebk.words.lo;
source_fat->fat$w_hiblkh = ebk.words.hi;
ebk.ebk = blocks_to_split + 1;
target_fat->fat$w_efblkl = ebk.words.lo;
target_fat->fat$w_efblkh = ebk.words.hi;
target_fat->fat$w_hiblkl = ebk.words.lo;
target_fat->fat$w_hiblkh = ebk.words.hi;
target_fh2->fh2$l_highwater = ebk.ebk;
/*
** Now for the tricky part... the mapping pointer.
*/
int mpoffset, map_inuse, lbn, count;
mpoffset = source_fh2->fh2$b_mpoffset;
map_inuse = source_fh2->fh2$b_map_inuse;
source_fm2 = (void *) &source_header[mpoffset];
target_fm2 = (void *) &target_header[mpoffset];
if ( target_fh2->fh2$b_map_inuse ) return (SS$_BADFILEHDR);
target_fh2->fh2$b_map_inuse = map_inuse;
for (i = mpoffset; i < (mpoffset + map_inuse); i++) {
target_header[i] = source_header[i];
}
target_fh2->fh2$l_filechar = source_fh2->fh2$l_filechar;
switch (source_fm2->fm2$v_format) {
case FM2$C_FORMAT1:
lbn = source_fm2->fm2$w_lowlbn + (source_fm2->fm2$v_highlbn
<<16);
lbn += blocks_to_split; // That had better fit!
source_fm2->fm2$w_lowlbn = lbn & 0xFFFF;
source_fm2->fm2$v_highlbn = lbn >> 16;
source_fm2->fm2$b_count1 -= blocks_to_split;
target_fm2->fm2$b_count1 = blocks_to_split - 1;
break;
case FM2$C_FORMAT2:
((FM2_1 *) source_fm2)->fm2$l_lbn2 += blocks_to_split;
source_fm2->fm2$v_count2 -= blocks_to_split;
target_fm2->fm2$v_count2 = blocks_to_split - 1;
break;
case FM2$C_FORMAT3:
((FM2_2 *) source_fm2)->fm2$l_lbn3 += blocks_to_split;
count = ((FM2_2 *) source_fm2)->fm2$w_lowcount + (source_fm2-
>fm2$v_count2 << 16);
count -= blocks_to_split;
((FM2_2 *) source_fm2)->fm2$w_lowcount = count & 0xFFFF;
source_fm2->fm2$v_count2 = count >> 16;
count = blocks_to_split - 1;
((FM2_2 *) target_fm2)->fm2$w_lowcount = count & 0xFFFF;
target_fm2->fm2$v_count2 = count >> 16;
break;
case FM2$C_PLACEMENT:
printf ("Don't want to deal with placement headers.\n");
return SS$_BADFILEHDR;
break;
}
/*
** Write out target header first, in case that is a problem.
** It was the last read, RAB still set up for BKT, RBF, RSZ.
*/
checksum = 0;
for (i = 0; i<255; i++) {
checksum += target_header[i];
}
target_header[i] = checksum & 0xFFFF;
status = sys$write(&rab);
if (!(status & 1)) return status;
/*
** Write out target header first, in case that is a problem.
*/
checksum = 0;
for (i = 0; i<255; i++) {
checksum += source_header[i];
}
source_header[i] = checksum & 0xFFFF;
rab.rab$l_bkt = source_fid + index_file_id_offset;
rab.rab$l_rbf = (void *) source_header;
status = sys$write(&rab);
if (status & 1 ) status = sys$close(&fab); /* close indexf.sys */
return status;
}
Scary cute ;-) I must say I'd be tempted to take out the volume
blocking lock, no matter what McCoy says. I tend to be "belt and
suspenders" when it comes to stuff like this.
Minor nit: sys$getdvi seems to be missing a "W" on the end, an IOSB, and
return status checks.
Cheers,
Jim.
--
www.eight-cubed.com
AH! Ever seen MUMPS code?
Are there any other questions?
Best regards Rudolf Wingert
Out of curiosity, how does Backup achieve this reverse delete ?
Does it build an in-memory list of files processed and once the backup
has been done (and optional verification pass), it parses that in-mmory
list backwards to delete the files ?
But it wastes so much CPU & IO reading thru all the files.
$ PIPE DIRE/COL=1/NOHEAD/NOTRAIL DISK$FOO:[BAR]*.* | -
SORT/KEY=(POS:1,SIZE:,DESC) SYS$PIPE FOO_BAR.TXT
Then DCL to delete them:
$ SET NOVER
$ ON ERROR THEN $GOTO ERR_RTN
$ OPEN/READ IFILE FOO_BAR.TXT
$LTOP:
$ READ/END=LEND IFILE IREC
$ DEL/LOG 'IREC'
$ GOTO LTOP
$LEND:
$ERR_RTN:
$ CLOSE IFILE
$ EXIT
Het middle is erger dan de kwaal?
Speaking of wasteful... what about that image activation for each
delete. Yikes!
I published a simple DCL script for reverse delete a few time in the
past.
For example in:
http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=625667
It combines a few deletes per image activation.
These days I'd use PERL:
$ perl -le "foreach (reverse sort glob q(test*.*;1)){ print; unlink }"
Looks tight, is faster.
Cheers,
Hein.
I would think that a rename removing a file from a directory would
have the same issues processing the directory file that delete has.
set file/enter doesn't have that problem, but deleting the alias
does not delete the file, so it also doesn't accomplish anything.
Which is OK if the files are small and the real overhead is in
directory file processing. But DFU doesn't have that problem,
so it's a better solution.
The OP may not have DFU. He may have to jump through hoops to get
it.
Alas, that approach doesn't help at all. The problem at hand was that
the directory was populated with 200,000 files and the original poster
needed to delete the first 30,000 of these.
Whether you delete those files or rename them to another directory
you still end up removing their directory entries. That leaves you
with empty blocks at the front end of a 200,000 file directory.
And that means that you need to shift the remaining data down to fill
in the vacated blocks.
If one was absolutely determined to use such an approach, it would be
possible to use a scheme in which _all_ the files are renamed to another
directory in reverse alphabetical order and the file names themselves
are inverted in lexicographic order -- e.g. A becomes Z, B becomes Y, etc.
That way you'd be updating both directories at the tail end.
> Also,
>
> delete az*.*;*
> delete ay*.*;*
> delete ax*.*;*
> ...
> delete ac*.*;*
> delete ab*.*;*
> delete aa*.*;*
>
> would be faster since it would begin the deletes further down the list
> and while not a full "reverse order" delete, it would reduce the amount
> of shuffling it needs to do for each delete.
If you're deleting the first 30,000 files from a 200,000 file directory,
any such optimization can only shave something like 8% off your total
elapsed time.
You can save yourself from shuffling the first 29,999 directory
entries (average 15,000) down, but there are still 170,000 that you
can't do anything about with this scheme.
Actually.... I seem to have posted an intermediate version.
Almost right, but not taking from the end towards forwards.
Which was the whole point! Ooops
Here is the correct example
It's just an example, with soem debugging lines still there to help
understand it.
Adapt to individual needs and perl quirks trying to help with files
and filenames.
Or re-write to something similar in DCL.
use strict;
#use warnings;
my $HELPER = "[-.tmp_helper]";
my $TARGET = "[-.tmp_renamed]";
my $i = 0;
my @files;
$_ = shift or die "Please provid double quoted wildcard filespec";
print "wild: $_\n";
s/"//g;
my $wild = $_;
foreach (qx(DIRECTORY/COLU=1 $wild)) {
chomp;
$files[$i++] = $_ if /;/;
}
die "Please provide double quoted wildcard filespec" if @files < 2;
# phase 1
$i = @files;
print "Moving $i files to $HELPER\n";
while ($i-- > 0) {
my $name = $files[$i];
my $new = sprintf("%s%06d%s",$HELPER,999999-$i,$name);
print "$name --> $new\n";
rename $name, $new;
}
system ("DIRECTORY $HELPER");
# phase 2
print "Renaming from $HELPER to $TARGET...\n";
while ($i++ < @files) {
my $name = $files[$i];
rename sprintf("%s%06d%s",$HELPER,999999-$i,$name), $TARGET.$name;
}
Hope this help better :-)
Hein.
Good point. Then read 3-4 records, concatenating them into one
larger string and then delete that. Damn DCL for having in 2007 a
240 byte max record size!
> I published a simple DCL script for reverse delete a few time in the
> past.
> For example in:
> http://forums1.itrc.hp.com/service/forums/questionanswer.do?threadId=625667
> It combines a few deletes per image activation.
>
> These days I'd use PERL:
>
> $ perl -le "foreach (reverse sort glob q(test*.*;1)){ print; unlink }"
Interesting.
> Looks tight, is faster.
>
> Cheers,
--
Does DFU have the ability to selectively delete files from a directory ?
From what help says, it can delete whole directories, or delete by file-id.
The original poster needs to selectively delete a whole bunch of files
in a huge directory.
Do the FTP in reverse order. For instance, do all the Z*.*;* files, then
delete all the Z*.*;* files. Transfer the Y*.*;* files, then delete all
the Y*.*;* files.
The current batch of undeleted files could stay there until you've done
all the B files, at which point you can delete the A*.*;* files which
you originally transfered first.
This will make the deletes at each stage much faster since you will be
working with files that are towards the end of the directory.
No, don't think so, and when *I* originaly mentioned DFU in this
thread it was about "emtying" or removing a whole DIR.
> From what help says, it can delete whole directories, or delete by file-id.
>
> The original poster needs to selectively delete a whole bunch of files
> in a huge directory.
As this, as far as I understand, is a one time effort to clean
up efter a system or user error, I think that the easiest route
is to just take the time to copy and delete the files using
regular DCL. If one use DFU at some intervalls to compress the
DIR file, one will ge a better performance after a while...
Note, if there have been some creates/deletes in this DIR over
time, it could be a good idea to run a DFU "directory compress"
right from the beginning to get the smallest possible DIR file
with the current files in it. The smaller the DIR file is, the
faster the deletes run.
Jan-Erik.
Oops, your right. The op would have to get the FID first, such as
by dir/file_id or f$file_attributes()
As of V7.3-2 and later, DCL's "record" (max cmd) length is 4KB (4095, actually).
This is approximately what DFU does. Since it is not resorting directories
ever, in doing this, it will run fairly fast. Once you have the file IDs,
after all, the files are all just files and it makes no difference whether
they are in one directory or separate directories.
The ACP interface is described in the firsts chapters of the RMS manual
as I recall. Note that it (using io$_delete)is not the same as lib$delete which
opens the file and closes with delete as disposition.
Glenn Everhart