Directory performance

JF Mezei

unread,

Apr 9, 2015, 2:18:50 PM4/9/15

to

OK, so the complaints about VMS directory performance are plentyful.

Apple's HFS uses a large catalogue file and doesn't actually have
directrory files (they are synthetised so the Unix side of OS-X thinks
there are directory files). I have seen complaints about HFS performance
but not seen any real facts to support those complaints.

For VMS, obviously, there is the DELETE *.*;* of a very large directory
which causes VMS to rewrite the whole .DIR file after deleting every
file. (DFU would profile a way to do a reverse delete which required
the rewriting only the last bock of the directory after each delete - or
perhaps it was even smarter, deleting the file IDs and updating the .DIR
only at the end).

So, since the fine engineers at VSI are allegedly planning a new file
system during lunch break, I am curious about what designs exist today
(or could be devised tomorrow) to deal with directories differently ?

Would it be as simple as having a "deleted" byte flag in a directory
record so deleting a file would only involve updating that block in-situ
and some sort of reclaim would happen later on ?

What else ?

abrsvc

unread,

Apr 9, 2015, 2:40:37 PM4/9/15

to

For the sake of accuracy, I believe that the only "rewrites" occur when a directory block has been emptied. Directory files do not have blank blocks in them. The reason for the long delete times, is that the "first" block ends up empty first requiring all the remaining blocks to be written to their respective block-1). A reverse delete (files in reverse order) don't require any rewrites as the "EOF" for the directory gets moved to the previous block once the last block is empty.

Dan

mcle...@gmail.com

unread,

Apr 9, 2015, 6:19:38 PM4/9/15

to

JF, you suggest a delete flag fro every file. The problem is that you'd need it for every version of every file, which could get messy.

If you want to stay with the same basic structure then maybe it would be simpler to have a "next block" pointer and not worry about having the correct physical order of the directory file blocks but always be able to go to the next block.

I see that Linux doesn't worry about file order in its equivalent to directory files and the filenames are only sorted when required (eg. for display). Maybe there's some ideas here or maybe they are too radical; I just mention it out of interest.

David Froble

unread,

Apr 9, 2015, 6:43:11 PM4/9/15

to

If you'd ask me, and yes, you appear to have done so, my suggestion
would be to refrain from attempting to use the file system as a database.

If I was a system manager and found a directory with 1000 files, I'd
have a fit and become a fire breathing dragon.

If I was managing a group of developers and any of them came up with a
design that would misuse the file system, I'd have a fit and become a
fire breathing dragon.

Seriously, the file system works, and does what it is intended to do.
problems arise only when it's abused.

A database supports adds, deletions, reading, updating, and such. It
has mechanisms that allow these things to be done in an efficient
manner. If that's what is needed, then the "right tool" should be
selected and used.

Got to wonder when so many people learned to attempt to pound square
pegs through round holes ??????????

abrsvc

unread,

Apr 9, 2015, 6:56:43 PM4/9/15

to

> If I was a system manager and found a directory with 1000 files, I'd
> have a fit and become a fire breathing dragon.
>

I have managed systems that have more than 10000 files in a directory with no problems at all. These are production systems using current versions of OpenVMS. While not ideal, the application uses this design mostly due to history and with no particular performance issues, there is little reason to change.

I think that the issue here is when directories grow to have this high number of entries, there can be problems. Properly managed, these are not a problem in all cases.

We need to be careful to NOT over generalize and say that all cases of large entries are a problem.

Dan

mcle...@gmail.com

unread,

Apr 9, 2015, 7:08:44 PM4/9/15

to

But you can't rely on all users and system managers properly managing directories with large numbers of files. Operating systems need to be as simple as possible for users. They will only say "The file system is too slow", not "The file system is too slow because I don't know enough about how to use it efficiently".

David Froble

unread,

Apr 9, 2015, 7:23:26 PM4/9/15

to

Dan, if I type DI and the list scrolls off the top of the screen, it's
too many for me.

Ok, some directories need more. But not what I've seen reported here.

JF Mezei

unread,

Apr 9, 2015, 7:27:18 PM4/9/15

to

On 15-04-09 18:43, David Froble wrote:

> If I was a system manager and found a directory with 1000 files, I'd
> have a fit and become a fire breathing dragon.

Remind me to not work anywhere near you without an asbestos suit. :-)

Many moons ago, I had software which mapped elevation of Australia and
processed a lot of GIS data. Back then the GEOTOPO30 elevation for
australia was in 2 files, I think roughly 20 megs.

Due to a big in LD driver, I lost many files (and backups), and ever so
slowly, and rebuilding some of the software.

Since I have to convert my old GPS "binary" tracks to more modern GPX
text format, I decided to rebuild the ability to get elevation for a GPS
coordinate.

OK, so I am getting much better precision. But today, this consists of
840 files of 25MB each. This is how the USGS gave me the data (SRTM 1
second data which had been embargoes on 9-11 and finally released in
Sept 2014.

You can breathe all the fire you want at me, but this is how things work
these days.

Just as the 8.3 filenames of the DOS and early Digital era are no longer
acceptable, the idea that a directory should have only a couple of files
is also no longer acceptable.

I know that VMS has improved directory performance over the years. I
think 8.4 also intrrocuded performance improvemebnts. So the question
comes, can fine tuning of apps and perhaps the format achieve acceptable
performance, or does it need a total rethink ?

Jan-Erik Soderholm

unread,

Apr 10, 2015, 7:12:16 AM4/10/15

to

David Froble skrev den 2015-04-10 00:43:
> JF Mezei wrote:
>> OK, so the complaints about VMS directory performance are plentyful.
>>
>> Apple's HFS uses a large catalogue file and doesn't actually have
>> directrory files (they are synthetised so the Unix side of OS-X thinks
>> there are directory files). I have seen complaints about HFS performance
>> but not seen any real facts to support those complaints.
>>
>>
>> For VMS, obviously, there is the DELETE *.*;* of a very large directory
>> which causes VMS to rewrite the whole .DIR file after deleting every
>> file. (DFU would profile a way to do a reverse delete which required
>> the rewriting only the last bock of the directory after each delete - or
>> perhaps it was even smarter, deleting the file IDs and updating the .DIR
>> only at the end).
>>
>> So, since the fine engineers at VSI are allegedly planning a new file
>> system during lunch break, I am curious about what designs exist today
>> (or could be devised tomorrow) to deal with directories differently ?
>>
>> Would it be as simple as having a "deleted" byte flag in a directory
>> record so deleting a file would only involve updating that block in-situ
>> and some sort of reclaim would happen later on ?
>>
>> What else ?
>
> If you'd ask me, and yes, you appear to have done so, my suggestion would
> be to refrain from attempting to use the file system as a database.
>
> If I was a system manager and found a directory with 1000 files, I'd have a
> fit and become a fire breathing dragon.

That is nothing but silly.

Yes, I remember way back on our MicroVAX 3100/90 that we sometimes
saw slow deletes when some directory had grown a bit to large...

Time for a real life experiment, isn't it?

System used:

$ tcpip sh ver

HP TCP/IP Services for OpenVMS Alpha Version V5.5 - ECO 1
on a COMPAQ AlphaServer DS20E 666 MHz running OpenVMS V8.2

Not the very "hottest" system or even the latest VMS...

I selected a directory with:

Total of 10338 files, 84664/536588 blocks.

This .DIR file allocation is 1249/1300.

All files (apart from 34) are ;1 files.
All files has 23 char files and 3 char types.

Evenly spread from 2007 to "today".

I deleted 8 files evenly spread over the (default sorted)
DIR output (including the very first file) using this COM:

$ sh time
$ sh proc/acc
$ delete/log <filename>
<repeated 9 times>
$ sh proc/acc
$ sh time

This took .25 secs "wall time" and 0.04 secs CPU time.

I do not regard that as "slow".

$ @b.b
10-APR-2015 12:43:29

Accounting information:
Buffered I/O count: 207650 Peak working set size: 26192
Direct I/O count: 109678 Peak virtual size: 576464
Page faults: 31593 Mounted volumes: 0
Images activated: 55
Elapsed CPU time: 0 00:00:48.32
Connect time: 0 02:18:23.79

%DELETE-I-FILDEL, <filename> deleted (52 blocks)
<repeated 9 times>

Accounting information:
Buffered I/O count: 207801 Peak working set size: 26192
Direct I/O count: 109817 Peak virtual size: 576464
Page faults: 31995 Mounted volumes: 0
Images activated: 64
Elapsed CPU time: 0 00:00:48.36
Connect time: 0 02:18:24.04

10-APR-2015 12:43:30

I also made another test and deleted the "first" 100
files (according to the default DIR output) and got:

$ @ b.b

10-APR-2015 13:03:11

Accounting information:
Buffered I/O count: 252075 Peak working set size: 26192
Direct I/O count: 120605 Peak virtual size: 576464
Page faults: 33618 Mounted volumes: 0
Images activated: 93
Elapsed CPU time: 0 00:00:52.97
Connect time: 0 02:38:04.86

<100 DELETE-I-FILDEL messages removed...>

Accounting information:
Buffered I/O count: 253682 Peak working set size: 26192
Direct I/O count: 122596 Peak virtual size: 576464
Page faults: 38224 Mounted volumes: 0
Images activated: 193
Elapsed CPU time: 0 00:00:53.68
Connect time: 0 02:38:06.61

10-APR-2015 13:03:12

So, 0.71 sec CPU time and 1.75 sec "wall clock" time for the
100 "first" files in a > 10.000 file directory.

Also quite acceptable regarding the server hardware used.

Best Regards,
Jan-Erik.

>
> If I was managing a group of developers...

Not realy worth commenting.

Bill Gunshannon

unread,

Apr 10, 2015, 8:35:46 AM4/10/15

to

In article <mg6v86$l5n$1...@dont-email.me>,

David Froble <da...@tsoft-inc.com> writes:
> JF Mezei wrote:
>> OK, so the complaints about VMS directory performance are plentyful.
>>
>> Apple's HFS uses a large catalogue file and doesn't actually have
>> directrory files (they are synthetised so the Unix side of OS-X thinks
>> there are directory files). I have seen complaints about HFS performance
>> but not seen any real facts to support those complaints.
>>
>>
>> For VMS, obviously, there is the DELETE *.*;* of a very large directory
>> which causes VMS to rewrite the whole .DIR file after deleting every
>> file. (DFU would profile a way to do a reverse delete which required
>> the rewriting only the last bock of the directory after each delete - or
>> perhaps it was even smarter, deleting the file IDs and updating the .DIR
>> only at the end).
>>
>> So, since the fine engineers at VSI are allegedly planning a new file
>> system during lunch break, I am curious about what designs exist today
>> (or could be devised tomorrow) to deal with directories differently ?
>>
>> Would it be as simple as having a "deleted" byte flag in a directory
>> record so deleting a file would only involve updating that block in-situ
>> and some sort of reclaim would happen later on ?
>>
>> What else ?
>
> If you'd ask me, and yes, you appear to have done so, my suggestion
> would be to refrain from attempting to use the file system as a database.
>
> If I was a system manager and found a directory with 1000 files, I'd
> have a fit and become a fire breathing dragon.

Why? My home directory has (currently) 1513 files. Is this supposed to
be some kind of problem? Shouldn't the user be the one who decides how
his data should be stored and used? Shouldn't the system support the
users rather than the users adapting to support the system?

>
> If I was managing a group of developers and any of them came up with a
> design that would misuse the file system, I'd have a fit and become a
> fire breathing dragon.

Misuse is one thing, but how is having a lot of files in a directory
mis-use unless the file system has serious shortcomings? (Last one
I knew of like that was Primos. :-)

>
> Seriously, the file system works, and does what it is intended to do.
> problems arise only when it's abused.

I really want to hear why you think big directories are an abuse.

>
> A database supports adds, deletions, reading, updating, and such. It
> has mechanisms that allow these things to be done in an efficient
> manner. If that's what is needed, then the "right tool" should be
> selected and used.
>
> Got to wonder when so many people learned to attempt to pound square
> pegs through round holes ??????????

I really don't understand what this has to do with large directories.

bill

--
Bill Gunshannon | de-moc-ra-cy (di mok' ra see) n. Three wolves
bill...@cs.scranton.edu | and a sheep voting on what's for dinner.
University of Scranton |
Scranton, Pennsylvania | #include <std.disclaimer.h>

Bill Gunshannon

unread,

Apr 10, 2015, 8:39:09 AM4/10/15

to

In article <55270ad4$0$1555$c3e8da3$12bc...@news.astraweb.com>,

Reading this just reminded me of another thing. If large directoies
of small files are such a problem I guess that explains why people
didn't use VMS for USENET News Servers. :-)

Bob Koehler

unread,

Apr 10, 2015, 9:21:37 AM4/10/15

to

In article <mg6v86$l5n$1...@dont-email.me>, David Froble <da...@tsoft-inc.com> writes:
>

> Got to wonder when so many people learned to attempt to pound square
> pegs through round holes ??????????

On UNIX, where there are no pegs and no holes, only saw dust piles.

Bob Koehler

unread,

Apr 10, 2015, 9:24:56 AM4/10/15

to

In article <55270ad4$0$1555$c3e8da3$12bc...@news.astraweb.com>, JF Mezei <jfmezei...@vaxination.ca> writes:
>

> OK, so I am getting much better precision. But today, this consists of
> 840 files of 25MB each. This is how the USGS gave me the data (SRTM 1
> second data which had been embargoes on 9-11 and finally released in
> Sept 2014.

While, I too, like to keep the output of a directory command to
about 1 page, I suspect it would not be hard to get good performance
with only 840 files, no matter what thier size.

That's a far cry from 10000 tiny files.

Bob Gezelter

unread,

Apr 10, 2015, 9:27:22 AM4/10/15

to

Jan-Erik,

Please be careful when benchmarking. The specifics relating to the root (pun unintended) of the "file deletion performance" problem affect which benchmarks will show a problem.

Casual benchmarks often fail to provoke the problem, thus give the appearance that the problem does not exist. As Hoff noted (and I have seen on many occasions) the problem does indeed exist.

However, unlike some file systems (e.g., FATx), FILES-11 directories do not contain fixed length entries. Compaction ONLY happens when a directory block is completely empty.

If the "benchmark" does not consistently create a "directory block empty" condition, then the performance problem with large directories will not occur (e.g., to exaggerate slightly, a directory with 100K files will not experience performance issues relating to deletes if no delete operation results in an empty directory block). This is why most discussions of this problem involve wildcard deletes (but not necessarily purges).

- Bob Gezelter, http://www.rlgsc.com

David Froble

unread,

Apr 10, 2015, 9:53:01 AM4/10/15

to

Ok, some perspective ....

Files in directories might be split into two categories.

1) For use by humans
2) For use by software

Now, for #1, yes, it's possible to have many totally different subjects,
and along with that file(s) for each. I have to ask, normally, how many
different subjects might a person have, before he / she needs to do some
ordering? When that time comes, one might set up multiple directories,
or sub-directories, to place some order on the files. I'd suggest that
for random subjects, there are limits past which the data becomes too
much for a person to handle.

There are some subjects that could involve large numbers of files. An
example of this would be programs. Even there, in a large software
package, it would be wise to segregate the programs in some manner.

For example, I use different sub-directories for Order Processing,
Inventory, A/R, A/P, etc. And within that, separate sub-directories for
sources, command procedures for building the programs, and executables.

Ok, now for data used by software. Let's use your mention of
newsgroups. Would you rather manually search through files for a
particular topic, group, and whatever else you're interested in, or
perhaps be able to perform an SQL inquiry such as:

Select * From Database Where Group="Comp.OS.VMS" And Topic="Directory
size", and CreationDate > 2010

Just as an example.

Not saying it cannot be done with lots of individual files, but having
all the data in a database has advantages. I'll leave thinking about
those advantages as an exercise for you.

While not all pegs are square and not all holes are round, it's my
perspective that software creating and using many small files, using the
file system as a database, might not be the smartest thing to do,
regardless of past practices.

johnwa...@yahoo.co.uk

unread,

Apr 10, 2015, 10:24:35 AM4/10/15

to

Careful.

On Linux (and on a decent modern UNIX), boss can frequently choose the
most appropriate file system for the job at hand. You don't *have* to
do that, and sensible defaults are frequently available, but you do
have the choice.

Even that other OS has an "installable file system" capability.

It'd perhaps be nice to have that capability on VMS. Nice is not the
same as necessary.

Providing such capability would likely break (m)any applications which
make assumptions about internal filesystem behaviour.

Non-priv apps can't really make those assumptions and use that knowledge
on VMS; security meachanisms largely prevent it. But in general for apps
with privs on VMS it's historically been safe to assume ODS-2 (do many
people actually use ODS-5?).

Are there good reasons why throwing money at hardware (flash drives etc) wouldn't be a reasonable short term solution vs lots of work in VMS
internals?

How much does filesystem performance matter, relative to the other
things in VSI's Inbox?

Jan-Erik Soderholm

unread,

Apr 10, 2015, 11:12:39 AM4/10/15

to

OK. Then tell me how to force this "problem".

I just deleted 1.000 files out of the aprox. 10.000 files.
The *first* 1.000 files that DIR displays.

Took aprox. 6 CPU seconds and 15 "wall time" seconds.
Aprox 13.000 DIR I/O.

The time might in part be due to the display of 1000 FILDEL
messages over my remote terminal link.

The size of the DIR file was 1229 blocks before the operation
and 1109 blocks after. So *something* must have happend to the
DIR file during the operation also, not?

Anyway, until someone shows a current example of this "problem",
I will claim that it is not a real problem today.

Best Regards,
Jan-Erik.

abrsvc

unread,

Apr 10, 2015, 11:30:57 AM4/10/15

to

The only way I can recall of demonstrating this problem would be to create say 10000 files of any size using a timestamp as the file name. This should create a reasonably sized .DIR file. Once created, use a wildcard delete of these files. Time this. You will see a large number of IOs for this operation. Take the same directory and delete the files in reverse order (last file first etc.) Note the IOs and time for this run as well. There will be a significant difference in the elapsed times for these 2 operations. The first (in order) should take longer. The last block will be "moved" N times where N= the number of blocks in the directory file. In case 2, (reverse order) there will be no re-writing of directory blocks.

Dan

Stephen Hoffman

unread,

Apr 10, 2015, 12:37:25 PM4/10/15

to

On 2015-04-10 14:24:34 +0000, johnwa...@yahoo.co.uk said:

> ...But in general for apps with privs on VMS it's historically been

> safe to assume ODS-2 (do many people actually use ODS-5?).

It's needed for Java and Apache, among other packages. I don't recall
off-hand if it's the installation default for new OpenVMS
installations, but it's use is pretty common with folks that have been
maintaining their environments not at the trailing edge.

> Are there good reasons why throwing money at hardware (flash drives
> etc) wouldn't be a reasonable short term solution vs lots of work in
> VMS internals?

VSI will probably end up qualifying some of that gear, certainly.

> How much does filesystem performance matter, relative to the other
> things in VSI's Inbox?

VSI are probably more concerned with addressing disk capacities than
with raw performance — ODS-2 and ODS-5 top out at 2 TiB, and 6 TB disks
are now available, and storage controllers can synthesize some very big
"disks" for some applications. (One of the dinky storage boxes I'm
working with can synthesize a 28 TB RAID-6 volume; this isn't
gonzo-expensive FC SAN gear, either.) But yes, VSI undoubtedly has a
very long list of pending projects.

Many existing folks will be very slow at moving to newer file systems,
too. More than a few applications still have no idea what to do with
ODS-5 filenames, and work simply because ODS-5 is a proper superset of
ODS-2, and folks using those tools are sticking to the subset.

As for faster storage, inboard flash — there are folks that offer flash
storage that fits into DIMM sockets — is becoming more common, too.
This gets around the existing 6 Gbps ("6 Gbit/s") SAS bus bottlenecks.
Not that the current SAS 12 Gbps is slow, where you can find and can
qualify those parts, and have the requisite-generation PCIe bus support.

FWIW, when you're running an operating system, you inevitably look two
or three years into the future when picking target features. VSI
probably lacks the resources to really get into that forward-facing
development work right now, and will probably be picking mostly from
the list of limitations that their soon-to-be-biggest customers want
removed, and the sorts of hardware configurations that the existing
customers want qualified for Poulson and Kittson. But if VSI can get
their revenues onto an acceptable trend, it'll be interesting to see
what they can come up with for OpenVMS.

VSI will probably have rather more comments on their plans at the
Bootcamp at the Nashua Radisson this September, if not before.

--
Pure Personal Opinion | HoffmanLabs LLC

Simon Clubley

unread,

Apr 10, 2015, 1:15:34 PM4/10/15

to

The other way of demonstrating this problem is to use dump to display
the blocks making up the test directory in question (and parse the output)
or just directly open the directory file for reading in a test program.

Create two command procedures: one which deletes every file in a directory
block except one and another one which deletes this last file in the
directory block (and hence causes block shuffling).

Have the program process the whole directory file in this way and then
run the resulting command procedures while timing the results.

Simon.

--
Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world

Simon Clubley

unread,

Apr 10, 2015, 1:24:15 PM4/10/15

to

On 2015-04-10, David Froble <da...@tsoft-inc.com> wrote:
>
> Ok, now for data used by software. Let's use your mention of
> newsgroups. Would you rather manually search through files for a
> particular topic, group, and whatever else you're interested in, or
> perhaps be able to perform an SQL inquiry such as:
>
> Select * From Database Where Group="Comp.OS.VMS" And Topic="Directory
> size", and CreationDate > 2010
>

I think Bill's talking about the news spool, not any messages you might
choose to keep locally after reading a newsgroup.

Back when I last ran a small news spool locally for offline reading
(which was over a decade ago so the behaviour may have changed), each
message in a newsgroup was stored in it's own file in a directory.

Simon Clubley

unread,

Apr 10, 2015, 1:42:14 PM4/10/15

to

$ set response/mode=good_natured

As a Public Service Announcement, I would like to bring your attention
to a wonderful new invention called a scroll bar. :-)

You may wish to investigate this wonderful new technology. :-)

David Froble

unread,

Apr 10, 2015, 2:06:57 PM4/10/15

to

Simon Clubley wrote:
> On 2015-04-09, David Froble <da...@tsoft-inc.com> wrote:
>> abrsvc wrote:
>>>> If I was a system manager and found a directory with 1000 files, I'd
>>>> have a fit and become a fire breathing dragon.
>>>>
>>> I have managed systems that have more than 10000 files in a directory with no problems at all. These are production systems using current versions of OpenVMS. While not ideal, the application uses this design mostly due to history and with no particular performance issues, there is little reason to change.
>>>
>>> I think that the issue here is when directories grow to have this high number of entries, there can be problems. Properly managed, these are not a problem in all cases.
>>>
>>> We need to be careful to NOT over generalize and say that all cases of large entries are a problem.
>>>
>>> Dan
>> Dan, if I type DI and the list scrolls off the top of the screen, it's
>> too many for me.
>>
>
> $ set response/mode=good_natured
>
> As a Public Service Announcement, I would like to bring your attention
> to a wonderful new invention called a scroll bar. :-)
>
> You may wish to investigate this wonderful new technology. :-)
>
> Simon.
>

$ set response/mode=senile_and_memory_impared

I need a bit more structure in my life.

:-)

JF Mezei

unread,

Apr 10, 2015, 2:13:44 PM4/10/15

to

Mr Froble,

My current pet project involves elevation data with files such as:

> s21_e121_1arc_v3.tif s26_e131_1arc_v3.tif s31_e138_1arc_v3.tif s42_e144_1arc_v3.tif
> s21_e122_1arc_v3.tif s26_e132_1arc_v3.tif s31_e139_1arc_v3.tif s42_e145_1arc_v3.tif
> s21_e123_1arc_v3.tif s26_e133_1arc_v3.tif s31_e140_1arc_v3.tif s42_e146_1arc_v3.tif
> s21_e124_1arc_v3.tif s26_e134_1arc_v3.tif s31_e141_1arc_v3.tif s42_e147_1arc_v3.tif
> s21_e125_1arc_v3.tif s26_e135_1arc_v3.tif s31_e142_1arc_v3.tif s42_e148_1arc_v3.tif
> s21_e126_1arc_v3.tif s26_e136_1arc_v3.tif s31_e143_1arc_v3.tif s43_e145_1arc_v3.tif
> s21_e127_1arc_v3.tif s26_e137_1arc_v3.tif s31_e144_1arc_v3.tif s43_e146_1arc_v3.tif
> s21_e128_1arc_v3.tif s26_e138_1arc_v3.tif s31_e145_1arc_v3.tif s43_e147_1arc_v3.tif
> s21_e129_1arc_v3.tif s26_e139_1arc_v3.tif s31_e146_1arc_v3.tif s43_e148_1arc_v3.tif
> s21_e130_1arc_v3.tif s26_e140_1arc_v3.tif s31_e147_1arc_v3.tif s44_e145_1arc_v3.tif
> s21_e131_1arc_v3.tif s26_e141_1arc_v3.tif s31_e148_1arc_v3.tif s44_e146_1arc_v3.tif
> s21_e132_1arc_v3.tif s26_e142_1arc_v3.tif s31_e149_1arc_v3.tif s44_e147_1arc_v3.tif
> s21_e133_1arc_v3.tif s26_e143_1arc_v3.tif s31_e150_1arc_v3.tif s44_e148_1arc_v3.tif

Although Unix doesn't have the /GRAND_TOTAL qualified for "ls", there
are 840 of those.

My app will open each file "on demand" when I need an elevation that is
within the lat/lon area covered by each tile/file. I programatically
build the file name based on the requested latitude/longitude.

Are you telling me that I should split this into multiple directories
and then add code to my app to also programatically decided which
directory the needed file should be expected to be in ?

These files all belong with each other. Not like accounts receivable can
be stored separately from accounts payable. But a large company might
have thousands of accounts receivable.

Consider and ISP with 40,000 customers. Everyone gets a .pdf as invoice
each month. How do you suggest the ISP store those 40,000 .pdf files ?

(Yes, my All-in-1 background tells me to split the monthly invoice
directory into say 40 subdirectories and evenly distribute the invoices
and then have the billing system keep a file pointer that contains which
directory contains the invoice).

But at the end of the day, having a file system that CAN handle 40,000
files easily makes for simpler applications.

Bill Gunshannon

unread,

Apr 10, 2015, 3:03:48 PM4/10/15

to

In article <mg91vs$283$4...@dont-email.me>,

Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> writes:
> On 2015-04-09, David Froble <da...@tsoft-inc.com> wrote:
>> abrsvc wrote:
>>>> If I was a system manager and found a directory with 1000 files, I'd
>>>> have a fit and become a fire breathing dragon.
>>>>
>>>
>>> I have managed systems that have more than 10000 files in a directory with no problems at all. These are production systems using current versions of OpenVMS. While not ideal, the application uses this design mostly due to history and with no particular performance issues, there is little reason to change.
>>>
>>> I think that the issue here is when directories grow to have this high number of entries, there can be problems. Properly managed, these are not a problem in all cases.
>>>
>>> We need to be careful to NOT over generalize and say that all cases of large entries are a problem.
>>>
>>> Dan
>>
>> Dan, if I type DI and the list scrolls off the top of the screen, it's
>> too many for me.
>>
>
> $ set response/mode=good_natured
>
> As a Public Service Announcement, I would like to bring your attention
> to a wonderful new invention called a scroll bar. :-)
>
> You may wish to investigate this wonderful new technology. :-)
>

Like Dave, I was unable to find the scrollbar on my VT100. I couldn't
even find one on my VT420. :-)

Bill Gunshannon

unread,

Apr 10, 2015, 3:09:37 PM4/10/15

to

In article <mg90u6$283$2...@dont-email.me>,

Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> writes:
> On 2015-04-10, David Froble <da...@tsoft-inc.com> wrote:
>>
>> Ok, now for data used by software. Let's use your mention of
>> newsgroups. Would you rather manually search through files for a
>> particular topic, group, and whatever else you're interested in, or
>> perhaps be able to perform an SQL inquiry such as:
>>
>> Select * From Database Where Group="Comp.OS.VMS" And Topic="Directory
>> size", and CreationDate > 2010
>>
>
> I think Bill's talking about the news spool, not any messages you might
> choose to keep locally after reading a newsgroup.
>
> Back when I last ran a small news spool locally for offline reading
> (which was over a decade ago so the behaviour may have changed), each
> message in a newsgroup was stored in it's own file in a directory.

Exactly. I ran a news server here that was grequently in the top 100
and made top 50 at least once. Look at how many "current" messages
there are in some of the more acive groups. Then realize that big
servers can easily keep a month or more available at any point in
time. Big directories. That took a bit of filesystem tuning
to accomplish. I am not aware of VMS filesystems having any admin
tunable parameters, but I could easily be wrong as I never admined
a box that was anything other than a student/faculty user box.

Simon Clubley

unread,

Apr 10, 2015, 3:44:36 PM4/10/15

to

On 2015-04-10, Bill Gunshannon <bi...@server3.cs.scranton.edu> wrote:
> In article <mg91vs$283$4...@dont-email.me>,
> Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> writes:
>> On 2015-04-09, David Froble <da...@tsoft-inc.com> wrote:
>>>
>>> Dan, if I type DI and the list scrolls off the top of the screen, it's
>>> too many for me.
>>>
>>
>> $ set response/mode=good_natured
>>
>> As a Public Service Announcement, I would like to bring your attention
>> to a wonderful new invention called a scroll bar. :-)
>>
>> You may wish to investigate this wonderful new technology. :-)
>>
>
> Like Dave, I was unable to find the scrollbar on my VT100. I couldn't
> even find one on my VT420. :-)
>

Then that's what the button marked Hold or the /PAGE qualifier is for. :-)

Stephen Hoffman

unread,

Apr 10, 2015, 4:12:43 PM4/10/15

to

On 2015-04-10 17:41:16 +0000, Simon Clubley said:

> On 2015-04-09, David Froble <da...@tsoft-inc.com> wrote:
>>
>> Dan, if I type DI and the list scrolls off the top of the screen, it's
>> too many for me.
>
> $ set response/mode=good_natured
>
> As a Public Service Announcement, I would like to bring your attention
> to a wonderful new invention called a scroll bar. :-)
>
> You may wish to investigate this wonderful new technology. :-)

Or DIRECTORY/PAGE.

Stephen Hoffman

unread,

Apr 10, 2015, 4:14:37 PM4/10/15

to

On 2015-04-10 19:03:46 +0000, Bill Gunshannon said:

> Like Dave, I was unable to find the scrollbar on my VT100. I couldn't
> even find one on my VT420. :-)

The former lacked scrolling, but you can use control up-arrow on the latter.

JF Mezei

unread,

Apr 10, 2015, 4:28:09 PM4/10/15

to

On 15-04-10 15:03, Bill Gunshannon wrote:

> Like Dave, I was unable to find the scrollbar on my VT100. I couldn't
> even find one on my VT420. :-)

$DIR/output=dir.txt
$EDIT/TPU dir.txt

You can then scroll to your heart's content using page up/down and the
arrow keys.

However, on a DECterm, you have scroll bars on the terminal and don't
need to send output to intermediate file.

John Reagan

unread,

Apr 10, 2015, 4:32:43 PM4/10/15

to

>
> Although Unix doesn't have the /GRAND_TOTAL qualified for "ls", there
> are 840 of those.
>

$ ls -1 | wc -l

hb

unread,

Apr 10, 2015, 4:47:24 PM4/10/15

to

On 04/10/2015 10:27 PM, JF Mezei wrote:
> $DIR/output=dir.txt
> $EDIT/TPU dir.txt

I wouldn't want to add (and sometime later delete) another file to my
current directory with 99.999 files :-)

$EDIT/TPU
DO (or whatever your synonym for the DO key is: PF4, GOLD-KP7, ...)
Command: buffer dcl
DO
Command: dcl dir

hb

unread,

Apr 10, 2015, 4:51:55 PM4/10/15

to

Shouldn't this be either
$ ls |wc -l
or
$ ls -l |grep -v '^total' |wc -l
?

Jan-Erik Soderholm

unread,

Apr 10, 2015, 5:35:44 PM4/10/15

to

JF Mezei skrev den 2015-04-10 20:13:

>
> Although Unix doesn't have the /GRAND_TOTAL qualified for "ls", there
> are 840 of those.

840 files in "nothing".

> Are you telling me that I should split this into multiple directories
> and then add code to my app to also programatically decided which
> directory the needed file should be expected to be in ?

Of course not. 840 files is nothing.

>
> Consider and ISP with 40,000 customers. Everyone gets a .pdf as invoice
> each month. How do you suggest the ISP store those 40,000 .pdf files ?
>

Not at all.
They are generated from the accounting database and sent/mailed.
Not stored.

Jan-Erik Soderholm

unread,

Apr 10, 2015, 6:44:56 PM4/10/15

to

OK. Simple enough to test, right?

This time on a faster CPU (DS25, 8.4) but way slower disks,
standard 73 GB internal storage works (shadowed in pairs).

Created 10.000 files using a timestamp like:

$ d

Directory USER:<JANNE.DIRTEST>

2015041023442794.DAT;1 1 10-APR-2015 23:44:27.95
2015041023442799.DAT;1 1 10-APR-2015 23:44:27.99
2015041023442803.DAT;1 1 10-APR-2015 23:44:28.03
2015041023442808.DAT;1 1 10-APR-2015 23:44:28.08
2015041023442812.DAT;1 1 10-APR-2015 23:44:28.12
2015041023442817.DAT;1 1 10-APR-2015 23:44:28.17
2015041023442821.DAT;1 1 10-APR-2015 23:44:28.21
2015041023442825.DAT;1 1 10-APR-2015 23:44:28.25
...
...

This created a 1.250 block .DIR file.
Approx 20 files/sec was created (actualy a COPY operation).

A plain "delete/log *.dat;*" delete took:
CPU: 1 min 9 sec
Time: 7 min 37 sec
DIR I/O: 88.515

10.000 DEL commands using a reverse sorted list of files took:
CPU: 1 min 18 sec
Time: 5 min 30 sec
DIR I/O: 30.278

In both cases the DIR file went from 1250 block to 1 block
in steps. Could not see size of each step.

Is this a "significant difference"? I'd say "no".

So, my conclusion is that deleting 10.000 files is "OK" and
it is not that far from the "reverse" method that it would
be a major "problem".

At no time did the output from "del/log" stall or hang during
the first (single command, natural order) test.

Maybe I'll rerun this on some system with much faster storage.
I could also rerun it using, say, 50.000 files.

As I said before, it is a long time since I saw any performance
problem with the VMS directory handling.

Now, unix/linux system would probably perform better in general
but that is another question...

Comments?

Jan-Erik.

David Froble

unread,

Apr 10, 2015, 7:08:09 PM4/10/15

to

JF Mezei wrote:
> Mr Froble,
>
> My current pet project involves elevation data with files such as:
>
>> s21_e121_1arc_v3.tif s26_e131_1arc_v3.tif s31_e138_1arc_v3.tif s42_e144_1arc_v3.tif
>> s21_e122_1arc_v3.tif s26_e132_1arc_v3.tif s31_e139_1arc_v3.tif s42_e145_1arc_v3.tif
>> s21_e123_1arc_v3.tif s26_e133_1arc_v3.tif s31_e140_1arc_v3.tif s42_e146_1arc_v3.tif
>> s21_e124_1arc_v3.tif s26_e134_1arc_v3.tif s31_e141_1arc_v3.tif s42_e147_1arc_v3.tif
>> s21_e125_1arc_v3.tif s26_e135_1arc_v3.tif s31_e142_1arc_v3.tif s42_e148_1arc_v3.tif
>> s21_e126_1arc_v3.tif s26_e136_1arc_v3.tif s31_e143_1arc_v3.tif s43_e145_1arc_v3.tif
>> s21_e127_1arc_v3.tif s26_e137_1arc_v3.tif s31_e144_1arc_v3.tif s43_e146_1arc_v3.tif
>> s21_e128_1arc_v3.tif s26_e138_1arc_v3.tif s31_e145_1arc_v3.tif s43_e147_1arc_v3.tif
>> s21_e129_1arc_v3.tif s26_e139_1arc_v3.tif s31_e146_1arc_v3.tif s43_e148_1arc_v3.tif
>> s21_e130_1arc_v3.tif s26_e140_1arc_v3.tif s31_e147_1arc_v3.tif s44_e145_1arc_v3.tif
>> s21_e131_1arc_v3.tif s26_e141_1arc_v3.tif s31_e148_1arc_v3.tif s44_e146_1arc_v3.tif
>> s21_e132_1arc_v3.tif s26_e142_1arc_v3.tif s31_e149_1arc_v3.tif s44_e147_1arc_v3.tif
>> s21_e133_1arc_v3.tif s26_e143_1arc_v3.tif s31_e150_1arc_v3.tif s44_e148_1arc_v3.tif
>
>
> Although Unix doesn't have the /GRAND_TOTAL qualified for "ls", there
> are 840 of those.

I will type slowly and use small words so you can understand ....

> My app will open each file "on demand" when I need an elevation that is
> within the lat/lon area covered by each tile/file. I programatically
> build the file name based on the requested latitude/longitude.

You also could store the information in each file in a database, and
what you sued to construct the filename could also be used to construct
keys. Then you'd just have the database file(s), not 840 files. You'd
also be using a product that was designed to store data and access the
data. Access might be a lot quicker, not claiming that's a requirement.
You'd also have what some might claim is better backup capabilities.

> Are you telling me that I should split this into multiple directories
> and then add code to my app to also programatically decided which
> directory the needed file should be expected to be in ?

You could, but I wouldn't.

> These files all belong with each other. Not like accounts receivable can
> be stored separately from accounts payable. But a large company might
> have thousands of accounts receivable.

If the data "belongs with each other", then wouldn't storing the data in
a single database be prudent?

> Consider and ISP with 40,000 customers. Everyone gets a .pdf as invoice
> each month. How do you suggest the ISP store those 40,000 .pdf files ?

Why would the ISP keep the PDF files around? I'd assume that's not
their accounting system. The receivables would be in the A/R system,
where they belong, and when info is required, the A/R system would
provide it. Said data would also be "up to date", not some saved PDF
file. You'd print the invoices, send them, and if it's a file, delete
it. Some systems might send the PDFs directly without using a file.
Using PDFs is not even required to send invoices to customers.

> (Yes, my All-in-1 background tells me to split the monthly invoice
> directory into say 40 subdirectories and evenly distribute the invoices
> and then have the billing system keep a file pointer that contains which
> directory contains the invoice).
>
> But at the end of the day, having a file system that CAN handle 40,000
> files easily makes for simpler applications.

I would not consider a bunch of files, PDF, text, or whatever, to be
simple. If text, and printing, you'd have one file to print containing
all invoices. Now, that's simpler.

JF Mezei

unread,

Apr 10, 2015, 8:52:36 PM4/10/15

to

On 15-04-10 19:07, David Froble wrote:

> If the data "belongs with each other", then wouldn't storing the data in
> a single database be prudent?

I have not checked this. But because each tiff file has its own GEOtiff
parametres, it is possible that those parameters are not identical from
one tiff file to the next. (consider that distances become much smaller
as you near the pole compared to near equator). It becomes an
interesting exercise to combine rows of elevation data when the scaling
parameters might change.

(In this case, it is possible the scaling is uniform all across, I have
not checked).

If this were a long term project, I might consider the work to unify
those files into a single large 20GB file. But not worth the effors for
short term project after which I will move on to other areas of the world.

IanD

unread,

Apr 11, 2015, 1:22:31 AM4/11/15

to

On Friday, April 10, 2015 at 10:35:46 PM UTC+10, Bill Gunshannon wrote:

<snip>

>
> Why? My home directory has (currently) 1513 files. Is this supposed to
> be some kind of problem? Shouldn't the user be the one who decides how
> his data should be stored and used? Shouldn't the system support the
> users rather than the users adapting to support the system?
>

<snip>

>
> Misuse is one thing, but how is having a lot of files in a directory
> mis-use unless the file system has serious shortcomings? (Last one
> I knew of like that was Primos. :-)
>

<snip>

> --
> Bill Gunshannon | de-moc-ra-cy (di mok' ra see) n. Three wolves
> bill...@cs.scranton.edu | and a sheep voting on what's for dinner.
> University of Scranton |
> Scranton, Pennsylvania | #include <std.disclaimer.h>

^^^^^ This!

Couldn't agree more

The OS needs to be the ubiquitous entity that people and processes interact with the way they wish to interact with a system, not be straight jacketed into having to deal with system limitations

The presentation layer and the data engine should be separated with the OS handling the complexities and optimization behind the scenes of how to store the data and retrieve and manipulate the data - this is why MS were looking at the DB file store concept. A rehashed ODSx isn't going to get us there

If there is a limitation, fix it but don't have the users of the system have to know every performance trick to get adequate performance from a system

MS wanted some time back to move to a DB file system. From what I have read lately, it looks like that concept may have been revisited with Windows 8 and some of the ground work added for a future move towards this idea again

If you want to see how far behind the times ODS2 is, have a quick glance at the wiki for file storage systems

http://en.wikipedia.org/wiki/Comparison_of_file_systems

glance at the Limits section as well and you'll see ODS2 is rather crippled compared to more modern systems, even ODS5 has noticeable limitations

The system I currently support runs and old application. It generates 10,000's of files every day, around 2 files a second, all plonked into a single directory
This is NOT abuse by myself or abuse by anyone for that matter (as was implied by someone else), this is merely the consequence of dealing with an old application that was never designed to scale up to the size that it is running at today. It was supposed to be decommissioned years ago but these days businesses are not application swapping as much as they are application enhancing

How many implementations do you see catering for real future growth and real business changes at design time? In an age where companies don't even know who/what/where/when the markets that they will be competing with tomorrow, it's just implement and hand that other future stuff to the operations domain to deal with

People have identified an issue with directory performance in VMS, and enough of them to warrant that it's not isolated complaints or people's imagination

Splitting up files across directories is a workaround, not a solution. Having to do this and having to change reporting scripts/interfaces to deal with these workarounds rob people from spending time on productive measures - less maintenance on a system, not more should be the key driver

System automation techniques should be handling these types of system limitations automatically so that users and applications are isolated from what goes on underneath instead of trying to force the world and how it interacts with a system to pussyfoot around what is essentially a system limitation

Where I work, systems automation is ramping up. Companies do not want to pay for experts to be available at the coal face, they want automation robots handling more and more system functions, leaving only real issues to be escalated up to L2/L3 support levels, whom the organisation don't want anyhow because they are too costly

I said this before, the VMS system manager must eventually die (I say that with the greatest reluctance) because costs are driving high maintenance systems out the door (by high maintenance I mean systems that require continual tuning, or experts on hand to ensure users play nice with them)

How to fix directory performance on VMS?
I don't think it can be or should I say, I don't think it can be fixed to scale to the demands of enterprise sized systems

The internet of things is coming, the OS's that get a lions share of dealing with the data spewing out from your toaster and wearable technologies will go to those systems that can adequately deal with huge volumes of small data snippets. VMS isn't ready and pandering to ODSx/RMS to heal itself will not work either

Better to allow VMS the ability to support multiple file systems than work with something that has limitations that were set when the world was a very different place
Create the new, transition across what is still relevant today to me is a sensible approach

VSI I suspect will be busy enough just getting larger storage devices working on VMS, they cannot do everything as they are catching up on years of HP neglect

abrsvc

unread,

Apr 11, 2015, 2:05:33 AM4/11/15

to

Try the same test without shadowing. I suspect that there will be a more significant difference.

Please note: I don't think that there is a performance issue either. I know that I have seen extended times with wildcard deletes.

Another note that may be significant here. The performance issues Ihave seen have been with V7.3-2 NOT any of the V8 variants. Perhaps the "delay" is no longer a problem with V8 and up???

Dan

Jan-Erik Soderholm

unread,

Apr 11, 2015, 4:54:56 AM4/11/15

to

abrsvc skrev den 2015-04-11 08:05:
> Try the same test without shadowing. I suspect that there will be a
> more significant difference.

I do not think shadowing does that a difference.
But yes, I have one non-shadowned disk (same type)
in the system, so I could rerun it there.

I can also rerun the test on a more up to date system
with storage in a SAN with better up to date performance.

>
> Please note: I don't think that there is a performance issue either. I
> know that I have seen extended times with wildcard deletes.
>
> Another note that may be significant here. The performance issues Ihave
> seen have been with V7.3-2 NOT any of the V8 variants. Perhaps the
> "delay" is no longer a problem with V8 and up???

Yes, that is also my point. The last time I can remember having some
real problems with this, was some MicroVAX 3100/90 running 7.x.

I cannot remember seeing this "problem" since V8.x come along. And
I'm sure I have seen directories with 10-20.000 (maybe up to the
40-50.000 range) files now and then.

Jan-Erik.

>
> Dan
>

Simon Clubley

unread,

Apr 11, 2015, 7:59:06 AM4/11/15

to

On 2015-04-10, hb <end...@inter.net> wrote:
> On 04/10/2015 10:32 PM, John Reagan wrote:
>>
>>>
>>> Although Unix doesn't have the /GRAND_TOTAL qualified for "ls", there
>>> are 840 of those.
>>>
>>
>> $ ls -1 | wc -l
>>
> Shouldn't this be either
> $ ls |wc -l

The -1 was clearly a precaution in case there's a broken ls out there
which still does multi-column output when the output is being fed into
a pipe.

> or
> $ ls -l |grep -v '^total' |wc -l
> ?

I take your point about dropping the total line however.

BTW, you may want to add a "-a" to the ls command as well.

abrsvc

unread,

Apr 11, 2015, 8:01:19 AM4/11/15

to

I can boot V7.3-2 and a V8 variant on the same hardware with the same disks. I will try this later in the week. I'm leaning toward changes in the V8 stream that may have addressed this "problem".

Dan

Paul Sture

unread,

Apr 11, 2015, 1:06:02 PM4/11/15

to

On 2015-04-10, hb <end...@inter.net> wrote:

> On 04/10/2015 10:32 PM, John Reagan wrote:
>>
>>>
>>> Although Unix doesn't have the /GRAND_TOTAL qualified for "ls", there
>>> are 840 of those.
>>>
>>
>> $ ls -1 | wc -l
>>
> Shouldn't this be either
> $ ls |wc -l

'ls' on its own will list the files in columns. To use an example currently
in front of me:

$ ls /usr/share/kbd/keymaps
amiga atari i386 include mac ppc sun

The '-1' switch tells ls to produce a 1 column list:

$ ls -1 /usr/share/kbd/keymaps
amiga
atari
i386
include
mac
ppc
sun

'wc -l' counts the number of lines in the listing:

$ ls -1 /usr/share/kbd/keymaps | wc -l
7

> or
> $ ls -l |grep -v '^total' |wc -l
> ?

That produces a total of the blocks occupied by the files in the
directory (but ignoring the sizes of files in any subdirectories
present).

From 'info ls' on Linux, under the '-l' entry:

For each directory that is listed, preface the files with a line
`total BLOCKS', where BLOCKS is the total disk allocation for all
files in that directory. The block size currently defaults to 1024
bytes, but this can be overridden (*note Block size::). The BLOCKS
computed counts each hard link separately; this is arguably a
deficiency.

These are 1024 byte blocks on Scientific Linux but it depends which
file system you are using :-)

OS X which is BSD based has this under 'info ls' for output using '-l':

In addition, for each directory whose contents are displayed, the
total number of 512-byte blocks used by the files in the directory
is displayed on a line by itself, immediately before the
information for the files in the directory.

I hope that clears up any confusion.

--
Message from user INTERnet on ALPHA1
INTERnet Started

Paul Sture

unread,

Apr 11, 2015, 1:18:17 PM4/11/15

to

On 2015-04-11, Paul Sture <nos...@sture.ch> wrote:
> On 2015-04-10, hb <end...@inter.net> wrote:

>> or
>> $ ls -l |grep -v '^total' |wc -l
>> ?
>
> That produces a total of the blocks occupied by the files in the
> directory (but ignoring the sizes of files in any subdirectories
> present).

Oops. I didn't notice the -v there. Sorry about that.

Stephen Hoffman

unread,

Apr 11, 2015, 1:50:01 PM4/11/15

to

On 2015-04-11 05:22:29 +0000, IanD said:

> The OS needs to be the ubiquitous entity that people and processes
> interact with the way they wish to interact with a system, not be
> straight jacketed into having to deal with system limitations

I'm going to use "interact" slightly differently than you have.
Servers aren't visible to end-users. Or shouldn't be. The herds of
servers are and should be visible only to IT staff. Details of the
servers and the operating systems to the most technical of the support
staff and to the developers. Fewer sites even have technical support
staff and staff developers, and many have outsourced that. IT is not a
competitive advantage for many organizations, to paraphrase a common
comment.

> The presentation layer and the data engine should be separated with the
> OS handling the complexities and optimization behind the scenes of how
> to store the data and retrieve and manipulate the data - this is why MS
> were looking at the DB file store concept. A rehashed ODSx isn't going
> to get us there

In terms of the end-users, the user presentation layer is long gone
from OpenVMS, beyond existing applications and developers, and the
presentation layer is certainly from most servers in general.

In terms of the technical staff, servers are replaceable and
interchangeable units, running stock installs with few or no
customizations. That's not the same world that begat OpenVMS.

Further down the stack, yes, it's increasingly all automated and
self-tuning and with logging and the rest looking at whole racks and
whole datacenters, and not at individual boxes. The most technical
folks and the developers get to see this layer. But again,
self-tuning and mass deployments are not where OpenVMS came from.

> If there is a limitation, fix it but don't have the users of the system
> have to know every performance trick to get adequate performance from a
> system

Alas, you're going to have to know some tricks for your platform and
you're going to have to know your hardware and software, if you're
pushing it hard enough.

But for most folks, yes, self-tuning, self-administering, and
self-overwriting and self-healing — without losing data — when repairs
or replacements are necessary. Simpler.

> MS wanted some time back to move to a DB file system. From what I have
> read lately, it looks like that concept may have been revisited with
> Windows 8 and some of the ground work added for a future move towards
> this idea again

File systems are databases. Always have been. They're just using
human-understandable metadata tags. As the scale of the data involved
and the computers involved increases, the humans and the human-friendly
metadata becomes the weak point of the classic file system designs.
Computers don't care if the blob of data is tagged as "MyCoolProgram.c"
or "375620F9-03B5-42C1-471D-A21B1DA89F73", after all. Computers don't
care if the filenames are sorted into any particular order, either —
unlike your particular case and your ginormous directory, which have
directories and names which are sorted. So how do you best deal with
humans, and are what the more experienced humans expect — classic file
systems and folders or directories — really the best solution for
various problems? Some yes. Some... not.

> If you want to see how far behind the times ODS2 is, have a quick
> glance at the wiki for file storage systems
>
> http://en.wikipedia.org/wiki/Comparison_of_file_systems
>
> glance at the Limits section as well and you'll see ODS2 is rather
> crippled compared to more modern systems, even ODS5 has noticeable
> limitations

In terms of its design, ODS-5 is ODS-2, for all intents and purposes.
The on-disk differences between the two are FI2DEF versus FI5DEF, and a
constant or two. The filename changes and the directory depth changes
are largely the removal of assumptions and limitations within the
parsers and the applications; in the software. Some related
limitations in other components such as the DCL command line still
lurk. Parts of the ODS-5 design such as the escapements explicitly
worked around the lack of UTF-8 support in DCL and in many parts of the
character-handling support within OpenVMS, for instance.

To me, the newer and layer-breaking file systems are more interesting —
ZFS, etc — as they are new approaches to solving problems, where
OpenVMS uses largely isolated pieces including ODS-2 or -5 and the XQP,
XFC, HBVS and related pieces. Component isolation and layered designs
help in many ways, but can also means it's far more difficult to
leverage knowledge that's available in one layer from within another
layer. Catching up ODS-2 or ODS-5 with what's available now for
isolated file systems is no small effort, and you're still going to end
up five years behind what other folks are doing, and probably ten years
behind ZFS or other more "radical" changes. From over seven years
ago:
<http://arstechnica.com/gadgets/2008/03/past-present-future-file-systems/>

But again, we don't know what VSI has in mind here for their
replacement file system, either.

> The system I currently support runs and old application. It generates
> 10,000's of files every day, around 2 files a second, all plonked into
> a single directory
> This is NOT abuse by myself or abuse by anyone for that matter (as was
> implied by someone else), this is merely the consequence of dealing
> with an old application that was never designed to scale up to the size
> that it is running at today. It was supposed to be decommissioned years
> ago but these days businesses are not application swapping as much as
> they are application enhancing

Can VSI extract enough revenue to make dealing specifically with your
particular case and that of similar folks viable? Will — as most
sites do — the applications sit unchanged for decades? There's
probably not enough revenue, when you trade off having to maintain the
ability to support and patch those old versions and those old
configurations, and trade off that investment with the benefits that
might accrue from moving forward. Put another way, the old model of
OS and application and hardware support for decades is just not viable
for many of the vendors. Not at the prices that can be charged for the
new software and for support in this era.

Times change.

Servers and software are five years old? Roll in the replacements, and
save on power and support and the overhead with the replacements, and
deal with upgrading your applications as part of moving from one
long-term-support software version to the current long-term-support
version available from your preferred vendor. This approach is
disruptive to expectations, too. It didn't used to be the case that
replacement servers were so much more efficient. This revolving door
model of server deployments is certainly not what the current OpenVMS
folks are accustomed to. Various folks using OpenVMS here have servers
that are ten years old, and sometimes more. AlphaServer DS25 boxes
aren't the bleeding-edge of server technologies. How long has it been
since your ginormous-directory application was substantially upgraded
and/or ported forward?

> How many implementations do you see catering for real future growth and
> real business changes at design time? In an age where companies don't
> even know who/what/where/when the markets that they will be competing
> with tomorrow, it's just implement and hand that other future stuff to
> the operations domain to deal with

Part of that inherently means migrating data and chucking out old
applications, and evolving and updating the surviving applications.
That's painful to existing users, and painful and risky to the vendors.

> People have identified an issue with directory performance in VMS, and
> enough of them to warrant that it's not isolated complaints or people's
> imagination

For your case, try an SSD? I'd also look to inboard flash, but AFAIK
neither OpenVMS nor the Integrity servers have support for that
hardware configuration.

> Splitting up files across directories is a workaround, not a solution.
> Having to do this and having to change reporting scripts/interfaces to
> deal with these workarounds rob people from spending time on productive
> measures - less maintenance on a system, not more should be the key
> driver

You're presuming that there are ways to change certain behaviors of
ODS-2 and ODS-5 behavior, short of wholesale replacement. ODS-2 and
ODS-5 maintain sorted lists of files stored in directories scattered
around the disks. This sorted linear list design fundamentally
doesn't scale. Scattered sorted-list files containing metadata is a
design which involves extra disk seeks for those operations, and thus
there are usually caches specifically for that data, and other
design-derivative details fall out of these decisions. Going to a
red-black tree effectively means the wholesale replacement of the file
system won't make folks happy if/when details change, and that's before
discussions of the entirely understandable aversion to risk that many
people have. There are always trade-offs.

> System automation techniques should be handling these types of system
> limitations automatically so that users and applications are isolated
> from what goes on underneath instead of trying to force the world and
> how it interacts with a system to pussyfoot around what is essentially
> a system limitation

Certainly for a number of cases. For others, this not-knowing approach
verges on installing screws with hammers. It works, but lacks finesse.

> Where I work, systems automation is ramping up. Companies do not want
> to pay for experts to be available at the coal face, they want
> automation robots handling more and more system functions, leaving only
> real issues to be escalated up to L2/L3 support levels, whom the
> organisation don't want anyhow because they are too costly

You're very far from the bleeding edge of this consolidation process.

> I said this before, the VMS system manager must eventually die (I say
> that with the greatest reluctance) because costs are driving high
> maintenance systems out the door (by high maintenance I mean systems
> that require continual tuning, or experts on hand to ensure users play
> nice with them)

As of last summer, OpenVMS was receiving vendor support, with
exceedingly limited enhancements, and with the eventual phase-out of
support basically scheduled.

HP is still executing on that plan, too.

VSI may well extend the long-term usefulness of OpenVMS, and could
reverse the long-term trends for OpenVMS, assuming they can get to
stable revenues.

As for the automation and integration that's necessary here, yes.
That's also no small project for VSI.

> How to fix directory performance on VMS?I don't think it can be or

> should I say, I don't think it can be fixed to scale to the demands of
> enterprise sized systems

VSI has mentioned plans for a replacement file system, but no details.

> The internet of things is coming,

Already here. q.v. CVE-2015-2247

> the OS's that get a lions share of dealing with the data spewing out
> from your toaster and wearable technologies will go to those systems
> that can adequately deal with huge volumes of small data snippets. VMS
> isn't ready and pandering to ODSx/RMS to heal itself will not work
> either

Whether OpenVMS can be extended and enhanced far enough and fast enough
and profitably enough remains to be determined. There's presently
certainly a niche for OpenVMS among existing users. Going beyond that
installed base involves years and decades of work, very large amounts
of cash, accruing the revenues to support the effort, and all this in
an environment where the already-competitive products are also moving
forward. It's a project that's vastly larger than your 10,0000 file
directory problem.

> Better to allow VMS the ability to support multiple file systems than
> work with something that has limitations that were set when the world
> was a very different place
> Create the new, transition across what is still relevant today to me is
> a sensible approach

That request has been around for a while. Replacement file systems are
difficult and disruptive projects. Replacement file systems are also
slowly adopted at many sites. Around decade ago, OpenVMS hacked out
the last of the disk-geometry-sensitive I/O code, for instance. That's
before any discussions of Spiralog, too.

> VSI I suspect will be busy enough just getting larger storage devices
> working on VMS, they cannot do everything as they are catching up on
> years of HP neglect

The 32-bit block number is all over the place within OpenVMS, and in
various applications. Much like the move from 512-byte memory pages
to 8192-byte memory pages, any move from 512-byte blocks and 32-bit
addresses to 4096-byte Advanced Format Disks and 64-bit addresses won't
be painless. Then there's that storage is moving off the I/O buses and
in-board, which means that dealing with the old "restart-reboot"
battery-backed RAM logic from the ancient OpenVMS consoles and ancient
OpenVMS-related hardware is coming back to the forefront in other
environments — it's already common in some computing environments.

Make no mistake: VSI has a huge amount of work ahead of them, and their
biggest goal is undoubtedly to get their revenues going, and to get
their profits trending in the right direction. For the folks that are
dependent on OpenVMS, this situation also means making a decision to
continue with HP support and HP's roadmap for OpenVMS, or paying VSI
for software that — unless you need those Poulson servers — won't be
all that immediately beneficial to you. Possibly in addition to paying
HP too, depending on local requirements. That sort of application
introspection and that decision process is not going to be an easy
decision for many entities, either. Also whether OpenVMS — whether at
HP or at VSI — will continue to meet your needs for your existing
applications, and also whether OpenVMS will be useful for wholly new
development, and what changes might be necessary or expected or
desirable.

David Froble

unread,

Apr 11, 2015, 3:05:40 PM4/11/15

to

It seems that for some this topic has morphed into a discussion on
performance. I've never questioned the performance. What I question is
what >>>I<<< consider improper usage.

While it is possible to consider a bunch of related data being in a
large number of files in a file system, >>>I<<< think it's better to
group together the bunch of related data into a database. >>>I<<< think
that makes it much more manageable and usable.

Now, that's just >>>me<<<. I'm sure others are allowed to have their
own opinions.

As for Jan-Erik calling some opinions "silly", I'll ask, if the file
system is so good for storing data, why are you using RDB? Look at it
from that perspective.

johnwa...@yahoo.co.uk

unread,

Apr 11, 2015, 3:44:14 PM4/11/15

to

> self-overwriting and self-healing -- without losing data -- when repairs

> or replacements are necessary. Simpler.
>
> > MS wanted some time back to move to a DB file system. From what I have
> > read lately, it looks like that concept may have been revisited with
> > Windows 8 and some of the ground work added for a future move towards
> > this idea again
>
> File systems are databases. Always have been. They're just using
> human-understandable metadata tags. As the scale of the data involved
> and the computers involved increases, the humans and the human-friendly
> metadata becomes the weak point of the classic file system designs.
> Computers don't care if the blob of data is tagged as "MyCoolProgram.c"
> or "375620F9-03B5-42C1-471D-A21B1DA89F73", after all. Computers don't

> care if the filenames are sorted into any particular order, either --

> unlike your particular case and your ginormous directory, which have
> directories and names which are sorted. So how do you best deal with

> humans, and are what the more experienced humans expect -- classic file
> systems and folders or directories -- really the best solution for

> various problems? Some yes. Some... not.
>
> > If you want to see how far behind the times ODS2 is, have a quick
> > glance at the wiki for file storage systems
> >
> > http://en.wikipedia.org/wiki/Comparison_of_file_systems
> >
> > glance at the Limits section as well and you'll see ODS2 is rather
> > crippled compared to more modern systems, even ODS5 has noticeable
> > limitations
>
> In terms of its design, ODS-5 is ODS-2, for all intents and purposes.
> The on-disk differences between the two are FI2DEF versus FI5DEF, and a
> constant or two. The filename changes and the directory depth changes
> are largely the removal of assumptions and limitations within the
> parsers and the applications; in the software. Some related
> limitations in other components such as the DCL command line still
> lurk. Parts of the ODS-5 design such as the escapements explicitly
> worked around the lack of UTF-8 support in DCL and in many parts of the
> character-handling support within OpenVMS, for instance.
>

> To me, the newer and layer-breaking file systems are more interesting --
> ZFS, etc -- as they are new approaches to solving problems, where

> OpenVMS uses largely isolated pieces including ODS-2 or -5 and the XQP,
> XFC, HBVS and related pieces. Component isolation and layered designs
> help in many ways, but can also means it's far more difficult to
> leverage knowledge that's available in one layer from within another
> layer. Catching up ODS-2 or ODS-5 with what's available now for
> isolated file systems is no small effort, and you're still going to end
> up five years behind what other folks are doing, and probably ten years
> behind ZFS or other more "radical" changes. From over seven years
> ago:
> <http://arstechnica.com/gadgets/2008/03/past-present-future-file-systems/>
>
>
> But again, we don't know what VSI has in mind here for their
> replacement file system, either.
>
> > The system I currently support runs and old application. It generates
> > 10,000's of files every day, around 2 files a second, all plonked into
> > a single directory
> > This is NOT abuse by myself or abuse by anyone for that matter (as was
> > implied by someone else), this is merely the consequence of dealing
> > with an old application that was never designed to scale up to the size
> > that it is running at today. It was supposed to be decommissioned years
> > ago but these days businesses are not application swapping as much as
> > they are application enhancing
>
> Can VSI extract enough revenue to make dealing specifically with your

> particular case and that of similar folks viable? Will -- as most
> sites do -- the applications sit unchanged for decades? There's

> environments -- it's already common in some computing environments.

>
> Make no mistake: VSI has a huge amount of work ahead of them, and their
> biggest goal is undoubtedly to get their revenues going, and to get
> their profits trending in the right direction. For the folks that are
> dependent on OpenVMS, this situation also means making a decision to
> continue with HP support and HP's roadmap for OpenVMS, or paying VSI

> for software that -- unless you need those Poulson servers -- won't be

> all that immediately beneficial to you. Possibly in addition to paying
> HP too, depending on local requirements. That sort of application
> introspection and that decision process is not going to be an easy

> decision for many entities, either. Also whether OpenVMS -- whether at
> HP or at VSI -- will continue to meet your needs for your existing

> applications, and also whether OpenVMS will be useful for wholly new
> development, and what changes might be necessary or expected or
> desirable.
>
>
> --
> Pure Personal Opinion | HoffmanLabs LLC

The move from 512byte pages to (othersize) pages should have been
(was?) largely transparent to most applications, though system
managers and some developers of system-class needed to think about
it.

Same for changes to file system internals and assorted other things
which will undoubtedly need to change as time goes by. Non-privileged
applications shouldn't (can't?) currently be dependent on this kind
of thing. But if the apps want to take advantage of new features...

There are obviously challenges ahead, for some selection of existing
VMS customers. But similar challenges have already been sorted in
VAX->Alpha and Alpha->IA64 transitions, both in VMS Engineering and
in the world of developers, sysadmins, customers, etc.

The alternative, for a VMS-dependent setup, is to move to a
different OS and toolset(s). That involves a different and quite
possibly more challenging adventure, which might involve starting
again from scratch (or might not, depending on various factors).

It's 2015. The people that are still on VMS aren't on VMS because
it's trendy (though that may have been a factor once upon a time).
They're frequently sticking with VMS because moving off hasn't
been cost-effective. That picture isn't going to change much for
a while yet, though the VSI announcement may hopefully cause some
existing customers who haven't got particular time pressures to
wait a little longer and see what happens (whilst keeping a
watching eye on their alternative options).

New customers? Related discussion, but not the same. As you've
observed many times, preserving compatibility with existing
applications is a double edged sword. VMS on x86-64 may be an
interesting opportunity for a bit of Thinking Different. Maybe.

Interesting times.

David Froble

unread,

Apr 11, 2015, 4:00:39 PM4/11/15

to

Stephen Hoffman wrote:
>
>
> On 2015-04-11 05:22:29 +0000, IanD said:
>
>> The OS needs to be the ubiquitous entity that people and processes
>> interact with the way they wish to interact with a system, not be
>> straight jacketed into having to deal with system limitations
>
> I'm going to use "interact" slightly differently than you have.
> Servers aren't visible to end-users. Or shouldn't be. The herds of
> servers are and should be visible only to IT staff. Details of the
> servers and the operating systems to the most technical of the support
> staff and to the developers. Fewer sites even have technical support
> staff and staff developers, and many have outsourced that. IT is not a
> competitive advantage for many organizations, to paraphrase a common
> comment.

This may be true for some. Not for others. My customers know that
their software gives them a competitive edge, and while they may wish
they could do the same with less, some have tried and failed. Losing a
competitive edge, of whatever kind, can mean losing the company.

I have pondered this movement away from appreciation of "competitive
edge" and such. I've got to wonder how many ever needed it, how many
needed it but gave it up, and (what I consider the vast majority) those
new to computing and never had it, or need it.

>> The presentation layer and the data engine should be separated with
>> the OS handling the complexities and optimization behind the scenes of
>> how to store the data and retrieve and manipulate the data - this is
>> why MS were looking at the DB file store concept. A rehashed ODSx
>> isn't going to get us there
>
> In terms of the end-users, the user presentation layer is long gone from
> OpenVMS, beyond existing applications and developers, and the
> presentation layer is certainly from most servers in general.

Usage has changed. Dramatically. In the "good old days" there was
people working on terminals, having replaced the card punch machines.
These people in the middle of transactions was state of the art. Not
today. Today most of the middle people are gone, or are doing something
more useful. Today the user is the customer, using some user interface
from their PC or such.

> In terms of the technical staff, servers are replaceable and
> interchangeable units, running stock installs with few or no
> customizations. That's not the same world that begat OpenVMS.

No, nor is it always better, or as good. Bean counters want to buy a
few PCs and have their clerks use them to run a company. Doesn't always
work out so well.

>> How many implementations do you see catering for real future growth
>> and real business changes at design time? In an age where companies
>> don't even know who/what/where/when the markets that they will be
>> competing with tomorrow, it's just implement and hand that other
>> future stuff to the operations domain to deal with
>
> Part of that inherently means migrating data and chucking out old
> applications, and evolving and updating the surviving applications.
> That's painful to existing users, and painful and risky to the vendors.

Some companies do have a clue as to where they will be in the future.
Some may need to change some things, others just hope to be able to do
as well as today. The best case is a smooth path to where they will
need to be.

>> People have identified an issue with directory performance in VMS, and
>> enough of them to warrant that it's not isolated complaints or
>> people's imagination
>
> For your case, try an SSD? I'd also look to inboard flash, but AFAIK
> neither OpenVMS nor the Integrity servers have support for that hardware
> configuration.
>
>> Splitting up files across directories is a workaround, not a solution.
>> Having to do this and having to change reporting scripts/interfaces to
>> deal with these workarounds rob people from spending time on
>> productive measures - less maintenance on a system, not more should be
>> the key driver

Isn't it possible that the problems come from poor application design,
rather than poor support from the OS, or file system?

If you're going to start with a leaky pipe, and then complain when you
have to constantly perform maintenance to patch it, where should blame lie?

> You're presuming that there are ways to change certain behaviors of
> ODS-2 and ODS-5 behavior, short of wholesale replacement. ODS-2 and
> ODS-5 maintain sorted lists of files stored in directories scattered
> around the disks. This sorted linear list design fundamentally doesn't
> scale. Scattered sorted-list files containing metadata is a design
> which involves extra disk seeks for those operations, and thus there are
> usually caches specifically for that data, and other design-derivative
> details fall out of these decisions. Going to a red-black tree
> effectively means the wholesale replacement of the file system won't
> make folks happy if/when details change, and that's before discussions
> of the entirely understandable aversion to risk that many people have.
> There are always trade-offs.
>
>
>> System automation techniques should be handling these types of system
>> limitations automatically so that users and applications are isolated
>> from what goes on underneath instead of trying to force the world and
>> how it interacts with a system to pussyfoot around what is essentially
>> a system limitation
>
> Certainly for a number of cases. For others, this not-knowing approach
> verges on installing screws with hammers. It works, but lacks finesse.

Oh, my, that brings back a memory. I had a guy putting up some
paneling. This guy loved his hammer. I informed him I preferred
screws, and wanted screws to be used. I wasn't precise enough to also
tell him to screw in the screws. Walked into the room. He had his
beloved hammer, a small nail, he'd use the nail to make a small hole,
place a screw in the hole, and smash it with his hammer.

>> Where I work, systems automation is ramping up. Companies do not want
>> to pay for experts to be available at the coal face, they want
>> automation robots handling more and more system functions, leaving
>> only real issues to be escalated up to L2/L3 support levels, whom the
>> organisation don't want anyhow because they are too costly

The beancounters want the savings in labor costs, but don't want to pay
for those who might be needed to keep the robots working. Yeah, I want
lots of things too. Maybe an F-15, and unlimited fuel. Sometimes
wanting isn't enough.

Stephen Hoffman

unread,

Apr 11, 2015, 4:22:44 PM4/11/15

to

On 2015-04-11 19:44:12 +0000, johnwa...@yahoo.co.uk said:

> The move from 512byte pages to (othersize) pages should have been
> (was?) largely transparent to most applications, though system managers
> and some developers of system-class needed to think about it.

Many folks likely didn't see this change. But the folks that use
memory sections within their applications certainly became familiar
with the page-size changes.

> Same for changes to file system internals and assorted other things
> which will undoubtedly need to change as time goes by. Non-privileged
> applications shouldn't (can't?) currently be dependent on this kind of
> thing. But if the apps want to take advantage of new features...

Anybody using virtual I/O or the IO$_ACPCONTROL XQP interface tends to
have 32-bit blocks scattered around. Various file-related drivers,
such as LDDRIVER. More mundane folks also tend to have 32-bit values
around, whether that's via some of the OpenVMS APIs or the older 32-bit
C fstat()/stat() stuff. Folks that check and monitor disk sizes have
32-bit values. These uses are far from ubiquitous, but they're not
entirely rare, and they can ripple into other components, too. This
ripples right up into DCL, for instance. Folks use f$getdvi to check
disk space and f$file_attributes to check file or directory sizes.
Then there's that DCL can't deal well with unsigned values for the
existing 2 TiB support, and doesn't support 64-bit values at all, and
down the rabbit hole we go...

> There are obviously challenges ahead, for some selection of existing
> VMS customers. But similar challenges have already been sorted in
> VAX->Alpha and Alpha->IA64 transitions, both in VMS Engineering and in
> the world of developers, sysadmins, customers, etc.

In a technical sense, certainly nothing here is insurmountable. The
technical team at VSI is exceedingly skilled, after all.
Unfortunately, few OpenVMS places are anywhere near where they were a
decade or two ago and leading into the last ports, whether in terms of
their OpenVMS staffing, software and configurations and expectations,
or OpenVMS vendor support, or all the rest of the details involved with
bespoke IT. Going forward, it's not at all certain that Oracle might
provide database support to this potential future OpenVMS port to
x86-64 for instance, and a lack of that database support will constrain
some sites. There've been huge changes in general in the computing
industry over the past decade, too.

> The alternative, for a VMS-dependent setup, is to move to a different
> OS and toolset(s). That involves a different and quite possibly more
> challenging adventure, which might involve starting again from scratch
> (or might not, depending on various factors).

Correct. Which gets back to deciding to support VSI and a path
forward, or resuming plans to maintain the current platform locally
and/or to migrate.

> It's 2015. The people that are still on VMS aren't on VMS because it's
> trendy (though that may have been a factor once upon a time). They're
> frequently sticking with VMS because moving off hasn't been
> cost-effective. That picture isn't going to change much for a while
> yet, though the VSI announcement may hopefully cause some existing
> customers who haven't got particular time pressures to wait a little
> longer and see what happens (whilst keeping a watching eye on their
> alternative options).

Ayup. The VSI announcement has deferred some of that work and some of
those discussions. Given HP is still following their roadmap for
OpenVMS, how long any VSI-related porting pause might last is likely
contingent on the details of the next VSI release or three, and on
various details that we'll learn about and know more about over time.
The next Bootcamp will be interesting, too.

> New customers? Related discussion, but not the same. As you've observed
> many times, preserving compatibility with existing applications is a
> double edged sword. VMS on x86-64 may be an
> interesting opportunity for a bit of Thinking Different. Maybe.

I'll certainly be watching which way VSI goes here; toward
compatibility with the existing and the past, or selectively and
judiciously and carefully breaking a few things and preparing to move
forward.

> Interesting times.

Very much so.

Post-V8.4, VSI is the path forward. Where that might lead?

JF Mezei

unread,

Apr 11, 2015, 4:24:55 PM4/11/15

to

On 15-04-11 15:05, David Froble wrote:

> It seems that for some this topic has morphed into a discussion on
> performance. I've never questioned the performance. What I question is
> what >>>I<<< consider improper usage.

Usage has evolved. And the architecture of the file system on VMS has
not kept up with the Jones (Window, Linux).

What was considered abusive use of directories yesterday is often very
common today. And if customers need/want to load a directory with 10,000
files and VMS can't handle it properly, these customers will look elsewhere.

Now, is the only performance issue with regards to directories the case
of DELETE *.*;* ? If so, can the currnet architecture be kept and just
that command updated to deal more efficiently with this operation ?

JF Mezei

unread,

Apr 11, 2015, 5:22:20 PM4/11/15

to

On 15-04-11 16:21, Stephen Hoffman wrote:

> Anybody using virtual I/O or the IO$_ACPCONTROL XQP interface tends to
> have 32-bit blocks scattered around. Various file-related drivers,
> such as LDDRIVER. More mundane folks also tend to have 32-bit values
> around, whether that's via some of the OpenVMS APIs or the older 32-bit
> C fstat()/stat() stuff.

The move to x86 is the perfect oportunity to make those changes since
there is no legacy software to support, and the IA64/Alpha/VAX emulators
can provide the interface from 32 bit strucutures to the new 64 bit ones.

BUT, that may imply a separate codebase for Alpha/IA64 that remain on 32
bit strucrtures, and the 8086 which is moving ahead big time.

One has to consider how much longer new software will be released for
that IA64 thing. If customers will beg and line up to move to x86 ASAP,
then will there be much of a need to continue to release new version fo
VMS for IA64 for long ? If it is to be a short period, would a separate
code base be such a bad idea ?

(I use "separate code base" in a way to say that structures on IA64
would be different from those on 8086. This could be achieved with lots
of if-defs instead or having actual separate code base, or perhaps the
separate code base would be easier for all those parts that become
different, and shared code for all the stuff that remains the same)

abrsvc

unread,

Apr 11, 2015, 5:33:44 PM4/11/15

to

On Saturday, April 11, 2015 at 5:22:20 PM UTC-4, JF Mezei wrote:
> The move to x86 is the perfect oportunity to make those changes since
> there is no legacy software to support, and the IA64/Alpha/VAX emulators
> can provide the interface from 32 bit strucutures to the new 64 bit ones.
>

I would disagree that there is no legacy software. Using your analogy, themove from Vax to Alpha or the move from Alpha to IA64 shouldn't have had any legacy software either.

I view the ability to compile/link/run with minimal changes a good thing for "old" software that I'd like to keep. I have applications written on PDP-11 that are running on IA64 today with updates of course, but the majority of the code remains the same. I expect to be able to have this working on the X86 platform as well with again minimal change.

Dan

hb

unread,

Apr 11, 2015, 5:34:21 PM4/11/15

to

On 04/11/2015 07:05 PM, Paul Sture wrote:
> On 2015-04-10, hb <end...@inter.net> wrote:
>> On 04/10/2015 10:32 PM, John Reagan wrote:
>>>
>>>>
>>>> Although Unix doesn't have the /GRAND_TOTAL qualified for "ls", there
>>>> are 840 of those.
>>>>
>>>
>>> $ ls -1 | wc -l
>>>
>> Shouldn't this be either
>> $ ls |wc -l

Hmm, I need a different font for my newsreader, better glasses or both:
I didn't see that there was a one, it looked like the letter l to me. I
apologize for any confusion and/or noise.

> 'ls' on its own will list the files in columns. To use an example currently
> in front of me:
>
> $ ls /usr/share/kbd/keymaps
> amiga atari i386 include mac ppc sun

Yes, but with the output redirected into a pipe it lists one file per
line, as can be seen with
$ ls |cat
You explicitly have to add the -C option to overwrite that feature.

Stephen Hoffman

unread,

Apr 11, 2015, 5:54:41 PM4/11/15

to

On 2015-04-11 20:00:27 +0000, David Froble said:

> Stephen Hoffman wrote:
>>
>>
>> On 2015-04-11 05:22:29 +0000, IanD said:
>>
>>> The OS needs to be the ubiquitous entity that people and processes
>>> interact with the way they wish to interact with a system, not be
>>> straight jacketed into having to deal with system limitations
>>
>> I'm going to use "interact" slightly differently than you have.
>> Servers aren't visible to end-users. Or shouldn't be. The herds of
>> servers are and should be visible only to IT staff. Details of the
>> servers and the operating systems to the most technical of the support
>> staff and to the developers. Fewer sites even have technical support
>> staff and staff developers, and many have outsourced that. IT is not a
>> competitive advantage for many organizations, to paraphrase a common
>> comment.
>
> This may be true for some. Not for others. My customers know that
> their software gives them a competitive edge, and while they may wish
> they could do the same with less, some have tried and failed. Losing a
> competitive edge, of whatever kind, can mean losing the company.
>
> I have pondered this movement away from appreciation of "competitive
> edge" and such. I've got to wonder how many ever needed it, how many
> needed it but gave it up, and (what I consider the vast majority) those
> new to computing and never had it, or need it.

So looking at this discussion another way 'round, your customers have
decided that their own IT efforts here are not cost-competitive, and
that you can deliver some of the services that they need, and more
cheaply than they can provide them to and for themselves. They've
commoditized their IT, and you're the provider. You have replaced
parts of various development organizations, or you've allowed the
existing organizations and existing staff to focus on other areas where
they can provide services they and their businesses need. In terms of
their competitive view, they can't do better than you at a price that
they can afford, and they're now about as competitive as the other
folks that are using your services — so they have an edge over some,
but not over the other folks that are also using your services. This
is also why I wrote "for many organizations" and not "for all
organizations". Some folks still have and still do maintain a
competitive edge secondary to their IT capabilities.

>>> The presentation layer and the data engine should be separated with the
>>> OS handling the complexities and optimization behind the scenes of how
>>> to store the data and retrieve and manipulate the data - this is why MS
>>> were looking at the DB file store concept. A rehashed ODSx isn't going
>>> to get us there
>>
>> In terms of the end-users, the user presentation layer is long gone
>> from OpenVMS, beyond existing applications and developers, and the
>> presentation layer is certainly from most servers in general.
>
> Usage has changed. Dramatically. In the "good old days" there was
> people working on terminals, having replaced the card punch machines.
> These people in the middle of transactions was state of the art. Not
> today. Today most of the middle people are gone, or are doing
> something more useful. Today the user is the customer, using some user
> interface from their PC or such.

Or with their mobile devices, which adds a layer of networking
complexity and potentially caching and offline access into the .

>> In terms of the technical staff, servers are replaceable and
>> interchangeable units, running stock installs with few or no
>> customizations. That's not the same world that begat OpenVMS.
>
> No, nor is it always better, or as good. Bean counters want to buy a
> few PCs and have their clerks use them to run a company. Doesn't
> always work out so well.

I'm referring to server farms. Not to end-user PCs. Not that most
end-user PCs and Macs used in large organizations aren't increasingly
stock images or otherwise rendered replaceable.

>>> Splitting up files across directories is a workaround, not a solution.
>>> Having to do this and having to change reporting scripts/interfaces to
>>> deal with these workarounds rob people from spending time on productive
>>> measures - less maintenance on a system, not more should be the key
>>> driver
>
> Isn't it possible that the problems come from poor application design,
> rather than poor support from the OS, or file system?

Quite. There have been changes made to OpenVMS to make existing
designs work faster, whether it's changes to RMS and related, or it's
more fundamental changes such as moving from assembler to higher-level
languages — the higher-level tools having more overhead than
well-written Macro32 assembly. It's the manner and the nature of the
business.

> If you're going to start with a leaky pipe, and then complain when you
> have to constantly perform maintenance to patch it, where should blame
> lie?

Depends on who's willing to pay and who's willing to work. There are
folks with some pretty bad plumbing that pay enough to get folks to fix
the problems.

>>> Where I work, systems automation is ramping up. Companies do not want
>>> to pay for experts to be available at the coal face, they want
>>> automation robots handling more and more system functions, leaving only
>>> real issues to be escalated up to L2/L3 support levels, whom the
>>> organisation don't want anyhow because they are too costly
>
> The beancounters want the savings in labor costs, but don't want to pay
> for those who might be needed to keep the robots working. Yeah, I want
> lots of things too. Maybe an F-15, and unlimited fuel. Sometimes
> wanting isn't enough.

The software bots — as they're called in one system I'm using — are
pretty handy for scripting stuff. It's akin to scripting with DCL, but
pushed up several levels in capabilities and in abstraction.

David Froble

unread,

Apr 11, 2015, 8:59:14 PM4/11/15

to

Yep! What he wrote ....

Also, consider the probabilities. If one had to bet on business from
existing VMS customers, or new VMS customers, on which number would one
place his chips?

While attracting new customers would be a good thing, there is the
possibility it won't happen in any great numbers.

Any existing customers are still on VMS because they could not find an
acceptable alternative. After Palmer, and Compaq, and then HP, there
wasn't much hope for VMS. So people were in a pretty bad situation.
With VSI, assuming it is successful, what's the chances of those
customers now deciding to drop usage of VMS. Rather small I'd think.

Now, I'm not making any claims, but there is the possibility that the
majority of any business VSI can hope for is the current VMS user base.
If so, then backward compatibility might be VERY important. One could
also argue that such would lock them into only the current user base.
Definitely no easy answers.

The bird in the hand, or, possible birds in the bushes ....

JF Mezei

unread,

Apr 11, 2015, 10:13:08 PM4/11/15

to

On 15-04-11 17:33, abrsvc wrote:

> I would disagree that there is no legacy software.

All software that will run on x86-VMS 9.0 will have to be recompiled to
run on that platform. So introduction of new structures may not so such
a big problem, especially if compilers provide the help to make it easy
(for instance, spotting when you move a new 64 bit value to an old 32
bit one).

Software compiled on IA64 or Alpha may be able to execute without
recompilation on x86, but only though an emulator. That emulator can
provide the layer that supports old structures.

Jan-Erik Soderholm

unread,

Apr 12, 2015, 5:19:17 AM4/12/15

to

A silly question, of course... :-)

The file system stores (more or less unformatted) *files*, the
database system stores (strongly formatted) *data*.
Just to put it simple.

I would never dream of storing our log files from the detached
processes into the Rdb. That would be realy silly and would not
give any benefit at all.

A "SEARCH <all logfiles> <some plain text>" is far easier then to try
to come up with the correct SQL syntax aginst an unformated mass
of text.

I see no problem having a couple of 1000s of logfiles spanning some
months back in time for easy SEARCH'es. Call that "improper" all you
like, it still very much help us in the daily job.

Besides, I read you view of "imporper" use of directories from the start
as directly related to the (former?) performance troubles rellated to
deleting files from "large" directories. Maybe it wasn't. But then,
what would the problem be with having "many" (whatever *that* is)
files as such?

Jan-Erik.

Jan-Erik Soderholm

unread,

Apr 12, 2015, 5:23:14 AM4/12/15

to

JF Mezei skrev den 2015-04-11 22:24:
> On 15-04-11 15:05, David Froble wrote:
>
>> It seems that for some this topic has morphed into a discussion on
>> performance. I've never questioned the performance. What I question is
>> what >>>I<<< consider improper usage.
>
>
> Usage has evolved. And the architecture of the file system on VMS has
> not kept up with the Jones (Window, Linux).
>
> What was considered abusive use of directories yesterday is often very
> common today. And if customers need/want to load a directory with 10,000
> files and VMS can't handle it properly,

It can.

Paul Sture

unread,

Apr 12, 2015, 5:32:14 AM4/12/15

to

On 2015-04-11, Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP>
wrote:

> On 2015-04-10, hb <end...@inter.net> wrote:
>> On 04/10/2015 10:32 PM, John Reagan wrote:
>>>
>>>>
>>>> Although Unix doesn't have the /GRAND_TOTAL qualified for "ls", there
>>>> are 840 of those.
>>>>
>>>
>>> $ ls -1 | wc -l
>>>
>> Shouldn't this be either
>> $ ls |wc -l
>
> The -1 was clearly a precaution in case there's a broken ls out there
> which still does multi-column output when the output is being fed into
> a pipe.

I'm pretty sure I've worked with one of those broken implementations
somewhere in the past.

>> or
>> $ ls -l |grep -v '^total' |wc -l
>> ?
>
> I take your point about dropping the total line however.
>
> BTW, you may want to add a "-a" to the ls command as well.

-A might be a better choice. Here's the BSD info on that:

-A List all entries except for . and ... Always set for the super-
user.

(the corresponding info on Linux omits that it's always set for
the superuser aka root, even though that appears to be the
behaviour)

Phillip Helbig (undress to reply)

unread,

Apr 12, 2015, 5:49:48 AM4/12/15

to

In article <0c6d1138-f0a8-4ebe...@googlegroups.com>,

johnwa...@yahoo.co.uk writes:

> The move from 512byte pages to (othersize) pages should have been
> (was?) largely transparent to most applications, though system
> managers and some developers of system-class needed to think about
> it.

In particular, HELP and other documentation isn't always clear whether
512 bytes or some other size is intended.

Stephen Hoffman

unread,

Apr 12, 2015, 10:57:52 AM4/12/15

to

On 2015-04-12 09:19:15 +0000, Jan-Erik Soderholm said:

> The file system stores (more or less unformatted) *files*, the database
> system stores (strongly formatted) *data*.
> Just to put it simple.

Databases can also store what are called blobs. What's a blob but a file>

> I would never dream of storing our log files from the detached
> processes into the Rdb. That would be realy silly and would not give
> any benefit at all.

I've stored blobs in databases before. Images, both pictures and
executables. Works nicely, too. Log files from detached databases
would be more easily scanned and the text loaded, so there's not much
reason to store those in a blob in a database. Now storing those as
text in database rows, that might be handy for some folks.

> A "SEARCH <all logfiles> <some plain text>" is far easier then to try
> to come up with the correct SQL syntax aginst an unformated mass of
> text.

Rdb isn't the only tool in the shed, nor is SQL the only way, nor is
the OpenVMS SEARCH command a particularly robust or modern utility, and
your current environment is not the only way. Most folks would
probably scan the file structure and the logs and the database ahead of
the query and generate search metadata, and potentially use map-reduce
to generate the metadata index. That'll be seriously fast — in seconds
and less, rather than being limited by the speed of scanning all of the
individual files. Having first used this directly for large source
code pools and even now smaller projects, using SEARCH to find a log
entry or a line of source code is... tedious... glacial. This started
out with a version of AltaVista that was running on a Windows system
with access to the source pool, then with ht:/Dig running on OpenVMS,
and now other modern search tools on other platforms.

But then this is your environment, your budget, and your decisions. If
it works for you, then go for it. But please don't ever get "stuck"
thinking that there's only one way to solve something, or that your
present solution is even still applicable. Some of these tools might
solve problems that you have faster and more easily, for instance. The
application requirements or the scale or the scope or the integration
requirements can and often do change over time. More subtly, some can
solve problems that you already have, but that you don't even currently
recognize as being problems.

> I see no problem having a couple of 1000s of logfiles spanning some
> months back in time for easy SEARCH'es. Call that "improper" all you
> like, it still very much help us in the daily job.

It's your solution for your environment within your expectations. It
works. In another environment, it'd be the slowest and least desirable
and most cluttered approach. But then again, neither approach here is
right or wrong.

> Besides, I read you view of "imporper" use of directories from the
> start as directly related to the (former?) performance troubles
> rellated to deleting files from "large" directories.

In the same manner of consideration as you're asking David, did your
limitations on what's improper with what can be stored and what can be
indexed in a database start with the limitations of Rdb and/or your
lack of a need for or your relative unfamiliarity with faster local
searches?

> Maybe it wasn't. But then, what would the problem be with having "many"
> (whatever *that* is) files as such?

Maybe it wasn't. But then would the problem be that you're not set up
to use blobs, and using centralized logging — with one server, there's
not much point in distributed logging, after all — and with the
limitations of Rdb around searching text, for instance.

> Jan-Erik.

Your design and your environment and your hardware and your software
works well for your needs now and that's far and away the biggest
factor here. But — and you knew that was coming — when generalizing
into other areas and other applications, please recall that while Rdb
is a very good relational database and OpenVMS is a good operating
system, some of the capabilities and tools that you're working with and
are familiar with are a decade or more old. This when viewed in terms
of the capabilities of some other available databases and tools. For
the OpenVMS users that need SEARCH for instance, having a modern, fast
metadata search would be very beneficial.

The downside of your present application environment and current
situation is — though it works, and it meets your needs — that it can
be very easy to miss out on options and features and possibilities that
can pull your efficiency or your integration or your application
forward — to make you and your work more valuable to your customer.
For one isolated server with a large collection of bespoke software in
one installation, that might not seem a big issue. But then there
might be ways to migrate or to replace or to augment tools
incrementally, and to provide better efficiencies. Things like
automated log analysis, for instance — learning what's normal in your
logs or your data and what's not, and then automatically notifying for
anomalies — sort of like email spam filtering, for logs. But where the
usual log chatter gets filtered, and you only deal with the "huh?"
stuff. You could use faster and full-text searches. There may well
be other things that can move your environment forward, too.

Jan-Erik Soderholm

unread,

Apr 12, 2015, 11:37:32 AM4/12/15

to

Stephen Hoffman skrev den 2015-04-12 16:57:

> ...OpenVMS SEARCH command a particularly robust or modern utility...

Hm, SEARCH have never failed, as far as I remember.
It has always done what was expected.

> That'll be seriously fast — in seconds and less, rather
> than being limited by the speed of scanning all of the individual files.

Doing a SEARCH on a couple of 1000s of files (take a few secs) once
a week (or whatever) is not a real problem that needs another solution.

>
> Maybe it wasn't. But then would the problem be that you're not set up to

> use blobs...

The "problem" is that blobs it is not needed (in this case).

(And yes, I have used blobs in Rdb in other cases where having
keyword access to binary data made sense. It would not make sense
in the case of our detached processes log files.)

Jan-Erik.

David Froble

unread,

Apr 12, 2015, 1:12:42 PM4/12/15

to

I've seen images stored in databases. I've also seen them stored in
files and pointed to by database fields. Both work.

> I would never dream of storing our log files from the detached
> processes into the Rdb. That would be realy silly and would not
> give any benefit at all.
>
> A "SEARCH <all logfiles> <some plain text>" is far easier then to try
> to come up with the correct SQL syntax aginst an unformated mass
> of text.

I too do not have a problem with keeping some types of log files.
Sometimes they can be cumulative. At other times not.

> I see no problem having a couple of 1000s of logfiles spanning some
> months back in time for easy SEARCH'es. Call that "improper" all you
> like, it still very much help us in the daily job.

I would not call that improper at all. I do the same. With some
things. However, if it's really important data, perhaps there might be
better methods of archiving it. Many log files, as I understand the
term, are transitory and should not have a long life span.

> Besides, I read you view of "imporper" use of directories from the start
> as directly related to the (former?) performance troubles rellated to
> deleting files from "large" directories. Maybe it wasn't. But then,
> what would the problem be with having "many" (whatever *that* is)
> files as such?

I think care must be taken in design of software, and a design that
includes many small data files is in my opinion possible a poor design.
There can be many reasons to use a particular approach. For example,
it might be easier to purge old data by creation date of the file(s).
It really depends on the application.

But, I've seen some really poor designs. Possibly because they are
rather old. Anything new should be carefully thought out, and new
capabilities can be much better than thousands of files in the file system.

Stephen Hoffman

unread,

Apr 12, 2015, 1:43:30 PM4/12/15

to

On 2015-04-12 17:12:35 +0000, David Froble said:

> I think care must be taken in design of software, and a design that
> includes many small data files is in my opinion possible a poor design.
> There can be many reasons to use a particular approach. For example,
> it might be easier to purge old data by creation date of the file(s).
> It really depends on the application.
>
> But, I've seen some really poor designs. Possibly because they are
> rather old. Anything new should be carefully thought out, and new
> capabilities can be much better than thousands of files in the file
> system.

Similar care must be taken with existing and functioning software, as
well. Any old and functional software should also be occasionally
revisited and reviewed and carefully re-thought, too.

This beyond the usual discussions of bug densities and of related
factors that might lead to remedial work. Consider that the general
requirements, and the scope and scale and performance requirements can
and do all change over time, and — just as important — so do the
capabilities and advantages of different tools and newer tools.

This is most certainly not to suggest chucking out all the old
software, but it might lead to plans for incremental changes, or plans
for the incremental migration or wholesale replacement of some tools,
where there are better options. Or it might mean finding that the
existing software still meets your needs, and that there are not
particularly better options. It might well lead to just starting to
use new features of your compiler or your environment. The definition
of "better" here is entirely local and obviously varies in terms of
features and benefits and price, but you don't want to get so tied into
an existing approach that you miss out on advantages or improvements or
cheaper or better replacements.

Getting stuck on how any particular tool works — even a good tool — can
blind you to the advantages and alternatives that might be available
with changes.

There are newer hammer designs available now, with more useful features
and better materials for weight or shock reduction. Some of these new
designs can be quite superior to the existing hammers, meaning that
spending some money to replace a favorite hammer might be appropriate.
Beyond the new hammers and new hammer designs, there are now nail
guns. Which are just ginormously better at some common tasks.

Phillip Helbig (undress to reply)

unread,

Apr 12, 2015, 2:24:03 PM4/12/15

to

In article <mge13i$1he$1...@dont-email.me>, Stephen Hoffman

<seao...@hoffmanlabs.invalid> writes:

> On 2015-04-12 09:19:15 +0000, Jan-Erik Soderholm said:
>
> > The file system stores (more or less unformatted) *files*, the database
> > system stores (strongly formatted) *data*.
> > Just to put it simple.
>
> Databases can also store what are called blobs. What's a blob but a file>

BLOB (in this context) = Binary Large OBject

Paul Sture

unread,

Apr 12, 2015, 2:53:11 PM4/12/15

to

On 2015-04-12, David Froble <da...@tsoft-inc.com> wrote:

> Jan-Erik Soderholm wrote:
>
>> The file system stores (more or less unformatted) *files*, the
>> database system stores (strongly formatted) *data*.
>> Just to put it simple.
>
> I've seen images stored in databases. I've also seen them stored in
> files and pointed to by database fields. Both work.

I've recently been wondering about this for a new project.

The following paper "To BLOB or Not To BLOB: Large Object Storage in a
Database or a Filesystem?" is now 9 years old but it's still worth a
read:

<http://research.microsoft.com/pubs/64525/tr-2006-45.pdf>

FWIW most of the PDFs I have from online transactions fall beneath 256KB
and many are less than 50KB. Having done a quick scan of all the PDFs
on my system and sorting into ascending size, by the time I get to 256KB
I'm seeing software and hardware manuals (obviously there's the odd
monster in there, usually the result of scanning).

Craig A. Berry

unread,

Apr 12, 2015, 5:27:08 PM4/12/15

to

On 4/11/15 9:13 PM, JF Mezei wrote:
> On 15-04-11 17:33, abrsvc wrote:
>
>> I would disagree that there is no legacy software.
>
> All software that will run on x86-VMS 9.0 will have to be recompiled to
> run on that platform. So introduction of new structures may not so such
> a big problem, especially if compilers provide the help to make it easy
> (for instance, spotting when you move a new 64 bit value to an old 32
> bit one).

If we'd had to make those kind of changes when we moved from Alpha to
Itanium five years ago I can say with some confidence we'd still be on
Alpha.

JF Mezei

unread,

Apr 12, 2015, 5:55:45 PM4/12/15

to

On 15-04-12 17:27, Craig A. Berry wrote:

> If we'd had to make those kind of changes when we moved from Alpha to
> Itanium five years ago I can say with some confidence we'd still be on
> Alpha.

Found out that while "long" on VMS remained 32 bits (you need long-long
to get 64 bits), it moved to 64 bits on OS-X. So a program that was
reading raw data from a file have to have that huge change: modifying 3
items in a struct from "long" to "int". (after testing "sizeof" for
various OS-X definitions). Apple was able to make that move to 64 bits
in a couple of years. VMS seems to still have VAX 32 bit artifacts 2
decades after the switch.

Sorry, but with the realities of hardware moving to sizes that handle
beyond 4GB sizes, there needs to be changes to programs. The days of the
RA82 disks at 650MB are over.

If you don't want to recompile your code and adapt any hacks you did at
low levels to use 64 bits values, I suggest you run your code in
emulation that will support the old 32 bit structures.

If your code is in maintenance mode, then you can leave it on
IA64/Alphga or on emulation on x86. But you shouldn't prevent those
customers who are actively developping from building software on a
modern OS that has finally shed its 1980s limitations.

I say it is time for VMS to catch up. If they want to provide 32bit
emulation for upward software compatibility, that is fine. But native
VMS needs to evolve. And the jump to x86 is the best time to make that
evolution and make long term stability for those who have moved to x86
since they will already go through conversion once and then be stable.

If you go through the x86 port, and a couple years later, you need to
upgrade software to support 64 bits, then you have more instability/trouble.

Consider that VAX to ALpha also involved converting from VAX-C to DEC-C
which is a harder conversion to go through.

Craig A. Berry

unread,

Apr 12, 2015, 7:33:38 PM4/12/15

to

On 4/12/15 4:55 PM, JF Mezei wrote:
> On 15-04-12 17:27, Craig A. Berry wrote:
>
>> If we'd had to make those kind of changes when we moved from Alpha to
>> Itanium five years ago I can say with some confidence we'd still be on
>> Alpha.
>
>
> Found out that while "long" on VMS remained 32 bits (you need long-long
> to get 64 bits), it moved to 64 bits on OS-X. So a program that was
> reading raw data from a file have to have that huge change: modifying 3
> items in a struct from "long" to "int". (after testing "sizeof" for
> various OS-X definitions). Apple was able to make that move to 64 bits
> in a couple of years. VMS seems to still have VAX 32 bit artifacts 2
> decades after the switch.
>
> Sorry, but with the realities of hardware moving to sizes that handle
> beyond 4GB sizes, there needs to be changes to programs. The days of the
> RA82 disks at 650MB are over.

The CRTL has supported file sizes larger than 2GB for some time by
optionally providing a 64-bit off_t. I can't imagine why you think the
size of a long has anything to do with it.

> If you don't want to recompile your code and adapt any hacks you did at
> low levels to use 64 bits values, I suggest you run your code in
> emulation that will support the old 32 bit structures.
>
> If your code is in maintenance mode, then you can leave it on
> IA64/Alphga or on emulation on x86. But you shouldn't prevent those
> customers who are actively developping from building software on a
> modern OS that has finally shed its 1980s limitations.

Needing existing software to continue working without invasive changes
or a complete rewrite may be what you consider maintenance mode.
Thankfully the VMS engineers called it "compile and go" during the
Itanium port and have stated the same goal for the x86_64 port.

Providing different models and better 64-bit support is certainly
desirable -- even essential -- but there is no reason to make a new
model the only option and many reasons not to. Even Microsoft did not
force 64-bit down everyone's throat all at once and many of the most
useful Windows programs out there are 32-bit only but work just fine
under 64-bit Windows. Before you reply, read up on the difference
between LP64 and LLP64 and make sure you understand the implications.

> I say it is time for VMS to catch up. If they want to provide 32bit
> emulation for upward software compatibility, that is fine. But native
> VMS needs to evolve. And the jump to x86 is the best time to make that
> evolution and make long term stability for those who have moved to x86
> since they will already go through conversion once and then be stable.

VMS certainly does need to catch up in many areas but if the existing
32-bit interfaces to LIB$ and OTS$ and SMG$ and SYS$ calls are only
available via emulation in VMS 9.0, then there's just not going to be
much interest in moving to it. You're talking about breaking 90% of the
existing APIs for a doctrinal point that is of no practical benefit to
most applications.

It's unfortunate that VMS 8.x was spent moving to Itanium and VMS 9.x
will be spent getting to x86_64, diverting attention and resources from
other new things that would naturally go with a major release, whether
it be new quotas, new SYSGEN parameters, new file systems, new
applications, etc., but that's where we are. I think we'll just have to
get our medicine in smaller doses, and hope that there's a 9.1 and 9.2
and so on, each introducing a deprecation cycle for some features,
followed by removal in the next release.

Simon Clubley

unread,

Apr 12, 2015, 8:02:42 PM4/12/15

to

On 2015-04-12, Jan-Erik Soderholm <jan-erik....@telia.com> wrote:
> Stephen Hoffman skrev den 2015-04-12 16:57:
>
>> ...OpenVMS SEARCH command a particularly robust or modern utility...
>
> Hm, SEARCH have never failed, as far as I remember.
> It has always done what was expected.
>

Regex support would be nice.

Simon.

--
Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
Microsoft: Bringing you 1980s technology to a 21st century world

David Froble

unread,

Apr 12, 2015, 9:31:57 PM4/12/15

to

JF Mezei wrote:

Oh, boy, time to beat up on JF. Not sure where to start, there are so
many options.

:-)

> On 15-04-12 17:27, Craig A. Berry wrote:
>
>> If we'd had to make those kind of changes when we moved from Alpha to
>> Itanium five years ago I can say with some confidence we'd still be on
>> Alpha.
>
>
> Found out that while "long" on VMS remained 32 bits (you need long-long
> to get 64 bits), it moved to 64 bits on OS-X. So a program that was
> reading raw data from a file have to have that huge change: modifying 3
> items in a struct from "long" to "int". (after testing "sizeof" for
> various OS-X definitions). Apple was able to make that move to 64 bits
> in a couple of years. VMS seems to still have VAX 32 bit artifacts 2
> decades after the switch.

What's the real issue here? Just where are 64 bit integers needed, and
not already available. Some or all of the languages provide this data
type. Now if you're thinking of large disks, making internal changes
for this should not affect any, or very little, applications.

Longwords are still adequate for many tasks.

Words are still adequate for many tasks.

Bytes are still adequate for many tasks.

So, where did this demand for 64 bit integers come from ???

> Sorry, but with the realities of hardware moving to sizes that handle
> beyond 4GB sizes, there needs to be changes to programs. The days of the
> RA82 disks at 650MB are over.

Well, yeah, so what? Two terabyte disks are much larger than that RA82,
and can be used. If there are users who need more on one spindle, they
are few and far between. As for using larger disks, if a user doesn't
need to see a very large disk, then in some SANs the storage can be
offered to VMS in sizes that can be used, regardless of the physical
size of the disks.

> If you don't want to recompile your code and adapt any hacks you did at
> low levels to use 64 bits values, I suggest you run your code in
> emulation that will support the old 32 bit structures.

I suggest you .... wait, this is a family oriented venue, can't say
that ....

And really, haw many applications would have done that?

> If your code is in maintenance mode, then you can leave it on
> IA64/Alphga or on emulation on x86. But you shouldn't prevent those
> customers who are actively developping from building software on a
> modern OS that has finally shed its 1980s limitations.

Why would existing code no longer work? Are you suggesting doing away
with all integer data types smaller than 64 bits?

> I say it is time for VMS to catch up. If they want to provide 32bit
> emulation for upward software compatibility, that is fine. But native
> VMS needs to evolve. And the jump to x86 is the best time to make that
> evolution and make long term stability for those who have moved to x86
> since they will already go through conversion once and then be stable.

You still haven't given one reason to stop using smaller data types.
And as I've mentioned, most or all VMS languages already support 64 bit
integers. So just what is it that you think should happen?

> If you go through the x86 port, and a couple years later, you need to
> upgrade software to support 64 bits, then you have more instability/trouble.

If you need 64 bits, you're already using it.

> Consider that VAX to ALpha also involved converting from VAX-C to DEC-C
> which is a harder conversion to go through.

I'd have to go look, since I don't use, or like, C. But I do believe
DEC C existed before the Alpha did. If so, and it wasn't running on
VAX, then what was it running on? I also believe the DEC C compiler
allows you to continue to use code written for VAX C. So, where was the
conversion?

So far, most of what I've heard is that addressing disks larger than 2
TB is a problem. Why would that somehow morph into "everything must be
64 bits" ? Yes, there will be some places where 32 bit integers are
used, and will need to be 64 bit integers. But the places are limited,
can be known, and having dual capabilities isn't such a hard thing, is
it? Someone has already suggested implementing additional library /
system service routines for the 64 bit stuff, leaving the 32 bit stuff
alone, thus not breaking nay code. Then if you really had to use the 64
bit routines, they'd also be there. Or the people at VSI might have
some better ideas.

JF Mezei

unread,

Apr 12, 2015, 11:36:19 PM4/12/15

to

On 15-04-12 21:31, David Froble wrote:

> What's the real issue here? Just where are 64 bit integers needed,

Some VMS structures aere still 32 bits. People who make low level calls
that make use of those strictures expect 32bits. If VMS changes to 64
bits to handle more modern file system and devices, then those programs
that interface with the system at that level need to defice thyeir
vcalriables to be 64 bits.

Some people really resist this despite warnings of moving to 64 bits
started in 1992.

> Longwords are still adequate for many tasks.

Yes they are. But parts of VMS dependent stuck on 32 bits to please old
timers prevent VMS from moving to modern times.

> So, where did this demand for 64 bit integers come from ???

Bigger files, bigger devices.

> Well, yeah, so what? Two terabyte disks are much larger than that RA82,

But there are drives bigger than 2TB today. What happens when you have
hardware with larger drives ? You end up like the VAXstation 3100 only
able to support 1GB drives due to bit limitations.

> I suggest you .... wait, this is a family oriented venue, can't say
> that ....

Not long enough to do that. Sorry.

> And really, haw many applications would have done that?

*I* have no problems with moving VMS to 64 bits. But some people have
code that is apparently expecting many 32 bit values from system
structures and would balk at changing the definition of their variables
to 64 bits because that would be more than a simple recompile.

> Why would existing code no longer work? Are you suggesting doing away
> with all integer data types smaller than 64 bits?

You get systemn service that returns a structure with a 64 bit value.
Your app tries to move that value to a 2 byte word.

> VAX, then what was it running on? I also believe the DEC C compiler
> allows you to continue to use code written for VAX C. So, where was the
> conversion?

VAX-C = kerningahm and Ritchie.
DEC-C : ANSI.

need for function prototypes, and updating all strings now passed "by
value" instead of by reference (passing by reference is now done by
default). Other changes too.

Cosnider this:

in UNIX, you.re suppose dto define a variable getting the time as
"time_t my_variable ;"

So when Apple changed time_r to become 64 bits, a simple recompile is
all that was needed.

But when you define your variable as "int" and the time values move to
64 bits, then you have to update your definitions.

Bob Gezelter

unread,

Apr 13, 2015, 6:08:01 AM4/13/15

to

On Sunday, April 12, 2015 at 11:36:19 PM UTC-4, JF Mezei wrote:
> On 15-04-12 21:31, David Froble wrote:
>

...

> But when you define your variable as "int" and the time values move to
> 64 bits, then you have to update your definitions.

David,

JF does have a point here. It is as much about marginal coding practices in C and other languages as it is about VMS.

It is a fact that many of the OpenVMS APIs (e.g., GET/SET DVI) use explicitly defined 32-bit fields (aka longwords) for various quantities. It is equally a fact that the values in some domains (e.g., memory, mass storage) have, or are rapidly approaching the point of routinely exceeding those quantities (e.g., free space, total blocks).

One of the confusing points of this discussion is that the term "64 bit API" has, and is, used to refer to the memory addresses pointing to or contained in such structures. In that case, one can have both versions for both existing programs and new, large address space programs.

In this particular case, JF has raised the question of 32 bit memory space programs accessing information that is inherently larger than the 32 bit value fields. This is a harder problem to dodge. As Hoff has mentioned several times, there is a very population of existing code which a change in structures would affect. Users have varying degrees of ability to easily mitigate this problem.

The practice of using C types of "long" rather than type aliases (e.g., time_t) make the scanning problem worse by creating false positives, each of which must be resolved (not to mention the variables and internal program interfaces that are coded as expecting 32 bit values.

Creating spoofed returns for 32 bit values in a 64 bit world is problematic. Programs written before the use of the spoofed values are likely to report strange results. This was seen with the RSX-11 AME. It was manageable, but it did cause problems (e.g., device names). Comp.os.vms had this discussion several months ago in a different thread.

In the end, we will likely end up with a bit of a muddle. Existing code (binary and dusty deck) will require the use of 32-bit APIs in both senses (address space and values). Separately named system service will support the 64-bit world (on the principal that one should never change the API of an existing interface point; far better to have a parallel name for the new interface.

- Bob Gezelter, http://www.rlgsc.com

johnwa...@yahoo.co.uk

unread,

Apr 13, 2015, 7:05:31 AM4/13/15

to

I have a vague memory that Solaris (and maybe even Tru64/OSF/1) had
compile-time and link-time options that could be used to make an
existing 32bit address space application (ie sizeof char* == 4)
built and running on a 64bit OS (ie sizeof char* == 8) believe it was
still in a 32bit world.

Could that be done, technically, for VMS code? Whether it would be
sensible to do so is a different discussion.

But we need to be careful to avoid conflating several separate (but
related) issues here:

* how big is int? (hint: anyone who uses int, naked+unqualified, is
heading for trouble. If size matters, int16_t, UINT16, whatever, are
your friends, though not a panacea).
* how big is an address?
* how big is a byte offset within a file (and indeed a whole load of
other stuff where 32bits used to be enough but aren't any more, eg
time_t)

Those things are visible at user-program level and the fixes are
widely understood and really just good practice anyway. Implementing
the fixes right means the same source works in either a 32bit or a
64bit (addressing, filesize, etc) world. What's not to like (except
the effort of fixing stuff that wasn't done right previously)?

It gets more complicated, but still far from insurmountable, when
programs are using the VMS-specific APIs where more complex changes
might turn out to be necessary, IF compatibility is abandoned.

hb

unread,

Apr 13, 2015, 7:51:47 AM4/13/15

to

On 04/13/2015 01:05 PM, johnwa...@yahoo.co.uk wrote:
> I have a vague memory that Solaris (and maybe even Tru64/OSF/1) had
> compile-time and link-time options that could be used to make an
> existing 32bit address space application (ie sizeof char* == 4)
> built and running on a 64bit OS (ie sizeof char* == 8) believe it was
> still in a 32bit world.

For Tru64 the option are -taso for ld and -xtaso and/or -xtaso_short for
compilers.

Bob Koehler

unread,

Apr 13, 2015, 9:05:01 AM4/13/15

to

In article <38560021-0810-486f...@googlegroups.com>, johnwa...@yahoo.co.uk writes:
> On Friday, 10 April 2015 14:21:37 UTC+1, Bob Koehler wrote:
>> In article <mg6v86$l5n$1...@dont-email.me>, David Froble <da...@tsoft-inc.com> writes:
>> >
>> > Got to wonder when so many people learned to attempt to pound square
>> > pegs through round holes ??????????
>>
>> On UNIX, where there are no pegs and no holes, only saw dust piles.
>
> Careful.
>
> On Linux (and on a decent modern UNIX), boss can frequently choose the
> most appropriate file system for the job at hand. You don't *have* to
> do that, and sensible defaults are frequently available, but you do
> have the choice.
>

So you have a choice between piles of sawdust, and piles of sand.
Still no pegs and no holes.

Bob Koehler

unread,

Apr 13, 2015, 9:06:46 AM4/13/15

to

In article <mg91vs$283$4...@dont-email.me>, Simon Clubley <clubley@remove_me.eisner.decus.org-Earth.UFP> writes:
>
> As a Public Service Announcement, I would like to bring your attention
> to a wonderful new invention called a scroll bar. :-)
>
> You may wish to investigate this wonderful new technology. :-)

Scroll bars? Scroll bars? My VT don't need no steenkin scroll bars.

Bill Gunshannon

unread,

Apr 13, 2015, 9:16:53 AM4/13/15

to

In article <e8rqvb-...@news.chingola.ch>,

Paul Sture <nos...@sture.ch> writes:
> On 2015-04-12, David Froble <da...@tsoft-inc.com> wrote:
>> Jan-Erik Soderholm wrote:
>>
>>> The file system stores (more or less unformatted) *files*, the
>>> database system stores (strongly formatted) *data*.
>>> Just to put it simple.
>>
>> I've seen images stored in databases. I've also seen them stored in
>> files and pointed to by database fields. Both work.
>
> I've recently been wondering about this for a new project.
>
> The following paper "To BLOB or Not To BLOB: Large Object Storage in a
> Database or a Filesystem?" is now 9 years old but it's still worth a
> read:
>
> <http://research.microsoft.com/pubs/64525/tr-2006-45.pdf>

Isn't this what NoSQL databases, specifically things like MongoDB, are
targetting?

>
> FWIW most of the PDFs I have from online transactions fall beneath 256KB
> and many are less than 50KB. Having done a quick scan of all the PDFs
> on my system and sorting into ascending size, by the time I get to 256KB
> I'm seeing software and hardware manuals (obviously there's the odd
> monster in there, usually the result of scanning).
>

bill

--
Bill Gunshannon | de-moc-ra-cy (di mok' ra see) n. Three wolves
bill...@cs.scranton.edu | and a sheep voting on what's for dinner.
University of Scranton |
Scranton, Pennsylvania | #include <std.disclaimer.h>

Bill Gunshannon

unread,

Apr 13, 2015, 9:27:18 AM4/13/15

to

In article <9s0ovb-...@news.chingola.ch>,

Paul Sture <nos...@sture.ch> writes:
> On 2015-04-10, hb <end...@inter.net> wrote:
>> On 04/10/2015 10:32 PM, John Reagan wrote:
>>>
>>>>
>>>> Although Unix doesn't have the /GRAND_TOTAL qualified for "ls", there
>>>> are 840 of those.
>>>>
>>>
>>> $ ls -1 | wc -l
>>>
>> Shouldn't this be either
>> $ ls |wc -l
>

> 'ls' on its own will list the files in columns. To use an example currently
> in front of me:
>
> $ ls /usr/share/kbd/keymaps
> amiga atari i386 include mac ppc sun

But not whem piped.

>
> The '-1' switch tells ls to produce a 1 column list:
>
> $ ls -1 /usr/share/kbd/keymaps
> amiga
> atari
> i386
> include
> mac
> ppc
> sun

Again, unless your "ls" is broken this is unneccessary if you are piping
it into something else, like wc.

>
>
> 'wc -l' counts the number of lines in the listing:
>
> $ ls -1 /usr/share/kbd/keymaps | wc -l
> 7
>

server1# ls -1 |wc -l
1061
server1# ls |wc -l
1061

>> or
>> $ ls -l |grep -v '^total' |wc -l
>> ?
>
> That produces a total of the blocks occupied by the files in the
> directory (but ignoring the sizes of files in any subdirectories
> present).

Ummm.. No. Nothing about blocks there. Just the number of files.
Or, to be more accurate, the number of directory entries.

> nd, as stated previously, "-a" should be used as well.

server1# ls -a | wc -l
1516
server1# ls -1a | wc -l
1516

> From 'info ls' on Linux, under the '-l' entry:
>
> For each directory that is listed, preface the files with a line
> `total BLOCKS', where BLOCKS is the total disk allocation for all
> files in that directory. The block size currently defaults to 1024
> bytes, but this can be overridden (*note Block size::). The BLOCKS
> computed counts each hard link separately; this is arguably a
> deficiency.

So then, I guess the Linux "ls" falls into that catagory of broken "ls"
implemenations. :-)

>
> These are 1024 byte blocks on Scientific Linux but it depends which
> file system you are using :-)
>
> OS X which is BSD based has this under 'info ls' for output using '-l':
>
> In addition, for each directory whose contents are displayed, the
> total number of 512-byte blocks used by the files in the directory
> is displayed on a line by itself, immediately before the
> information for the files in the directory.
>
> I hope that clears up any confusion.
>

My "ls", from BSD, returns only a list of files unless you includ the "-l"
option. Then it provides "total 2338380" at the top of the listing.

Stephen Hoffman

unread,

Apr 13, 2015, 10:18:14 AM4/13/15

to

On 2015-04-13 01:31:48 +0000, David Froble said:

> JF Mezei wrote:
>
> Oh, boy, time to beat up on JF. Not sure where to start, there are so
> many options.
>
> :-)
>
>> On 15-04-12 17:27, Craig A. Berry wrote:
>>
>>> If we'd had to make those kind of changes when we moved from Alpha to
>>> Itanium five years ago I can say with some confidence we'd still be on
>>> Alpha.
>>
>>
>> Found out that while "long" on VMS remained 32 bits (you need long-long
>> to get 64 bits), it moved to 64 bits on OS-X. So a program that was
>> reading raw data from a file have to have that huge change: modifying 3
>> items in a struct from "long" to "int". (after testing "sizeof" for
>> various OS-X definitions). Apple was able to make that move to 64 bits
>> in a couple of years. VMS seems to still have VAX 32 bit artifacts 2
>> decades after the switch.
>
> What's the real issue here? Just where are 64 bit integers needed, and
> not already available.

What I suspect JF is referring to here (and surprisingly poorly, given
he knows rather more about this area) is that OS X migrated from native
32-bit to native 64-bit, allowing 32-bit apps to work in the 64-bit
environment, but OS X does not try to mix the two within the same
application.

Contrast this with OpenVMS V7 and later, which uses a design that's
both 32- and 64-bit, depending on your settings and depending on what
services you call. And within C on OpenVMS, you get tagged by stuff
such as the ptrdiff_t — what's supposed to be the difference between
two pointers — being 32-bit, and that just doesn't work with 64-bit
pointers.

OpenVMS probably made the 64-bit migration somewhat easier in some ways
— as you could graft on some 64-bit into a 32-bit app, without
rebuilding the whole app as 64-bit — at the cost of a more complex
application environment, and with a 64-bit environment that's
comparatively awkward to work with.

Contrast this with OS X went from flat 32 bit virtual to 64-bit
virtual. OS X didn't try to allow both in the same applications.
This means that OpenVMS went to a segmented 64-bit environment where
you have to program both 32-bit and 64-bit addressing, and where you
have some services that are 32- and 64-bit and some that have 64()
variants, and where you have to keep track of where you're storing
(big) stuff.

As I've commented occasionally, this is an example of shorter-term and
upward-compatible thinking, versus longer-term and more future-friendly
thinking. This compatibility stuff isn't an easy call to make, either
— if you go full-on compatible, there are cases where your designs get
deeper into quagmires, and for little or even negative long-term
benefits.

> Some or all of the languages provide this data type.

That's quite true within many of the languages in the OpenVMS
environment. In other environments, the native integer size is the
native size of the variables.

> Now if you're thinking of large disks, making internal changes for this
> should not affect any, or very little, applications.

I've posted some stuff that'll break. Most commonly, stuff that looks
at file sizes and disk sizes. There are other areas.

> Longwords are still adequate for many tasks.
>
> Words are still adequate for many tasks.
>
> Bytes are still adequate for many tasks.

All true.
Do watch out for threading with short and packed-in variables, and
don't what's efficient for your on-disk formats mixed up with what's
efficient for your in-memory formats and processing, FWIW.
I usually prefer to avoid bitfields and packing in any bitflags, due to
the overhead that's usually involved.

> So, where did this demand for 64 bit integers come from ???

Donno. Depends on the application.

By default within the C environment, OS X uses 8 bytes for long long, 8
for long, and 4 for int. OpenVMS uses 8, 4, and 4, respectively. All
are eight-bit bytes. Both approaches are permissible under the C
standard, and why folks are increasingly using int32_t and uint32_t and
related baggage, when there's a requirement for a specific allocation,
such as when you're dealing with marshalling and unmarshalling the
contents of RMS records.

For me, it's more the memory model and the addressing and the
complexity that's involved with 64-bit on OpenVMS. Not so much the
specific variables. Except where I'm at an OS limit, such as the
current 32-bit limit for disk addressing, of course.

>
>> Sorry, but with the realities of hardware moving to sizes that handle
>> beyond 4GB sizes, there needs to be changes to programs. The days of
>> the RA82 disks at 650MB are over.
>
> Well, yeah, so what? Two terabyte disks are much larger than that
> RA82, and can be used. If there are users who need more on one
> spindle, they are few and far between.

If you're aiming OpenVMS at the mid- and high-end of computing, then
gibibit-scale memory and terabyte-scale databases aren't that far up
into the high-end. That scale is not really even in the high-end
these days. Six terabyte disk spindles are now available, and larger
synthetic disks are easily possible and with increasingly low-end
controllers. (One series of low-end storage box that I'm dealing with
does 32 TB, in a box about the size of a small AlphaStation. But I
digress.)

Going to big disks might not even mean big files, it might mean not
having to deal with partitioning your data and your files and your
directories across smaller spindles, much like a flat 64-bit address
space means not having to partition your in-memory data structures
between P0, P1 and P2 space. For applications that might involve more
than 1 GiB of in-memory data — that's one gibibyte, the limit of P0
space — I'd generally put everything into P2 space so I don't have to
deal with that detail of the memory segmentation again, though that
also means sorting out which services support both, and which services
require migration to the 64() variants, and migrating from 32-bit to
64-bit data structures.

FWIW, I'm increasingly mapping whole files into memory for some
applications, because the code is just vastly easier and clearer and
faster than reading that in and processing that via the punched card
interface; via what RMS records emulate. Going years back, this is
like the difference between PIO to some devices, and DMA to others.
Dealing with bigger hunks is just faster, and with lower overhead.

> As for using larger disks, if a user doesn't need to see a very large
> disk, then in some SANs the storage can be offered to VMS in sizes that
> can be used, regardless of the physical size of the disks.
>
>> If you don't want to recompile your code and adapt any hacks you did at
>> low levels to use 64 bits values, I suggest you run your code in
>> emulation that will support the old 32 bit structures.
>
> I suggest you .... wait, this is a family oriented venue, can't say that ....
>
> And really, haw many applications would have done that?

Using more than a gibibyte? Quite a few. Usually some big data
structure or two, or just mapping in a big data file.

>> If your code is in maintenance mode, then you can leave it on
>> IA64/Alphga or on emulation on x86. But you shouldn't prevent those
>> customers who are actively developping from building software on a
>> modern OS that has finally shed its 1980s limitations.
>
> Why would existing code no longer work? Are you suggesting doing away
> with all integer data types smaller than 64 bits?

I suspect he's aiming to go to a flat address space, and to try to
rationalize the interfaces for native 64-bit work.

You're entirely correct — many applications don't need 64-bit
addressing. Yet. Those that do need 64-bit are only going to get more
common, which means more and more folks will be working with 64-bit.

>
>> I say it is time for VMS to catch up. If they want to provide 32bit
>> emulation for upward software compatibility, that is fine. But native
>> VMS needs to evolve. And the jump to x86 is the best time to make that
>> evolution and make long term stability for those who have moved to x86
>> since they will already go through conversion once and then be stable.
>
> You still haven't given one reason to stop using smaller data types.
> And as I've mentioned, most or all VMS languages already support 64 bit
> integers. So just what is it that you think should happen?

I don't think this is really data types.

>
>> If you go through the x86 port, and a couple years later, you need to
>> upgrade software to support 64 bits, then you have more instability/trouble.
>
> If you need 64 bits, you're already using it.

BASIC can't use 64-bit addressing. Which means you can't do stuff like
mapping in big blocks of data in one go, where "big" is somewhat less
than a gibibyte. Mapping in great wads of data is a different approach
than what OpenVMS programs use, certainly. But whether it's easier to
code and maintain, or it's faster for SIMD work, it's becoming
increasingly common.

>> Consider that VAX to ALpha also involved converting from VAX-C to DEC-C
>> which is a harder conversion to go through.
>
> I'd have to go look, since I don't use, or like, C. But I do believe
> DEC C existed before the Alpha did. If so, and it wasn't running on
> VAX, then what was it running on? I also believe the DEC C compiler
> allows you to continue to use code written for VAX C. So, where was
> the conversion?

DEC C started out on VAX, and was the upgrade path from V3.2 to V4.0.
More than a few folks still haven't gone through VAX C to DEC C, for
that matter. There was some requirement for the preservation of
latent bugs or some such.

> So far, most of what I've heard is that addressing disks larger than 2
> TB is a problem.

It is. Akin to the six-byte to ten-byte SCSI command mess back in
mid-era VAX days. The mess is just in the OS and the applications this
time, and not in the console.

> Why would that somehow morph into "everything must be 64 bits" ?

I suspect because it's one aspect of a (pun intended) much larger
problem: of reaching and exceeding limits in existing OpenVMS
interfaces, and of existing OpenVMS and application code, and of the
architectural approach taken when resolving these limits.

> Yes, there will be some places where 32 bit integers are used, and will
> need to be 64 bit integers. But the places are limited, can be known,
> and having dual capabilities isn't such a hard thing, is it? Someone
> has already suggested implementing additional library / system service
> routines for the 64 bit stuff, leaving the 32 bit stuff alone, thus not
> breaking nay code. Then if you really had to use the 64 bit routines,
> they'd also be there. Or the people at VSI might have some better
> ideas.

Having been using 64-bit on OS X and iOS — yes, some iPhone and iPad
devices use 64-bit — the flat address space is simple to use, and the
migration is generally pretty easy. The OpenVMS 64-bit environment is
more complex, and the migration is more difficult, and — because of the
segmented addressing and various arcana — the resulting code can be
more complex. Again, trade-offs.

Compatibility eventually gets you constrained, and you can either
continue with more complex approaches, or you can reset your designs —
the latter reset is most certainly more disruptive to the existing
applications during the migration, but reduced complexity usually has a
better long-term outcome. BASIC doesn't do 64-bit addressing, so
you're limited to a gibibyte of code and data due to virtual addressing
limits. Had OpenVMS gone native 64-bit with a flat virtual address
space and a 64-bit BASIC compiler, you'd just keep allocating memory
and doing what you're currently doing. But OpenVMS 64-bit doesn't work
that way.

Again-again, there are always trade-offs.

Stephen Hoffman

unread,

Apr 13, 2015, 10:35:55 AM4/13/15

to

On 2015-04-13 11:05:30 +0000, johnwa...@yahoo.co.uk said:

> I have a vague memory that Solaris (and maybe even Tru64/OSF/1) had
> compile-time and link-time options that could be used to make an
> existing 32bit address space application (ie sizeof char* == 4) built
> and running on a 64bit OS (ie sizeof char* == 8) believe it was still
> in a 32bit world.
>
> Could that be done, technically, for VMS code? Whether it would be
> sensible to do so is a different discussion.

Resizing ints is comparatively easy, though code that does marshalling
and unmarshalling will need to be addressed.
Dealing with the currently-segmented virtual address space in a
similarly-transparent fashion isn't nearly as easy.

Jan-Erik Soderholm

unread,

Apr 13, 2015, 10:40:16 AM4/13/15

to

Jan-Erik Soderholm skrev den 2015-04-11 10:54:

> abrsvc skrev den 2015-04-11 08:05:
>> Try the same test without shadowing. I suspect that there will be a
>> more significant difference.
>

> I do not think shadowing does that a difference.
> But yes, I have one non-shadowned disk (same type)
> in the system, so I could rerun it there.
>
> I can also rerun the test on a more up to date system
> with storage in a SAN with better up to date performance.

>
>>
>> Please note: I don't think that there is a performance issue either. I
>> know that I have seen extended times with wildcard deletes.
>>
>> Another note that may be significant here. The performance issues Ihave
>> seen have been with V7.3-2 NOT any of the V8 variants. Perhaps the
>> "delay" is no longer a problem with V8 and up???
>

> Yes, that is also my point. The last time I can remember having some
> real problems with this, was some MicroVAX 3100/90 running 7.x.
>
> I cannot remember seeing this "problem" since V8.x come along. And
> I'm sure I have seen directories with 10-20.000 (maybe up to the
> 40-50.000 range) files now and then.
>
> Jan-Erik.
>
>>
>> Dan
>>
>

A few simple stats from a system (DS20e) with a faster SA
(using 2 Gb FC) storage, DS20e 666 MHz and VMS 8.4.

Create 10.000 files:

$ @crefiles
13-APR-2015 15:30:58
13-APR-2015 15:32:57
$

$ dir/tot *.dat
...
Total of 10000 files.

Delete of 10.000 files ($ delete *.DAT;*):

$ @delall
13-APR-2015 15:34:51
13-APR-2015 15:35:33
$

42 seconds runtime (or 238 files/sec).

Also tested a partial delete.
Deleting the first 480 files created (also the
"first" files in the .DIR) took 10 sec. The delete
as such took 2-3 secs, the rest was spent scanning
the other 9.520 files for the creation date.

So I would say that handling up to 10.000 files in
a single directory is no problem (on this system).

Also rerun the same test using 50.000 files:

Create 50.000 files:

$ @crefiles
13-APR-2015 15:37:20
13-APR-2015 15:45:01
$

(The .DIR file grew to a little over 5.500 blocks.)

$ dir/tot *.dat
...
Total of 50000 files.
$

Delete 50.000 files ($ delete *.DAT;*):

$ @delall
13-APR-2015 15:46:05
13-APR-2015 15:57:18
$

673 seconds runtime (or 74 files/sec avg).

So here it didn't scale linearly, at least, but far
from a total stall or halt. And maybe more files
then anyone would usualy have in a directory. :-)

Another thing is that the filenames was not 100%
unique since more then 100 files was created/sec.
708 files out of 10.000 in the first test was ;2.

Anyway, FWIW... :-)

Jan-Erik.

John Reagan

unread,

Apr 13, 2015, 10:56:53 AM4/13/15

to

On Monday, April 13, 2015 at 10:18:14 AM UTC-4, Stephen Hoffman wrote:

>
> Contrast this with OpenVMS V7 and later, which uses a design that's
> both 32- and 64-bit, depending on your settings and depending on what
> services you call. And within C on OpenVMS, you get tagged by stuff

> such as the ptrdiff_t -- what's supposed to be the difference between
> two pointers -- being 32-bit, and that just doesn't work with 64-bit
> pointers.

It certainly works with two 64-bit pointers that refer to addresses within
a single array whose size is limited by size_t which is 32-bits. Addresses that don't point into the same array are illegal C.

>BASIC doesn't do 64-bit addressing, so
> you're limited to a gibibyte of code and data due to virtual addressing
> limits. Had OpenVMS gone native 64-bit with a flat virtual address
> space and a 64-bit BASIC compiler, you'd just keep allocating memory
> and doing what you're currently doing. But OpenVMS 64-bit doesn't work
> that way.

You can certainly put BASIC-generated code into P2 or S2 space. There are certainly assumptions scattered around about data sizes however with 32-bit size limits (much like C's size_t). You really are dealing with more of a linker/imgact issue with code getting loading into P0 space by default. All you really need is the procedure descriptors/function descriptors allocated in a 32-bit address space so you can still point to them with old-school 32-bit pointers.

As mentioned prior, Tru64 shook out all the 32-bit pointers by simply not mapping that bottom 32-bits of address space unless you used TASO (and I know lots of code that used TASO, including the C compiler and GEM).

Michael Moroney

unread,

Apr 13, 2015, 11:51:30 AM4/13/15

to

For yucks, I created a directory with 400,000+ files in it (63,000+ block
.DIR file) to see how that performed. Except for a DELETE *.*;*,
performance seems reasonable, a $ DIR/TOTAL takes several seconds
(probably could be faster as another process has a $ DELETE *.*;*
underway), random creates work OK (couple of seconds), a
DIR/SIZE/TOTAL (which reads every header) 4 minutes. As the $ DIR/TOTAL
does very little IO it appears the whole directory file gets cached.

The $ DELETE from beginning to end will scale as O(n^2) due to the way
it's implemented, having to shuffle the whole 63K blocks every few files.
I'd expect random deletes to go quickly until it gets to the point where
each block is sparse, since it won't have to shuffle blocks until blocks
start becoming empty. Then it will slow down as most will start becoming
empty at the same time.

Stephen Hoffman

unread,

Apr 13, 2015, 11:54:01 AM4/13/15

to

On 2015-04-13 14:56:52 +0000, John Reagan said:

> On Monday, April 13, 2015 at 10:18:14 AM UTC-4, Stephen Hoffman wrote:
>
>>
>> Contrast this with OpenVMS V7 and later, which uses a design that's
>> both 32- and 64-bit, depending on your settings and depending on what
>> services you call. And within C on OpenVMS, you get tagged by stuff
>> such as the ptrdiff_t -- what's supposed to be the difference between
>> two pointers -- being 32-bit, and that just doesn't work with 64-bit
>> pointers.
>
> It certainly works with two 64-bit pointers that refer to addresses within
> a single array whose size is limited by size_t which is 32-bits.
> Addresses that don't point into the same array are illegal C.

True. Particularly if I have smaller wads of data, just a whole lot
of it. Though if I'm up in 64-bit space, there's also a decent shot
I'm working with wads of data that need that 33rd or 34th addressing
bit, and the need for several gibibytes of data is only going to become
that much more common going forward. Databases are already using
64-bit addressing (P2 and/or S2), and some different approaches to
programming than are typical on OpenVMS — mapping data and faulting it
all into virtual memory, for instance — certainly chews up more than a
little address space.

>> BASIC doesn't do 64-bit addressing, so you're limited to a gibibyte of
>> code and data due to virtual addressing> limits. Had OpenVMS gone
>> native 64-bit with a flat virtual address space and a 64-bit BASIC
>> compiler, you'd just keep allocating memory and doing what you're
>> currently doing. But OpenVMS 64-bit doesn't work that way.
>
> You can certainly put BASIC-generated code into P2 or S2 space.

Is locating code up into P2 now documented and supported? Ok. Neat.
Learned something today.

> There are certainly assumptions scattered around about data sizes
> however with 32-bit size limits (much like C's size_t). You really are
> dealing with more of a linker/imgact issue with code getting loading
> into P0 space by default. All you really need is the procedure
> descriptors/function descriptors allocated in a 32-bit address space so
> you can still point to them with old-school 32-bit pointers.

Ah, OK, so not code in P2. Nevermind. You're referencing P0 and P1
here; the low two gibibytes. Code size — for what I'm working on — is
less of a factor, and I'm not working on executables that are that
large. I don't have that much on the projects I'm dealing with, even
assuming the wasted space of not being able to unmap old code once it's
activated. It's the code and data in P0 that are increasingly pushing
the data up into P2 space. The same sort of limit that made the old
VAX S0 and S1 system space design untenable. Moving data into P1
never seemed to make all that much sense either, even if that still
fits into 32-bit addressing.

> As mentioned prior, Tru64 shook out all the 32-bit pointers by simply
> not mapping that bottom 32-bits of address space unless you used TASO
> (and I know lots of code that used TASO, including the C compiler and
> GEM).

Smart move, but then Tru64 Unix did have a clean and simple
implementation in many areas, not the least of which was 64-bit.

Bob Koehler

unread,

Apr 13, 2015, 12:49:19 PM4/13/15

to

Often

BACKUP/DELETE/NOCRC/NOGROUP *.*;* NLA0:/SAVESET

will out permform

DELETE *.*;*

Care to try it? (BACKUP knows about the inefficiency of ODS
directories).

JF Mezei

unread,

Apr 13, 2015, 12:51:40 PM4/13/15

to

Another aspect to consider with regards to migration to 64 bits.

When porting from other platforms where 64 bit has been implemented,
moving code to a 32 bit VMS can cause problems. a "long" which is 64
bits elsewhere is still 32 bits in VMS for instance.

While code *should* be written to be transparent to this, there are
cases where it is not.

If you move a pointer to an "int" in C, the compiler should issue
warnings. It may have worked in VAX days, but people should have stopped
assuming a pointer was 32 bits the day Alpha came out.

That is over 20 years ago folks.

Bill Gunshannon

unread,

Apr 13, 2015, 1:04:53 PM4/13/15

to

In article <552bf41a$0$19799$c3e8da3$3388...@news.astraweb.com>,

I'll throw out my take on this. I actually went and looked at the
ANSI C Wiki, hoping, of course, that it accurately reflects the
standard. There are 5 non-floating point data types defined.
char, short, int, long and long long. short and int are tha same.
I see no reason for this. Based on the current state of the
technology they should be:
char : 8 bits
short : 16 bits
int : 32 bits
long : 64 bits
long long : 128 bits

And, I will also add that I think there should be an 8 bit data type
other than char. Byte comes to mind. :-) Maybe then we could start
getting people out of the habit of doing arithmetic with letters.
Except, of course, in algebra.

Now, where does one file petitions with the ANSI C Standards Committee?

John Reagan

unread,

Apr 13, 2015, 1:28:05 PM4/13/15

to

On Monday, April 13, 2015 at 11:54:01 AM UTC-4, Stephen Hoffman wrote:

> Ah, OK, so not code in P2. Nevermind. You're referencing P0 and P1

> here; the low two gibibytes. Code size -- for what I'm working on -- is

> less of a factor, and I'm not working on executables that are that
> large. I don't have that much on the projects I'm dealing with, even
> assuming the wasted space of not being able to unmap old code once it's
> activated. It's the code and data in P0 that are increasingly pushing
> the data up into P2 space. The same sort of limit that made the old
> VAX S0 and S1 system space design untenable. Moving data into P1
> never seemed to make all that much sense either, even if that still
> fits into 32-bit addressing.

No, you can indeed put code into P2. The procedure descriptor has a 64-bit code address. The generated code loads that address for the transfer of control. However, the "address" of a routine on Alpha and Itanium is not the address of the first instruction. It is the address the descriptor (allocated by the compiler & filled in by the linker on Alpha; allocated/filled in the linker on Itanium).

Now getting that code there initially is the trick. I don't think the linker/imgact has the ability for code like it does for those 64-bit Fortran COMMON blocks. You'd have to block move the code, flush icache, etc. And don't forget about the linkage section. That often occupies as much room as the actual instructions do. Moving it would mean fixing up those addresses in the code when you copy it. It ain't trivial and I'm probably missing something along the way. Maybe Hartmut can offer an opinion?

John Reagan

unread,

Apr 13, 2015, 1:31:38 PM4/13/15

to

On Monday, April 13, 2015 at 11:54:01 AM UTC-4, Stephen Hoffman wrote:
> On 2015-04-13 14:56:52 +0000, John Reagan said:

> >
> > It certainly works with two 64-bit pointers that refer to addresses within
> > a single array whose size is limited by size_t which is 32-bits.
> > Addresses that don't point into the same array are illegal C.
>
> True. Particularly if I have smaller wads of data, just a whole lot
> of it. Though if I'm up in 64-bit space, there's also a decent shot
> I'm working with wads of data that need that 33rd or 34th addressing
> bit, and the need for several gibibytes of data is only going to become
> that much more common going forward. Databases are already using
> 64-bit addressing (P2 and/or S2), and some different approaches to

> programming than are typical on OpenVMS -- mapping data and faulting it
> all into virtual memory, for instance -- certainly chews up more than a
> little address space.

No argument from me. A real LP64 model along with a larger size_t would indeed be welcome. Of course, as others have noted, that ripples into lots of things like system interfaces, FABs, RABs (I'm not sure if you can silently replace the FAB/RAB with FAB64/RAB64 or if the code would need changed...) However, code coming from outside the OpenVMS world certainly doesn't refer to FABs, RABs, or OpenVMS services. You pick the model you want/need. If it was easy, we'd have done it by now.

VAXman-

unread,

Apr 13, 2015, 2:01:37 PM4/13/15

to

In article <552bf41a$0$19799$c3e8da3$3388...@news.astraweb.com>, JF Mezei <jfmezei...@vaxination.ca> writes:
>Another aspect to consider with regards to migration to 64 bits.
>
>When porting from other platforms where 64 bit has been implemented,
>moving code to a 32 bit VMS can cause problems. a "long" which is 64
>bits elsewhere is still 32 bits in VMS for instance.
>
>While code *should* be written to be transparent to this, there are
>cases where it is not.
>
>If you move a pointer to an "int" in C, the compiler should issue
>warnings. It may have worked in VAX days, but people should have stopped
>assuming a pointer was 32 bits the day Alpha came out.
>
>That is over 20 years ago folks.

My experience is that I must intentionally cast a value if I try to use one
in ways that would modifiy a pointer. Of course, there are compiler quali-
fiers which will ask the compiler to shut up and stay out of your business.

--
VAXman- A Bored Certified VMS Kernel Mode Hacker VAXman(at)TMESIS(dot)ORG

I speak to machines with the voice of humanity.

Stephen Hoffman

unread,

Apr 13, 2015, 2:07:16 PM4/13/15

to

On 2015-04-13 17:04:51 +0000, Bill Gunshannon said:

> I'll throw out my take on this. I actually went and looked at the ANSI
> C Wiki, hoping, of course, that it accurately reflects the standard.
> There are 5 non-floating point data types defined.
> char, short, int, long and long long. short and int are tha same. I
> see no reason for this.

As stated, there are five signed integer types: char, short, int, long,
long long. Five unsigned types, as well.

Of these, "A 'plain' int object has the natural size suggested by the
architecture of the execution environment..."

In terms of the precision available with each of the five required
integer types (and there's a parallel requirement for unsigned integer
types), it's equivalent to this:

char <= short <= int <= long <= long long

The required minimal precision of the various types: char is at least
that of eight base-two bits. This is via the requirement to store
values in the range of -2^7-1 to +2^7-1, and 2^8-1 for unsigned, or a
larger range. short and unsigned short are required to have at least
the equivalent range provided by 16 base-two bits, or larger. int and
unsigned int at least that of 16 bits, long and unsigned long at least
that of 32 bits, or larger. And long long and unsigned long are
required to have at least that range provided by 64 bits, or larger.

C99 and C11 have very similar (identical?) requirements here.

David Froble

unread,

Apr 13, 2015, 2:14:43 PM4/13/15

to

Yes, there perhaps will be some new things. But I have to ask, how many
will actually be impacted? Even if the system is using a 6 TB disk, how
many users will be using a single file larger than 2 TB? Perhaps some
won't care if at times they get a value they cannot handle.

While there may be some issues for some users, and my bet is the
percentage will be small, I don't consider it a "the sky is falling"
issue. Many will be able to continue as before, with no issues.

Now, you want to hear something scary? I've got a database product
still in use that works on 512 byte blocks. It's implemented in
Macro-32, which I haven't worked with recently. If I have to modify it
to work with 4096 byte blocks, well, it's going to be rather difficult.
No, VERY SCARY!

And yes, it uses Longwords for record indicies ....

I believe it has a max of 16 million records in a file. Never came
close yet.

Stephen Hoffman

unread,

Apr 13, 2015, 2:16:35 PM4/13/15

to

Is what you're discussing here theoretical or techical, or is this now
considered supported? (Not that I've had to deal with activating or
debugging it recently, but I do know how to get blocks of code mapped
up there. By coincidence, I'm presently mapping wads of code as part
of a side project. But I digress.) But is executable code within P2
particularly documented and supported? When last I waded into this,
AFAIK it wasn't supported. For the "larger" applications I've worked
with, it was the combo of code and data that tended to blow out the
available P0 space. Which led to moving data into P2 or S2 space.

Stephen Hoffman

unread,

Apr 13, 2015, 2:29:25 PM4/13/15

to

On 2015-04-13 18:14:37 +0000, David Froble said:

> Yes, there perhaps will be some new things. But I have to ask, how
> many will actually be impacted? Even if the system is using a 6 TB
> disk, how many users will be using a single file larger than 2 TB?
> Perhaps some won't care if at times they get a value they cannot handle.

Have you turned on ALL in SET AUDIT for OpenVMS security auditing in a
big cluster? :-)

David Froble

unread,

Apr 13, 2015, 2:32:46 PM4/13/15

to

Actually, this topic is a bit distressing. While mainly using Basic,
and I don't want to think about Macro-32, I do use pointers. Using
LOC(), and in descriptors, and elsewhere. Having to find all instances
and modify for 64 bit addresses would be difficult.

What good is it to successfully cross the street, if the beer truck gets
you on the 100th attempt?

What good is 99% of the code unaffected, if you still need the other 1%
to work?

Two different topics, I think.

1) 64 bit integers for large disks
2) 64 bit address pointers everywhere

How much cost is it to retain 32 bit address pointers when it's adequate
for the job?

John Reagan

unread,

Apr 13, 2015, 2:45:35 PM4/13/15

to

I don't think it is written down in any manual. There are several things that would/could get in your way however the code generated by the compiler doesn't particularly care where it lives in memory.

JF Mezei

unread,

Apr 13, 2015, 3:17:14 PM4/13/15

to

On 15-04-13 14:32, David Froble wrote:

> and I don't want to think about Macro-32, I do use pointers. Using
> LOC(), and in descriptors, and elsewhere. Having to find all instances
> and modify for 64 bit addresses would be difficult.

The BASIC compiler/interpreter should be able to warn you that you are
attempting to move a 64 bit value (memory address) into a 32 bit
variable. So you fix those instances.

Perhaps the various compilers could have a "nag once for 32 to 64 bit
conversion" switch where the compiler gets pedantic with all pointer
stuff to help you with one time conversion to 64 bits.

In C, when you define your won descriptor structures, the pointers are
defined as pointers (char *dsc$_l_address as I recall) so that
automatically gets promoted to 64 bits. However, the length remains a
short. Whether that needs to change is another question. I suspect
descriptor structure is opaque to you as a BASIC programmer.

Stephen Hoffman

unread,

Apr 13, 2015, 3:27:35 PM4/13/15

to

On 2015-04-13 18:32:40 +0000, David Froble said:

> Actually, this topic is a bit distressing. While mainly using Basic,
> and I don't want to think about Macro-32, I do use pointers. Using
> LOC(), and in descriptors, and elsewhere. Having to find all instances
> and modify for 64 bit addresses would be difficult.

With a migration to 64-bit that doesn't involve promoting the existing
data type declarations, you redeclare the variables as 64 bit as
needed, and use strict type checking within the compiler(s) to catch
the errors.

With a migration that does involve implicit data type promotions, you
get to look at what your compilers find, and also look at your
marshalling and unmarshalling code wherever your data crosses your
interfaces; file records, network packets, mailbox messages, etc. This
is more involved.

With an environment that can be selectively mixed 32- and 64-bit, you
get to use the compilers and to also figure out which pointers you're
using in a particular spot, and when crossing in and out.

> Two different topics, I think.
>
> 1) 64 bit integers for large disks
> 2) 64 bit address pointers everywhere

Within a Von Neumann design, those two cases are not really all that
different. :-)

> How much cost is it to retain 32 bit address pointers when it's
> adequate for the job?

OpenVMS does that all over the place right now; all of your 32-bit
pointers are sign-extended to 64-bit before they're actually used.
Additionally, if you fire up the debugger and have a look at the
details of many of the subroutine calls for instance, you'll see stuff
data and addresses getting promoted, too.

There are some memory maps showing the old address space of P0, P1, S0
and S1, and a "newer" map of the OpenVMS Alpha V7.0 and later design
that has P0, P1 and the (much larger) P2 in the lower virtual address
range and S0, S1 and (much larger) S2 in the upper range. To get the
P0 and P1 and S0 and S1 32-bit addresses to work on 64-Alpha, they're
sign extended. Technically, there was a ginormous hole in the middle
of virtual address space prior to V7.0, though the sign extension
masked that the memory management used back then (also) couldn't really
address anything in that hole.

<http://h71000.www7.hp.com/doc/82final/5841/5841pro_035.html>

hb

unread,

Apr 13, 2015, 3:44:50 PM4/13/15

to

On I64 code in P2 is supported. I don't think the compilers have any
support for it. But you can tell the linker to move code into P2, the
image activator just does what the linker suggests. The linker switch is
/SEGMENT_ATTRIBUTE=CODE=P2. This forces code and unwind data into P2.

On Alpha, having code in P2 is not supported. You can move/copy the
code, but as already said, you have to make sure that all the code
addresses in the linkage section are updated. As far as I remember,
exception handling is also not guaranteed to work.

For data you need compiler support, the compilers have to set the
ALLOC_64BIT PSECT attribute. The linker will happily allocate P2
addresses and the image activator will map the data segment/section into
the designated region.