Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

CRTL and RMS vs SSIO

1,233 views
Skip to first unread message

Greg Tinkler

unread,
Oct 5, 2021, 10:06:31 PM10/5/21
to
I notice that SSIO (beta) in included in an up coming V9.1 field test. So I read up on the issues it is trying to solve.

One concerning thing was to have CRTL (via SSIO) access directly to XFC. From an architectural point of view this is wrong at so many levels, but if that is what needs to happen then open it up so RMS and other code bases can use it.

The main reason stated was the need to do byte offset/count IO’s. Well lets solve that first, change RMS by adding SYS$READB and SYS$WRITEB. These would be useful to all code using RMS.
SYS$READB read from byte offset for count, return latest data from that byte range.
SYS$WRITEB write from byte offset for count, update latest copy of underlying blocks.

SYS$WRITEB needs to use latest copy of data, and could use the new SSIO interface to XFC but RMS has it's own methods for this.
It may seem like a big ask getting all the latest blocks, but if you think about it it only needs to re-read the last and first block if it does not already have the latest copy. Also no need if the offset starts at the beginning of a block, and it fills the last block.

By having these as part of RMS we want to ensure the blocks/buffers are coordinated so any other user of RMS will see the changes, and we get their changes.

This seems to be at the core of the CRTL issue, it does NOT use RMS, nor does it synchronize its blocks/buffers, leading to the lost update problem.

So with this ‘simple’ addition the CRTL should be altered to us RMS for all file IO.

An extra that could be added, if the file is RFM=fixed, and the C code uses it that way with the same record length then use the SYS$GET/SYS$PUT so it will play nicely with an RMS access to those files.

Anyway just my 2 cent worth.

gt down under

Stephen Hoffman

unread,
Oct 5, 2021, 11:09:17 PM10/5/21
to
On 2021-10-06 02:06:29 +0000, Greg Tinkler said:

> I notice that SSIO (beta) in included in an up coming V9.1 field test.
> So I read up on the issues it is trying to solve.
>
> One concerning thing was to have CRTL (via SSIO) access directly to
> XFC. From an architectural point of view this is wrong at so many
> levels,...

Off the top, some of the various existing stuff that breaks layering on
OpenVMS includes HBVS volume shadowing, MOUNT, and byte-range locking.

IP as a layered product is broken layering.

The C select() call is a fine mess of mis-layering.

The XQP design is mis-layering.

There are other examples.

There are examples of breaking layering to advantage, such as ZFS
else-platform.

All discussions of layering and esthetics aside, I presume the primary
purpose of the SSIO project is to permit porting PostgreSQL to OpenVMS,
posthaste.



--
Pure Personal Opinion | HoffmanLabs LLC

Greg Tinkler

unread,
Oct 5, 2021, 11:32:51 PM10/5/21
to
On Wednesday, 6 October 2021 at 2:09:17 pm UTC+11, Stephen Hoffman wrote:

> Off the top, some of the various existing stuff that breaks layering on
> OpenVMS includes HBVS volume shadowing, MOUNT, and byte-range locking.
>
> IP as a layered product is broken layering.
>
> The C select() call is a fine mess of mis-layering.
>
> The XQP design is mis-layering.
>
> There are other examples.
>
> There are examples of breaking layering to advantage, such as ZFS
> else-platform.
>
> All discussions of layering and esthetics aside, I presume the primary
> purpose of the SSIO project is to permit porting PostgreSQL to OpenVMS,
> posthaste.

Yup, exactly, hence get CRTL to use RMS which does work.

Re byte range locking, why not just use locking granularity (aka Rdb) to do the job. Very efficient and has worked for decades, and no need to change VMS DLM. Sure it may be nice to have an API that does this for us, but hey we are programmers.

gt

Craig A. Berry

unread,
Oct 6, 2021, 8:40:13 AM10/6/21
to

On 10/5/21 9:06 PM, Greg Tinkler wrote:

> An extra that could be added, if the file is RFM=fixed, and the C
> code uses it that way with the same record length then use the
> SYS$GET/SYS$PUT so it will play nicely with an RMS access to those files.

I don't know the degree to which the current plan corresponds to the
original plan from a decade or so ago, but back then only stream files
were going to be supported by SSIO, which makes sense since the whole
point is locking byte ranges.

Arne Vajhøj

unread,
Oct 6, 2021, 9:01:24 AM10/6/21
to
On 10/5/2021 10:06 PM, Greg Tinkler wrote:
> I notice that SSIO (beta) in included in an up coming V9.1 field
> test. So I read up on the issues it is trying to solve.
>
> One concerning thing was to have CRTL (via SSIO) access directly to
> XFC. From an architectural point of view this is wrong at so many
> levels, but if that is what needs to happen then open it up so RMS
> and other code bases can use it.
>
> The main reason stated was the need to do byte offset/count IO’s.
> Well lets solve that first, change RMS by adding SYS$READB and
> SYS$WRITEB. These would be useful to all code using RMS. SYS$READB
> read from byte offset for count, return latest data from that byte
> range. SYS$WRITEB write from byte offset for count, update latest
> copy of underlying blocks.

> By having these as part of RMS we want to ensure the blocks/buffers
> are coordinated so any other user of RMS will see the changes, and we
> get their changes.
>
> This seems to be at the core of the CRTL issue, it does NOT use RMS,
> nor does it synchronize its blocks/buffers, leading to the lost
> update problem.
>
> So with this ‘simple’ addition the CRTL should be altered to us RMS
> for all file IO.
>
> An extra that could be added, if the file is RFM=fixed, and the C
> code uses it that way with the same record length then use the
> SYS$GET/SYS$PUT so it will play nicely with an RMS access to those
> files.

To be honest then I think the safest way to implement this is
to put lots of restrictions on when it is doable.

Examples:
* No cluster support (announcement already states that!)
* Only FIX 512, STMLF and UDF are supported
* no mixing with traditional RMS calls

Some applications coming over from *nix most known PostgreSQL needs
this. But trying to cover all types of cases would be a lot of
work.

Arne

David Jones

unread,
Oct 6, 2021, 9:18:57 AM10/6/21
to
Open source software ports often comes with the restriction that it only works
with stream-LF files. Maybe they should add flag to directory files that if set
only allows it to contain stream-LF or directory files.

I keep a stmlf.fdl file in my login directory to use for copying (i.e. convert/fdl=...)
text files to NFS shares.

Dave Froble

unread,
Oct 6, 2021, 9:48:14 AM10/6/21
to
It has been my impression that for quite some time at HP, work on
specific requests tended to be very specific to that request, and failed
to consider capabilities as general to VMS.

The approach to SSIO appears to be an example of this. Basically, do
the least required to achieve the specific result. In the case of SSIO
the result appears to be rather useless, at least so far.

For some years I've advocated a more general enhancement to the VMS DLM,
specifically, numeric range locking. Such would address a basic issue
I've had with the VMS DLM for a rather long time.

I've a database product, a rather old product. At the time it was
implemented it was rather useful. But there was a locking issue. The
DLM locks resource names. The database would support I/O transfers of 1
to 127 disk blocks. How would one lock 127 contiguous disk blocks? The
blunt force method could be taking out 127 locks, not an optimum
solution. Having numeric range locking back in 1984 would have been
quite useful.

I've also suggested in the past that a simple enhancement to the DLM,
specifically the addition of a "type of lock" with the capability of
adding logic for specific "types" would solve the locking part of SSIO
and do so as a part of VMS, not as part of the CRTL.

As for byte range I/O, I'm not sure what is and isn't possible with disk
drives. It has been my impression that only whole block transfers are
possible. Perhaps I've been wrong. Perhaps SSDs have more flexibility.

Not really an issue for me anymore.

--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: da...@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486

Arne Vajhøj

unread,
Oct 6, 2021, 10:37:13 AM10/6/21
to
On 10/6/2021 9:45 AM, Dave Froble wrote:
> On 10/6/2021 8:40 AM, Craig A. Berry wrote:
>> On 10/5/21 9:06 PM, Greg Tinkler wrote:
>>> An extra that could be added, if the file is RFM=fixed, and the C
>>> code  uses it that way with the same record length then use the
>>> SYS$GET/SYS$PUT so it will play nicely with an RMS access to those
>>> files.
>>
>> I don't know the degree to which the current plan corresponds to the
>> original plan from a decade or so ago, but back then only stream files
>> were going to be supported by SSIO, which makes sense since the whole
>> point is locking byte ranges.
>
> It has been my impression that for quite some time at HP, work on
> specific requests tended to be very specific to that request, and failed
> to consider capabilities as general to VMS.
>
> The approach to SSIO appears to be an example of this.  Basically, do
> the least required to achieve the specific result.  In the case of SSIO
> the result appears to be rather useless, at least so far.

General is better than specific.

When not considering resources.

My impression is that VSI engineering resources are very limited - and
several orders of magnitudes smaller than DEC 40 years ago.

So when they have the choice of solving something 80% for 200 hours of
effort or 100% for 1000 hours of effort then ...

> For some years I've advocated a more general enhancement to the VMS DLM,
> specifically, numeric range locking.  Such would address a basic issue
> I've had with the VMS DLM for a rather long time.

> I've also suggested in the past that a simple enhancement to the DLM,
> specifically the addition of a "type of lock" with the capability of
> adding logic for specific "types" would solve the locking part of SSIO
> and do so as a part of VMS, not as part of the CRTL.

That would make sense to me.

But I do not count.

> As for byte range I/O, I'm not sure what is and isn't possible with disk
> drives.  It has been my impression that only whole block transfers are
> possible.  Perhaps I've been wrong.  Perhaps SSDs have more flexibility.

No matter what the disk can do then the VMS file system is still
block oriented and I believe the system services take block offsets
not byte offsets.

Arne


Stephen Hoffman

unread,
Oct 6, 2021, 11:09:22 AM10/6/21
to
On 2021-10-06 03:32:50 +0000, Greg Tinkler said:

> Yup, exactly, hence get CRTL to use RMS which does work.

For this case, RMS really doesn't work at all well. Says why right
there in the name, too. Record management, not stream management.

C and IP have both been tussling with mismatched assumptions within the
OpenVMS file system since the instantiation of C on OpenVMS, too.

Lately, I've been tussling with the record-oriented assumptions within
OpenVMS. Records just never got as far along as objects. And RMS
records are an unmitigated joy around upgrades and mixed-version
clusters.

The various stream-format files are one of the ensuing compromises here.

> Re byte range locking, why not just use locking granularity (aka Rdb)
> to do the job. Very efficient and has worked for decades, and no need
> to change VMS DLM.

The use of Oracle Rdb isn't viable as a dependency for many folks, and
lock granularity doesn't work at all well for arbitrary and overlapping
locking ranges.

> Sure it may be nice to have an API that does this for us, but hey we
> are programmers.

I don't want us each writing and debugging and maintaining
range-locking code for what is part of the C standard library, but you
do you.

As much as I'd like a general range-locking solution here in DLM, and
with adding (better?) stream I/O support into RMS, and as much as I'd
like to see OO API support added, and IP integration, and app and app
security integration with sandboxes, packaging, and package management,
and a whole pile of other badly-needed work, I'd infer that the folks
at VSI really want PostgreSQL as an available database option soonest.

There's a very long history of "can-kicking" here and a whole lot of
that is almost inherent and inevitable with the upward-compatibility
goals for the platform, and with resulting miasma far less visible to
those of us that have used OpenVMS for the past decade or three or
more, but is front and center with any new developer looking at the
APIs, and with any wholly new 64-bit app work.

John Dallman

unread,
Oct 6, 2021, 3:04:43 PM10/6/21
to
In article <4e4ac776-f43f-4a84...@googlegroups.com>,
osuv...@gmail.com (David Jones) wrote:

> Open source software ports often comes with the restriction that it
> only works with stream-LF files. Maybe they should add a flag to
> directory files that if set only allows it to contain stream-LF
> or directory files.

People used to UNIX or Windows generally find the other VMS file types
baffling and confusing. I got used to the idea, but never made use of
them, since my employers already had fewer customers on VMS than they did
UNIX when I joined, and the disparity only increased.

John

Greg Tinkler

unread,
Oct 6, 2021, 9:25:59 PM10/6/21
to
What a good conversation, some feedback.
>To be honest then I think the safest way to implement this is
>to put lots of restrictions on when it is doable.
>
>Examples:
>* No cluster support (announcement already states that!)
>* Only FIX 512, STMLF and UDF are supported
>* no mixing with traditional RMS calls

My point is SSIO seems to be focused on just PostgreSQL, whereas an RMS solution is much much easier to program, uses well tested code, and is already cluster ready putting the team ahead of the game and not building issues for the future.

>I've a database product, a rather old product. At the time it was
>implemented it was rather useful. But there was a locking issue. The
>DLM locks resource names. The database would support I/O transfers of 1
>to 127 disk blocks. How would one lock 127 contiguous disk blocks? The
>blunt force method could be taking out 127 locks, not an optimum
>solution. Having numeric range locking back in 1984 would have been
>quite useful.

Yup DLM uses resource names, but they can be hierarchical, like a B-Tree index. Also the resources need only exist when needed, removed it not. The the lock tree size depends on the lock contention.

This is why I made reference to Rdb, it uses this technique, and they are probably not the only ones. NB each level controls a range of resources and each level can have it’s own fan out factor. The depth and lowest level is always dependant on the applications requirements.

FYI I am pretty sure RMS uses RFA to lock a record, this is an implied range of 1 record.

>No matter what the disk can do then the VMS file system is still
>block oriented and I believe the system services take block offsets
>not byte offsets.
All disks are block based, even on Unix. With some SSD’s yes you can do byte transfers, but this should be left to the driver to optimise. Also with X86_64 it weill be virtualised so what the..

>For this case, RMS really doesn't work at all well. Says why right
>there in the name, too. Record management, not stream management.

Well yes and no. If you think about it most Unix text IO is record, ie LF terminated, and binary is fixed records not necessarily the same length in the file.

RMS for $GET and $PUT are record based, but $READ and $WRITE are block based, missing is $READB and $WRITEB, not just for CRTL but useful for various applications.

RMS ISAM with fixed length records is a pain, I have long argued ISAM should support variable length records, don’t care if they are VFC or STMLF, I would allow for both as VFC could allow for binary variable length records.

Likewise the keys on an ISAM file should be able to be variable based on a separator e.g “,” or <tab> or a combination.

>The use of Oracle Rdb isn't viable as a dependency for many folks, and
>lock granularity doesn't work at all well for arbitrary and overlapping
>locking ranges.
I think you will be a B-Tree style dynamic resource tree, similar to what Rdb uses, will work well. Any ‘byte range’ implementation will need some index to find interesting locks, DLM uses hash which is as efficient as you can get.

>> Sure it may be nice to have an API that does this for us, but hey we
>> are programmers.

>I don't want us each writing and debugging and maintaining
>range-locking code for what is part of the C standard library, but you
>do you.
NO, quite the opposite. I believe there is a POSIX standard for a locking API, and as VMS, sorry OpenVMS, wishes to maintain its POSIX stamp it should use these API’s using DLM underneath. NB DLM is also already cluster based, but you know that.

>People used to UNIX or Windows generally find the other VMS file types
>baffling and confusing.

I always wondered why the CRTL did not have some smarts to present a VFC records as STMLF and vise-versa, effectively hiding the internal record structures. This could be done via open using the VMS extension “rfm=STMLF” which should be the default unless it is a binary file “rfm=unf”. If the file is VFC then CRTL could to the translation. Wishful thinking.

gt down under

Lawrence D’Oliveiro

unread,
Oct 6, 2021, 9:45:43 PM10/6/21
to
On Thursday, October 7, 2021 at 2:18:57 AM UTC+13, osuv...@gmail.com wrote:

> Open source software ports often comes with the restriction that it only works
> with stream-LF files.

I would say that’s partially true. Typically there are options to treat files as “text” or “binary”. A “binary” file is just a stream of arbitrary 8-bit bytes, which are supposed to be read or written without any imposition of record boundaries, sector-size rounding or special treatment of any byte values. A “text” file is assumed to be broken up into lines. It is true that LF is the traditional Unix line delimiter. But enlightened toolkits like Python are capable of reading text files in “universal newline” mode, so for example if you copy a text file created on MS-DOS (line delimiter = CR+LF, because CP/M did it that way, for no rational reason) in binary mode onto a Linux system, your Python text-processing script running on the latter can cope with it without a hiccup.

Arne Vajhøj

unread,
Oct 6, 2021, 9:48:31 PM10/6/21
to
On 10/6/2021 9:25 PM, Greg Tinkler wrote:
> What a good conversation, some feedback.
>> To be honest then I think the safest way to implement this is
>> to put lots of restrictions on when it is doable.
>>
>> Examples:
>> * No cluster support (announcement already states that!)
>> * Only FIX 512, STMLF and UDF are supported
>> * no mixing with traditional RMS calls
>
> My point is SSIO seems to be focused on just PostgreSQL, whereas an
> RMS solution is much much easier to program, uses well tested code,
> and is already cluster ready putting the team ahead of the game and
> not building issues for the future.
I very much doubt that a full RMS solution is much easier.

:-)

>> For this case, RMS really doesn't work at all well. Says why right
>> there in the name, too. Record management, not stream management.
>
> Well yes and no. If you think about it most Unix text IO is record,
> ie LF terminated, and binary is fixed records not necessarily the
> same length in the file.
>
> RMS for $GET and $PUT are record based, but $READ and $WRITE are
> block based, missing is $READB and $WRITEB, not just for CRTL but
> useful for various applications.
>
> RMS ISAM with fixed length records is a pain, I have long argued ISAM
> should support variable length records, don’t care if they are VFC or
> STMLF, I would allow for both as VFC could allow for binary variable
> length records.
????

Index-sequential files and RMS API supports variable length.

Not all language API's on top of RMS does.

>> The use of Oracle Rdb isn't viable as a dependency for many folks, and
>> lock granularity doesn't work at all well for arbitrary and overlapping
>> locking ranges.

> I think you will be a B-Tree style dynamic resource tree, similar to
> what Rdb uses, will work well. Any ‘byte range’ implementation will
> need some index to find interesting locks, DLM uses hash which is as
> efficient as you can get.
Hash is effective for finding exact matches but useless for finding
other matches aka "starting with". For those a tree is better.

Arne

Lawrence D’Oliveiro

unread,
Oct 6, 2021, 9:51:32 PM10/6/21
to
On Thursday, October 7, 2021 at 8:04:43 AM UTC+13, John Dallman wrote:
> People used to UNIX or Windows generally find the other VMS file types
> baffling and confusing.

One question I never saw answered (because I never came across examples of files to check it) was whether in “VFC” files, the record count included the fixed header or not? And was that the same or different in the on-disk format versus the in-memory RMS structure with the “RSZ” (“RAB$W_RSZ”?) field?

By the way, I knew FORTRAN carriage control is now an anachronism, but I didn’t realize that it is now considered so obsolete, that compilers won’t support it any more.

Arne Vajhøj

unread,
Oct 6, 2021, 9:59:47 PM10/6/21
to
On 10/6/2021 9:51 PM, Lawrence D’Oliveiro wrote:
> On Thursday, October 7, 2021 at 8:04:43 AM UTC+13, John Dallman wrote:
>> People used to UNIX or Windows generally find the other VMS file types
>> baffling and confusing.
>
> One question I never saw answered (because I never came across
> examples of files to check it) was whether in “VFC” files, the record
> count included the fixed header or not? And was that the same or
> different in the on-disk format versus the in-memory RMS structure
> with the “RSZ” (“RAB$W_RSZ”?) field?

Try it!

$ open/write z.z z.z
$ write z.z "ABC"
$ close z.z
$ dir/full z.z

Directory DISK2:[ARNE]

z.z;1 File ID: (5295,236,0)
...
Record format: VFC, 2 byte header, maximum 0 bytes, longest 3 bytes
...
$ dump z.z

Dump of file DISK2:[ARNE]z.z;1 on 6-OCT-2021 21:54:39.48
File ID (5295,236,0) End of file block 1 / Allocated 16

Virtual block number 1 (00000001), 512 (0200) bytes

00000000 00000000 00000000 00000000 00000000 0000FFFF 00434241
8D010005 ....ABC......................... 000000

Arne

Lawrence D’Oliveiro

unread,
Oct 6, 2021, 10:00:38 PM10/6/21
to
On Thursday, October 7, 2021 at 2:25:59 PM UTC+13, tink...@gmail.com wrote:

> All disks are block based, even on Unix.

The difference being, on *nix systems, the responsibility for blocking and deblocking is left to the filesystem layer. So if a file is n bytes long, and n mod «sector size» ≠ 0, the application never sees what is in the padding bytes, if any.

Some filesystems even implement “tail packing”, which means the leftover bits of multiple files can share the same block, all transparently to the application, minimizing fragmentation.

By the way, Linus Torvalds did apparently use a VMS system at some point. (Must have been after his Sinclair QL days.) Guess what reason he gave, when asked why he hated it ...

> RMS ISAM with fixed length records is a pain, I have long argued ISAM should support
>variable length records ...

Given that nowadays an SQL-based RDBMS like SQLite can offer full support for transactions, joins and subqueries (missing only more multi-user-type features like locking and replication), and yet still be resource-light enough to fit in your mobile phone, I would say the time for application developers to be grubbing about in ISAM files is past.

Dave Froble

unread,
Oct 7, 2021, 12:12:58 AM10/7/21
to
On 10/6/2021 9:25 PM, Greg Tinkler wrote:
> What a good conversation, some feedback.
>> To be honest then I think the safest way to implement this is
>> to put lots of restrictions on when it is doable.
>>
>> Examples:
>> * No cluster support (announcement already states that!)
>> * Only FIX 512, STMLF and UDF are supported
>> * no mixing with traditional RMS calls
>
> My point is SSIO seems to be focused on just PostgreSQL, whereas an RMS solution is much much easier to program, uses well tested code, and is already cluster ready putting the team ahead of the game and not building issues for the future.

RMS is a bit too high level for what's being discussed.

But yeah, the real issue is that SSIO was aimed (it seems) at
PostgreSQL. In my opinion, that is poor software architecture and design.

>> I've a database product, a rather old product. At the time it was
>> implemented it was rather useful. But there was a locking issue. The
>> DLM locks resource names. The database would support I/O transfers of 1
>> to 127 disk blocks. How would one lock 127 contiguous disk blocks? The
>> blunt force method could be taking out 127 locks, not an optimum
>> solution. Having numeric range locking back in 1984 would have been
>> quite useful.
>
> Yup DLM uses resource names, but they can be hierarchical, like a B-Tree index. Also the resources need only exist when needed, removed it not. The the lock tree size depends on the lock contention.

Well the perceived issue is what happens when taking out locks, and at
some point there is a conflict. Say needing 127 blocks locked, and the
conflict is on the last block. That means 126 locks to be released, and
perhaps try again.

In reality, the large I/O buffer capability is rarely used, and then
it's usually with exclusive file access, which precludes the need for
block locks, just the file lock. For random access, single block
locking and I/O is good. Larger I/O buffers are usually used for
sequential access, both read only, and updating.

> This is why I made reference to Rdb, it uses this technique, and they are probably not the only ones. NB each level controls a range of resources and each level can have it’s own fan out factor. The depth and lowest level is always dependant on the applications requirements.
>
> FYI I am pretty sure RMS uses RFA to lock a record, this is an implied range of 1 record.

RMS has some interesting internals, basically below application usage.

Global buffers
Multiple buffers
Multi-block count

RMS can (I believe, it's been a long while) keep track of file usage,
and provide data from an RMS buffer to a user's buffer. No disk
activity required. Writes of course must go to disk. But even so, the
data can still be in the updated global buffers for use by multiple tasks.

>> No matter what the disk can do then the VMS file system is still
>> block oriented and I believe the system services take block offsets
>> not byte offsets.
> All disks are block based, even on Unix. With some SSD’s yes you can do byte transfers, but this should be left to the driver to optimise. Also with X86_64 it weill be virtualised so what the..

As long as storage is block oriented, then regardless of the numeric
range of bytes, all blocks encompassing the byte range will need to be
read, including locking, and written. This usually will include data
outside the byte range.

>> For this case, RMS really doesn't work at all well. Says why right
>> there in the name, too. Record management, not stream management.

Ayep. RMS is record based.

> Well yes and no. If you think about it most Unix text IO is record, ie LF terminated, and binary is fixed records not necessarily the same length in the file.
>
> RMS for $GET and $PUT are record based, but $READ and $WRITE are block based, missing is $READB and $WRITEB, not just for CRTL but useful for various applications.

Forget RMS, I/O would be at the QIO level.

> RMS ISAM with fixed length records is a pain, I have long argued ISAM should support variable length records, don’t care if they are VFC or STMLF, I would allow for both as VFC could allow for binary variable length records.

RMS keyed files can have variable record lengths.
RMS relative files require fixed length records. (if I remember correctly)
RMS sequential files can have variable record lengths.

> Likewise the keys on an ISAM file should be able to be variable based on a separator e.g “,” or <tab> or a combination.
>
>> The use of Oracle Rdb isn't viable as a dependency for many folks, and
>> lock granularity doesn't work at all well for arbitrary and overlapping
>> locking ranges.
> I think you will be a B-Tree style dynamic resource tree, similar to what Rdb uses, will work well. Any ‘byte range’ implementation will need some index to find interesting locks, DLM uses hash which is as efficient as you can get.
>
>>> Sure it may be nice to have an API that does this for us, but hey we
>>> are programmers.
>
>> I don't want us each writing and debugging and maintaining
>> range-locking code for what is part of the C standard library, but you
>> do you.
> NO, quite the opposite. I believe there is a POSIX standard for a locking API, and as VMS, sorry OpenVMS, wishes to maintain its POSIX stamp it should use these API’s using DLM underneath. NB DLM is also already cluster based, but you know that.
>
>> People used to UNIX or Windows generally find the other VMS file types
>> baffling and confusing.

That is because, without additional apps, Unix I/O is a stream of bytes.
There is no concept of records, such as that provided by RMS.

Frankly, (and yes, I'm biased), I find records reasonable, and a stream
of bytes baffling and confusing. Guess it's what one is used to.

> I always wondered why the CRTL did not have some smarts to present a VFC records as STMLF and vise-versa, effectively hiding the internal record structures. This could be done via open using the VMS extension “rfm=STMLF” which should be the default unless it is a binary file “rfm=unf”. If the file is VFC then CRTL could to the translation. Wishful thinking.

I would suggest the use of "VMS" in the above, rather than "CRTL". That
is unless one considers the CRTL VMS ...

> gt down under

Lawrence D’Oliveiro

unread,
Oct 7, 2021, 3:54:53 AM10/7/21
to
On Thursday, October 7, 2021 at 5:12:58 PM UTC+13, Dave Froble wrote:
> Frankly, (and yes, I'm biased), I find records reasonable, and a stream
> of bytes baffling and confusing. Guess it's what one is used to.

Trouble is, there are many binary file formats that do not map easily to a simple sequence of records (of whatever delimitation). Consider the IFF family of file formats, for example: these are built out of chunks, and certain chunk types can contain other chunks.

For another example, consider file formats like TIFF and TTF, where there is a directory that identifies the location and size of the various major pieces. Oh, and PDF comes under this as well.

And then there are text-based format families, like XML, JSON, YAML, TOML ...

Simon Clubley

unread,
Oct 7, 2021, 8:12:56 AM10/7/21
to
That because asking Unix/Windows people to learn about VMS records and
file structures is like asking a VMS person to learn about how to work
with records and files on z/OS using traditional z/OS methods.

It is something so very, very, different from what they are used to.

Simon.

--
Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.

Simon Clubley

unread,
Oct 7, 2021, 8:26:38 AM10/7/21
to
On 2021-10-06, Greg Tinkler <tink...@gmail.com> wrote:
>
>>For this case, RMS really doesn't work at all well. Says why right
>>there in the name, too. Record management, not stream management.
>
> Well yes and no. If you think about it most Unix text IO is record, ie LF terminated, and binary is fixed records not necessarily the same length in the file.
>

How do you find byte 12,335,456 in a variable length RMS sequential file
without reading from the start of the file ?

That's why there are restrictions on RMS supported file formats in an
application in some cases.

>
> I always wondered why the CRTL did not have some smarts to present a VFC
> records as STMLF and vise-versa, effectively hiding the internal record
> structures. This could be done via open using the VMS extension ?rfm=STMLF?
> which should be the default unless it is a binary file ?rfm=unf?. If the file
> is VFC then CRTL could to the translation. Wishful thinking.
>

This could not be the default. What if LF characters are part of the
existing data record itself ? You have just destroyed the meaning of
the file in that case.

Greg Tinkler

unread,
Oct 7, 2021, 8:42:41 AM10/7/21
to
On Thursday, 7 October 2021 at 11:26:38 pm UTC+11, Simon Clubley wrote:

> How do you find byte 12,335,456 in a variable length RMS sequential file
> without reading from the start of the file ?
>
> That's why there are restrictions on RMS supported file formats in an
> application in some cases.

The same way it is done on Unix, calculate the block offset, go get it, and extract the byte. no difference and nothing to do with the underlying format.

> > I always wondered why the CRTL did not have some smarts to present a VFC
> > records as STMLF and vise-versa, effectively hiding the internal record
> > structures. This could be done via open using the VMS extension ?rfm=STMLF?
> > which should be the default unless it is a binary file ?rfm=unf?. If the file
> > is VFC then CRTL could to the translation. Wishful thinking.
> >
> This could not be the default. What if LF characters are part of the
> existing data record itself ? You have just destroyed the meaning of
> the file in that case.

Please read what I wrote, if the file has been opened "b" then don't, otherwise we need to assume it is stmLF. Yup probably another logical to set the default but I'm pretty sure if you create a new file using CRTL with the defaults then it will be stmLF anyway.

gt

Craig A. Berry

unread,
Oct 7, 2021, 8:50:23 AM10/7/21
to

On 10/6/21 11:10 PM, Dave Froble wrote:

> the real issue is that SSIO was aimed (it seems) at
> PostgreSQL.

And Apache, and Samba, and other things that have been explicitly
mentioned as having needed app-specific workarounds due to the absence
of shared stream I/O support. SSIO *is* the general-purpose solution
that you seem to be lamenting the lack of.

Greg Tinkler

unread,
Oct 7, 2021, 8:54:56 AM10/7/21
to

> Well the perceived issue is what happens when taking out locks, and at
> some point there is a conflict. Say needing 127 blocks locked, and the
> conflict is on the last block. That means 126 locks to be released, and
> perhaps try again.
Maybe, maybe not. It depends on the locking fan out factors for the differing levels. It is possible that only 1 lock is needed, may be more, the wort case would be 127. NB there is also BLAST to assist with managing the lock promotion/demotions.


> As long as storage is block oriented, then regardless of the numeric
> range of bytes, all blocks encompassing the byte range will need to be
> read, including locking, and written. This usually will include data
> outside the byte range.

Yup, as is the case on Unix...let the drivers worry about how and why this is done, block/byte what ever the IO device needs.

> Forget RMS, I/O would be at the QIO level.

Why? Underneath RMS is QIO, what RMS gives us the the coordination of the buffers/buckets/clumps/block across the cluster to ensure not lost updates, as per the example used to justify SSIO.

> RMS keyed files can have variable record lengths.
True, VAR only not VFC or STM*, but fixed length key fields, with fixed offsets in the record

> RMS relative files require fixed length records. (if I remember correctly)
Yup, there are implicitly fixed length.

===
Have been thinking about the byte range locking. As most of the use will be for locking ranges in a file it should be integrated with RMS, i.e. RMS should have an API to allow this as it already does the locking to the buffer/bucket/clump/block. Just need another 1 or 2 layers of lock tree and you have it. And it all be cluster wide, and it will be compatible with other users of RMS.

gt

Simon Clubley

unread,
Oct 7, 2021, 8:59:17 AM10/7/21
to
On 2021-10-07, Greg Tinkler <tink...@gmail.com> wrote:
> On Thursday, 7 October 2021 at 11:26:38 pm UTC+11, Simon Clubley wrote:
>
>> How do you find byte 12,335,456 in a variable length RMS sequential file
>> without reading from the start of the file ?
>>
>> That's why there are restrictions on RMS supported file formats in an
>> application in some cases.
>
> The same way it is done on Unix, calculate the block offset, go get it, and extract the byte. no difference and nothing to do with the underlying format.
>

You don't know the block offset without scanning the file when it comes
to some RMS file formats.

IOW, data byte 12,335,456 will not be the same thing as file byte 12,335,456
unless you restrict yourself to record formats that do not have embedded
record metadata.

Arne Vajhøj

unread,
Oct 7, 2021, 9:34:31 AM10/7/21
to
On 10/7/2021 8:59 AM, Simon Clubley wrote:
> On 2021-10-07, Greg Tinkler <tink...@gmail.com> wrote:
>> On Thursday, 7 October 2021 at 11:26:38 pm UTC+11, Simon Clubley wrote:
>>> How do you find byte 12,335,456 in a variable length RMS sequential file
>>> without reading from the start of the file ?
>>>
>>> That's why there are restrictions on RMS supported file formats in an
>>> application in some cases.
>>
>> The same way it is done on Unix, calculate the block offset, go get it, and extract the byte. no difference and nothing to do with the underlying format.
>
> You don't know the block offset without scanning the file when it comes
> to some RMS file formats.
>
> IOW, data byte 12,335,456 will not be the same thing as file byte 12,335,456
> unless you restrict yourself to record formats that do not have embedded
> record metadata.

Yes.

And it does not get better when using standard C IO.

I suspect that the variable length file output below will
surprise a few *nix developers.

$ type var.txt
A
BB
CCC
$ type stmlf.txt
A
BB
CCC
$ type process.c
#include <stdio.h>
#include <sys/stat.h>

void sequential(const char *fnm, int mode)
{
FILE *fp;
int ix, c;
printf("%s sequential read (%s):", fnm, mode ? "binary" : "text");
fp = fopen(fnm, mode ? "rb" : "r");
ix = 0;
while((c = fgetc(fp)) >= 0)
{
ix++;
if(c >= 0)
{
printf(" %d=%02X", ix, c);
}
else
{
printf(" %d=-1", ix);
}
}
printf("\n");
fclose(fp);
}

void direct(const char *fnm, int mode, int siz)
{
FILE *fp;
int ix, c;
printf("%s direct read (%s):", fnm, mode ? "binary" : "text");
fp = fopen(fnm, mode ? "rb" : "r");
for(ix = 0; ix < siz; ix++)
{
fseek(fp, ix, SEEK_SET);
c = fgetc(fp);
if(c >= 0)
{
printf(" %d=%02X", ix + 1, c);
}
else
{
printf(" %d=-1", ix + 1);
}
}
printf("\n");
fclose(fp);
}

int main(int argc,char *argv[])
{
struct stat buf;
stat(argv[1], &buf);
printf("%s size = %d bytes\n", argv[1], (int)buf.st_size);
sequential(argv[1], 0);
sequential(argv[1], 1);
direct(argv[1], 0, (int)buf.st_size);
direct(argv[1], 1, (int)buf.st_size);
return 0;
}
$ cc process
$ link process
$ mcr sys$disk:[]process var.txt
var.txt size = 14 bytes
var.txt sequential read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43 9=0A
var.txt sequential read (binary): 1=41 2=42 3=42 4=43 5=43 6=43
var.txt direct read (text): 1=41 2=-1 3=02 4=-1 5=42 6=-1 7=-1 8=-1 9=43
10=-1 11=-1 12=-1 13=FF 14=-1
var.txt direct read (binary): 1=41 2=-1 3=02 4=-1 5=42 6=-1 7=-1 8=-1
9=43 10=-1 11=-1 12=-1 13=FF 14=-1
$ mcr sys$disk:[]process stmlf.txt
stmlf.txt size = 9 bytes
stmlf.txt sequential read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43
8=43 9=0A
stmlf.txt sequential read (binary): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43
8=43 9=0A
stmlf.txt direct read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43 9=0A
stmlf.txt direct read (binary): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43 9=0A

Arne

Arne Vajhøj

unread,
Oct 7, 2021, 9:40:12 AM10/7/21
to
Samba I totally get.

Multiple PC's writing to a file on a Samba share would create
some interesting scenarios.

But why does Apache need it?

It should read files to serve - and since it is serving VMS files
then I think it be as VMSish as possible so totally standard RMS.
And it should write sequential text files like access.log.

What am I missing?

Arne



Craig A. Berry

unread,
Oct 7, 2021, 9:51:11 AM10/7/21
to
log files (and probably the fact that multiple worker processes can be
writing to the same logs). And I forgot to mention that Java needs it
too. See:

<http://de.openvms.org/TUD2012/opensource_and_unix_portability.pdf>

Page 16 says:

• Java (CIFS too) uses a work-around
− Does open+read/write+close for every read/write!
− Restores current file offset after each close+open
− Significant performance issue
• Oracle problem with log and trace files
− Single writer with multiple readers
• Apache’s use of log files sub-optimal
− V1.3 places restriction
− V2.0 uses a work-around

Arne Vajhøj

unread,
Oct 7, 2021, 10:01:13 AM10/7/21
to
On 10/7/2021 9:51 AM, Craig A. Berry wrote:
> On 10/7/21 8:40 AM, Arne Vajhøj wrote:
>> On 10/7/2021 8:50 AM, Craig A. Berry wrote:
>>> On 10/6/21 11:10 PM, Dave Froble wrote:
>>>> the real issue is that SSIO was aimed (it seems) at PostgreSQL.
>>>
>>> And Apache, and Samba, and other things that have been explicitly
>>> mentioned as having needed app-specific workarounds due to the absence
>>> of shared stream I/O support. SSIO *is* the general-purpose solution
>>> that you seem to be lamenting the lack of.
>>
>> Samba I totally get.
>>
>> Multiple PC's writing to a file on a Samba share would create
>> some interesting scenarios.
>>
>> But why does Apache need it?
>>
>> It should read files to serve - and since it is serving VMS files
>> then I think it be as VMSish as possible so totally standard RMS.
>> And it should write sequential text files like access.log.
>>
>> What am I missing?
>
> log files (and probably the fact that multiple worker processes can be
> writing to the same logs).

I still don't get it.

I thought SSIO was about shared access to byte streams.

Writing to log files should be fine using good old record based
writes (somewhere down the call stack SYS$PUT).

>   And I forgot to mention that Java needs it
> too.  See:
>
> <http://de.openvms.org/TUD2012/opensource_and_unix_portability.pdf>
>
> Page 16 says:
>
> • Java (CIFS too) uses a work-around
>   − Does open+read/write+close for every read/write!
>   − Restores current file offset after each close+open
>   − Significant performance issue

In this context does "Java" mean "Tomcat"?

Arne

Arne Vajhøj

unread,
Oct 7, 2021, 10:42:35 AM10/7/21
to
On 10/7/2021 9:34 AM, Arne Vajhøj wrote:
>         fseek(fp, ix, SEEK_SET);

> var.txt size = 14 bytes
> var.txt sequential read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43
> 9=0A
> var.txt sequential read (binary): 1=41 2=42 3=42 4=43 5=43 6=43
> var.txt direct read (text): 1=41 2=-1 3=02 4=-1 5=42 6=-1 7=-1 8=-1 9=43
> 10=-1 11=-1 12=-1 13=FF 14=-1
> var.txt direct read (binary): 1=41 2=-1 3=02 4=-1 5=42 6=-1 7=-1 8=-1
> 9=43 10=-1 11=-1 12=-1 13=FF 14=-1
> $ mcr sys$disk:[]process stmlf.txt
> stmlf.txt size = 9 bytes
> stmlf.txt sequential read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43
> 8=43 9=0A
> stmlf.txt sequential read (binary): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43
> 8=43 9=0A
> stmlf.txt direct read (text): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43 9=0A
> stmlf.txt direct read (binary): 1=41 2=0A 3=42 4=42 5=0A 6=43 7=43 8=43
> 9=0A

In all fairness then I believe there are some documentation
somewhere that states that fseek is only supported to
beginning of a record. I cannot find it right now,
but I believe I once saw it somewhere.

Arne


Arne Vajhøj

unread,
Oct 7, 2021, 10:49:01 AM10/7/21
to
On 10/7/2021 3:54 AM, Lawrence D’Oliveiro wrote:
> On Thursday, October 7, 2021 at 5:12:58 PM UTC+13, Dave Froble
> wrote:
>> Frankly, (and yes, I'm biased), I find records reasonable, and a
>> stream of bytes baffling and confusing. Guess it's what one is used
>> to.
>
> Trouble is, there are many binary file formats that do not map easily
> to a simple sequence of records (of whatever delimitation). Consider
> the IFF family of file formats, for example: these are built out of
> chunks, and certain chunk types can contain other chunks.
>
> For another example, consider file formats like TIFF and TTF, where
> there is a directory that identifies the location and size of the
> various major pieces. Oh, and PDF comes under this as well.

The whole record things is mostly for text files and RMS based
database style usage.

Even on VMS then true binary files are usually FIX 512 (or in rare
cases UDF) with the structure handled entirely by the application.

Attempts to do otherwise often end up with 32K problems.

> And then there are text-based format families, like XML, JSON, YAML,
> TOML ...

Different. Both on *nix and VMS that is a separate structure
on top of the basic file format.

Arne


Stephen Hoffman

unread,
Oct 7, 2021, 11:34:50 AM10/7/21
to
On 2021-10-07 01:25:57 +0000, Greg Tinkler said:

> My point is SSIO seems to be focused on just PostgreSQL, whereas an RMS
> solution is much much easier to program, uses well tested code, and is
> already cluster ready putting the team ahead of the game and not
> building issues for the future.

...

Fix the existing RMS data corruption in 32-bit RMS and/or in the C
library, and get PostgreSQL available on OpenVMS soonest. I expect this
is the priority for VSI.

Everything else is aspirational.

Integrate stream file access support at the XQP and allow C and C++ and
other non-punched-card-style app designs and stream- and OO-focused
languages to optionally bypass RMS entirely.

Better integrate and document the existing range-locking support
available within DLM.

And in aggregate, stop trying to make the current 32-bit RMS NoSQL
database more complex than it already is, and re-architect such that
32-bit RMS NoSQL database becomes just another available database, and
preferably while providing room for 64-bit RMS rather than trying
another OpenVMS Alpha V7.0-style 32-/64-bit or FAB/RAB/RAB64/NAM/NAML
design, and make 32- or (hypothetical) 64-bit RMS not the sole
persistent-storage "funnel" for structured file access for apps running
on OpenVMS, short of those few using XQP or LOG_IO or PHY_IO. Existing
RMS apps are already headed for "fun" as part of the upcoming 64-bit
LBN work for VSI and for apps, and a whole lot of those apps just won't
make it past messes similar to apps still tied to ODS-2 naming. I'd
wager that most existing apps don't yet fully support ODS-5 naming,
UTF-8 and all, too. Similar app messes with latent 32-bit RMS
dependencies.

David Jones

unread,
Oct 7, 2021, 11:37:06 AM10/7/21
to
On Thursday, October 7, 2021 at 3:54:53 AM UTC-4, Lawrence D’Oliveiro wrote:
> Trouble is, there are many binary file formats that do not map easily to a simple sequence of records (of whatever delimitation). Consider the IFF family of file formats, for example: these are built out of chunks, and certain chunk types can contain other chunks.

Whatever happened to Compound Document Architecture (CDA)? It always struck me as an effort (now abandoned) toward an object oriented file structure.

Arne Vajhøj

unread,
Oct 7, 2021, 11:50:54 AM10/7/21
to
On 10/6/2021 10:00 PM, Lawrence D’Oliveiro wrote:
> Given that nowadays an SQL-based RDBMS like SQLite can offer full
> support for transactions, joins and subqueries (missing only more
> multi-user-type features like locking and replication), and yet still
> be resource-light enough to fit in your mobile phone, I would say the
> time for application developers to be grubbing about in ISAM files is
> past.

There are still cases where it make sense. RMS index-sequential files
are really a NoSQL Key Value Store in modern terminology and
they are still used and new ones even being developed (like
RocksDB).

But the default should change.

"use index-sequential file unless good reason to use relational database"

=>

"use relational database unless good reason to use
index-sequential file"

Arne


Stephen Hoffman

unread,
Oct 7, 2021, 11:51:33 AM10/7/21
to
There are many examples. It's far easier to map a whole executable
image into virtual memory or to use file system calls to load the whole
image into virtual memory, too. (This is an app design I never would
have considered on a VAX, too.)

For a number of apps and designs, I find RMS problematic for its
fondness for records in the lower parts of its position within the I/O
stack "funnel", and problematic again at somewhat higher levels of the
I/O stack "funnel" with what little RMS can do with those database
records it wants to enforce; its lack of marshaling and unmarshaling
for apps needing those services, among other sorts of designs, and with
all the usual "fun" with making changes to the contents and formats of
RMS records within apps.

Trying to make all apps fit within one NoSQL database really isn't all
that great of a solution. Getting PostgreSQL, SQLite, and other
databases better integrated is helpful. Longer-term and as I'd
mentioned in another reply, demoting 32-bit RMS to "just another local
database" status, too.

And to be absolutely clear here: if an app developer needs a NoSQL
database and as many apps can, having 32-bit RMS is entirely useful. At
least until the app developer needs to make changes or additions to the
record structures, when 32-bit RMS starts showing its age. A problem
related to how we now have roughly two-dozen files necessary within a
cluster configuration.

Craig A. Berry

unread,
Oct 7, 2021, 12:01:40 PM10/7/21
to
Is this what you're looking for?

$ help crtl fseek description

CRTL

fseek

Description

The fseek function can position a fixed-length record-access
file with no carriage control or a stream-access file on any
byte offset, but can position all other files only on record
boundaries.

The available Standard I/O functions position a variable-length
or VFC record file at its first byte, at the end-of-file, or on
a record boundary. Therefore, the arguments given to fseek must
specify any of the following:

o The beginning or end of the file

o A 0 offset from the current position (an arbitrary record
boundary)

o The position returned by a previous, valid ftell call


Stephen Hoffman

unread,
Oct 7, 2021, 12:08:47 PM10/7/21
to
DEC ceded the desktop app business.

The modern equivalent to CDA is PDF.

Craig A. Berry

unread,
Oct 7, 2021, 12:12:03 PM10/7/21
to

On 10/7/21 9:01 AM, Arne Vajhøj wrote:
> On 10/7/2021 9:51 AM, Craig A. Berry wrote:
>> On 10/7/21 8:40 AM, Arne Vajhøj wrote:
>>> On 10/7/2021 8:50 AM, Craig A. Berry wrote:
>>>> On 10/6/21 11:10 PM, Dave Froble wrote:
>>>>> the real issue is that SSIO was aimed (it seems) at PostgreSQL.
>>>>
>>>> And Apache, and Samba, and other things that have been explicitly
>>>> mentioned as having needed app-specific workarounds due to the absence
>>>> of shared stream I/O support. SSIO *is* the general-purpose solution
>>>> that you seem to be lamenting the lack of.
>>>
>>> Samba I totally get.
>>>
>>> Multiple PC's writing to a file on a Samba share would create
>>> some interesting scenarios.
>>>
>>> But why does Apache need it?
>>>
>>> It should read files to serve - and since it is serving VMS files
>>> then I think it be as VMSish as possible so totally standard RMS.
>>> And it should write sequential text files like access.log.
>>>
>>> What am I missing?
>>
>> log files (and probably the fact that multiple worker processes can be
>> writing to the same logs).
>
> I still don't get it.
>
> I thought SSIO was about shared access to byte streams.
>
> Writing to log files should be fine using good old record based
> writes (somewhere down the call stack SYS$PUT).

Don't ask me, ask the authors of the document to which I linked. Or the
folks at VSI who inherited their work. I may be wrong and it's not
about log files, but suppose it is. If you start from the premise that
the log files are stream-oriented and you have multiple writers and
multiple readers at the same time, then that's pretty much the
definition of shared access to a byte stream. Doing it differently for a
platform that prefers records would be extra cost and extra maintenance.

>>                            And I forgot to mention that Java needs it
>> too.  See:
>>
>> <http://de.openvms.org/TUD2012/opensource_and_unix_portability.pdf>
>>
>> Page 16 says:
>>
>> • Java (CIFS too) uses a work-around
>>    − Does open+read/write+close for every read/write!
>>    − Restores current file offset after each close+open
>>    − Significant performance issue
>
> In this context does "Java" mean "Tomcat"?

You know as much as I do -- probably more ;-).

Dave Froble

unread,
Oct 7, 2021, 12:21:20 PM10/7/21
to
On 10/7/2021 8:50 AM, Craig A. Berry wrote:
>
A while back we were discussing doing away with I/O to buffers, and
accessing the data in place. Slower access perhaps, but doing away with
the reading and writing to/from buffers. Haven't heard much about that
lately. I don't get out much.

Such type of activity would really benefit from having the capability of
locking just the required data, and, would need the capability of
reading and writing just the required data.

I'm aware of how useful something like SSIO would be. I'm just appalled
by the design and implementation. As mentioned, it seems aimed at just
a few current uses, and totally ignores how useful it would be for many
more future uses. This is rather consistent with the long time apathy
with which VMS has been treated. It's more a patch than an enhancement.
This is what I lament.

Dave Froble

unread,
Oct 7, 2021, 12:30:02 PM10/7/21
to
On 10/7/2021 10:01 AM, Arne Vajhøj wrote:

>
> I still don't get it.
>
> I thought SSIO was about shared access to byte streams.

That is a bit of tunnel vision.

Locking numeric ranges could be used for many other things. Such a
capability should be generic, not just for a single purpose.

That's the problem I see, the tunnel vision when approaching the issue,
rather than the vision to see just how useful the capability could be.

Craig's post points that out.

Arne Vajhøj

unread,
Oct 7, 2021, 12:52:30 PM10/7/21
to
YES.

And shame on me, because I only checked help crtl fseek arguments.

Arne

Dave Froble

unread,
Oct 7, 2021, 12:53:34 PM10/7/21
to
On 10/7/2021 8:54 AM, Greg Tinkler wrote:
>
>> Well the perceived issue is what happens when taking out locks, and at
>> some point there is a conflict. Say needing 127 blocks locked, and the
>> conflict is on the last block. That means 126 locks to be released, and
>> perhaps try again.
> Maybe, maybe not. It depends on the locking fan out factors for the differing levels. It is possible that only 1 lock is needed, may be more, the wort case would be 127. NB there is also BLAST to assist with managing the lock promotion/demotions.
>
>
>> As long as storage is block oriented, then regardless of the numeric
>> range of bytes, all blocks encompassing the byte range will need to be
>> read, including locking, and written. This usually will include data
>> outside the byte range.
>
> Yup, as is the case on Unix...let the drivers worry about how and why this is done, block/byte what ever the IO device needs.
>
>> Forget RMS, I/O would be at the QIO level.
>
> Why? Underneath RMS is QIO, what RMS gives us the the coordination of the buffers/buckets/clumps/block across the cluster to ensure not lost updates, as per the example used to justify SSIO.

Too limited and specific purpose. RMS might be able to make use of some
capabilities, but so might other applications.

RMS does some things well, and doesn't have some capabilities that it
perhaps should have. Data field definitions in records comes to mind.

>> RMS keyed files can have variable record lengths.
> True, VAR only not VFC or STM*, but fixed length key fields, with fixed offsets in the record
>
>> RMS relative files require fixed length records. (if I remember correctly)
> Yup, there are implicitly fixed length.
>
> ===
> Have been thinking about the byte range locking. As most of the use will be for locking ranges in a file it should be integrated with RMS, i.e. RMS should have an API to allow this as it already does the locking to the buffer/bucket/clump/block. Just need another 1 or 2 layers of lock tree and you have it. And it all be cluster wide, and it will be compatible with other users of RMS.

Short sighted thinking. Numeric range locking might be useful in many
applications.

Arne Vajhøj

unread,
Oct 7, 2021, 1:00:00 PM10/7/21
to
On 10/7/2021 12:27 PM, Dave Froble wrote:
> On 10/7/2021 10:01 AM, Arne Vajhøj wrote:
>> I still don't get it.
>>
>> I thought SSIO was about shared access to byte streams.
>
> That is a bit of tunnel vision.

Not really. More like the definition.

<quote>
SSIO
====
Shared Stream IO feature provides POSIX compliant read/write to byte
stream files.
Hence SSIO feature, the data consistency is guaranteed when mutiple
processes are performing a Read/Write to non overlapping byte boundaries
with the same block boundary.
</quote>

> Locking numeric ranges could be used for many other things.  Such a
> capability should be generic, not just for a single purpose.

I agree that range locking is a useful feature for many other purposes
than SSIO.

> That's the problem I see, the tunnel vision when approaching the issue,
> rather than the vision to see just how useful the capability could be.
>
> Craig's post points that out.

It listed some project that could benefit from SSIO besides
PostgreSQL.

And I just don't understand some of the examples since they
sound traditional record oriented to me.

Arne


Dave Froble

unread,
Oct 7, 2021, 1:00:25 PM10/7/21
to
On 10/7/2021 8:59 AM, Simon Clubley wrote:
> On 2021-10-07, Greg Tinkler <tink...@gmail.com> wrote:
>> On Thursday, 7 October 2021 at 11:26:38 pm UTC+11, Simon Clubley wrote:
>>
>>> How do you find byte 12,335,456 in a variable length RMS sequential file
>>> without reading from the start of the file ?
>>>
>>> That's why there are restrictions on RMS supported file formats in an
>>> application in some cases.
>>
>> The same way it is done on Unix, calculate the block offset, go get it, and extract the byte. no difference and nothing to do with the underlying format.
>>
>
> You don't know the block offset without scanning the file when it comes
> to some RMS file formats.
>
> IOW, data byte 12,335,456 will not be the same thing as file byte 12,335,456
> unless you restrict yourself to record formats that do not have embedded
> record metadata.
>
> Simon.
>

I'm guessing Unix files don't have metadata and such. So the comparison
is not valid.

For a non-RMS file, yes, the location can be calculated. But not so for
an RMs file with record characteristics included in the records.

Since Unix doesn't have RMS files, perhaps that confused Greg.

Stephen Hoffman

unread,
Oct 7, 2021, 1:01:51 PM10/7/21
to
On 2021-10-07 16:18:28 +0000, Dave Froble said:

> A while back we were discussing doing away with I/O to buffers, and
> accessing the data in place. Slower access perhaps, but doing away
> with the reading and writing to/from buffers. Haven't heard much about
> that lately. I don't get out much.

Ayup. Nonvolatile byte-addressable storage hardware is available now,
and is in use in various applications.

Compatible memory hardware will be rather more available for OpenVMS
x86-64, for folks interested in investigating this for their apps.

Carving out a hunk of persistent storage will be interesting topic for
app developers on OpenVMS, though I can think of a couple of ways to
try.

Here's an HPE overview from a few years ago on the topic:
https://www.pdl.cmu.edu/SDI/2016/slides/keeton-2016-10-19-memory-driven-computing.pdf


I see some B-Tree work for this area in a newer paper, and a number of
other discussions.

> Such type of activity would really benefit from having the capability
> of locking just the required data, and, would need the capability of
> reading and writing just the required data.

Locking access to the contents of a global section, or locking access
to hardware-backed storage for external devices, is the same issue.

Whether DLM overhead is too high for that to be workable is another
discussion that the app developers will want to ponder.

> I'm aware of how useful something like SSIO would be. I'm just
> appalled by the design and implementation. As mentioned, it seems
> aimed at just a few current uses, and totally ignores how useful it
> would be for many more future uses. This is rather consistent with the
> long time apathy with which VMS has been treated. It's more a patch
> than an enhancement. This is what I lament.

Alas, there's no other outcome when upward-compatibility is an
overarching goal for the platform.

Dave Froble

unread,
Oct 7, 2021, 1:03:32 PM10/7/21
to
On 10/7/2021 9:34 AM, Arne Vajhøj wrote:

> I suspect that the variable length file output below will
> surprise a few *nix developers.
>

Why do you post C code examples that confuse me and give me a headache?

:-)

Then again, Basic code examples might confuse Unix developers ...

Arne Vajhøj

unread,
Oct 7, 2021, 1:09:37 PM10/7/21
to
On 10/7/2021 1:00 PM, Dave Froble wrote:
> On 10/7/2021 9:34 AM, Arne Vajhøj wrote:
>> I suspect that the variable length file output below will
>> surprise a few *nix developers.
>
> Why do you post C code examples that confuse me and give me a headache?
>
> :-)
>
> Then again, Basic code examples might confuse Unix developers ...

Sorry about the headache.

But the topic was identical code on *nix and VMS trying to
access a random position in a file.

C is available on both *nix and VMS so it was rather
obvious.

VMS Basic is not available on *nix.

I don't think there is quite the same options
in VMS Basic as in C for this, but I expect all the
options available in VMS Basic to produce a natural
expected result.

Arne


Dave Froble

unread,
Oct 7, 2021, 1:19:02 PM10/7/21
to
On 10/7/2021 11:34 AM, Stephen Hoffman wrote:
> On 2021-10-07 01:25:57 +0000, Greg Tinkler said:
>
>> My point is SSIO seems to be focused on just PostgreSQL, whereas an
>> RMS solution is much much easier to program, uses well tested code,
>> and is already cluster ready putting the team ahead of the game and
>> not building issues for the future.
>
> ...
>
> Fix the existing RMS data corruption in 32-bit RMS and/or in the C
> library, and get PostgreSQL available on OpenVMS soonest. I expect this
> is the priority for VSI.

Most likely.

> Everything else is aspirational.
>
> Integrate stream file access support at the XQP and allow C and C++ and
> other non-punched-card-style app designs and stream- and OO-focused
> languages to optionally bypass RMS entirely.

I don't use C, so I don't know much about it. But isn't this capability
already available? Even RMS has the BLOCK I/O capability, at least from
Basic.

As far as I know, QIO doesn't know a thing about RMS. Well, the
directory structure does know RMS, and to an extent is RMS.

> Better integrate and document the existing range-locking support
> available within DLM.

Yes, for sure. And if needed, make it much better.

> And in aggregate, stop trying to make the current 32-bit RMS NoSQL
> database more complex than it already is, and re-architect such that
> 32-bit RMS NoSQL database becomes just another available database, and
> preferably while providing room for 64-bit RMS rather than trying
> another OpenVMS Alpha V7.0-style 32-/64-bit or FAB/RAB/RAB64/NAM/NAML
> design, and make 32- or (hypothetical) 64-bit RMS not the sole
> persistent-storage "funnel" for structured file access for apps running
> on OpenVMS, short of those few using XQP or LOG_IO or PHY_IO. Existing
> RMS apps are already headed for "fun" as part of the upcoming 64-bit LBN
> work for VSI and for apps, and a whole lot of those apps just won't make
> it past messes similar to apps still tied to ODS-2 naming. I'd wager
> that most existing apps don't yet fully support ODS-5 naming, UTF-8 and
> all, too. Similar app messes with latent 32-bit RMS dependencies.

Oh, no, Steve. That is much too logical and reasonable. Can't have
that. We must insure that things stay totally screwed up.

Don't know how far work had progressed on alternate file systems. Might
or might not help to make RMS "just another capability". But, doing
what you suggest would go a long way toward making VMS more useful in
the future.

I've got the suspicion that VMS clusters, while good, create some of the
problems in attempting to add new capabilities to VMS. Need I mention
"MOUNT"? Better segregation might help to add new and different
capabilities. Not sure how easy that might be.

Arne Vajhøj

unread,
Oct 7, 2021, 1:25:40 PM10/7/21
to
On 10/7/2021 1:16 PM, Dave Froble wrote:
> On 10/7/2021 11:34 AM, Stephen Hoffman wrote:
>> Integrate stream file access support at the XQP and allow C and C++ and
>> other non-punched-card-style app designs and stream- and OO-focused
>> languages to optionally bypass RMS entirely.
>
> I don't use C, so I don't know much about it.  But isn't this capability
> already available?  Even RMS has the BLOCK I/O capability, at least from
> Basic.

C/C++ and most newer languages have a "stream view" of files while
RMS has a "record view" of files.

If they used different file systems everything would be fine.

If all text files are STMLF then it works and the "stream view"
and the "record view" produces consistent results.

But trying to mix on variable length or VFC files becomes
a minefield.

I know you don't like C, but try look at the example I posted.
Some of the outputs are very weird.

Arne

Arne Vajhøj

unread,
Oct 7, 2021, 1:27:26 PM10/7/21
to
On 10/7/2021 12:12 PM, Craig A. Berry wrote:
> On 10/7/21 9:01 AM, Arne Vajhøj wrote:
>> I still don't get it.

> Don't ask me, ask the authors of the document to which I linked. Or the
> folks at VSI who inherited their work.


I know - I should not shoot the messenger. Sorry.

Arne

Dave Froble

unread,
Oct 7, 2021, 1:28:28 PM10/7/21
to
I'd suggest there should not be a "default". Rather, make good
thoughtful decisions. Have valid reasons for any decisions or choices.

Simon Clubley

unread,
Oct 7, 2021, 1:53:35 PM10/7/21
to
On 2021-10-07, Dave Froble <da...@tsoft-inc.com> wrote:
> On 10/7/2021 9:34 AM, Arne Vajhøj wrote:
>
>> I suspect that the variable length file output below will
>> surprise a few *nix developers.
>>
>
> Why do you post C code examples that confuse me and give me a headache?
>
>:-)
>
> Then again, Basic code examples might confuse Unix developers ...
>

Some of them might be aware of Basic.

Back in the later MS-DOS days, Microsoft used to ship a Basic
interpreter for free with MS-DOS and (apparently some Windows versions):

https://en.wikipedia.org/wiki/QBasic

I've just discovered there's a version of Microsoft QuickBasic for Linux:

https://en.wikipedia.org/wiki/FreeBASIC

which I did not know about.

Just been reminded that Gorillas.bas was released 30 years ago.

I am now depressed. :-)

Simon.

--
Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.

Simon Clubley

unread,
Oct 7, 2021, 2:07:02 PM10/7/21
to
On 2021-10-07, Dave Froble <da...@tsoft-inc.com> wrote:
> On 10/7/2021 8:59 AM, Simon Clubley wrote:
>> On 2021-10-07, Greg Tinkler <tink...@gmail.com> wrote:
>>> On Thursday, 7 October 2021 at 11:26:38 pm UTC+11, Simon Clubley wrote:
>>>
>>>> How do you find byte 12,335,456 in a variable length RMS sequential file
>>>> without reading from the start of the file ?
>>>>
>>>> That's why there are restrictions on RMS supported file formats in an
>>>> application in some cases.
>>>
>>> The same way it is done on Unix, calculate the block offset, go get it, and extract the byte. no difference and nothing to do with the underlying format.
>>>
>>
>> You don't know the block offset without scanning the file when it comes
>> to some RMS file formats.
>>
>> IOW, data byte 12,335,456 will not be the same thing as file byte 12,335,456
>> unless you restrict yourself to record formats that do not have embedded
>> record metadata.
>>
>
> I'm guessing Unix files don't have metadata and such. So the comparison
> is not valid.
>

No, Unix doesn't. At Unix filesystem level, files are just a stream of bytes.

The next layer up on Unix is the C RTL. There's nothing like RMS
between the filesystem and the C RTL on Unix.

Arne Vajhøj

unread,
Oct 7, 2021, 2:13:53 PM10/7/21
to
On 10/7/2021 1:53 PM, Simon Clubley wrote:
> On 2021-10-07, Dave Froble <da...@tsoft-inc.com> wrote:
>> Then again, Basic code examples might confuse Unix developers ...
>
> Some of them might be aware of Basic.
>
> Back in the later MS-DOS days, Microsoft used to ship a Basic
> interpreter for free with MS-DOS and (apparently some Windows versions):
>
> https://en.wikipedia.org/wiki/QBasic

GW-Basic came with DOS 1-4 and QBasic with DOS 5-6 and early Windows I
believe.

GW-Basic source code is now available at:
https://github.com/microsoft/GW-BASIC

> I've just discovered there's a version of Microsoft QuickBasic for Linux:
>
> https://en.wikipedia.org/wiki/FreeBASIC
>
> which I did not know about.

I would still not expect many Linux people to know Basic.

And besides VMS Basic is somewhat different from MS Basic flavors.

Arne

Arne Vajhøj

unread,
Oct 7, 2021, 2:18:28 PM10/7/21
to
On 10/7/2021 2:07 PM, Simon Clubley wrote:
> On 2021-10-07, Dave Froble <da...@tsoft-inc.com> wrote:
>> I'm guessing Unix files don't have metadata and such. So the comparison
>> is not valid.
>
> No, Unix doesn't. At Unix filesystem level, files are just a stream of bytes.
>
> The next layer up on Unix is the C RTL. There's nothing like RMS
> between the filesystem and the C RTL on Unix.

The Unix file systems does not have meta data about how the
bytes are to be read/interpreted (like VMS: ORG, RFM, RAT,
MRS etc.). They do have some general meta data (owner,
protection, size, timestamp).

Arne

Simon Clubley

unread,
Oct 7, 2021, 2:28:07 PM10/7/21
to
On 2021-10-07, Dave Froble <da...@tsoft-inc.com> wrote:
>
> Don't know how far work had progressed on alternate file systems. Might
> or might not help to make RMS "just another capability". But, doing
> what you suggest would go a long way toward making VMS more useful in
> the future.
>
> I've got the suspicion that VMS clusters, while good, create some of the
> problems in attempting to add new capabilities to VMS. Need I mention
> "MOUNT"? Better segregation might help to add new and different
> capabilities. Not sure how easy that might be.
>

VMS clusters at conceptual level are not the problem. They offer
some very nice functionality that only recently is beginning to
appear elsewhere. They were literally a generation ahead of what
was available elsewhere when they were released.

The problem is how VMS was designed in those early days before
modular and layered computing really took off.

The VMS filesystem code, including MOUNT as you say, is a _horrible_
monolithic mass of closely interlinked code without any clear
boundaries between them that allow people (including end users) to
easily plug in new functionality and new filesystems.

The same is true for VMS CLIs BTW. DCL is tightly bound into VMS
in a horrible way it should not be. On Linux, both the command
shell and filesystem architectures are vastly cleaner and more
modular than they are on VMS.

However, if VMS had been designed in a later era, there would be
absolutely nothing stopping VMS having a cleaner internal architecture
_and_ also having world-leading cluster capabilities that are only
now just being equalled elsewhere.

IOW, it's not clustering that's the problem - it's the fact that
VMS wasn't implemented 5 to 10 years later than it was.

Stephen Hoffman

unread,
Oct 7, 2021, 2:30:53 PM10/7/21
to
On 2021-10-07 17:16:03 +0000, Dave Froble said:

> On 10/7/2021 11:34 AM, Stephen Hoffman wrote:
>> On 2021-10-07 01:25:57 +0000, Greg Tinkler said:
>>
>>> My point is SSIO seems to be focused on just PostgreSQL, whereas an RMS
>>> solution is much much easier to program, uses well tested code, and is
>>> already cluster ready putting the team ahead of the game and not
>>> building issues for the future.
>>
>> ...
>>
>> Fix the existing RMS data corruption in 32-bit RMS and/or in the C
>> library, and get PostgreSQL available on OpenVMS soonest. I expect this
>> is the priority for VSI.
>
> Most likely.
>
>> Everything else is aspirational.
>>
>> Integrate stream file access support at the XQP and allow C and C++ and
>> other non-punched-card-style app designs and stream- and OO-focused
>> languages to optionally bypass RMS entirely.
>
> I don't use C, so I don't know much about it. But isn't this
> capability already available?

The C standard functions—the equivalent of the BASIC calls OPEN, READ,
WRITE, et al—are via RMS. There's no knob to tell C "don't do that".

The C default sequential file format creation format on OpenVMS is RMS
VFC, which has been a perpetual source of confusion and consternation
for users new to C on OpenVMS.

> Even RMS has the BLOCK I/O capability, at least from Basic.

C doesn't do sector I/O within the standard library, though the native
platform calls are easily available.

> As far as I know, QIO doesn't know a thing about RMS. Well, the
> directory structure does know RMS, and to an extent is RMS.

$qio (and $io_perform) offer sector access through RMS (virtual),
record access through RMS (virtual), or access to device through the
file system (IO$_ACPCONTROL XQP), or direct access to the device driver
and device (logical and physical I/O).

The VIRT_IO virtual I/O paths through RMS and through the XQP are
cluster-aware, while the LOG_IO logical and PHY_IO physical I/O paths
are not.

RMS provides record locking for cluster coordination, while the XQP
provides coordination for the on-disk file system.

>> Better integrate and document the existing range-locking support
>> available within DLM.
>
> Yes, for sure. And if needed, make it much better.
>
>> And in aggregate, stop trying to make the current 32-bit RMS NoSQL
>> database more complex than it already is, and re-architect such that
>> 32-bit RMS NoSQL database becomes just another available database, and
>> preferably while providing room for 64-bit RMS rather than trying
>> another OpenVMS Alpha V7.0-style 32-/64-bit or FAB/RAB/RAB64/NAM/NAML
>> design, and make 32- or (hypothetical) 64-bit RMS not the sole
>> persistent-storage "funnel" for structured file access for apps running
>> on OpenVMS, short of those few using XQP or LOG_IO or PHY_IO. Existing
>> RMS apps are already headed for "fun" as part of the upcoming 64-bit
>> LBN work for VSI and for apps, and a whole lot of those apps just won't
>> make it past messes similar to apps still tied to ODS-2 naming. I'd
>> wager that most existing apps don't yet fully support ODS-5 naming,
>> UTF-8 and all, too. Similar app messes with latent 32-bit RMS
>> dependencies.
>
> Oh, no, Steve. That is much too logical and reasonable. Can't have
> that. We must insure that things stay totally screwed up.

I'd prefer an approach where there's some opportunity to ease new work
and new APIs into production, and to also retire overtly-busted APIs.

Oracle Rdb was really good at that migration and for as far as that
went, but most other apps and OpenVMS itself have not managed to copy
that. Not successfully.

> Don't know how far work had progressed on alternate file systems.
> Might or might not help to make RMS "just another capability". But,
> doing what you suggest would go a long way toward making VMS more
> useful in the future.
>
> I've got the suspicion that VMS clusters, while good, create some of
> the problems in attempting to add new capabilities to VMS. Need I
> mention "MOUNT"? Better segregation might help to add new and
> different capabilities. Not sure how easy that might be.

Oracle Rdb and some other databases have cluster access locking,
whether using DLM or database-level locking.

Other databases can be single-host.

The SQLite port to OpenVMS supports DLM and clustering.

PostgreSQL has been adding replication and clustering:
https://www.postgresql.org/docs/9.5/different-replication-solutions.html

Whether an OpenVMS port of PostgreSQL can incorporate DLM calls is
fodder for future discussions, once the SSIO prerequisite becomes
available and a hypothetical future PostgreSQL port becomes stable. A
stable PostgreSQL will interest some folks, with adoptions depending on
both intrinsic interest and, um, potential extrinsic factors not yet in
evidence.

And no, you need not mention MOUNT, having necessarily (re)written what
MOUNT provides on several occasions.

Stephen Hoffman

unread,
Oct 7, 2021, 2:34:21 PM10/7/21
to
On 2021-10-07 18:28:04 +0000, Simon Clubley said:

> IOW, it's not clustering that's the problem - it's the fact that VMS
> wasn't implemented 5 to 10 years later than it was.

...Or that OpenVMS and its apps weren't later migrated to DEC MICA.
Which is kinda-sorta what you're referring to.

Arne Vajhøj

unread,
Oct 7, 2021, 2:36:37 PM10/7/21
to
On 10/7/2021 2:30 PM, Stephen Hoffman wrote:
> PostgreSQL has been adding replication and clustering:
> https://www.postgresql.org/docs/9.5/different-replication-solutions.html
>
> Whether an OpenVMS port of PostgreSQL can incorporate DLM calls is
> fodder for future discussions, once the SSIO prerequisite becomes
> available and a hypothetical future PostgreSQL port becomes stable. A
> stable PostgreSQL will interest some folks, with adoptions depending on
> both intrinsic interest and, um, potential extrinsic factors not yet in
> evidence.

PostgreSQL clusters are active/passive.

All updates and typical all reads goes to the active node
and updates get replicated from the active node to the passive nodes.

I believe it is possible to have the passive nodes support
reading.

But with only the active node taking updates then there
is no need for DLM.

(VMS people may not even call such a config a cluster, but ...)

Arne

Stephen Hoffman

unread,
Oct 7, 2021, 3:09:12 PM10/7/21
to
On 2021-10-07 18:36:28 +0000, Arne Vajh j said:

> PostgreSQL clusters are active/passive. ...

For folks interested in this general topic area with PostgreSQL around
failover and replication, please see the PostgreSQL documentation for
details.

Here's an updated link from what I'd posted earlier:
https://www.postgresql.org/docs/14/different-replication-solutions.html

If there's interest in adding what OpenVMS calls clustering within any
hypothetical future PostgreSQL port, use of the DLM will undoubtedly be
considered.

nb: PostgreSQL uses the term "cluster" for something entirely different
and unrelated to OpenVMS clustering.

Dave Froble

unread,
Oct 7, 2021, 5:06:16 PM10/7/21
to
Now I''m just a dumb polock, wandered down out of the woods. But I just
don't see where upward compatibility has anything to do with
enhancements to the DLM. If existing calls continue to work as before,
and only when an optional extra parameter would enable new capabilities,
then upward compatibility just cannot be an issue. At least for this.

The optional parameter might be a "lock type", and if not present,
existing logic would be used, and if present, new code could be executed
to process the new lock type. Stuff a couple of quadwords into the
resource name for the numeric range. It would add one new piece of data
to the DLM data structure(s).

Dave Froble

unread,
Oct 7, 2021, 5:16:16 PM10/7/21
to
I have some understanding of RMS files. I've been known to do recovery
work on corrupted RMS files. Have to have some knowledge of RMS to do
that. But, it sure isn't any fun ...

Dave Froble

unread,
Oct 7, 2021, 5:19:55 PM10/7/21
to
You may have noticed that I didn't blame VMS clusters for the problem.
Rather how some things are so rigid, and more so because of support for
some things that involve clusters. Makes new stuff sometimes much
harder, as you mentioned.

Craig A. Berry

unread,
Oct 7, 2021, 7:15:23 PM10/7/21
to

On 10/7/21 1:30 PM, Stephen Hoffman wrote:
> On 2021-10-07 17:16:03 +0000, Dave Froble said:


>> I don't use C, so I don't know much about it.  But isn't this
>> capability already available?
>
> The C standard functions—the equivalent of the BASIC calls OPEN, READ,
> WRITE, et al—are via RMS. There's no knob to tell C "don't do that".

You're pretending that you don't know about the foo="bar" options on the
CRTL open/fopen/creat calls. Yes, it's all via RMS, but you can tell it
to do or not do certain things. And the feature logicals, of course, but
it might be dinnertime in your time zone and I wouldn't want to give you
indigestion :-).

But from BASIC, yes, I think you have to write wrappers around the CRTL
functions and then call them from BASIC, or at least that's what I did
the one time I had to write stream files from BASIC.

Dave Froble

unread,
Oct 7, 2021, 8:24:44 PM10/7/21
to
I'd ask, why not call the RMS routines?

No, messing with FABs and RABs and such is not one of my favorite things
to do. But it sure is doable.

Now perhaps the naming doesn't mean the same thing, but:

OPEN
Syntax
[ FOR INPUT ]
OPEN file-spec1 [ FOR OUTPUT ] AS [ FILE ] chnl-exp1 [,
open-clause ]...

open-clause: { { VIRTUAL }
}
{ { UNDEFINED }
}
{ [ ORGANIZATION ] { INDEXED } [ STREAM ]
}
{ { SEQUENTIAL } [ VARIABLE ]
}
{ { RELATIVE } [ FIXED ]
}

Basic help seems to imply that stream files can be created ...

Perhaps I should actually try it, much as it entails work ...

Itanic> t zz.bas
1 Open "ZZ.ZZ" For Output as File #1%, &
Organization Sequential Stream, &
Recordsize 32767

Print #1%, Num1$(Z%) For Z% = 1% to 5%

Close #1%

End
Itanic> t zz.zz
1
2
3
4
5
Itanic> dir/full zz.zz

Directory DKB0:[DFE]

ZZ.ZZ;1 File ID: (6678,7,0)
Size: 1/16 Owner: [DFE]
Created: 7-OCT-2021 20:22:50.29
Modified: 7-OCT-2021 20:22:50.36 (1)
Expires: <None specified>
Backup: <No backup recorded>
Effective: <None specified>
Recording: <None specified>
Accessed: 7-OCT-2021 20:22:50.29
Attr Mod: 7-OCT-2021 20:22:50.36
Data Mod: 7-OCT-2021 20:22:50.29
Linkcount: 1
File organization: Sequential
Shelved state: Online
Caching attribute: Writethrough
File attributes: Allocation: 16, Extend: 0, Global buffer count: 0,
No version limit
Record format: Stream, maximum 32767 bytes, longest 1 byte
Record attributes: Carriage return carriage control
RMS attributes: None
Journaling enabled: None
File protection: System:RWED, Owner:RWED, Group:RE, World:
Access Cntrl List: None
Client attributes: None

Lawrence D’Oliveiro

unread,
Oct 7, 2021, 8:55:15 PM10/7/21
to
On Friday, October 8, 2021 at 4:37:06 AM UTC+13, osuv...@gmail.com wrote:
> Whatever happened to Compound Document Architecture (CDA)? It always
> struck me as an effort (now abandoned) toward an object oriented file structure.

Then there was Bento, which Apple was fond of for a while (back in the days of the OpenDoc-versus-OLE2 war).

Seems like nobody cares about live embedding and compound documents now. Probably turned out to be too complex for most users to handle.

One interesting modern trend is the use of ZIP archives as a document metaformat. For example, an ODF file (ISO 26300) is essentially a ZIP archive. There is this interesting convention that the first element of the archive shall be named “mimetype”, and its content shall be uncompressed. This allows file sniffers to pick up the MIME type info at a fixed offset near the start of the file.

Arne Vajhøj

unread,
Oct 7, 2021, 8:59:52 PM10/7/21
to
On 10/7/2021 8:55 PM, Lawrence D’Oliveiro wrote:
> One interesting modern trend is the use of ZIP archives as a document
> metaformat. For example, an ODF file (ISO 26300) is essentially a ZIP
> archive.

ODF, OOXML, a bunch of Java stuff (jar, war, rar, ear) etc..

Arne

Lawrence D’Oliveiro

unread,
Oct 7, 2021, 9:01:06 PM10/7/21
to
On Friday, October 8, 2021 at 4:34:50 AM UTC+13, Stephen Hoffman wrote:
> Integrate stream file access support at the XQP and allow C and C++ and
> other non-punched-card-style app designs and stream- and OO-focused
> languages to optionally bypass RMS entirely.

I assume you mean “bypass RMS for non-block-level I/O”, since it was always possible for nonprivileged code to do direct ACP/XQP calls like IO$_ACCESS, READ/WRITEVBLK and friends.

(You soon appreciate how much work $PARSE is doing for you...)

Lawrence D’Oliveiro

unread,
Oct 7, 2021, 9:06:13 PM10/7/21
to
On Friday, October 8, 2021 at 6:19:02 AM UTC+13, Dave Froble wrote:
> As far as I know, QIO doesn't know a thing about RMS. Well, the
> directory structure does know RMS, and to an extent is RMS.

Last I checked*, on VMS, directories were just files, and there was no protection against processes with write access screwing up their contents. For some reason that was not considered to be a vital part of filesystem integrity.

RMS implements the full file/directory name syntax, but the management of name entries inside directories is an ACP/XQP function.

*decades ago, admittedly

Lawrence D’Oliveiro

unread,
Oct 7, 2021, 9:10:02 PM10/7/21
to
On Friday, October 8, 2021 at 7:07:02 AM UTC+13, Simon Clubley wrote:
>
> On 2021-10-07, Dave Froble <da...@tsoft-inc.com> wrote:
>>
>> I'm guessing Unix files don't have metadata and such.
>>
> No, Unix doesn't. At Unix filesystem level, files are just a stream of bytes.

Some Linux filesystems have the concept of “extended attributes” <https://manpages.debian.org/buster/manpages/xattr.7.en.html>. Some are reserved for security purposes, others are user-defined.

Lawrence D’Oliveiro

unread,
Oct 7, 2021, 9:19:34 PM10/7/21
to
On Friday, October 8, 2021 at 7:28:07 AM UTC+13, Simon Clubley wrote:
> The same is true for VMS CLIs BTW. DCL is tightly bound into VMS
> in a horrible way it should not be. On Linux, both the command
> shell and filesystem architectures are vastly cleaner and more
> modular than they are on VMS.

Fundamental difference in mindset: process creation in VMS is expensive and to be avoided if possible, while on *nix systems it’s something you do as naturally as breathing.

And of course the VMS mindset continued over into Windows NT...

> However, if VMS had been designed in a later era ...

Note that Unix predates VMS. Folks at DEC would have been aware of it right from the early days, since it was born on DEC hardware.

Stephen Hoffman

unread,
Oct 8, 2021, 10:51:05 AM10/8/21
to
On 2021-10-07 21:03:53 +0000, Dave Froble said:

> On 10/7/2021 1:01 PM, Stephen Hoffman wrote:
>> On 2021-10-07 16:18:28 +0000, Dave Froble said:
>>
>>> I'm aware of how useful something like SSIO would be. I'm just
>>> appalled by the design and implementation. As mentioned, it seems
>>> aimed at just a few current uses, and totally ignores how useful it
>>> would be for many more future uses. This is rather consistent with the
>>> long time apathy with which VMS has been treated. It's more a patch
>>> than an enhancement. This is what I lament.
>>
>> Alas, there's no other outcome when upward-compatibility is an
>> overarching goal for the platform.
>
> Now I''m just a dumb polock, wandered down out of the woods. But I
> just don't see where upward compatibility has anything to do with
> enhancements to the DLM. If existing calls continue to work as before,
> and only when an optional extra parameter would enable new
> capabilities, then upward compatibility just cannot be an issue. At
> least for this.

I was building on the "long term apathy" and "more patch than
enhancement" comments, with the increasing difficulties even making
comparatively minor or isolated changes and updates.

Larger changes can be Really Difficult with ~40 years of accumunated
dependencies around, assuming the developers and schedule and funding
are all available. (q.v. Hyrum's Law.)

There are sections of OpenVMS that would best be ripped out and
replaced, or refactored, or re-architected, but that can't happen or
can't easily happen while staying compatible with existing apps.

DLM itself needs better abstractions as some of the more common tasks
are just absurdly involved to program within the existing API. Tasks
such as selecting a primary app server for a host or a cluster, for
instance. This is less of an issue for experienced OpenVMS programmers
and for those with access to examples (cost and schedule and budget and
ongoing support discussions aside), but this sequence is not something
at all obvious to less-experienced developers. And even with
experienced developers, mistakes still happen. And within a wider view,
this DLM primary support is building local process and job control
support, which is an omission I've commended on before.

Stephen Hoffman

unread,
Oct 8, 2021, 11:27:38 AM10/8/21
to
On 2021-10-07 23:15:19 +0000, Craig A. Berry said:

> On 10/7/21 1:30 PM, Stephen Hoffman wrote:
>> On 2021-10-07 17:16:03 +0000, Dave Froble said:
>
>
>>> I don't use C, so I don't know much about it.  But isn't this
>>> capability already available?
>>
>> The C standard functions—the equivalent of the BASIC calls OPEN, READ,
>> WRITE, et al—are via RMS. There's no knob to tell C "don't do that".
>
> You're pretending that you don't know about the foo="bar" options on
> the CRTL open/fopen/creat calls. Yes, it's all via RMS, but you can
> tell it to do or not do certain things. And the feature logicals, of
> course, but it might be dinnertime in your time zone and I wouldn't
> want to give you indigestion :-).

I need to pretend harder, or to forget harder, then. I'm sufficiently
familiar with those C options and with the acc routines and with my
always-favorite C feature logical names and the lib$initialize psect
fun to often prefer use of $qio or $io_perform when suppression of RMS
"helpfulness" is sought, yes. Even with all those knobs, "well, there's
egg and bacon; egg sausage and bacon; egg and RMS; egg bacon and RMS;
egg bacon sausage and RMS; RMS bacon sausage and RMS; RMS egg RMS bacon
and RMS; RMS sausage RMS bacon RMS..." "Do you have anything without
RMS (getting "helpful")?" https://vimeo.com/329001211

Too many C design quirks awaiting the unwary or the uninitiated here,
too; where you have to add options to remove platform-specific oddities
(see above), where basename works for Unix specs but not for OpenVMS
specs, how select only works for IP, where VFC is the default
sequential file creation format, the need for the moving target that is
lib$initialize, and the decc$to_vms and decc$from_vms calling
conventions. And suchlike.

This is part of why I'd prefer to see a new C standard library within
the port, and to relegate the existing standard library for use by
existing apps.

> But from BASIC, yes, I think you have to write wrappers around the CRTL
> functions and then call them from BASIC, or at least that's what I did
> the one time I had to write stream files from BASIC.

I'd expect there's BASIC code around that doesn't handle TCP streams
all that well, either. Similar issues. Punched cards are really
entrenched all through OpenVMS.

Stephen Hoffman

unread,
Oct 8, 2021, 11:37:26 AM10/8/21
to
On 2021-10-08 01:19:33 +0000, Lawrence D’Oliveiro said:

> On Friday, October 8, 2021 at 7:28:07 AM UTC+13, Simon Clubley wrote:
>> The same is true for VMS CLIs BTW. DCL is tightly bound into VMS in a
>> horrible way it should not be. On Linux, both the command shell and
>> filesystem architectures are vastly cleaner and more modular than they
>> are on VMS.
> Fundamental difference in mindset: process creation in VMS is expensive
> and to be avoided if possible, while on *nix systems it’s something you
> do as naturally as breathing.

VAX-era wisdom, and which is clung on. Creating new processes on
OpenVMS never got as light as Unix, but the overhead has become
negligible on modern systems for all but industrial-scale
creation-deletion.

Having looked at this back in the VAX era, the slow process creations
our apps were incurring were arising from inefficiencies within the DCL
spawn-related processing, and not from within the OpenVMS process
creation overhead.

Once that was identified and the obvious work-around implemented,
spawns were pretty speedy even VAX-era.

To Simon's comment, how DCL gets mapped into process address space is
just ugly, too. And hard to debug.

Simon Clubley

unread,
Oct 8, 2021, 2:19:28 PM10/8/21
to
On 2021-10-07, Dave Froble <da...@tsoft-inc.com> wrote:
>
> Now I''m just a dumb polock, wandered down out of the woods. But I just
> don't see where upward compatibility has anything to do with
> enhancements to the DLM. If existing calls continue to work as before,
> and only when an optional extra parameter would enable new capabilities,
> then upward compatibility just cannot be an issue. At least for this.
>

Is there a version number on the current inter-node DLM messages ?

If not, how can you change the DLM message structure in a compatible way ?

If yes, what happens when an older node sees a later format DLM message ?
You would at least need a compatibility kit to be installed on the older
nodes.

> The optional parameter might be a "lock type", and if not present,
> existing logic would be used, and if present, new code could be executed
> to process the new lock type. Stuff a couple of quadwords into the
> resource name for the numeric range. It would add one new piece of data
> to the DLM data structure(s).
>

What about the DLM messages sent between nodes ?

Simon Clubley

unread,
Oct 8, 2021, 2:23:45 PM10/8/21
to
On 2021-10-07, Lawrence D?Oliveiro <lawren...@gmail.com> wrote:
> On Friday, October 8, 2021 at 7:07:02 AM UTC+13, Simon Clubley wrote:
>>
>> On 2021-10-07, Dave Froble <da...@tsoft-inc.com> wrote:
>>>
>>> I'm guessing Unix files don't have metadata and such.
>>>
>> No, Unix doesn't. At Unix filesystem level, files are just a stream of bytes.
>
> Some Linux filesystems have the concept of ?extended attributes? <https://manpages.debian.org/buster/manpages/xattr.7.en.html>. Some are reserved for security purposes, others are user-defined.

That's true and I do use them. However, at filesystem level, the
file data itself is just a stream of bytes without any embedded metadata
(unlike on VMS).

Dave Froble

unread,
Oct 8, 2021, 2:53:47 PM10/8/21
to
On 10/8/2021 10:51 AM, Stephen Hoffman wrote:
> On 2021-10-07 21:03:53 +0000, Dave Froble said:
>
>> On 10/7/2021 1:01 PM, Stephen Hoffman wrote:
>>> On 2021-10-07 16:18:28 +0000, Dave Froble said:
>>>
>>>> I'm aware of how useful something like SSIO would be. I'm just
>>>> appalled by the design and implementation. As mentioned, it seems
>>>> aimed at just a few current uses, and totally ignores how useful it
>>>> would be for many more future uses. This is rather consistent with
>>>> the long time apathy with which VMS has been treated. It's more a
>>>> patch than an enhancement. This is what I lament.
>>>
>>> Alas, there's no other outcome when upward-compatibility is an
>>> overarching goal for the platform.
>>
>> Now I''m just a dumb polock, wandered down out of the woods. But I
>> just don't see where upward compatibility has anything to do with
>> enhancements to the DLM. If existing calls continue to work as
>> before, and only when an optional extra parameter would enable new
>> capabilities, then upward compatibility just cannot be an issue. At
>> least for this.
>
> I was building on the "long term apathy" and "more patch than
> enhancement" comments, with the increasing difficulties even making
> comparatively minor or isolated changes and updates.
>
> Larger changes can be Really Difficult with ~40 years of accumunated
> dependencies around, assuming the developers and schedule and funding
> are all available. (q.v. Hyrum's Law.)

Hyrum's Law and such points to the need of good software architecture.
(I always have to use the spell checker on that word.)

If intelligent and structured use of something like VMS if followed,
enhancements should not be much of an issue. It is when people do
things they really should not the problems arise. Compatibility with
well designed tools should not be an issue. Going off on one's own, and
making assumptions about things, which are not guaranteed to remain
as-is is where such problems occur, for the most part.

As a simple example:

If Stat% and 1%

vs

If Stat% and SS$_NORMAL

That causes a problem, if the VMS developers decide that "1" is no
longer what it used to be. The problem is not compatibility, the
problem is not using the approved constant.

Now while breaking customers code can be bad for business, the dumb
polock can say "fuck 'em, enhance the product and break their erroneous
code".

:-)

Dave Froble

unread,
Oct 8, 2021, 3:35:12 PM10/8/21
to
On 10/8/2021 2:19 PM, Simon Clubley wrote:
> On 2021-10-07, Dave Froble <da...@tsoft-inc.com> wrote:
>>
>> Now I''m just a dumb polock, wandered down out of the woods. But I just
>> don't see where upward compatibility has anything to do with
>> enhancements to the DLM. If existing calls continue to work as before,
>> and only when an optional extra parameter would enable new capabilities,
>> then upward compatibility just cannot be an issue. At least for this.
>>
>
> Is there a version number on the current inter-node DLM messages ?

Good question, and if not, perhaps such could be implemented. However,
what I envision should not affect usage of the existing resource name lock.

> If not, how can you change the DLM message structure in a compatible way ?
>
> If yes, what happens when an older node sees a later format DLM message ?
> You would at least need a compatibility kit to be installed on the older
> nodes.

Perhaps.

>> The optional parameter might be a "lock type", and if not present,
>> existing logic would be used, and if present, new code could be executed
>> to process the new lock type. Stuff a couple of quadwords into the
>> resource name for the numeric range. It would add one new piece of data
>> to the DLM data structure(s).
>>
>
> What about the DLM messages sent between nodes ?
>
> Simon.
>

First, re-read what I posted. Specifically "if not present, existing
logic would be used".

Not sure what you're calling "DLM message".

The only data item I'd see added to the lock database would be the "lock
type", and that could be done in a manner such that it does not affect
lock database information that does not have the new structure definitions.

Perhaps it could be arranged that when using the new data structure(s),
that it would be mandatory to update all nodes in a cluster. Perhaps
some type of version would disallow usage of dissimilar versions.

Note that any node or cluster that wished to use numeric range locking
would have to have the enhancement installed. If not using it, then
nothing changes.

This could be done as a VMS DLM enhancement. I'm rather sure of that.
Whether the desire to do so might be a different issue.

Simon Clubley

unread,
Oct 8, 2021, 4:36:11 PM10/8/21
to
On 2021-10-08, Dave Froble <da...@tsoft-inc.com> wrote:
>
> Not sure what you're calling "DLM message".
>

DLM-related cluster traffic.

Anything you propose not only has to be compatible at API level,
but also in physical DLM messages on the wire.

Lawrence D’Oliveiro

unread,
Oct 8, 2021, 6:55:07 PM10/8/21
to
On Saturday, October 9, 2021 at 4:37:26 AM UTC+13, Stephen Hoffman wrote:
> Having looked at this back in the VAX era, the slow process creations
> our apps were incurring were arising from inefficiencies within the DCL
> spawn-related processing, and not from within the OpenVMS process
> creation overhead.
>
> Once that was identified and the obvious work-around implemented,
> spawns were pretty speedy even VAX-era.
>
> To Simon's comment, how DCL gets mapped into process address space is
> just ugly, too. And hard to debug.

But the whole reason why DCL maps into a process in this way, so that user-mode code can be repeatedly loaded, run and then wiped from the same process, was precisely to avoid multiple process creations. Now you are saying that the DCL mechanism itself contributes to the overhead of process creations!

But “spawn” is still not the same as “fork”. Sure, in *nix, the “fork” followed by “exec” idiom is common, but lots of forks are done without an accompanying exec (I’ve done a few myself). In the early days of Unix, the “vfork” hack was invented to speed things up in the fork+exec case, but this was later discovered to be unnecessary: not (so much) because hardware had become faster, but it was recognized that the bottleneck of giving the child process its own copy of non-shared writable memory could be avoided/postponed by just copying the relevant page table entries and setting a “copy-on-write” flag on them.

What do you know, vfork(2) was actually specified in POSIX, and Linux still supports it <https://manpages.debian.org/bullseye/manpages-dev/vfork.2.en.html>.

Greg Tinkler

unread,
Oct 8, 2021, 8:04:11 PM10/8/21
to
<snip>
>>> The optional parameter might be a "lock type", and if not present,
>>> existing logic would be used, and if present, new code could be executed
>>> to process the new lock type. Stuff a couple of quadwords into the
>>> resource name for the numeric range. It would add one new piece of data
>>> to the DLM data structure(s).

What would be useful is a name space for the lock e.g. RMS ...

Well there is the group id from UIC, and there is $set_resource_domain(), both can be useful but not a solution.

Having a name space that can be local machine or cluster wide could be very useful for some applications. But that is a much longer term idea.

At present the resource name is limited to 31 char, ok in the 32 era but in 64 bit era and looking at GFS2 for peta byte moving into the exabyte and possibly the Zettabyta range, if VMS is to survive the next 40 years it needs to prepare.

First lets move onto X86_64. Yes it would be good to have easier building of open source code, and the main issues as I understand it are

file IO, moving to RMS will fix most if not all of that
fork
for fork/exec - spawn is fine, no modern systems a bit of CPU...
for file access more of problem, but not often used
directory and filenaming
some work is in place for this

So the main issue is file IO, so change CRTL to use RMS.

gt

Dave Froble

unread,
Oct 8, 2021, 8:59:52 PM10/8/21
to
On 10/8/2021 4:36 PM, Simon Clubley wrote:
> On 2021-10-08, Dave Froble <da...@tsoft-inc.com> wrote:
>>
>> Not sure what you're calling "DLM message".
>>
>
> DLM-related cluster traffic.
>
> Anything you propose not only has to be compatible at API level,
> but also in physical DLM messages on the wire.
>
> Simon.
>

If the single new piece of data is not used, then nothing changes.

If it is in use, then the nodes in question would already have the
enhancement installed.

Dave Froble

unread,
Oct 8, 2021, 9:11:58 PM10/8/21
to
Let me ask this as a question, because I really don't know.

Doesn't C already use RMS for file I/O ?

It has been my impression that all the VMS languages use RMS for file
I/O. But I don't get out much.

Arne Vajhøj

unread,
Oct 8, 2021, 9:18:58 PM10/8/21
to
On 10/8/2021 9:09 PM, Dave Froble wrote:
> On 10/8/2021 8:04 PM, Greg Tinkler wrote:
>> So the main issue is file IO, so change CRTL to use RMS.
>
> Let me ask this as a question, because I really don't know.
>
> Doesn't C already use RMS for file I/O ?
>
> It has been my impression that all the VMS languages use RMS for file
> I/O.  But I don't get out much.

It has previously been claimed that:

other languages use a very thin layer on top of SYS$GET and SYS$PUT

C use a much thicker layer on top of SYS$READ and SYS$WRITE

I don't know if it is true or not.

Arne


Vitaly Pustovetov

unread,
Oct 9, 2021, 3:00:48 AM10/9/21
to
> So the main issue is file IO, so change CRTL to use RMS.
>
> gt

CRTL uses RMS for file I/O. But there is an issue with concurrent access of multiple processes to the same file in stream mode. And we had a choice - 1) rewrite half of Postgres by inserting file locking; 2) add a new SSIO (Shared Stream IO) service to VMS.

Greg Tinkler

unread,
Oct 9, 2021, 6:19:44 AM10/9/21
to
Sort of.

RMS worked and has been working of 40+years and does not have these concurrent access issues! NB there is no such thing at an OS level of stream anything, every thing is clumps of data being buffered is some way, the API that accesses that data from the higher levels may be stream based. In this case it is CRTL's role to translate the clumps of data into/from stream API.

So the other choice, 3), fix CRTL to use RMS correctly, and the problems will go away. Engineering effort would not be great. I don't have access to the code base, but assuming that stdio uses unixio then it is fixing 5 routines. This would also allow all the other ports to work with minimal changes in the file access area. If you what to know more contact me directly.

Longer term the SSIO may be useful to RMS, which is where it belongs.

Sorry if the above is a little blunt, I appreciate the efforts people have put in over the years, but some of us have using and coding VMS for a very long time, and I really want VMS to be successful and easy to port to. This has been a good opportunity for me to look more into CRTL and RMS, and see the problems that have been there for decades.

locking is an interesting area, I still feel the current DLM is more than capable of doing the 'lock a byte range' in a way that can be used with the current RMS locking. Longer term DLM needs some changes but they are about sizes of resource names, scoping of resource names, ability to scan for children resources by name.

gt

David Jones

unread,
Oct 9, 2021, 8:25:13 AM10/9/21
to
On Saturday, October 9, 2021 at 6:19:44 AM UTC-4, tink...@gmail.com wrote
> So the other choice, 3), fix CRTL to use RMS correctly, and the problems will go away. Engineering effort would not be great. I don't have access to the code base, but assuming that stdio uses unixio then it is fixing 5 routines. This would also allow all the other ports to work with minimal changes in the file access area. If you what to know more contact me directly.
>
> Longer term the SSIO may be useful to RMS, which is where it belongs.

I don't think the CRTL can do it with just the capabilities RMS gives it currently(or they would have fixed it already). Maintaining coherence of where end-of-file is for multiple writers is a difficult problem.

The crtl does not layer stdio file access on top of unixio primitives.

Vitaly Pustovetov

unread,
Oct 9, 2021, 11:48:30 AM10/9/21
to
> RMS worked and has been working of 40+years and does not have these concurrent access issues!
No, you are wrong. RMS works fine with record-based files, but not streams. You can write a program even in MACRO, you will still have the same issues. This is a documented feature of RMS.

Stephen Hoffman

unread,
Oct 9, 2021, 12:47:14 PM10/9/21
to
On 2021-10-09 10:19:42 +0000, Greg Tinkler said:

> On Saturday, 9 October 2021 at 6:00:48 pm UTC+11, Vitaly Pustovetov wrote:
>>> So the main issue is file IO, so change CRTL to use RMS.> >> > gt
>> CRTL uses RMS for file I/O. But there is an issue with concurrent
>> access of multiple processes to the same file in stream mode. And we
>> had a choice - 1) rewrite half of Postgres by inserting file locking;
>> 2) add a new SSIO (Shared Stream IO) service to VMS.
>
> Sort of.
> RMS worked and has been working of 40+years and does not have these
> concurrent access issues! NB there is no such thing at an OS level of
> stream anything, every thing is clumps of data being buffered is some
> way, the API that accesses that data from the higher levels may be
> stream based. In this case it is CRTL's role to translate the clumps
> of data into/from stream API.

RMS is a pretty good database, for its time. Alas, its become rather
more dated, with an API design that is complex and limiting, and in
competitive terms RMS is badly feature-limited.

If you need a key-value store and where the developer entirely owns the
fields used within the punched cards, and where y'all can fit your
files in 2 TiB (or bound volume sets, gag), RMS is still a fine choice.

For stream access to data, removing the punched-card assumptions and
file and cluster locking and the rest of effectively removes all of RMS
from the discussion; in such a case, RMS really isn't used either in
name, or in general.

As for whether or not there are streams of data, the abstraction
exists. The difference at the app level is whether the operating system
and its default file system enforces the use of a punched-card
abstraction. C does not expect that. Classic OpenVMS apps do.

> So the other choice, 3), fix CRTL to use RMS correctly, and the
> problems will go away. Engineering effort would not be great. I don't
> have access to the code base, but assuming that stdio uses unixio then
> it is fixing 5 routines. This would also allow all the other ports to
> work with minimal changes in the file access area. If you what to know
> more contact me directly.

Punched cards and punched-card-based assumptions are rather more
pernicious within OpenVMS and clustering, and mailboxes, and various
other areas, alas. For those of us steeped in OpenVMS, the effects of
these assumptions can be invisible.

> Longer term the SSIO may be useful to RMS, which is where it belongs.

Longer-term, SSIO belongs in XQP, and RMS needs a demotion to "just
another of the available databases on OpenVMS" atop the XQP, and/or
atop some replacement XQP and/or FUSE for different file systems, and
this with various other common databases present.

SSIO and similar work aside, that demotion of RMS and the related and
substantial investment in new file system and database work are not
going to happen any time soon. Getting the device drivers and XQP to
64-bit storage addressing was reportedly one part of the work involved
(and was once targeted for V8.5), while getting RMS to 64-bit
addressing was a separate and subsequent feature. Getting RMS to 64-bit
storage addressing was and is and will be a rather larger investment,
too. Both VAFS and GFS have been discussed here, but VSI has been busy
with and increasingly focused on the port and port-related work.

SSIO is unrelated to the other file system work pending here.

Back to RMS and SSIO, apps that don't expect punched-card semantics can
and variously do perform their own coordination, so sharing the
underlying files with apps that do expect punched cards is unnecessary,
and counterproductive.

> Sorry if the above is a little blunt, I appreciate the efforts people
> have put in over the years, but some of us have using and coding VMS
> for a very long time, and I really want VMS to be successful and easy
> to port to. This has been a good opportunity for me to look more into
> CRTL and RMS, and see the problems that have been there for decades.

Part of the problem was that thirty years ago, senior DEC leadership
and OpenVMS development leadership was unwilling or unable to foresee
the directions that computing was headed, and fallout from that era
continues to reverberate through to this day around C and IP and RMS.
And around where OpenVMS has found itself in recent years. As long as
we're being blunt.

One of the problems that OpenVMS has here is RMS. While RMS was and is
very useful, it's just not a competitive database in 2021, and too many
of its punched-card assumptions have permeated the platform. That, and
the primary RMS API is just bad for making any sort of significant
changes. This is another area very much like the addition of 64-bit
virtual addressing on OpenVMS; where providing compatibility for 32-bit
virtual apps makes an already complex environment (RMS) vastly more
complex (mixed 32-bit and 64-bit storage addressing within RMS).

> locking is an interesting area, I still feel the current DLM is more
> than capable of doing the 'lock a byte range' in a way that can be used
> with the current RMS locking. Longer term DLM needs some changes but
> they are about sizes of resource names, scoping of resource names,
> ability to scan for children resources by name.

C and DLM already implement range locking on OpenVMS.

Dave Froble

unread,
Oct 9, 2021, 1:56:57 PM10/9/21
to
On 10/9/2021 6:19 AM, Greg Tinkler wrote:
> On Saturday, 9 October 2021 at 6:00:48 pm UTC+11, Vitaly Pustovetov wrote:
>>> So the main issue is file IO, so change CRTL to use RMS.
>>>
>>> gt
>> CRTL uses RMS for file I/O. But there is an issue with concurrent access of multiple processes to the same file in stream mode.
>> And we had a choice - 1) rewrite half of Postgres by inserting file locking; 2) add a new SSIO (Shared Stream IO) service to VMS.
>
> Sort of.
>
> RMS worked and has been working of 40+years and does not have these concurrent access issues!

What is RMS, in the current context?

Record Management System
========================

Notice the first word is RECORD. Not FILE, not FIELD, just RECORD.

As for concurrent access, RMS has used the VMS DLM since around 1984.
It is cooperating processes using the DLM that avoids concurrent access
issues.

DLM offers such services.
RMS does not.
CRTL does not.

> NB there is no such thing at an OS level of stream anything,

Not true, unless discussing block oriented devices, which we most likely
are. I do hope you are not suggesting that VMS cannot access memory in
any manner it chooses?

> every thing is clumps of data being buffered is some way, the API that accesses that data from the higher levels may be stream based. In this case it is CRTL's role to translate the clumps of data into/from stream API.

So, how does Pascal, Fortran, Cobol, Basic, and such do it?

>
> So the other choice, 3), fix CRTL to use RMS correctly, and the problems will go away.

3a, The CRTL may use RMS in the manner in which RMS is designed to work.

3b, What problems are there in RMS?

? Engineering effort would not be great. I don't have access to the
code base, but assuming that stdio uses unixio then it is fixing 5
routines. This would also allow all the other ports to work with
minimal changes in the file access area. If you what to know more
contact me directly.
>
> Longer term the SSIO may be useful to RMS, which is where it belongs.

What use does RMS have for any numeric range locking, at least for
anything besides records, which is what RMS is all about.

> Sorry if the above is a little blunt, I appreciate the efforts people have put in over the years, but some of us have using and coding VMS for a very long time, and I really want VMS to be successful and easy to port to. This has been a good opportunity for me to look more into CRTL and RMS, and see the problems that have been there for decades.
>
> locking is an interesting area, I still feel the current DLM is more than capable of doing the 'lock a byte range' in a way that can be used with the current RMS locking. Longer term DLM needs some changes but they are about sizes of resource names, scoping of resource names, ability to scan for children resources by name.
>
> gt
>


Dave Froble

unread,
Oct 9, 2021, 2:02:59 PM10/9/21
to
On 10/9/2021 12:47 PM, Stephen Hoffman wrote:

> C and DLM already implement range locking on OpenVMS.

I'd really like to see the documentation and how to use it.

Phillip Helbig (undress to reply)

unread,
Oct 9, 2021, 2:13:29 PM10/9/21
to
In article <sjsh2f$tf8$1...@dont-email.me>, Stephen Hoffman
<seao...@hoffmanlabs.invalid> writes:

> RMS is a pretty good database, for its time. Alas, its become rather
> more dated,

From database to datedbase. :-)

Arne Vajhøj

unread,
Oct 9, 2021, 2:18:46 PM10/9/21
to
On 10/9/2021 1:54 PM, Dave Froble wrote:
> On 10/9/2021 6:19 AM, Greg Tinkler wrote:
>> every thing is clumps of data being buffered is some way, the API that
>> accesses that data from the higher levels may be stream based.  In
>> this case it is CRTL's role to translate the clumps of data into/from
>> stream API.
>
> So, how does Pascal, Fortran, Cobol, Basic, and such do it?

They do not treat files as streams of bytes - they treat files
as sequences of records.

The underlying problem is that the two paradigms are pretty
incompatible. It is not easy for CRTL to translate a sequence
of records to a stream of bytes in a consistent and meaningful
manner.

Arne

Arne Vajhøj

unread,
Oct 9, 2021, 2:22:08 PM10/9/21
to
On 10/9/2021 12:47 PM, Stephen Hoffman wrote:
> On 2021-10-09 10:19:42 +0000, Greg Tinkler said:
>> On Saturday, 9 October 2021 at 6:00:48 pm UTC+11, Vitaly Pustovetov
>> wrote:
>>>> So the main issue is file IO, so change CRTL to use RMS.> >> > gt
>>> CRTL uses RMS for file I/O. But there is an issue with concurrent
>>> access of multiple processes to the same file in stream mode. And we
>>> had a choice - 1) rewrite half of Postgres by inserting file locking;
>>> 2) add a new SSIO (Shared Stream IO) service to VMS.
>>
>> Sort of.
>> RMS worked and has been working of 40+years and does not have these
>> concurrent access issues!  NB there is no such thing at an OS level of
>> stream anything, every thing is clumps of data being buffered is some
>> way, the API that accesses that data from the higher levels may be
>> stream based.  In this case it is CRTL's role to translate the clumps
>> of data into/from stream API.
>
> RMS is a pretty good database, for its time.  Alas, its become rather
> more dated, with an API design that is complex and limiting, and in
> competitive terms RMS is badly feature-limited.
>
> If you need a key-value store and where the developer entirely owns the
> fields used within the punched cards, and where y'all can fit your files
> in 2 TiB (or bound volume sets, gag), RMS is still a fine choice.

Hoff I think you are muddying the water here.

This discussion has so far been about ORG:SEQ files.

ORG:IDX files are a Key Value Store. But that is a totally
different topic.

Arne

Stephen Hoffman

unread,
Oct 9, 2021, 4:55:22 PM10/9/21
to
And here I was trying to explicitly not slag on RMS and its
capabilities, as that'd solely serve provoke a torrent of folks quite
reasonably pointing out that RMS is perfect for {app}.

Stephen Hoffman

unread,
Oct 9, 2021, 4:57:35 PM10/9/21
to
On 2021-10-09 18:00:45 +0000, Dave Froble said:

> On 10/9/2021 12:47 PM, Stephen Hoffman wrote:
>
>> C and DLM already implement range locking on OpenVMS.
>
> I'd really like to see the documentation and how to use it.

Alas, entirely undocumented, per the previous comments around here.

Vitaly Pustovetov

unread,
Oct 9, 2021, 5:22:31 PM10/9/21
to
суббота, 9 октября 2021 г. в 21:02:59 UTC+3, Dave Froble:
> On 10/9/2021 12:47 PM, Stephen Hoffman wrote:
>
> > C and DLM already implement range locking on OpenVMS.
> I'd really like to see the documentation and how to use it.

"File Locking
The C RTL supports byte-range file locking using the F_GETLK, F_SETLK, and F_SETLKW
commands of the fcntl function, as defined in the X/Open specification. Byte-range file locking is
supported across OpenVMS clusters. You can only use offsets that fit into 32-bit unsigned integers.
When a shared lock is set on a segment of a file, other processes on the cluster are able to set shared
locks on that segment or a portion of it. A shared lock prevents any other process from setting
an exclusive lock on any portion of the protected area. A request for a shared lock fails if the file
descriptor was not opened with read access....."(c)VSI C Run-Time Library Reference Manual

Stephen Hoffman

unread,
Oct 9, 2021, 6:27:47 PM10/9/21
to
On 2021-10-09 21:22:29 +0000, Vitaly Pustovetov said:

> суббота, 9 октября 2021 г. в 21:02:59 UTC+3, Dave Froble:
>> On 10/9/2021 12:47 PM, Stephen Hoffman wrote:>> > C and DLM already
>> implement range locking on OpenVMS.
>> I'd really like to see the documentation and how to use it.
>
> "File Locking
> The C RTL supports byte-range...

That comment was in reference to the DLM range-locking API; the
(un)documentation for what's implemented underneath those C calls
within CRTL and DLM.

Dave Froble

unread,
Oct 9, 2021, 6:44:03 PM10/9/21
to
Which is why Steve's suggestion for ODS2/ODS5 becoming just another file
system.

Which is why Steve's suggestion for RMS to become just another database
product. Well, if ODS? wants to use it for directories, Ok.

But even if another "application" handles other files, there is still
the issue of today's disks being block based (Ok, punched card if you
must) devices.

Stream devices is alien enough to today's VMS that it would be much
better served by dedicated tools designed for that format. (And it sure
isn't RMS!)

Then there is the interesting question of what the next format to come
along might be.

Dave Froble

unread,
Oct 9, 2021, 6:50:25 PM10/9/21
to
On 10/9/2021 4:57 PM, Stephen Hoffman wrote:
> On 2021-10-09 18:00:45 +0000, Dave Froble said:
>
>> On 10/9/2021 12:47 PM, Stephen Hoffman wrote:
>>
>>> C and DLM already implement range locking on OpenVMS.
>>
>> I'd really like to see the documentation and how to use it.
>
> Alas, entirely undocumented, per the previous comments around here.
>
>

Then, does it really exist?

Dave Froble

unread,
Oct 9, 2021, 6:55:10 PM10/9/21
to
No, it is not. The OP declared that RMS should be used for that.

You are correct that we're concerned about stream files, but claims
about RMS have been part of the discussion.
It is loading more messages.
0 new messages