LLVM, volatile and async VMS I/O and system calls

585 views
Skip to first unread message

Simon Clubley

unread,
Sep 22, 2021, 8:28:09 AMSep 22
to
Jan-Erik's questions about ASTs in COBOL have reminded me about something
I asked a while back.

VMS I/O and system calls are much more asynchronous than on other operating
systems and data can appear in buffers and variables in general can be
changed outside of the normal sequence points (such as at a function call
boundary).

With the move to LLVM, and its different optimiser, have any examples
appeared in VMS code for x86-64 where volatile attributes are now required
on variable definitions where you would have got away with not using them
before (even if technically, they should have been marked as volatile anyway) ?

Just curious if there's any places in code running on VMS x86-64 that will
need to cleaned up to do things in the correct way that you would have
got away with doing less correctly previously.

Simon.

--
Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
Walking destinations on a map are further away than they appear.

Arne Vajhøj

unread,
Sep 22, 2021, 9:06:29 AMSep 22
to
On 9/22/2021 8:28 AM, Simon Clubley wrote:
> Jan-Erik's questions about ASTs in COBOL have reminded me about something
> I asked a while back.
>
> VMS I/O and system calls are much more asynchronous than on other operating
> systems and data can appear in buffers and variables in general can be
> changed outside of the normal sequence points (such as at a function call
> boundary).
>
> With the move to LLVM, and its different optimiser, have any examples
> appeared in VMS code for x86-64 where volatile attributes are now required
> on variable definitions where you would have got away with not using them
> before (even if technically, they should have been marked as volatile anyway) ?
>
> Just curious if there's any places in code running on VMS x86-64 that will
> need to cleaned up to do things in the correct way that you would have
> got away with doing less correctly previously.

To state the obvious.

Correct C aka C code with defined behavior by C standard will work
on any standard compliant C compiler.

C code with implementation specific or undefined behavior is
throwing a dice.

Maybe John Reagan has some ideas about what may break, but I cannot see
VSI systematic document how Itanium to x86-64 migration will impact
C code with implementation specific or undefined behavior.

Arne


John Reagan

unread,
Sep 22, 2021, 9:12:14 AMSep 22
to
Simon, you asked this exact same question before. You can search back for the guesses.

Moving architectures can always expose bugs regardless of platform, OS, or compiler.

GEM has different optimizations on Alpha vs Itanium. Did you have to add any volatiles in that transition?

Linux people, how many people have code that works with -O0 and -O1 but not with -O3 or-Ofast?

The current cross-compilers are no-optimize so there is no real world experience for missing volatiles.

abrsvc

unread,
Sep 22, 2021, 10:00:54 AMSep 22
to
I would also ask why does seemingly every question have a negative bent toward OpenVMS?
Why is what LLVM does "correct' where potentially what OpenVMS does buggy?
Isn't it possible that a correct sequence on OpenVMS can reveal a bug in LLVM where an implementation is not correct in all cases?

Bob Gezelter

unread,
Sep 22, 2021, 10:26:03 AMSep 22
to
On Wednesday, September 22, 2021 at 8:28:09 AM UTC-4, Simon Clubley wrote:
> Jan-Erik's questions about ASTs in COBOL have reminded me about something
> I asked a while back.
>
> VMS I/O and system calls are much more asynchronous than on other operating
> systems and data can appear in buffers and variables in general can be
> changed outside of the normal sequence points (such as at a function call
> boundary).
>
> With the move to LLVM, and its different optimiser, have any examples
> appeared in VMS code for x86-64 where volatile attributes are now required
> on variable definitions where you would have got away with not using them
> before (even if technically, they should have been marked as volatile anyway) ?
>
> Just curious if there's any places in code running on VMS x86-64 that will
> need to cleaned up to do things in the correct way that you would have
> got away with doing less correctly previously.
>
> Simon
>
> --
> Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
> Walking destinations on a map are further away than they appear.
Simon,

Since the days of RSX-11M, I have been dealing with client bugs in this area.. The best phrasing I have seen in this area was in an IBM System/360 Principles of Operation manual. It may have only appeared in certain editions, as I cannot find the precise reference. However, it was along the lines of "the contents of a buffer are UNDEFINED [emphasis mine] from the initiation of the I/O operation until the operation has completed with the device end signal from the device."

In OpenVMS speak, the above translates as: "The contents of the buffer are undefined from the issuance of the QIO system call until such time as the I/O is completed, signaled by the queueing of an AST; setting of an event flag; or the setting of the completion code in the IOSB."

Hoff and I participated in a thread a ways back on a related topic. Out of order storing on Alpha requires an explicit flush of the pipeline to ensure that the IOSB, buffers, and other data is consistent when an AST is queued.

One violates that fundamental understanding at one's peril. (Yes, I have had clients try peeking at in-progress buffers, often with catastrophic results). There are absolutely no guarantees about the contents of a buffer while an I/O operation is queued or in process for the buffer.

- Bob Gezelter, http://www.rlgsc.com

Bob Gezelter

unread,
Sep 22, 2021, 10:27:58 AMSep 22
to
absrv,

Absolutely. Many an optimizer has incorrectly optimized that which should not be optimized.

Dave Froble

unread,
Sep 22, 2021, 11:58:45 AMSep 22
to
On 9/22/2021 9:12 AM, John Reagan wrote:

> Simon, you asked this exact same question before. You can search back for the guesses.

Are you suggesting Simon is getting a bit senile, or is just stubborn?


--
David Froble Tel: 724-529-0450
Dave Froble Enterprises, Inc. E-Mail: da...@tsoft-inc.com
DFE Ultralights, Inc.
170 Grimplin Road
Vanderbilt, PA 15486

Dave Froble

unread,
Sep 22, 2021, 12:00:36 PMSep 22
to
On 9/22/2021 10:00 AM, abrsvc wrote:

> I would also ask why does seemingly every question have a negative bent toward OpenVMS?
> Why is what LLVM does "correct' where potentially what OpenVMS does buggy?
> Isn't it possible that a correct sequence on OpenVMS can reveal a bug in LLVM where an implementation is not correct in all cases?
>

Now Dan, are you trying to ruin Simon's fun?

Dave Froble

unread,
Sep 22, 2021, 12:05:40 PMSep 22
to
Gotta agree with that. Once some action is started, the buffer ain't
yours until the action is completed.

Simon Clubley

unread,
Sep 22, 2021, 1:41:23 PMSep 22
to
On 2021-09-22, John Reagan <xyzz...@gmail.com> wrote:
> Simon, you asked this exact same question before. You can search back for the guesses.
>

I know. I was wondering if any data has come along since then.

>
> The current cross-compilers are no-optimize so there is no real world experience for missing volatiles.

That I had forgotten was still the case. Will be interesting to see
the performance improvements when you turn the optimiser on. :-)

Simon Clubley

unread,
Sep 22, 2021, 1:54:42 PMSep 22
to
On 2021-09-22, abrsvc <dansabr...@yahoo.com> wrote:
>
> I would also ask why does seemingly every question have a negative bent toward OpenVMS?

It doesn't. I've posted very positive comments about VMS clusters
and the cluster-wide DLM (for example) in the past. Even today, they
are still very strong features in VMS (ignoring the price tag. :-)).

However, experience in other operating systems causes me to see
missing features in VMS, at least some of which people might expect as
standard these days.

At a policy level, some negative statements I've made about the move
to time-limited production licences appear to be widely supported.

> Why is what LLVM does "correct' where potentially what OpenVMS does buggy?

Not buggy, just not as aggressive. LLVM is a much more aggressive optimiser
from what I can tell.

To give you an idea of what LLVM gets up to, this is the current list
of LLVM passes:

https://llvm.org/docs/Passes.html

Simon Clubley

unread,
Sep 22, 2021, 2:11:40 PMSep 22
to
On 2021-09-22, Bob Gezelter <geze...@rlgsc.com> wrote:
> Simon,
>
> Since the days of RSX-11M, I have been dealing with client bugs in this area.. The best phrasing I have seen in this area was in an IBM System/360 Principles of Operation manual. It may have only appeared in certain editions, as I cannot find the precise reference. However, it was along the lines of "the contents of a buffer are UNDEFINED [emphasis mine] from the initiation of the I/O operation until the operation has completed with the device end signal from the device."
>
> In OpenVMS speak, the above translates as: "The contents of the buffer are undefined from the issuance of the QIO system call until such time as the I/O is completed, signaled by the queueing of an AST; setting of an event flag; or the setting of the completion code in the IOSB."
>

That isn't the concern Bob.

The concern is, given the highly asynchronous nature of VMS I/O and
of some VMS system calls in general, and given the more aggressive
LLVM optimiser, does the generated code always correctly re-read the
current contents of buffers and variables without having to mark those
buffers/variables as volatile ?

Or are there enough sequence points in VMS application code where these
buffers and variables are accessed that this may turn out not to be a
problem in most cases ?

In essence, the VMS system call and I/O system is behaving much more
like the kinds of things you see in embedded bare-metal programming
than in the normal synchronous model you see in the Unix world.

There's a reason why volatile is used so liberally in embedded bare-metal
programming. :-)

Simon.

chris

unread,
Sep 22, 2021, 3:09:55 PMSep 22
to
That sounds like bad code design to me and more an issue of critical
sections. For example, it's quite common to have an upper and lower io
half, with queues betwixt the two. Upper half being mainline code that
has access to and can update pointers, while low half at interrupt
level also has access to the queue and it's pointers. At trivial level,
interrupts are disabled during mainline access and if the interrupt
handler always runs to completion, that provides the critical section
locks.

What you seem to be suggesting is a race condition, where the state of
one section of code is unknown to the other, a sequence of parallel
states that somehow get out of sync, due to poor code design, sequence
points, whatever.


I'm sure the designers of vms wpuld be well aware of such issues,
steeped in computer science as they were, and an area which is
fundamental to most system design...

Chris

Bob Gezelter

unread,
Sep 22, 2021, 3:58:17 PMSep 22
to
Simon,

Good technical question.

In general, optimizers work within basic blocks. The example of concern is not a single basic block.

A basic block is a section of code with one entry and one exit. Simple IF statements fall within that category. However, any out-of-line code invocation does not.

The presence of the SYS$QIO system service, which one way or another involves a CALL, ends the basic block, as the optimizer cannot know what is modified by the out-of-line call or its descendants.

Simon Clubley

unread,
Sep 22, 2021, 4:16:20 PMSep 22
to
On 2021-09-22, chris <chris-...@tridac.net> wrote:
>
> That sounds like bad code design to me and more an issue of critical
> sections. For example, it's quite common to have an upper and lower io
> half, with queues betwixt the two. Upper half being mainline code that
> has access to and can update pointers, while low half at interrupt
> level also has access to the queue and it's pointers. At trivial level,
> interrupts are disabled during mainline access and if the interrupt
> handler always runs to completion, that provides the critical section
> locks.
>

It's nothing like that Chris.

At the level of talking to the kernel, all I/O on VMS is asynchronous
and it is actually a nice design. There is no such thing as synchronous
I/O at system call level on VMS.

When you queue an I/O in VMS, you can pass either an event flag number or
an AST completion routine to the sys$qio() call which then queues the
I/O for processing and then immediately returns to the application.

To put that another way, the sys$qio() I/O call is purely asynchronous.
Any decisions to wait for for I/O to complete are made in the application,
(for example via the sys$qiow() call) and not in the kernel.

You can stall by making a second system call to wait until the event
flag is set, or you can use sys$qiow() which is a helper routine to
do that for you, but you are not forced to and that is the critical
point.

You can queue the I/O and then just carry on doing something else
in your application while the I/O completes and then you are notified
in one of several ways.

That means the kernel can write _directly_ into your process space by
setting status variables and writing directly into your queued buffer
while the application is busy doing something else completely different.

You do not have to stall in a system call to actually receive the
buffer from the kernel - VMS writes it directly into your address space.

It is _exactly_ the same as embedded bare-metal programming where the
hardware can write directly into memory-mapped registers and buffers
in your program while you are busy doing something else.

> What you seem to be suggesting is a race condition, where the state of
> one section of code is unknown to the other, a sequence of parallel
> states that somehow get out of sync, due to poor code design, sequence
> points, whatever.
>

It is actually a very clean mechanism and there are no such things
as race conditions when using it properly.

>
> I'm sure the designers of vms wpuld be well aware of such issues,
> steeped in computer science as they were, and an area which is
> fundamental to most system design...
>

They are, which is why the DEC-controlled compilers emitted code
that worked just fine with VMS without the application having to
use volatile.

However, LLVM is now the compiler toolkit in use and it could
potentially make quite valid (and different) assumptions about
if it needs to re-read a variable that it doesn't know has changed.

After all, if the application takes full advantage of this
asynchronous I/O model, there has been no direct call by the code
to actually receive the buffer and I/O completion status variables
when VMS decides to update them after the I/O has completed.

I am hoping however that there are enough sequence points in the
code, even in the VMS asynchronous I/O model for this not to be
a problem in practice although it is a potential problem.

Now do you see the potential problem ?

BTW, this also applies to some system calls in general as a number of
them are asynchronous as well - it's not just the I/O in VMS which
is asynchronous.

Simon Clubley

unread,
Sep 22, 2021, 4:26:00 PMSep 22
to
On 2021-09-22, Bob Gezelter <geze...@rlgsc.com> wrote:
> Simon,
>
> Good technical question.
>
> In general, optimizers work within basic blocks. The example of concern is not a single basic block.
>
> A basic block is a section of code with one entry and one exit. Simple IF statements fall within that category. However, any out-of-line code invocation does not.
>
> The presence of the SYS$QIO system service, which one way or another involves a CALL, ends the basic block, as the optimizer cannot know what is modified by the out-of-line call or its descendants.
>

But VMS writes directly into your process space at some random time
X later _after_ you have returned from sys$qio() and are potentially
busy doing something else.

From the viewpoint of the application, it's exactly the same as hardware
choosing to write an updated value into a register while your bare-metal
code is busy doing something else.

How does the compiler know VMS has done that or are there enough
sequence points even in the VMS asynchronous I/O model for this
to still work fine without having to use the volatile attribute,
even in the presence of an highly aggressive optimising compiler ?

Arne Vajhøj

unread,
Sep 22, 2021, 7:22:41 PMSep 22
to
On 9/22/2021 4:25 PM, Simon Clubley wrote:
> On 2021-09-22, Bob Gezelter <geze...@rlgsc.com> wrote:
>> In general, optimizers work within basic blocks. The example of concern is not a single basic block.
>>
>> A basic block is a section of code with one entry and one exit. Simple IF statements fall within that category. However, any out-of-line code invocation does not.
>>
>> The presence of the SYS$QIO system service, which one way or another involves a CALL, ends the basic block, as the optimizer cannot know what is modified by the out-of-line call or its descendants.
>
> But VMS writes directly into your process space at some random time
> X later _after_ you have returned from sys$qio() and are potentially
> busy doing something else.
>
> From the viewpoint of the application, it's exactly the same as hardware
> choosing to write an updated value into a register while your bare-metal
> code is busy doing something else.
>
> How does the compiler know VMS has done that or are there enough
> sequence points even in the VMS asynchronous I/O model for this
> to still work fine without having to use the volatile attribute,
> even in the presence of an highly aggressive optimising compiler ?

This is a bit outside my area of expertise.

But wouldn't a flow like:
- call SYS$QIO with buffer
- do something
- wait for IO to complete
- __MB()
- use buffer
work?

Arne



Jan-Erik Söderholm

unread,
Sep 22, 2021, 7:53:42 PMSep 22
to
One important point is of course to not use the buffer until the
QIO related to that buffer completes. But that is not really what
Simon is talkning/asking about.

Simon is refering to the well known issue on platforms where a
variable can directly refer to some data that might get updated
outside of the controle of the application code (and then also
not being able to be analysed by a compiler optimizer.

One very common case is where a variable refers to an "port"
on a microcontroller. The port is connected to some real life
equipment such as push buttons, relays or what ever. Those items
can be handled totaly out of control from the application code.

In those cases, it is very common that the compiler says "this
variable has not been updated, so I'll just use the value from
the last read that I already have in a register anyway". And
then missing some push button being pressed.

That is where you say "volatile" to disable any such optimization and
force the compiler to always re-read the variable from the source.

AST, does in a way look like this, some data (the buffer) in the app
is changed without this beeing obvious from just looking at the code.
That is, as the compiler is doing. That is what Simon is asking about.

I am very well aware of these issues with microcontrollers from a long
time programming "8-bitters" in the Microchip PIC family. I have no
idea how this relates to ASTs...


Jan-Erik.





Arne Vajhøj

unread,
Sep 22, 2021, 8:07:17 PMSep 22
to
> One important point is of course to not use the buffer until the
> QIO related to that buffer completes. But that is not really what
> Simon is talkning/asking about.

That is a given. And why I had the "wait for IO to complete".

> Simon is refering to the well known issue on platforms where a
> variable can directly refer to some data that might get updated
> outside of the controle of the application code (and then also
> not being able to be analysed by a compiler optimizer.
>
> One very common case is where a variable refers to an "port"
> on a microcontroller. The port is connected to some real life
> equipment such as push buttons, relays or what ever. Those items
> can be handled totaly out of control from the application code.
>
> In those cases, it is very common that the compiler says "this
> variable has not been updated, so I'll just use the value from
> the last read that I already have in a register anyway". And
> then missing some push button being pressed.
>
> That is where you say "volatile" to disable any such optimization and
> force the compiler to always re-read the variable from the source.

Yes.

But instead of spreading out volatile keyword wouldn't
__MB() do the same? (on VMS - it is a VMS C specific thing
I believe)

Arne

Jan-Erik Söderholm

unread,
Sep 22, 2021, 8:16:09 PMSep 22
to
Sorry. Seems to be something about a "memory barrier". I don't know
what that is in this context and if it works as an volatile.

John Reagan

unread,
Sep 22, 2021, 8:18:09 PMSep 22
to
So LLVM and gcc are good optimizers. They battle with each other all the time for everybody's benefit. However, GEM is a pretty good optimizer too. Alpha GEM code is really good. It started wobbly with EV4 but grew into a tight code generator. GEM Itanium does not take advantage of the machines speculative or advance loads which puts it behind the HPUX compiler but it holds its own.

LLVM's has many optimization passes for specific targets. The whole list is not run on every target.

For LLVM, it is way more than just "volatile" loads and stores. Optimizers need way more than that. For LLVM, look at

https://llvm.org/docs/AliasAnalysis.html
https://llvm.org/docs/Passes.html
https://llvm.org/docs/Atomics.html
https://llvm.org/docs/LangRef.html#tbaa-metadata
https://llvm.org/docs/MemorySSA.html


BTW, GEM also has a TBAA mechanism which uses callbacks from GEM back to the frontend to ask about alias-information. It allows the alias analysis to be specific to the language semantics.

Lawrence D’Oliveiro

unread,
Sep 22, 2021, 9:58:41 PMSep 22
to
On Thursday, September 23, 2021 at 11:53:42 AM UTC+12, Jan-Erik Söderholm wrote:
> One very common case is where a variable refers to an "port"
> on a microcontroller. The port is connected to some real life
> equipment such as push buttons, relays or what ever. Those items
> can be handled totaly out of control from the application code.

This is memory-mapped I/O: those addresses don’t actually access real memory, and do not have normal memory semantics.

Some CPU architectures use special I/O instructions for this purpose, and won’t have this problem. This is why you have philosophical* debates about which is the better approach ...

*maybe even verging on religious

Simon Clubley

unread,
Sep 23, 2021, 8:24:38 AMSep 23
to
> Sorry. Seems to be something about a "memory barrier". I don't know
> what that is in this context and if it works as an volatile.

Assuming __MB() is a hardware memory barrier operation, the answer is no.

Memory barrier instructions are designed to get bits of hardware back
into sync with each other so that when you read something it is valid.

Volatile OTOH is a purely software construct and is used to tell the
compiler to insert bits of code into the generated code to _always_
first re-read the memory location even if the compiler thinks from
looking at the source code that the value could not have changed.

There's no point getting the hardware back into sync, if the generated
code is missing the bit to then unconditionally read that value again
before doing something with the variable.

Even if memory barriers could be made to so the same thing somehow
by having the compiler look for them, they would have to be inserted
within the executable code. Volatile however is a variable definition
attribute so only ever appears in the source code when the volatile
variables are initially defined.

chris

unread,
Sep 23, 2021, 10:10:51 AMSep 23
to
So what is the issue here ?. Keywords like volatile would not normally
ever be used at app level, being reserved for low level kernel and
driver code where it touches real hardware registers. or perhaps
memory locations reserved for a specific purpose. The 4th edition of
Harbison & Steele C reference manual has 2 or 3 pages devoted to the
volatile keyword and might be worth looking at; Section 4.4.5. The
whole point of a black box kernel is to isolate the internal workings
from an application. Thinking layers, of course.

If systems and code are designed and written properly, then it should
compile to usable code irrespective of compiler, including
optimisation level, so long as the compiler is standards compliant.
Anything else is s system design issue...

Chris

Dave Froble

unread,
Sep 23, 2021, 11:27:56 AMSep 23
to
On 9/23/2021 10:10 AM, chris wrote:

> So what is the issue here ?. Keywords like volatile would not normally
> ever be used at app level, being reserved for low level kernel and
> driver code where it touches real hardware registers. or perhaps
> memory locations reserved for a specific purpose. The 4th edition of
> Harbison & Steele C reference manual has 2 or 3 pages devoted to the
> volatile keyword and might be worth looking at; Section 4.4.5. The
> whole point of a black box kernel is to isolate the internal workings
> from an application. Thinking layers, of course.
>
> If systems and code are designed and written properly, then it should
> compile to usable code irrespective of compiler, including
> optimisation level, so long as the compiler is standards compliant.
> Anything else is s system design issue...

If your code was executed as you intended, then there should not be any
issue, as you mention.

But what if your code is NOT executed as you intended. An optimizer
just might figure that it doesn't need to execute some instructions,
that they are redundant. They do that. However, if they are not
redundant, then results may not be as expected.

So, one might consider "volatile" (or whatever else is used) as an edict
to "don't optimize".

chris

unread,
Sep 23, 2021, 12:22:12 PMSep 23
to
On 09/23/21 16:25, Dave Froble wrote:
> On 9/23/2021 10:10 AM, chris wrote:
>
>> So what is the issue here ?. Keywords like volatile would not normally
>> ever be used at app level, being reserved for low level kernel and
>> driver code where it touches real hardware registers. or perhaps
>> memory locations reserved for a specific purpose. The 4th edition of
>> Harbison & Steele C reference manual has 2 or 3 pages devoted to the
>> volatile keyword and might be worth looking at; Section 4.4.5. The
>> whole point of a black box kernel is to isolate the internal workings
>> from an application. Thinking layers, of course.
>>
>> If systems and code are designed and written properly, then it should
>> compile to usable code irrespective of compiler, including
>> optimisation level, so long as the compiler is standards compliant.
>> Anything else is s system design issue...
>
> If your code was executed as you intended, then there should not be any
> issue, as you mention.
>
> But what if your code is NOT executed as you intended. An optimizer just
> might figure that it doesn't need to execute some instructions, that
> they are redundant. They do that. However, if they are not redundant,
> then results may not be as expected.
>
> So, one might consider "volatile" (or whatever else is used) as an edict
> to "don't optimize".
>

Not really, as no code in a high level language should be written so
as to depend on the sequence of instructions generated. If you want
that defined more tightly, then you should be using assembler, or
have intimate knowledge of compiler translations and output.

One of the functions of a high level language is to provide an
abstraction layer between applications and the underlying machine.
Of course, that doesn't apply to systems or kernel programming but
such work requires a much deeper understanding of comp sci algorithmics
than simple application programming...

Chris

Simon Clubley

unread,
Sep 23, 2021, 1:51:43 PMSep 23
to
On 2021-09-23, chris <chris-...@tridac.net> wrote:
>
> So what is the issue here ?. Keywords like volatile would not normally
> ever be used at app level, being reserved for low level kernel and
> driver code where it touches real hardware registers. or perhaps
> memory locations reserved for a specific purpose.

There are a number of instances where it is quite valid (and expected)
to use a volatile attribute in normal applications, especially when
asynchronous I/O is involved.

The use of volatile is not restricted to those who know the resistor
colour code chart off by heart. :-)

For example, Linux uses it (correctly) in its own asynchronous I/O interface:

https://man7.org/linux/man-pages/man7/aio.7.html

>
> If systems and code are designed and written properly, then it should
> compile to usable code irrespective of compiler, including
> optimisation level, so long as the compiler is standards compliant.
> Anything else is s system design issue...
>

In this case, the standards compliant approach would have been to require
the volatile attribute from day 1 on those variables/fields/buffers which
are filled in _after_ the system call has returned control to the program.

Simon Clubley

unread,
Sep 23, 2021, 1:58:10 PMSep 23
to
On 2021-09-23, chris <chris-...@tridac.net> wrote:
>
> Not really, as no code in a high level language should be written so
> as to depend on the sequence of instructions generated. If you want
> that defined more tightly, then you should be using assembler, or
> have intimate knowledge of compiler translations and output.
>
> One of the functions of a high level language is to provide an
> abstraction layer between applications and the underlying machine.
> Of course, that doesn't apply to systems or kernel programming but
> such work requires a much deeper understanding of comp sci algorithmics
> than simple application programming...
>
> Chris

Chris,

Are you familiar with true asynchronous I/O in normal applications
where the operating system can write directly into your address space
without the program having to wait around in a system call to actually
receive that data and while the program is actually busy doing something
else at the same time?

chris

unread,
Sep 23, 2021, 5:41:09 PMSep 23
to
On 09/23/21 18:58, Simon Clubley wrote:
> On 2021-09-23, chris<chris-...@tridac.net> wrote:
>>
>> Not really, as no code in a high level language should be written so
>> as to depend on the sequence of instructions generated. If you want
>> that defined more tightly, then you should be using assembler, or
>> have intimate knowledge of compiler translations and output.
>>
>> One of the functions of a high level language is to provide an
>> abstraction layer between applications and the underlying machine.
>> Of course, that doesn't apply to systems or kernel programming but
>> such work requires a much deeper understanding of comp sci algorithmics
>> than simple application programming...
>>
>> Chris
>
> Chris,
>
> Are you familiar with true asynchronous I/O in normal applications
> where the operating system can write directly into your address space
> without the program having to wait around in a system call to actually
> receive that data and while the program is actually busy doing something
> else at the same time?
>
> Simon.
>


But of course. In the case of a dma transfer, the app
allocates a buffer, fills in a parameter block, than calls a system
service to do the transfer, The app then receives a message or signal to
show that the transfer is complete. Fairly standard os practice. May
also register a callback function to tidy up and complete the operation.

It get"s more complicated for say disk io, where several processes may
queue "n" reads or writes, in any order and timeline, but due to seek
times, disk behaviour and more, the returned replies may not be in
the same order as the requests. However, each reply will be directed
to the process that requested it. That's what I mean by async io, but
perhaps there is another definiton ?.

Still doesn't explain why a volatiie keyword might be needed at
application level, though I guess there might be a few edge cases...

Chris

chris

unread,
Sep 23, 2021, 7:09:23 PMSep 23
to
If you look at that use of volatile, it's dealing with sig_atomic,
which I would guess to be an interface to a test and set instruction,
which is designed to be indivisible and non interuptable. That is,
the whole instruction always executes to completion.
More like driver level code, not application, where such
functionality would normally be encapsulated into a system call.

The hard barrier between kernel and application code is there
for very good reason :-)...

Chris

Lawrence D’Oliveiro

unread,
Sep 23, 2021, 9:29:33 PMSep 23
to
On Friday, September 24, 2021 at 5:58:10 AM UTC+12, Simon Clubley wrote:
> Are you familiar with true asynchronous I/O in normal applications
> where the operating system can write directly into your address space
> without the program having to wait around in a system call to actually
> receive that data and while the program is actually busy doing something
> else at the same time?

We normally do that with threading or shared memory these days.

Stephen Hoffman

unread,
Sep 23, 2021, 11:23:02 PMSep 23
to
At the level being discussed ($io_perform, $qio) all I/O is async by
default. The app queues the I/O request, and goes off to do...
whatever.

OpenVMS apps can use the sync call format—which is a sync call wrapping
around the underlying async call—and can then use ASTs or KP threads
for multi-threading, or multiple processes.

The I/O eventually completes, or eventually fails, loads the I/O status
block, and then triggers an event flag (often EFN$C_ENF "do not care")
and the AST async notification. The IOSB is always set before the EF or
the AST.

ASTs are in some ways a predecessor to a closure, and lack compiler
support and syntactic sugar such as the block syntax found in clang, or
the lambda syntax found in C++.

The DEC-traditional languages on OpenVMS sometimes have threading support.

(What I'm referring to as traditional: BASIC, Fortran, FORTRAN, Pascal,
COBOL, BLISS, Macro32, etc. I'm here ignoring Java, Python, Lua, and
whatnot, fine languages that those are.)

Older C (VAX C) had built-in parallelism support (which had some
issues), and newer C has pthreads POSIX threading support.

On OpenVMS, pthreads are built on KP Threads.

Language-based async/await is not something that was common in years
past, and the traditional OpenVMS compilers don't have support for
that, nor for newer standards were those syntax features have been
added.

Unix started out on a different path for I/O with largely sync calls
for I/O, and developed async support later (epoll, kpoll, select, aio,
pthreads, GSD, etc) to wrap around that.

select is a mess on OpenVMS, so we won't discuss that. aio and
GSD/libdispatch don't exist on OpenVMS. etc.

Here, the usual OpenVMS app pattern would be an AST-based app, or maybe
a threading app using pthreads or KP threads. The description I'd
posted earlier in the COBOL thread is also somewhat GSD-ish, given its
use of queues.

With ASTs, the app is either active in the mainline, or exactly one AST
is active. Threads are somewhat more complex, and threads can and do
operate entirely in parallel across multiple processors.

Both ASTs and threads require careful consideration of shared storage,
which ties back to Simon's threads on compiler code optimization, as
well as knowing the processor memory model.

Alpha in particular is very aggressive memory model, as compared to
pretty much any other architecture available. And the COBOL thread
involves Alpha.

It'd be possible to do all this in memory with a section and queues
too, but that then means adding notifications (signals, DLM lock
doorbells, $sigprc, etc) and eventually security and pretty soon most
of the overhead of mailboxes or sockets.

Rolling your own communications interface is absolutely possible and
was once fairly common. I've built and worked with more than a few
communications APIs commonly using sections. Yes, pun fully intended.

For most cases with newer app development work or overhauls on OpenVMS,
I'd tend to use sockets and not mailboxes (from over in the COBOL
thread), but that's local preference. Sockets can let me move
constituent apps further apart, should the app or server load
increases. As has happened with apps I've worked on, the alternative
tends to be mailboxes and sockets, which is more code and more
complexity. And some have included section-based and driver-based
comms. All that means more code, and more "fun" routing and logging and
troubleshooting.

Creating an app that's basically one big ball of self-requeuing ASTs
with a main that hibernates and wakes works pretty well for
low-to-moderate-scale OpenVMS apps, too.




--
Pure Personal Opinion | HoffmanLabs LLC

Simon Clubley

unread,
Sep 24, 2021, 8:09:42 AMSep 24
to
On 2021-09-23, chris <chris-...@tridac.net> wrote:
>
> Still doesn't explain why a volatiie keyword might be needed at
> application level, though I guess there might be a few edge cases...
>

I'm surprised you are having a hard time seeing it Chris.

Hardware stuffs something directly into process memory outside of
the flow of execution of a program, hence volatile may be required
for some programs to tell the compiler to generate code to re-read
it again.

VMS stuffs something directly into process memory outside of
the flow of execution of a program, hence volatile may now be
required for some programs to tell the compiler to generate code
to re-read it again.

Simon Clubley

unread,
Sep 24, 2021, 8:16:32 AMSep 24
to
Volatile is also set (quite correctly) on the buffer itself.

chris

unread,
Sep 24, 2021, 10:18:50 AMSep 24
to
On 09/24/21 13:09, Simon Clubley wrote:
> On 2021-09-23, chris<chris-...@tridac.net> wrote:
>>
>> Still doesn't explain why a volatiie keyword might be needed at
>> application level, though I guess there might be a few edge cases...
>>
>
> I'm surprised you are having a hard time seeing it Chris.
>
> Hardware stuffs something directly into process memory outside of
> the flow of execution of a program, hence volatile may be required
> for some programs to tell the compiler to generate code to re-read
> it again.
>

Sorry, but that's incorrect. You are confusing compile time actions
with runtime situations. Present C compilers can have no
knowledge of future dynamic runtime situations where, for example, a
shared buffer may be updated asynchronously by separate processes
and at different times. However, most operating systems have features
to manage such situations to ensure things like mutual exclusion and
deadlock prevention. Os books are full of algorithms for that sort
of thing, as it's so fundamental to OS design.

> VMS stuffs something directly into process memory outside of
> the flow of execution of a program, hence volatile may now be
> required for some programs to tell the compiler to generate code
> to re-read it again.
>
> Simon.

Sorry, but wrong again and it does nothing of the sort. All the
volatile keyword does is to tell the compiler to disable
optimisation across sequence points, ie: eg not unroll loops, not
delete whole sections of apparently redundant code etc, but to
generate code as per the source defines. No added code is or can
be generated to take account of future run time situations.

Check out the Harbison & Steele book for 2 or 3 pages on the
volatile keyword...

Chris

>

chris

unread,
Sep 24, 2021, 10:28:25 AMSep 24
to
On 09/24/21 13:16, Simon Clubley wrote:
> On 2021-09-23, chris<chris-...@tridac.net> wrote:
>>
>> If you look at that use of volatile, it's dealing with sig_atomic,
>> which I would guess to be an interface to a test and set instruction,
>> which is designed to be indivisible and non interuptable. That is,
>> the whole instruction always executes to completion.
>> More like driver level code, not application, where such
>> functionality would normally be encapsulated into a system call.
>>
>
> Volatile is also set (quite correctly) on the buffer itself.
>
> Simon.
>

Why ?. While the compiler will typically pad out structures to align
each element to the machine wordsize, the use of volatile to define
that buffer looks redundant, since no optimisation would apply to
that structure definition anyway.

Use structure overlays for machine register access all the time here
and would never use the volatile keyword for any of it. Machine
register structure pointers, yes, volatile is appropriate and
necessary there...

Chris

chris

unread,
Sep 24, 2021, 10:28:47 AMSep 24
to
On 09/24/21 13:16, Simon Clubley wrote:
> On 2021-09-23, chris<chris-...@tridac.net> wrote:
>>
>> If you look at that use of volatile, it's dealing with sig_atomic,
>> which I would guess to be an interface to a test and set instruction,
>> which is designed to be indivisible and non interuptable. That is,
>> the whole instruction always executes to completion.
>> More like driver level code, not application, where such
>> functionality would normally be encapsulated into a system call.
>>
>
> Volatile is also set (quite correctly) on the buffer itself.
>
> Simon.
>

Why ?. While the compiler will typically pad out structures to align
each element to the machine wordsize, the use of volatile to define
that buffer is redundant, since no optimisation would apply to

Arne Vajhøj

unread,
Sep 24, 2021, 11:41:44 AMSep 24
to
__MB should ensure that when the code reads from memory
it should get the latest value.

A buffer with 100 or 1000 or 10000 bytes can not be in
a register (at least not on x86-64) so reading the buffer
will mean reading from memory.

And if __MB ensure that reading from memory will get the
latest value then ...

Arne


Simon Clubley

unread,
Sep 24, 2021, 2:15:44 PMSep 24
to
On 2021-09-24, chris <chris-...@tridac.net> wrote:

Chris, you are even more stubborn than Arne. :-) (Sorry Arne :-))

> On 09/24/21 13:09, Simon Clubley wrote:
>> On 2021-09-23, chris<chris-...@tridac.net> wrote:
>>>
>>> Still doesn't explain why a volatiie keyword might be needed at
>>> application level, though I guess there might be a few edge cases...
>>>
>>
>> I'm surprised you are having a hard time seeing it Chris.
>>
>> Hardware stuffs something directly into process memory outside of
>> the flow of execution of a program, hence volatile may be required
>> for some programs to tell the compiler to generate code to re-read
>> it again.
>>
>
> Sorry, but that's incorrect. You are confusing compile time actions
> with runtime situations. Present C compilers can have no
> knowledge of future dynamic runtime situations where, for example, a
> shared buffer may be updated asynchronously by separate processes
> and at different times. However, most operating systems have features
> to manage such situations to ensure things like mutual exclusion and
> deadlock prevention. Os books are full of algorithms for that sort
> of thing, as it's so fundamental to OS design.
>

No I am not. All I have said all along is that volatile inserts code
into the generated code to _always_ re-read the variable before doing
anything with it.

Some programs may not need that but only if they have not touched the
variable since the program started running (so the initial read is what
would be done anyway).

I have _never_ said that compilers have any knowledge of dynamic runtime
situations. Volatile guarantees an unconditional read before working
with data and that is all it does and that's how unknown situations are
handled.

BTW, what does the hardware (or the operating system) dropping data you
requested into your process memory space while you busy doing something
else have to do with mutual exclusion or deadlocks ?

If you have not done such things, you might want to try writing programs
to use sys$qio() in full async mode or try the Linux AIO stuff and you
may then see what my potential concerns are with the upcoming compiler
changes.

>> VMS stuffs something directly into process memory outside of
>> the flow of execution of a program, hence volatile may now be
>> required for some programs to tell the compiler to generate code
>> to re-read it again.
>>
>> Simon.
>
> Sorry, but wrong again and it does nothing of the sort. All the
> volatile keyword does is to tell the compiler to disable
> optimisation across sequence points, ie: eg not unroll loops, not
> delete whole sections of apparently redundant code etc, but to
> generate code as per the source defines. No added code is or can
> be generated to take account of future run time situations.
>

Telling the compiler to generate code to re-read it again is _exactly_
what volatile does.

And telling the compiler to add code to force a re-read of a variable
_is_ the way you take account of unknown future run time situations.

Simon Clubley

unread,
Sep 24, 2021, 2:36:41 PMSep 24
to
On 2021-09-24, chris <chris-...@tridac.net> wrote:
> On 09/24/21 13:16, Simon Clubley wrote:
>> On 2021-09-23, chris<chris-...@tridac.net> wrote:
>>>
>>> If you look at that use of volatile, it's dealing with sig_atomic,
>>> which I would guess to be an interface to a test and set instruction,
>>> which is designed to be indivisible and non interuptable. That is,
>>> the whole instruction always executes to completion.
>>> More like driver level code, not application, where such
>>> functionality would normally be encapsulated into a system call.
>>>
>>
>> Volatile is also set (quite correctly) on the buffer itself.
>>
>
> Why ?. While the compiler will typically pad out structures to align
> each element to the machine wordsize, the use of volatile to define
> that buffer looks redundant, since no optimisation would apply to
> that structure definition anyway.
>

Chris, what makes you think you know better than the people who
wrote that header ?

It's not the structure, but the data written into the buffer by
Linux behind the scenes that the volatile attribute is designed
to address.

The whole point of the AIO interface is that the data is written to
the buffer in your process by Linux while your program is busy doing
something else. In this case, it's behaving in exactly the same way
that sys$qio() is behaving.

You therefore have to force a re-read of the buffer when you later go
looking at it so the compiler doesn't think it can reuse an existing
(and now stale) value.

Simon Clubley

unread,
Sep 24, 2021, 2:54:42 PMSep 24
to
__MB() puts the hardware holding the contents of that variable into sync.
Volatile OTOH puts the generated code for that variable into sync by not
caching the variable but instead re-reading it every time.

IOW this only works if the code _does_ read from memory every time
at which point you don't need the memory barrier anyway, at least not
for this. You may still need a memory barrier down inside the device
drivers or in the kernel, but that's nothing to do with working around
the compiler generating code to cache the variable instead of re-reading
it every time.

If you could somehow get this to work, you would also have to manually
insert the __MB() instructions throughout your code instead of just
tagging the variable as volatile and letting the compiler add code
to do a re-read automatically.

> A buffer with 100 or 1000 or 10000 bytes can not be in
> a register (at least not on x86-64) so reading the buffer
> will mean reading from memory.
>

That's non-deterministic. What if the code only looks at the first
longword in the buffer ? A longword that it looked at previously
when the buffer had previous contents ? Oops... :-)

A __MB() call here would make no difference to that behaviour.

Dave Froble

unread,
Sep 24, 2021, 5:15:25 PMSep 24
to
On 9/24/2021 2:15 PM, Simon Clubley wrote:
> On 2021-09-24, chris <chris-...@tridac.net> wrote:
>
> Chris, you are even more stubborn than Arne. :-) (Sorry Arne :-))

Maybe even as stubborn as Simon ???

chris

unread,
Sep 24, 2021, 5:48:56 PMSep 24
to
On 09/24/21 19:36, Simon Clubley wrote:
> On 2021-09-24, chris<chris-...@tridac.net> wrote:
>> On 09/24/21 13:16, Simon Clubley wrote:
>>> On 2021-09-23, chris<chris-...@tridac.net> wrote:
>>>>
>>>> If you look at that use of volatile, it's dealing with sig_atomic,
>>>> which I would guess to be an interface to a test and set instruction,
>>>> which is designed to be indivisible and non interuptable. That is,
>>>> the whole instruction always executes to completion.
>>>> More like driver level code, not application, where such
>>>> functionality would normally be encapsulated into a system call.
>>>>
>>>
>>> Volatile is also set (quite correctly) on the buffer itself.
>>>
>>
>> Why ?. While the compiler will typically pad out structures to align
>> each element to the machine wordsize, the use of volatile to define
>> that buffer looks redundant, since no optimisation would apply to
>> that structure definition anyway.
>>
>
> Chris, what makes you think you know better than the people who
> wrote that header ?

While I don't stubbornly claim to be always right, have spent over
three decades programming real time embedded on a variety of RTOS
platforms.

They put a volatile tag onto a void pointer, which can be
cast to pointer to any type, but still doesn't need the volatile
tag.

>
> It's not the structure, but the data written into the buffer by
> Linux behind the scenes that the volatile attribute is designed
> to address.
>

> The whole point of the AIO interface is that the data is written to
> the buffer in your process by Linux while your program is busy doing
> something else. In this case, it's behaving in exactly the same way
> that sys$qio() is behaving.

Shared memory regions are quite common in everyday code, asynchronously
updated or not, as the DMA example I outlined earlier. Perhaps you are
not explaining what you mean very well ?.

> You therefore have to force a re-read of the buffer when you later go
> looking at it so the compiler doesn't think it can reuse an existing
> (and now stale) value.

All i'm saying is, read the C standard docs on the use of the volatile
keyword for more info, or do you think you know better ?...

Chris

chris

unread,
Sep 24, 2021, 5:59:16 PMSep 24
to
On 09/24/21 19:15, Simon Clubley wrote:
> On 2021-09-24, chris<chris-...@tridac.net> wrote:
>
> Chris, you are even more stubborn than Arne. :-) (Sorry Arne :-))
>
>> On 09/24/21 13:09, Simon Clubley wrote:
>>> On 2021-09-23, chris<chris-...@tridac.net> wrote:
>>>>
>>>> Still doesn't explain why a volatiie keyword might be needed at
>>>> application level, though I guess there might be a few edge cases...
>>>>
>>>
>>> I'm surprised you are having a hard time seeing it Chris.
>>>
>>> Hardware stuffs something directly into process memory outside of
>>> the flow of execution of a program, hence volatile may be required
>>> for some programs to tell the compiler to generate code to re-read
>>> it again.
>>>
>>
>> Sorry, but that's incorrect. You are confusing compile time actions
>> with runtime situations. Present C compilers can have no
>> knowledge of future dynamic runtime situations where, for example, a
>> shared buffer may be updated asynchronously by separate processes
>> and at different times. However, most operating systems have features
>> to manage such situations to ensure things like mutual exclusion and
>> deadlock prevention. Os books are full of algorithms for that sort
>> of thing, as it's so fundamental to OS design.
>>
>
> No I am not. All I have said all along is that volatile inserts code
> into the generated code to _always_ re-read the variable before doing
> anything with it.

Sorry, no it doesn't :-). All it's saying is that the section of code
should not be subject to any optimisations. Not the same thing at all.
Doesn't add code, just doesn't take any away, nor modify it,
but translates as written.

Of course in the real world, out of order execution on modern micros
can be a can of worms in itself, if the code depends on a specific
sequence of instruction execution, which of course, it never should.

Chris



chris

unread,
Sep 24, 2021, 6:16:28 PMSep 24
to
On 09/22/21 20:58, Bob Gezelter wrote:
> On Wednesday, September 22, 2021 at 2:11:40 PM UTC-4, Simon Clubley wrote:
orld.
>>
>> There's a reason why volatile is used so liberally in embedded bare-metal
>> programming. :-)
>>
>> Simon.

Primarily because embedded spends a lot of time accessing hardware
registers directly, for example:

volatile unsigned char *ttyport = (volatile unsigned char*) TTY_PORT

which assigns a numerc value to the pointer and tells the compiler
not to optimise it away, nor change the value.

Something application level code should rarely, if ever, see...

Chris

Dave Froble

unread,
Sep 24, 2021, 7:04:47 PMSep 24
to
On 9/24/2021 5:48 PM, chris wrote:

> or do you think you know better ?...
>
> Chris

Come on Chris, this is Simon you're arguing with. Did you really need
to ask that?

:-)

Dave Froble

unread,
Sep 24, 2021, 7:09:36 PMSep 24
to
Well, now, that sort of depends on your definition of "application
level code", doesn't it?

I sometimes design/write stuff where I consider such issues.

Of course I write it in Basic, not that shitty C stuff. Basic seems to
usually get things right. (Don't tell John I wrote that, he'll hold it
against me when I ask him to fix Basic.)

VAXman-

unread,
Sep 24, 2021, 7:14:53 PMSep 24
to
In article <sillid$268$2...@dont-email.me>, Dave Froble <da...@tsoft-inc.com> writes:
>On 9/24/2021 5:48 PM, chris wrote:
>
>> or do you think you know better ?...
>>
>> Chris
>
>Come on Chris, this is Simon you're arguing with. Did you really need
>to ask that?
>
>:-)

ROTFLMFAO!

--
VAXman- A Bored Certified VMS Kernel Mode Hacker VAXman(at)TMESIS(dot)ORG

I speak to machines with the voice of humanity.

gah4

unread,
Sep 24, 2021, 7:52:58 PMSep 24
to
On Wednesday, September 22, 2021 at 7:26:03 AM UTC-7, geze...@rlgsc.com wrote:
> On Wednesday, September 22, 2021 at 8:28:09 AM UTC-4, Simon Clubley wrote:
> > Jan-Erik's questions about ASTs in COBOL have reminded me about something
> > I asked a while back.
> >
> > VMS I/O and system calls are much more asynchronous than on other operating
> > systems and data can appear in buffers and variables in general can be
> > changed outside of the normal sequence points (such as at a function call
> > boundary).
> >
> > With the move to LLVM, and its different optimiser, have any examples
> > appeared in VMS code for x86-64 where volatile attributes are now required
> > on variable definitions where you would have got away with not using them
> > before (even if technically, they should have been marked as volatile anyway) ?
> >
> > Just curious if there's any places in code running on VMS x86-64 that will
> > need to cleaned up to do things in the correct way that you would have
> > got away with doing less correctly previously.
> >
> > Simon
> >
> > --
> > Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP
> > Walking destinations on a map are further away than they appear.
> Simon,
>
> Since the days of RSX-11M, I have been dealing with client bugs in this area..
> The best phrasing I have seen in this area was in an IBM System/360 Principles of
> Operation manual. It may have only appeared in certain editions, as I cannot
> find the precise reference. However, it was along the lines of "the contents of a
> buffer are UNDEFINED [emphasis mine] from the initiation of the I/O operation
> until the operation has completed with the device end signal from the device."

That would apply at the hardware level.

But BSAM (and other B...) I/O operations are almost at that level.
(QSAM is different, and at a higher level.)

A BSAM I/O call (which is pretty much a subroutine call into an OS routine)
eventually does an EXCP, which then does the appropriate SIO to start the
I/O operation. The program should then (later) WAIT for it to complete,
and as with the description above, the buffer is undefined.

(That should be described in the appropriate manual for I/O macros.)

For the 360/85, and many S/370 models, cache made things more
interesting. Since I/O operations go directly to memory, when does
the cache get updated?

I do wonder, though, if QIO is closer to the OS/360 Queued access
methods, like QSAM.

Note also that BSAM allows for, and also PL/I, locate mode I/O
where the hardware reads/writes directly from the actual data
arrays, without any intermediate buffering. (That only works
for contiguous data.)

I am not sure how VMS does I/O buffering.

Dave Froble

unread,
Sep 24, 2021, 10:17:14 PMSep 24
to
On 9/24/2021 7:14 PM, VAX...@SendSpamHere.ORG wrote:
> In article <sillid$268$2...@dont-email.me>, Dave Froble <da...@tsoft-inc.com> writes:
>> On 9/24/2021 5:48 PM, chris wrote:
>>
>>> or do you think you know better ?...
>>>
>>> Chris
>>
>> Come on Chris, this is Simon you're arguing with. Did you really need
>> to ask that?
>>
>> :-)
>
> ROTFLMFAO!
>

I thought you'd enjoy that.

I was laughing so hard, I almost could not type it.

:-)

Humor aside, I think I'm agreeing with Simon on this topic, even if I
know just about nothing about optimizers.

Simon Clubley

unread,
Sep 24, 2021, 11:25:07 PMSep 24
to
On 2021-09-24, chris <chris-...@tridac.net> wrote:
> On 09/24/21 19:15, Simon Clubley wrote:
>>
>> No I am not. All I have said all along is that volatile inserts code
>> into the generated code to _always_ re-read the variable before doing
>> anything with it.
>
> Sorry, no it doesn't :-). All it's saying is that the section of code
> should not be subject to any optimisations. Not the same thing at all.
> Doesn't add code, just doesn't take any away, nor modify it,
> but translates as written.
>

I may have phrased that a little loosely, but the end result is exactly
the same - the generated object code has unconditional reads in it in
places it might not have done had the optimiser been allowed to go to
work on the variable.

Simon Clubley

unread,
Sep 24, 2021, 11:42:38 PMSep 24
to
How do you otherwise _guarantee_ that the Linux application program
is seeing the latest data that the Linux kernel might have written
into the buffer behind the scenes since the program last looked at
the buffer ?

Lawrence D’Oliveiro

unread,
Sep 25, 2021, 2:01:23 AMSep 25
to
On Friday, September 24, 2021 at 3:23:02 PM UTC+12, Stephen Hoffman wrote:

> At the level being discussed ($io_perform, $qio) all I/O is async by
> default. The app queues the I/O request, and goes off to do...
> whatever.

Coming from VMS, where I/O and process scheduling were inherently decoupled, I find the Unix way a step backwards in some ways. Linux has its “aio” framework, but that seems to be specifically for block devices, for use it seems by some DBMS implementors who don’t like to work through conventional filesystems.

> Language-based async/await is not something that was common in years
> past ...

It’s just a revival of the old coroutine concept from decades past. Kind of. There is this terminology of “stackful” versus “stackless” coroutines, where the original kind was “stackful”. Async/await are described as “stackless” because they don’t need to switch entire stacks between tasks, since preemption can only occur at limited points. Perhaps more accurately described as “stack-light”, but there you go.

> select is a mess on OpenVMS, so we won't discuss that.

No “poll” or “epoll” ... ? “select” is considered a bit old-fashioned these days...

> Creating an app that's basically one big ball of self-requeuing ASTs
> with a main that hibernates and wakes works pretty well for
> low-to-moderate-scale OpenVMS apps, too.

I did that once, back in my MSc days. I also wrote my own threading package on top of ASTs, and tried reimplementing the app on top of that. Performance dropped by half.

chris

unread,
Sep 25, 2021, 6:48:50 AMSep 25
to
On 09/25/21 04:42, Simon Clubley wrote:
> On 2021-09-24, chris<chris-...@tridac.net> wrote:
>> On 09/24/21 19:36, Simon Clubley wrote:
>>> You therefore have to force a re-read of the buffer when you later go
>>> looking at it so the compiler doesn't think it can reuse an existing
>>> (and now stale) value.
>>
>> All i'm saying is, read the C standard docs on the use of the volatile
>> keyword for more info, or do you think you know better ?...
>>
>
> How do you otherwise _guarantee_ that the Linux application program
> is seeing the latest data that the Linux kernel might have written
> into the buffer behind the scenes since the program last looked at
> the buffer ?
>
> Simon.
>

Most kernels have system calls to deal with that sort of thing, to
create and manage locks on shared resources and to ensure mutual
exclusion. The key thing is that that is a high level thing, whereas
things like volatile are a compile time mechanism. If you like,
the low level support foundation for high level lock mechanisms.

Others may have a better explanation of all this...

Chris

chris

unread,
Sep 25, 2021, 9:00:14 AMSep 25
to
On 09/25/21 04:25, Simon Clubley wrote:
> On 2021-09-24, chris<chris-...@tridac.net> wrote:
>> On 09/24/21 19:15, Simon Clubley wrote:
>>>
>>> No I am not. All I have said all along is that volatile inserts code
>>> into the generated code to _always_ re-read the variable before doing
>>> anything with it.
>>
>> Sorry, no it doesn't :-). All it's saying is that the section of code
>> should not be subject to any optimisations. Not the same thing at all.
>> Doesn't add code, just doesn't take any away, nor modify it,
>> but translates as written.
>>
>
> I may have phrased that a little loosely, but the end result is exactly
> the same - the generated object code has unconditional reads in it in
> places it might not have done had the optimiser been allowed to go to
> work on the variable.
>
> Simon.
>

We are probably in agreement, just different interpretations of
the same thing ?. One thing I always do when confronted with a new
compiler or tool chain is to look at the assembler source output
to make sure it's doing what I expect it to. Don't bother once
i'm happy with the compiler, but it does help to get to know what
the compiler is doing under various conditions. It's also useful
if you are trying to optimise performance. For example, trying to
decide which loop construct to use for / next, or do / while. Quite
important in the old 8 bit days, but moderm micros are so good
now, it's less of an issue. Quite often a single line of asm
per C statement, but you can fine tune the programming style to'
get the best results form the compiler....

Chris


Simon Clubley

unread,
Sep 25, 2021, 2:29:25 PMSep 25
to
On 2021-09-25, chris <chris-...@tridac.net> wrote:
> On 09/25/21 04:42, Simon Clubley wrote:
>> On 2021-09-24, chris<chris-...@tridac.net> wrote:
>>> On 09/24/21 19:36, Simon Clubley wrote:
>>>> You therefore have to force a re-read of the buffer when you later go
>>>> looking at it so the compiler doesn't think it can reuse an existing
>>>> (and now stale) value.
>>>
>>> All i'm saying is, read the C standard docs on the use of the volatile
>>> keyword for more info, or do you think you know better ?...
>>>
>>
>> How do you otherwise _guarantee_ that the Linux application program
>> is seeing the latest data that the Linux kernel might have written
>> into the buffer behind the scenes since the program last looked at
>> the buffer ?
>>
>
> Most kernels have system calls to deal with that sort of thing, to
> create and manage locks on shared resources and to ensure mutual
> exclusion. The key thing is that that is a high level thing, whereas
> things like volatile are a compile time mechanism. If you like,
> the low level support foundation for high level lock mechanisms.
>
> Others may have a better explanation of all this...
>

No explanation needed as I do understand those things.

However, we were talking instead about why the AIO implementation
on Linux uses the volatile attribute on its transfer buffer.

Perhaps if you play with the Linux AIO interface and especially with
the sys$qio() system call in full async mode, you might understand
why I am saying the things I am.

Simon Clubley

unread,
Sep 25, 2021, 2:46:04 PMSep 25
to
On 2021-09-25, chris <chris-...@tridac.net> wrote:
>
> We are probably in agreement, just different interpretations of
> the same thing ?. One thing I always do when confronted with a new
> compiler or tool chain is to look at the assembler source output
> to make sure it's doing what I expect it to. Don't bother once
> i'm happy with the compiler, but it does help to get to know what
> the compiler is doing under various conditions. It's also useful
> if you are trying to optimise performance. For example, trying to
> decide which loop construct to use for / next, or do / while. Quite
> important in the old 8 bit days, but moderm micros are so good
> now, it's less of an issue. Quite often a single line of asm
> per C statement, but you can fine tune the programming style to'
> get the best results form the compiler....
>

:-)

Looking at the generated code has proved interesting at times. :-)

The following Ada Issue is a direct result of me looking at someone's
problem on comp.lang.ada a number of years ago which was caused by
the code the Ada compiler had generated:

http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0128-1.txt?rev=1.15&raw=N

The following AI is also directly related to this:

http://www.ada-auth.org/cgi-bin/cvsweb.cgi/ai12s/ai12-0127-1.txt?rev=1.27&raw=N

[I am _not_ a member of the ARG or anything like that. I am just a
normal programmer who includes Ada in the list of languages I know.]

chris

unread,
Sep 26, 2021, 12:54:13 PMSep 26
to
On 09/25/21 19:29, Simon Clubley wrote:
> On 2021-09-25, chris<chris-...@tridac.net> wrote:
>> On 09/25/21 04:42, Simon Clubley wrote:
>>> On 2021-09-24, chris<chris-...@tridac.net> wrote:
>>>> On 09/24/21 19:36, Simon Clubley wrote:
>>>>> You therefore have to force a re-read of the buffer when you later go
>>>>> looking at it so the compiler doesn't think it can reuse an existing
>>>>> (and now stale) value.
>>>>
>>>> All i'm saying is, read the C standard docs on the use of the volatile
>>>> keyword for more info, or do you think you know better ?...
>>>>
>>>
>>> How do you otherwise _guarantee_ that the Linux application program
>>> is seeing the latest data that the Linux kernel might have written
>>> into the buffer behind the scenes since the program last looked at
>>> the buffer ?
>>>
>>
>> Most kernels have system calls to deal with that sort of thing, to
>> create and manage locks on shared resources and to ensure mutual
>> exclusion. The key thing is that that is a high level thing, whereas
>> things like volatile are a compile time mechanism. If you like,
>> the low level support foundation for high level lock mechanisms.
>>
>> Others may have a better explanation of all this...
>>
>
> No explanation needed as I do understand those things.
>
> However, we were talking instead about why the AIO implementation
> on Linux uses the volatile attribute on its transfer buffer.

Yes, an i'm suggesting that it's redundant. Structures or their
contents are not modified in any way by the compiler, other than
possible padding out element spacing to the natural machine
wordsize. So again, why is that buffer pointer declared with
the volatile keyword ?.

>
> Perhaps if you play with the Linux AIO interface and especially with
> the sys$qio() system call in full async mode, you might understand
> why I am saying the things I am.
>

It's still not clear to me, so in the spirit of teamwork, why
don't you explain that use of volatile, in depth, so we can
all understand it ?.

No later Linux here, but FreeBSD man aio produces:

> The aio facility provides system calls for asynchronous I/O.
> Asynchronous I/O operations are not completed synchronously
> by the calling thread. Instead, the calling thread invokes
> one system call to request an asynchronous I/O operation.
> The status of a completed request is retrieved later via a
> separate system call.

Key point there is: not completed synchronously by the calling
thread...

Chris

chris

unread,
Sep 26, 2021, 1:10:09 PMSep 26