Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

plipp preprocessor

54 views
Skip to first unread message

John Cowan

unread,
Sep 14, 2022, 10:42:16 PM9/14/22
to
Subset G has only one preprocessor directive, %INCLUDE. The only built-in extension I think is necssary is %DICTIONARY, which I interpret as incorporating declarations from the named .pli file (or perhaps .plih file); this will be processed like %INCLUDE, but will also cause a corresponding #include directive to be written into the compiler output. Thus if you want to make references to a C library foobar.c, you have a set of PL/I declarations for this library in mainprog.pli, and then the generated mainprog.c will contan "#include "foobar.h"".

To supplement this minimalism, I intend to write a preprocessor that will not know any PL/I (as Ratfor doesn't know any Fortran) except in the bodies of procedures. It will not include any listing-control directives, because they are no longer useful/necessary. Here are the preprocessor statements:

%; - null statement
%assignment
%ACTIVATE - declared preproc variables are eligible for replacement
%DEACTIVATE - the opposite
%DECLARE, %DCL - corresponds to DECLARE
%DO - corresponds to DO
%END - corresponds to END
%ERROR - outputs an error message
%FATAL - outputs a fatal error message
%GO TO, %GOTO - jumps to labeled preproc statement
%IF-%THEN-%ELSE - corresponds to IF
%INCLUDE - corresponds to %INCLUDE
%INFORM - outputs an informative error message
%INSCAN - like %xINCLUDE, but inclusion is a preproc variable
%PROCEDURE - declares preprocessor procedure
%REPLACE - replaces an identifier with a constant value
%WARN - generates warning message
%XINCLUDE - like %INCLUDE but done only once
%XINSCAN - like %INSCAN, but done only once

Here are the statements valid within procedures:

null statement
assignment
ANSWER - returns text to be rescanned
CALL - invoke procedure
DO - corresponds to DO
END - corresponds to END
GO TO, GOTO - corresponds to GOTO
IF-THEN-ELSE corresponds to IF
ITERATE - jump to end of iterative DO
LEAVE - break out of DO
RETURN - returns value to caller
SELECT-WHEN-OTHERWISE - corresponds to SELECT
STOP - no more output

John Cowan

unread,
Sep 17, 2022, 8:19:43 PM9/17/22
to
I'm now figuring out the list of plipp's built-in functions, and I realize I need to determine the appropriate data structures the preprocessor needs. Neither the full nor the subset ANSI standard has anything to say about preprocessing other than %INCLUDE, so I have no guidance for this.

In particular, IBM supports arbitrary PL/I arrays, but I think it may be enough to support just one-dimensional arrays, or arrays with a fixed lower bound of 1, or both. None of the other PL/I manuals I have access to talks about arrays at all. Anyone have feedback on how useful the richer arrays are when preprocessing? It's not a matter of implementation difficulty, but of featuritis, particularly documentation complications.

Anyhow, here's the tentative built-in function list:

ABS(n) - absolute value of n
CHAR(n) - character with codepoint n
COMMENT(s) - format s as a comment
COMPILEDATE() - yyyymmddhhmmssttt
COPY(s,n) - n copies of s
COUNTER() - string of next integer value
DIMENSION(a,n) - extent of array a in the nth dimension
DIV(x,y) - x div y
FIXED(n) - fixed integer represented by n
HBOUND(a,n) - high bound of array a in the nth dimension
INDEX(needle,haystack[,index]) - returns 0 if not found
LBOUND(a,n) - low bound of array a in the nth dimension
LENGTH(s) - string length
LOWERCASE(s) - lowercase of s
LTRIM(s) - remove whitespace from left end
MAX(x,y) - maximum of x and y
MIN(x,y) - minimum of x and y
MOD(x,y) - x modulo y
QUOTE(s) - format s as a quoted string
RANK(a) - rank of array
REPEAT(s,n) - COPY(s,n+1)
RTRIM(s) - remove whitespace from right end
SEARCH(x,y,[n]) - position in x of char in y, starting at n
SIGN(n) - signum of n
STRING(n) - string representing n
SUBSTR(s,x[,y]) - substring (default y = length)
TRANSLATE(s,to,from) - translate chars in from -> to
TRIM(s) - remove whitespace from both ends
UPPERCASE(s) - uppercase of s
VERIFY(x,y,[,n] - position in x of char not in y, starting at n


Peter Flass

unread,
Sep 17, 2022, 10:13:38 PM9/17/22
to
John Cowan <co...@ccil.org> wrote:
> I'm now figuring out the list of plipp's built-in functions, and I
> realize I need to determine the appropriate data structures the preprocessor needs.

I thrashed around for a while, and finally settled on a doubly-linked list
of pointers to lines.
I haven’t checked Enterprise PL/I in much detail, so I don’t know if all
these are there, but LOWERCASE and UPPERCASE seem redundant with TRANSLATE,
if more convenient. Is SEARCH there? POS would align better with the base
language, it’s also redundant with INDEX. of course, implementing redundant
builtins is not more trouble than implementing one, and all builtins used
in open code have to be declared.


--
Pete

Peter Flass

unread,
Sep 17, 2022, 10:30:34 PM9/17/22
to
Peter Flass <peter...@yahoo.com> wrote:
> John Cowan <co...@ccil.org> wrote:
>> I'm now figuring out the list of plipp's built-in functions, and I
>> realize I need to determine the appropriate data structures the preprocessor needs.
>

Forgot to mention I “compile” all the preprocessor statements to pseudocode
in pass 1.

--
Pete

John Cowan

unread,
Sep 17, 2022, 11:05:12 PM9/17/22
to
On Saturday, September 17, 2022 at 10:13:38 PM UTC-4, bearlyabus...@gmail.com wrote:
> LOWERCASE and UPPERCASE seem redundant with TRANSLATE,

Technically. But in a Unicode implementation, which is what I'm writing
in both plic and plipp, there are thousands of upper/lowercase pairs.

> Is SEARCH there?

It isn't.

> POS would align better with the base
> language, it’s also redundant with INDEX. of course, implementing redundant
> builtins is not more trouble than implementing one, and all builtins used
> in open code have to be declared.

Do they? I thought you only had to declare them if you need access to them
within a scope in which the names are declared as variables or constants.

Robert Prins

unread,
Sep 18, 2022, 7:08:41 AM9/18/22
to
There's also "COMPILETIME" and that has different formats when you compare the
old OS compiler with EPLI. Don't have z/OS running now, but I have a "FILLER"
pre-processor procedure that uses the difference to translate "filler" fields in
structures to either "Znnnnn" for OS PL/I, and to "*" (or also "Znnnnn", when
the "TEST" compiler option is also specified) for EPLI:

%filler: proc returns(char);
dcl str char;

dcl compiletime builtin;
dcl counter builtin;

if substr(compiletime, 3, 1) = ' ' then
str = 'Z' || counter;
else
if sysparm = 'TEST' then
str = 'z' || counter;
else
str = '* ';

return(str);
%end filler;

You might add a third format to allow detection of your version! Might IPL z/OS
later today if I get through everything else I far more urgently need to do…

Robert
--
Robert AH Prins
robert(a)prino(d)org
The hitchhiking grandfather - https://prino.neocities.org/
Some REXX code for use on z/OS - https://prino.neocities.org/zOS/zOS-Tools.html

--
Robert AH Prins
robert(a)prino(d)org
The hitchhiking grandfather - https://prino.neocities.org/
Some REXX code for use on z/OS - https://prino.neocities.org/zOS/zOS-Tools.html


John W Kennedy

unread,
Sep 18, 2022, 5:10:06 PM9/18/22
to
But don’t forget that the macro language is based on the main language,
and includes some rules that are not expressly documented. For example,
a major application I wrote all the way back in the 60s was dependent on
the fact that, although the macro language didn’t include BIT variables,
the following worked:

% DECLARE OPTION_STRING CHARACTER;

% OPTION_STRING = '110101001010';

% IF OPTION_STRING & '101100011010' % THEN % DO;
...
% END;

works by implicitly converting OPTION_STRING and the constant
"101100011010" from CHARACTER to BIT ('110101001010'B and
'101100011010'B) and then ANDing them (result '100100001010'B).

--
John W. Kennedy
Algernon Burbage, Lord Roderick, Father Martin, Bishop Baldwin,
King Pellinore, Captain Bailey, Merlin -- A Kingdom for a Stage!


John W Kennedy

unread,
Sep 18, 2022, 5:17:32 PM9/18/22
to
In the macro language, %DECLARE of a BUILTIN is equivalent to %ACTIVATE,
and if you don’t do either, then a macro BUILTIN will not work in open
code, though it will work inside a macro statement or procedure. (Note
that activating a BUILTIN in open code will break the runtime version of
that same BUILTIN, so it may not be such a good idea.)

Peter Flass

unread,
Sep 18, 2022, 5:18:29 PM9/18/22
to
John W Kennedy <john.w....@gmail.com> wrote:
> On 9/17/22 10:30 PM, Peter Flass wrote:
>> Peter Flass <peter...@yahoo.com> wrote:
>>> John Cowan <co...@ccil.org> wrote:
>>>> I'm now figuring out the list of plipp's built-in functions, and I
>>>> realize I need to determine the appropriate data structures the preprocessor needs.
>>>
>>
>> Forgot to mention I “compile” all the preprocessor statements to pseudocode
>> in pass 1.
>>
>
> But don’t forget that the macro language is based on the main language,
> and includes some rules that are not expressly documented. For example,
> a major application I wrote all the way back in the 60s was dependent on
> the fact that, although the macro language didn’t include BIT variables,
> the following worked:
>
> % DECLARE OPTION_STRING CHARACTER;
>
> % OPTION_STRING = '110101001010';
>
> % IF OPTION_STRING & '101100011010' % THEN % DO;
> ...
> % END;
>
> works by implicitly converting OPTION_STRING and the constant
> "101100011010" from CHARACTER to BIT ('110101001010'B and
> '101100011010'B) and then ANDing them (result '100100001010'B).
>

Interesting, thank you!

--
Pete

Peter Flass

unread,
Sep 18, 2022, 5:18:29 PM9/18/22
to
I believe I read that you don’t have to declare them within a preprocessor
procedure. Possibly I’m wrong.

--
Pete

John Cowan

unread,
Sep 18, 2022, 8:11:17 PM9/18/22
to
On Sunday, September 18, 2022 at 7:08:41 AM UTC-4, Robert Prins wrote:

> There's also "COMPILETIME"

I saw that, but it would drag in localization architecture for the short name
of the month, which seemed like an excessive burden.

> You might add a third format to allow detection of your version!

I've added SYSPARM, which will return "PLIC".

I've decided, at least for the moment, to restrict arrays to a single dimension
with a lower bound of 1. This lets me remove the built-in functions
DIMENSION, HBOUND, LBOUND, and RANK, and overload LENGTH
to mean either a char length or an array length.

Peter Flass

unread,
Sep 18, 2022, 8:33:12 PM9/18/22
to
SYSPARM needs to return the value of (some part of) the command-line
argument when the preprocessor is called. You need this in order to have
the equivalent of gcc’s -Dsome-arg -Danother_arg, etc.

--
Pete

Robin Vowels

unread,
Sep 21, 2022, 1:15:46 AM9/21/22
to
On Monday, September 19, 2022 at 10:11:17 AM UTC+10, co...@ccil.org wrote:
> On Sunday, September 18, 2022 at 7:08:41 AM UTC-4, Robert Prins wrote:
>
> > There's also "COMPILETIME"
>
> I saw that, but it would drag in localization architecture for the short name
> of the month, which seemed like an excessive burden.
> > You might add a third format to allow detection of your version!
> I've added SYSPARM, which will return "PLIC".
>
> I've decided, at least for the moment, to restrict arrays to a single dimension
> with a lower bound of 1. This lets me remove the built-in functions
> DIMENSION, HBOUND,
.
HBOUND returns the upper bound of an array.
.

Robin Vowels

unread,
Sep 21, 2022, 1:32:46 AM9/21/22
to
On Sunday, September 18, 2022 at 10:19:43 AM UTC+10, co...@ccil.org wrote:
> I'm now figuring out the list of plipp's built-in functions, and I realize I need to determine the appropriate data structures the preprocessor needs. Neither the full nor the subset ANSI standard has anything to say about preprocessing other than %INCLUDE, so I have no guidance for this.
>
> In particular, IBM supports arbitrary PL/I arrays, but I think it may be enough to support just one-dimensional arrays, or arrays with a fixed lower bound of 1, or both. None of the other PL/I manuals I have access to talks about arrays at all. Anyone have feedback on how useful the richer arrays are when preprocessing? It's not a matter of implementation difficulty, but of featuritis, particularly documentation complications.
>
> Anyhow, here's the tentative built-in function list:
>
> ABS(n) - absolute value of n
> CHAR(n) - character with codepoint n
> COMMENT(s) - format s as a comment
> COMPILEDATE() - yyyymmddhhmmssttt
> COPY(s,n) - n copies of s
> COUNTER() - string of next integer value
> DIMENSION(a,n) - extent of array a in the nth dimension
> DIV(x,y) - x div y
> FIXED(n) - fixed integer represented by n
> HBOUND(a,n) - high bound of array a in the nth dimension
> INDEX(needle,haystack[,index]) - returns 0 if not found
> LBOUND(a,n) - low bound of array a in the nth dimension
> LENGTH(s) - string length
> LOWERCASE(s) - lowercase of s
> LTRIM(s) - remove whitespace from left end
.
Usually, macro built-ins use the same name as compiler builtins.
It's for consistency.
The Enterprise compiler uses TRIM as a preprocessor name,
not LTRIM and RTRIM.
TRIM removes from either end, or from both ends,
just as does the compiler version.
.
There's no DIV function, nor FIXED, nor MOD.
.
> MAX(x,y) - maximum of x and y
> MIN(x,y) - minimum of x and y
> MOD(x,y) - x modulo y
> QUOTE(s) - format s as a quoted string
> RANK(a) - rank of array
.
There's no RANK.
.
> REPEAT(s,n) - COPY(s,n+1)
> RTRIM(s) - remove whitespace from right end
> SEARCH(x,y,[n]) - position in x of char in y, starting at n
.
There's no SEARCH nor SIGN nor STRING.
.

Peter Flass

unread,
Sep 21, 2022, 10:03:47 AM9/21/22
to
Since the macro language is a complete programming language in itself,
missing builtins could mostly be coded as preprocessor procedures. Since
there is no preprocessor standard, subset G or otherwise, my instinct would
be to stick fairly closely to existing implementations for compatibility,
but, of course, the implementor is free to do whatever he wants with it.

--
Pete

John Cowan

unread,
Sep 21, 2022, 6:46:24 PM9/21/22
to
On Wednesday, September 21, 2022 at 1:32:46 AM UTC-4, Robin Vowels wrote:

> HBOUND returns the upper bound of an array.

I understand that now, so I restored it.

> The Enterprise compiler uses TRIM as a preprocessor name,
> not LTRIM and RTRIM.
> TRIM removes from either end, or from both ends,
> just as does the compiler version.

LTRIM and RTRIM are from the Kednos PL/I preprocessor,
but I agree that they are not necessary.

> There's no DIV function, nor FIXED, nor MOD. ]

DIV was an oversight of mine: removed.
FIXED does not seem to be doable in any other way.
MOD is from Kednos.

> There's no RANK.

A confusion on my part. Removed.

> There's no SEARCH nor SIGN nor STRING.

SEARCH and SIGN are from Kednos.
Splitting STRING into CHARACTER and BIT; added for the same reason as FIXED.


0 new messages