Language-independent format for master config

17 views
Skip to first unread message

James Harris

unread,
Dec 19, 2021, 4:33:30 PM12/19/21
to
We've previously discussed the potential benefits of having a master
configuration which can be used to generate types, constants,
structures, declarations etc for more than one programming language so
that each language gets the same info.

As it happens, I find myself in that position now. I need assembly and
my own language to cooperate using some common definitions so ISTM the
right approach to have a master set of definitions and use it to create
declarations and the like both for the assembler and for my compiler.

Before I jump in and devise something new ... do you know of any format
and tools which exist already?

Or since my needs will probably be very limited could it be simpler to
avoid a comprehensive and bulky package and just to make up something
from scratch? And IYO what should be in it?

(BTW, I've loads of other messages yet to reply to. I've not forgotten
them.)


--
James Harris

Bart

unread,
Dec 20, 2021, 7:11:32 AM12/20/21
to
On 19/12/2021 21:33, James Harris wrote:
> We've previously discussed the potential benefits of having a master
> configuration which can be used to generate types, constants,
> structures, declarations etc for more than one programming language so
> that each language gets the same info.
>
> As it happens, I find myself in that position now. I need assembly and
> my own language to cooperate using some common definitions so ISTM the
> right approach to have a master set of definitions and use it to create
> declarations and the like both for the assembler and for my compiler.
>
> Before I jump in and devise something new ... do you know of any format
> and tools which exist already?
>
> Or since my needs will probably be very limited could it be simpler to
> avoid a comprehensive and bulky package and just to make up something
> from scratch? And IYO what should be in it?


I don't recall that discussion.

It sounds like you're thinking of a special language just for
declarations, which transpiles to multiple targets.

That might be a little extravagant. I'm not sure any existing tools are
going to be helpful, since how will thet know how to generate code for
each of your languages?

In my case, I only have 3 languages (that I code in): ASM, M (static), Q
(dynamic).

The ASM is written inline in M, and will have access to most of its
declarations (not yet direcly to structs; I need special declarations to
make the member offsets available).

When M/Q share data, I use ad hoc methods: for a example a special
routine in M which when called, writes Q-compatible versions.

Or, since the syntax is similar, I can just copy&paste with a few tweaks.

Nothing however that will guarantee those separate declarations
automatically remain in sync.

Except something I'm working on now, which is that when M is generating
a shared library, then it will also generate an exports file containing
an API for use from:

* M (for M to use it as though it was a regular DLL)
* Q
* C had also been planned, to allow access from other languages that
can use DLLs + C headers

(I found a problem with M generating DLL, so that I'm working on
devising my own shared library format for my own languages. I think that
can still be packaged within a regular DLL too, but that's low priority;
it will also need an external tool to generate the core DLL file!)

James Harris

unread,
Dec 20, 2021, 11:59:38 AM12/20/21
to
On 20/12/2021 12:11, Bart wrote:
> On 19/12/2021 21:33, James Harris wrote:
>> We've previously discussed the potential benefits of having a master
>> configuration which can be used to generate types, constants,
>> structures, declarations etc for more than one programming language so
>> that each language gets the same info.
>>
>> As it happens, I find myself in that position now. I need assembly and
>> my own language to cooperate using some common definitions so ISTM the
>> right approach to have a master set of definitions and use it to
>> create declarations and the like both for the assembler and for my
>> compiler.
>>
>> Before I jump in and devise something new ... do you know of any
>> format and tools which exist already?
>>
>> Or since my needs will probably be very limited could it be simpler to
>> avoid a comprehensive and bulky package and just to make up something
>> from scratch? And IYO what should be in it?
>
>
> I don't recall that discussion.
>
> It sounds like you're thinking of a special language just for
> declarations, which transpiles to multiple targets.

I don't know that it would be a 'language' but preferably something much
simpler. For example, the master file might have a couple of constants,
one typed and one not typed.

constant int32 LIMIT 906
constant BLOCKSIZE 512

That would result in assembly something like

LIMIT dd 906
BLOCKSIZE equ 512

and in C something like

int32_t LIMIT = 906;
#define BLOCKSIZE 512

IOW the typed one would reserve storage whereas the other would not, and
there could be an arbitrary number of each. For C, LIMIT and BLOCKSIZE
would likely be written into a header and after being imported could be
used in other C code as normal.

(There's possibly a const qualification that should be added to the
LIMIT but I don't know where.)

>
> That might be a little extravagant. I'm not sure any existing tools are
> going to be helpful, since how will thet know how to generate code for
> each of your languages?

They wouldn't have to do so. What I had in mind was me writing the code
to produce something suitable for my language. It's just that if there
were already a standard master format then I'd look to see if that was
worth using. No need to reinvent the wheel.

>
> In my case, I only have 3 languages (that I code in): ASM, M (static), Q
> (dynamic).
>
> The ASM is written inline in M, and will have access to most of its
> declarations (not yet direcly to structs; I need special declarations to
> make the member offsets available).
>
> When M/Q share data, I use ad hoc methods: for a example a special
> routine in M which when called, writes Q-compatible versions.
>
> Or, since the syntax is similar, I can just copy&paste with a few tweaks.

Wouldn't it be better to produce compatible declarations automatically,
from some master file?

>
> Nothing however that will guarantee those separate declarations
> automatically remain in sync.

Indeed.

>
> Except something I'm working on now, which is that when M is generating
> a shared library, then it will also generate an exports file containing
> an API for use from:
>
>  * M (for M to use it as though it was a regular DLL)
>  * Q
>  * C had also been planned, to allow access from other languages that
> can use DLLs + C headers
>
> (I found a problem with M generating DLL, so that I'm working on
> devising my own shared library format for my own languages. I think that
> can still be packaged within a regular DLL too, but that's low priority;
> it will also need an external tool to generate the core DLL file!)

If your master info is in one of your own languages won't you end up in
the same position as many other languages: something that has to be
translated with no clear master? Wouldn't it be better to have some form
which is clearly the master format for conversion to any and all languages?


--
James Harris

James Harris

unread,
Dec 20, 2021, 12:09:33 PM12/20/21
to
On 20/12/2021 16:59, James Harris wrote:

...

>   constant int32 LIMIT 906
>   constant BLOCKSIZE 512
>
> That would result in assembly something like
>
>   LIMIT      dd   906
>   BLOCKSIZE  equ  512

For anyone who's not familiar with Nasm assembly those two lines do the
following.

LIMIT dd 906

reserves four bytes (dd means 4 bytes, db means 1 byte, etc) of storage
and initialises it to 906.

By contrast,

BLOCKSIZE equ 512

associates the value 512 with the symbol BLOCKSIZE.

Perhaps the master file should make the distinction clearer by using
different initial keywords, as in

stored_constant int32 LIMIT 906

literal_constant BLOCKSIZE 512

I don't know. Just throwing some ideas around. The main thing is that
the master copy should be parsable and convertible to various
programming languages.


--
James Harris

Bart

unread,
Dec 20, 2021, 1:42:23 PM12/20/21
to
On 20/12/2021 16:59, James Harris wrote:
> On 20/12/2021 12:11, Bart wrote:

>> It sounds like you're thinking of a special language just for
>> declarations, which transpiles to multiple targets.
>
> I don't know that it would be a 'language' but preferably something much
> simpler. For example, the master file might have a couple of constants,
> one typed and one not typed.
>
>   constant int32 LIMIT 906
>   constant BLOCKSIZE 512
>
> That would result in assembly something like
>
>   LIMIT      dd   906
>   BLOCKSIZE  equ  512
>
> and in C something like
>
>   int32_t LIMIT = 906;
>   #define BLOCKSIZE 512
>
> IOW the typed one would reserve storage whereas the other would not, and
> there could be an arbitrary number of each. For C, LIMIT and BLOCKSIZE
> would likely be written into a header and after being imported could be
> used in other C code as normal.

This looks like a language to me. It's just one that consists of
declarations, or rather, non-executable code.

A bit like the kind of language I once proposed for defining APIs in a
language-neutral format.

So it's perhaps not as simple as you think. For my purposes,
declarations can include all these aspects:

* Basic types

* Aggregate types (structs, arrays)

* Pointers, strings

* Named constants of those types

* Variables of those types, including arrays and tables

* User-defined structs

* User-defined types

* Enumerations

* In my case, 'tabledata' (enums + parallel arrays)

* Function signatures, mainly for importing/exporting across programs
and across languages

* Using previously defined user-defined structs and types for any of these

* Literals used to define consts and variables: strings (with escape
codes), integers, floats with separators and in various bases

* Macros (in my case, simple expression macros)

* Possibly, making use of 'include' (to incorporate and share such info
in other files) and 'strinclude' (string literals from a file).

* Possibly, read-only attributes (not something I do ATM)

So perhaps half a language; a univeral one translatable to any other,
including assembly. It would need a syntax and a specification.

Maybe your requirements are simpler, but when I produce a DLL, the above
is typical of what might need to be shared. At the least, function
signatures, types/structs, and enums/named constants.

If you are really talking about a configuration file, then that will be
a lot simpler - mainly keywords and values - but I can't see it needing
to be converted into actual language syntax.

>> (I found a problem with M generating DLL, so that I'm working on
>> devising my own shared library format for my own languages. I think
>> that can still be packaged within a regular DLL too, but that's low
>> priority; it will also need an external tool to generate the core DLL
>> file!)
>
> If your master info is in one of your own languages won't you end up in
> the same position as many other languages: something that has to be
> translated with no clear master? Wouldn't it be better to have some form
> which is clearly the master format for conversion to any and all languages?


I'd designate one language as the master, probably the static one as
that has a full static type system. The dynamic one has partial support
only.

There might still need to be a process by which the master is translated
into the other language. For me, that is when the M program is compiled
to a shared library; then the necessary API file is regenerated.

luserdroog

unread,
Dec 24, 2021, 9:57:02 AM12/24/21
to
This seems simple enough to just code up in your favorite scripting language.
Being a weirdo, here's how I'd do it using PostScript.

%!
/mydefs [
{(int32_t)(LIMIT)(906)data}
{(BLOCKSIZE)(512)constant}
] def
/print-C {
<<
/data { 3 -1 roll print( )print exch print( = )print print(;\n)print }
/constant { (#define )print exch print( )print print(\n)print }
>> begin
{exec}forall
end
} def
/print-ASM {
<<
/data { exch print( dd )print print(\n)print pop }
/constant { exch print( equ )print print(\n)print }
>> begin
{exec}forall
end
} def

%Usage:
% mydefs print-C
% mydefs print-ASM
%from command line:
% gsnd -q mydefs.ps -c "mydefs print-C"
% gsnd -q mydefs.ps -c "mydefs print-ASM"

James Harris

unread,
Jan 2, 2022, 10:40:33 AMJan 2
to
On 20/12/2021 18:42, Bart wrote:
> On 20/12/2021 16:59, James Harris wrote:

...

>>    constant int32 LIMIT 906
>>    constant BLOCKSIZE 512
>>
>> That would result in assembly something like
>>
>>    LIMIT      dd   906
>>    BLOCKSIZE  equ  512
>>
>> and in C something like
>>
>>    int32_t LIMIT = 906;
>>    #define BLOCKSIZE 512

...

> This looks like a language to me. It's just one that consists of
> declarations, or rather, non-executable code.

Yes.

>
> A bit like the kind of language I once proposed for defining APIs in a
> language-neutral format.
>
> So it's perhaps not as simple as you think. For my purposes,
> declarations can include all these aspects:
>
> * Basic types
>
> * Aggregate types (structs, arrays)
>
> * Pointers, strings
>
> * Named constants of those types
>
> * Variables of those types, including arrays and tables
>
> * User-defined structs
>
> * User-defined types
>
> * Enumerations
>
> * In my case, 'tabledata' (enums + parallel arrays)
>
> * Function signatures, mainly for importing/exporting across programs
> and across languages
>
> * Using previously defined user-defined structs and types for any of these
>
> * Literals used to define consts and variables: strings (with escape
> codes), integers, floats with separators and in various bases
>
> * Macros (in my case, simple expression macros)
>
> * Possibly, making use of 'include' (to incorporate and share such info
> in other files) and 'strinclude' (string literals from a file).
>
> * Possibly, read-only attributes (not something I do ATM)

That's a good list.

>
> So perhaps half a language; a univeral one translatable to any other,
> including assembly. It would need a syntax and a specification.

Yes, it could be seen as a language but likely a very simple one
involving nothing but declarations, integers and compile-time
expressions. AISI so far, there would be no execution, no loops, no
conditionals.

I guess the form could be something like

keyword parameters

where every statement begins with a keyword. Where composite forms are
required, perhaps something like

keyword {
parameters
}

or

begin keyword
parameters
end keyword

>
> Maybe your requirements are simpler, but when I produce a DLL, the above
> is typical of what might need to be shared. At the least, function
> signatures, types/structs, and enums/named constants.
>
> If you are really talking about a configuration file, then that will be
> a lot simpler - mainly keywords and values

Yes.

>
> - but I can't see it needing
> to be converted into actual language syntax.

I don't get that. The whole point is to have something which can be
converted into the form required for different languages.


--
James Harris

Reply all
Reply to author
Forward
0 new messages