Namespaces, modularity, ... what's the big deal?

49 views
Skip to first unread message

luserdroog

unread,
Nov 16, 2021, 10:16:02 AM11/16/21
to
I've been following with interest many of the threads started
by James Harris over the year and frequently the topic of
namespaces and modularity come up. Most recently in the
"Power operator and replacement..." thread.

And I tend to find myself on James' side through unfamiliarity
with the other option. What's the big deal about namespaces
and modules? What do they cost and what do they facilitate?

David Brown

unread,
Nov 16, 2021, 11:11:30 AM11/16/21
to
"namespaces" are just a way to collect identifiers in a group. In C,
all global (external linkage) identifiers in a program are in the same
namespace. So you have to make your own namespaces for your code:

void timers_init();
void timers_enable();
int timers_get_current();
void timers_set_frequency(double);

etc.


With namespaces, you could have this all in a namespace:

namespace timers {
void init();
void enable();
int get_current();
void set_frequency(double);
}

You can use them as "timers::init();" or by pulling them into the
current scope with "using timers; enable();".

The cost is that the identifiers (functions and objects) no longer match
the source code name and the assembly level name - you need some kind of
mangling (it could be as simple as $timers$init, if you don't need
mangling for overloads or other purposes).

The benefit is that it is easier to structure the code. You can have
nice long namespace identifiers that are clear and explicit when you
only need them occasionally, but in parts of the code that use the
identifiers a lot you can have a "using" clause and avoid cluttering the
code with the namespace. This is all a big improvement over having to
use long identifier names all the time, or having cryptic abbreviations.
All in all, you get nicer code and far less risk of identifier collisions.


There is not a lot of difference between having a namespace, and having
a class containing only static methods and objects.

(I've used the C++ syntax here, but other languages have much the same
idea though they may use slightly different syntax.)


"modules" are a way of having a pre-compiled unit of code. Typically
languages have separate "interface" and "implementation" parts. A
module may or may not correspond to a namespace - that varies by
language. Basically, modules are more formalized and more restrictive
than C's header and C file setup, which means you can have a safer,
clearer and more efficient approach. A compiled module may also contain
more information than just the assembly/object code and identifiers - it
can have code in internal formats so that you can do cross-module
optimisation or static error checking.

Bart

unread,
Nov 16, 2021, 11:38:04 AM11/16/21
to
There's no runtime overhead (depending on how dynamic a language is).

Typical characterics of modules (as I implement them in my):

* Names (functions, variables, named constants, types, enums, macros)
are usually local to a module

* To share them with other modules, they are marked as 'global'.

* To import names from a specific module A, use:

import A

* This then makes visible, say, a function F in A provided it is 'global'

* The name of the module creates a namespace 'A'. To call F, use A.F().

* However (in my stuff, others may frown upon this), I don't need the
qualifier 'A'; I just use F().

* I only need A.F() if I import modules A and B, and they both export F.
Then I need A.F() or B.F() to disambiguate

* Namespaces are also created with (in my languages) records (used as
classes) and functions, which can both be used to encapsulate data. But
I use this rarely.

The above describes my older module scheme, which I'm just replacing
with a newer one. All my schemes are described in more depth here,
although it is for those already familiar with modules:

https://github.com/sal55/langs/tree/master/Modules2021

The above corresponds to scheme III, and I've just moved to IV, which
goes a little beyond individual modules.



James Harris

unread,
Nov 17, 2021, 4:43:35 AM11/17/21
to
On 16/11/2021 16:11, David Brown wrote:
> On 16/11/2021 16:16, luserdroog wrote:

>> I've been following with interest many of the threads started
>> by James Harris over the year and frequently the topic of
>> namespaces and modularity come up. Most recently in the
>> "Power operator and replacement..." thread.
>>
>> And I tend to find myself on James' side through unfamiliarity
>> with the other option. What's the big deal about namespaces
>> and modules? What do they cost and what do they facilitate?

...

> namespace timers {
> void init();
> void enable();
> int get_current();
> void set_frequency(double);
> }
>
> You can use them as "timers::init();" or by pulling them into the
> current scope with "using timers; enable();".

As this is about namespaces, I'd suggest some problems with that "using
timers" example:

* AIUI it mixes the names from within 'timers' into the current
namespace. If that's right then a programmer taking such an approach
would have to avoid having his own versions of those names. (I.e. no
'enable' or 'init' names in the local scope.)

* If multiple namespaces are imported in the same way then there's
another potential for name conflicts. It's especially bad for names
imported from external files.

* There could be a stable situation with no name conflicts, but then an
upgrade to an imported module which is not directly under the
programmer's control could introduce a new name which /does/ conflict.

Therefore namespace support needs to be more extensive and more
restrictive, IMO. For example,

namespace T = lib.org.timers

then

T.enable()

In that, T would be a name in the local scope that the programmer could
control and ensure it doesn't conflict with any other names. The import
could be

namespace timers = lib.org.timers

Then the name 'timers' would be local. The programmer would still be
/required/ to specify the namespace name as in

timers.enable()

>
> The cost is that the identifiers (functions and objects) no longer match
> the source code name and the assembly level name - you need some kind of
> mangling (it could be as simple as $timers$init, if you don't need
> mangling for overloads or other purposes).

Interesting point. Can you give an example?


--
James Harris

Dmitry A. Kazakov

unread,
Nov 17, 2021, 5:40:30 AM11/17/21
to
On 2021-11-17 10:43, James Harris wrote:
> On 16/11/2021 16:11, David Brown wrote:

>> namespace timers {
>>     void init();
>>     void enable();
>>     int get_current();
>>     void set_frequency(double);
>> }
>>
>> You can use them as "timers::init();" or by pulling them into the
>> current scope with "using timers; enable();".
>
> As this is about namespaces, I'd suggest some problems with that "using
> timers" example:
>
> * AIUI it mixes the names from within 'timers' into the current
> namespace. If that's right then a programmer taking such an approach
> would have to avoid having his own versions of those names. (I.e. no
> 'enable' or 'init' names in the local scope.)
>
> * If multiple namespaces are imported in the same way then there's
> another potential for name conflicts. It's especially bad for names
> imported from external files.

You mean making simple names inside the namespace directly visible in
the scope? Because there are two kind of imports:

1. Making the namespace visible without direct visibility of its names.
So you must qualify the simple name Foo from My_Namespace:

My_Namespace.Foo
[ My_Namespace::Foo in C++ ]

2. Making simple names directly visible. So you can omit all parents:

Foo

> * There could be a stable situation with no name conflicts, but then an
> upgrade to an imported module which is not directly under the
> programmer's control could introduce a new name which /does/ conflict.

You are talking about #2. There are two approaches to that:

2.1. Allow direct visiblity so long there is no conflicts inside the
scope. E.g. let both Namespace_1 and Namespace_2 have Foo, but the code
does not refer to Foo, then is OK.

2.2. Disallow it even if there is no conflicts inside the scope. This
would prevent surprises so long the namespaces are stable. Of course a
modification of a namespace can break an existing program. However this
is not a problem from the SW development POV, because you must review
the client code anyway if you change any namespace it uses.

> Therefore namespace support needs to be more extensive and more
> restrictive, IMO. For example,
>
>   namespace T = lib.org.timers
>
> then
>
>   T.enable()
>
> In that, T would be a name in the local scope that the programmer could
> control and ensure it doesn't conflict with any other names. The import
> could be
>
>   namespace timers = lib.org.timers
>
> Then the name 'timers' would be local. The programmer would still be
> /required/ to specify the namespace name as in
>
>   timers.enable()

These are "fully qualified names," i.e. #1. Note that T and Timers are
simple names and may still conflict with some other T's and Timers'. You
have further choices:

A. You allow overloading of Timers. Let only one of Timers has Enable:

Timers.Enable

This is OK, if another Timers has no Enable

B. You do not allow it regardless.

Ada has both #1 and #2. It uses #2.1(A). The predefined package Standard
is the root namespace. So, qualified names may use it to disambiguate,
e.g. like

Standard.Timers.Enable

You can mix #1 and #2. I.e. if a fully qualified name is Standard.A.B.C,
then A.B.C is OK because Standard is directly visible #2. If you make
Standard.A directly visible, then B.C would be also OK. If you make
Standard.A.B directly visible, then you could use just C.

>> The cost is that the identifiers (functions and objects) no longer match
>> the source code name and the assembly level name - you need some kind of
>> mangling (it could be as simple as $timers$init, if you don't need
>> mangling for overloads or other purposes).
>
> Interesting point. Can you give an example?

See C++ name mangling. GNAT Ada compiler does it too, but it uses a
different schema.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

David Brown

unread,
Nov 17, 2021, 5:51:42 AM11/17/21
to
On 17/11/2021 10:43, James Harris wrote:
> On 16/11/2021 16:11, David Brown wrote:
>> On 16/11/2021 16:16, luserdroog wrote:
>
>>> I've been following with interest many of the threads started
>>> by James Harris over the year and frequently the topic of
>>> namespaces and modularity come up. Most recently in the
>>> "Power operator and replacement..." thread.
>>>
>>> And I tend to find myself on James' side through unfamiliarity
>>> with the other option. What's the big deal about namespaces
>>> and modules? What do they cost and what do they facilitate?
>
> ...
>
>> namespace timers {
>>     void init();
>>     void enable();
>>     int get_current();
>>     void set_frequency(double);
>> }
>>
>> You can use them as "timers::init();" or by pulling them into the
>> current scope with "using timers; enable();".
>
> As this is about namespaces, I'd suggest some problems with that "using
> timers" example:
>
> * AIUI it mixes the names from within 'timers' into the current
> namespace. If that's right then a programmer taking such an approach
> would have to avoid having his own versions of those names. (I.e. no
> 'enable' or 'init' names in the local scope.)

That is exactly correct. The benefit of "using" is that by importing an
identifier into the current scope, you can use that identifier without
the full namespace qualification. It is precisely so that you can use
"enable();" without having to write "timers::enable();". The
disadvantage, of course, is that if you have other "enable" identifiers,
there will be an ambiguity or conflict.

Often you will have the "using" clauses in a local scope (within a
function, for example, rather than at file scope). And languages often
support abbreviations or aliasing (in C++, you can use "namespace tmrs =
timers;", though clearly it makes most sense for long or nested namespaces).

This is standard stuff for any programming language that supports a
significant level of nested scopes - whether by modules, sub-modules,
classes, namespaces, units, or whatever the language supports and
whatever terminology it uses.

>
> * If multiple namespaces are imported in the same way then there's
> another potential for name conflicts. It's especially bad for names
> imported from external files.
>

Yes - again, that's why you typically keep the "using" clause in a local
scope. In C++, you are advised not to put a "using" clause at file
scope in a header, for example (putting it within definitions of
functions is fine).

> * There could be a stable situation with no name conflicts, but then an
> upgrade to an imported module which is not directly under the
> programmer's control could introduce a new name which /does/ conflict.
>
> Therefore namespace support needs to be more extensive and more
> restrictive, IMO. For example,
>
>   namespace T = lib.org.timers
>
> then
>
>   T.enable()

Such aliases and abbreviations are definitely a useful feature. But the
key to avoiding name conflicts is to have hierarchical namespaces (of
some sort - possibly modules, units, etc.) and to have something like a
"using" feature that you can have in local scopes to balance
convenience, clarity and avoiding conflicts.

>
> In that, T would be a name in the local scope that the programmer could
> control and ensure it doesn't conflict with any other names. The import
> could be
>
>   namespace timers = lib.org.timers
>
> Then the name 'timers' would be local. The programmer would still be
> /required/ to specify the namespace name as in
>
>   timers.enable()
>

Sure. Support both full namespace imports and aliases, at local scopes
and wider scopes. I wasn't trying to give a complete list of all the
features of namespaces and "using" in C++, merely a few key points. If
you want them all, this site is an excellent reference for C and C++:

<https://en.cppreference.com/w/cpp/language/namespace>

I am not suggesting that C++ is necessarily the model or syntax to copy
- its separation of namespaces and modules has its advantages and
disadvantages, and other languages may choose a different path. And as
always, if you pick the syntax and keywords at the start of language
design, you'll be able to get nicer choices than a language that has
evolved over time.


>>
>> The cost is that the identifiers (functions and objects) no longer match
>> the source code name and the assembly level name - you need some kind of
>> mangling (it could be as simple as $timers$init, if you don't need
>> mangling for overloads or other purposes).
>
> Interesting point. Can you give an example?
>
>

Yes, sure.


extern "C" int foo0(int x) { return x + 1; }

int foo1(int x) { return x + 1; }

namespace space {
int foo2(int x) { return x + 1; }
}

With gcc (identifier mangling in C++ is compiler-specific), you get :

foo0:
lea eax, [rdi+1]
ret
_Z4foo1i:
lea eax, [rdi+1]
ret
_ZN5space4foo2Ei:
lea eax, [rdi+1]
ret

With C, there is no mangling - a function name corresponds directly to
the label at assembly level. C++ includes the types of the parameters
(but not the return type) in a mangled name - the "i" at the end of
"_Z4foo1i" indicates a single "int" parameter, so that you can have
function overloads. When you have namespaces, these need to be included
in the name too. You also need the mangling to include class names for
methods if you support OOP.

If you know your assembler/linker support characters that are not
allowed in the source language - such as $ - then you can probably make
a neater and more compact name-mangling scheme.

It may also be of interest to generate multiple versions or clones of
functions, and have mangled names for them too.

If you have some kind of module system so that compiling one source
module generates a binary "import module" that is needed by users of the
module, then you could have an almost random mangling scheme with a
translation table in the generated import module. There are many
possibilities.


Bart

unread,
Nov 17, 2021, 9:01:39 AM11/17/21
to
On 17/11/2021 09:43, James Harris wrote:
> On 16/11/2021 16:11, David Brown wrote:
>> On 16/11/2021 16:16, luserdroog wrote:
>
>>> I've been following with interest many of the threads started
>>> by James Harris over the year and frequently the topic of
>>> namespaces and modularity come up. Most recently in the
>>> "Power operator and replacement..." thread.
>>>
>>> And I tend to find myself on James' side through unfamiliarity
>>> with the other option. What's the big deal about namespaces
>>> and modules? What do they cost and what do they facilitate?
>
> ...
>
>> namespace timers {
>>     void init();
>>     void enable();
>>     int get_current();
>>     void set_frequency(double);
>> }
>>
>> You can use them as "timers::init();" or by pulling them into the
>> current scope with "using timers; enable();".
>
> As this is about namespaces, I'd suggest some problems with that "using
> timers" example:
>
> * AIUI it mixes the names from within 'timers' into the current
> namespace. If that's right then a programmer taking such an approach
> would have to avoid having his own versions of those names. (I.e. no
> 'enable' or 'init' names in the local scope.)

It depends on how you make it work. Languages already work with multiple
instances of the same name within nested scopes; it will match the name
in the nearest surrounding scope.

With 'using', it could just introduce an extra scope beyond the ones
that are already known.

With two 'using's in two nested scopes, it gets a bit more involved:

{ using ns1;
{ using ns2;
enable();

If 'enable' exists in both ns1 and ns2, will there be a conflict, or
will it use ns2?

I don't have a 'using' statement. My namespaces are only linked to
module names. And all modules in a program (now, a subprogram) are
visible at the same time. Then there can be a conflict:

import A # both export 'enable'
import B

proc enable = {}

enable()

This is OK; 'enable' is found within the normal set of scopes. But
delete the proc definition, and then there is a conflict.

> * If multiple namespaces are imported in the same way then there's
> another potential for name conflicts. It's especially bad for names
> imported from external files.

Yes, there's a conflict. The same as if you declared 'abc' then declared
another 'abc' in the same scope. Or removed braces so that two 'abc's
are now in the same block scope.

> * There could be a stable situation with no name conflicts, but then an
> upgrade to an imported module which is not directly under the
> programmer's control could introduce a new name which /does/ conflict.

There are all sorts of issues, but most of them already occur with
nested scopes:

{
double abc;
{
int abc;
printf(fmt, abc);

Here, the 'int abc;' could be deleted or commented out. Now 'abc'
matches the other definition, with no detectable error.

James Harris

unread,
Nov 17, 2021, 11:16:33 AM11/17/21
to
On 17/11/2021 10:51, David Brown wrote:
> On 17/11/2021 10:43, James Harris wrote:
>> On 16/11/2021 16:11, David Brown wrote:

...

>>> The cost is that the identifiers (functions and objects) no longer match
>>> the source code name and the assembly level name - you need some kind of
>>> mangling (it could be as simple as $timers$init, if you don't need
>>> mangling for overloads or other purposes).
>>
>> Interesting point. Can you give an example?

...

> If you know your assembler/linker support characters that are not
> allowed in the source language - such as $ - then you can probably make
> a neater and more compact name-mangling scheme.

Where would such namespace mangling be needed? AFAICS only in two cases.

The first is where tools (compiler, linker, debugger, and so on) were
not hierarchy-aware (HA) so that multipart names would need to be
concatenated into a single name.

The second is where another language is not HA and code written in it
needs to refer to a symbol in one in which names /can/ be hierarchical.
References in the converse direction, from an HA language to one which
is not HA, would not be a problem.

So AISI if a language and its toolchain are HA then programs written in
that language would have full access to symbols exported from other
modules whatever language those modules were written in.

For example, if there's an assembly module called fred which exports a
name joe then a language which supported hierarchical namespaces could,
perhaps, refer to that symbol as

fred.joe

AISI all languages and build tools should be able to work with multipart
names. But I suspect that few are!


--
James Harris

David Brown

unread,
Nov 17, 2021, 2:32:57 PM11/17/21
to
All of what you write here sounds perfectly reasonable - but that last
part is a big issue. You typically have to interact with other tools in
a system - writing /everything/ yourself is not practical. So at some
point you have to switch to "flat" names - and if you still need to
encode hierarchical information there, you need some kind of mangling.

One way to make this easier is to have a "foreign function interface",
or some way of declaring non-mangled names. In C++, you can use extern
"C" to declare that a symbol should be in "plain C" format. It is then
considered as outside any namespaces in regard to the name, has no
mangling, etc. That also means it cannot support overloading. But it
is useful for linking (in either direction) against object code modules
made by other tools.

Some languages - like C++ and Ada - support function overloading. In
that case, you cannot have a simple HA identifier scheme such as you
proposed as the mangled names also have to encode the parameter types.
You could have a more advanced method and keep it within tools
specifically written to handle them.

But the point of using name mangling schemes here is that your ordinary
assembler, linker, debugger, etc., can work as though it were HA by
simply encoding the hierarchy within mangled symbol names. Now all you
need from these tools is that they can work with long symbol names (as
they can get very long) - everything else stays exactly the same.
Although name mangling may seem like a hack and a workaround at first,
it is in fact quite a convenient and elegant solution.

(Think of it a little like using Base64 encoding to be able to send
binary files via text-only email. It may look a mess when you try to
read it, but it simply and conveniently solves the problem without
needing to re-do the entire email system.)

Dmitry A. Kazakov

unread,
Nov 17, 2021, 2:58:16 PM11/17/21
to
On 2021-11-17 20:32, David Brown wrote:

> Some languages - like C++ and Ada - support function overloading. In
> that case, you cannot have a simple HA identifier scheme such as you
> proposed as the mangled names also have to encode the parameter types.

There are also non-functions like constructors/destructors and utility
subprograms created by the compiler, e.g. instances of generic/template
bodies. GCC generates a huge number of such stuff when creating a
dynamic library.

luserdroog

unread,
Nov 20, 2021, 10:19:14 PM11/20/21
to
On Tuesday, November 16, 2021 at 10:11:30 AM UTC-6, David Brown wrote:
[the thing that was requested]

Well, dang that does seem mighty useful. Of my various language projects
the only one where I'm truly designing it (albeit backburnered) is `olmec`,
my APL variant. But I may (or may not) have a problem with incorporating
namespaces.

For starters, it's APL. Arrays, Characters (21bit unicode), Integers, Multiprecision
Floating Point, infix functions and operators (or verbs and adverbs). But here's
the weird part:

Any contiguous string of non-numeric, non-whitespace, non-bracket characters
is an <identifier-string> which then gets peeled apart by scanning left-to-right
for the longest defined prefix.

This enable extremely stupid golfing tricks, like making the letter 'b' be the
addition function and then the expression 'abc' becomes an addition if
'a' and 'c' are suitably defined.

So I kind of have long identifiers and "free" scope-resolution operators.
You can just pick any crazy character you like to be a separator and define
a long name of whatever symbols.

The piece I'm missing then, is a "using" directive. I suppose this would be
scoped to whole functions, because I don't really have any kind of <statement>
syntax. It's just <expression><expression><expression>...<End-of-function>.
Returns are either by Fortran-style copy from a named variable or the value
of the last expression if you didn't a specify a return variable.

I don't even have "if(){}". So far, you have to build it out of lambdas if you want
to do that. Maybe that should be on the high priority list, too.

luserdroog

unread,
Nov 20, 2021, 10:26:07 PM11/20/21
to
I take that back. Maybe I do have something like an "if". But it's weird.
I seem to recall implementing the syntax from Ken Iverson's "tool of thought"
Turing award paper. which is like

(true expression):(condition):(false expression)

with 2 embedded colons separating the parts. I forgot about that part.
I suppose it doesn't affect namespacing except that colons are taken.
So I can't use colons.

luserdroog

unread,
Nov 20, 2021, 10:48:06 PM11/20/21
to
On Saturday, November 20, 2021 at 9:19:14 PM UTC-6, luserdroog wrote:
> On Tuesday, November 16, 2021 at 10:11:30 AM UTC-6, David Brown wrote:
> [the thing that was requested]
[snip]
> The piece I'm missing then, is a "using" directive. I suppose this would be
> scoped to whole functions, because I don't really have any kind of <statement>
> syntax. It's just <expression><expression><expression>...<End-of-function>.
> Returns are either by Fortran-style copy from a named variable or the value
> of the last expression if you didn't a specify a return variable.
[snip]

Into the meat of the situation...

There are 2 basic ways to define a function, short or long. The long way has
a bunch of variations.

For the short way, I don't think "using" directive are practical. It's supposed
to be for one-liners. I guess I can have an expression separator like semicolon
maybe (old APL literature seems to like <diamond>), but not urgent.

The long way is IBM APL style DEL definitions. You start a line with DEL
(the downward pointing big triangle) and then a preamble or prototype
on the rest of the line and hit enter. BTW you're typing all this into the
interactive editor bc I insist on handling all the unicode internally and maybe
throwing in extra characters from the VT220 graphics sets.

Then you're in a separate mode, entering lines (expressions) for the function
to execute sequentially. The del-function-definition mode ends when you
give it a single DEL by itself on a line.

So the question is, stuff it (the "using" directive) into the preamble, which
already specifies

(<optional return variable><left arrow>)(optional left arg)(function)(optional right arg);(optional local variables)

Or make it a "statement" or special expression, a directive.

I don't know. Maybe I haven't given enough information for help, but that's
what I got.

luserdroog

unread,
Nov 21, 2021, 12:00:49 AM11/21/21
to
On Saturday, November 20, 2021 at 9:19:14 PM UTC-6, luserdroog wrote:
> On Tuesday, November 16, 2021 at 10:11:30 AM UTC-6, David Brown wrote:
> [the thing that was requested]
>
> Well, dang that does seem mighty useful. Of my various language projects
> the only one where I'm truly designing it (albeit backburnered) is `olmec`,
> my APL variant. But I may (or may not) have a problem with incorporating
> namespaces.
>
> For starters, it's APL. Arrays, Characters (21bit unicode), Integers, Multiprecision
> Floating Point, infix functions and operators (or verbs and adverbs).

Oh yeah, and there's a Unicode symbol table type. And a dynamic chain of these
attached to functions to provide variable scoping. I have somewhat "canonical"
versions of the implementation of each of these datatypes posted at
codereview.stackexchange.com. I'm pretty proud of the multidimentional array
stuff, but still humbled by the critique. I still make a lot of those same mistakes.

> But here's
> the weird part:
>
> Any contiguous string of non-numeric, non-whitespace, non-bracket characters
> is an <identifier-string> which then gets peeled apart by scanning left-to-right
> for the longest defined prefix.
>

Unless you're inside of a (long form) function definition, then you're typing
expressions at the console to be evaluated. So there's just a single global
symbol table holding all the definitions you can use directly at the prompt.

> This enables extremely stupid golfing tricks, like making the letter 'b' be the

luserdroog

unread,
Nov 21, 2021, 12:09:32 AM11/21/21
to
On Saturday, November 20, 2021 at 9:48:06 PM UTC-6, luserdroog wrote:

> The long way is IBM APL style DEL definitions. You start a line with DEL
> (the downward pointing big triangle) and then a preamble or prototype
> on the rest of the line and hit enter. BTW you're typing all this into the
> interactive editor bc I insist on handling all the unicode internally and maybe
> throwing in extra characters from the VT220 graphics sets.
>
> Then you're in a separate mode, entering lines (expressions) for the function
> to execute sequentially. The del-function-definition mode ends when you
> give it a single DEL by itself on a line.
>
> So the question is, stuff it (the "using" directive) into the preamble, which
> already specifies
>
> (<optional return variable><left arrow>)(optional left arg)(function)(optional right arg);(optional local variables)

Incidentally this bit of weirdness still seems pretty awesome to me,
but I'm not entirely sure how practical it actually is. You can specify
a "niladic" function that takes no arguments. But then you have to
have a very close understanding of the execution behavior of
expressions to know exactly where in there the thing will get called.

It also lets you define monadic or dyadic (ie. unary or binary) functions
by filling in the right (+left) arguments, of course. But you also
can define "sinister" verbs. These have a left argument but no right argument.
So you can make a dang=ol' factorial n! that fercrissakes looks like
for real-ass factorial notation. It screws up the whole strict right-to-left
deal that APL does, but it's awesome that's why. It's weird and that's the
dilly-o.

James Harris

unread,
Nov 21, 2021, 5:11:10 AM11/21/21
to
Yes, as you surmise below, it's option 2 which I am saying is a bad idea.

>
>> * There could be a stable situation with no name conflicts, but then
>> an upgrade to an imported module which is not directly under the
>> programmer's control could introduce a new name which /does/ conflict.
>
> You are talking about #2. There are two approaches to that:
>
> 2.1. Allow direct visiblity so long there is no conflicts inside the
> scope. E.g. let both Namespace_1 and Namespace_2 have Foo, but the code
> does not refer to Foo, then is OK.
>
> 2.2. Disallow it even if there is no conflicts inside the scope. This
> would prevent surprises so long the namespaces are stable. Of course a
> modification of a namespace can break an existing program. However this
> is not a problem from the SW development POV, because you must review
> the client code anyway if you change any namespace it uses.

I would prohibit option 2 (making imported names directly visible)
altogether (although I would allow names to be searched for in different
scopes - but that's getting in to another topic).

If the language allows hierarchical names and aliases as in

namedef T = long.complex.path.timers

where T becomes an alias of the name on the RHS then a programmer could
refer to any name A within timers as

T.A

That's no hardship. Further, the name T would defined locally so would
not have any surprising conflicts.



For full disclosure (!) I would like to go slightly further and allow
imported names to follow a prefix - i.e. without the dot. For example,
say the syntax allowed something like

namedef P* = long.complex.path.timers.*

where P became a prefix to all the names in timers. Then timers.enable
could be invoked as

Penable()

In a sense, P would replace "timers." including the dot.

In such a case, no identifiers in the local scope would be able to begin
with P.

That latter approach (allowing a prefix without the dot) is far from
necessary. It's a convenience to the programmer. But AISI there are
cases where code which has a dot before every imported name would be
annoying to work with. So I think it's worth including.

>
>> Therefore namespace support needs to be more extensive and more
>> restrictive, IMO. For example,
>>
>>    namespace T = lib.org.timers
>>
>> then
>>
>>    T.enable()
>>
>> In that, T would be a name in the local scope that the programmer
>> could control and ensure it doesn't conflict with any other names. The
>> import could be
>>
>>    namespace timers = lib.org.timers
>>
>> Then the name 'timers' would be local. The programmer would still be
>> /required/ to specify the namespace name as in
>>
>>    timers.enable()
>
> These are "fully qualified names," i.e. #1. Note that T and Timers are
> simple names and may still conflict with some other T's and Timers'. You
> have further choices:
>
> A. You allow overloading of Timers. Let only one of Timers has Enable:
>
>    Timers.Enable
>
> This is OK, if another Timers has no Enable

Overloading is a big topic but I think there would have to be only one
name such as Timers in each scope. Remember that such a name would be
local to the source file that the programmer is editing. Just as a C
programmer should not declare

int i, i;

it ought to be wrong to declare Timers twice in the same scope.

>
> B. You do not allow it regardless.
>
> Ada has both #1 and #2. It uses #2.1(A). The predefined package Standard
> is the root namespace. So, qualified names may use it to disambiguate,
> e.g. like
>
>    Standard.Timers.Enable
>
> You can mix #1 and #2. I.e. if a fully qualified name is Standard.A.B.C,
> then A.B.C is OK because Standard is directly visible #2. If you make
> Standard.A directly visible, then B.C would be also OK. If you make
> Standard.A.B directly visible, then you could use just C.

Exactly.


--
James Harris

Dmitry A. Kazakov

unread,
Nov 21, 2021, 6:06:07 AM11/21/21
to
On 2021-11-21 11:11, James Harris wrote:

> I would prohibit option 2 (making imported names directly visible)
> altogether (although I would allow names to be searched for in different
> scopes - but that's getting in to another topic).

You cannot, otherwise you will have to name all local scopes:

void Foo ()
{
int X;

X := 1; // Illegal!
}

Should be

void Foo ()
{
Local : {
int X;

Root.Foo.Local.X := 1; // OK
}
}

Having different sets of rules for naming is an awful idea.

A related issue is hiding:

{
int X;
{
int X; // Hides directly visible X

X := 1; // The nested X
}
}

Children hide, siblings conflict? (:-))

> If the language allows hierarchical names and aliases as in
>
>   namedef T = long.complex.path.timers
>
> where T becomes an alias of the name on the RHS then a programmer could
> refer to any name A within timers as
>
>   T.A

Which is worse than the original problem. Renaming is a dangerous thing
and you are asking for overusing it because otherwise the code would be
unreadable with extremely long names.

> That's no hardship. Further, the name T would defined locally so would
> not have any surprising conflicts.

It is. Basically instead of resolving the issue you burden the
programmer with renaming and invoke havoc in maintenance because who
would manage and document all these renamings, if not the language?

A problem related to visibility and renaming is transitivity of using.

Consider B that uses or renames things in A. If C uses B will it see the
uses and renamings of B?

In Ada it will see no uses but all remanings. Before you say it is good,
it is not straightforward, because there is no way to maintain derived
namespaces like B. If B uses a lot of other namespaces there is no
simple way to shape the effect using B will have in C. Again, renaming
is not a solution in the real-world projects.

> For full disclosure (!) I would like to go slightly further and allow
> imported names to follow a prefix - i.e. without the dot. For example,
> say the syntax allowed something like
>
>   namedef P* = long.complex.path.timers.*
>
> where P became a prefix to all the names in timers. Then timers.enable
> could be invoked as
>
>   Penable()
>
> In a sense, P would replace "timers." including the dot.

Abhorrent.

> Overloading is a big topic but I think there would have to be only one
> name such as Timers in each scope. Remember that such a name would be
> local to the source file that the programmer is editing. Just as a C
> programmer should not declare
>
>   int i, i;
>
> it ought to be wrong to declare Timers twice in the same scope.

Except that you cannot control it unless you rename each and every
entity of each namespace manually.

The problem is with means supporting manipulation of namespaces as a
*whole*, like using does. These impose problems. Individual renamings is
not a solution, it makes things much worse.

James Harris

unread,
Nov 21, 2021, 8:55:52 AM11/21/21
to
On 21/11/2021 11:05, Dmitry A. Kazakov wrote:
> On 2021-11-21 11:11, James Harris wrote:
>
>> I would prohibit option 2 (making imported names directly visible)
>> altogether (although I would allow names to be searched for in
>> different scopes - but that's getting in to another topic).

Throughout your post you seem to be talking about something other than
what I proposed. I cannot work out what model you have in mind but I'll
reply as best I can.

>
> You cannot, otherwise you will have to name all local scopes:
>
>    void Foo ()
>    {
>       int X;
>
>       X := 1; // Illegal!
>    }

I didn't suggest that. In your example your reference to X would use the
X declared above it.

...

>> If the language allows hierarchical names and aliases as in
>>
>>    namedef T = long.complex.path.timers
>>
>> where T becomes an alias of the name on the RHS then a programmer
>> could refer to any name A within timers as
>>
>>    T.A
>
> Which is worse than the original problem. Renaming is a dangerous thing
> and you are asking for overusing it because otherwise the code would be
> unreadable with extremely long names.

The name was long to illustrate a point.

>
>> That's no hardship. Further, the name T would defined locally so would
>> not have any surprising conflicts.
>
> It is. Basically instead of resolving the issue you burden the
> programmer with renaming and invoke havoc in maintenance because who
> would manage and document all these renamings, if not the language?

There's no renaming. If you mean aliasing then that depends on the
details of the scheme. Consider

int Outer;
void F(void)
{
int Inner;
X
}

Under 'nominal' schemes that we are all familiar with, at point X the
names Inner and Outer would be visible. If we stick to that and add an
alias such as

int Outer;
alias External = A.B.C.D.E; <=== line added to the prior example
void F(void)
{
int Inner;
X
}

Are you saying that at point X name External should not be visible? If
so, why should it not?

>
> A problem related to visibility and renaming is transitivity of using.
>
> Consider B that uses or renames things in A. If C uses B will it see the
> uses and renamings of B?

Let me check what you mean. Say we were to have

module A
public name A1
endmodule

module B
public alias B1 = A.A1
endmodule

Are you saying that another module, C, should not refer to the B.B1 name?

>
> In Ada it will see no uses but all remanings. Before you say it is good,
> it is not straightforward, because there is no way to maintain derived
> namespaces like B. If B uses a lot of other namespaces there is no
> simple way to shape the effect using B will have in C. Again, renaming
> is not a solution in the real-world projects.

Say module B presents a certain interface. Why should it not collect
together and present names from other modules?

>
>> For full disclosure (!) I would like to go slightly further and allow
>> imported names to follow a prefix - i.e. without the dot. For example,
>> say the syntax allowed something like
>>
>>    namedef P* = long.complex.path.timers.*
>>
>> where P became a prefix to all the names in timers. Then timers.enable
>> could be invoked as
>>
>>    Penable()
>>
>> In a sense, P would replace "timers." including the dot.
>
> Abhorrent.

:-) Do you have a semantic objection to it or is it purely that you
don't think a programmer should be offered such a facility - perhaps
because you think it's too close to having names require the dot?

Consider a case where lots of names are imported. For example, say an
external module had the tens or hundreds of mnemonics for an assembly
language. In the context of this discussion one could consider that
there are three choices:

1. Import all mnemonics into the current scope - which would cause the
problems mentioned in my prior post.

2. Give them all a common name as with the x in

x.add
x.and
x.call
etc

3. Use a prefix as in the x in

xadd
xand
xcall

For sure, you could say that the latter is not necessary. But try
reading aloud a lot of code which contains loads of dotted names and you
might come to see the suggestion differently.

I am interested in your opinion but I cannot see why you would object to
that. It would be clear and manifest in the importing module that the
prefix x or the alias x had been defined.

>
>> Overloading is a big topic but I think there would have to be only one
>> name such as Timers in each scope. Remember that such a name would be
>> local to the source file that the programmer is editing. Just as a C
>> programmer should not declare
>>
>>    int i, i;
>>
>> it ought to be wrong to declare Timers twice in the same scope.
>
> Except that you cannot control it unless you rename each and every
> entity of each namespace manually.

It's typically only external namespaces which would have to be given an
alias. Within a single module normal scope resolution could apply -
although I may take a different approach which I'll not go in to just
now as it would muddy the waters.

>
> The problem is with means supporting manipulation of namespaces as a
> *whole*, like using does. These impose problems. Individual renamings is
> not a solution, it makes things much worse.
>

To be clear, what would normally be aliased would be entire namespaces,
not individual symbols (although they could be aliased if required).


--
James Harris

James Harris

unread,
Nov 21, 2021, 9:42:56 AM11/21/21
to
On 17/11/2021 19:32, David Brown wrote:
> On 17/11/2021 17:16, James Harris wrote:

...

>> For example, if there's an assembly module called fred which exports a
>> name joe then a language which supported hierarchical namespaces could,
>> perhaps, refer to that symbol as
>>
>>   fred.joe
>>
>> AISI all languages and build tools should be able to work with multipart
>> names. But I suspect that few are!
>>
>
> All of what you write here sounds perfectly reasonable - but that last
> part is a big issue. You typically have to interact with other tools in
> a system - writing /everything/ yourself is not practical. So at some
> point you have to switch to "flat" names - and if you still need to
> encode hierarchical information there, you need some kind of mangling.

Agreed, although I'd add that once the tools are available they should
be able to work with any number of languages - as long as they agree on
the file and name formats.


--
James Harris

Dmitry A. Kazakov

unread,
Nov 21, 2021, 10:06:58 AM11/21/21
to
On 2021-11-21 14:55, James Harris wrote:
> On 21/11/2021 11:05, Dmitry A. Kazakov wrote:
>> On 2021-11-21 11:11, James Harris wrote:
>>
>>> I would prohibit option 2 (making imported names directly visible)
>>> altogether (although I would allow names to be searched for in
>>> different scopes - but that's getting in to another topic).
>
> Throughout your post you seem to be talking about something other than
> what I proposed. I cannot work out what model you have in mind but I'll
> reply as best I can.
>
>>
>> You cannot, otherwise you will have to name all local scopes:
>>
>>     void Foo ()
>>     {
>>        int X;
>>
>>        X := 1; // Illegal!
>>     }
>
> I didn't suggest that. In your example your reference to X would use the
> X declared above it.

You said no directly visible names. Now you have directly visible X,
instead of Foo.X.

>> Which is worse than the original problem. Renaming is a dangerous
>> thing and you are asking for overusing it because otherwise the code
>> would be unreadable with extremely long names.
>
> The name was long to illustrate a point.

Most names are, because of deep nesting.

>>> That's no hardship. Further, the name T would defined locally so
>>> would not have any surprising conflicts.
>>
>> It is. Basically instead of resolving the issue you burden the
>> programmer with renaming and invoke havoc in maintenance because who
>> would manage and document all these renamings, if not the language?
>
> There's no renaming.

OK, call it alias name, if you do not like the term "renaming."

> If you mean aliasing then that depends on the
> details of the scheme. Consider
>
>   int Outer;
>   void F(void)
>   {
>     int Inner;
>     X
>   }
>
> Under 'nominal' schemes that we are all familiar with, at point X the
> names Inner and Outer would be visible.

Why different rules for nesting?

> If we stick to that and add an
> alias such as
>
>   int Outer;
>   alias External = A.B.C.D.E;  <=== line added to the prior example
>   void F(void)
>   {
>     int Inner;
>     X
>   }
>
> Are you saying that at point X name External should not be visible? If
> so, why should it not?

Not only it. Outer and Inner must not be visible either. Inner is not a
proper name. They actually are:

Root.Outer
Root.External
Root.F.Inner

Again, provided you hold to your proposal not to allow direct
visibility. Otherwise, you have a problem of irregularity of the rules.

Since Ada has such rules, I can tell you that this is quite annoying and
the source of a rift between people favoring fully qualified names vs
ones who prefer the direct names. Note that the choice directly
influences the design, since you must name things differently if you are
a true Lilliputian vs a treacherous Blefuscudian... (:-))

Anyway, the point is that direct visibility exist for a reason. That
reason is valid for both nested and imported scopes.

>> A problem related to visibility and renaming is transitivity of using.
>>
>> Consider B that uses or renames things in A. If C uses B will it see
>> the uses and renamings of B?
>
> Let me check what you mean. Say we were to have
>
>   module A
>     public name A1
>   endmodule
>
>   module B
>     public alias B1 = A.A1
>   endmodule
>
> Are you saying that another module, C, should not refer to the B.B1 name?

No. I am saying this:

module A
public name A1
endmodule

module B
use A
-- A1 is directly visible here?
endmodule

module C
use B
-- Is A1 directly visible here?
endmodule

>> In Ada it will see no uses but all remanings. Before you say it is
>> good, it is not straightforward, because there is no way to maintain
>> derived namespaces like B. If B uses a lot of other namespaces there
>> is no simple way to shape the effect using B will have in C. Again,
>> renaming is not a solution in the real-world projects.
>
> Say module B presents a certain interface. Why should it not collect
> together and present names from other modules?

Because "use" is non-transitive in Ada. [You want to prohibit "use"
altogether.]

Arguably one could have

1. non-transitive alias/rename (stays inside the module)

2. transitive alias/rename (propagates upon use)

3. non-transitive use

A non-transitive use creates a non-transitive alias for each name of the
module *except* the non-transitive aliases.

4. transitive use

A transitive use creates a transitive alias for *each* name of the module

>>> For full disclosure (!) I would like to go slightly further and allow
>>> imported names to follow a prefix - i.e. without the dot. For
>>> example, say the syntax allowed something like
>>>
>>>    namedef P* = long.complex.path.timers.*
>>>
>>> where P became a prefix to all the names in timers. Then
>>> timers.enable could be invoked as
>>>
>>>    Penable()
>>>
>>> In a sense, P would replace "timers." including the dot.
>>
>> Abhorrent.
>
> :-) Do you have a semantic objection to it or is it purely that you
> don't think a programmer should be offered such a facility - perhaps
> because you think it's too close to having names require the dot?

No. One could err on ethics, but not on aesthetics. (:-)) It is abhorrent!

> Consider a case where lots of names are imported. For example, say an
> external module had the tens or hundreds of mnemonics for an assembly
> language. In the context of this discussion one could consider that
> there are three choices:
>
> 1. Import all mnemonics into the current scope - which would cause the
> problems mentioned in my prior post.
>
> 2. Give them all a common name as with the x in
>
>   x.add
>   x.and
>   x.call
>   etc
>
> 3. Use a prefix as in the x in
>
>   xadd
>   xand
>   xcall

... and have all alleged problems.

The proper solution is for the language to provide means to maintain
namespaces at the *module* side. The designer of the module chooses the
way mnemonics to be used in the client code. Note, not otherwise. The
client code may only apply some emergency exceptional rules to resolve
problems, but there should be none with *properly designed* modules.

>> Except that you cannot control it unless you rename each and every
>> entity of each namespace manually.
>
> It's typically only external namespaces which would have to be given an
> alias. Within a single module normal scope resolution could apply -
> although I may take a different approach which I'll not go in to just
> now as it would muddy the waters.

The ratio between external namespaces and the nested namespaces is kind
of 1000 to 1. You are barking on the wrong tree.

>> The problem is with means supporting manipulation of namespaces as a
>> *whole*, like using does. These impose problems. Individual renamings
>> is not a solution, it makes things much worse.
>
> To be clear, what would normally be aliased would be entire namespaces,
> not individual symbols (although they could be aliased if required).

That changes little as namespaces (modules, classes etc) are deeply
nested themselves. If you think that the problem is "the last mile," you
are wrong.

James Harris

unread,
Nov 21, 2021, 11:43:19 AM11/21/21
to
On 21/11/2021 15:06, Dmitry A. Kazakov wrote:
> On 2021-11-21 14:55, James Harris wrote:
>> On 21/11/2021 11:05, Dmitry A. Kazakov wrote:
>>> On 2021-11-21 11:11, James Harris wrote:
>>>
>>>> I would prohibit option 2 (making imported names directly visible)
>>>> altogether (although I would allow names to be searched for in
>>>> different scopes - but that's getting in to another topic).
>>
>> Throughout your post you seem to be talking about something other than
>> what I proposed. I cannot work out what model you have in mind but
>> I'll reply as best I can.
>>
>>>
>>> You cannot, otherwise you will have to name all local scopes:
>>>
>>>     void Foo ()
>>>     {
>>>        int X;
>>>
>>>        X := 1; // Illegal!
>>>     }
>>
>> I didn't suggest that. In your example your reference to X would use
>> the X declared above it.
>
> You said no directly visible names. Now you have directly visible X,
> instead of Foo.X.

I was talking about /imported/ names. I said I would prohibit "making
*imported* names directly visible". It's in the text, above!

Consider

namespace Bases
int Oct = 8
int Dec = 10
int Hex = 16
namespace-end

enum WinterMonths(Nov. Dec, Jan)

Although the name Dec appears twice the two could never conflict because
they would /always/ require qualification. For example,

Bases.Dec
Months.Dec

If they were in external modules they could be imported as in

alias B = external_1.Bases
alias M = external_2.Months

then the names would be B.Dec and M.Dec. IOW the two could still not
conflict.

Note that if a programmer wrote

alias Dec = Bases.Dec
alias Dec = Months.Dec

then there would be a conflict but it would be clear in the client code.

...

>>    int Outer;
>>    void F(void)
>>    {
>>      int Inner;
>>      X
>>    }
>>
>> Under 'nominal' schemes that we are all familiar with, at point X the
>> names Inner and Outer would be visible.
>
> Why different rules for nesting?

I am not suggesting different rules. See the next comment, below.

>
>> If we stick to that and add an alias such as
>>
>>    int Outer;
>>    alias External = A.B.C.D.E;  <=== line added to the prior example
>>    void F(void)
>>    {
>>      int Inner;
>>      X
>>    }
>>
>> Are you saying that at point X name External should not be visible? If
>> so, why should it not?
>
> Not only it. Outer and Inner must not be visible either. Inner is not a
> proper name. They actually are:
>
>    Root.Outer
>    Root.External
>    Root.F.Inner
>
> Again, provided you hold to your proposal not to allow direct
> visibility. Otherwise, you have a problem of irregularity of the rules.

As above, that wasn't what I proposed. In the example, at point X the
following would all be accessible

Inner
Outer
External

In fact, with normal scope resolution they wouldn't just be accessible
but would all be accessible /as plain names/.

Feasibly, one could decide to make Inner directly accessible but not the
other two, although that's probably a separate discussion. Either way,
the key point is that External and Outer would be treated in the same
way. Both names appear in the file outside a function and both would be
at file scope.


>
> Since Ada has such rules, I can tell you that this is quite annoying and
> the source of a rift between people favoring fully qualified names vs
> ones who prefer the direct names. Note that the choice directly
> influences the design, since you must name things differently if you are
> a true Lilliputian vs a treacherous Blefuscudian... (:-))

For Blefuscudian my newsreader's spell checker suggests "Undiscussable"!
:-)

>
> Anyway, the point is that direct visibility exist for a reason. That
> reason is valid for both nested and imported scopes.

Sure. As above, if Outer is directly visible then External would be too.

And in case it's not clear, External would allow access to any names
within it. For example,

alias External = A.B.C.D.E;
void F(void)
{
int inner = External.baseval;
}

where baseval is a name within E.

...

> No. I am saying this:
>
>    module A
>      public name A1
>    endmodule
>
>    module B
>      use A
>              -- A1 is directly visible here?
>    endmodule
>
>    module C
>      use B
>              -- Is A1 directly visible here?
>    endmodule

OK. I wouldn't allow the "use" statements so the issue would not apply.

>
>>> In Ada it will see no uses but all remanings. Before you say it is
>>> good, it is not straightforward, because there is no way to maintain
>>> derived namespaces like B. If B uses a lot of other namespaces there
>>> is no simple way to shape the effect using B will have in C. Again,
>>> renaming is not a solution in the real-world projects.
>>
>> Say module B presents a certain interface. Why should it not collect
>> together and present names from other modules?
>
> Because "use" is non-transitive in Ada. [You want to prohibit "use"
> altogether.]

Yes.

...

>> Consider a case where lots of names are imported. For example, say an
>> external module had the tens or hundreds of mnemonics for an assembly
>> language. In the context of this discussion one could consider that
>> there are three choices:
>>
>> 1. Import all mnemonics into the current scope - which would cause the
>> problems mentioned in my prior post.
>>
>> 2. Give them all a common name as with the x in
>>
>>    x.add
>>    x.and
>>    x.call
>>    etc
>>
>> 3. Use a prefix as in the x in
>>
>>    xadd
>>    xand
>>    xcall
>
> ... and have all alleged problems.
>
> The proper solution is for the language to provide means to maintain
> namespaces at the *module* side. The designer of the module chooses the
> way mnemonics to be used in the client code. Note, not otherwise.

I accept that that is your view but I cannot agree with it. AISI it's
important for the client to remain in control. That is so that such
choices would be visible to any programmer reading the code.

If, by contrast, such choices were made in the module providing the
service then the propagation of names would be invisible to the client
code.

> The
> client code may only apply some emergency exceptional rules to resolve
> problems, but there should be none with *properly designed* modules.


--
James Harris

James Harris

unread,
Nov 21, 2021, 12:19:49 PM11/21/21
to
AISI there are two issues:
1. Definition. How to declare a name within a certain scope.
2. Searching. The order in which to search scopes for a name.

>
> With 'using', it could just introduce an extra scope beyond the ones
> that are already known.
>
> With two 'using's in two nested scopes, it gets a bit more involved:
>
>   {  using ns1;
>      { using ns2;
>          enable();
>
> If 'enable' exists in both ns1 and ns2, will there be a conflict, or
> will it use ns2?

That's the 'searching' issue: how to find a plain name. For me the jury
is still out on that. IME most languages search outwards one scope at a
time until they find a matching name. Is that the best approach? I don't
know.

Incidentally, consider

A.B

How would that find the relevant name? One option is for it to search
for the name A in scopes in some defined order and then require B to be
in the first A that it comes across.

Another option is to search for the first A in which there is a B.

ATM I prefer the first option - only the most-significant name would be
searched for - but I'm keeping it as an open question.

>
> I don't have a 'using' statement. My namespaces are only linked to
> module names. And all modules in a program (now, a subprogram) are
> visible at the same time. Then there can be a conflict:
>
>     import A          # both export 'enable'
>     import B
>
>     proc enable = {}
>
>     enable()
>
> This is OK; 'enable' is found within the normal set of scopes. But
> delete the proc definition, and then there is a conflict.

Good example. As you say below, if a programmer deletes the definition
of an inner name he may not notice that there's still a reference to it.
Then if the same name exists in an outer scope the compiler may not be
able to report a problem.

At the moment I don't know if I want to allow inner names to hide outer
ones. An alternative is to say that inner names must not be the same as
any name in an outer scope but that may not scale well.

Another option is to say that all names have to be unique within a
function or within a file.

A further option once a language supports hierarchical names is to treat
levels of the hierarchy like a file tree and say that names in inner
scopes can hide names in outer scopes but that there's still a way to
refer to a 'parent' scope much as one can refer to a parent directory -
something equivalent to

name - searches scopes in some predefined order
./name - the name must be in the current scope
../name - the name must be in the parent scope

>
>> * If multiple namespaces are imported in the same way then there's
>> another potential for name conflicts. It's especially bad for names
>> imported from external files.
>
> Yes, there's a conflict. The same as if you declared 'abc' then declared
> another 'abc' in the same scope. Or removed braces so that two 'abc's
> are now in the same block scope.
>
>> * There could be a stable situation with no name conflicts, but then
>> an upgrade to an imported module which is not directly under the
>> programmer's control could introduce a new name which /does/ conflict.
>
> There are all sorts of issues, but most of them already occur with
> nested scopes:
>
>     {
>         double abc;
>         {
>             int abc;
>             printf(fmt, abc);
>
> Here, the 'int abc;' could be deleted or commented out. Now 'abc'
> matches the other definition, with no detectable error.
>

The thing is, if all the names we are talking about are present in the
source file how burdensome would it be for a programmer to have to use
different names in inner scopes? I suspect it would not be too much of a
problem and it would solve a number of problems so it may be worth
considering.


--
James Harris

Dmitry A. Kazakov

unread,
Nov 21, 2021, 12:25:56 PM11/21/21
to
On 2021-11-21 17:43, James Harris wrote:

> And in case it's not clear, External would allow access to any names
> within it. For example,
>
>   alias External = A.B.C.D.E;
>   void F(void)
>   {
>     int inner = External.baseval;
>   }
>
> where baseval is a name within E.

Sure, unless it gets hidden due to name conflicts. As I said aliases
solve nothing. E.g. your example is valid in Ada and represents renaming
a package:

package External renames A.B.C.D.E;

It is used rarely and not for the purpose you are talking. The most
frequent use case is platform-dependent packages. E.g. you would have

package Real_Time_Extensions renames Win32.A.B.C.E;

and another compilation unit

package Real_Time_Extensions renames Linux.A.B.C.E.D;

All through the client code you refer on

with Real_Time_Extensions;

and swap files using the project target. Of course the public interfaces
must be same.

>> No. I am saying this:
>>
>>     module A
>>       public name A1
>>     endmodule
>>
>>     module B
>>       use A
>>               -- A1 is directly visible here?
>>     endmodule
>>
>>     module C
>>       use B
>>               -- Is A1 directly visible here?
>>     endmodule
>
> OK. I wouldn't allow the "use" statements so the issue would not apply.

Yes, you would alias all 1000+ functions and objects manually. "use" is
merely a tool to automate the process of creating aliases. If you ever
worked with C++ you would know how difficult to build a C++ DLL
exporting classes. This is basically same problem. There are modifiers
you can apply to a class to export everything required. Unfortunately in
produces a huge tail of errors. A newly designed language must address
such automation problems, if you do that afterwards.

>> The proper solution is for the language to provide means to maintain
>> namespaces at the *module* side. The designer of the module chooses
>> the way mnemonics to be used in the client code. Note, not otherwise.
>
> I accept that that is your view but I cannot agree with it. AISI it's
> important for the client to remain in control.

No, it is important to protect clients from stupid mistakes. Because
here you have a bug multiplier. Each client can introduce a bug to fix.
If the interface is stable and that includes which names the client may
use *by safe default*, you reduce that to a single fault point.

> If, by contrast, such choices were made in the module providing the
> service then the propagation of names would be invisible to the client
> code.

Yes, if the client should not see them, that would be the module
interface designer's choice.

James Harris

unread,
Nov 21, 2021, 12:57:17 PM11/21/21
to
On 21/11/2021 17:25, Dmitry A. Kazakov wrote:
> On 2021-11-21 17:43, James Harris wrote:
>
>> And in case it's not clear, External would allow access to any names
>> within it. For example,
>>
>>    alias External = A.B.C.D.E;
>>    void F(void)
>>    {
>>      int inner = External.baseval;
>>    }
>>
>> where baseval is a name within E.
>
> Sure, unless it gets hidden due to name conflicts. As I said aliases
> solve nothing. E.g. your example is valid in Ada and represents renaming
> a package:
>
>    package External renames A.B.C.D.E;
>
> It is used rarely and not for the purpose you are talking. The most
> frequent use case is platform-dependent packages. E.g. you would have
>
>    package Real_Time_Extensions renames Win32.A.B.C.E;
>
> and another compilation unit
>
>    package Real_Time_Extensions renames Linux.A.B.C.E.D;
>
> All through the client code you refer on
>
>    with Real_Time_Extensions;
>
> and swap files using the project target. Of course the public interfaces
> must be same.

Yes, that's all good.

>
>>> No. I am saying this:
>>>
>>>     module A
>>>       public name A1
>>>     endmodule
>>>
>>>     module B
>>>       use A
>>>               -- A1 is directly visible here?
>>>     endmodule
>>>
>>>     module C
>>>       use B
>>>               -- Is A1 directly visible here?
>>>     endmodule
>>
>> OK. I wouldn't allow the "use" statements so the issue would not apply.
>
> Yes, you would alias all 1000+ functions and objects manually.

No need! The programmer would usually refer to the namespace (or an
alias thereof). For example, if there was a namespace ns1 which had many
names within it:

namespace ns1
name A
name B
name C
1000 other names
end namespace

and the programmer imported it with

alias ns1 = whereever.ns1

then the name "ns1" would be entered into the current scope and the
program could refer to names within that namespace with such as

ns1.A

but he would not be able to refer to plain

A

The only way to refer to A as a plain name would be to define an alias
for it as in

alias A = ns1.A

but then it would be manifest in the source code that the name A was
being added to the current scope. Yet that would be rare. With 1000+
names it would be easiest for the programmer to just refer to them as

ns1.name

where 'name' is one of the thousand.

IOW there would be no need to alias all 1000+ functions and objects.
Just alias the namespace within which they exist.

> "use" is
> merely a tool to automate the process of creating aliases. If you ever
> worked with C++ you would know how difficult to build a C++ DLL
> exporting classes. This is basically same problem. There are modifiers
> you can apply to a class to export everything required.

I don't know Windows but I think there's something similar in Python
where one can write

from my_module import *

IMO that's a bad feature. It imports all names in my_module into the
current scope. That has at least two problems:

1. The names can conflict with names the programmer enters into the
scope - along with all the attendant problems already mentioned.

2. When names are imported from two or more modules there's no way for a
programmer to be able to tell where a certain name is referring to. For
example,

from mod1 import *
from mod2 import *

print a_name

there's no way to tell whether a_name comes from mod1 or mod2. IMO the
"import *" facility pollutes the namespace which is the current scope.

> Unfortunately in
> produces a huge tail of errors. A newly designed language must address
> such automation problems, if you do that afterwards.

As I say, I don't know about DLLs but I wouldn't allow the "import *"
approach so maybe that would avoid the problem you have in mind.

>
>>> The proper solution is for the language to provide means to maintain
>>> namespaces at the *module* side. The designer of the module chooses
>>> the way mnemonics to be used in the client code. Note, not otherwise.
>>
>> I accept that that is your view but I cannot agree with it. AISI it's
>> important for the client to remain in control.
>
> No, it is important to protect clients from stupid mistakes. Because
> here you have a bug multiplier. Each client can introduce a bug to fix.
> If the interface is stable and that includes which names the client may
> use *by safe default*, you reduce that to a single fault point.

Maybe I am misunderstanding what you mean. Could you illustrate it with
a small example?

>
>> If, by contrast, such choices were made in the module providing the
>> service then the propagation of names would be invisible to the client
>> code.
>
> Yes, if the client should not see them, that would be the module
> interface designer's choice.

Under the scheme I have in mind names would still have protections such
as being private or public. Private names would not be visible to
another module.


--
James Harris

Dmitry A. Kazakov

unread,
Nov 21, 2021, 1:19:03 PM11/21/21
to
On 2021-11-21 18:57, James Harris wrote:

> No need! The programmer would usually refer to the namespace (or an
> alias thereof). For example, if there was a namespace ns1 which had many
> names within it:
>
>   namespace ns1
>     name A
>     name B
>     name C
>     1000 other names
>   end namespace
>
> and the programmer imported it with
>
>   alias ns1 = whereever.ns1
>
> then the name "ns1" would be entered into the current scope and the
> program could refer to names within that namespace with such as
>
>   ns1.A

Like

A ns1.+ B

And since you have no direct visibility you have a problem with prefix
notation.

ns1.A.Print ()

Why Print is visible?

I do not want to explain further, I am not a language lawyer, but there
exit massive problems with your approach if you have generic and other
polymorphic bodies. In essence all methods must be visible in order to
prevent calling wrong a method, since you implicitly convert to the
parent if the overriding is not visible. Ada had such issues until 2005.

> 2. When names are imported from two or more modules there's no way for a
> programmer to be able to tell where a certain name is referring to. For
> example,
>
>   from mod1 import *
>   from mod2 import *
>
>   print a_name
>
> there's no way to tell whether a_name comes from mod1 or mod2.

You can use fully qualified name to disambiguate.

> Maybe I am misunderstanding what you mean. Could you illustrate it with
> a small example?

Consider a linear algebra modules. The client should be able to use all
modules without mental gymnastics. There is a type Matrix, and I can
multiply them, just out of the box.

> Under the scheme I have in mind names would still have protections such
> as being private or public. Private names would not be visible to
> another module.

This is a totally unrelated issue.

Bart

unread,
Nov 21, 2021, 2:32:34 PM11/21/21
to
On 21/11/2021 17:19, James Harris wrote:
> On 17/11/2021 14:01, Bart wrote:

>> It depends on how you make it work. Languages already work with
>> multiple instances of the same name within nested scopes; it will
>> match the name in the nearest surrounding scope.
>
> AISI there are two issues:
> 1. Definition. How to declare a name within a certain scope.

That's an easy one. Just declare like any other.


> 2. Searching. The order in which to search scopes for a name.

I'm working on related matters right now. First I'm revising my module
schemes across two languages, which also affects how namespaces work.

But there was also the issue of finding a module or support file within
the file system. I had had a stack of possible search paths, but I've
dropped that in order to have more confidence in exactly what version of
a file will be found, and where.

So usually there is one choice, at most two.

Scopes within source code are similar. I don't have block scopes, so it
will be:

* In this function
* In this module
* In this subprogram
* In this program

Names can exist within namespaces created by, say, record (class)
definitions. Those can accessed from outside via a qualified name which
is found in one of the above.

(There is also the case of code inside such a class definition; I don't
do much on those lines, but it would add these, or replace the function
line above:

* In this record definition
* In the next outer if nested

but I stop there. I don't have enough experience of using those to have
a definitive way of doing it.)

James Harris

unread,
Nov 21, 2021, 5:14:36 PM11/21/21
to
On 21/11/2021 18:19, Dmitry A. Kazakov wrote:
> On 2021-11-21 18:57, James Harris wrote:
>
>> No need! The programmer would usually refer to the namespace (or an
>> alias thereof). For example, if there was a namespace ns1 which had
>> many names within it:
>>
>>    namespace ns1
>>      name A
>>      name B
>>      name C
>>      1000 other names
>>    end namespace
>>
>> and the programmer imported it with
>>
>>    alias ns1 = whereever.ns1
>>
>> then the name "ns1" would be entered into the current scope and the
>> program could refer to names within that namespace with such as
>>
>>    ns1.A
>
> Like
>
>    A ns1.+ B

I doubt it. But you could have qualified names such as

ns1.A + ns1.B

>
> And since you have no direct visibility you have a problem with prefix
> notation.
>
>    ns1.A.Print ()
>
> Why Print is visible?

Good example.

>
> I do not want to explain further, I am not a language lawyer, but there
> exit massive problems with your approach if you have generic and other
> polymorphic bodies. In essence all methods must be visible in order to
> prevent calling wrong a method, since you implicitly convert to the
> parent if the overriding is not visible. Ada had such issues until 2005.

OK.

>
>> 2. When names are imported from two or more modules there's no way for
>> a programmer to be able to tell where a certain name is referring to.
>> For example,
>>
>>    from mod1 import *
>>    from mod2 import *
>>
>>    print a_name
>>
>> there's no way to tell whether a_name comes from mod1 or mod2.
>
> You can use fully qualified name to disambiguate.

If the names (the contents of the namespaces) cannot be intermixed then
conflicts would become impossible and disambiguation would be the norm.


--
James Harris

James Harris

unread,
Nov 21, 2021, 5:43:02 PM11/21/21
to
On 21/11/2021 19:32, Bart wrote:
> On 21/11/2021 17:19, James Harris wrote:

...

>> 2. Searching. The order in which to search scopes for a name.
>
> I'm working on related matters right now.

If you find a really good scheme ... be sure to share! ;-)


> First I'm revising my module
> schemes across two languages, which also affects how namespaces work.
>
> But there was also the issue of finding a module or support file within
> the file system. I had had a stack of possible search paths, but I've
> dropped that in order to have more confidence in exactly what version of
> a file will be found, and where.

I am unsure about resolving names by search paths. In some ways they
make a lot of sense but in others they introduce uncertainty.

One option I have in mind is for names which are to be imported to use
search paths (and yet names as used in a function to go the other way
and be very explicit). The first entry in the search path would normally
be the folder the source file sits in (the base folder) so an import of
A.B would, by default, look in the base folder for a file called
A.<language-extension> and then expect to find a name B in the outer
scope in that file.

If no file A were to be found there then the search would continue with
the next namespace in the search path.

The search path could be manipulated under program control between one
import and the next.



You may have already thought of this but in some cases it can be
convenient to have a reserved name to represent any namespace itself -
in this case to represent the file. For example, say you wanted to write
a factorial function and your language extension was xx. You might
create a file called

factorial.xx

That's fairly normal. But then say within it that you create a function
to compute the factorial. You would naturally call it 'factorial' but
then its name from the outside of the file would be factorial.factorial
which could be a bit irritating. :-(

It would be better for a caller simply to refer to the function
'factorial' yet you cannot drop the function header and begin
factorial.xx in function mode because, for example, you need to specify
a parameter or two. That's why I say an option is to have a reserved
name such as underscore. Then factorial.xx can include

function _ (parameter)
... function body ...
endfunction

and that function could be invoked simply as function(...) from the
outside.


--
James Harris

James Harris

unread,
Nov 21, 2021, 5:50:13 PM11/21/21
to
On 21/11/2021 22:43, James Harris wrote:

...

> That's why I say an option is to have a reserved
> name such as underscore. Then factorial.xx can include
>
>   function _ (parameter)
>     ... function body ...
>   endfunction
>
> and that function could be invoked simply as function(...) from the
> outside.

Oops. That's wrong. It should have said it could be invoked as
factorial(...) from the outside - where 'factorial' is the file name
(without the extension).

Similarly a function with the same name, _, in a file called

fib.xx

would be invocable as fib(...).

Just some ideas.


--
James Harris

Bart

unread,
Nov 21, 2021, 8:54:51 PM11/21/21
to
On 21/11/2021 22:43, James Harris wrote:
> On 21/11/2021 19:32, Bart wrote:
>> On 21/11/2021 17:19, James Harris wrote:
>
> ...
>
>>> 2. Searching. The order in which to search scopes for a name.
>>
>> I'm working on related matters right now.
>
> If you find a really good scheme ... be sure to share! ;-)
>
>
>> First I'm revising my module schemes across two languages, which also
>> affects how namespaces work.
>>
>> But there was also the issue of finding a module or support file
>> within the file system. I had had a stack of possible search paths,
>> but I've dropped that in order to have more confidence in exactly what
>> version of a file will be found, and where.
>
> I am unsure about resolving names by search paths. In some ways they
> make a lot of sense but in others they introduce uncertainty.

No that was about finding source files. The similarity was in using a
list of search folders, and trying each of them in turn, and using a
list of scopes.

While files are not limited to one or two folders, top-level names are
limited to a fixed number of scopes, perhaps 3 or 4.


>
> You may have already thought of this but in some cases it can be
> convenient to have a reserved name to represent any namespace itself -
> in this case to represent the file. For example, say you wanted to write
> a factorial function and your language extension was xx. You might
> create a file called
>
>   factorial.xx
>
> That's fairly normal. But then say within it that you create a function
> to compute the factorial. You would naturally call it 'factorial' but
> then its name from the outside of the file would be factorial.factorial
> which could be a bit irritating. :-(
>
> It would be better for a caller simply to refer to the function
> 'factorial' yet you cannot drop the function header and begin
> factorial.xx in function mode because, for example, you need to specify
> a parameter or two. That's why I say an option is to have a reserved
> name such as underscore. Then factorial.xx can include
>
>   function _ (parameter)
>     ... function body ...
>   endfunction
>
> and that function could be invoked simply as function(...) from the
> outside.

I don't quite get this (I've seen your follow-up correction).


If I create such a file called fact.m:

global function fact(int n)int = {(n<=1 | 1 | n*fact(n-1))}

And I call it from the lead module:


module fact # (new module scheme uses 'module')

proc start=
println(fact.fact(12))
println(fact(12))
end

Either of those designations will work.

fact() will first resolve to the exported fact.fact routine before the
module name.

But fact.fact will resolve first to the module name.

So there is no ambiguity since there is only one actual fact function,
and the rules say that module names are resolved first, for the A in A.B.

But if there is a local version of fact():

fact() resolves to that local version.

fact.fact() is an error: the first 'fact' resolves to that local
function, which does not contain a definition of 'fact'.

Generaly, module names that match function names etc are a nuisance
(especially in my dynamic languages where a standalone module name not
followed by "." is a valid expression term).

But there is a workaround for this example:

module fact as ff

function fact(int n)int = {n*2}

proc start=
println(ff.fact(12))
println(fact(12))
end

Here, I've provided an alias for the module name 'fact', so that I can
use 'ff' to remove the ambiguity.


luserdroog

unread,
Nov 22, 2021, 8:58:47 PM11/22/21
to
On Saturday, November 20, 2021 at 9:48:06 PM UTC-6, luserdroog wrote:
> On Saturday, November 20, 2021 at 9:19:14 PM UTC-6, luserdroog wrote:
> > On Tuesday, November 16, 2021 at 10:11:30 AM UTC-6, David Brown wrote:
> > [the thing that was requested]
> [snip]
> > The piece I'm missing then, is a "using" directive. I suppose this would be
> > scoped to whole functions, because I don't really have any kind of <statement>
> > syntax. It's just <expression><expression><expression>...<End-of-function>.
> > Returns are either by Fortran-style copy from a named variable or the value
> > of the last expression if you didn't a specify a return variable.
> [snip]
>
> Into the meat of the situation...
[snip]
> So the question is, stuff it (the "using" directive) into the preamble, which
> already specifies
>
> (<optional return variable><left arrow>)(optional left arg)(function)(optional right arg);(optional local variables)
>
> Or make it a "statement" or special expression, a directive.
>
> I don't know. Maybe I haven't given enough information for help, but that's
> what I got.

Erhm. That was not the meat of the situation, just the surface syntax. sigh.
The real issue is how to implement the directive. I have support for
Weizenbaum environment chains, although so far there are only ever two
environments: global or local to a (DEL) function.

On function entry, a local symbol table is linked to the global table and
this chain is used to resolve any names in the function (dynamically,
in the interpreter's runtime code). So a "using" directive could (I think)
just add a new symbol table to the chain, just behind the locals.

That feels too easy. I must have missed something. Or Weizenbaum was
even wiser than I thought.

James Harris

unread,
Nov 23, 2021, 12:24:32 PM11/23/21
to
On 22/11/2021 01:54, Bart wrote:
> On 21/11/2021 22:43, James Harris wrote:
>> On 21/11/2021 19:32, Bart wrote:

...

>>> But there was also the issue of finding a module or support file
>>> within the file system. I had had a stack of possible search paths,
>>> but I've dropped that in order to have more confidence in exactly
>>> what version of a file will be found, and where.
>>
>> I am unsure about resolving names by search paths. In some ways they
>> make a lot of sense but in others they introduce uncertainty.
>
> No that was about finding source files. The similarity was in using a
> list of search folders, and trying each of them in turn, and using a
> list of scopes.
>
> While files are not limited to one or two folders, top-level names are
> limited to a fixed number of scopes, perhaps 3 or 4.

One good thing about discussing such topics is that they force me to
think about some of the harder choices. Where that's got me in this case
is as follows.

I'd like to scrap the idea of searching for names in a search path. That
may be fine for users running code but programmers can - and probably
should - be more precise. As a programmer I'd rather specify exactly
which import I want than to have the vagueness of something (such as a
compiler, linker or loader) searching for it in a series of locations.

Now I think about it the system we normally use for finding headers and
other object files is woeful! "Look through these folders and pick the
first match." What??? I'd rather say "import this specific name from
this specific module".

That's easy when there's only one possible module to import but what if
we want one from a number of choices? To illustrate, imagine there are
modules which are distinct per CPU type as in

an x86-32 module
an x86-64 module
an arm-hf-64 module
etc

and our code is generic enough to work with any of them. In that case
ISTM best to have the CPU type as a parameter and say something like

import lib.${CPUTYPE}.modname

where CPUTYPE will have been defined as a build parameter.

That avoids the vagueness of searching and gives the programmer full
control.

What do you think?

I'll reply to the other part of your reply separately.


--
James Harris

James Harris

unread,
Nov 23, 2021, 12:52:38 PM11/23/21
to
On 22/11/2021 01:54, Bart wrote:
> On 21/11/2021 22:43, James Harris wrote:

...
As I just wrote in another reply to the same post perhaps we should get
away from the idea of a name being resolved to the first match from
multiple places. That may work for human convenience when typing things
in to a command line but there's no need for it when a programmer
prepares a source file and can specify in it exactly what he's referring
to.

It's well illustrated in your example about imports. Why would you want

fact.fact
fact

both to resolve to the same thing?

External names could potentially be long with many parts but I'd still
have the ability to define a local alias for anything external so that
it could be referred to by a short name in program code.

As I say, the above is about imports. What of names within nested scopes
in our code? I don't know, yet. Jury's out!

>
> But if there is a local version of fact():
>
>    fact() resolves to that local version.
>
>    fact.fact() is an error: the first 'fact' resolves to that local
> function, which does not contain a definition of 'fact'.
>
> Generaly, module names that match function names etc are a nuisance
> (especially in my dynamic languages where a standalone module name not
> followed by "." is a valid expression term).
>
> But there is a workaround for this example:
>
>     module fact as ff
>
>     function fact(int n)int = {n*2}
>
>     proc start=
>         println(ff.fact(12))
>         println(fact(12))
>     end
>
> Here, I've provided an alias for the module name 'fact', so that I can
> use 'ff' to remove the ambiguity.

I prefer that. It's clearer. In the call, ff clearly relates to the ff
defined above.

Alternatively, what about fact.m including reserved function name _ as in

global function _(int n)int = {(n<=1 | 1 | n*fact(n-1))}

such that the caller's code would by naming the module refer to the _
function within it as in

module fact # (new module scheme uses 'module')

proc start=
println(fact(12))
end

?


--
James Harris

James Harris

unread,
Nov 23, 2021, 12:57:05 PM11/23/21
to
On 23/11/2021 01:58, luserdroog wrote:
> On Saturday, November 20, 2021 at 9:48:06 PM UTC-6, luserdroog wrote:

...

>> So the question is, stuff it (the "using" directive) into the preamble, which
>> already specifies
>>
>> (<optional return variable><left arrow>)(optional left arg)(function)(optional right arg);(optional local variables)
>>
>> Or make it a "statement" or special expression, a directive.
>>
>> I don't know. Maybe I haven't given enough information for help, but that's
>> what I got.
>
> Erhm. That was not the meat of the situation, just the surface syntax. sigh.
> The real issue is how to implement the directive. I have support for
> Weizenbaum environment chains, although so far there are only ever two
> environments: global or local to a (DEL) function.

What are "Weizenbaum environment chains"?? Most of Google's suggestions
don't seem to relate.


--
James Harris

Bart

unread,
Nov 23, 2021, 3:38:04 PM11/23/21
to
On 23/11/2021 17:52, James Harris wrote:
> On 22/11/2021 01:54, Bart wrote:

>> So there is no ambiguity since there is only one actual fact function,
>> and the rules say that module names are resolved first, for the A in A.B.
>
> As I just wrote in another reply to the same post perhaps we should get
> away from the idea of a name being resolved to the first match from
> multiple places. That may work for human convenience when typing things
> in to a command line but there's no need for it when a programmer
> prepares a source file and can specify in it exactly what he's referring
> to.
>
> It's well illustrated in your example about imports. Why would you want
>
>   fact.fact
>   fact
>
> both to resolve to the same thing?

Suppose you're in a directory 'fact' (now talking about files not
identifiers!), and you have a file called 'fact'. It might be poor
practice, but it happens.

Then there are all these multiple ways of refering to that file from
within folder fact:

C:\fact>type fact
This is FACT.

C:\fact>type \fact\fact
This is FACT.

C:\fact>type .\fact
This is FACT.

C:\fact>type .\.\.\.\.\.\.\.\.\fact
This is FACT.

Further, on Windows, if the file 'fact' was a program, then if folder
'fact' was one of the search paths (SET PATH=, equivalent to USING),
then program 'fact' could be accessed from anywhere:


C:\demo>set path=%PATH%;C:\fact

C:\demo>fact
Program FACT

The difference is that here, it will settle for the first match; it will
no report an ambiguity if there are multiple fact.exe files accessible.

They both resolve to the same thing because they do. A file system gives
you the choice of an absolute or relative path, when it is possible.

So can a langage. The advantage of an absolute path, like fact.fact, is
that it will work from anywhere.

Just fact() will see the local function 'fact' first. fact.fact() will
always use the one inside module fact, from any module.



>
> External names could potentially be long with many parts but I'd still
> have the ability to define a local alias for anything external so that
> it could be referred to by a short name in program code.
>
> As I say, the above is about imports. What of names within nested scopes
> in our code? I don't know, yet. Jury's out!

I don't have many nested scopes. My language will not peek inside them,
only inside the outermost scope of a module.

>>
>> But if there is a local version of fact():
>>
>>     fact() resolves to that local version.
>>
>>     fact.fact() is an error: the first 'fact' resolves to that local
>> function, which does not contain a definition of 'fact'.
>>
>> Generaly, module names that match function names etc are a nuisance
>> (especially in my dynamic languages where a standalone module name not
>> followed by "." is a valid expression term).
>>
>> But there is a workaround for this example:
>>
>>      module fact as ff
>>
>>      function fact(int n)int = {n*2}
>>
>>      proc start=
>>          println(ff.fact(12))
>>          println(fact(12))
>>      end
>>
>> Here, I've provided an alias for the module name 'fact', so that I can
>> use 'ff' to remove the ambiguity.
>
> I prefer that. It's clearer. In the call, ff clearly relates to the ff
> defined above.

I have to use this when there are ambiguites! But also if gives a
shorter, less cluttery name, and more immune from changes of module name.

(Or, sometimes, there is a choice of module X or Y, both aliased as A,
so I just write A whatever the actual module.)

>
> Alternatively, what about fact.m including reserved function name _ as in
>
>   global function _(int n)int = {(n<=1 | 1 | n*fact(n-1))}

(What about the 'fact' in the body of the function? Actually what would
this function be called? They can't all be _!)

> such that the caller's code would by naming the module refer to the _
> function within it as in
>
>   module fact             # (new module scheme uses 'module')
>
>   proc start=
>     println(fact(12))
>   end

I don't get this. Module 'fact' might contain 1000 functions, all with
different names.

Is this still about the module and that one function of many sharing the
same name?

What problem is this solving?

James Harris

unread,
Nov 23, 2021, 5:36:03 PM11/23/21
to
On 23/11/2021 20:38, Bart wrote:
> On 23/11/2021 17:52, James Harris wrote:
>> On 22/11/2021 01:54, Bart wrote:

...

>   C:\fact>type \fact\fact
>   This is FACT.

...

> The difference is that here, it will settle for the first match; it will
> no report an ambiguity if there are multiple fact.exe files accessible.

Yes, IMO that's not ideal. A program should be able to invoke a specific
external function - rather than the first one which matches. :-o

>
> They both resolve to the same thing because they do. A file system gives
> you the choice of an absolute or relative path, when it is possible.
>
> So can a langage. The advantage of an absolute path, like fact.fact, is
> that it will work from anywhere.

Have to say I would call that a relative path. Wouldn't an absolute path
be one which begins at somewhere fixed such as your \fact\fact, above,
starting at the root directory (assuming that \ means the root dir).

...

>> Alternatively, what about fact.m including reserved function name _ as in
>>
>>    global function _(int n)int = {(n<=1 | 1 | n*fact(n-1))}
>
> (What about the 'fact' in the body of the function? Actually what would
> this function be called? They can't all be _!)

Good point. As I said in my earlier post I am not sure what would be
best to do about resolving internal references. If the language
specifies that the compiler is to search for a name in successively
surrounding scopes (as in probably most HLLs) then the program could be

global function _(int n)int = {(n <= 1 | 1 | n * _(n - 1))}

(If the language specifies that all references to names begin at file
level then the code would be the same because _ is at the file level.)

However, if the language specifies that references made within a
function are rooted at the function level then there would have to be
some way to go 'up' a level to where the function is defined. Something
where the @@ is in

global function _(int n)int = {(n <= 1 | 1 | n * @@/_(n - 1))}

>
>> such that the caller's code would by naming the module refer to the _
>> function within it as in
>>
>>    module fact             # (new module scheme uses 'module')
>>
>>    proc start=
>>      println(fact(12))
>>    end
>
> I don't get this. Module 'fact' might contain 1000 functions, all with
> different names.

Yes, there could be any number of other functions in the same file.
Ignoring overloading, only one would take the name of the file. For
example, if the main _ function came with other functions Convert and
Reduce (say) then the file would have the following functions.

function _
function Convert
function Reduce

Only one, the _ function, would represent the file itself. If the file were

calc.xx

then from outside the file the three functions would be invocable as

calc(....)
calc.Convert(....)
calc.Reduce(....)

And, yes, I am wary of the potential for that to cause problems!

>
> Is this still about the module and that one function of many sharing the
> same name?

No, overloading aside, only one function in the file would have that name.

>
> What problem is this solving?

It's just an idea I am exploring in the context of hierarchical
namespaces. It would not always be relevant. For example, if one had a
file called

utils.xx

(where xx is the programming language extension) then one could imagine
that the functions within it would all have their own names such as

utils.advance(....)

with none having the name of the file. But as above if the programmer
created a file such as

discriminant.xx

then it could look a bit awkward to give the primary or only function
within that file the same name if it had to be invoked as

discriminant.discriminant(....)

That's not nice! :-(

Maybe this will turn out to be just aesthetics but I am thinking of it
applying to any namespace, not just files so I'll see where it goes.


--
James Harris

Bart

unread,
Nov 23, 2021, 8:02:22 PM11/23/21
to
On 23/11/2021 22:36, James Harris wrote:
> On 23/11/2021 20:38, Bart wrote:
>> On 23/11/2021 17:52, James Harris wrote:
>>> On 22/11/2021 01:54, Bart wrote:
>
> ...
>
>>    C:\fact>type \fact\fact
>>    This is FACT.
>
> ...
>
>> The difference is that here, it will settle for the first match; it
>> will no report an ambiguity if there are multiple fact.exe files
>> accessible.
>
> Yes, IMO that's not ideal. A program should be able to invoke a specific
> external function - rather than the first one which matches. :-o
>
>>
>> They both resolve to the same thing because they do. A file system
>> gives you the choice of an absolute or relative path, when it is
>> possible.
>>
>> So can a langage. The advantage of an absolute path, like fact.fact,
>> is that it will work from anywhere.
>
> Have to say I would call that a relative path. Wouldn't an absolute path
> be one which begins at somewhere fixed such as your \fact\fact, above,
> starting at the root directory (assuming that \ means the root dir).

Yes, that's true. I tend not to use the names of modules for anything
else, so when a qualified name starts with a module, then I think of it
as absolute.

There /is/ a special top-level name in my ST, that I give the internal
name $prog, but my name resolver is not set up to deal with that.

Then a fully absolute name would be:

$prog.fact.fact()

Although a snappier alias would be needed if I was going to use that
(I'm not).

> ...
>
>>> Alternatively, what about fact.m including reserved function name _
>>> as in
>>>
>>>    global function _(int n)int = {(n<=1 | 1 | n*fact(n-1))}
>>
>> (What about the 'fact' in the body of the function? Actually what
>> would this function be called? They can't all be _!)
>
> Good point. As I said in my earlier post I am not sure what would be
> best to do about resolving internal references. If the language
> specifies that the compiler is to search for a name in successively
> surrounding scopes (as in probably most HLLs) then the program could be
>
>   global function _(int n)int = {(n <= 1 | 1 | n * _(n - 1))}
>
> (If the language specifies that all references to names begin at file
> level then the code would be the same because _ is at the file level.)
>
> However, if the language specifies that references made within a
> function are rooted at the function level then there would have to be
> some way to go 'up' a level to where the function is defined. Something
> where the @@ is in
>
>   global function _(int n)int = {(n <= 1 | 1 | n * @@/_(n - 1))}

This is getting elaborate.

I simply do not want to deal with the hassle, or the clutter, so I do as
much as I can to hardly ever need to use qualified names.

It's just one big happy family with everyone on first name terms.

However, my recent module change means I can split my program into
several families in separate houses (but in the same compound), with
people on first name terms in each, and also with selected
representatives of other houses.

It's rare that I'd need to use a 'surname'.

(This analogy means that program=compound; subprogram=house, and
module=room.

Unsurprisingly I tend to do the same thing with files in folders:
everything is usually on one level, I hate nested directories.)

>
>>
>>> such that the caller's code would by naming the module refer to the _
>>> function within it as in
>>>
>>>    module fact             # (new module scheme uses 'module')
>>>
>>>    proc start=
>>>      println(fact(12))
>>>    end
>>
>> I don't get this. Module 'fact' might contain 1000 functions, all with
>> different names.
>
> Yes, there could be any number of other functions in the same file.
> Ignoring overloading, only one would take the name of the file. For
> example, if the main _ function came with other functions Convert and
> Reduce (say) then the file would have the following functions.
>
>   function _
>   function Convert
>   function Reduce
>
> Only one, the _ function, would represent the file itself. If the file were
>
>   calc.xx
>
> then from outside the file the three functions would be invocable as
>
>   calc(....)
>   calc.Convert(....)
>   calc.Reduce(....)
>
> And, yes, I am wary of the potential for that to cause problems!

I think I know what you're getting at.

In my static language, I don't have executable code outside a function;
in the dynamic language, I do, but mainly to make quick, short programs
easier to write.

There are also two special function names (dynamic code):

* start(); this is called automatically for each module if present.

* main(); this is only called if it's in the main module.

Putting 'module' code in start() is recommended instead of writing it
openly. (There are issues with scope in doing so.)

So I think that what you call "_", I deal with as "start". Except your
"_" is only called on demand; my start() is used to initialise a module.

>> What problem is this solving?
>
> It's just an idea I am exploring in the context of hierarchical
> namespaces. It would not always be relevant. For example, if one had a
> file called
>
>   utils.xx
>
> (where xx is the programming language extension) then one could imagine
> that the functions within it would all have their own names such as
>
>   utils.advance(....)
>
> with none having the name of the file. But as above if the programmer
> created a file such as
>
>   discriminant.xx
>
> then it could look a bit awkward to give the primary or only function
> within that file the same name if it had to be invoked as
>
>   discriminant.discriminant(....)
>
> That's not nice! :-(

Pythn programs do that all the time; no one cares. But it also has, if
I've got it right:

from discriminant import discriminant
discriminant()

However, with this:

from discriminant import discriminant
from discriminant2 import discriminant
discriminant()

where two modules export it, it calls the last one seen. It just
overwrites the first. Actually 'import discriminant' overwrites the
module of that name also.

That's a bad show, but typical of Python:

from math import sqrt
sqrt = 42

(This is a language that runs major banking corporations!)

James Harris

unread,
Nov 24, 2021, 3:52:17 AM11/24/21
to
On 24/11/2021 01:02, Bart wrote:
> On 23/11/2021 22:36, James Harris wrote:

...

> There /is/ a special top-level name in my ST, that I give the internal
> name $prog, but my name resolver is not set up to deal with that.
>
> Then a fully absolute name would be:
>
>    $prog.fact.fact()

Cool. I have a similar idea - mnemonics indicating predefined places in
the name hierarchy. For example, the folder in which the current source
file sits would be something like

&base

so a factorial program in the same folder could be invoked as

&base.fact()

...

> Unsurprisingly I tend to do the same thing with files in folders:
> everything is usually on one level, I hate nested directories.)

Would you still hate nested directories if they were simply part of the
name hierarchy? For example, say you had

folderA
prog.xx
folderB
folderC
sub.xx

Then prog.xx would be able to invoke the code in sub.xx even though it's
two folders deeper by

&base.folderB.folderC.sub()

or as with any name prog could define an alias for it and invoke that.

That's OK, isn't it?

...

> So I think that what you call "_", I deal with as "start". Except your
> "_" is only called on demand; my start() is used to initialise a module.

They are not quite the same - for the reasons you mention.

...

>>    discriminant.discriminant(....)
>>
>> That's not nice! :-(
>
> Pythn programs do that all the time; no one cares.

You surprise me. You dislike the repetition of i in

for (i = 0; i < LEN; i++)

but you don't mind the repetition in

discriminant.discriminant

?


--
James Harris

Bart

unread,
Nov 24, 2021, 6:37:04 AM11/24/21
to
I just wouldn't do it. I've seen folder hierarchies 9-11 levels deep (eg
in VS installations). They are not meant for use by humans.

In a language, I already have long dotted sequences for member selection
in records. For mere name resolution. I usually have 0 dots, or at most 1.

I like things flat!

Note that A.B.C.D for member selection is not necessarily a hierarchy.
If P is the head of a linear linked list, then P.next.next.data just
accesses a member of a node futher than the chain.

>> Pythn programs do that all the time; no one cares.
>
> You surprise me. You dislike the repetition of i in
>
>   for (i = 0; i < LEN; i++)
>
> but you don't mind the repetition in
>
>   discriminant.discriminant

I didn't say I liked it; it's just common in Python, eg:

import dis # byte-code disassembler

dis.dis(fn)

However, Python does allow an alias:

dasm = dis.dis
dasm(fn)

I actually allow the same in dynamic code:

ff := fact.fact
print ff(12)

luserdroog

unread,
Nov 24, 2021, 7:22:58 PM11/24/21
to
It's the solution to the FUNARG problem described in:

http://www.softwarepreservation.org/projects/LISP/MIT/Weizenbaum-FUNARG_Problem_Explained-1968.pdf

He calls it a "symbol table tree". But I quibble whether it's really a tree like we
usually define them. All the links go from leaves back up toward the root.
I learned about them from /Anatomy of Lisp/, and I think my implementation
in olmec copies what I understood from that book. IIRC Anatomy of Lisp
describes them as chains instead of a tree.

James Harris

unread,
Nov 26, 2021, 11:57:34 AM11/26/21
to
On 24/11/2021 11:37, Bart wrote:
> On 24/11/2021 08:52, James Harris wrote:

...

>> Would you still hate nested directories if they were simply part of
>> the name hierarchy? For example, say you had
>>
>>    folderA
>>      prog.xx
>>      folderB
>>        folderC
>>          sub.xx
>>
>> Then prog.xx would be able to invoke the code in sub.xx even though
>> it's two folders deeper by
>>
>>    &base.folderB.folderC.sub()
>>
>> or as with any name prog could define an alias for it and invoke that.
>>
>> That's OK, isn't it?
>
> I just wouldn't do it. I've seen folder hierarchies 9-11 levels deep (eg
> in VS installations). They are not meant for use by humans.
>
> In a language, I already have long dotted sequences for member selection
> in records. For mere name resolution. I usually have 0 dots, or at most 1.
>
> I like things flat!

I don't mind up to one level of depth, and that can be achieved by
aliasing a name (zero levels) or its parent (one level). For example,

alias gui = &base.folderB.folderC

then the subroutine could be invoked with

gui.sub()


>
> Note that A.B.C.D for member selection is not necessarily a hierarchy.
> If P is the head of a linear linked list, then P.next.next.data just
> accesses a member of a node futher than the chain.

I don't think I would have that problem. If a node of the list had the form

ui64 next
ui64 data

and P referred to a node then

P would refer to the node
P* would be the node
P*.next would refer to the next node
P*.data would be the data in the node


--
James Harris

James Harris

unread,
Nov 26, 2021, 12:55:27 PM11/26/21
to
On 25/11/2021 00:22, luserdroog wrote:
> On Tuesday, November 23, 2021 at 11:57:05 AM UTC-6, James Harris wrote:
>> On 23/11/2021 01:58, luserdroog wrote:

...

>>> The real issue is how to implement the directive. I have support for
>>> Weizenbaum environment chains, although so far there are only ever two
>>> environments: global or local to a (DEL) function.

>> What are "Weizenbaum environment chains"?? Most of Google's suggestions
>> don't seem to relate.
>>
>
> It's the solution to the FUNARG problem described in:
>
> http://www.softwarepreservation.org/projects/LISP/MIT/Weizenbaum-FUNARG_Problem_Explained-1968.pdf
>
> He calls it a "symbol table tree". But I quibble whether it's really a tree like we
> usually define them. All the links go from leaves back up toward the root.
> I learned about them from /Anatomy of Lisp/, and I think my implementation
> in olmec copies what I understood from that book. IIRC Anatomy of Lisp
> describes them as chains instead of a tree.
>

Thanks for the info. It seems to be related to partial function
application and/or currying.

https://en.wikipedia.org/wiki/Partial_application
https://en.wikipedia.org/wiki/Currying

Have to say, all such 'clever' approaches seem to me to make a program
unnecessarily hard to understand. I've never needed any of them.


--
James Harris

luserdroog

unread,
Nov 27, 2021, 8:14:45 PM11/27/21
to
It's important for those tricks, but also for closures. Like in javascript,

function something( myarray ){
var thing = 12;
return myarray.Map( elem=>elem+thing );
}

The closure allows the "thing" in "elem=>elem+thing" to access
the "thing" defined in the outer function.

anti...@math.uni.wroc.pl

unread,
Jan 28, 2022, 9:35:18 PMJan 28
to
James Harris <james.h...@gmail.com> wrote:
> On 16/11/2021 16:11, David Brown wrote:
> > On 16/11/2021 16:16, luserdroog wrote:
>
> >> I've been following with interest many of the threads started
> >> by James Harris over the year and frequently the topic of
> >> namespaces and modularity come up. Most recently in the
> >> "Power operator and replacement..." thread.
> >>
> >> And I tend to find myself on James' side through unfamiliarity
> >> with the other option. What's the big deal about namespaces
> >> and modules? What do they cost and what do they facilitate?
>
> ...
>
> > namespace timers {
> > void init();
> > void enable();
> > int get_current();
> > void set_frequency(double);
> > }
> >
> > You can use them as "timers::init();" or by pulling them into the
> > current scope with "using timers; enable();".
>
> As this is about namespaces, I'd suggest some problems with that "using
> timers" example:
>
> * AIUI it mixes the names from within 'timers' into the current
> namespace. If that's right then a programmer taking such an approach
> would have to avoid having his own versions of those names. (I.e. no
> 'enable' or 'init' names in the local scope.)

I did not check C++ rules here. But other languages have simple rule:
local name wins. That is to access global one you either should
not declare local one or use qualified version.

If you think about this for a while you should see that this is
the only sensible rule: imported modules can change without
programmer of "client" module knowing this. And breakning clients
by merely adding new export is in most cases not acceptable.

OTOH client should know which imported functions will be used.
So avoiding local use of imported names usually is not a big
burden. And when client thinks that some local name is extremally
good and can not be reasonably replaced by different name, then
client can still use imported name as qualified name.

--
Waldek Hebisch

Dmitry A. Kazakov

unread,
Jan 29, 2022, 5:22:43 AMJan 29
to
On 2022-01-29 03:35, anti...@math.uni.wroc.pl wrote:

> I did not check C++ rules here. But other languages have simple rule:
> local name wins. That is to access global one you either should
> not declare local one or use qualified version.

The rules most languages deploy are rather quite complicated. There are
three choices:

1. Silent hiding of external entities with conflicting names
2. Overloading the entities with conflicting names
3. Signaling error

The rule #2 is interesting as it can create mutually hiding names when
overloading cannot be resolved.

Languages tend to have a mixture of all three from case to case.

> If you think about this for a while you should see that this is
> the only sensible rule: imported modules can change without
> programmer of "client" module knowing this. And breakning clients
> by merely adding new export is in most cases not acceptable.

Well, it is not that simple.

The first point is that from the SW development POV, if the imported
module's interface changes, the client code must be reviewed anyway.

The second point is that the rule #1 does not actually protect clients.
As an example consider two modules used by the client. If one of the
modules introduces a name conflict with another module, that breaks the
client anyway, because this case cannot be handled by the rule #1 anymore.

> OTOH client should know which imported functions will be used.
> So avoiding local use of imported names usually is not a big
> burden.

In practice it is. Ada even introduced partial visibility of imported
names. For example you can say:

use Linear_Algebra;

This will make all names from Linear_Algebra publicly visible and
potentially conflicting with other stuff. But you could write:

declare
X, Y, Z : Matrix; -- Linear_Algebra.Matrix
begin
...
Z := X + Y; -- Operation "+" of Linear_Algebra.Matrix

Now if you wanted to use only the Matrix type in your code without
importing anything else, you could do:

use type Linear_Algebra.Matrix;

This would make operations of Matrix directly visible, but nothing else.
So the code would look like:

declare
X, Y, Z : Linear_Algebra.Matrix; -- Qualified name
begin
...
Z := X + Y; -- Operation "+" of Linear_Algebra.Matrix

Without either the code would be:

declare
X, Y, Z : Linear_Algebra.Matrix; -- Qualified name
begin
...
Z := Linear_Algebra."+" (X, Y);

anti...@math.uni.wroc.pl

unread,
Feb 3, 2022, 2:24:25 PMFeb 3
to
Dmitry A. Kazakov <mai...@dmitry-kazakov.de> wrote:
> On 2022-01-29 03:35, anti...@math.uni.wroc.pl wrote:
>
> > I did not check C++ rules here. But other languages have simple rule:
> > local name wins. That is to access global one you either should
> > not declare local one or use qualified version.
>
> The rules most languages deploy are rather quite complicated. There are
> three choices:
>
> 1. Silent hiding of external entities with conflicting names
> 2. Overloading the entities with conflicting names
> 3. Signaling error
>
> The rule #2 is interesting as it can create mutually hiding names when
> overloading cannot be resolved.

Overloading creates its own complexities. First, with overloading
what matters is really not name but full signature: name + types
of arguments and result. If language requires exact match for
signature than the rule above works with name replaced by
signature. If there are automatic type convertions or return
type is needed to resolve overloading then things indeed one
needs extra rules.

One language that I use has following extra rules:
- when interface inherists from two different interfaces each
having "the same" signature this resuls in single inherited
signature
- when overloading allows more than one function choice is
"arbitrary"

Note: in case of exact match local signature wins. Overloading
only plays role when there are multiple alternatives leading to
different types.

Both rules work under assumption that design has right names,
so functons with the same signatur do equivalent work.
Expirience with 200 thousend lines codebase shows that
this works well. However, it is not clear how this would work
with bigger codebase (say 20 million lines) or in less regular
problem domain.

Arguably, signaling error when overloading can not be resolved
in unique way would be safer.

> Languages tend to have a mixture of all three from case to case.
>
> > If you think about this for a while you should see that this is
> > the only sensible rule: imported modules can change without
> > programmer of "client" module knowing this. And breakning clients
> > by merely adding new export is in most cases not acceptable.
>
> Well, it is not that simple.
>
> The first point is that from the SW development POV, if the imported
> module's interface changes, the client code must be reviewed anyway.

Well, if your coding rules say the review is in place, do what rules
say. However, adding new signature without changing behaviour of
existing signatures may be "safe" change. Namely, it is safe
for languages without overloading. With overloading I consider
calling a different function with "the same" contract as sefe
change.

> The second point is that the rule #1 does not actually protect clients.
> As an example consider two modules used by the client. If one of the
> modules introduces a name conflict with another module, that breaks the
> client anyway, because this case cannot be handled by the rule #1 anymore.

Sure. But there is important difference: global object must be
coordinated to avoid conflicts. With say "error when names are
equal" rule there will be conflicts with local routines which
would significanlty increase number of conflicts.
Well, there are many ways. Extended Pascal have selective import
(only names that you specify are imported) and interfaces, that
is module may have several export lists and client decides which
one to use. And there is possiblity of renaming at import time.
All of those means that client has several ways of resolving
conflict. Extended Pascal significanly decreases chance of
artificial conflict, that is accidentally importing name that
client do not want to use. OTOH explicit import/export lists
are less attractive when there is overloading (to get equivalent
effect they require repeating type information).

Concerning Ada way, it is not clear for me how this is supposed
to work when there are multiple intersecting subsets of functions.

--
Waldek Hebisch

Dmitry A. Kazakov

unread,
Feb 4, 2022, 3:59:46 AMFeb 4
to
On 2022-02-03 20:24, anti...@math.uni.wroc.pl wrote:
> Dmitry A. Kazakov <mai...@dmitry-kazakov.de> wrote:
>> On 2022-01-29 03:35, anti...@math.uni.wroc.pl wrote:
>>
>>> I did not check C++ rules here. But other languages have simple rule:
>>> local name wins. That is to access global one you either should
>>> not declare local one or use qualified version.
>>
>> The rules most languages deploy are rather quite complicated. There are
>> three choices:
>>
>> 1. Silent hiding of external entities with conflicting names
>> 2. Overloading the entities with conflicting names
>> 3. Signaling error
>>
>> The rule #2 is interesting as it can create mutually hiding names when
>> overloading cannot be resolved.
>
> Overloading creates its own complexities. First, with overloading
> what matters is really not name but full signature: name + types
> of arguments and result. If language requires exact match for
> signature than the rule above works with name replaced by
> signature. If there are automatic type convertions or return
> type is needed to resolve overloading then things indeed one
> needs extra rules.
>
> One language that I use has following extra rules:
> - when interface inherists from two different interfaces each
> having "the same" signature this resuls in single inherited
> signature

Inheritance is dynamic polymorphism. Overloading is ad-hoc polymorphism.
As such they are unrelated.

> - when overloading allows more than one function choice is
> "arbitrary"

Arbitrary? That sound like a very poorly designed language.

> Note: in case of exact match local signature wins.

= #1

> Overloading
> only plays role when there are multiple alternatives leading to
> different types.

= #2

> Both rules work under assumption that design has right names,
> so functons with the same signatur do equivalent work.

Well, this is the Liskov's substitutability principle (LSP). Whether to
apply it to ad-hoc polymorphism is a question of program semantics.

In any case, the language does not interfere with the semantics, that
would be incomputable. LSP is not not enforceable, it is only a design
principle.

> Expirience with 200 thousend lines codebase shows that
> this works well. However, it is not clear how this would work
> with bigger codebase (say 20 million lines) or in less regular
> problem domain.

Large projects are split into weakly coupled modules.

> Arguably, signaling error when overloading can not be resolved
> in unique way would be safer.

That is the only way. The question is different. Whether visibility
rules should allow programs with *potentially* unresolvable overloading.
E.g. both M1 and M2 declare conflicting Foo. M3 uses both M1 and M2 but
does not reference Foo. Is M3 legal? Or just the fact of using
conflicting M1 and M2 should make M3 illegal.

>> The first point is that from the SW development POV, if the imported
>> module's interface changes, the client code must be reviewed anyway.
>
> Well, if your coding rules say the review is in place, do what rules
> say. However, adding new signature without changing behaviour of
> existing signatures may be "safe" change. Namely, it is safe
> for languages without overloading.

It is never safe. The change can always introduce a conflict. The only
choice is between flagging it as an error always or only when a client
actually uses it.

This is comparable with the choices made in C++ templates vs choices in
Ada generics.

C++ templates are not checked. So long you do not instantiate the
template with parameters provoking an error you are good.

Ada generics are [partially] checked so that potential instantiation
errors are signaled even if never instantiated in the program.

>> The second point is that the rule #1 does not actually protect clients.
>> As an example consider two modules used by the client. If one of the
>> modules introduces a name conflict with another module, that breaks the
>> client anyway, because this case cannot be handled by the rule #1 anymore.
>
> Sure. But there is important difference: global object must be
> coordinated to avoid conflicts.

There should be no global objects, for the start.

> With say "error when names are
> equal" rule there will be conflicts with local routines which
> would significanlty increase number of conflicts.

and reduce surprises. Any choice has drawbacks and name spaces is a
method of damage control.

> Well, there are many ways. Extended Pascal have selective import
> (only names that you specify are imported) and interfaces, that
> is module may have several export lists and client decides which
> one to use. And there is possiblity of renaming at import time.
> All of those means that client has several ways of resolving
> conflict.

There is a difference. For example, if M1 and M2 have Foo and

with M1; use M1;
with M2, use M2;

procedure M3 is
begin
Foo; -- Conflict
end M3;

Renaming:

with M1; use M1;
with M2, use M2;

procedure M3 is
procedure Bar renames M1.Foo;
procedure Baz renames M2.Foo;
begin
Bar; -- No conflict
end M3;

Resolving:

with M1; use M1;
with M2, use M2;

procedure M3 is
begin
M1.Foo; -- Resolved per a fully qualified name
end M3;

> Extended Pascal significanly decreases chance of
> artificial conflict, that is accidentally importing name that
> client do not want to use.

You cannot accidentally import anything. The module interface must be
designed rationally to be useful for the clients. There is no defense
against poorly designed interfaces.

However interfaces can be designed in favor of clients deploying fully
qualified names vs clients deploying direct visibility. Unfortunately
one should choose one.

The former tends to choose same names for everything. So that the client
does something like:

IO.Integers.Put;

The latter would rather do

IO.Integer_IO.Put_Integer

which with direct visibility becomes

Put_Integer

> OTOH explicit import/export lists
> are less attractive when there is overloading (to get equivalent
> effect they require repeating type information).

Why? Explicit import in Ada is very popular. I gave example of

use type T;

clause which is that thing. Usually such types are numeric which leads
to massive overloading of +,-,*,/. Nobody cares as these are all resolvable.

> Concerning Ada way, it is not clear for me how this is supposed
> to work when there are multiple intersecting subsets of functions.

I am not sure what you mean. Overloading works pretty well Ada because
of types. Languages shy of overloading are ones with weak type system
when everything gets lumped into something unresolvable. E.g. if there
is no distinct numeric types or when the result is not a part of the
signature etc.

anti...@math.uni.wroc.pl

unread,
Feb 4, 2022, 5:28:40 PMFeb 4
to
You are making unwarranted assumptions here. I am writing about
language which is probably unknown to you. In this language
interface inheritance is mostly static (some folks would say
purely static). Interfaces are useful for polymorphism, but
in _this_ case polymorphism is parametric and would be useful
even without any dynamic aspect. You could have similar (but
more clumsy) language with interfaces without inheritance:
interface inheritace allows nice expression of common parts and
by name test for interface (in)compatibility (otherwise one
would be forced to use structural tests). One could use
interface inheritance without polymorphism. For example in
language somewhat similar to Modula2 or UCSD Pascal one could
have modules with separate interfaces and use inheritance for
common parts (but have no OO nor overloading).

Interfaces are related to overloading: visible interfaces decide
which signatures are visible. And in language that I am writing
above overloading means choice of _signature_. Actual function
to be run is determined at runtime.

>
> > - when overloading allows more than one function choice is
> > "arbitrary"
>
> Arbitrary? That sound like a very poorly designed language.
>
> > Note: in case of exact match local signature wins.
>
> = #1
>
> > Overloading
> > only plays role when there are multiple alternatives leading to
> > different types.
>
> = #2
>
> > Both rules work under assumption that design has right names,
> > so functons with the same signatur do equivalent work.
>
> Well, this is the Liskov's substitutability principle (LSP). Whether to
> apply it to ad-hoc polymorphism is a question of program semantics.
>
> In any case, the language does not interfere with the semantics, that
> would be incomputable.

Well, theorem proving is uncomputable, but proof checking is computable.
Arguably, program should be deemed correct only when programmer
can justify its correctness. So, at least in theory you could have
language that requires programmer to include justification (that is
proof) of correctnes...

> LSP is not not enforceable, it is only a design
> principle.
>
> > Expirience with 200 thousend lines codebase shows that
> > this works well. However, it is not clear how this would work
> > with bigger codebase (say 20 million lines) or in less regular
> > problem domain.
>
> Large projects are split into weakly coupled modules.

This 200 thousend lines is split into about 1000 modules. But
there is cluster of about 600 mutually connected modules. From
one point of view many of those modules are "independent", but
there are subtle connections and if you transitively track them there
is largish cluster...

> > Arguably, signaling error when overloading can not be resolved
> > in unique way would be safer.
>
> That is the only way. The question is different. Whether visibility
> rules should allow programs with *potentially* unresolvable overloading.
> E.g. both M1 and M2 declare conflicting Foo. M3 uses both M1 and M2 but
> does not reference Foo. Is M3 legal? Or just the fact of using
> conflicting M1 and M2 should make M3 illegal.

Word "potentially" have many different meanings. In my case
types are hairy enough that there are 3 cases:

- compiler can decide that there is only one applicable signature
- compiler can decide that there is conflict
- compiler can not decide and only at runtime it possible to
decide if there is a conflict.

The core of problem may be illustrated using Extended Pascal.
Namely, one can define schema type:

type T(i : integer) = 0..100;

above type discriminant 'i' is otherwise unused but ensures
that later w get incompatible types.

We can have procedure taking argument of schema type:

function foo(a : T) : integer;
....

But we can also use discriminanted version of type:

function foo(a : T(42)) : integer;
....


We can have another function taking integer parameters:

function bar(i : integer) : integer;
var a : T(i) = 0;
begin
bar := foo(a);
end;

Extended Pascal does not have overloading, so you can not have both
foo-s in the same context. But suppose that we add overloading
keeping other rules. Then, if i is different than 42 only
general (schematic) version is applicable, so there is unique
applicable siganture. But when i = 42, then both calls would
be valid, so there is conflict. As a little variation one
could write:

function bar(i : integer) : integer;
var a : T(i*i) = 0;
begin
bar := foo(a);
end;

Since 42 is not a square the second case can not occur, so there
would be no conflict at runtime.

Of course, in Ada spirit it would be to disallow both versions.
OTOH even in Ada some checks are delayed to runtime...

> >> The first point is that from the SW development POV, if the imported
> >> module's interface changes, the client code must be reviewed anyway.
> >
> > Well, if your coding rules say the review is in place, do what rules
> > say. However, adding new signature without changing behaviour of
> > existing signatures may be "safe" change. Namely, it is safe
> > for languages without overloading.
>
> It is never safe. The change can always introduce a conflict. The only
> choice is between flagging it as an error always or only when a client
> actually uses it.

Well, assuming that there are no conflict between two different
imported signatures, but only imported signature gets shadowed by
local one you will get the same code. Of course, after adding
new export compilation may discover that added signature is
in conflict with other imported signature, but this is different
case.

> >> The second point is that the rule #1 does not actually protect clients.
> >> As an example consider two modules used by the client. If one of the
> >> modules introduces a name conflict with another module, that breaks the
> >> client anyway, because this case cannot be handled by the rule #1 anymore.
> >
> > Sure. But there is important difference: global object must be
> > coordinated to avoid conflicts.
>
> There should be no global objects, for the start.

Sorry, you are attaching different meaning to words that I am.
To have meaningful disscussion we need common (global to the
disscussion) understanding of "global" and "object". To put
it differently common words should be global objects. Similarly
to be able to program you need global objects. For example
in C++ everything at source level is in some namespace. Eliminate
global namespace and you can no longer program in C++ (you would
probably consider this a good thing, but equvalent thing applies
to Ada). In particular some interfaces must be global.

> > Extended Pascal significanly decreases chance of
> > artificial conflict, that is accidentally importing name that
> > client do not want to use.
>
> You cannot accidentally import anything. The module interface must be
> designed rationally to be useful for the clients. There is no defense
> against poorly designed interfaces.

Well, there is no law of nature saying that poorly designed interfaces
are impossible. Actually, appearence of poorly designed interfaces
seem to be natural state. So, unless you can control 100% of
codebase that you use you are likely to be forced to use
some poorly designed interface. Of couse, you may hide such
interface behind a better designed one, but this was exactly
my point: you need some way to insulate your code from unwanted
external interfaces.

> However interfaces can be designed in favor of clients deploying fully
> qualified names vs clients deploying direct visibility. Unfortunately
> one should choose one.
>
> The former tends to choose same names for everything. So that the client
> does something like:
>
> IO.Integers.Put;
>
> The latter would rather do
>
> IO.Integer_IO.Put_Integer
>
> which with direct visibility becomes
>
> Put_Integer

Well, with overlaoding it is normal to use direct visibility and
use the same name for all types: overloading take care of types
and you rarely need qualified names. Of course, you need to
properly choose names so that they correspond well with actual
meaning, but this is easier if you do not have to invent new
names just to avoid conflicts...

> > OTOH explicit import/export lists
> > are less attractive when there is overloading (to get equivalent
> > effect they require repeating type information).
>
> Why? Explicit import in Ada is very popular. I gave example of
>
> use type T;
>
> clause which is that thing.

No, this is not explicit import/export list. I meant something
like:

import signature foo : U -> V from T

meaning that you want from T only function foo having argument of
type U and return value of type V. In principle T may export
20 different foo-s, each with its own signature, so saying

import foo from T

would be ambigious (reasonably it could import all foo-s, but
that may be too much). And of course problem is most acute
when there is established "canonical" name like '+' or 'map'
(96 overloads in my codebase).

This is different than what you showed: you use named interface.

> Usually such types are numeric which leads
> to massive overloading of +,-,*,/. Nobody cares as these are all resolvable.

In my case probably "most overloaded" is 'coerce' (238 overloads).
It is "changing type" and wrong version would not pass typechecking.

> > Concerning Ada way, it is not clear for me how this is supposed
> > to work when there are multiple intersecting subsets of functions.
>
> I am not sure what you mean.

You showed how to import whole interface (all exported signatures).
But how you handle situation when you have module exporting several
signatures and in different uses you need different subsets.
Say, for files one client wants 'open', 'read', 'close'. Another
Works with existing files and wants 'read' and 'write'. Package
for file operations is likely to contains several other operations.
I did not specify types above, but at least for some operations
it is natural to provide overloaded variants. In principle
client may want arbitrary subset of exported operations. Clearly
creating separate interface per subset is not feasible.
In principle one could have one interface per operation, but
this looks clumsy. Can Ada do better?

> Overloading works pretty well Ada because
> of types. Languages shy of overloading are ones with weak type system
> when everything gets lumped into something unresolvable. E.g. if there
> is no distinct numeric types or when the result is not a part of the
> signature etc.

Well, I would say "explicit types". ML and Haskell have have
rather detailed types. But they use type inference and
algorithm they use depends on "almost" lack of overloading
(they have cludges to overload arithmetic).

--
Waldek Hebisch

Dmitry A. Kazakov

unread,
Feb 5, 2022, 5:29:32 AMFeb 5
to
Static inheritance is still dynamic polymorphism. The point is that in
dynamic polymorphism everything is controlled and any overloading that
occur is always resolvable.

[...]

> Interfaces are related to overloading: visible interfaces decide
> which signatures are visible.

Yes, and irrelevant because in dynamic polymorphism the fundamental
principle is that a method is reachable regardless its visibility. So
overloading simply does no matter, or else the language is broken.

> Well, theorem proving is uncomputable, but proof checking is computable.
> Arguably, program should be deemed correct only when programmer
> can justify its correctness. So, at least in theory you could have
> language that requires programmer to include justification (that is
> proof) of correctnes...

Yes, you can put a meta language on top of the object language, e.g.
SPARK on top of Ada (and move the problem to the meta language).

However, when you do so, you usually prove the program correctness
rather than much weaker substitutability of types. Substitutability is
meant to ensure certain grade of correctness. If you can prove
correctness directly you care less. Of course you get fragile design if
you choose to ignore LSP completely, but that is a different story.

>>> Expirience with 200 thousend lines codebase shows that
>>> this works well. However, it is not clear how this would work
>>> with bigger codebase (say 20 million lines) or in less regular
>>> problem domain.
>>
>> Large projects are split into weakly coupled modules.
>
> This 200 thousend lines is split into about 1000 modules. But
> there is cluster of about 600 mutually connected modules. From
> one point of view many of those modules are "independent", but
> there are subtle connections and if you transitively track them there
> is largish cluster...

Right, which is why I did not say they were independent. Experience with
larger Ada projects shows that overloading is never a problem.

>>> Arguably, signaling error when overloading can not be resolved
>>> in unique way would be safer.
>>
>> That is the only way. The question is different. Whether visibility
>> rules should allow programs with *potentially* unresolvable overloading.
>> E.g. both M1 and M2 declare conflicting Foo. M3 uses both M1 and M2 but
>> does not reference Foo. Is M3 legal? Or just the fact of using
>> conflicting M1 and M2 should make M3 illegal.
>
> Word "potentially" have many different meanings. In my case
> types are hairy enough that there are 3 cases:
>
> - compiler can decide that there is only one applicable signature

= no conflict.

BTW, to be precise, compiler cannot decide anything, it must follow the
language rules, compile all correct programs (impossible of course) and
reject all incorrect ones.

> - compiler can decide that there is conflict

= conflict

> - compiler can not decide and only at runtime it possible to
> decide if there is a conflict.

= poorly designed language.

> The core of problem may be illustrated using Extended Pascal.
> Namely, one can define schema type:
>
> type T(i : integer) = 0..100;
>
> above type discriminant 'i' is otherwise unused but ensures
> that later w get incompatible types.
>
> We can have procedure taking argument of schema type:
>
> function foo(a : T) : integer;
> ....
>
> But we can also use discriminanted version of type:
>
> function foo(a : T(42)) : integer;
> ....

> Of course, in Ada spirit it would be to disallow both versions.

Right. In Ada you cannot overload these within the same context because
that is not statically resolvable. A discriminant constraint [T(42)]
produces in Ada a constrained subtype, no new type. Subtypes cannot be
used in resolution. It same as with arrays. Constrained arrays with
fixed bounds are subtypes and considered interchangeable, thus useless
for resolution purpose.

Note that you still can overload on subtypes using different modules:

package M1 is
type T (I : Integer) is null record;
subtype S is T (42);
end M1;
-------------------- In the same context ---------------
with M1; use M1;
package M2 is
function Foo (A : T) return Integer;
function Foo (A : S) return Integer; -- Illegal
end M2;
--------------------------------------------------------
But you can
-------------------- Two different packages ------------
with M1; use M1;
package M3 is
function Foo (A : T) return Integer;
end M3;

with M1; use M1;
package M4 is
function Foo (A : S) return Integer;
end M4;
--------------------------------------------------------
and then
-------------------- Import from both ------------------
with M3; use M3;
with M4; use M4;
package M5 is
-- Here Foo is overloaded and unresolvable
end M4;
--------------------------------------------------------

> OTOH even in Ada some checks are delayed to runtime...

Checks yes, resolution no.

> Well, assuming that there are no conflict between two different
> imported signatures, but only imported signature gets shadowed by
> local one you will get the same code. Of course, after adding
> new export compilation may discover that added signature is
> in conflict with other imported signature, but this is different
> case.

This case I meant.

>>>> The second point is that the rule #1 does not actually protect clients.
>>>> As an example consider two modules used by the client. If one of the
>>>> modules introduces a name conflict with another module, that breaks the
>>>> client anyway, because this case cannot be handled by the rule #1 anymore.
>>>
>>> Sure. But there is important difference: global object must be
>>> coordinated to avoid conflicts.
>>
>> There should be no global objects, for the start.
>
> Sorry, you are attaching different meaning to words that I am.
> To have meaningful disscussion we need common (global to the
> disscussion) understanding of "global" and "object". To put
> it differently common words should be global objects. Similarly
> to be able to program you need global objects.

No, you do not. There is a root namespace (in Ada it is the package
Standard). What you suggest is modifying the root namespace, which is by
the way inconsistent with the idea of a module interface.

Anyway, in Ada you cannot modify Standard. There is no way to place an
object into it.

What you can is to create a library-level package P (which fully
qualified name is Standard.P). P is a child package of Standard as such
it directly see anything from Standard.

Then you can put objects into P.

No global objects as you see.

>> However interfaces can be designed in favor of clients deploying fully
>> qualified names vs clients deploying direct visibility. Unfortunately
>> one should choose one.
>>
>> The former tends to choose same names for everything. So that the client
>> does something like:
>>
>> IO.Integers.Put;
>>
>> The latter would rather do
>>
>> IO.Integer_IO.Put_Integer
>>
>> which with direct visibility becomes
>>
>> Put_Integer
>
> Well, with overlaoding it is normal to use direct visibility and
> use the same name for all types: overloading take care of types
> and you rarely need qualified names.

Right, but among Ada developers ones in favor of fully qualified names
("no-use-clause") are at least 50%.

> Of course, you need to
> properly choose names so that they correspond well with actual
> meaning, but this is easier if you do not have to invent new
> names just to avoid conflicts...

One of the arguments of "no-use"clause" crowd is that you need not to
care. Fully qualified names never conflict. Direct names may conflict
and may be difficult to tack down to the declarations. Of course this
argument is weakened with modern IDE where you can always click on "go
to declaration".

>>> OTOH explicit import/export lists
>>> are less attractive when there is overloading (to get equivalent
>>> effect they require repeating type information).
>>
>> Why? Explicit import in Ada is very popular. I gave example of
>>
>> use type T;
>>
>> clause which is that thing.
>
> No, this is not explicit import/export list. I meant something
> like:
>
> import signature foo : U -> V from T
>
> meaning that you want from T only function foo having argument of
> type U and return value of type V.

In Ada you would rename it:

function Foo (X : U) return V renames T.Foo;

Note that you could change the names of the function and of
the arguments

function Bar (Y : U) return V renames T.Foo;

So renaming is more versatile than import.

> This is different than what you showed: you use named interface.

Yes, but also more useful. In practice renaming of subprograms for the
purpose of import is rare in Ada. When used then other reasons. E.g. if
you want to replace predefined operations but still have access to the
original ones you use renaming:

type T is new Integer;

function Add (Left, Right : Integer) return Integer renames "+";

function "+" (Left, Right : Integer) return Integer is
begin -- My fancy new implementation of "+"
-- Here I still can use the old one as Add
end "+";

>>> Concerning Ada way, it is not clear for me how this is supposed
>>> to work when there are multiple intersecting subsets of functions.
>>
>> I am not sure what you mean.
>
> You showed how to import whole interface (all exported signatures).
> But how you handle situation when you have module exporting several
> signatures and in different uses you need different subsets.
> Say, for files one client wants 'open', 'read', 'close'. Another
> Works with existing files and wants 'read' and 'write'. Package
> for file operations is likely to contains several other operations.

In Ada a package interface and type interface are very different things.
A package can declare any number of types.

An OO design could likely have different types for read-only and
read-write files. It is a classic case of multiple inheritance:

file
/ \
read-only write-only
\ /
read-write

[The Ada standard I/O library does it the old way. You have Read and
Write for single File_Type. If the mode does not fit, you get a run-time
exception in Write.]

> I did not specify types above, but at least for some operations
> it is natural to provide overloaded variants. In principle
> client may want arbitrary subset of exported operations.

Normally it simply imports everything.

> Clearly
> creating separate interface per subset is not feasible.
> In principle one could have one interface per operation, but
> this looks clumsy.

In Ada a subprogram can be a compilation unit. So, the answer is yes,
you can have a subprogram as a standalone thing. It is very rarely used
in practice.

But I am still not sure what you are asking for. Packages are meant to
combine things coupled to each other. So, normally if you import one
thing from a package you need to import other things too. Otherwise the
package is poorly designed.

James Harris

unread,
Feb 12, 2022, 1:16:03 PMFeb 12