New coding standards: use underscores, hyphens or mixed case in command (and identifier) names

73 views
Skip to first unread message

James Harris

unread,
Jul 17, 2004, 8:17:03 AM7/17/04
to

Before I embark on a new long-term language project I'd appreciate your advice on how to
split up long names. I would like to keep the standards for command or instruction names
the same as that for variable and type names, if possible. Looking at the examples below,
which ones seem better?

Straight names
echoclient
lastcharoffset
helloworld

Internal underscores
echo_client
last_char_offset
hello_world

I could also use embedded hyphens as my minus sign must be surrounded by whitespace
(please suspend disbelief while looking at these. I know they will look unfamiliar!)
echo-client
last-char-offset
hello-world

Mixed case
EchoClient
LastCharOffset
HelloWorld

Initial lower case then mixed
echoClient
lastCharOffset
helloWorld

In some ways I like the mixed case versions using an inital capital, especially as I may
want to prefix some names with a code for an abstract data type, which, when present,
could begin with a lower case. Is this getting too Microsoft-ish? Is it getting to
Hungarian? Is Hungarian bad when used with abstract data types rather than inbuilt ones?

Advice on which is or is not thought to be acceptable would be much appreciated. Please
bear in mind that I intend these names for commands/instructions as well as variables and
types. Constants would be in all caps.

--
Thanks,
James


Marco van de Voort

unread,
Jul 17, 2004, 9:13:27 AM7/17/04
to
On 2004-07-17, James Harris <> wrote:
> EchoClient
> LastCharOffset
> HelloWorld
>
> Initial lower case then mixed
> echoClient
> lastCharOffset
> helloWorld
>
> In some ways I like the mixed case versions using an inital capital,
> especially as I may want to prefix some names with a code for an abstract
> data type, which, when present, could begin with a lower case. Is this
> getting too Microsoft-ish? Is it getting to Hungarian? Is Hungarian bad
> when used with abstract data types rather than inbuilt ones?

Hungarian notation is not bad or good. The point is, do you need that bit of
extra security of Hungarian notation? If you have a strong typed language
with good error messages you don't need it.

E.g. in Pascal nobody ever uses Hungarian notation, unless forced
externally. The compiler directly snaps at you that it wanted type x, while
you gave y. This can even be done pre-compilation stage by the syntax
highlighter. (e.g. Delphi uses the compiler binary for syntax highlighting
and Intellisense, much more powerful than a simple parser as highlighter)

Delphi coding style btw does use some Hungarian notation. Mostly enumaration
elements and class fields have a prefix. The latter is because of properties
occupying the same namespace, which is a valid reason. I'm not sure about the
exact reason for the former.

So saying that you _want_ Hungarian notation doesn't make sense. If you want
to do it right, you need proper motivation _why_ the programmer has to go
through the extra burden to do that extra type administration.

> Advice on which is or is not thought to be acceptable would be much
> appreciated. Please bear in mind that I intend these names for
> commands/instructions as well as variables and types. Constants would be
> in all caps.

Some other questions:

- Is your language case-sensitive? Complex capitalisation hurts more if a
slightly different capitalisation fails to compile.
- e.g. Delphi styleguide has different rules for native code and e.g. imported
code (header/api units). Do you plan such a thing?

Howard Ding <hading@hading.dnsalias.com>

unread,
Jul 17, 2004, 1:20:00 PM7/17/04
to
"James Harris" <no.email.please> writes:

> I could also use embedded hyphens as my minus sign must be surrounded by whitespace
> (please suspend disbelief while looking at these. I know they will look unfamiliar!)
> echo-client
> last-char-offset
> hello-world
>

Why? They look perfectly familiar to anyone who programs Lisp.

--
Howard Ding
<had...@hading.dnsalias.com>

Marcin 'Qrczak' Kowalczyk

unread,
Jul 17, 2004, 5:38:37 PM7/17/04
to
On Sat, 17 Jul 2004 13:17:03 +0100, James Harris wrote:

> Before I embark on a new long-term language project I'd appreciate your
> advice on how to split up long names.

http://groups.google.com/groups?selm=pan.2004.06.20.12.55.58.551616%40knm.org.pl
http://groups.google.com/groups?selm=pan.2003.10.28.12.31.07.163797%40knm.org.pl

--
__("< Marcin Kowalczyk
\__/ qrc...@knm.org.pl
^^ http://qrnik.knm.org.pl/~qrczak/

cr88192

unread,
Jul 17, 2004, 11:00:09 PM7/17/04
to

"James Harris" <no.email.please> wrote in message
news:40f918a1$0$7807$db0f...@news.zen.co.uk...

>
> Before I embark on a new long-term language project I'd appreciate your
advice on how to
> split up long names. I would like to keep the standards for command or
instruction names
> the same as that for variable and type names, if possible. Looking at the
examples below,
> which ones seem better?
>
depends...

> Straight names
> echoclient
> lastcharoffset
> helloworld
>

terminal part of a callback function.

void *pdscr_threads_makethread(PDSCR0_Context *ctx, void **args, int n);
also for local variable names.

> Internal underscores
> echo_client
> last_char_offset
> hello_world
>

some other cases, typically internal functions.

> I could also use embedded hyphens as my minus sign must be surrounded by
whitespace
> (please suspend disbelief while looking at these. I know they will look
unfamiliar!)
> echo-client
> last-char-offset
> hello-world
>

yes, that works, but can't be done in c and friends.
eg, scheme had used a lot of other symbols in names like this, eg:
(set! x y)

(if* (list? x) :then (display (car x)) :else (display x))
yes, non-standard, but this is to illustrate a point...

...

in general:
! is used for destructive operations;
? is used for predicates (they only return true or false);
* is typically used for alternate forms of something;
...

:foo is a keyword, basically meaning that it just evaluates to itself.

> Mixed case
> EchoClient
> LastCharOffset
> HelloWorld
>

typical of normal functions or types.

> Initial lower case then mixed
> echoClient
> lastCharOffset
> helloWorld
>

personally I don't as much like this one.

> In some ways I like the mixed case versions using an inital capital,
especially as I may
> want to prefix some names with a code for an abstract data type, which,
when present,
> could begin with a lower case. Is this getting too Microsoft-ish? Is it
getting to
> Hungarian? Is Hungarian bad when used with abstract data types rather than
inbuilt ones?
>
> Advice on which is or is not thought to be acceptable would be much
appreciated. Please
> bear in mind that I intend these names for commands/instructions as well
as variables and
> types. Constants would be in all caps.
>

yes.

I use hungarian sometimes, usually it is to deal with "name clashes", or
different types of whatever reffering to the same thing.


James Harris

unread,
Jul 20, 2004, 5:33:04 PM7/20/04
to

"Marco van de Voort" <mar...@stack.nl> wrote in message
news:slrncfi9fn....@toad.stack.nl...

> Hungarian notation is not bad or good. The point is, do you need that bit
of
> extra security of Hungarian notation? If you have a strong typed language
> with good error messages you don't need it.

My intention is that the language will be strongly typed - but will include
a Variant type (and something like a TypeOf operator).

> So saying that you _want_ Hungarian notation doesn't make sense. If you
want
> to do it right, you need proper motivation _why_ the programmer has to go
> through the extra burden to do that extra type administration.

Having seen a particular description of Hungarian I agree with you. I
didn't like what I saw!

I would like the language to show clarity of statement. I know we all want
that. I am thinking of something like this counterexample,

result = left + right

Does this add two integers and assign the result to another integer. Does
it add reals, ratios, complex numbers? It may do none of these. It may, in
fact, concatenate strings. Does it concatenate bit fields? None of this can
be determined without looking up the types of the three variables.

Here's another question which is also relevant, Does the output data type
permit the result of the operation to be assigned without coercion? If the
operation is multiply is there a danger of me losing high-order bits
because the output type is the same width as both inputs? Here is an
alternative

iResult = iLeft + iRight

though I must confess I don't much care for the look of that either. :-(


> - Is your language case-sensitive? Complex capitalisation hurts more if
a
> slightly different capitalisation fails to compile.

At this time the language is intended to be case sensitive. For familiarity
to users who are not computer-literate the hyphenated version is growing on
me, eg the user typing hello-world.

> - e.g. Delphi styleguide has different rules for native code and e.g.
imported
> code (header/api units). Do you plan such a thing?

No. I particularly don't want the language to enforce or require different
rules for different circumstances. I want the language to prescribe as
little as possible, leaving the system designer to choose the
representation where possible.


--
James


James Harris

unread,
Jul 20, 2004, 5:45:00 PM7/20/04
to

"cr88192" <cr8...@protect.hotmail.com> wrote in message
news:BIlKc.494$Qn6...@fe07.usenetserver.com...

> > echo-client
> > last-char-offset
> > hello-world
> >
> yes, that works, but can't be done in c and friends.
> eg, scheme had used a lot of other symbols in names like this, eg:
> (set! x y)
>
> (if* (list? x) :then (display (car x)) :else (display x))
> yes, non-standard, but this is to illustrate a point...
>
> ...
>
> in general:
> ! is used for destructive operations;
> ? is used for predicates (they only return true or false);
> * is typically used for alternate forms of something;
> ...
>
> :foo is a keyword, basically meaning that it just evaluates to itself.

I'm intending deliberately not distinguishing variable names and function
names. Then source code can be unchanged if the type changes. This is
particularly for array and function melding. E.g. result =
factorial.(input-value). I'd like to be able to extend that to functions
that do not require input.


> I use hungarian sometimes, usually it is to deal with "name clashes", or
> different types of whatever reffering to the same thing.

Do you mean, for example, to distinguish the number realSalary from the
printable representation stringSalary?

--
James


James Harris

unread,
Jul 20, 2004, 5:46:34 PM7/20/04
to

<had...@hading.dnsalias.com> wrote in message
news:m3wu121...@frisell.localdomain...

> "James Harris" <no.email.please> writes:
>
> > I could also use embedded hyphens as my minus sign must be surrounded
by whitespace
> > (please suspend disbelief while looking at these. I know they will look
unfamiliar!)
> > echo-client
> > last-char-offset
> > hello-world
> >
>
> Why? They look perfectly familiar to anyone who programs Lisp.

I think I could get used to these. Perhaps variable names are one of the
more readable bits of Lisp...... :-)

James Harris

unread,
Jul 20, 2004, 5:53:37 PM7/20/04
to

"Marcin 'Qrczak' Kowalczyk" <qrc...@knm.org.pl> wrote in message
news:pan.2004.07.17....@knm.org.pl...

Marcin, thanks. I had seen your good comments before posting.

I still didn't understand your signature, though. Is it a duck, a baby in a
pram? No, perhaps a Kogut...? ;-)

--
Cheers,
James


Marcin 'Qrczak' Kowalczyk

unread,
Jul 20, 2004, 6:13:13 PM7/20/04
to
On Tue, 20 Jul 2004 22:33:04 +0100, James Harris wrote:

> result = left + right
>
> Does this add two integers and assign the result to another integer. Does
> it add reals, ratios, complex numbers? It may do none of these.

Perhaps it does not matter. If it adds two amounts of time, I should not
care at this point if it's an integer or a rational.

Of course sometimes it does matter. But Hungarian notation is a poor
solution: it doesn't scale to many types (and representing almost
everything by integers is not a good idea, better introduce types
which more accurately describe the data), and looks ugly.

> It may, in fact, concatenate strings. Does it concatenate bit fields?

Assuming that the language use + for concatenation. Some languages and
some people, including me, don't use + for anything other than addition
of numbers. Precisely because it makes hard to infer what a piece of
code means without knowing a larger context.

> Does the output data type permit the result of the operation to be
> assigned without coercion? If the operation is multiply is there a
> danger of me losing high-order bits because the output type is the same
> width as both inputs?

Ah, so you first create a language with a broken addition, and then use
a naming convention to remind people each time that it's broken :-)

My favorite approach to adding integers is that it does not overflow and
is not coerced. If a programmer writes 'let result = left + right', and
both left and right are non-negative integers, then the result is a
non-negative integer, period. This is how addition works.

I accept "wrong" results only for floating point numbers, because people
haven't invented a representation of real numbers which would be able to
replace floating point, so unfortunately here the limitations of possible
implementation must influence the behavior. But implementation of integers
limited only by memory is a known issue with known solutions, there are
free libraries for this (e.g. GMP), etc.

Marcin 'Qrczak' Kowalczyk

unread,
Jul 20, 2004, 6:14:48 PM7/20/04
to
On Tue, 20 Jul 2004 22:53:37 +0100, James Harris wrote:

> I still didn't understand your signature, though. Is it a duck, a baby in a
> pram? No, perhaps a Kogut...? ;-)

It's meant to be a chicken :-)
"Kurczak" is "chicken" in Polish.

cr88192

unread,
Jul 20, 2004, 10:54:01 PM7/20/04
to

"James Harris" <no.email.please> wrote in message
news:40fd925a$0$7117$db0f...@news.zen.co.uk...

>
> "cr88192" <cr8...@protect.hotmail.com> wrote in message
> news:BIlKc.494$Qn6...@fe07.usenetserver.com...
>
> > > echo-client
> > > last-char-offset
> > > hello-world
> > >
> > yes, that works, but can't be done in c and friends.
> > eg, scheme had used a lot of other symbols in names like this, eg:
> > (set! x y)
> >
> > (if* (list? x) :then (display (car x)) :else (display x))
> > yes, non-standard, but this is to illustrate a point...
> >
> > ...
> >
> > in general:
> > ! is used for destructive operations;
> > ? is used for predicates (they only return true or false);
> > * is typically used for alternate forms of something;
> > ...
> >
> > :foo is a keyword, basically meaning that it just evaluates to itself.
>
> I'm intending deliberately not distinguishing variable names and function
> names. Then source code can be unchanged if the type changes. This is
> particularly for array and function melding. E.g. result =
> factorial.(input-value). I'd like to be able to extend that to functions
> that do not require input.
>
hmm, I am not totally sure of this one...

not all function names have a character appended on.
this was I guess a replacement for the p, n, ... postfixes for cl.

>
> > I use hungarian sometimes, usually it is to deal with "name clashes", or
> > different types of whatever reffering to the same thing.
>
> Do you mean, for example, to distinguish the number realSalary from the
> printable representation stringSalary?
>

well, just as an example:
int foo_function();
int (*foo_function_p)();

by default, foo_function_p points to foo_function, but may be overridden, so
&foo_function is not generally usable in this case.

similar is when one is the real function, and the other is an interpreter
builtin referring to the real function.

similar is because I like to stick to a convention of typically one-letter
base var names for functions, and sometimes the names clash, so they get
renamed:
void **a, **b;
double *fa, *fb; //renamed since a is allready in use
NetParse_Node *n0; //would have been n if 'int n' weren't around
int n;
int i, j, k, l; //these ones are most of the time ints
int ai, bi, ci;
float af, bf, cf;
void *p, *q, *r; //these are quite often void pointers
char *s, *t, *u, *v; //often strings
float s, t; //main contenders for s and t space
int t; //also often a contender for t
...

as can maybe be noted, types are prefixed in the case of pointers and
postfixed in the case of normal vars.

a lot has to do with order as well, as usually later added variables will
more often be renamed than earlier ones, ...


Marco van de Voort

unread,
Jul 21, 2004, 5:36:28 AM7/21/04
to
On 2004-07-20, James Harris <> wrote:
>
>> Hungarian notation is not bad or good. The point is, do you need that bit
> of
>> extra security of Hungarian notation? If you have a strong typed language
>> with good error messages you don't need it.
>
> My intention is that the language will be strongly typed - but will include
> a Variant type (and something like a TypeOf operator).

Then I wouldn't, or at least not for basic types.

You could still use it for some other things. Details depend on your language,
but I can give some examples from delphi:

- In Delphi, nearly all identifiers use the same (compilation unit)
namespace So types, const, vars etc. Types are therefore commonly
prefixed with a 'T'
- Resourcestrings are commonly prefixed too, with S (probably for string).
Probably only do indicate that it is a posteditable without recompiling)
string.
- Exceptions are classes, but are sometimes prefixed with E
- Interfaces use I as (type) prefix.
- classes can have both properties (with RTTI) and internal corresponding
fields. Therefore, classfields are commonly prefixed with "f"

> want
>> to do it right, you need proper motivation _why_ the programmer has to go
>> through the extra burden to do that extra type administration.
>
> Having seen a particular description of Hungarian I agree with you. I
> didn't like what I saw!

Also keep in mind that a good parsable language with a decent importing
system can more easily show types etc in tooltip like popups in editors.

> I would like the language to show clarity of statement. I know we all want
> that. I am thinking of something like this counterexample,
>
> result = left + right
>
> Does this add two integers and assign the result to another integer. Does
> it add reals, ratios, complex numbers? It may do none of these. It may, in
> fact, concatenate strings. Does it concatenate bit fields? None of this can
> be determined without looking up the types of the three variables.

Yes. But must it be encoded in the identifiers?
That has two serious problems:
- it must be done by the programmer (opposed to the IDE)
- programmers sometimes don't update it if the type changes.

> Here's another question which is also relevant, Does the output data type
> permit the result of the operation to be assigned without coercion? If the
> operation is multiply is there a danger of me losing high-order bits
> because the output type is the same width as both inputs? Here is an
> alternative

Better stuff this in compiler warnings etc. Or better, have decent runtime
range checking. (but that might be my Pascal biass)

>> - Is your language case-sensitive? Complex capitalisation hurts more if
> a
>> slightly different capitalisation fails to compile.
>
> At this time the language is intended to be case sensitive. For familiarity
> to users who are not computer-literate the hyphenated version is growing on
> me, eg the user typing hello-world.

Problem with hyphen is that it is the same char as minus. That can be
confusing. (and cause parser problems)

>> - e.g. Delphi styleguide has different rules for native code and e.g.
> imported
>> code (header/api units). Do you plan such a thing?
>
> No. I particularly don't want the language to enforce or require different
> rules for different circumstances

This was a styleguide, Delphi enforces nothing, though the default editor
encourages it a bit.

> I want the language to prescribe as little as possible, leaving the system
> designer to choose the representation where possible.

Of course. But keep in mind that the system that you devise for the initial
system is the one that usually sticks.

Delphi has different recommendations for imported header units, since those
are essentially translated C headers, while the normal styleguide is for native
Pascal code.

Marco van de Voort

unread,
Jul 21, 2004, 5:39:07 AM7/21/04
to
On 2004-07-20, James Harris <> wrote:

>> I use hungarian sometimes, usually it is to deal with "name clashes", or
>> different types of whatever reffering to the same thing.
>
> Do you mean, for example, to distinguish the number realSalary from the
> printable representation stringSalary?

Note that the seriousness and likelyness of name clashes is dependant on
your scoping and import precendence rules. A good module system reduces the
likeliness of clashes.

Howard Ding <hading@hading.dnsalias.com>

unread,
Jul 21, 2004, 8:47:50 AM7/21/04
to
"James Harris" <no.email.please> writes:

>
> I think I could get used to these. Perhaps variable names are one of the
> more readable bits of Lisp...... :-)
>

That's true. Along with the rest of it.

--
Howard Ding
<had...@hading.dnsalias.com>

cr88192

unread,
Jul 21, 2004, 11:46:02 AM7/21/04
to

<had...@hading.dnsalias.com> wrote in message
news:m3smbly...@frisell.localdomain...

> "James Harris" <no.email.please> writes:
>
> >
> > I think I could get used to these. Perhaps variable names are one of the
> > more readable bits of Lisp...... :-)
> >
>
> That's true. Along with the rest of it.
>
this is something I have heard claims of a lot but don't really agree
with...

a problem with lisp syntax is that it is too regular, and many syntactic
constructs don't "stand out" as much as in other languages (the use of
special syntax for some forms helps a little, but given the common
"character chaining" approaches they tend to be cryptic).

this factor also interferes with the ability to skim the source and get a
rough idea of the structure.


things also seem to end up a little more nested than one would hope, but
this is more a language issue than a syntax one (where in many languages a
nesting of 3 or 4 levels is pretty deep, much deeper nestings seem to
accumulate a lot faster).

not to mention the factor that the syntax has a lot of "scaring off of the
newbies" type powers. yes, any unfammilair languages have this property, but
imo lisp and smalltalk are more so then normal, c-like syntax is closer to
average but is much more common, and pascal syntax would probably be a
little less so (though I am not that fond of pascal syntax either...).
forth is especially bad (and also has the added bonus that once one stops
looking at it they start forgetting the structure, and re-fammiliarizing
oneself with their own code is extra difficult).

and yes, xml is worse than s-exps, but only rarely have I heard anyone
pushing direct use of xml as a programming language syntax...


I will argue that s-exps' power is their expressiveness, and not their
readability.

a possibile issue here though is that they tend to interfere some with the
"behind the scenes" abilities of the parser, leading to the occurance of
"syntax objects" and such to try to deal with the issues.

also, the workings of the compiler may find themselves constrained, and one
can't really change things around as much without interfering with the code
being compiled.
this is in contrast to languages like c where lots of weird crap can be done
in the parser, and where things can be changed around readily, at the cost
that the syntax is far less expressive and powerful macro systems are
largely eliminated...


or something...


James Harris

unread,
Jul 21, 2004, 5:36:54 PM7/21/04
to

"Marcin 'Qrczak' Kowalczyk" <qrc...@knm.org.pl> wrote in message
news:pan.2004.07.20....@knm.org.pl...

> On Tue, 20 Jul 2004 22:33:04 +0100, James Harris wrote:
>
> > result = left + right

<snip>

> > It may, in fact, concatenate strings. Does it concatenate bit fields?
>
> Assuming that the language use + for concatenation. Some languages and
> some people, including me, don't use + for anything other than addition
> of numbers. Precisely because it makes hard to infer what a piece of
> code means without knowing a larger context.

What I have in mind will allow the programmer to define meanings for words
or symbols - and to match these depending on context. The plus sign is
normally overloaded, representing integer addition of various precisions,
real addition and possibly others. I probably won't provide the
concatenation of strings per se but nor will I prevent "+" being used as a
method on new data types.

> > Does the output data type permit the result of the operation to be
> > assigned without coercion? If the operation is multiply is there a
> > danger of me losing high-order bits because the output type is the same
> > width as both inputs?
>
> Ah, so you first create a language with a broken addition, and then use
> a naming convention to remind people each time that it's broken :-)

LOL. As I say, the programmer can choose.

> My favorite approach to adding integers is that it does not overflow and
> is not coerced. If a programmer writes 'let result = left + right', and
> both left and right are non-negative integers, then the result is a
> non-negative integer, period. This is how addition works.

Well, I did mention multiplication rather than addition, but taking your
comment, wouldn't MOSTPOS + MOSTPOS be too wide to be assigned to an
integer. Perhaps you mean a negative and a non-negative....?


James Harris

unread,
Jul 21, 2004, 5:46:19 PM7/21/04
to

"Marco van de Voort" <mar...@stack.nl> wrote in message
news:slrncfse8s...@toad.stack.nl...
<snip>

> > I would like the language to show clarity of statement. I know we all
want
> > that. I am thinking of something like this counterexample,
> >
> > result = left + right
> >
> > Does this add two integers and assign the result to another integer.
Does
> > it add reals, ratios, complex numbers? It may do none of these. It may,
in
> > fact, concatenate strings. Does it concatenate bit fields? None of this
can
> > be determined without looking up the types of the three variables.
>
> Yes. But must it be encoded in the identifiers?

No. I'm not suggesting it /must/ be for anyone using the language. I am
wondering, however, whether to use some form of type identification in my
own code written in the language.

> That has two serious problems:
> - it must be done by the programmer (opposed to the IDE)
> - programmers sometimes don't update it if the type changes.

Isn't your second argument the converse of your first? If your IDE can
identify types could it not rename identifiers? Incidentally a hover-help
IDE is a good idea. I'm wanting the source to be preparable on simple
terminals, though.

> > Here's another question which is also relevant, Does the output data
type
> > permit the result of the operation to be assigned without coercion? If
the
> > operation is multiply is there a danger of me losing high-order bits
> > because the output type is the same width as both inputs? Here is an
> > alternative
>
> Better stuff this in compiler warnings etc. Or better, have decent
runtime
> range checking. (but that might be my Pascal biass)

The jury is out on this one. I may permit the programmer to specify whether
an operation is to be checked or not.

> >> - Is your language case-sensitive? Complex capitalisation hurts more
if
> > a
> >> slightly different capitalisation fails to compile.
> >
> > At this time the language is intended to be case sensitive. For
familiarity
> > to users who are not computer-literate the hyphenated version is
growing on
> > me, eg the user typing hello-world.
>
> Problem with hyphen is that it is the same char as minus. That can be
> confusing. (and cause parser problems)

I agree it is unfamiliar. Alowing it as a hyphen in identifier names would
require the minus sign to be separated from neighbouring character strings
by whitespace. Not everyone would like this.

James Harris

unread,
Jul 21, 2004, 5:57:08 PM7/21/04
to

"Marco van de Voort" <mar...@stack.nl> wrote in message
news:slrncfsedr...@toad.stack.nl...

Marco, I'm not sure what you mean by the last comment. Could you add more?

I am thinking generally of very small modules, each defining an op-code and
its context. That op-code is then to be usable as any inbuilt op-code.

Where packages of op-codes are needed I was thinking for the package to
define a prefix of any length (from zero) and that all op-codes in that
package would have the prefix of the package.

For example, say I wanted to define a package of operations on
extended-length words I could say

type ExtendedWord is <whatever>
package e : prefix "e."
to negate (ExtendedWord value)
value = (- value)
endto negate

then use the new negate function in this way,

ExtendedWord myExtendedWord
e.negate myExtendedWord

where the e. is the prefix from the package definition.

James Harris

unread,
Jul 21, 2004, 6:03:50 PM7/21/04
to

"cr88192" <cr8...@protect.hotmail.com> wrote in message
news:yUkLc.4957$Qn6....@fe07.usenetserver.com...

<snip>

> > I'm intending deliberately not distinguishing variable names and
function
> > names. Then source code can be unchanged if the type changes. This is
> > particularly for array and function melding. E.g. result =
> > factorial.(input-value). I'd like to be able to extend that to
functions
> > that do not require input.
> >
> hmm, I am not totally sure of this one...

Nor am I....! Maybe it gives too much flexibility to the programmer, making
another's code hard to understand. On the other hand maybe the programmer
should have this flexibility and then de facto standards can arise to keep
code clear. I'm not sure which way to go on this yet.

<snip>

> int ai, bi, ci;
> float af, bf, cf;

Interesting. Makes the variable types clear, doesn't it?

cr88192

unread,
Jul 21, 2004, 10:22:50 PM7/21/04
to

"James Harris" <no.email.please> wrote in message
news:40fee844$0$7129$db0f...@news.zen.co.uk...

>
> "cr88192" <cr8...@protect.hotmail.com> wrote in message
> news:yUkLc.4957$Qn6....@fe07.usenetserver.com...
>
> <snip>
>
> > > I'm intending deliberately not distinguishing variable names and
> function
> > > names. Then source code can be unchanged if the type changes. This is
> > > particularly for array and function melding. E.g. result =
> > > factorial.(input-value). I'd like to be able to extend that to
> functions
> > > that do not require input.
> > >
> > hmm, I am not totally sure of this one...
>
> Nor am I....! Maybe it gives too much flexibility to the programmer,
making
> another's code hard to understand. On the other hand maybe the programmer
> should have this flexibility and then de facto standards can arise to keep
> code clear. I'm not sure which way to go on this yet.
>
as far as I understand it, you were suggesting allowing functions to be
evaluated without an args list, or be used as fake objects.

depending on implementation, this can cause "weird" semantic issues (say, a
practical inability to usefully work with first-class functions).

of course, it could be generalized in another way:
all operations are implicitly application, so, for example, a normal
function can be used as an array or object, and can accept method calls and
slot assignments.

dunno your syntax, using my own.

function foo(var, val) {...}
foo.bar=baz;
which would be equivalent to a call:
foo(#bar, baz);


the issue, however, is when you pick up the "function with no arguments is
equivalent to its return value" type semantics.

function bar() 3;
x=bar;
x => 3

this eliminates a lot of possible uses of functions (eg: passing them around
and calling them from elsewhere).

> <snip>
>
> > int ai, bi, ci;
> > float af, bf, cf;
>
> Interesting. Makes the variable types clear, doesn't it?
>

yes, but the main point is that often I exhaust my supply of short
local-variable names, and need to do something about it.

I also stick highly to conventions related to the use of the variables as
well...

I don't like longer names, mostly because they take more effort to type and
require me to actually think up a good var name, and they interfere with my
ability to copy and paste chunks of code between functions with only minor
(and sometimes no) alterations.

Marco van de Voort

unread,
Jul 22, 2004, 4:25:57 AM7/22/04
to
On 2004-07-21, James Harris <> wrote:
>> > be determined without looking up the types of the three variables.
>>
>> Yes. But must it be encoded in the identifiers?
>
> No. I'm not suggesting it /must/ be for anyone using the language. I am
> wondering, however, whether to use some form of type identification in my
> own code written in the language.

I wouldn't for a strongtyped lang with a module system. HN is a cludge for
systems that don't have that.

>> That has two serious problems:
>> - it must be done by the programmer (opposed to the IDE)
>> - programmers sometimes don't update it if the type changes.
>
> Isn't your second argument the converse of your first? If your IDE can
> identify types could it not rename identifiers?

No. Since it might not have access to all occurances, parts may be
precompiled-only etc etc, people might not always use the IDE etc.

I don't like languages that are _only_ editable via their own IDE.

> Incidentally a hover-help
> IDE is a good idea. I'm wanting the source to be preparable on simple
> terminals, though.

Could be done on terminals too. Simply ident the type the cursor is on, and
display the type in the status bar.

Our own textmode IDE is of the Turbo Vision type (like Turbo Pascal IDE, and
dos edit), but extended to have a symbol browser, some intellisense like
features etc.

Most GUI IDE concepts translate quite well to the textmode too. Specially
since TV is event driven also.

>> > because the output type is the same width as both inputs? Here is an
>> > alternative
>>
>> Better stuff this in compiler warnings etc. Or better, have decent
> runtime
>> range checking. (but that might be my Pascal biass)
>
> The jury is out on this one. I may permit the programmer to specify whether
> an operation is to be checked or not.

That's what Pascal does also. Please also allow to _locally_ disable/enable
it, and not just global to the project or module.

> growing on
>> > me, eg the user typing hello-world.
>>
>> Problem with hyphen is that it is the same char as minus. That can be
>> confusing. (and cause parser problems)
>
> I agree it is unfamiliar. Alowing it as a hyphen in identifier names would
> require the minus sign to be separated from neighbouring character strings
> by whitespace. Not everyone would like this.

I wouldn't like it. But I don't like any significantly meaning placed on
whitespace. Call me old fashioned :-)

Marco van de Voort

unread,
Jul 22, 2004, 4:50:52 AM7/22/04
to
On 2004-07-21, James Harris <> wrote:
>
> "Marco van de Voort" <mar...@stack.nl> wrote in message
> news:slrncfsedr...@toad.stack.nl...
>> On 2004-07-20, James Harris <> wrote:
>>
>> >> I use hungarian sometimes, usually it is to deal with "name clashes",
> or
>> >> different types of whatever reffering to the same thing.
>> >
>> > Do you mean, for example, to distinguish the number realSalary from the
>> > printable representation stringSalary?
>>
>> Note that the seriousness and likelyness of name clashes is dependant on
>> your scoping and import precendence rules. A good module system reduces
> the
>> likeliness of clashes.
>
> Marco, I'm not sure what you mean by the last comment. Could you add more?

The scope of an identifier is the space in the source where the identifier
can be used. So for a variabele inside a
- procedure, the scope is the procedure
- for a global in a module the scope is the module itself, and if the global is
also exported the scope extends to all other modules that import the
module.

However some languages have multiple ways of importing a module. E.g. modula2
can import symbols from a module in two ways:
(my M2 is a bit rusty, others please don't point out mistakes, it is for the idea
only)

MODULE xxx;

FROM yyy IMPORT a,b,c,d,e;
IMPORT zzz,ooo;

END xxx.

The FROM line line causes to import identifiers a,b,c,d,e from yyy. These
identifiers can be used without module name, so a,b,c,d,e

The IMPORT zzz line imports module zzz (and ooo). Identifiers from zzz can
_only_ be used qualified with modulename, so zzz.someident.

The second way of importing avoids nameclashes, because in xxx, zzz.bla is
totally different from ooo.bla. Moreover, in Modula2 users will actually be
biassed to use the second (IMPORT) way because that way they don't have to
name all identifiers they want locally in the FROM way.

So the way how you allow identifiers to go from module to module, and how
many ways you allow to hide identifiers (local procedures, local modules)
will decrease the likelyness of nameclashes.

Note that the M2 system also allows the compiler barf already on importing
two identifiers with the same names (using FROM syntax above) without
actually having to look at the other modules source, since the same identifier
is then listed in two FROM .. IMPORT zzz; statements.

> I am thinking generally of very small modules, each defining an op-code and
> its context. That op-code is then to be usable as any inbuilt op-code.

.. treating "foreign" identifiers the same as locally defined increases
the likeliness of nameclasses (opposed to e.g. calling with a modulename
qualifier). That doesn't have to be bad, as long as you realise it.

>
> type ExtendedWord is <whatever>
> package e : prefix "e."

[...]

> where the e. is the prefix from the package definition.

If this prefixing is mandatory, that is pretty much what I meant.

Also think about nesting modules. Can be fun :-)

I got hooked on it using Modula2, and miss it in Pascal sometimes.

Lasse Hillerøe Petersen

unread,
Jul 22, 2004, 5:44:19 AM7/22/04
to
In article <slrncfuvvc....@toad.stack.nl>,

Marco van de Voort <mar...@stack.nl> wrote:

> However some languages have multiple ways of importing a module. E.g. modula2
> can import symbols from a module in two ways:
> (my M2 is a bit rusty, others please don't point out mistakes, it is for the
> idea
> only)
>
> MODULE xxx;
>
> FROM yyy IMPORT a,b,c,d,e;
> IMPORT zzz,ooo;
>
> END xxx.

This is much like Perl modules.

use Yyy qw(a b c d e);
use Zzz qw();
use Ooo qw();

Except that Perl gives the module writer a convenient way to "default" a
list of names to be imported locally. This default set is then imported
by simply:
use Mmmm;

Of course Perl gives you access to symbol tables, so all this can be
hacked in every imaginable (and unimaginable) way anyhow.

Eiffel has one feature which I don't think is found in any other
language I know (although it could be achieved with Perl, I suppose.)
Of course in Eiffel, the only way to import is to inherit from some
other class, but inherited features can be *renamed* which gives the
inherited feature a different name locally.

-Lasse

Marcin 'Qrczak' Kowalczyk

unread,
Jul 22, 2004, 6:17:16 AM7/22/04
to
On Wed, 21 Jul 2004 22:36:54 +0100, James Harris wrote:

> What I have in mind will allow the programmer to define meanings for words
> or symbols - and to match these depending on context. The plus sign is
> normally overloaded, representing integer addition of various precisions,
> real addition and possibly others. I probably won't provide the
> concatenation of strings per se but nor will I prevent "+" being used as a
> method on new data types.

I don't see a problem. This is fine. It means that looking at "x + y" you
don't know which implementation of addition will be used - so what? They
are all supposed to do analogous things to various types. They are all
supposed to give equal answer when applied to equal numbers represented
differently (be sure to distinguish integer division from real division),
modulo rounding errors. As I said, the behavior of floating point is the
only case when I accept a wrong answer motivated by ease of implementation.

> Well, I did mention multiplication rather than addition, but taking your
> comment, wouldn't MOSTPOS + MOSTPOS be too wide to be assigned to an
> integer. Perhaps you mean a negative and a non-negative....?

There is no such thing as the most positive integer, in any other sense
than the longest string. Sure, a billion of digits might not fit in
memory; when someone uses numbers *that* big, he starts getting out of
memory errors. Errors are better than a wrong answer, and in this case
errors are unavoidable. But 12345678901234567890 times 9876543210987654321
is 121932631137021795223746380111126352690, no problem.

Marco van de Voort

unread,
Jul 22, 2004, 7:57:42 AM7/22/04
to
On 2004-07-22, Lasse Hillerře Petersen <lhp+...@toft-hp.dk> wrote:
>> idea
>> only)
>>
>> MODULE xxx;
>>
>> FROM yyy IMPORT a,b,c,d,e;
>> IMPORT zzz,ooo;
>>
>> END xxx.
>
> This is much like Perl modules.
>
> Except that Perl gives the module writer a convenient way to "default" a
> list of names to be imported locally. This default set is then imported
> by simply:
> use Mmmm;

Modula2 has that too IIRC. But the not entirely standard compiler that I had
didn't implement that (IIRC, it could also be that I simply missed it, I was
a beginner then).

cr88192

unread,
Jul 22, 2004, 12:52:31 PM7/22/04
to

"Marco van de Voort" <mar...@stack.nl> wrote in message
news:slrncfuvvc....@toad.stack.nl...

something similar is possible in my lang per-se, though I don't explicitly
have a module system...
the idea is that a module is an object containing all code and variables for
a module. objects can be used as toplevels, albeit I am lacking in
(currently completed) means of usefully exploiting this feature.

my lang involves a "delegation" system, with the possibility of using this
for creating code in toplevels where it is not possible to do certain things
or access certain data (eg: because it is not visible in the scope of the
code being run). (better security may be needed eventually, but this may be
good as a basic system).
as a result, "modules" are also relative to the point of execution.

assuming there is a module of "system" features, with an "io" submodule,
with a function called openfile, one has a few ways to reference it:
system.io.openfile //a full reference

var io=system.io;
io.openfile //a shorter reference

var _io=system.io;
openfile //because now "self" delegates to system.io (note the '_')

and finally:
var openfile=system.io.openfile;
openfile //because the binding was imported directly

of course, this may be cumbersome. new syntax could be used, or a hack based
on a function:
function import(module, vars...)
{
local i;
for(i=0; vars[i]; i++)
self[vars[i]]=module[vars[i]];
}

now, if I wanted, I could write:
import(system.io, #openfile);

(imo this is one cool feature of not using lexical scoping...).

or something (then I just remember that file io in my lang sucks at present,
I just have a few special read/write functions and bytevectors, and
bytevectors are stupid for io in the abscence of c-style type casting, ...).

of course, a lot of this is defeated by a few issues at present:
I use the real toplevel currently used as a dumping ground for builtins, and
beter organization would be needed for making security more managable;
my special "object" syntax has not been fully completed/tested (I just have
the "dictionary" syntax, but this may actually make more sense for toplevels
anyways, eg, since a lot more control is given and expressions are evaluated
in the context of the creator);
I would need to add forms like 'load(script, toplevel)';
...

var user_only=[_user:=user, _console:=system.io.console];
this would create a toplevel which delegates only to "user" and
"system.io.console", but doesn't include anything else.

this could be useful, eg, with:
load("untrusted.bs", user_only);

but these are all minor.


James Harris

unread,
Jul 22, 2004, 4:00:49 PM7/22/04
to

"Marco van de Voort" <mar...@stack.nl> wrote in message
news:slrncfuugl....@toad.stack.nl...
<snip>

> > The jury is out on this one. I may permit the programmer to specify whether
> > an operation is to be checked or not.
>
> That's what Pascal does also. Please also allow to _locally_ disable/enable
> it, and not just global to the project or module.

Agreed.

> > growing on
> >> > me, eg the user typing hello-world.
> >>
> >> Problem with hyphen is that it is the same char as minus. That can be
> >> confusing. (and cause parser problems)
> >
> > I agree it is unfamiliar. Alowing it as a hyphen in identifier names would
> > require the minus sign to be separated from neighbouring character strings
> > by whitespace. Not everyone would like this.
>
> I wouldn't like it. But I don't like any significantly meaning placed on
> whitespace. Call me old fashioned :-)

OK. You're old fashioned. :-)

James Harris

unread,
Jul 22, 2004, 5:06:37 PM7/22/04
to

"cr88192" <cr8...@protect.hotmail.com> wrote in message
news:ZxFLc.5196$Qn6....@fe07.usenetserver.com...
<snip?

> as far as I understand it, you were suggesting allowing functions to be
> evaluated without an args list, or be used as fake objects.
>
> depending on implementation, this can cause "weird" semantic issues (say, a
> practical inability to usefully work with first-class functions).
>
> of course, it could be generalized in another way:
> all operations are implicitly application, so, for example, a normal
> function can be used as an array or object, and can accept method calls and
> slot assignments.
>
> dunno your syntax, using my own.
>
> function foo(var, val) {...}
> foo.bar=baz;
> which would be equivalent to a call:
> foo(#bar, baz);
>
>
> the issue, however, is when you pick up the "function with no arguments is
> equivalent to its return value" type semantics.
>
> function bar() 3;
> x=bar;
> x => 3
>
> this eliminates a lot of possible uses of functions (eg: passing them around
> and calling them from elsewhere).

Good point. I was thinking of this syntax

x y = func arg1 arg2

where func would be matched against its input types and output types. The
question is, could I replace arg1 with another function and could I do the same
with result x or y? I am thinking that all values to the left of the equals sign
would be lvalues (i.e. addresses of) and those to the right rvalues (i.e. the
values of). Since func expects two arguments if I replace arg1 I have to replace
it with another argument of the same type. This means the replacement would
either be a single identifier or need parentheses. Taking the simple identifier
first, replacing y with function fred and arg1 with function joe,

x fred = func joe arg2

where fred is another function would pass the second result of func - that was
previously assigned to y - to the function fred. Fred would be expected to
consume the value passed to it, given the above syntax. Now for function joe. It
would have to emit its value.

Now the case where fred and joe take parameters. Say they are arrays or are
functions modelling arrays. Would this work?

x (fred.3) = func (joe.9) arg2

In this case the second result of func should be allocated to element 3 of fred.
Strictly, fred would be passed two values, the index 3 and the second result
from func. Element 9 of joe would be used in the calculation. Joe would be
passed one value, 9, and emit a result of whatever type it was specified to
return.

Hmm. At the moment I think you are right. I don't have a way to pass the
function joe itself to function func. The syntax is somewhat more defined than
the simple examples above suggest. I'll see if I can fit in a means of passing
functions.

--
Cheers,
James


James Harris

unread,
Jul 22, 2004, 5:23:12 PM7/22/04
to

"Marco van de Voort" <mar...@stack.nl> wrote in message
news:slrncfuvvc....@toad.stack.nl...
<snip>

> .. treating "foreign" identifiers the same as locally defined increases
> the likeliness of nameclasses (opposed to e.g. calling with a modulename
> qualifier). That doesn't have to be bad, as long as you realise it.
> >
> > type ExtendedWord is <whatever>
> > package e : prefix "e."
> [...]
> > where the e. is the prefix from the package definition.
>
> If this prefixing is mandatory, that is pretty much what I meant.
>
> Also think about nesting modules. Can be fun :-)
>
> I got hooked on it using Modula2, and miss it in Pascal sometimes.

Thanks for explaining about the identifier imports. I've snipped it from the
above but followed your reasoning.

Yes, the prefixing is intended to be mandatory but a) is only for procedure
names, member functions if you like, and b) all variables will be local. I'm
intending all procedures to run as separate communicating processes. All
communication between them will be in defined interfaces. That said there will
be the option of collecting procedures in packages along with data.

To refer to a procedure name in a given package the package prefix will be
mandatory but that prefix can be of zero length, which means that the member
function names will stand alone. I'm intending that each process or package is
compiled separately. Compiled functions will form new instructions or data
types. To use these they will be referred to as any inbuild instructions or data
types. This brings in the question of linking. I won't go in to the details here
as it is off topic but it is to be a form of lazy linking. The requirement to
say we are ready to start process A is that all processes A refers to are
locatable and have themselves been confirmed as ready to start.

I haven't given nested modules much thought yet!


James Harris

unread,
Jul 22, 2004, 5:29:07 PM7/22/04
to

"cr88192" <cr8...@protect.hotmail.com> wrote in message
news:khSLc.5946$Qn6....@fe07.usenetserver.com...

> var _io=system.io;
> openfile //because now "self" delegates to system.io (note the '_')

I followed the others but this one defeated me. How does this work? I mean that
I can understand

var _=system.io;
openfile //etc

uses _ for "self" but aren't you showing _io as self...? I am assuming self is a
context for procedure names and hence openfile is found in self's context.


cr88192

unread,
Jul 22, 2004, 5:57:40 PM7/22/04
to

"James Harris" <no.email.please> wrote in message
news:41002c57$0$7132$db0f...@news.zen.co.uk...
all this is just weird and I don't it follow that well...

is the idea that you just have raw function calls and ones that work like
assignment or something?

I have pattern matching that can be used vaguely similar:
{#x, #y}=foo(3, 4);

the idea in this case is that foo is expected to return an array with 2
values, which will be bound to x and y.
(in this case, I leave it undefined whether the left hand side is evaluated
at compile time or runtime, but it should be treated like a compile-time
operation...).

however, it is not possible to "filter" things like that.
as far as I can tell, your language also has implicit currying? (eg: a
function can take some of the args directly, in which case it creates a
function expecting more of the args, delaying evaluation until all args are
recieved?).

personally, I am not a fan of implicit currying as it can have both
implementation and semantic consequences, instead I like currying to be done
explicitly...

> In this case the second result of func should be allocated to element 3 of
fred.
> Strictly, fred would be passed two values, the index 3 and the second
result
> from func. Element 9 of joe would be used in the calculation. Joe would be
> passed one value, 9, and emit a result of whatever type it was specified
to
> return.
>
> Hmm. At the moment I think you are right. I don't have a way to pass the
> function joe itself to function func. The syntax is somewhat more defined
than
> the simple examples above suggest. I'll see if I can fit in a means of
passing
> functions.
>

dunno.

I like use of first class functions (eg: being able to dynamicly pass them
around and call them, stuffing them in objects, ...).

of course, from what I can see the languages are clearly somewhat different
(mine inherits a lot from c and javascript, and some from scheme and
self...).


cr88192

unread,
Jul 22, 2004, 6:32:28 PM7/22/04
to

"James Harris" <no.email.please> wrote in message
news:4100319c$0$7133$db0f...@news.zen.co.uk...
self is allways implicitly referenced if a variable can't be found in the
current lexical scope.

self can be referenced directly via the psuedo-variable self.

'_' is not in itself a name, but it intended to be a NULL psuedovariable
(eg: any attempts to bind or assign it have no effect, and any attempts to
reference it are viewed as invalid).
it is intended to signify "don't care" spots in patterns.

however, as a special bit of semantics, all variables beginning with '_' are
used as "delegates" (excepting those beginning with '__', which are
'special', this includes variables which are generally intended to be
set/interpreted by the implementation or used for special features, but not
to be really used by general code).

eg:
'_io' means "a delegate variable named 'io'".
whatever is assigned to io will be searched for references if they can't be
found in self or any of the previous delegates.
the reason a name is given is so that it is possible to reference or
re-assign them if needed (though this will be discouraged as later it may be
allowed to cause a performance hit).

there are a number of possible delegates. a few 'default' ones are '_parent'
and '_toplevel', which will typically take precedence over others (this can
be controlled more finely by creating objects with dictionary syntax, where
the exact precedence order is under the creators control).


also cool:
delegation graphs are also allowed to be cyclic, as otherwise delegating the
toplevel to system.io, assuming system.io delegated to the toplevel, would
lead to an infinate loop whenever a variable could not be found in any child
scope.

by defualt the toplevel delegates to itself. the idea is thus that
'_toplevel' can be used from pretty much anywhere to refer to the current
toplevel.

eg: _toplevel._io=system.io; is yet another possible way to do an import,
and will also work from child scopes (assuming that _toplevel doesn't
delegate somewhere weird along the way...).

all for now.


James Harris

unread,
Jul 22, 2004, 6:34:35 PM7/22/04
to

"cr88192" <cr8...@protect.hotmail.com> wrote in message
news:dLWLc.5972$Qn6....@fe07.usenetserver.com...

<snip>

> > Now the case where fred and joe take parameters. Say they are arrays or
> are
> > functions modelling arrays. Would this work?
> >
> > x (fred.3) = func (joe.9) arg2
> >
> all this is just weird and I don't it follow that well...

I know. Someone else's syntax suddenly thrown in to the pot without explanation
is hard to follow. Using more conventional syntax

fred.3 is an array reference, more usually referred to as fred[3]
joe.9 is an array reference joe[9]

The /type/ of fred.3 is the type of an element of fred. If the C language
returned a tuple (and using a non-C "let" for clarity of meaning) the above
could be seen as

let x, fred[3] = func (joe[9], arg2)

where func reads the two values on its right and produces the two values on its
left.

> is the idea that you just have raw function calls and ones that work like
> assignment or something?

Everything to the right of the function name, "func", is passed to func.
Everything to the left of the equals sign is returned from it.

In the case of joe, it too can be seen as a function. It has the number 9 to its
right so is passed the number 9. In this case, as an array, it returns the value
of element 9, which replaces (joe[9]) in the arguments passed to func.

Fred, on the other hand, is to the left of the equals sign. It is passed the
number 3 (the index) and the second result from func. It will then do what an
array does, store the second result from func in element 3. As a simpler
example,

fred[3] = 7

is expressed as

fred.3 = 7

or

fred.(3) = 7

where parentheses serve no purpose other than to gather arguments, in this case,
only one of them. The latter is reminiscent of ocaml arrays.

In all the above, and key to what I am intending, we don't need to know in the
source code whether fred or joe are true arrays or process abstractions behaving
as arrays. This allows me to change the implementation of these components
without changing the source code that uses them. A more definite example, joe,
for speed and simplicity, could be an array in the same memory space as the
process being defined. On the other hand joe could be a separate process running
on the computer. Further, joe could be either an array stored on or a process
running on a different machine somewhere out on the network. In all cases, as
long as joe behaves as required and the language implements the communcation,
the source code of the program that uses joe does not change. I like that!


> I have pattern matching that can be used vaguely similar:
> {#x, #y}=foo(3, 4);
>
> the idea in this case is that foo is expected to return an array with 2
> values, which will be bound to x and y.

Yes, this part looks to be the same. My returns are just individual values (of
any type) at the moment. I have yet to do the work to cover variable numbers of
values, generators and the like.

> (in this case, I leave it undefined whether the left hand side is evaluated
> at compile time or runtime, but it should be treated like a compile-time
> operation...).
>
> however, it is not possible to "filter" things like that.
> as far as I can tell, your language also has implicit currying? (eg: a
> function can take some of the args directly, in which case it creates a
> function expecting more of the args, delaying evaluation until all args are
> recieved?).
>
> personally, I am not a fan of implicit currying as it can have both
> implementation and semantic consequences, instead I like currying to be done
> explicitly...

Thanks for the comment. As I say, I have yet to work this stuff out.


> I like use of first class functions (eg: being able to dynamicly pass them
> around and call them, stuffing them in objects, ...).
>
> of course, from what I can see the languages are clearly somewhat different
> (mine inherits a lot from c and javascript, and some from scheme and
> self...).

I think there is also a strong C influence in mine - but, as you have noticed,
not the syntax...!

cr88192

unread,
Jul 22, 2004, 8:08:03 PM7/22/04
to

"James Harris" <no.email.please> wrote in message
news:410040f4$0$7125$db0f...@news.zen.co.uk...

>
> "cr88192" <cr8...@protect.hotmail.com> wrote in message
> news:dLWLc.5972$Qn6....@fe07.usenetserver.com...
>
> <snip>
>
> > > Now the case where fred and joe take parameters. Say they are arrays
or
> > are
> > > functions modelling arrays. Would this work?
> > >
> > > x (fred.3) = func (joe.9) arg2
> > >
> > all this is just weird and I don't it follow that well...
>
> I know. Someone else's syntax suddenly thrown in to the pot without
explanation
> is hard to follow. Using more conventional syntax
>
> fred.3 is an array reference, more usually referred to as fred[3]
> joe.9 is an array reference joe[9]
>
> The /type/ of fred.3 is the type of an element of fred. If the C language
> returned a tuple (and using a non-C "let" for clarity of meaning) the
above
> could be seen as
>
> let x, fred[3] = func (joe[9], arg2)
>
> where func reads the two values on its right and produces the two values
on its
> left.
>
ok, this makes sense now.

> > is the idea that you just have raw function calls and ones that work
like
> > assignment or something?
>
> Everything to the right of the function name, "func", is passed to func.
> Everything to the left of the equals sign is returned from it.
>

yes, ok.

yes, this is cool.

>
> > I have pattern matching that can be used vaguely similar:
> > {#x, #y}=foo(3, 4);
> >
> > the idea in this case is that foo is expected to return an array with 2
> > values, which will be bound to x and y.
>
> Yes, this part looks to be the same. My returns are just individual values
(of
> any type) at the moment. I have yet to do the work to cover variable
numbers of
> values, generators and the like.
>

ok.

the idea though is that multiple return values are generated by returning an
array.

eg:
function haar1d(a, b) ({a+b, b-a});
//parens needed for syntactic reasons

function haar2d(a, b, c, d)
{
{#a1, #b1}=haar1d(a, b);
{#c1, #d1}=haar1d(c, d);
{#a2, #c2}=haar1d(a1, c1);
{#b2, #d2}=haar1d(b2, d2);

{a2, b2, c2, d2}
}
ok, this would have been more elegant with raw substitution, but this is
just to show a point...

misc note:
functions in my lang can be used in both functional and imperative styles,
eg:
tail expressions have implicit return and tail-optimizing, like above;
I can use return manually, eg, to return from wherever or force
tail-optimization.

it is also possible to fold arrays as well, eg:
{#x, #y...}={1, 2, 3, 4};
x => 1
y => {2, 3, 4}

generators are not possible in my case at present though, though in my
personal experience I haven't come up with a strong reason for generators
anyways.
a hack could be done, eg:
function fib_gen(x, y)
fun() ({x, fib_gen(y, x+y+1)...});
var fib=fib_gen(1, 1)();

this could either produce a (faked) infinately long array, or form a nested
array:
{1, {1, {3, {5, ...}}}}

this is assuming I could come up with a good reason for them.

this also reminds me of once having ideas of doing symbolic algebra in a
programming language. it would require a few hacks but it would be possible.
dunno if there is any possible value in this either though...

a weaker hack:
var tri=[a:=3, b:=4];
var pyth=[c:=expr(sqrt((a*a)+(b*b)))];
println("c=", (tri&pyth).c());

where expr would define an "expression" that can be forced to evaluate it by
calling it, in which case it will replace itself with its value (sort of
like delay, but maybe more constrained in that it will not be bytecoded
until final evaluation and thus may be rewritten before hand, and will have
special behavior wrt operations).
(this makes me wonder if this is just a kludge for my absence of lists...).


one thing though:
use of array folding and multiple return values like this will have memory
costs at present (one of the many things that eliminates constant-memory
ness).

I may fix this later though.

> > (in this case, I leave it undefined whether the left hand side is
evaluated
> > at compile time or runtime, but it should be treated like a compile-time
> > operation...).
> >
> > however, it is not possible to "filter" things like that.
> > as far as I can tell, your language also has implicit currying? (eg: a
> > function can take some of the args directly, in which case it creates a
> > function expecting more of the args, delaying evaluation until all args
are
> > recieved?).
> >
> > personally, I am not a fan of implicit currying as it can have both
> > implementation and semantic consequences, instead I like currying to be
done
> > explicitly...
>
> Thanks for the comment. As I say, I have yet to work this stuff out.
>

yes, however with the semantics from before implicit currying may be
required though...

>
> > I like use of first class functions (eg: being able to dynamicly pass
them
> > around and call them, stuffing them in objects, ...).
> >
> > of course, from what I can see the languages are clearly somewhat
different
> > (mine inherits a lot from c and javascript, and some from scheme and
> > self...).
>
> I think there is also a strong C influence in mine - but, as you have
noticed,
> not the syntax...!
>

yes, mine has a lot of syntactic influence from c, but a lot of semantic
influence from scheme and self.


Marco van de Voort

unread,
Jul 22, 2004, 10:29:12 PM7/22/04
to
On 2004-07-22, James Harris <> wrote:
>>
>> If this prefixing is mandatory, that is pretty much what I meant.
>>
>> Also think about nesting modules. Can be fun :-)
>>
>> I got hooked on it using Modula2, and miss it in Pascal sometimes.
>
> Thanks for explaining about the identifier imports. I've snipped it from the
> above but followed your reasoning.
>
> Yes, the prefixing is intended to be mandatory but a) is only for procedure
> names, member functions if you like, and b) all variables will be local.

Types, constants ?

(I'll read the rest later when I'm not dead tired)

> I haven't given nested modules much thought yet!

For a more OOP eq, see inner-classes in Java.

Marco van de Voort

unread,
Jul 22, 2004, 10:29:54 PM7/22/04
to
On 2004-07-23, Marco van de Voort <mar...@stack.nl> wrote:

(as said dead tired)

>> above but followed your reasoning.
>>
>> Yes, the prefixing is intended to be mandatory but a) is only for procedure
>> names, member functions if you like, and b) all variables will be local.
>
> Types, constants ?
>
> (I'll read the rest later when I'm not dead tired)
>
>> I haven't given nested modules much thought yet!
>
> For a more OOP eq, see inner-classes in Java.

... but with a little bit more control over import/export.

Don Groves

unread,
Jul 29, 2004, 2:02:47 AM7/29/04
to
"James Harris" <no.email.please> wrote in message news:<40f918a1$0$7807$db0f...@news.zen.co.uk>...
> Before I embark on a new long-term language project I'd appreciate your advice on how to
> split up long names. I would like to keep the standards for command or instruction names
> the same as that for variable and type names, if possible. Looking at the examples below,
> which ones seem better?

>
> I could also use embedded hyphens as my minus sign must be surrounded by whitespace
> (please suspend disbelief while looking at these. I know they will look unfamiliar!)
> echo-client
> last-char-offset
> hello-world

This gets my vote. Easy and fast to type (no shifted chars) and easy to read,
especially for lispers and schemers. Others will get used to it quickly.
--
dg

cody

unread,
Jul 30, 2004, 3:53:30 AM7/30/04
to
It depends on the language you are using.
All your given conventions are used by languages, you cannot say which one
is better because they are all conventions that are used.

echoClient->Java
EchoClient->VB,C#,Delphi,VC++ (Most languages used under Windows)
echo-client->Scheme and some other strange Languages
echo_client->plain C, C++

My advise is that you should adapt to the language/framework conventions of
the language/framework you are using.

--
cody

Freeware Tools, Games and Humour
http://www.deutronium.de.vu || http://www.deutronium.tk
"James Harris" <no.email.please> schrieb im Newsbeitrag
news:40f918a1$0$7807$db0f...@news.zen.co.uk...


>
> Before I embark on a new long-term language project I'd appreciate your
advice on how to
> split up long names. I would like to keep the standards for command or
instruction names
> the same as that for variable and type names, if possible. Looking at the
examples below,
> which ones seem better?
>

> Straight names
> echoclient
> lastcharoffset
> helloworld
>
> Internal underscores
> echo_client
> last_char_offset
> hello_world


>
> I could also use embedded hyphens as my minus sign must be surrounded by
whitespace
> (please suspend disbelief while looking at these. I know they will look
unfamiliar!)
> echo-client
> last-char-offset
> hello-world
>

> Mixed case
> EchoClient
> LastCharOffset
> HelloWorld
>
> Initial lower case then mixed
> echoClient
> lastCharOffset
> helloWorld
>
> In some ways I like the mixed case versions using an inital capital,
especially as I may
> want to prefix some names with a code for an abstract data type, which,
when present,
> could begin with a lower case. Is this getting too Microsoft-ish? Is it
getting to
> Hungarian? Is Hungarian bad when used with abstract data types rather than
inbuilt ones?
>
> Advice on which is or is not thought to be acceptable would be much
appreciated. Please
> bear in mind that I intend these names for commands/instructions as well
as variables and
> types. Constants would be in all caps.
>
> --
> Thanks,
> James
>
>


cody

unread,
Jul 30, 2004, 4:01:46 AM7/30/04
to
Sorry I reread you posting, you aren't planning a project, you are planning
a new Language.

So it depends which platform your want to support. For primarily windows I'd
suggest pascalcase (EchoClient).
You can also use camelcase (echoClient) like in Java.
If your language uses minus signs for subtraction I'd strongly suggest you
not to allow hyphens in identifiers.
conventions like (echo_client) do not allow differentiations between
variable and classnames.
Therefore I consider for myself PascalCase and camelCase the best ones.


--
cody

Freeware Tools, Games and Humour
http://www.deutronium.de.vu || http://www.deutronium.tk

"cody" <deutr...@web.de> schrieb im Newsbeitrag
news:2mudppF...@uni-berlin.de...

Lasse Hillerøe Petersen

unread,
Jul 31, 2004, 3:02:01 AM7/31/04
to
In article <2mue7iF...@uni-berlin.de>, "cody" <deutr...@web.de>
wrote:

> Sorry I reread you posting, you aren't planning a project, you are planning
> a new Language.
>
> So it depends which platform your want to support. For primarily windows I'd
> suggest pascalcase (EchoClient).
> You can also use camelcase (echoClient) like in Java.
> If your language uses minus signs for subtraction I'd strongly suggest you
> not to allow hyphens in identifiers.
> conventions like (echo_client) do not allow differentiations between
> variable and classnames.

Au contraire. Apart from the (perhaps less) obvious method of using
boldface for types/classnames and regular italic for variables, and
permitting spaces in names, you can still use a lower/uppercase
convention combined with underscore. I believe this is the style
recommended for Eiffel by Bertrand Meyer. Although I'd choose
bold/italic/space if possible, I'd pick Eiffel-style otherwise, except
when using a language already having some other convention.

-Lasse

cr88192

unread,
Jul 31, 2004, 6:05:28 AM7/31/04
to

"Lasse Hillerře Petersen" <lhp+...@toft-hp.dk> wrote in message
news:lhp+news-C8E508...@news.tele.dk...
I wonder if your source is still in plaintext...
in most normal conditions bold and italic are not usable in programming
languages based on the fact that text editors don't support them, or the
compiler doesn't except the formats for which that style of formatting is
allowed.

unless of course you are using an ide or such that handles all of this, or
you have an editor that does syntax highlighting or changing the style...

if, of course, formatting could be used in a language, this brings up
interesting ideas, like, eg, using a bold '.' to mean dot-product, or and
italic 'X' for cross product, ...


James Harris

unread,
Jul 31, 2004, 8:52:34 AM7/31/04
to

"Marco van de Voort" <mar...@stack.nl> wrote in message
news:slrncfuugl....@toad.stack.nl...

>
> I wouldn't like it. But I don't like any significantly meaning placed on
> whitespace. Call me old fashioned :-)

I already replied but I've been wondering about this statement. As a programmer
for many years I've agreed with this but I'm finding my views changing.

When programming we use various items of punctuation to separate elements in the
code but we don't expect users to use the punctuation-laden syntax when invoking
our code from the command line. They use whitespace. Compare these fictitious
statements,

write ("Hello", username, "\n");

write "Hello" username "\n"

and the second - which could be the syntax used when invoking the "write"
program - doesn't need the parens or the comma. This syntax DOES presume
grouping of the command with its parameters which is not necessary in this
example as there is nothing with which to group. In a more complex example,
given the functions "max" and "min" which return the largest and smallest of
their parameters, the more popular,

highest = max (A, B, C)

could be replaced by,

highest = max A B C

then the function would need to be grouped - delimited from any context - so,

range = max(A, B, C, D, E) - min(A, B, C, D, E)

would be replaced by

range = (max A B C D E) - (min A B C D E)

which provides the grouping required. How does that look?

--
Cheers,
James


Marcin 'Qrczak' Kowalczyk

unread,
Jul 31, 2004, 9:07:32 AM7/31/04
to
On Sat, 31 Jul 2004 13:52:34 +0100, James Harris wrote:

> When programming we use various items of punctuation to separate elements in the
> code but we don't expect users to use the punctuation-laden syntax when invoking
> our code from the command line. They use whitespace. Compare these fictitious
> statements,
>
> write ("Hello", username, "\n");
>
> write "Hello" username "\n"

It's not that fictitious. In my language Kogut you write

Write "Hello " username "\n";

or better

WriteLine "Hello " username;

where the semicolon is needed if this is statement is followed by other
statements.

> highest = max (A, B, C)
>
> could be replaced by,
>
> highest = max A B C

let highest = Max A B C;

> then the function would need to be grouped - delimited from any context - so,
>
> range = max(A, B, C, D, E) - min(A, B, C, D, E)
>
> would be replaced by
>
> range = (max A B C D E) - (min A B C D E)

Named function application binds stronger than operator application,
so this is

let range = Max A B C D E - Min A B C D E;

cr88192

unread,
Jul 31, 2004, 10:12:40 AM7/31/04
to

"James Harris" <no.email.please> wrote in message
news:410b960e$0$7125$db0f...@news.zen.co.uk...

>
> "Marco van de Voort" <mar...@stack.nl> wrote in message
> news:slrncfuugl....@toad.stack.nl...
>
> >
> > I wouldn't like it. But I don't like any significantly meaning placed on
> > whitespace. Call me old fashioned :-)
>
> I already replied but I've been wondering about this statement. As a
programmer
> for many years I've agreed with this but I'm finding my views changing.
>
well, there are reasons to use common syntax, and reasons why it is not
necessary to do such.


<snip>


> range = (max A B C D E) - (min A B C D E)
>
> which provides the grouping required. How does that look?
>

you are half-way there in reinventing a hybrid of lisp and logo style
syntax...

lisp style:
(= range (- (max A B C D E) (min A B C D E)))

logo style (as far as I remember anyways):
= range [- [max A B C D E] [min A B C D E]]

now compare this with:


range = max(A, B, C, D, E) - min(A, B, C, D, E)

it is worth noting that in many cases newbies will likely be scared away by
lisp style syntax, mostly as things can happen like:
parens can build up to large numbers;
it is not terribly obvious how to break up/indent things;
a lot of the visual cues are missing;
...

which is sad really, but not that much can be done (though many people would
rather deny the issue, or say that it is unimportant). yes, s-exps do allow
cool features, but at the costs listed above.

or maybe all this was somehow a historical accident, and the only reason
people use c-style syntax is because of history.
in any case for the time being it is the most common, and programmers often
seem comfortable with it.

afaik there is probably somewhere a balance between opposing forces:
punctuation vs whitespace;
symbols vs words;
blocks/statements vs. expressions;
...

I don't know.

my lang seems to be going in the general direction of being fairly loose
about some things, but there are limits, and often particularly odd syntax
for things...


now, why in my lang did I make it so that:
(a)=3;
causes 'a' to evaluate to a value that is used like a target/pattern?

dunno exactly, but at least I can do proxy assignment (among other things):

var foo, bar;
foo=#bar;
(foo)="baz";
bar => "baz"

and at least I can have basic pattern decomposition based on this...


James Harris

unread,
Jul 31, 2004, 10:29:12 AM7/31/04
to
"Marcin 'Qrczak' Kowalczyk" <qrc...@knm.org.pl> wrote in message
news:pan.2004.07.31...@knm.org.pl...

> Write "Hello " username "\n";
> WriteLine "Hello " username;


> let highest = Max A B C;

> let range = Max A B C D E - Min A B C D E;

I like the format and this set me to look more in to your web site. It's great
to see some overlapping of ideas - i.e. someone else having gone along some of
the same thought processes, though there are more differences.

Some questions:
1) Have you written a starter-guide - something to 'sell' the language including
simple examples?
2) The FAQ says that whitespace is insignificant. Aren't you using whitespace to
separate parameters?


Marcin 'Qrczak' Kowalczyk

unread,
Jul 31, 2004, 10:38:08 AM7/31/04
to
On Sat, 31 Jul 2004 15:29:12 +0100, James Harris wrote:

> 1) Have you written a starter-guide - something to 'sell' the language
> including simple examples?

Not yet, unfortunately, and the language reference is incomplete.
You know, this is less fun than implementing and using a language :-)

I will make better references to existing examples, and I'm making PLEAC
entries <http://pleac.sourceforge.net/>.

> 2) The FAQ says that whitespace is insignificant. Aren't you using
> whitespace to separate parameters?

There are places where some whitespace or comments is required to separate
tokens, but the amount and shape of whitespace (i.e. whether it's spaces
or newlines, or how large the indent is) is insignificant - like in many
languages, unlike Python, Ruby, Haskell, Unix shell, and C preprocessor.

James Harris

unread,
Jul 31, 2004, 11:10:04 AM7/31/04
to
"cr88192" <cr8...@protect.hotmail.com> wrote in message
news:IMNOc.5763$4%2.3...@fe07.usenetserver.com...

> you are half-way there in reinventing a hybrid of lisp and logo style
> syntax...
>
> lisp style:
> (= range (- (max A B C D E) (min A B C D E)))

<snip>

> it is worth noting that in many cases newbies will likely be scared away by
> lisp style syntax, mostly as things can happen like:
> parens can build up to large numbers;
> it is not terribly obvious how to break up/indent things;
> a lot of the visual cues are missing;

Agreed. I don't want to put people off. I've come to see the Lisp prefix example
as very flexible and logical from the point of view of the language designer;
however, while it is more reasonable for a function such as Max, above, it is
much less familiar to most programmers than an infix version (if such is
possible). Infix falls down in suggesting exactly two operands. I'm (currently)
intending to allow both, hence the hybrid,

range = (max A B C D E) - (min A B C D E)

which uses both forms and is the most clear way I can think to write this. The
infix notation is just syntactic sugar for the more general prefix notation so
the above is really

range = - (max A B C D E) (min A B C D E)

but I find the former clearer in source code, and it has more of the visual
clues you mention and which are important. A program is read many more times
than it is written!

Parentheses will build up a little partly because I (currently) intend to
require the order of operations to be explicitly specified. They won't build up
in the same way as they do in Lisp because a) assignment will not appear within
an expression, b) the compiler will allow infix as mentioned, c) functions will
not require parameters to be in parens. These three reasons should reduce the
paren count.

<snip>


> or maybe all this was somehow a historical accident, and the only reason
> people use c-style syntax is because of history.

I think it goes back further than C, to school. We are taught to write 'sums'
such as 3+4. No bad thing, though.

<snip>


> now, why in my lang did I make it so that:
> (a)=3;
> causes 'a' to evaluate to a value that is used like a target/pattern?
>
> dunno exactly, but at least I can do proxy assignment (among other things):
>
> var foo, bar;
> foo=#bar;
> (foo)="baz";
> bar => "baz"

Does the third example assign "baz" to bar? I'm assuming foo has been set to a
reference to bar.

James Harris

unread,
Jul 31, 2004, 11:47:34 AM7/31/04
to
"Marcin 'Qrczak' Kowalczyk" <qrc...@knm.org.pl> wrote in message
news:pan.2004.07.31....@knm.org.pl...

> I will make better references to existing examples, and I'm making PLEAC
> entries <http://pleac.sourceforge.net/>.

Interesting. I'd seen the updated Shootout site
http://shootout.alioth.debian.org/ but this one is new to me.


> > 2) The FAQ says that whitespace is insignificant. Aren't you using
> > whitespace to separate parameters?
>
> There are places where some whitespace or comments is required to separate
> tokens, but the amount and shape of whitespace (i.e. whether it's spaces
> or newlines, or how large the indent is) is insignificant - like in many
> languages, unlike Python, Ruby, Haskell, Unix shell, and C preprocessor.

I would guess whitespace within tokens is invalid. Overall I would say this
regards whitespace as significant! The classic example of a language for which
whitespace is insignificant is this code, from Fortran,

DO3I = 1,10

which starts off looking like an assigment but it is, in fact, a loop construct
(DO 3 I = 1,10) whereas,

DO 3 I=1

is an assignment statement assigning 1 to variable DO3I. The compiler can ignore
whitespace because of the use of punctuation such as equal-signs and commas. In
Kogut and in what I have in mind whitespace is required to delimit tokens.


I'm intriguged by the use of semicolon to end a statement. How much would
treating newlines as significant affect your language? I don't think I need the
semicolon to terminate statements. Mostly context will show the need e.g.,

struct {
int var1
int var2
} myStruct

with no need for a terminating semicolon. I am thinking to use semicolon as a
statement separator so as to allow multiple statements on a line such as,

sum = a + b; diff = a - b

but not require them at the ends of lines as I think it looks neater and that
they are unneccessary. Your language strucure is similar. Are there strong
reasons as to why you require semicolons to terminate statements?

--
Cheers,
James


cr88192

unread,
Jul 31, 2004, 12:10:00 PM7/31/04
to

"James Harris" <no.email.please> wrote in message
news:410bb644$0$7131$db0f...@news.zen.co.uk...
yes, at least you thought about it some.

I a lot of what I post is intermediate, and, thus, subject to change...

> <snip>
> > or maybe all this was somehow a historical accident, and the only reason
> > people use c-style syntax is because of history.
>
> I think it goes back further than C, to school. We are taught to write
'sums'
> such as 3+4. No bad thing, though.
>

yes, this is the case for infix.

what about c syntax in general?

c syntax generally resembles a lot of algebra in various ways (ok, this is
debatable).
it has generally defeated many more wordy syntaxes (eg: pascal, cobol, ...);
it is not threatened that much by those with much larger amounts of symbols
(eg: perl).

I think it might be balanced here.

making general code structure exist in terms of large-expressions feels a
little weird, and maybe is a little less natural to many people than
sequences of commands.

...

of course, all of this may have been historical circumstance as well...

> <snip>
> > now, why in my lang did I make it so that:
> > (a)=3;
> > causes 'a' to evaluate to a value that is used like a target/pattern?
> >
> > dunno exactly, but at least I can do proxy assignment (among other
things):
> >
> > var foo, bar;
> > foo=#bar;
> > (foo)="baz";
> > bar => "baz"
>
> Does the third example assign "baz" to bar? I'm assuming foo has been set
to a
> reference to bar.
>

yes.

foo is assigned a symbol.
(foo)="baz";
evaluates foo, notes it is a symbol, and binds "baz" to the named slot.

similarly, patterns can be passed in vars like this:

pat={#x, #y, #z};
(pat)={1, 2, 3};
x => 1
y => 2
z => 3

this may later have other uses as well.


Marcin 'Qrczak' Kowalczyk

unread,
Jul 31, 2004, 12:16:07 PM7/31/04
to
On Sat, 31 Jul 2004 16:47:34 +0100, James Harris wrote:

> Overall I would say this regards whitespace as significant! The classic
> example of a language for which whitespace is insignificant is this
> code, from Fortran,

Ok, this could be phrased differently.

Anyway, Fortran is an exception: I don't know of any other language which
doesn't require some whitespace between identifiers, numbers and keywords.
Most syntaxes fall into three groups:
1. Newlines and indentation is significant (Python, Haskell, Clean).
2. Newlines are significant, indentation is not, except some constructs
like string literals (Ruby, Unix shell, Visual Basic).
3. Neither is significant, except that sometimes some whitespace is needed
to separate tokens, and except some constructs like string literals and
to-end-of-line comments (most languages).

> I'm intriguged by the use of semicolon to end a statement. How much would
> treating newlines as significant affect your language?

I tried 1 at the beginning, then 2.

The problem is that there are too many cases where a newline doesn't end
a definition or statement. Even not counting cases which are obvious from
the context, e.g. after an operator or before a ")".

One could use \ or something to mark a newline as insignificant, but with
too many \'s it's uglier than with explicit semicolons. It's hard for a
human to see whether he is allowed to split a line in a particular place.

This is especially bad if arguments are separated by spaces, because you
can't rely on a newline after a comma being ignored. So every function
application which doesn't fit in one line needs a \.

Significant newlines constrained other parts of my syntax to use only such
combinations of tokens that can be nicely split into lines, without the
need of many explicit \'s. These constraints were too strong, so I abandoned
significant newlines.

Lasse Hillerøe Petersen

unread,
Jul 31, 2004, 3:32:44 PM7/31/04
to
In article <Z8KOc.5755$4%2.4...@fe07.usenetserver.com>,

"cr88192" <cr8...@protect.hotmail.com> wrote:
> >
> I wonder if your source is still in plaintext...

;-)

> in most normal conditions bold and italic are not usable in programming
> languages based on the fact that text editors don't support them, or the
> compiler doesn't except the formats for which that style of formatting is
> allowed.

Some ten years ago, I used Think Pascal, for the Macintosh. It did
syntax checking on the fly, and rendered programs with keywords in bold;
very nice. The Script Editor for AppleScript did much the same, taking
the approach a bit further. I know that this is not quite the same, as
typeface was not used to distinguish symbols in syntax, only as a pretty
rendering; but the step is not far. This is one thing I'd really love
support for in Algol68g, some way to write programs using a special
stropping instead of uppercase, and a way to prettyprint and edit on an
xterm using boldface for modes and keywords.

> unless of course you are using an ide or such that handles all of this, or
> you have an editor that does syntax highlighting or changing the style...

Some IDE is necessary, but bold and underline is available even with a
simple xterm. So it doesn't necessarily have to be a "window-based" IDE.

> if, of course, formatting could be used in a language, this brings up
> interesting ideas, like, eg, using a bold '.' to mean dot-product, or and
> italic 'X' for cross product, ...

I'd rather use proper symbols for such things.

-Lasse

Message has been deleted

cr88192

unread,
Jul 31, 2004, 9:21:21 PM7/31/04
to

"Lasse Hillerře Petersen" <lhp+...@toft-hp.dk> wrote in message
news:lhp+news-8C089B...@news.tele.dk...

> In article <Z8KOc.5755$4%2.4...@fe07.usenetserver.com>,
> "cr88192" <cr8...@protect.hotmail.com> wrote:
> > >
> > I wonder if your source is still in plaintext...
>
> ;-)
>
> > in most normal conditions bold and italic are not usable in programming
> > languages based on the fact that text editors don't support them, or the
> > compiler doesn't except the formats for which that style of formatting
is
> > allowed.
>
> Some ten years ago, I used Think Pascal, for the Macintosh. It did
> syntax checking on the fly, and rendered programs with keywords in bold;
> very nice. The Script Editor for AppleScript did much the same, taking
> the approach a bit further. I know that this is not quite the same, as
> typeface was not used to distinguish symbols in syntax, only as a pretty
> rendering; but the step is not far. This is one thing I'd really love
> support for in Algol68g, some way to write programs using a special
> stropping instead of uppercase, and a way to prettyprint and edit on an
> xterm using boldface for modes and keywords.
>
dunno. rendering text special, in general, does not seem that useful (sure,
it can add visual distinctiveness, but it is mostly just an editor feature).

> > unless of course you are using an ide or such that handles all of this,
or
> > you have an editor that does syntax highlighting or changing the
style...
>
> Some IDE is necessary, but bold and underline is available even with a
> simple xterm. So it doesn't necessarily have to be a "window-based" IDE.
>

yes.

I was just meaning, eg, you can't write code in notepad...
one possibility could be binding an editor to the language (not my preferred
approach, but it is possible).
your code could be represented largely as a glob of xml with a lot of text
stuffed in there as well...

a custom binary format could be used as well, which might be simpler.

or, another weird thought:
a variation of ansi codes could be used for various features as well...

> > if, of course, formatting could be used in a language, this brings up
> > interesting ideas, like, eg, using a bold '.' to mean dot-product, or
and
> > italic 'X' for cross product, ...
>
> I'd rather use proper symbols for such things.
>

yes.

doing anything weird would elminate sending the code as plaintext as well
anyways.
this largely means that little can be done, people are stuck with a limited
character range.

of course, one could use unicode (assuming it gets a lot more common), and
maybe with special editors (that or doing a more complicated multi-lingual
setup or such) one could, eg, also use greek letters and various other
symbols in code or such...

I don't know.


Howard Ding <hading@hading.dnsalias.com>

unread,
Aug 1, 2004, 12:59:16 AM8/1/04
to
"cr88192" <cr8...@protect.hotmail.com> writes:

> it is worth noting that in many cases newbies will likely be scared away by
> lisp style syntax, mostly as things can happen like:
> parens can build up to large numbers;
> it is not terribly obvious how to break up/indent things;
> a lot of the visual cues are missing;
> ...
>

I think a lot here depends on whether you're talking about "newbies to
programming" or "newbies to this particular language who have
experience in other languages". The HtDP people (www.htdp.org and
www.teach-scheme.org) have a lot of experience teaching Scheme, and
the former group of people, according to their experience, don't
really seem to have many problems picking up a Lisp; it's those
bringing in experience from other languages that seem to struggle.

--
Howard Ding
<had...@hading.dnsalias.com>

cr88192

unread,
Aug 1, 2004, 2:03:38 AM8/1/04
to

<had...@hading.dnsalias.com> wrote in message
news:m3fz77p...@frisell.localdomain...
hmm, this is probably true...

I was working roughly under the assumption that people looking at it would
have allready worked with other languages, and would have just encountered
it for whatever reason.

"newbies to programming" tend more often to be drawn to java apparently (and
then later get taught vb as it is most common for classes), and a lot of
them don't seem to know the difference between "machine language" and "c++"
anyways...
thus, the only way most of them will encounter programming will be through a
class, and when it is a class anything remotely reasonable can be taught
without complaint (except for those who object because it is not java or
whatever other language is overly hyped at the time...).
under this assumption most of them are unlikely to encounter lisp or scheme,
but if they did learning it would be no big deal.

or something...


Wilhelm B. Kloke

unread,
Aug 1, 2004, 4:22:45 AM8/1/04
to
In article <lhp+news-8C089B...@news.tele.dk>,

Lasse Hillerře Petersen <lhp+...@toft-hp.dk> wrote:
>
>Some ten years ago, I used Think Pascal, for the Macintosh. It did
>syntax checking on the fly, and rendered programs with keywords in bold;
>very nice. The Script Editor for AppleScript did much the same, taking
>the approach a bit further. I know that this is not quite the same, as
>typeface was not used to distinguish symbols in syntax, only as a pretty
>rendering; but the step is not far. This is one thing I'd really love
>support for in Algol68g, some way to write programs using a special
>stropping instead of uppercase, and a way to prettyprint and edit on an
>xterm using boldface for modes and keywords.

It should be easy to add a new stropping regime for Algol68 (or your
favourite other language) to allow
for constructs like "{\b while}" (TeX) or <bold>while<\bold> (HTML) or
your other favourite RTF (rich text format) to represent the
keyword WHILE. In this case pretty printing is easy: Just use TeX/
your browser/wordprocessor. The textual representation seams clumsy, but
this may be facilitated by the use of edito/browser features.
--
Dipl.-Math. Wilhelm Bernhard Kloke
Institut fuer Arbeitsphysiologie an der Universitaet Dortmund
Ardeystrasse 67, D-44139 Dortmund, Tel. 0231-1084-257

cr88192

unread,
Aug 1, 2004, 5:06:20 AM8/1/04
to

"Wilhelm B. Kloke" <w...@arb-phys.uni-dortmund.de> wrote in message
news:1091349913.914106@vestein...

> In article <lhp+news-8C089B...@news.tele.dk>,
> Lasse Hillerře Petersen <lhp+...@toft-hp.dk> wrote:
> >
> >Some ten years ago, I used Think Pascal, for the Macintosh. It did
> >syntax checking on the fly, and rendered programs with keywords in bold;
> >very nice. The Script Editor for AppleScript did much the same, taking
> >the approach a bit further. I know that this is not quite the same, as
> >typeface was not used to distinguish symbols in syntax, only as a pretty
> >rendering; but the step is not far. This is one thing I'd really love
> >support for in Algol68g, some way to write programs using a special
> >stropping instead of uppercase, and a way to prettyprint and edit on an
> >xterm using boldface for modes and keywords.
>
> It should be easy to add a new stropping regime for Algol68 (or your
> favourite other language) to allow
> for constructs like "{\b while}" (TeX) or <bold>while<\bold> (HTML) or
> your other favourite RTF (rich text format) to represent the
> keyword WHILE. In this case pretty printing is easy: Just use TeX/
> your browser/wordprocessor. The textual representation seams clumsy, but
> this may be facilitated by the use of edito/browser features.

hmm, yes, one just needs a compiler that supports it (or a tool for ripping
out the formatting).
for the rtf format, on windows wordpad is decent but lacks a line number
status or any way to jump to a specific line, which could be annoying.

there may exist other rtf editors though that have such features though
(actually, I think ms word has both a page number and line number, but I am
not sure, and the fragmentation into pages is not that helpful either...).

similarly, many other kinds of editors lack a line number status as well...
one may get used to coding without it though, but I find it quite helpful...