ANN: pldev.org

James Harris

unread,

Sep 3, 2011, 3:31:05 AM9/3/11

to

A while ago we discussed setting up a website on programming language
development. The site is now live at

Don't expect too much as the site is new. I'll move some of the
external stuff in to it over time and will get round to improving the
wording of some of the principles. But it does show the basics. Thanks
to Robbert for providing the domain name.

For reference, discussions that led to the site are in the threads
linked below.

http://groups.google.com/group/comp.lang.misc/browse_frm/thread/93501b369a148fd6
http://groups.google.com/group/comp.lang.misc/browse_frm/thread/4500919f2cd56c96
http://groups.google.com/group/comp.lang.misc/browse_frm/thread/1b0a27386dd6aa67

James

tm

unread,

Sep 8, 2011, 4:05:54 AM9/8/11

to

On 3 Sep., 09:31, James Harris <james.harri...@googlemail.com> wrote:
> A while ago we discussed setting up a website on programming language
> development. The site is now live at
>
> http://pldev.org

I had just a short glimpse on it.

IMHO the most important hint, for somebody who wants to design his
own language, is missing:

DON'T DO IT

People should explore their motives to design a new language, before
they start to do it (to learn about language design or compiler /
interpreter writing, to make the world a better place, to become
famous). They should be aware of the alternatives (preprocessor,
macros or library for existing language, extension for existing
compiler/interpreter, joining another new-language project). They
should be aware which goal their language project has (will be
thrown away after the exam, only for private use, to be released on
the internet, to reach world domination). Will the designer be
able to implement the design (projects with great ideas, which
"just" need somebody to implement them, will usually fail). They
should ask themself: Why should anybody switch to this language.
What makes it unique.

Your page about source portability concentrates on hardware
portability issues. There are also other portability issues, which
IMHO have became more important than hardware portability issues.
I speak of portability between operating systems and libraries.
Many libraries are not available everywhere.

Some people distinguish between language and runtime library and
consider a language portable, when its core part is portable. But a
programmer is only interested to know, if his program can be moved
to another computer without effort.

A program which uses Win32 calls will not run on Linux and vice
versa (I know about Wine, but this would need that a language
integrates somehow with Wine).

Even when no OS specific functions are called, some OS specific
things, like path delimiters and drive letters, can still lead to
unportable programs. Even system calls which seem portable still can
make problems (e.g.: Under Linux you can open a directory as file,
while under Windows this fails).

The pldev page about Error location led me to some problem:

What should happen, when you read numbers from a file, and there is
no number? Most people will probably agree that throwing an
exception is the right thing to do. But when the file is connected
to a keyboard the program should probably not terminate, when
the user writes a letter instead of a digit. Requesting that every
read from the keyboard must be followed by exception handlers, seems
a little bit heavvy.

For that reason I decided to put an exception handler in the read
function. This exception handler sets a flag, when the conversion
(to an integer or some other type) fails. That way an iteractive
read of a number either reads the number or leaves the number
unchanged and sets the flag. This seems elegant for interactive I/O,
but could lead to hard to find errors when reading data from files.

I see two possible solutions:

- Use some file mode (conversion exceptions caught or not),
which needs to be set up.
- Use different read functions with and without catching of
conversion exceptions.

I would be pleased to get some feedback about this.

Greetings Thomas Mertes

--
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.

BGB

unread,

Sep 8, 2011, 2:46:28 PM9/8/11

to

yes, cool...

I hadn't heard stuff in a long enough time that I had wondered if
anything was still going on with this (good that it is).

sometimes, with all of the quietness, one can wonder just how much
longer usenet will live on (sad sometimes...).

the main site is, I guess, a plain website?...
doesn't appear to be a wiki or anything.

in my case, I had also set up my own wiki in the mean-time (based on
mediawiki), mostly intended for my own projects.

I personally like mediawiki, as it is a little less effort than more
traditional website editing. a downside though is, at the moment, I
don't really have any access to its stored contents (I guess they are in
a MySQL database somewhere on my servers' HDD or similar...).

it basically means having to use plaintext and HTML for most "real"
documentation, and the worry of what it would mean for any site/wiki
contents if my server crashes really hard.

if anyone cares (assuming server uptime, ...):
http://cr88192.dyndns.org/wiki/index.php/Main_Page

and, for a few 3D-engine screen shots (some including some since-fixed
glitches):
http://cr88192.dyndns.org/wiki/index.php/BGBTech_Shots

however, with the default settings, the wiki was very prone to getting
hacked and covered with spam, so I ended up having to lock-down the
security a bit more to prevent this (it shouldn't be too hard for
real-humans, but was mostly to limit bots from doing this).

or such...

BGB

unread,

Sep 8, 2011, 4:44:18 PM9/8/11

to

On 9/8/2011 1:05 AM, tm wrote:
> On 3 Sep., 09:31, James Harris<james.harri...@googlemail.com> wrote:
>> A while ago we discussed setting up a website on programming language
>> development. The site is now live at
>>
>> http://pldev.org
>
> I had just a short glimpse on it.
>
> IMHO the most important hint, for somebody who wants to design his
> own language, is missing:
>
> DON'T DO IT
>
> People should explore their motives to design a new language, before
> they start to do it (to learn about language design or compiler /
> interpreter writing, to make the world a better place, to become
> famous). They should be aware of the alternatives (preprocessor,
> macros or library for existing language, extension for existing
> compiler/interpreter, joining another new-language project). They
> should be aware which goal their language project has (will be
> thrown away after the exam, only for private use, to be released on
> the internet, to reach world domination). Will the designer be
> able to implement the design (projects with great ideas, which
> "just" need somebody to implement them, will usually fail). They
> should ask themself: Why should anybody switch to this language.
> What makes it unique.
>

I don't entirely agree.

other major uses for language design:
personal expression (artistic, or "how do *I* want things to be");
a desire to experiment and have the personal freedom to try new design
ideas (to this end, an ideally "open-ended" VM architecture is needed);
just not liking the existing options (there are some of us who would
rather not poke Perl or Python with a stick, and find languages like
Java to be an all-in-all painful and unsatisfying experience...);
maybe the existing languages/technologies don't fit with ones design goals;
...

whether or not other people care, would want to use the language, ...
may often not matter much.

I would assume many people could care less either about "control" or
about "changing the world".

many people may just want to do stuff for the hell of it, like, without
obligations or worries. much like putting a lift-kit / under-lights /
... on a car, or for a computer using a water-cooling system with
fluorescent materials in the water (and having UV lights to make it all
glow all spiffy-looking).

or, some combination: where under the hood there are all these glowing
things and tubes of fluorescing liquid and so on (like something out of
a SciFi movie, or Doom 3...).

practical?... maybe not. sometimes one doesn't need it to be practical,
sometimes one just wants it to be really cool-looking.

but, then one is faced with reality, where there is money and one has to
worry about how to do something to live off of, ... and where tasks and
competing with what exists may require *massive* amounts of effort (it
is hard to make something big and shiny, because software companies are
able to make things more big and shiny more quickly, since this is where
*their* source of profit lies...).

it is a sad realization that ones' ideas and expression may be stifled
by ones' own limitations (be it the mountains of code they have to
write, or the piles of graphics and 3D modeling/... effort required in
making something like a game, ...).

then one has to find many ways to cut corners and has to deal with much
less interesting matters, mostly to have "something" to show for ones'
efforts.

as well, one may find that in some areas, they are very uncreative, and
others will judge oneself for this, ... it is like the quest for "the
magic" (which is this strange and bizarre property of software/games/...
to look cool and also be usable, ...).

much like, one may also have fantasies about being sort of like a
rock-star, but then one is also left to realize that they would need to
be able to play instruments and be able to come up with songs and crap,
rather than just go up on stage with a costume and make-up and yell
"yeah..." a lot and have this work.

> Your page about source portability concentrates on hardware
> portability issues. There are also other portability issues, which
> IMHO have became more important than hardware portability issues.
> I speak of portability between operating systems and libraries.
> Many libraries are not available everywhere.
>

sadly, very true.

to a large degree one may end up creating a very large and complex
"framework" mostly just to gloss over the matter of OS and library
issues (so that ones' stuff works fairly similarly on, say, Windows and
Linux). also, one may consider trying to build off of an existing
framework, but then the portability and availability of that framework
itself may become an issue, as well as the tendency of many of these
frameworks to be fairly specialized to a particular type of application.

so, the CPU is just one concern among many.

it is much like, the creation of a usable compiler involves far more
than just the parser and some magical parser-generator tool (despite how
much those people who flog off parser generators jerk off over them
endlessly, the fact that they only really deal with one of the more
trivial parts of such a project, and IMO go about it in almost entirely
the wrong direction, doesn't really impress me all that much...).

> Some people distinguish between language and runtime library and
> consider a language portable, when its core part is portable. But a
> programmer is only interested to know, if his program can be moved
> to another computer without effort.
>

I don't think the concerns of a programmer are quite that limited.
otherwise, everyone would probably just use Java by now.

> A program which uses Win32 calls will not run on Linux and vice
> versa (I know about Wine, but this would need that a language
> integrates somehow with Wine).
>

and this option would probably suck as well...

> Even when no OS specific functions are called, some OS specific
> things, like path delimiters and drive letters, can still lead to
> unportable programs. Even system calls which seem portable still can
> make problems (e.g.: Under Linux you can open a directory as file,
> while under Windows this fails).
>

yep.

and/or, for other reasons, the app ends up needing to be run in a VFS.
my own stuff tends to internally create a virtual Unix-like VFS in which
the majority of all file IO is done.

there are both merits and drawbacks to doing this though, and as-is, I
have started to run into current organizational scalability issues with
my present design, and may need to find better ways to address them.

the VFS itself is working fairly well, mostly it is just a worry about
having to re-introduce concepts like PWD and search-paths for more types
of resources (textures and sound effects, for example).

the present system is based mostly on things like resource directories
and union mounts and similar (generally, 1 or more top-level resource
directories are created and union-mounted together, potentially followed
by mounting any "resource zip files" into the VFS as well, basically
sort of like the Quake-series engines).

> The pldev page about Error location led me to some problem:
>
> What should happen, when you read numbers from a file, and there is
> no number? Most people will probably agree that throwing an
> exception is the right thing to do. But when the file is connected
> to a keyboard the program should probably not terminate, when
> the user writes a letter instead of a digit. Requesting that every
> read from the keyboard must be followed by exception handlers, seems
> a little bit heavvy.
>

probably depends...

there are tradeoffs between exceptions, error status codes, and simply
making the failed operation behave as a no-op (graceful failure).

an ideal balance has not yet been found, and as well I am left with the
issue that I generally use graceful failure and status codes in places
where many other (traditionally more pedantic) languages would throw
exceptions.

an issue was the current ugly mess that is cross-language exception passing.

I am currently leaning towards a possibility that I may make thrown
exceptions (in BGBScript) behave more like check-able status indicators
in C.

to, BGBScript code throws an exception, and either handles it locally or
returns to C. the C code may check it, and do whatever. if code returns
back into BGBScript (without clearing the exception status), it may
resume being treated as an exception.

the prior strategy was based more on treating them as exceptions both in
C and in BGBScript (on Windows, an unhandled exception would be passed
over to Win32 SEH, and on returning to the VM be converted back to a VM
exception).

the above, however, caused me to generally not use them and regard them
as a "presently broken" feature.

I may likely reserve SEH-like "hard" exceptions for things that actually
need to be handled so severely, such as "oh crap, I have just
dereferenced an invalid pointer" and whatever (as in, what stuff the OS
does already), and leave things which are often currently handled via
no-ops or status codes, either as no-ops, status codes, or as "soft"
exceptions.

> For that reason I decided to put an exception handler in the read
> function. This exception handler sets a flag, when the conversion
> (to an integer or some other type) fails. That way an iteractive
> read of a number either reads the number or leaves the number
> unchanged and sets the flag. This seems elegant for interactive I/O,
> but could lead to hard to find errors when reading data from files.
>

more common in my case:
just ignore the problem (trying to parse a non-number as a number
effectively just eats the token and returns 0).

yes, there are potentially drawbacks here as well, but oh well...

> I see two possible solutions:
>
> - Use some file mode (conversion exceptions caught or not),
> which needs to be set up.
> - Use different read functions with and without catching of
> conversion exceptions.
>
> I would be pleased to get some feedback about this.
>

dunno...

I am left wondering some even about things like object slot accesses and
bounds-check errors.

as-is, one often knows that they have failed because they are seeing
"undefined" everywhere, but this is about as reliable as seeing NaN
everywhere when doing floating-point calculations.

granted, whether a math problem is better off simply "going NaN" or
causing the application to crash at first sight, is probably a matter of
debate as well.

Bartc

unread,

Sep 9, 2011, 9:28:11 AM9/9/11

to

"tm" <thomas...@gmx.at> wrote in message
news:4dbfd900-b0e7-416b...@h11g2000vbc.googlegroups.com...

> IMHO the most important hint, for somebody who wants to design his
> own language, is missing:
>
> DON'T DO IT
>
> People should explore their motives to design a new language

Perhaps for the same reason some people design or make their own anything.
For fun maybe...

> should ask themself: Why should anybody switch to this language.
> What makes it unique.

Because it's yours! That won't be an advantage to anyone else of course...

> A program which uses Win32 calls will not run on Linux and vice
> versa (I know about Wine, but this would need that a language
> integrates somehow with Wine).

So this leads to someone creating their own abstract library (for graphics,
files, whatever) which is not tied to an OS and which can be made to run on
a range of systems, rather than relying on someone's huge library which only
runs on machine X, or someone else's even huger system which is
cross-platform, but has to be programmed in C++...

The same sort of necessity which might lead someone to invent a
special-purpose language.

(My first language was created because there was no practical way of getting
someone else's language onto the microprocessor systems I was building.

And even when I was using floppy disks a bit later, and I could afford to
buy a language (in pre-internet days), the ones available ran too slowly for
my purpose (taking minutes of disk grinding to compile the smallest program,
while mine compiled instantly and I could get on with my work.

In 2011 it's a bit different, but it can still be fun.)

--
Bartc

tm

unread,

Sep 9, 2011, 3:09:43 PM9/9/11

to

On 9 Sep., 15:28, "Bartc" <b...@freeuk.com> wrote:
> "tm" <thomas.mer...@gmx.at> wrote in message

>
> news:4dbfd900-b0e7-416b...@h11g2000vbc.googlegroups.com...
>
> > IMHO the most important hint, for somebody who wants to design his
> > own language, is missing:
>
> > DON'T DO IT
>
> > People should explore their motives to design a new language
>
> Perhaps for the same reason some people design or make their own anything.
> For fun maybe...

That's one possible reason (my list of possible reasons is
not complete).

> > should ask themself: Why should anybody switch to this language.
> > What makes it unique.
>
> Because it's yours! That won't be an advantage to anyone else of course...

The implementer of a language will probably use his
language. :-) But he (she) might not switch to it.
The question (Why should anybody ...) is tailored for
people who want to release their language and who want
that their language is actually used.

> > A program which uses Win32 calls will not run on Linux and vice
> > versa (I know about Wine, but this would need that a language
> > integrates somehow with Wine).
>
> So this leads to someone creating their own abstract library (for graphics,
> files, whatever) which is not tied to an OS and which can be made to run on

> a range of systems, ...

This is exactly the stategy I use for Seed7.

> ... rather than relying on someone's huge library which only

> runs on machine X, or someone else's even huger system which is
> cross-platform, but has to be programmed in C++...

Functions from the C++ library could be called from the
new designed language. To use this stategy some conditions
must be fulfilled:

1. The C++ library must be available on the supported
platforms (ideally under a free license). And it
should be released together with the language
implementation.
2. The types used by the C++ library and the types used
by the new language must be compatible. Often this
is not easy. E.g.: Your language may have a different
concept to represent strings (the internal representation
of a C++ string is implementation dependent).
3. Something like prototypes and header files and conversion
routines must be defined.

BartC

unread,

Sep 10, 2011, 5:52:17 AM9/10/11

to

"tm" <thomas...@gmx.at> wrote in message

news:70f63325-7b32-47be...@n11g2000yqh.googlegroups.com...

> On 9 Sep., 15:28, "Bartc" <b...@freeuk.com> wrote:
>> "tm" <thomas.mer...@gmx.at> wrote in message

>> Because it's yours! That won't be an advantage to anyone else of

>> course...
>
> The implementer of a language will probably use his
> language. :-) But he (she) might not switch to it.

(I would; it's just so much more 'comfortable' than anything else. Although
that also applies to other kinds of software..)

> The question (Why should anybody ...) is tailored for
> people who want to release their language and who want
> that their language is actually used.

I created a scripting language to go with my applications. The only way to
extend the application was to use that language. So I did have 'users', even
if they didn't exactly choose it out of free will... I wasn't bothered
however about whether the language (which was powerful enough to do general
stuff) was used much elsewhere; it's other people's loss if I could be more
productive than them!).

>> So this leads to someone creating their own abstract library (for
>> graphics,
>> files, whatever) which is not tied to an OS and which can be made to run
>> on
>> a range of systems, ...
>
> This is exactly the stategy I use for Seed7.

OK, so why can't someone create a new language for exactly the same reasons?
Or are you saying you wouldn't have created Seed7 in hindsight...?

>> ... rather than relying on someone's huge library which only
>> runs on machine X, or someone else's even huger system which is
>> cross-platform, but has to be programmed in C++...
>
> Functions from the C++ library could be called from the
> new designed language. To use this stategy some conditions
> must be fulfilled:
>
> 1. The C++ library must be available on the supported
> platforms (ideally under a free license). And it
> should be released together with the language
> implementation.

...

C++ is pretty much impossible if you don't use, understand, or like C++,
especially trying to interface from a different language.

I've wasted plenty of time trying to link into C++ libraries via C-style
interfaces (GDI+ for example), now I don't even bother. If it's in C++, then
forget it.

--
Bartc

James Harris

unread,

Sep 10, 2011, 6:01:49 PM9/10/11

to

On Sep 8, 9:05 am, tm <thomas.mer...@gmx.at> wrote:

...

> The pldev page about Error location led me to some problem:
>
> What should happen, when you read numbers from a file, and there is
> no number? Most people will probably agree that throwing an
> exception is the right thing to do. But when the file is connected
> to a keyboard the program should probably not terminate, when
> the user writes a letter instead of a digit. Requesting that every
> read from the keyboard must be followed by exception handlers, seems
> a little bit heavvy.
>
> For that reason I decided to put an exception handler in the read
> function. This exception handler sets a flag, when the conversion
> (to an integer or some other type) fails. That way an iteractive
> read of a number either reads the number or leaves the number
> unchanged and sets the flag. This seems elegant for interactive I/O,
> but could lead to hard to find errors when reading data from files.
>
> I see two possible solutions:
>
> - Use some file mode (conversion exceptions caught or not),
> which needs to be set up.
> - Use different read functions with and without catching of
> conversion exceptions.
>
> I would be pleased to get some feedback about this.

OK here are some thoughts. If the numbers are present in the file as
digit characters why not read them as such and convert them to binary
separately?

Exceptions might be good wherever you convert from digits to binary so
as to catch errors.

Of course, if a file contains binary integers and you can accept any
integer in the range there can be no conversion exception. File read
or keyboard read could always generate an exception. Maybe all inputs
to a process need to be policed.

James

James Harris

unread,

Sep 10, 2011, 6:17:56 PM9/10/11

to

On Sep 8, 7:46 pm, BGB <cr88...@hotmail.com> wrote:
> On 9/3/2011 12:31 AM, James Harris wrote:

> > A while ago we discussed setting up a website on programming language
> > development. The site is now live at
>
> > http://pldev.org

...

> yes, cool...
>
> I hadn't heard stuff in a long enough time that I had wondered if
> anything was still going on with this (good that it is).
>
> sometimes, with all of the quietness, one can wonder just how much
> longer usenet will live on (sad sometimes...).
>
> the main site is, I guess, a plain website?...
> doesn't appear to be a wiki or anything.

Since we discussed the idea I put a lot of work into trying out
different options to make editing by multiple users easy and secure. I
learned a lot but there's more to do and it's a bit of a minefield ...
and I'm too busy at work and with other things. In the end I decided
to just start by setting up the first part of what I had in mind.

> in my case, I had also set up my own wiki in the mean-time (based on
> mediawiki), mostly intended for my own projects.

I saw it. It's good that you set that up.

> I personally like mediawiki, as it is a little less effort than more
> traditional website editing. a downside though is, at the moment, I
> don't really have any access to its stored contents (I guess they are in
> a MySQL database somewhere on my servers' HDD or similar...).

I considered some CMS and Wiki systems including Mediawiki. It's one
of the best, for sure. For similar reasons to what you mention I stuck
with smaller facilities such as CSS, PHP, SQL and HTML. This means
more work to edit but I have more control and I know where things are.

James

BGB

unread,

Sep 10, 2011, 8:29:08 PM9/10/11

to

On 9/10/2011 3:17 PM, James Harris wrote:
> On Sep 8, 7:46 pm, BGB<cr88...@hotmail.com> wrote:
>> On 9/3/2011 12:31 AM, James Harris wrote:
>
>>> A while ago we discussed setting up a website on programming language
>>> development. The site is now live at
>>
>>> http://pldev.org
>
> ...
>
>> yes, cool...
>>
>> I hadn't heard stuff in a long enough time that I had wondered if
>> anything was still going on with this (good that it is).
>>
>> sometimes, with all of the quietness, one can wonder just how much
>> longer usenet will live on (sad sometimes...).
>>
>> the main site is, I guess, a plain website?...
>> doesn't appear to be a wiki or anything.
>
> Since we discussed the idea I put a lot of work into trying out
> different options to make editing by multiple users easy and secure. I
> learned a lot but there's more to do and it's a bit of a minefield ...
> and I'm too busy at work and with other things. In the end I decided
> to just start by setting up the first part of what I had in mind.
>

fair enough...

>> in my case, I had also set up my own wiki in the mean-time (based on
>> mediawiki), mostly intended for my own projects.
>
> I saw it. It's good that you set that up.
>
>> I personally like mediawiki, as it is a little less effort than more
>> traditional website editing. a downside though is, at the moment, I
>> don't really have any access to its stored contents (I guess they are in
>> a MySQL database somewhere on my servers' HDD or similar...).
>
> I considered some CMS and Wiki systems including Mediawiki. It's one
> of the best, for sure. For similar reasons to what you mention I stuck
> with smaller facilities such as CSS, PHP, SQL and HTML. This means
> more work to edit but I have more control and I know where things are.
>

yep.

more ideal would be something sort of like mediawiki but with the
ability to store contents in plain text files.

granted, one could argue whether or not it could scale as well or have
as many features, but disk files would have the advantage that one one
more easily copy them to/from their server, make backups, ...

I had actually partly considered something like the above, but more in
the context of a documentation system, and with partial integration with
a javadoc like system, so like javadoc+wiki or similar (and probably
using a similar markup syntax to in mediawiki as well).

potentially, the syntax would also be more compact than the existing
Javadoc/Doxygen style syntax, and more ideally could integrate with
standalone documentation (probably like "wiki-markup as standalone files").

another component could probably also run on a webserver as a CGI
script, and serve a similar role to mediawiki (apart from being
file-based, probably with each article as its own markup file).

potentially, an optional dummy webserver could exist, which basically
does nothing apart from manage a local HTTP server and pass requests to
the CGI portion, and invoke the browser for said local server (the
browser providing the UI).

slightly less ambitious though would be a tool to simply convert
mediawiki-like markup into HTML (could be run batch-style).

however, given that my efforts are spread fairly thin as-is, I haven't
really gotten around to anything like this.

> James

BGB

unread,

Sep 11, 2011, 3:11:43 PM9/11/11

to

On 9/10/2011 2:52 AM, BartC wrote:
> "tm" <thomas...@gmx.at> wrote in message
> news:70f63325-7b32-47be...@n11g2000yqh.googlegroups.com...
>> On 9 Sep., 15:28, "Bartc" <b...@freeuk.com> wrote:
>>> "tm" <thomas.mer...@gmx.at> wrote in message
>
>>> Because it's yours! That won't be an advantage to anyone else of
>>> course...
>>
>> The implementer of a language will probably use his
>> language. :-) But he (she) might not switch to it.
>
> (I would; it's just so much more 'comfortable' than anything else. Although
> that also applies to other kinds of software..)
>
>> The question (Why should anybody ...) is tailored for
>> people who want to release their language and who want
>> that their language is actually used.
>
> I created a scripting language to go with my applications. The only way to
> extend the application was to use that language. So I did have 'users',
> even
> if they didn't exactly choose it out of free will... I wasn't bothered
> however about whether the language (which was powerful enough to do general
> stuff) was used much elsewhere; it's other people's loss if I could be more
> productive than them!).
>

probably same here...

I just try to make my scripting languages be "reasonably familiar"
though, mostly so that any would-be modders are not scratching their
heads being all like "now WTF is all this crap?...".

>>> So this leads to someone creating their own abstract library (for
>>> graphics,
>>> files, whatever) which is not tied to an OS and which can be made to run
>>> on
>>> a range of systems, ...
>>
>> This is exactly the stategy I use for Seed7.
>
> OK, so why can't someone create a new language for exactly the same
> reasons?
> Or are you saying you wouldn't have created Seed7 in hindsight...?
>

it is a mystery.
I am not entirely sure of his motives sometimes...

in my case, whether or not any of my stuff is/was a good idea is
debatable...

much like this idea (random idea which came up last night):
http://cr88192.dyndns.org/wiki/index.php/ExWAD

which I may or may not implement a tool and VFS support for.
note: yes, one can just shove a ZIP archive onto the end of an EXE and
do the same thing, but this option avoids just being able to unzip the
file (merit or drawback, either way).

>>> ... rather than relying on someone's huge library which only
>>> runs on machine X, or someone else's even huger system which is
>>> cross-platform, but has to be programmed in C++...
>>
>> Functions from the C++ library could be called from the
>> new designed language. To use this stategy some conditions
>> must be fulfilled:
>>
>> 1. The C++ library must be available on the supported
>> platforms (ideally under a free license). And it
>> should be released together with the language
>> implementation.
> ...
>
> C++ is pretty much impossible if you don't use, understand, or like C++,
> especially trying to interface from a different language.
>
> I've wasted plenty of time trying to link into C++ libraries via C-style
> interfaces (GDI+ for example), now I don't even bother. If it's in C++,
> then forget it.
>

and, recently, I got into a big argument with people for reasons why my
scripting VM only provides direct C access via the FFI, and would
require the C++ code to use a C-like interface to have "common ground".

but, seriously, vs plain C, C++ "opens up a whole new can of worms..."
(parsing, semantics, FFI, ...). and, transparent auto-plugging between
an ECMAScript variant and C++ may well pose non-trivial complexities...

or such...

BartC

unread,

Sep 12, 2011, 5:24:07 PM9/12/11

to

"BGB" <cr8...@hotmail.com> wrote in message
news:j4j157$1k0$1...@news.albasani.net...

> On 9/10/2011 2:52 AM, BartC wrote:

>> I created a scripting language to go with my applications. The only way
>> to
>> extend the application was to use that language. So I did have 'users',
>> even
>> if they didn't exactly choose it out of free will

> probably same here...
>
> I just try to make my scripting languages be "reasonably familiar" though,
> mostly so that any would-be modders are not scratching their heads being
> all like "now WTF is all this crap?...".

When users are not really programmers, then you don't want to impose on them
something like C-syntax (bristling with symbols and unnecessary
punctuation), or Perl, or Lisp.

You want something that looks like what someone might expect a programming
language to look like! So more like Basic (or, for a more modern example,
like Ruby, Python or especially Lua).

I had my own preferred syntax, but it was tamed down to make it look simpler
and more informal.

--
Bartc

BGB

unread,

Sep 13, 2011, 1:37:39 AM9/13/11

to

On 9/12/2011 2:24 PM, BartC wrote:
> "BGB" <cr8...@hotmail.com> wrote in message
> news:j4j157$1k0$1...@news.albasani.net...
>> On 9/10/2011 2:52 AM, BartC wrote:
>
>>> I created a scripting language to go with my applications. The only way
>>> to
>>> extend the application was to use that language. So I did have 'users',
>>> even
>>> if they didn't exactly choose it out of free will
>
>> probably same here...
>>
>> I just try to make my scripting languages be "reasonably familiar"
>> though,
>> mostly so that any would-be modders are not scratching their heads being
>> all like "now WTF is all this crap?...".
>
> When users are not really programmers, then you don't want to impose on
> them
> something like C-syntax (bristling with symbols and unnecessary
> punctuation), or Perl, or Lisp.
>

eh?...

C-like syntax works pretty well in mainstream language lists.
they probably hold top-place in these rankings for a reason...

> You want something that looks like what someone might expect a programming
> language to look like! So more like Basic (or, for a more modern example,
> like Ruby, Python or especially Lua).
>

but these languages look weird, especially Python...

> I had my own preferred syntax, but it was tamed down to make it look
> simpler
> and more informal.
>

I am using a mostly JavaScript/ActionScript style syntax...

but, anyways, the presumed modder would have at least basic familiarity
with programming, and most likely with other mainstream languages.

BartC

unread,

Sep 13, 2011, 5:50:57 AM9/13/11

to

"BGB" <cr8...@hotmail.com> wrote in message

news:j4mq77$q03$1...@news.albasani.net...

> On 9/12/2011 2:24 PM, BartC wrote:

>> When users are not really programmers, then you don't want to impose on
>> them
>> something like C-syntax (bristling with symbols and unnecessary
>> punctuation), or Perl, or Lisp.
>>
>
> eh?...
>
> C-like syntax works pretty well in mainstream language lists.
> they probably hold top-place in these rankings for a reason...

That doesn't mean they are good for beginners, or for casual programming by
people who just want to automate part of their work.

The first piece of code I ever saw in action was the following Basic program
(I'd asked what sort of things some mini-computer could do, and the guy
typed this into the nearest teletype, which was even more impressive as you
got an immediate printout. However it was actually connected to a mainframe
so was a bit of a cheat!):

10 for i=1 to 10
20 print i,sqr(i)
30 next i

This was clear enough to me, even though I'd never seen any code before in
my life (other input which talked to the OS was a bit more cryptic). There
is nothing much in there that wasn't anything to do with the job in hand, or
that you could appreciate was necessary for this 'programming' business.

Now let's take a piece of equivalent C code. A bit unfair, as this is actual
C syntax, while other languages with derived syntax have less extraneous
stuff, but anyway:

#include <stdio.h>
#include <math.h>
int main(void) {
int i;
for (i=1; i<=10; ++i)
printf("%d %f\n",i, sqrt(i));
}

Immediately, the first four lines seem to be gobbledygook, and appear to
have little to do with anything.

The loop statement for some reason has the loop variable written 3 times
instead of once, and you have the extra symbol <=, a mysterious ++i, plus
parentheses and two semicolons for good measure!

The print statement also has parentheses and the semicolon, and the
strange-looking "%d %f\n"; what's that all about?!

The syntax I used was along the lines of:

for i:=1 to 10 do
println i,sqrt(i)
end

This also has things such as "to", "do" and "end" thrown in, but they are
English words and partly self-explanatory. As in the Basic, you might
rightly expect to be able to write extra lines in the loop body (not so in
the C).

Lua code is almost the same (for i=1,10 do print (i,math.sqrt(i)) end).

(or, for a more modern example,
>> like Ruby, Python or especially Lua).
>>
>
> but these languages look weird, especially Python...

Yeah, the layout, otherwise nothing remarkable. (However, Python can be
unreadable for other reasons...)

> I am using a mostly JavaScript/ActionScript style syntax...
>
>
> but, anyways, the presumed modder would have at least basic familiarity
> with programming, and most likely with other mainstream languages.

I suppose if any syntax was imposed on a beginner, or a 'modder', they will
pick it up eventually and just write whatever is necessary to make things
work.

However I'm quite familiar with C yet I still find writing C codes gives me
RSI; apart from all the typos I get with my clumsy typing and needing all
that punctuation.

--
Bartc

BGB

unread,

Sep 13, 2011, 1:50:44 PM9/13/11

to

one doesn't need most of that for a C-like syntax.

(note, I am not advocating here using C itself as a scripting language,
as it has many drawbacks in the area...).

also possible is this:
var i;
for(i=0; i<10; i++)
println(i, " ", sqrt(i));

or, also:
for(i in 1..10)
println(i, " ", sqrt(i));

my language also has and often uses printf, but more it is for sake of
formatting control(it is easier to do formatted output with a
control-string and arguments, than with inline formatting controls).

> The loop statement for some reason has the loop variable written 3 times
> instead of once, and you have the extra symbol <=, a mysterious ++i, plus
> parentheses and two semicolons for good measure!
>
> The print statement also has parentheses and the semicolon, and the
> strange-looking "%d %f\n"; what's that all about?!
>

each token serves its purpose...

> The syntax I used was along the lines of:
>
> for i:=1 to 10 do
> println i,sqrt(i)
> end
>
> This also has things such as "to", "do" and "end" thrown in, but they are
> English words and partly self-explanatory. As in the Basic, you might
> rightly expect to be able to write extra lines in the loop body (not so in
> the C).
>

COBOL also used lots of English under the assumption that it would make
it readable to ordinary business men... it didn't work...

> Lua code is almost the same (for i=1,10 do print (i,math.sqrt(i)) end).
>

this is part of why it looks weird...

> (or, for a more modern example,
>>> like Ruby, Python or especially Lua).
>>>
>>
>> but these languages look weird, especially Python...
>
> Yeah, the layout, otherwise nothing remarkable. (However, Python can be
> unreadable for other reasons...)
>

Python's indentation-based syntax is a cause of much controversy.
mixing tabs and spaces, or reading/writing code with different
tab-widths, can go awry.
...

>> I am using a mostly JavaScript/ActionScript style syntax...
>>
>>
>> but, anyways, the presumed modder would have at least basic familiarity
>> with programming, and most likely with other mainstream languages.
>
> I suppose if any syntax was imposed on a beginner, or a 'modder', they will
> pick it up eventually and just write whatever is necessary to make
> things work.
>

usually, a modder wants to hack on or tweak the behavior of a program or
game, sometimes practical, more often as a form of expression or
experimentation.

some of the most popular games for modders (most of the Quake-series
games) have used C-like syntax for scripts (Quake used a custom language
it called QuakeC, Quake2 and Quake3 just used C though, the former using
it directly via compiling to DLLs, the latter compiling it to bytecode
and running this).

Doom 3 uses a vaguely C++ like scripting language, but with some added
funkiness (aspects retained from QuakeC, ...).

Valve used C++ for Half-Life and Half-Life 2 (generally by compiling it
to DLLs, like in Quake2). IIRC starting with Garry's Mod or maybe TF2,
people ended up starting to use a lot of Lua with the Source engine.

I guess one could also write mods for GIMP or similar, but maybe the
GIMP people would have a different name for them?...

> However I'm quite familiar with C yet I still find writing C codes gives me
> RSI; apart from all the typos I get with my clumsy typing and needing all
> that punctuation.
>

not like lots of letters and keywords are any easier, FWIW...
all require hitting keys...

operators/braces/... generally allow expressing more meaning in less
characters, which is ultimately a win.

this is more so the case when entering code interactively, as one may
only have so many characters in the console (otherwise one starts having
to resort to the "text editor as a scratch pad" thing).

one nifty feature (seen elsewhere) was the ability to select code
fragments (in an editor) and eval them directly with a keypress, but
there is currently no easy/obvious way to add this to my 3D engine
(would need to think up UI concerns for an integrated console-linked
text editor...).

or such...

Tony

unread,

Sep 14, 2011, 3:23:28 PM9/14/11

to

"BartC" <b...@freeuk.com> wrote in message
news:j4n94e$do6$1...@dont-email.me...

> The first piece of code I ever saw in action was the following Basic
> program
>

> 10 for i=1 to 10
> 20 print i,sqr(i)
> 30 next i
>

> Now let's take a piece of equivalent C code.

> int i;
> for (i=1; i<=10; ++i)
> printf("%d %f\n",i, sqrt(i));
>

> The syntax I used was along the lines of:
>
> for i:=1 to 10 do
> println i,sqrt(i)
> end

And you think that is now good syntax? It's cryptic, verbose, not easily
comprehendable, inelegant.

(The C header thing is a separate issue not to confuse it with the syntax
of a language proper).

BartC

unread,

Sep 14, 2011, 5:36:17 PM9/14/11

to

"Tony" <nosp...@ever.net> wrote in message
news:Qa7cq.20761$OO1....@newsfe02.iad...

> "BartC" <b...@freeuk.com> wrote in message

>> int i;
>> for (i=1; i<=10; ++i)
>> printf("%d %f\n",i, sqrt(i));
>>
>> The syntax I used was along the lines of:
>>
>> for i:=1 to 10 do
>> println i,sqrt(i)
>> end
>
> And you think that is now good syntax? It's cryptic, verbose, not easily
> comprehendable, inelegant.

It's derived from the Algol family of languages. It's not perfect, but I
disagree that it's cryptic. Algol used to be used to illustrate algorithms
in technical books and articles.

But what would *you* consider to be clear, concise, easy to understand, and
elegant? For a language suitable for embedding in an application. Perhaps
you can show how to print a table of square roots in that syntax.

Would would the task look like in pseudo-code? Or English?

To be that much more precise, it might have to look something like:

[(i,√i) | i=1..10]

in some interactive loop which evaluates and prints every expression
entered. But I don't consider this easy to read or understand for people who
might only be interested in some scripting.

> (The C header thing is a separate issue not to confuse it with the syntax
> of a language proper).

It's just another thing that has nothing to do with what you're trying to
achieve, so affecting productivity and becoming another source of errors.

--
Bartc

Tony

unread,

Sep 14, 2011, 8:05:56 PM9/14/11

to

"BartC" <b...@freeuk.com> wrote in message

news:j4r6rv$ki9$1...@dont-email.me...

> "Tony" <nosp...@ever.net> wrote in message
> news:Qa7cq.20761$OO1....@newsfe02.iad...
>> "BartC" <b...@freeuk.com> wrote in message
>
>>> int i;
>>> for (i=1; i<=10; ++i)
>>> printf("%d %f\n",i, sqrt(i));
>>>
>>> The syntax I used was along the lines of:
>>>
>>> for i:=1 to 10 do
>>> println i,sqrt(i)
>>> end
>>
>> And you think that is now good syntax? It's cryptic, verbose, not
>> easily comprehendable, inelegant.
>
> It's derived from the Algol family of languages.

Obviously, but that is what you were criticizing.

> It's not perfect,

Moreso (read, "understatement of the year"), it's just as bad as what you
were trying to improve on.

> but I disagree that it's cryptic.

To someone who has programming experience it is not, but that is not
saying much.

> But what would *you* consider to be clear, concise, easy to understand,
> and elegant?

I could show you, but then I'd have to kill you. ;-) So, I'll let others
offer up the "obvious" ways to improve upon any/all of the given examples
of the simple snippet.

>> (The C header thing is a separate issue not to confuse it with the
>> syntax of a language proper).
>
> It's just another thing that has nothing to do with what you're trying
> to achieve, so affecting productivity and becoming another source of
> errors.

I was trying to focus the discussion by removing unnecessary issues. The
smaller scope still has enough "meat" in it (obviously). Indeed,
narrowing the scope a bit more by removing the specific library calls
would be prudent to use as a start for analysis:

A. Visual Basic

for i=1 to 10
stmt1
stmt2
next i

B. C++

for (int i=1; i<=10; ++i)
{
stmt1;
stmt2;
}

C. BartC

for i:=1 to 10 do

stmt1
stmt2
end

where "stmt1" and "stmt2" can be any statement, using Visual Basic rather
than some old Basic for (A), and C++ rather than C for (B) and taking
some preference toward layout.

My opinion is that A, B, and C, all suck to about the same degree.
Others, please do comment on the good/bad/ugly of A, B, C above, or even
show/propose a D, E, F...

Tony

unread,

Sep 14, 2011, 8:12:01 PM9/14/11

to

"Tony" <nosp...@ever.net> wrote in message

news:Hkbcq.13433$Ll3....@newsfe16.iad...

>
> "BartC" <b...@freeuk.com> wrote in message
> news:j4r6rv$ki9$1...@dont-email.me...
>> "Tony" <nosp...@ever.net> wrote in message
>> news:Qa7cq.20761$OO1....@newsfe02.iad...
>>> "BartC" <b...@freeuk.com> wrote in message
>>
>>>> int i;
>>>> for (i=1; i<=10; ++i)
>>>> printf("%d %f\n",i, sqrt(i));
>>>>
>>>> The syntax I used was along the lines of:
>>>>
>>>> for i:=1 to 10 do
>>>> println i,sqrt(i)
>>>> end
>>>
>>> And you think that is now good syntax? It's cryptic, verbose, not
>>> easily comprehendable, inelegant.
>>
>> It's derived from the Algol family of languages.
>
> Obviously, but that is what you were criticizing.

Did I say that? It actually looks more Pascal-ish.

BartC

unread,

Sep 15, 2011, 5:47:04 AM9/15/11

to

"Tony" <nosp...@ever.net> wrote in message

news:Hkbcq.13433$Ll3....@newsfe16.iad...

> I was trying to focus the discussion by removing unnecessary issues. The
> smaller scope still has enough "meat" in it (obviously). Indeed, narrowing
> the scope a bit more by removing the specific library calls would be
> prudent to use as a start for analysis:
>
> A. Visual Basic
> for i=1 to 10

> B. C++
> for (int i=1; i<=10; ++i)

> C. BartC
> for i:=1 to 10 do

> My opinion is that A, B, and C, all suck to about the same degree. Others,
> please do comment on the good/bad/ugly of A, B, C above, or even
> show/propose a D, E, F...

For a traditional loop where you need the index, you are always going to
have these elements:

o A symbol or keyword introducing a loop
o The name chosen for the index
o The start value
o The end value

(Plus some way of delimiting the statements forming the body of the loop.)

That's four elements. The Basic manages it in 6 tokens, which is not bad.
Otherwise it would look like:

D. Minimalistic
for i 1 10

My version uses 7 tokens, including the 'do' delimiter because the syntax is
not line oriented.

But it can also be reduced to 5 tokens: for i to 10 do, since loops start at
1 if not specified.

(And where the index is not needed, it's just 3 tokens: to 10 do, whereas
most others still require a loop variable anyway).

The C (or C++) version uses 14 tokens! 15 if you include the { delimiter. A
lot more error-prone, especially with the loop index written 3 times.

Maybe they do all suck, but some suck more than others!

Newer languages have eliminated much of the need for explicit loops like
this, but if you do need one, there aren't so many different ways of
specifying those four things, with the Basic still one of the clearest in my
opinion.

--
Bartc

James Harris

unread,

Sep 15, 2011, 3:52:46 PM9/15/11

to

On Sep 15, 10:47 am, "BartC" <b...@freeuk.com> wrote:

..

> Newer languages have eliminated much of the need for explicit loops like
> this, but if you do need one, there aren't so many different ways of
> specifying those four things, with the Basic still one of the clearest in my
> opinion.

I fully agree with what you say and I too think that Basic is just so
user-friendly that it's hard to beat or even come close to. More than
most languages the various dialects of Basic, on the whole, seem more
'natural' than other languages.

I often feel my own design efforts fail to produce a language that is
as easy to think in as what I remember of Basic. When I analyse them
the components I try to express are simply too complex for simple
language.

Perhaps that is the nub of the issue. Basic's components tend to be
simple but incomplete. Taking your loop example, Basic loops are great
for the simple 1 to 10 or 10 to 0 in steps of 2 etc, but are not
general. If you want to express a more complex loop in Basic you end
up making the code MORE complex than it can be in, say, C.

Is it a case of: you can have it simple or complete but not both?

James

Marco van de Voort

unread,

Sep 15, 2011, 4:03:42 PM9/15/11

to

On 2011-09-13, BGB <cr8...@hotmail.com> wrote:
>> punctuation), or Perl, or Lisp.
>>
>
> eh?...
>
> C-like syntax works pretty well in mainstream language lists.

Yeah, like assembler in the fifties and sixties....

BartC

unread,

Sep 15, 2011, 7:39:29 PM9/15/11

to

"James Harris" <james.h...@googlemail.com> wrote in message
news:5b4f52cc-671d-42f3...@h11g2000vbc.googlegroups.com...

> On Sep 15, 10:47 am, "BartC" <b...@freeuk.com> wrote:

>> with the Basic still one of the clearest in my opinion.
>
> I fully agree with what you say and I too think that Basic is just so
> user-friendly that it's hard to beat or even come close to.

Especially with it's string handling, which was ahead of it's time. It took
me a while to come up with something where string handling was equally
effortless for the programmer.

> Basic loops are great
> for the simple 1 to 10 or 10 to 0 in steps of 2 etc, but are not
> general. If you want to express a more complex loop in Basic you end
> up making the code MORE complex than it can be in, say, C.

> Is it a case of: you can have it simple or complete but not both?

I keep my iterative loops simple: they only ever iterate over an integer
range, and always in steps of 1, either up or down (steps other than one are
so rare, I've eliminated that option).

Anything more complex, is a different kind of statement, for example C's
general purpose loop 'for(a;b;c)' I write as 'while a,b,c do'.

Then I use 'for all' for other kinds of iteration; over all elements of a
list for example, but I use it for anything that it makes sense to iterate
over, or sometimes even if it doesn't!

You're right that you can't express most of these directly in Basic, that's
why I don't actually use Basic...

--
bartc

BGB

unread,

Sep 16, 2011, 10:42:31 AM9/16/11

to

however, a difference is that at this point there is a lack of anything
clearly better.

Marco van de Voort

unread,

Sep 17, 2011, 8:09:04 AM9/17/11

to

On 2011-09-16, BGB <cr8...@hotmail.com> wrote:
>>> eh?...
>>>
>>> C-like syntax works pretty well in mainstream language lists.
>>
>> Yeah, like assembler in the fifties and sixties....
>
> however, a difference is that at this point there is a lack of anything
> clearly better.

No, that is what the ones that stuck to assembler in the fifties and sixties
also said :-)

James Harris

unread,

Sep 17, 2011, 8:09:58 AM9/17/11

to

On Sep 16, 12:39 am, "BartC" <b...@freeuk.com> wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message

>
> news:5b4f52cc-671d-42f3...@h11g2000vbc.googlegroups.com...
>
> > On Sep 15, 10:47 am, "BartC" <b...@freeuk.com> wrote:
> >> with the Basic still one of the clearest in my opinion.
>
> > I fully agree with what you say and I too think that Basic is just so
> > user-friendly that it's hard to beat or even come close to.
>
> Especially with it's string handling, which was ahead of it's time. It took
> me a while to come up with something where string handling was equally
> effortless for the programmer.

The Basic I used first had left$, mid$ and right$ which, IMHO, are
*awful*. What did you have in mind?

> > Basic loops are great
> > for the simple 1 to 10 or 10 to 0 in steps of 2 etc, but are not
> > general. If you want to express a more complex loop in Basic you end
> > up making the code MORE complex than it can be in, say, C.
> > Is it a case of: you can have it simple or complete but not both?
>
> I keep my iterative loops simple: they only ever iterate over an integer
> range, and always in steps of 1, either up or down (steps other than one are
> so rare, I've eliminated that option).

This leads to some familiar questions:

1. What happens to the loop control variable outside the loop? If it
is in scope after the loop finishes what value does it have?

2. What happens if the programmer changes or tries to change the
control variable value in the body of the loop?

3. What happens if the programmer changes or tries to change the far
bound?

4. What should programmers generally do if they want step sizes other
than one and does the compiler cope well if they make inefficient
choices? In other words, when a more complex loop control is needed do
you encourage a programmer to write (for i ... ; real_index = equation
involving i; ...) or to write (real_index := start; for i ...;
real_index := real_index + step; ...), i.e. to maintain real_index
manually? Or, at what point should they switch to a while loop?

All of these are complications which can occur when a Basic-style for
loop is used. It looks friendly in the, er, basic case. And it may
look simple. But it isn't. It has complexities that are hidden. This
is a case of a non-manifest interface. Certain choices are made by the
language designer but they are not at all apparent in the syntax.

C has its own plethora of complex hidden rules in many areas but it
deals fairly well with *all* of the above questions in its loop
construct.

One additional benefit of C's for loop: it makes fine adjustments
simple by allowing such tests as less than (<) as well as less than or
equal to (<=). Thus it avoids some of the "minus one" bounds
adjustments such as "for i = near to far - 1" that a Basic-type for
loop sometimes/often requires.

> Anything more complex, is a different kind of statement, for example C's
> general purpose loop 'for(a;b;c)' I write as 'while a,b,c do'.

Do you mean you write 'for(a;b;c)d' as 'a;while(b)d;c do'?

James

Marco

unread,

Sep 17, 2011, 12:12:56 PM9/17/11

to

suggestion:
James please put the edition of books you review because they seem to be different then the current editions.

BartC

unread,

Sep 17, 2011, 4:27:40 PM9/17/11

to

"James Harris" <james.h...@googlemail.com> wrote in message

news:c76208e5-8ccf-40b7...@l4g2000vbv.googlegroups.com...

> On Sep 16, 12:39 am, "BartC" <b...@freeuk.com> wrote:

>> Especially with it's string handling, which was ahead of it's time.

> The Basic I used first had left$, mid$ and right$ which, IMHO, are
> *awful*. What did you have in mind?

That you could do A$+B$ without worrying about how it's managed.

As for left$ and right$, I actually prefer that syntax (eg. left(s,n))
instead of the alternative s.1..n] which I also use, or s[n:] which I'm
thinking of using.

> This leads to some familiar questions:

OK, these answers apply to one interpreted language of mine, but in general
languages can vary and it's probably not good to depend on specific
behaviours.

> 1. What happens to the loop control variable outside the loop? If it
> is in scope after the loop finishes what value does it have?

I only use function-scope. The loop index in 'for i:=a to b' will end up
with value b+1, or b-1 if using 'downto', if the loop terminates normally.
(And yes I ignore overflow problems at the limits of the range.)

> 2. What happens if the programmer changes or tries to change the
> control variable value in the body of the loop?

Whatever you might expect I suppose. If i has the value 5, and is changed to
value 50, then the next iteration it will be 51 instead of 6. (A loop won't
be terminated in the middle of a body if the index suddenly exceeds the
limit, if that's what you mean.)

> 3. What happens if the programmer changes or tries to change the far
> bound?

A loop limit is one of Constant, Simple, or Complex. A Constant limit can't
be changed. A Complex one is evaluated once and copied to a temporary. So
that can't be changed. A Simple limit (eg. a single named variable) *can* be
changed.

(I suppose a Simple limit can be treated as Complex, and copied too,
especially if the compiler is clever enough to detect when it might change;
but mine isn't...)

A For loop is one of the fastest constructions in the interpreter and uses a
single bytecode, so it tries not to do anything clever.

> 4. What should programmers generally do if they want step sizes other
> than one and does the compiler cope well if they make inefficient
> choices? In other words, when a more complex loop control is needed do
> you encourage a programmer to write (for i ... ; real_index = equation
> involving i; ...) or to write (real_index := start; for i ...;
> real_index := real_index + step; ...), i.e. to maintain real_index
> manually? Or, at what point should they switch to a while loop?

Probably another kind of loop is better. My For-loops can just about
accommodate other step-sizes, using various ways, but it's untidy and not
quite as efficient as doing it properly.

> All of these are complications which can occur when a Basic-style for
> loop is used. It looks friendly in the, er, basic case. And it may
> look simple. But it isn't. It has complexities that are hidden. This
> is a case of a non-manifest interface. Certain choices are made by the
> language designer but they are not at all apparent in the syntax.
>
> C has its own plethora of complex hidden rules in many areas but it
> deals fairly well with *all* of the above questions in its loop
> construct.

It does, but it's too general, and a pain to write for the 19 times out of
20 that you want a basic loop. In fact, you practically have to tell the C
compiler exactly how to implement the loop, every time you write a for
statement!

(C programmers tend to respond to this by suggesting using a tacky macro
such as FOR(i,a,b), but that is not really the answer.)

> One additional benefit of C's for loop: it makes fine adjustments
> simple by allowing such tests as less than (<) as well as less than or
> equal to (<=). Thus it avoids some of the "minus one" bounds
> adjustments such as "for i = near to far - 1" that a Basic-type for
> loop sometimes/often requires.

That's true, in a zero-based language, it has that advantage. But my
languages tend to be one-based! And having to get it just right is a
disadvantage as mentioned.

And I haven't dismissed it completely, as I've added a form of it to the
syntax of my While statement. (And I have plenty of other ways than a basic
For statement, to execute unusual loops. So For can be kept simple - and
efficient.)

>> Anything more complex, is a different kind of statement, for example C's
>> general purpose loop 'for(a;b;c)' I write as 'while a,b,c do'.
>
> Do you mean you write 'for(a;b;c)d' as 'a;while(b)d;c do'?

Well, it would be written as 'while a,b,c do d od', which is almost the same
as 'a; while(b) d; c od; but not quite: all my loops support four loop
control statements: restart, redo, next, and exit (like C's break), and some
of these only work properly when the loop is self-contained.

So a 'Restart' will not re-execute the 'a' in 'a; while(b) d; c do'.

--
Bartc

BartC

unread,

Sep 17, 2011, 6:19:43 PM9/17/11

to

"BartC" <b...@freeuk.com> wrote in message news:j52vvi$4c1$1...@dont-email.me...

> "James Harris" <james.h...@googlemail.com> wrote in message

>> Do you mean you write 'for(a;b;c)d' as 'a;while(b)d;c do'?
>
> Well, it would be written as 'while a,b,c do d od', which is almost the
> same
> as 'a; while(b) d; c od

> 'a' in 'a; while(b) d; c do'.

Don't know how I came up with these. The equivalent of C's 'for (a;b;c;)d'
would be written as:

a
while b do
d
c
end

if there was no other way of expressing it.

--
Bartc

BGB

unread,

Sep 18, 2011, 6:08:28 PM9/18/11

to

I am not sure of the point of all this...

maybe at the time they were right, as pretty much the only other options
were FORTRAN and COBOL (mostly in the 60s), which were not necessarily
"clearly better" than assembler.

C didn't come around until the 70s... (so its later existence or
non-existence was of no relevance to them).

in any case, it is not clear at the moment that there is anything
clearly better (who knows about 10 or 20 years from now, that is the
future, not "right now").

BASIC-like, Python-like, or Lua-like syntax is not clearly better, and
on many fronts I might contend that they are actually worse on various
fronts.

hence, the issue is as it is.

for the moment, C-family syntax (extended to include the likes of Java,
C#, JavaScript, ActionScript, ... as well) seems "reasonably close" to
optimal (the rest then is "fine-tuning").

not that one has to treat syntactic details or language trivia as a
religion or anything, but rather refrain from making significant changes
unless there is a good reason for doing so (for example, my choice of
"/expr/ as /type/" and "/expr/ as! /type/" cast syntax over
"(/type/)/expr/" syntax for sake of avoiding parser ambiguity related to
the use of function-returning-expressions and curried functions and
similar...).

the rest is pedantics:
some people don't like ';' so they make line-breaks significant... then
you have all of these funky (and often subtle) issues related to
formatting (the parser doesn't always parse the code as intended, ...);
some people like keywords more (making very "wordy" languages), and
others prefer to avoid keywords wherever possible and have nearly
everything as operator glyphs;
...

so, at this point, things are as they are...

BartC

unread,

Sep 18, 2011, 7:53:20 PM9/18/11

to

"BGB" <cr8...@hotmail.com> wrote in message

news:j55q4v$mr3$1...@news.albasani.net...

> BASIC-like, Python-like, or Lua-like syntax is not clearly better,

No? It's certainly more informal and therefore more friendly.

> for the moment, C-family syntax (extended to include the likes of Java,
> C#, JavaScript, ActionScript, ... as well) seems "reasonably close" to
> optimal (the rest then is "fine-tuning").

There's one thing wrong with thinking C-family syntax is 'optimal': if you
ask anyone to write an algorithm in pseudo-code, the chances are they will
write something that looks like Basic, Algol, Pascal, Lua, Python ... in
fact pretty much anything except C!

This might give a clue that perhaps that perhaps C-family languages aren't
the most natural way of writing code. (In fact even the C pre-processor
prefers to use #if ... #endif!)

(Understandly, C-family programmers will defend their syntax to the death,
even to the extent of pretending that there's nothing really wrong with C's
type-declaration syntax, a format so convoluted and impossible, it's
necessary to employ third-party utilities to disentangle their meaning!)

However, syntax is just syntax, it's merely a superficial layer over the
language proper, and it ought to be possible to just switch from one syntax
style to another, but that hasn't really happened. (It's not a wild idea:
I've been pretty much writing C code, but with Algol-like syntax, for
years.)

> the rest is pedantics:
> some people don't like ';' so they make line-breaks significant... then
> you have all of these funky (and often subtle) issues related to
> formatting (the parser doesn't always parse the code as intended, ...);
> some people like keywords more (making very "wordy" languages), and others
> prefer to avoid keywords wherever possible and have nearly everything as
> operator glyphs;

Exactly, so why not have a choice? A choice of selecting a preferred syntax
instead of having to completely switch languages.

--
Bartc

BGB

unread,

Sep 18, 2011, 9:19:37 PM9/18/11

to

On 9/18/2011 4:53 PM, BartC wrote:
> "BGB" <cr8...@hotmail.com> wrote in message
> news:j55q4v$mr3$1...@news.albasani.net...
>
>> BASIC-like, Python-like, or Lua-like syntax is not clearly better,
>
> No? It's certainly more informal and therefore more friendly.
>

BASIC code often ends up requiring explicit line-continuation characters;
Python's indentation-based syntax is a source of many problems and much
controversy;
Lua has lots of keywords and generally looks funky;
...

also, many other attempts at "soft line-break" languages have ended up
creating lots of awkward ambiguities, and often require funky rules so
that the parser doesn't get confused.

>> for the moment, C-family syntax (extended to include the likes of
>> Java, C#, JavaScript, ActionScript, ... as well) seems "reasonably
>> close" to optimal (the rest then is "fine-tuning").
>
> There's one thing wrong with thinking C-family syntax is 'optimal': if
> you ask anyone to write an algorithm in pseudo-code, the chances are
> they will write something that looks like Basic, Algol, Pascal, Lua,
> Python ... in fact pretty much anything except C!
>

my own pseudo-code tends to look mostly C-like (actually, my own
pseudo-code tends to mostly resemble something Java-like, but sometimes
with features which wouldn't map well to real languages, such as
conditionals or procedural logic in structures, ...).

> This might give a clue that perhaps that perhaps C-family languages
> aren't the most natural way of writing code. (In fact even the C
> pre-processor prefers to use #if ... #endif!)
>

as-is, my own language uses:
ifdef(...) { ... }
and:
ifndef(...) { ... }

mostly as I ended up "promoting" it from the use of attributes:
$[ifdef(...)] { ... }

> (Understandly, C-family programmers will defend their syntax to the
> death, even to the extent of pretending that there's nothing really
> wrong with C's type-declaration syntax, a format so convoluted and
> impossible, it's necessary to employ third-party utilities to
> disentangle their meaning!)
>

whatever dude...

although, C and C++ proper have ambiguous declaration syntax, languages
like C# and Java have disambiguated this through the use of various
rules (the syntax allows for context-independent parsing, as the cost of
being able to express certain constructions).

I had before considered a language I had called C-Aux, which would have
been mostly C-like but move over to a technically more C#-like syntax,
but little has been done with this idea thus far.

for my own language, I am currently leaning mostly towards ActionScript
style declarations "var x;" or "var obj:Foo;", although as-is both this
and Java-style declarations are supported ("variant x;" and "Foo obj;").

technical note:
"var x;", although both "var" and "variant" are valid names for the
type, is in-fact a syntactic variable definition (var is parsed as a
keyword, rather than a type, however "variant" is also the default type
if no other type is given).

this also means that the full form (variant) is required for a function
returning the type, rather than being able to use the 'var' short-form.
partly, I have considered the possibility of dropping the 'var'
shorthand for the type-name, essentially making "variant" the sole
official name for the type.

also note that "variant" may behave as either a dynamic type, or more
like the "auto" type in C++ or C# (IIRC), depending on context (a proper
"auto" type could be added, but at the moment would simply be an alias,
as it would be extra effort to add a "single type or die" special case,
vs the current "try to guess the type, or fall back to dynamic
type-checking" strategy...).

> However, syntax is just syntax, it's merely a superficial layer over the
> language proper, and it ought to be possible to just switch from one
> syntax style to another, but that hasn't really happened. (It's not a
> wild idea: I've been pretty much writing C code, but with Algol-like
> syntax, for years.)
>

for many people though, the syntax is the language (many people express
their thoughts/memories/... more in syntax than in semantics).

hence, a change in appearance may also constitute a break in
understanding (and a need to internally convert code between mental
representations).

>> the rest is pedantics:
>> some people don't like ';' so they make line-breaks significant...
>> then you have all of these funky (and often subtle) issues related to
>> formatting (the parser doesn't always parse the code as intended, ...);
>> some people like keywords more (making very "wordy" languages), and
>> others prefer to avoid keywords wherever possible and have nearly
>> everything as operator glyphs;
>
> Exactly, so why not have a choice? A choice of selecting a preferred
> syntax instead of having to completely switch languages.
>

well, there are issues with this:
one would likely need a standardized AST or some kind of standardized
meta-parser system;
it could lead to an inability for people to nearly so readily copy-paste
code between projects or between programmers (this is why standardized
syntax, APIs, and coding conventions, are often so much emphasized).

as-is, I have seen people essentially get into fights over things which
the compiler doesn't give a crap about, such as "where the brace goes"
and similar...

say, a person goes and types:
"if(foo)doSomething();"

and several other people argue about how this should be written instead:
"if(foo) { doSomething(); }"
vs:
"if(foo)
{
doSomething();
}"
vs:
"if(foo) {
doSomething(); }"
vs:
...

and, then there are people who believe their particular preferred style
should be made part of the standard and/or enforced by the compiler.

then I have seen things where people have made "lint" type tools, and
then gotten into big arguments over the default rule-sets (because said
lint tool by-default prefers different style rules then their own
preferred rules), ...

ultimately, it is all pointless, but many people seem to take such
matters fairly seriously.

allowing much more variability than there is already would potentially
make such people go berserk...

it is bad enough with the optional ';' terminator in JavaScript, which
many feel should be mandatory (it is a "do not omit and let others see
ones' code" type matter), ...

(in many of my own parsers, the ';' is in-fact optional, as are most
uses of commas, as the parsers generally involve "whitespace
heuristics", but they are generally recommended).

and, I myself have gotten into big-ass arguments for sake of asserting
that a lot of stuff like this doesn't matter, or that various commonly
black-listed coding practices (such as copy-paste) are in-fact often
acceptable (or even preferable) for sake of avoiding bigger problems
(say, physical dependencies between unrelated libraries or programs).

but, in many ways, interpersonal interactions and relations are a battle
field, much as is things like "office politics", ...

or such...

Marco van de Voort

unread,

Sep 19, 2011, 4:24:38 AM9/19/11

to

On 2011-09-18, BGB <cr8...@hotmail.com> wrote:
>>>> Yeah, like assembler in the fifties and sixties....
>>>
>>> however, a difference is that at this point there is a lack of anything
>>> clearly better.
>>
>> No, that is what the ones that stuck to assembler in the fifties and sixties
>> also said :-)
>
> I am not sure of the point of all this...

Reducing retro sentiment presented as universal fact in this group.

> maybe at the time they were right, as pretty much the only other options
> were FORTRAN and COBOL (mostly in the 60s), which were not necessarily
> "clearly better" than assembler.

They were in the direction that matter, namely a better tradeoff between
programmer time and computer time.

> in any case, it is not clear at the moment that there is anything
> clearly better (who knows about 10 or 20 years from now, that is the
> future, not "right now").

Except for a few system programming jobs, C is considered very low
productivity compared to most common OOP languages. C++ is even entering the
kernel.

> BASIC-like, Python-like, or Lua-like syntax is not clearly better, and
> on many fronts I might contend that they are actually worse on various
> fronts.

BASIC is too heterogenous to say anything sensible about it. For the rest I
agree, since obviously Pascal/Wirthian is the better syntax :-)

> for the moment, C-family syntax (extended to include the likes of Java,
> C#, JavaScript, ActionScript, ... as well) seems "reasonably close" to
> optimal (the rest then is "fine-tuning").

Curly braces suck.

> the rest is pedantics:
> some people don't like ';' so they make line-breaks significant... then
> you have all of these funky (and often subtle) issues related to
> formatting (the parser doesn't always parse the code as intended, ...);
> some people like keywords more (making very "wordy" languages), and
> others prefer to avoid keywords wherever possible and have nearly
> everything as operator glyphs;

Both have reasons. Glyph operators are more math like, word based is more
normal reading style like.

Math syntax is really good in cramming a lot on a line, and for most
scientific formula's that makes sense. For programming IMHO less, unless it
is really math/scientific formula oriented.

Andy Walker

unread,

Sep 19, 2011, 6:22:11 AM9/19/11

to

On 18/09/11 23:08, BGB wrote:
> On 9/17/2011 5:09 AM, Marco van de Voort wrote:

[...]

>> No, that is what the ones that stuck to assembler in the fifties and sixties
>> also said :-)
> I am not sure of the point of all this...
> maybe at the time they were right, as pretty much the only other
> options were FORTRAN and COBOL (mostly in the 60s), which were not
> necessarily "clearly better" than assembler.

There is a danger of re-writing history. In the '50s and '60s,
there were literally hundreds of autocodes and other high-level languages.
These thrived or died out as the specific computers they were aimed at
thrived or died out -- there was no general expectation at that time that
you could take a program from one machine to another [even of the same
type, for there would be differences in storage and peripherals] and have
it work. Fortran was just one of them; it thrived primarily because IBM
thrived and threw their weight behind the project.

Algol and Cobol grew out of the need for commonality in ways of
expressing [resp] scientific and business applications. In those terms,
Algol thrived; but it never really superseded the generally much more
efficient autocodes. Cobol on the other hand did eventually manage to
squeeze out its rivals.

As for "clearly better" than assembler or not, I started writing
programs in the '60s, so have only a folk-lore view of the '50s. But by
my time no-one in any of the environments I was familiar with wrote in
assembler for any serious project [it came back some years later when
mini-computers and, later, micros arrived, and were too small/slow for
many purposes using the high-level languages available]. I imagine that
there remained niche markets [eg control] where assembler programs were
still de rigeur.

However, it was *also* the case that every language I used in
those days allowed assembler inserts, used either for efficiency at
critical points of the code or for operations [such as magnetic tape
manipulation] that had no counterpart in Algol [or whatever]. There
was usually [always?] a primitive interface back to the higher level,
eg so that you could use named variables. IOW, there was no need ever
to write "pure" assembler programs; it was virtually always "clearly
better" to use the high-level language. Note that an assembler [or
"machine code"] insert is quite different from the modern practice of
libraries of functions, some of which may have been been written in
assembler. The "insert" was right there in your program, and the
assembler syntax was part of the defined syntax of the HLL.

--
Andy Walker,
Nottingham.

BartC

unread,

Sep 19, 2011, 8:27:02 AM9/19/11

to

"BGB" <cr8...@hotmail.com> wrote in message

news:j565bh$amh$1...@news.albasani.net...

>> No? It's certainly more informal and therefore more friendly.
>>
>
> BASIC code often ends up requiring explicit line-continuation characters;

C uses line-continuation too.

> Python's indentation-based syntax is a source of many problems and much
> controversy; Lua has lots of keywords and generally looks funky;

(Is 'funky' good or bad?)

>
> also, many other attempts at "soft line-break" languages have ended up
> creating lots of awkward ambiguities, and often require funky rules so
> that the parser doesn't get confused.

For years I used this rule for line-breaks: "End-of-line is converted to a
semicolon, unless preceded by a comma or a "\" line continuation".
(Although the syntax has to be tolerant of superfluous semicolons.)

This has worked well so far: even though my syntax is semicolon-separated,
you can look at a thousand lines of code and have trouble finding even one!
In fact I'm thinking of adding tokens like "(" and "[" to the comma, where
further input is obviously expected.

>> if
>> you ask anyone to write an algorithm in pseudo-code, the chances are
>> they will write something that looks like Basic, Algol, Pascal, Lua,
>> Python ... in fact pretty much anything except C!

> my own pseudo-code tends to look mostly C-like (actually, my own
> pseudo-code tends to mostly resemble something Java-like, but sometimes
> with features which wouldn't map well to real languages, such as
> conditionals or procedural logic in structures, ...).

That's fair enough, but bear in mind that pseudo-code often has to be impart
an idea to someone without knowledge of the idiosyncrasies of your syntax.

(Pseudo-code also tends to use higher-level expressions than might be
available in an actual language, although the higher level the language, the
closer it will map.)

> as-is, my own language uses:
> ifdef(...) { ... }
> and:
> ifndef(...) { ... }

That's good, at least the syntax is consistent!

>> However, syntax is just syntax, it's merely a superficial layer over the
>> language proper

> for many people though, the syntax is the language (many people express

> their thoughts/memories/... more in syntax than in semantics).

I agree there might be issues in putting the concept across.

>>> the rest is pedantics:
>>> some people don't like ';' so they make line-breaks significant...
>>> then you have all of these funky (and often subtle) issues related to
>>> formatting (the parser doesn't always parse the code as intended, ...);
>>> some people like keywords more (making very "wordy" languages), and
>>> others prefer to avoid keywords wherever possible and have nearly
>>> everything as operator glyphs;
>>
>> Exactly, so why not have a choice? A choice of selecting a preferred
>> syntax instead of having to completely switch languages.
>>
>
> well, there are issues with this:
> one would likely need a standardized AST or some kind of standardized
> meta-parser system;
> it could lead to an inability for people to nearly so readily copy-paste
> code between projects or between programmers (this is why standardized
> syntax, APIs, and coding conventions, are often so much emphasized).

I had in mind some switch on an IDE that would instantly convert from one
form to another. Possibly copy-and-paste might be a problem when dealing
with partial code-fragments (which can't be converted).

I can see further problems also as I prefer case-insensitive syntax, C-style
(and quite a few others) are case-sensitive, and converting between the two
is not trivial.

> as-is, I have seen people essentially get into fights over things which
> the compiler doesn't give a crap about, such as "where the brace goes" and
> similar...

Don't IDEs already have some way of reformatting code according to one style
or another?

> (in many of my own parsers, the ';' is in-fact optional, as are most uses
> of commas, as the parsers generally involve "whitespace heuristics", but
> they are generally recommended).

If you get rid of commas, some parsing possibilities disappear, for example
I can write "25 cm" (a constant length), which is otherwise parsed as "25,
cm".

--
Bartc

BGB

unread,

Sep 19, 2011, 10:17:05 AM9/19/11

to

On 9/19/2011 1:24 AM, Marco van de Voort wrote:
> On 2011-09-18, BGB<cr8...@hotmail.com> wrote:
>>>>> Yeah, like assembler in the fifties and sixties....
>>>>
>>>> however, a difference is that at this point there is a lack of anything
>>>> clearly better.
>>>
>>> No, that is what the ones that stuck to assembler in the fifties and sixties
>>> also said :-)
>>
>> I am not sure of the point of all this...
>
> Reducing retro sentiment presented as universal fact in this group.
>

ok.

>> maybe at the time they were right, as pretty much the only other options
>> were FORTRAN and COBOL (mostly in the 60s), which were not necessarily
>> "clearly better" than assembler.
>
> They were in the direction that matter, namely a better tradeoff between
> programmer time and computer time.
>
>> in any case, it is not clear at the moment that there is anything
>> clearly better (who knows about 10 or 20 years from now, that is the
>> future, not "right now").
>
> Except for a few system programming jobs, C is considered very low
> productivity compared to most common OOP languages. C++ is even entering the
> kernel.
>

except that the topic was "C family syntax", not C itself...

languages like C++, Java, C#, ... are all related in terms of having a
similar-looking syntax (although Java and C# have varied in a
fundamental way from C and C++ regarding how their parsers work).

>> BASIC-like, Python-like, or Lua-like syntax is not clearly better, and
>> on many fronts I might contend that they are actually worse on various
>> fronts.
>
> BASIC is too heterogenous to say anything sensible about it. For the rest I
> agree, since obviously Pascal/Wirthian is the better syntax :-)
>

I disagree, I am not so fond of Wirthian syntax...

>> for the moment, C-family syntax (extended to include the likes of Java,
>> C#, JavaScript, ActionScript, ... as well) seems "reasonably close" to
>> optimal (the rest then is "fine-tuning").
>
> Curly braces suck.
>

they are easier to type and more compact than begin/end though...

next up someone claims begin/nigeb to be the ideal replacement for {/}.

>> the rest is pedantics:
>> some people don't like ';' so they make line-breaks significant... then
>> you have all of these funky (and often subtle) issues related to
>> formatting (the parser doesn't always parse the code as intended, ...);
>> some people like keywords more (making very "wordy" languages), and
>> others prefer to avoid keywords wherever possible and have nearly
>> everything as operator glyphs;
>
> Both have reasons. Glyph operators are more math like, word based is more
> normal reading style like.
>
> Math syntax is really good in cramming a lot on a line, and for most
> scientific formula's that makes sense. For programming IMHO less, unless it
> is really math/scientific formula oriented.

better IMO is to have a tradeoff.
most C-family syntax styles have such a tradeoff...

I had before experimented with more concise syntax styles, but managed
to create a syntax with an amusing drawback: I had a hard time
remembering it and thus was often unable to read/write code later.

OTOH, one could write things like:
t=\x\y x
f=\x\y y
{t t f}

the language mostly ended up being used as a calculator for a while...

BGB

unread,

Sep 19, 2011, 10:24:35 AM9/19/11

to

On 9/19/2011 3:22 AM, Andy Walker wrote:
> On 18/09/11 23:08, BGB wrote:
>> On 9/17/2011 5:09 AM, Marco van de Voort wrote:
> [...]
>>> No, that is what the ones that stuck to assembler in the fifties and
>>> sixties
>>> also said :-)
>> I am not sure of the point of all this...
>> maybe at the time they were right, as pretty much the only other
>> options were FORTRAN and COBOL (mostly in the 60s), which were not
>> necessarily "clearly better" than assembler.
>
> There is a danger of re-writing history.

well, I could have added the word "real" before "options", but whatever.
I guess Lisp existed in the 60s, and its time in the sun mostly came and
went.

I am not really that huge into all the history though...

> In the '50s and '60s,
> there were literally hundreds of autocodes and other high-level languages.
> These thrived or died out as the specific computers they were aimed at
> thrived or died out -- there was no general expectation at that time that
> you could take a program from one machine to another [even of the same
> type, for there would be differences in storage and peripherals] and have
> it work. Fortran was just one of them; it thrived primarily because IBM
> thrived and threw their weight behind the project.
>

AFAIK, also common at the time were HLLs that were essentially
macro-assemblers by modern standards.

> Algol and Cobol grew out of the need for commonality in ways of
> expressing [resp] scientific and business applications. In those terms,
> Algol thrived; but it never really superseded the generally much more
> efficient autocodes. Cobol on the other hand did eventually manage to
> squeeze out its rivals.
>

fair enough.

> As for "clearly better" than assembler or not, I started writing
> programs in the '60s, so have only a folk-lore view of the '50s. But by
> my time no-one in any of the environments I was familiar with wrote in
> assembler for any serious project [it came back some years later when
> mini-computers and, later, micros arrived, and were too small/slow for
> many purposes using the high-level languages available]. I imagine that
> there remained niche markets [eg control] where assembler programs were
> still de rigeur.
>

I was born in the 80s and got into computers in the 90s.
2000 to present I have written much code, but still have sadly not
achieved much "of significance".

AFAIK, many programs back in the 80s were written in ASM, with most 90s
style code in C or C++.

one would have presumed that earlier decades were more of the same,
essentially ASM all the way back until the early days of lights and
switches or similar, with HLLs like C/Lisp/... being used "here and
there" as more compact and portable ASM alternatives.

IIRC, I had read somewhere before that at the time FORTRAN was a large
achievement, like people previously saying it couldn't be done, ...

> However, it was *also* the case that every language I used in
> those days allowed assembler inserts, used either for efficiency at
> critical points of the code or for operations [such as magnetic tape
> manipulation] that had no counterpart in Algol [or whatever]. There
> was usually [always?] a primitive interface back to the higher level,
> eg so that you could use named variables. IOW, there was no need ever
> to write "pure" assembler programs; it was virtually always "clearly
> better" to use the high-level language. Note that an assembler [or
> "machine code"] insert is quite different from the modern practice of
> libraries of functions, some of which may have been been written in
> assembler. The "insert" was right there in your program, and the
> assembler syntax was part of the defined syntax of the HLL.
>

fair enough...

I guess it differs some from many modern HLLs (such as Java) where
interfacing with much of anything besides themselves (say, calling
to/from C land) is made into an exercise in pain.

a big issue in my language efforts is that calling to/from C, and
sharing code/data/... between the HLL and C, is something I try to make
reasonably painless.

the next major goal would be to try to do likewise for C++ level
interfacing, where one can reasonably easily share classes/methods/...
between C++ and the scripting language (and mutually overload classes,
...). however, this would be a good deal technically more complex at the
moment (essentially, trying to do this would open up a whole new level
of pain and complexity).

granted, one can still interface between them at the C level (say, via
using structs and function pointers in place of classes), even if this
is far from "ideal" as far as many people are concerned.

but, oh well...

its major merit is that it can be loaded in source-form from scripts and
one can use eval with it and similar...

or such...

BGB

unread,

Sep 19, 2011, 6:54:54 PM9/19/11

to

On 9/19/2011 5:27 AM, BartC wrote:
> "BGB" <cr8...@hotmail.com> wrote in message
> news:j565bh$amh$1...@news.albasani.net...
>
>>> No? It's certainly more informal and therefore more friendly.
>>>
>>
>> BASIC code often ends up requiring explicit line-continuation characters;
>
> C uses line-continuation too.
>

but, generally only as part of the preprocessor, which is generally
regarded as separate from the main language proper.

>> Python's indentation-based syntax is a source of many problems and
>> much controversy; Lua has lots of keywords and generally looks funky;
>
> (Is 'funky' good or bad?)
>

generally bad.

people prefer things which are conventional/familiar/...

Lua has a strange syntax and semantics (unified objects/arrays, for
example), which can be off-putting (although it has gained some
popularity among game modders for whatever reason).

>>
>> also, many other attempts at "soft line-break" languages have ended up
>> creating lots of awkward ambiguities, and often require funky rules so
>> that the parser doesn't get confused.
>
> For years I used this rule for line-breaks: "End-of-line is converted to
> a semicolon, unless preceded by a comma or a "\" line continuation".
> (Although the syntax has to be tolerant of superfluous semicolons.)
>
> This has worked well so far: even though my syntax is
> semicolon-separated, you can look at a thousand lines of code and have
> trouble finding even one! In fact I'm thinking of adding tokens like "("
> and "[" to the comma, where further input is obviously expected.
>

my assembler is naturally linebreak-based, but allows using a semicolon
to place multiple opcodes on a line (whitespace sensitive, it may also
be interpreted as a comment depending on where the whitespace is at).

>>> if
>>> you ask anyone to write an algorithm in pseudo-code, the chances are
>>> they will write something that looks like Basic, Algol, Pascal, Lua,
>>> Python ... in fact pretty much anything except C!
>
>> my own pseudo-code tends to look mostly C-like (actually, my own
>> pseudo-code tends to mostly resemble something Java-like, but
>> sometimes with features which wouldn't map well to real languages,
>> such as conditionals or procedural logic in structures, ...).
>
> That's fair enough, but bear in mind that pseudo-code often has to be
> impart an idea to someone without knowledge of the idiosyncrasies of
> your syntax.
>

except that C-family syntax is so common now that it is nearly
universally understood.

> (Pseudo-code also tends to use higher-level expressions than might be
> available in an actual language, although the higher level the language,
> the closer it will map.)
>

yep.

>> as-is, my own language uses:
>> ifdef(...) { ... }
>> and:
>> ifndef(...) { ... }
>
> That's good, at least the syntax is consistent!
>

well, it is also because ifdef/ifndef are handled much later than the
C-style ifdef (in my VM, they are handled via special conditional
function call opcodes, where if the condition is false, then the special
call operation is no-op and the associated code blocks may be omitted).

> >> However, syntax is just syntax, it's merely a superficial layer over
> the
>>> language proper
>
>> for many people though, the syntax is the language (many people
>> express their thoughts/memories/... more in syntax than in semantics).
>
> I agree there might be issues in putting the concept across.
>

yep.

>>>> the rest is pedantics:
>>>> some people don't like ';' so they make line-breaks significant...
>>>> then you have all of these funky (and often subtle) issues related to
>>>> formatting (the parser doesn't always parse the code as intended, ...);
>>>> some people like keywords more (making very "wordy" languages), and
>>>> others prefer to avoid keywords wherever possible and have nearly
>>>> everything as operator glyphs;
>>>
>>> Exactly, so why not have a choice? A choice of selecting a preferred
>>> syntax instead of having to completely switch languages.
>>>
>>
>> well, there are issues with this:
>> one would likely need a standardized AST or some kind of standardized
>> meta-parser system;
>> it could lead to an inability for people to nearly so readily
>> copy-paste code between projects or between programmers (this is why
>> standardized syntax, APIs, and coding conventions, are often so much
>> emphasized).
>
> I had in mind some switch on an IDE that would instantly convert from
> one form to another. Possibly copy-and-paste might be a problem when
> dealing with partial code-fragments (which can't be converted).
>

possible, but this could risk punishing people who would prefer just use
plain text editors.

> I can see further problems also as I prefer case-insensitive syntax,
> C-style (and quite a few others) are case-sensitive, and converting
> between the two is not trivial.
>

potentially. I prefer case-sensitive, and many of my languages are
case-sensitive (OTOH, my assembler is case-sensitive for
label/function/variable names, but not for register names or nmonics).

>> as-is, I have seen people essentially get into fights over things
>> which the compiler doesn't give a crap about, such as "where the brace
>> goes" and similar...
>
> Don't IDEs already have some way of reformatting code according to one
> style or another?
>

dunno...
people still fight over stuff like this, and many people still prefer
plain text editors over that of full IDEs.

>> (in many of my own parsers, the ';' is in-fact optional, as are most
>> uses of commas, as the parsers generally involve "whitespace
>> heuristics", but they are generally recommended).
>
> If you get rid of commas, some parsing possibilities disappear, for
> example I can write "25 cm" (a constant length), which is otherwise
> parsed as "25, cm".
>

in cases where suffixes are allowed, no whitespace is allowed before the
suffix.

say, "2+3i" vs "2+3 i".

for example:
#[16 -80 30]
is the same as:
#[16, -80, 30]

with the parser noting the asymmetric whitespace around '-' as
indicating a negative number rather than a subtraction.

a lot of this means that stuff which handles code fragments before the
parser generally has to take care to preserve significant whitespace.

in general it has worked fairly well though...

Fritz Wuehler

unread,

Sep 21, 2011, 8:30:36 AM9/21/11

to

Andy Walker <ne...@cuboid.co.uk> wrote:

> On 18/09/11 23:08, BGB wrote:
> > On 9/17/2011 5:09 AM, Marco van de Voort wrote:
> [...]
> >> No, that is what the ones that stuck to assembler in the fifties and sixties
> >> also said :-)
> > I am not sure of the point of all this...
> > maybe at the time they were right, as pretty much the only other
> > options were FORTRAN and COBOL (mostly in the 60s), which were not
> > necessarily "clearly better" than assembler.

Didn't see most of this thread but it sounds interesting.

PL/I was also around in the 60s and got a good start early. FORTRAN and
COBOL were and are better than assembler if you used them as they were
intended. Assembler is better if used as intended. There really isn't any
overlap. No sane person would write numerical code in assembler and no sane
person would write financial transactions or accounting reports in
assembler, at least not since the days of COBOL. OTOH no sane person would
try to write systems software in FORTRAN or COBOL, at least on IBM that
simply isn't possible.

> Algol and Cobol grew out of the need for commonality in ways of
> expressing [resp] scientific and business applications. In those terms,
> Algol thrived; but it never really superseded the generally much more
> efficient autocodes. Cobol on the other hand did eventually manage to
> squeeze out its rivals.

I am not sure ALGOL grew out of any need and I don't know that it was ever
used in commercial production. I thought the point of ALGOL was as a reality
check against BNF, that you could design a grammar on paper and then make a
real language out of it. ALGOL68 was pretty usable but on IBM not nearly as
usable as PL/I. Most of that was not inherent in the language but in how
much money IBM spent producing libraries and working on the COBOL and PL/I
compilers to interface and exploit their OS features. IBM never had an
ALGOL compiler and the ALGOL compilers that did run on IBM systems didn't
have the money behind them to make them commercially viable. They didn't
have the library support and interfaces to all of the IBM facilities that
IBM's languages did.

COBOL didn't really have any rivals, not then and not now. It's as special
purpose a language as FORTRAN is. Now 50 years later they've both been
stretched pretty far from their original framework. FORTRAN seems to be
tolerating the changes better than COBOL.

> As for "clearly better" than assembler or not, I started writing
> programs in the '60s, so have only a folk-lore view of the '50s. But by
> my time no-one in any of the environments I was familiar with wrote in
> assembler for any serious project [it came back some years later when
> mini-computers and, later, micros arrived, and were too small/slow for
> many purposes using the high-level languages available]. I imagine that
> there remained niche markets [eg control] where assembler programs were
> still de rigeur.

It depends where you worked and on what platform. Assembler was used in the
early days even for things COBOL would have been a far better choice, simply
because there was a pool of ex-autocoder and assembler programmers. I have
personally seen and worked on many hundreds of thousands of lines of
commercial (business) applications written in assembler. Sad but true. Of
course all the systems software was and is still written in assembler for
IBM. There wasn't ever a choice.

> However, it was *also* the case that every language I used in
> those days allowed assembler inserts, used either for efficiency at
> critical points of the code or for operations [such as magnetic tape
> manipulation] that had no counterpart in Algol [or whatever]. There
> was usually [always?] a primitive interface back to the higher level,
> eg so that you could use named variables. IOW, there was no need ever
> to write "pure" assembler programs;

I seem to remember you could indeed backspace a tape with ALGOL.

There was a need for pure assembler code on IBM that continues until today,
on the platform(s) you worked on apparently that wasn't true. What machines
and OS are you referring to?

> it was virtually always "clearly better" to use the high-level language.
> Note that an assembler [or "machine code"] insert is quite different from
> the modern practice of libraries of functions, some of which may have been
> been written in assembler.

That is not necessarily a "modern practice." IBM has been doing that since
the days of FORTRAN, possibly earlier. That was 1957.

> The "insert" was ight there in your program, and the assembler syntax was

> part of the defined syntax of the HLL.

That is not true in the only case I'm aware of which is a very modern
addition to C language on IBM allowing inline assembler. The syntax isn't
defined as part of the HLL, it's pure assembler. But the C compiler does
break up long lines and insert continuations as needed for the assembler.

Rod Pemberton

unread,

Sep 21, 2011, 5:44:38 PM9/21/11

to

"James Harris" <james.h...@googlemail.com> wrote in message

news:c76208e5-8ccf-40b7...@l4g2000vbv.googlegroups.com...

On Sep 16, 12:39 am, "BartC" <b...@freeuk.com> wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message
>
> news:5b4f52cc-671d-42f3...@h11g2000vbc.googlegroups.com...
>
> > On Sep 15, 10:47 am, "BartC" <b...@freeuk.com> wrote:
> >> with the Basic still one of the clearest in my opinion.
>
> > > I fully agree with what you say and I too think that Basic is just so
> > > user-friendly that it's hard to beat or even come close to.
>
> > Especially with it's string handling, which was ahead of it's time. It
> > took me a while to come up with something where string handling
> > was equally effortless for the programmer.
>
> The Basic I used first had left$, mid$ and right$ which, IMHO, are
> *awful*. What did you have in mind?

Do you really think that James? ...

While BartC needs to lookup how to read C declarations, he has this one
correct. BASIC has the most elegant string operators compared to other
language, even C. Here are some of my past comments on the usefulness of
those:

"Heh! You just reminded me of BASIC. BASIC had/has three substring
operators, i.e., LEFT$, RIGHT$, MID$, and one operator for string
concatenation, +. Even after years of C programming with the immensely
useful and powerful C string functions, I'm always amazed by how much those
four operations could do in BASIC. Unfortunately, C needs a function to do
string concatenation, i.e., strcat()..."
http://groups.google.com/group/comp.lang.misc/msg/0832c639a455ead1

"About the only useful aspect [of BASIC] was the
simple string functionality: mid$, left$, right$, and + for concatenation."
http://groups.google.com/group/alt.lang.asm/msg/073d81892ec6c9e5

"BASIC had the easiest method of string manipulation. This is the thing I
remember the most about BASIC. It was so easy, that I usually create the
MID$,LEFT$,RIGHT$, and a concatenation function when I'm programming in
languages other than C. C has less easy to use duplicate funtionality."
http://groups.google.com/group/comp.lang.c/msg/430927e5b2189ab9

"It took many years, and a language review, before I realized that my fond
memories about the strength of BASIC were misplaced. It was more a
representation of my strength or skill than the language's. Although,
BASIC does have a few useful language constructs for string processing
which are still useful. In fact, I implemented a left$, right$, mid$ for
another language that was lacking in string processing functionality.
Unfortunately, native language syntax for concatentation wasn't available.
For C, I usually just put up with the slightly extra work of using C's
string functions since they are more compatible with each other than with
an implementation of left$, right$, mid$."
http://groups.google.com/group/comp.lang.misc/msg/ba2d6459a470407e

Rod Pemberton

Rod Pemberton

unread,

Sep 21, 2011, 5:46:04 PM9/21/11

to

"BartC" <b...@freeuk.com> wrote in message news:j560br$7iq$1...@dont-email.me...

>
> (Understandly, C-family programmers will defend their syntax to the death,
> even to the extent of pretending that there's nothing really wrong with

> C's type-declaration syntax, a format so convoluted and impossible, [...]
>

Are you joking? Look up "right-left rule" and "c declaration" sometime.

Or, go here:
http://sites.google.com/site/sanselgroup/c/complex-declarations

> [...] C's

> type-declaration syntax, a format so convoluted and impossible,

> necessary to employ third-party utilities to disentangle their meaning!)

No one uses 'cdecl' to "disentangle" the meaning ... If your C declarations
are that complicated, you've done something wrong.

> Exactly, so why not have a choice? A choice of selecting a preferred
> syntax instead of having to completely switch languages.
>

Why have a choice? If one syntax suffices to express what's necessary, why
do you need another? If English is just as effective as Spanish or French,
why do I need Spanish and French?

Rod Pemberton

Rod Pemberton

unread,

Sep 21, 2011, 6:10:28 PM9/21/11

to

"Marco van de Voort" <mar...@turtle.stack.nl> wrote in message
news:slrnj7dv26....@turtle.stack.nl...
>
> Curly braces suck.
>

Suck? Suck the most? Or, suck, but suck much less than the other choices?

Lets start with the two most widespread languages: C and Forth.

C - concisely uses the following for all blocks
{ }

Forth - has no syntax so a numerous variety Forth words are used to delimit
blocks
BEGIN AGAIN
BEGIN UNTIL
BEGIN WHILE REPEAT
IF ELSE THEN
DO LOOP
: ;
etc.

I had to look these up. It's either been too long since I coded them, or I
don't program in them.

Pascal -
begin end

Fortran -
do enddo

Cobol - word delimited
PERFORM VARYING
EVALUATE WHEN
etc.

Lisp -
( )

Lua - similar to Forth, delimited by various words ...
function end
for do end
if elseif else end

Perl -
{ }

AWK -
{ }

Java -
{ }

Javascript -
{ }

It seems the curly braces might be winning, and where they aren't parens are
... ;)

Rod Pemberton

Andy Walker

unread,

Sep 21, 2011, 8:00:03 PM9/21/11

to

On 19/09/11 15:24, BGB wrote:
> one would have presumed that earlier decades were more of the same,
> essentially ASM all the way back until the early days of lights and
> switches or similar, with HLLs like C/Lisp/... being used "here and
> there" as more compact and portable ASM alternatives.

Computing really had three separate, though linked, starts.
Up to the '70s, you're talking about machines the size of buildings,
owned by universities, government or big corporations, costing around
1000 person-years of salary, and with staffs of perhaps 70 to run them.
Then came minicomputers, fitting into rooms, owned by departments,
costing 5 years salary, run by the most savvy person in the department.
Finally came micros, fitting on desks, used by individuals, costing
[now] a day or two of salary, mostly run by novices.

In each strand, there was a development process, as machines
improved in terms of speed and storage. Initially, there were no, or
very limited, OS's and HLLs. These come into real use when you get
up to 32K bytes [or even an "elephantine 64K"]. So assemblers and
machine code get a boost with each new strand. But the old strands
continued, at least for a while. So each strand has the sort of
development you describe, but the overall picture of the state of
CS is more complicated.

Portability wasn't really an issue in the early days. It
was talked about, but it was largely pie-in-the-sky stuff. There
were two different sorts of portability. One was taking your paper
tape to a different computer and expecting it to work -- unlikely.
The other was expecting your programs to continue to work when the
computer was upgraded or replaced; and here you were on better
ground, as the entire university [or whatever] had the same problem,
and we're talking huge amounts of money, so it was worthwhile for
the manufacturer to provide a compiler for the old language on the
new machine.

--
Andy Walker,
Nottingham.

BartC

unread,

Sep 21, 2011, 8:33:08 PM9/21/11

to

"Rod Pemberton" <do_no...@noavailemail.cmm> wrote in message
news:j5dlue$qfj$1...@speranza.aioe.org...

> "BartC" <b...@freeuk.com> wrote in message
> news:j560br$7iq$1...@dont-email.me...

>> [...] C's
>> type-declaration syntax, a format so convoluted and impossible,
>> necessary to employ third-party utilities to disentangle their meaning!)
>
> No one uses 'cdecl' to "disentangle" the meaning ... If your C
> declarations
> are that complicated, you've done something wrong.

So what's it for then?

When I write type-declarations, they read left-to-right. Made a little more
verbose, they resemble the English output of cdecl!

It shouldn't be necessary to employ an algorithm to understand a
declaration. And here's one example I actually had to use (and now have to
keep in a locked drawer in case I lose it):

(*(void (*(*pcptr))))();

>> Exactly, so why not have a choice? A choice of selecting a preferred
>> syntax instead of having to completely switch languages.
>>
>
> Why have a choice? If one syntax suffices to express what's necessary,
> why
> do you need another? If English is just as effective as Spanish or
> French,
> why do I need Spanish and French?

But we do have English, Spanish and French! English hasn't taken over the
world yet.

And there are tools to convert from one language to another. They're not
perfect, so you'd think artificial computer languages would be much easier
to convert from one representation to another.

--
Bartc

Andy Walker

unread,

Sep 21, 2011, 8:54:27 PM9/21/11

to

On 21/09/11 13:30, Fritz Wuehler wrote:
>> Algol and Cobol grew out of the need for commonality in ways of

>> expressing [resp] scientific and business applications. [...]

> I am not sure ALGOL grew out of any need

There's a hint in the name, ALGOrithmic Language. If you can
get hold of the "Preliminary Report" on "International Algebraic
Language" [nowadays usually called Algol 58], and published in various
places under somewhat different titles, the Introduction is pretty
explicit [more so than the Algol 60 or 68 reports] about the background
and the need.

> and I don't know that it was ever
> used in commercial production. I thought the point of ALGOL was as a reality
> check against BNF, that you could design a grammar on paper and then make a
> real language out of it.

Not sure I understand your point. Algol 60 compilers were
certainly produced by most [all?] important manufacturers, and were
widely used in the scientific community, esp in Europe. I doubt
whether most users had ever heard of BNF.

[...]

> There was a need for pure assembler code on IBM that continues until today,
> on the platform(s) you worked on apparently that wasn't true. What machines
> and OS are you referring to?

Up to 1970, primarily Edsac, Atlas and KDF9. "What is an OS?"
I never used one until 1972, when the university replaced its KDF9 with
a 1906A running "George 3". Atlas [eg] had a permanently-resident
"supervisor" program which implemented "job control" [and many other
things], but it would be stretching a point to call it an OS.

Note that I wasn't saying that I never needed to use assembler;
but that the assembler used was, apart from experimentation, always simply
embedded in the program.

>> it was virtually always "clearly better" to use the high-level language.
>> Note that an assembler [or "machine code"] insert is quite different from
>> the modern practice of libraries of functions, some of which may have been
>> been written in assembler.
> That is not necessarily a "modern practice." IBM has been doing that since
> the days of FORTRAN, possibly earlier. That was 1957.

Yes, but the point was rather that these days you can't [usually
-- your example below seems to be an exception!] simply put the odd line
of assembler in your C or Java, you have to do it by [in C/Unix terms]
compiling/assembling something into a ".o" file and linking that with
the rest of your program.

>> The "insert" was right there in your program, and the assembler syntax was

>> part of the defined syntax of the HLL.
> That is not true in the only case I'm aware of which is a very modern
> addition to C language on IBM allowing inline assembler. The syntax isn't
> defined as part of the HLL, it's pure assembler. But the C compiler does
> break up long lines and insert continuations as needed for the assembler.

C wasn't around in the '60s! Presumably, if the C extension does
not define what the assembler syntax is, then you can't use C constructs
as part of that assembler? Eg, labels, variables, array/struct elements,
bit fields, etc? Else how does the C compiler know to replace variable
names by suitable assembler but not op codes?

--
Andy Walker,
Nottingham.

James Harris

unread,

Sep 22, 2011, 1:35:22 PM9/22/11

to

On Sep 21, 10:44 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>

wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message

...

> > The Basic I used first had left$, mid$ and right$ which, IMHO, are
> > *awful*. What did you have in mind?
>
> Do you really think that James? ...

Yes, I'm afraid so, though perhaps I overemphasised it. You and I may
have discussed this before. I prefer to see a string as a sequence of
characters and am really intrigued that you see it so differently.
Care for a code-off?

> ... BASIC has the most elegant string operators compared to other

> language, even C. Here are some of my past comments on the usefulness of
> those:

...

C's string handling is a bit awkward. I'm not sure it is a good
comparison. (Of course C's ethos is minimal; it doesn't have a garbage
collector or reference counts and similar, so its library approach to
string handling is appropriate and consistent.)

OK how about some code comparisons? Using no language in particular,

left$(a$, 7)
vs
a$[.. 7]

mid$(a$, 2, 3)
vs
a$[2 .. 3]

assuming this particular mid$ works that way. ISTM even worse if the
last parameter is a length but that's maybe because I think in terms
of subscripts.

So, do you dislike the second of each example? If so, why? They look
clear to me and are essentially a single notation, which I think is a
benefit.

Or, do you have some code that shows why you prefer the Basic-style
functions?

James

James Harris

unread,

Sep 22, 2011, 1:38:33 PM9/22/11

to

On Sep 17, 5:12 pm, Marco <prenom_no...@yahoo.com> wrote:
> suggestion:
> James please put the edition of books you review because they seem to be different then the current editions.

Sure. I want to database-ise the books info and will look for a way to
include edition info.

James

BartC

unread,

Sep 22, 2011, 7:35:57 PM9/22/11

to

"James Harris" <james.h...@googlemail.com> wrote in message

news:29286864-45b5-4808...@s16g2000yqc.googlegroups.com...

> OK how about some code comparisons? Using no language in particular,
>
> left$(a$, 7)
> vs
> a$[.. 7]
>
> mid$(a$, 2, 3)
> vs
> a$[2 .. 3]

> So, do you dislike the second of each example? If so, why? They look

> clear to me and are essentially a single notation, which I think is a
> benefit.

OK, you're using slicing notation.

That's fine, but how do you select, say, the last 4 characters of a string?
Or everything *except* the first character?

Don't know how Basic's right$() deals with that second example, but with
*my* left/right functions, that would be right(a,4) and right(a,-1).

And my left()/right() can also specify a substring *longer* than the
original, and can pad out with another string, by adding an extra argument.
So functionality has been easily extended using this format.

With slicing notation, I would have to write those examples as follows (for
strings, I needed a 'dot' to break apart an object normally considered a
single entity):

a.[a.len-4..a.len]
a.[2..a.len].

With a tidier way of writing the end-bound, say by writing "$", or just
omitting it, and getting rid of that ".", the examples might become:

a[$-4..]
a[2..]

Definitely shorter, but much more cryptic too. These forms also depend on
the zero-based or one-based nature of the subscripts. In fact I created
special notation for these left/right slices, and the examples become:

a[:4]
a[:-1]

But even here, it's not clear whether that last one should be a[-1:] or
a[:-1] (probably the latter, as the slice is to the right in both cases).
And when specifying a larger slice, this form seems to demand that a bounds
error should be raised.

Finally, when porting an algorithm to another language without these
features, then left()/right() functions can be more easily emulated than
dedicated syntax. So all good reasons for retaining function format!

--
Bartc

Marco van de Voort

unread,

Sep 23, 2011, 5:22:47 AM9/23/11

to

On 2011-09-21, Rod Pemberton <do_no...@noavailemail.cmm> wrote:
> "Marco van de Voort" <mar...@turtle.stack.nl> wrote in message
> news:slrnj7dv26....@turtle.stack.nl...
>>
>> Curly braces suck.
>
> Suck? Suck the most? Or, suck, but suck much less than the other choices?

suck. No additional qualifiers needed.

> Lets start with the two most widespread languages: C and Forth.

Forth widespread? Where, in Apple firmwares ? :)

> I had to look these up. It's either been too long since I coded them, or I
> don't program in them.
>
> Pascal -
> begin end

That's what I'm most familiar with. Though I like the Modula2 system more,
snce it solves dangling els problems.

> It seems the curly braces might be winning, and where they aren't parens are
> ... ;)

Not all popular things are good things. Take for instance Berlusconi :-)

So my remark was about quality, not quantity.

Bill Gunshannon

unread,

Sep 23, 2011, 9:27:45 AM9/23/11

to

In article <j5ggsk$g8k$1...@dont-email.me>,

"BartC" <b...@freeuk.com> writes:
> "James Harris" <james.h...@googlemail.com> wrote in message
> news:29286864-45b5-4808...@s16g2000yqc.googlegroups.com...
>
>> OK how about some code comparisons? Using no language in particular,
>>
>> left$(a$, 7)
>> vs
>> a$[.. 7]
>>
>> mid$(a$, 2, 3)
>> vs
>> a$[2 .. 3]
>
>> So, do you dislike the second of each example? If so, why? They look
>> clear to me and are essentially a single notation, which I think is a
>> benefit.
>
> OK, you're using slicing notation.
>
> That's fine, but how do you select, say, the last 4 characters of a string?

a$ = left$(z$,4)

> Or everything *except* the first character?

a$ = left$(z$,len(z$)-1)

>
> Don't know how Basic's right$() deals with that second example, but with

For every right$() there is (obviously) a complimentary left$().

> *my* left/right functions, that would be right(a,4) and right(a,-1).
>
> And my left()/right() can also specify a substring *longer* than the
> original,

Something longer than the original would not be a substring but a superset
of the original string. Give me an example of what you want and I will
show you how it would be done using the BASIC String Functions.

> and can pad out with another string, by adding an extra argument.
> So functionality has been easily extended using this format.

See above. Easily done with the BASIC String functions (which consist
of much more than right$(), left$() and mid$().

>
> With slicing notation, I would have to write those examples as follows (for
> strings, I needed a 'dot' to break apart an object normally considered a
> single entity):
>
> a.[a.len-4..a.len]
> a.[2..a.len].
>
> With a tidier way of writing the end-bound, say by writing "$", or just
> omitting it, and getting rid of that ".", the examples might become:
>
> a[$-4..]
> a[2..]

It is probably a matter of taste, but IMHO the BASIC examples are probably
a lot easier to understand for someone coming in from the cold. :-)

>
> Definitely shorter, but much more cryptic too. These forms also depend on
> the zero-based or one-based nature of the subscripts. In fact I created
> special notation for these left/right slices, and the examples become:
>
> a[:4]
> a[:-1]
>
> But even here, it's not clear whether that last one should be a[-1:] or
> a[:-1] (probably the latter, as the slice is to the right in both cases).
> And when specifying a larger slice, this form seems to demand that a bounds
> error should be raised.
>
> Finally, when porting an algorithm to another language without these
> features, then left()/right() functions can be more easily emulated than
> dedicated syntax. So all good reasons for retaining function format!

But they assume that all strings are arrays of characters and while true
in C, that is not a common convention as well as being one that most non-C
programmers consider a C weakness, not strength.

bill

--
Bill Gunshannon | de-moc-ra-cy (di mok' ra see) n. Three wolves
bill...@cs.scranton.edu | and a sheep voting on what's for dinner.
University of Scranton |
Scranton, Pennsylvania | #include <std.disclaimer.h>

BGB

unread,

Sep 23, 2011, 9:43:45 AM9/23/11

to

On 9/23/2011 2:22 AM, Marco van de Voort wrote:
> On 2011-09-21, Rod Pemberton<do_no...@noavailemail.cmm> wrote:
>> "Marco van de Voort"<mar...@turtle.stack.nl> wrote in message
>> news:slrnj7dv26....@turtle.stack.nl...
>>>
>>> Curly braces suck.
>>
>> Suck? Suck the most? Or, suck, but suck much less than the other choices?
>
> suck. No additional qualifiers needed.
>

IMO, there are plenty worse options:
paired keywords;
indentation;
opening and closing tags ("<foo>...</foo>" or "[foo]...[/foo]");
...

>> Lets start with the two most widespread languages: C and Forth.
>
> Forth widespread? Where, in Apple firmwares ? :)
>

IIRC, it is also used as an alternative to C in various embedded systems
(since it can perform similarly and has similar operating
characteristics, but requires a much simpler compiler).

sort of like PostScript...

>> I had to look these up. It's either been too long since I coded them, or I
>> don't program in them.
>>
>> Pascal -
>> begin end
>
> That's what I'm most familiar with. Though I like the Modula2 system more,
> snce it solves dangling els problems.
>

but, these do involve, on average, typing 4x as many characters as { and }.

>> It seems the curly braces might be winning, and where they aren't parens are
>> ... ;)
>
> Not all popular things are good things. Take for instance Berlusconi :-)
>

had to look this up...

found: a politician.

then was thinking: wow, this guy sort of resembles the mobster
stereotype (but this may be because in the US, Italians form one of two
major popular cultural/stereotypical roles: being mobsters, or running
small restaurants and talking funny, like "mamamia my pizzaria" and
similar, meanwhile using lots of hand-gestures and similar...).

prior to looking up the word, was actually more expecting to find some
sort of food product (like, say, something similar to haggis or similar,
...).

hmm...

> So my remark was about quality, not quantity.

?...

Rod Pemberton

unread,

Sep 23, 2011, 10:21:49 AM9/23/11

to

"BartC" <b...@freeuk.com> wrote in message news:j5dvoq$ukp$1...@dont-email.me...

>
> It shouldn't be necessary to employ an algorithm to understand a
> declaration. And here's one example I actually had to use (and now have to
> keep in a locked drawer in case I lose it):
>
> (*(void (*(*pcptr))))();
>

The parens seem to be incorrect to me. Do you have all the parens present
and in the correct spots? My guess is that was supposed to call a function
using a pointer to the function or maybe a pointer to a pointer to the
function, depending on the whether the parens are correct ... It may do
that through the void type conversion, depending on the type of pcptr ...
As is, it appears, without me confirming for sure, that it's an invalid type
conversion.

Function pointers are the most extreme example you should be using.
Usually, one or two typedef's will eliminate some or all of the awkward
syntax.

If it is, it likely needs some casts for ANSI C or later. E.g., something
like:

(*( (void(*)(void)) (*pcptr) ))();

In that example, 'pcptr' is a pointer to a non-function type such as an
integer or a non-void pointer, for example 'unsigned long *' or 'unsigned
long **'. void pointers are generally not here used because of a type
conversion restriction in ANSI C. Next, it is derefenced. Since the result
is also a non-function type, for example 'unsigned long' or 'unsigned long
*', it is then cast as a pointer to a function by 'void (*)(void)'. That
says the function pointer, say 'func', is the same as a function declared
'void *func(void)'. The cast for the function type can be different or even
unecessary depending on the type of pcptr. The function is then called, by
the dereference: * func () where func is the result of the pointer
dereference and function cast. Typically, for an interpreter at least, the
dereference of 'pcptr' must result in the same type as 'pcptr' which
requires a cast in C. E.g., it needs an additional cast like so:

(*( (void(*)(void)) (*((cast_type *)pcptr)) ))();

Rod Pemberton

Rod Pemberton

unread,

Sep 23, 2011, 10:30:08 AM9/23/11

to

"James Harris" <james.h...@googlemail.com> wrote in message

news:29286864-45b5-4808...@s16g2000yqc.googlegroups.com...

> On Sep 21, 10:44 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
> > "James Harris" <james.harri...@googlemail.com> wrote in message
...

> > > The Basic I used first had left$, mid$ and right$ which, IMHO, are
> > > *awful*. What did you have in mind?
>
> > Do you really think that James? ...
>
> Yes, I'm afraid so, though perhaps I overemphasised it. You and I may
> have discussed this before. I prefer to see a string as a sequence of
> characters

It is. A string a sequence of characters whether in BASIC or C. At least,
it is with every version of each that I've used ...

> [...] and am really intrigued that you see it so differently.

I'm not sure what you mean by that. Both C and BASIC manipulate strings of
characters.

> Care for a code-off?

That depends.

> C's string handling is a bit awkward.

I agree. I stated so in one of those quotes.

But, the set of functions in C which manipulate strings are quite powerful
and flexible. One of their advantages is you can do stuff like this:

*strrchr(line,'\n')='\0';

strrchr() searches backwards from the end-of-string for a specific
character. Obviously, strrchr() would be a subroutine in BASIC with a for
loop that cuts and compares one character at a time using mid$ until it
finds '\n'. The C function is implemented "identically".

strrchr(), in this case, searches for '\n'. strrchr() returns a pointer to
the found char or NULL. So, if found, we can dereference that pointer in
order to assign a nul '\0' character to location where it found '\n',
thereby trimming and terminating the string. Of course, '\n' must be known
to be present or you could have a NULL pointer dereference ...

*strrchr(line,'\n')='\0';
*strrchr(line,'\r')='\0';

That trims CRLF or CR or LF from the end-of-string. By, "you can do stuff
like this", I mean use pointers and char assignments with string functions.

> OK how about some code comparisons? Using no language in particular,
>
> left$(a$, 7)
> vs
> a$[.. 7]
>

a$[0 ... 7]

Zero. You probably want to explictly require arguments. At a minimum,
it'll reduce parsing issues.

> mid$(a$, 2, 3)
> vs
> a$[2 .. 3]
>

... 3 dots or 2 ?

Looks Pascal-ish to me ... E.g., replace two dots .. with a colon :
Pascal is the only language that I currently recall that allows indexed
array subsets.

From a syntax perspective, use of dots may be an issue, e.g., complicates
parsing, if your language uses them elsewhere. I.e., you may need a flag or
state variable to indicate you're inside a [ ] section.

> [...] assuming this particular mid$ works that way. ISTM even

> worse if the last parameter is a length but that's maybe because
> I think in terms of subscripts.

Your right$ would be a$[3..], but again you probably want to explicitly
require range arguments: a$[3..7]. So, in essence, both of your examples
are really mid$ without the start or end defined. And, that's basically
what they are in BASIC too, but named differently ... If you decide to
explicitly require both ends of the range, you might consider a special flag
value, e.g., -1 or -0 etc., that can't be used as a normal index value for
the maximum index.

> So, do you dislike the second of each example?

Without seeing how you do string concatenation, I can't say for sure one way
or another. The elegant feature of BASIC is two-fold. First, the
operations are simple and can basically much of what the C functions can do.
They can easily be emulated using functions or procedures in other
languages, if desired, except for maybe the use of '$' as part of the
function or procedure name. Second, the string concatentation '+' and
string assignment '=' are part of BASIC's language. String concatenation is
not part of the C language. It requires a function to implement it:
strcat(). That's part of the awkwardness.

Instead of c$=a$+b$, you have this equivalent for C:

strcpy(c,a);
strcat(c,b);

Or,

strcat(strcpy(c,a),b);

You can use assignment in C for string pointers, but that just sets the
pointer to the location of another variable. So, that requires that the
assigned from variable contain the correct result. In this case, it would
modify either 'a' or 'b' to be 'a+b' prior to assigning to 'c', which is not
what you want.

> If so, why? They look clear to me and are essentially a single notation,
> which I think is a benefit. Or, do you have some code that shows why
> you prefer the Basic-style functions?

Hopefully, I've clearly explained that above.

Rod Pemberton

BartC

unread,

Sep 23, 2011, 12:27:19 PM9/23/11

to

"Bill Gunshannon" <bill...@cs.uofs.edu> wrote in message
news:9e3fqh...@mid.individual.net...

> In article <j5ggsk$g8k$1...@dont-email.me>,
> "BartC" <b...@freeuk.com> writes:

>> That's fine, but how do you select, say, the last 4 characters of a
>> string?
>
> a$ = left$(z$,4)

That's fine (although you need right$ there!), but the discussion was about
about using function-style as opposed to indexing and slicing notation.

>
>> Or everything *except* the first character?
>
> a$ = left$(z$,len(z$)-1)

(Again, you need right$ here. And it's no longer quite as sweet as writing
right$(z$,-1) as convention for excluding rather than including characters.)

>> and can pad out with another string, by adding an extra
>> argument.
>> So functionality has been easily extended using this format.
>
> See above. Easily done with the BASIC String functions (which consist
> of much more than right$(), left$() and mid$().

Example (no longer Basic syntax):

a:="123"
println right(a,6,"*")

Output:
***123

It was just an example of something awkward with slice notation.

>> a[$-4..]
>> a[2..]
>
> It is probably a matter of taste, but IMHO the BASIC examples are probably
> a lot easier to understand for someone coming in from the cold. :-)

Exactly my point of view! And they can be more functional too.

--
Bartc

Bill Gunshannon

unread,

Sep 23, 2011, 3:11:31 PM9/23/11

to

In article <j5ic2s$qvt$1...@dont-email.me>,

"BartC" <b...@freeuk.com> writes:
> "Bill Gunshannon" <bill...@cs.uofs.edu> wrote in message
> news:9e3fqh...@mid.individual.net...
>> In article <j5ggsk$g8k$1...@dont-email.me>,
>> "BartC" <b...@freeuk.com> writes:
>
>>> That's fine, but how do you select, say, the last 4 characters of a
>>> string?
>>
>> a$ = left$(z$,4)
>
> That's fine (although you need right$ there!), but the discussion was about
> about using function-style as opposed to indexing and slicing notation.

Why? As I stated later if you have right$() you will also have left$().
If we are going to add in arbitrary restrictions how about you can only
use odd numbers for your array index examples.

>
>>
>>> Or everything *except* the first character?
>>
>> a$ = left$(z$,len(z$)-1)
>
> (Again, you need right$ here. And it's no longer quite as sweet as writing
> right$(z$,-1) as convention for excluding rather than including characters.)

See above.

>
>>> and can pad out with another string, by adding an extra
>>> argument.
>>> So functionality has been easily extended using this format.
>>
>> See above. Easily done with the BASIC String functions (which consist
>> of much more than right$(), left$() and mid$().
>
> Example (no longer Basic syntax):
>
> a:="123"
> println right(a,6,"*")
>
> Output:
> ***123
>
> It was just an example of something awkward with slice notation.
>
>>> a[$-4..]
>>> a[2..]
>>
>> It is probably a matter of taste, but IMHO the BASIC examples are probably
>> a lot easier to understand for someone coming in from the cold. :-)
>
> Exactly my point of view! And they can be more functional too.
>

James Harris

unread,

Sep 23, 2011, 3:15:36 PM9/23/11

to

On Sep 23, 3:30 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>

wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message
>
> news:29286864-45b5-4808...@s16g2000yqc.googlegroups.com...> On Sep 21, 10:44 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> > wrote:
> > > "James Harris" <james.harri...@googlemail.com> wrote in message

...

> > ... I prefer to see a string as a sequence of

> > characters
>
> It is. A string a sequence of characters whether in BASIC or C. At least,
> it is with every version of each that I've used ...
>
> > [...] and am really intrigued that you see it so differently.
>
> I'm not sure what you mean by that. Both C and BASIC manipulate strings of
> characters.

I mean if a string is a sequence of characters why not use the
language's sequence selectors. I avoided saying an array of characters
as I thought that was too limiting. For example, a string could be
appropriately expressed as an array or characters or as a list of
characters. But to be more concrete let's take the probably more often
used option and discuss a string as an array of characters. I'm not
talking about C here, just the concept in general. If a string is
considered an array of characters why not allow it to be indexed using
array notation? (Similar could be said of a list if the language
permitted arbitrary indices for accessing lists.)

Consider a loop through a string one character at a time using a Basic-
style function:

for i = 1 to len(a$)
print mid$(a$, i, 1)
next i

What I was saying was that I prefer

for i = 1 to len(a$)
print a$[i]
next i

Basically, I couldn't see the need for a separate function to extract
part of a string. The array notation seemed quite adequate to me. And
it expresses the essential operation arguably more clearly. We want
element i of the string. Period.

We alternative it saying we want the substring of the string that
starts at position i and is one character long.

Please bear in mind I'm not advocating any specific language here,
merely the notation.

Similar could be said for a substring of length greater than 1. Say we
wanted 'N' characters in each substring:

Basic-style:

for i = 1 to len(a$) - N + 1
print mid$(a$, i, N)
next i

vs

for i = 1 to len(a$) - N + 1
print a$[i : i + N - 1]
next i

(Note I've used a colon as the range operator here rather than the
dots I used before for no other reason than to emphasise that I'm not
talking about a particular language.)

Another option is to allow the string slicing to include an operator
other than a range operator. For example,

foreach i = 1 to len(a$) - N + 1
print a$(i for N)
next i

> > Care for a code-off?
>
> That depends.

Haha! On what? I was hoping to compare options with you.

<snipped interesting points but not about left, mid and right$>

> > So, do you dislike the second of each example?
>
> Without seeing how you do string concatenation, I can't say for sure one way
> or another.

A plus sign is fine. I don't mind too much what it is. I'm not talking
about a specific language.

Normal assignment is fine too as in a$ = b$ or whatever is used in
the rest of the language.

> The elegant feature of BASIC is two-fold. First, the
> operations are simple and can basically much of what the C functions can do.
> They can easily be emulated using functions or procedures in other
> languages, if desired, except for maybe the use of '$' as part of the
> function or procedure name. Second, the string concatentation '+' and
> string assignment '=' are part of BASIC's language. String concatenation is
> not part of the C language. It requires a function to implement it:
> strcat(). That's part of the awkwardness.

Sure. Let's ignore C here. I thought we were discussing the relative
merits of left, mid and right (i.e. your preference) against string
slicing (my preference). My apologies if you wanted to discuss
something else.

...

> > If so, why? They look clear to me and are essentially a single notation,
> > which I think is a benefit. Or, do you have some code that shows why
> > you prefer the Basic-style functions?
>
> Hopefully, I've clearly explained that above.

Well, I understand that as functions they can be added to other
languages (because those languages allow user-defined functions) and I
agree with that. And I can see that you are comparing a lot against C
but I don't know whether you prefer them over string slicing or why. I
was hoping you could give some examples of code showing why they are
better.

James

James Harris

unread,

Sep 23, 2011, 3:52:50 PM9/23/11

to

On Sep 23, 12:35 am, "BartC" <b...@freeuk.com> wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message

You've made the case well for both. I agree up to this point, and here
I only slightly disagree. The more-cryptic part is a matter of
opinion. Once you get used to the slicing notation it's really quite
intuitive. In fact I'd say it's much easier to read than mid$.

> These forms also depend on
> the zero-based or one-based nature of the subscripts. In fact I created
> special notation for these left/right slices, and the examples become:
>
> a[:4]
> a[:-1]
>
> But even here, it's not clear whether that last one should be a[-1:] or
> a[:-1] (probably the latter, as the slice is to the right in both cases).
> And when specifying a larger slice, this form seems to demand that a bounds
> error should be raised.

If you go down the route of some languages (where all strings can be
indexed using non-negative integers as offsets from the left; and as
negative integers as indices from the right - i.e. the last character
is -1, the one before it is -2 etc) then your two examples become

a[-4:]
a[1:]

For anyone who is not used to the syntax [-4:] means between -4 and
the end of the string (a string four characters long, i.e. the
rightmost four characters), and [1:] means from character offset 1 to
the end of the string.

NB I guess most left$ etc functions apply only to strings of
*characters*. A very good thing about slicing from a language design
point of view is it can apply to arbitrary sequences. You could slice
arrays or sequences of any type.

> Finally, when porting an algorithm to another language without these
> features, then left()/right() functions can be more easily emulated than
> dedicated syntax.

Very true.

> So all good reasons for retaining function format!

Well, one good reason. :-)

James

Rod Pemberton

unread,

Sep 23, 2011, 4:35:33 PM9/23/11

to

"James Harris" <james.h...@googlemail.com> wrote in message

news:4518caba-b99b-4427...@l4g2000vbz.googlegroups.com...

> On Sep 23, 3:30 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
> > "James Harris" <james.harri...@googlemail.com> wrote in message

...

> > > [...] and am really intrigued that you see it so differently.
>

> I mean if a string is a sequence of characters why not use the
> language's sequence selectors. I avoided saying an array of characters
> as I thought that was too limiting. For example, a string could be
> appropriately expressed as an array or characters or as a list of
> characters. But to be more concrete let's take the probably more often
> used option and discuss a string as an array of characters. I'm not
> talking about C here, just the concept in general. If a string is
> considered an array of characters why not allow it to be indexed using
> array notation? (Similar could be said of a list if the language
> permitted arbitrary indices for accessing lists.)

Uh, my responses to this are varying ... a couple of aspects ...

If I take the first sentence and the final question together, it seems
you're arguing that only accessing the characters of string via a function,
e.g., mid$, left$, right$, is restrictive, i.e., need array or range
operations for sequences of characters. Once you have the functionality of
mid$ and length (LEN in BASIC) and concatenation, you can do "everything".
mid$(string, offset, length) is used to select one or more characters from
anywhere in the string. E.g., mid$(str, X, 1) selects one character. left$
is mid$(str, 0, X). right$ is mid$(str, X, len(str)). So, while there is
no direct indexing via syntax, i.e., one must use mid$ etc, you have full
string indexing via the function(s).

If I take the first sentence, I'd say that that both C and BASIC do that.
BASIC's sequence selectors for strings are mid$, left$, right$. C allows
you to index strings from a specific location to the terminator. So, you
can
use indexing for sequence selection, albeit limited and primitively like
BASIC. C also has a set of functions to do various sequence selections.

> Consider a loop through a string one character at a time using
> a Basic- style function:
>
> for i = 1 to len(a$)
> print mid$(a$, i, 1)
> next i
>
> What I was saying was that I prefer
>
> for i = 1 to len(a$)
> print a$[i]
> next i

Ok. As a side, if you were to use mid$, left$, right$, you could come up a
function for just one char in order to not type ",1)" , but chr$ is in use
already.

> Basically, I couldn't see the need for a separate function
> to extract part of a string.

I think that has to do with the design of a language like BASIC. Strings
are a fundamental part of the language in BASIC. That's why there is a
concatenation operator. AIR, BASIC has limited array functionality which
doesn't support strings (?).

C doesn't have a string type. It supports chars and "arrays". C has types
and "arrays" of them. So, "strings" are built using chars and "arrays". C
uses "arrays" of bytes as the fundamental concept upon which everything in
the language is constructed or stored.

> Please bear in mind I'm not advocating any specific language
> here, merely the notation.

If it allows you to take substrings and concatenate, then it's sufficient,
but primitive.

> Similar could be said for a substring of length greater than 1.
> Say we wanted 'N' characters in each substring:
>
> Basic-style:
>
> for i = 1 to len(a$) - N + 1
> print mid$(a$, i, N)
> next i
>

What do you need the loop for?

You can print a specific substring directly if you know it's location:

i=5
N=3
print mid$(a$,i,N)

I'm assuming the -N+1 here and N-1 below are correct. You might consider
zero based arrays. It eliminates many, but not all, off-by-one errors with
indexing.

> vs
>
> for i = 1 to len(a$) - N + 1
> print a$[i : i + N - 1]
> next i

Normally, wouldn't you just set j=N-1 and use it?

j=N-1
for i = 1 to len(a$) - j
print a$[i : i + j]
next i

You should see one disadvantage by specifying start and end instead of start
(or offset) and length ...

> Another option is to allow the string slicing to include
> an operator other than a range operator. For example,
>
> foreach i = 1 to len(a$) - N + 1
> print a$(i for N)
> next i

See comments above.

> Normal assignment is fine too as in a$ = b$ or whatever
> is used in the rest of the language.

The issue here with assignment is what gets copied. Does b$ pointer get
copied into a$ pointer, i.e., both pointers are the same value and there is
only one string which is shared. Or, does string b$ contents get copied to
string a$ content location, i.e., each string is in separate space, and both
pointers are different.

> Sure. Let's ignore C here. I thought we were discussing the relative
> merits of left, mid and right (i.e. your preference) against string
> slicing (my preference). My apologies if you wanted to discuss
> something else.

If one ignores syntax choices, I think it's all been covered. It only takes
two sentences to describe.

> [...] but I don't know whether you prefer them over string slicing or why.

Well, I prefer them to C's usage, or at least the basic functionality of
them, i.e., ability to take a substring. They are few, simple, easy to use,
powerful, flexible, easy to remember, easy to build up more powerful
functions with them, and concatenation is part of the language when they are
present.

> I was hoping you could give some examples of code showing
> why they are better.

Your examples, I'd say, are equivalent and slightly shorter.

As I said above: if it allows you to take substrings and concatenate, then
it's sufficient, but primitive.

I think you mean "examples of code" _using them_ that "show why they are
better." The "examples of code showing why they are better" is usually done
not by showing how they are better. It's done by showing how other
languages are far worse. You show the complexity or awkwardness of string
operations in other languages as compared to the simplicity of BASIC. I did
an example of that for C with strcpy() and strcat() vs. BASIC. C is worse.

Rod Pemberton

Rod Pemberton

unread,

Sep 23, 2011, 4:36:10 PM9/23/11

to

"Marco van de Voort" <mar...@turtle.stack.nl> wrote in message

news:slrnj7ojv7....@turtle.stack.nl...

> On 2011-09-21, Rod Pemberton <do_no...@noavailemail.cmm> wrote:
> > "Marco van de Voort" <mar...@turtle.stack.nl> wrote in message
> > news:slrnj7dv26....@turtle.stack.nl...
> >>

> > Lets start with the two most widespread languages: C and Forth.
>
> Forth widespread? Where, in Apple firmwares ? :)
>

Everywhere. It's been implemented on more platforms than C.

Still no reason as to why "curly braces suck", so I'll take that
as just an intense personal "preference" ...

Rod Pemberton

Rod Pemberton

unread,

Sep 23, 2011, 4:41:54 PM9/23/11

to

"James Harris" <james.h...@googlemail.com> wrote in message

news:9518b81f-0e77-425d...@u19g2000vbm.googlegroups.com...

>
> If you go down the route of some languages (where all strings
> can be indexed using non-negative integers as offsets from the
> left; and as negative integers as indices from the right - i.e. the
> last character is -1, the one before it is -2 etc) then your two
> examples become
>

Which languages allow negative indexing from the right?

The one language I know of that allows negative indexing is relative to the
left ...

Rod Pemberton

BartC

unread,

Sep 23, 2011, 4:34:14 PM9/23/11

to

"James Harris" <james.h...@googlemail.com> wrote in message

news:9518b81f-0e77-425d...@u19g2000vbm.googlegroups.com...

One thing I forgot, in favour of slicing notation, is that usually this can
be used for l-values too.

I suppose left()/right() *could* be l-values, but I haven't done that with
mine, and if they were, then that would also be difficult to port to another
language.

But then, in some languages strings are immutable, so the problem never
comes up!

--
Bartc

BartC

unread,

Sep 23, 2011, 5:19:10 PM9/23/11

to

"James Harris" <james.h...@googlemail.com> wrote in message

news:4518caba-b99b-4427...@l4g2000vbz.googlegroups.com...

> I mean if a string is a sequence of characters why not use the
> language's sequence selectors. I avoided saying an array of characters
> as I thought that was too limiting. For example, a string could be
> appropriately expressed as an array or characters or as a list of
> characters.

I did at one time make a distinction between objects with multiple values
(lists, arrays), and compound objects usually considered to be a single
value (strings, records/structs).

The latter needed special indexing notation to break them apart (mentioned
on another post):

a.[i] instead of a[i]
a.[i..j] instead of a[i..j]

When a is a string, then a.[i] returns a 1-character string (not a character
scalar, as discussed one time in this group!).

But having two different kinds of indexing caused some problems. Combining
them is neater and orthogonal, but I lose some the ability to detect some
errors.

> What I was saying was that I prefer
>
> for i = 1 to len(a$)
> print a$[i]
> next i
>
> Basically, I couldn't see the need for a separate function to extract
> part of a string. The array notation seemed quite adequate to me. And
> it expresses the essential operation arguably more clearly. We want
> element i of the string. Period.

Basic indexing only worked for arrays, not strings. Hence the
special-purpose functions. When indexing *is* available, then I think few
would object to making use of it.

When selecting a substring however, 'mid$' still has a use.

> print mid$(a$, i, N)

> vs

> print a$[i : i + N - 1]

About the same. Basic has 8 tokens, slicing has 10, but a couple fewer
characters.

> (Note I've used a colon as the range operator here rather than the
> dots I used before for no other reason than to emphasise that I'm not
> talking about a particular language.)
>
> Another option is to allow the string slicing to include an operator
> other than a range operator. For example,
>
> foreach i = 1 to len(a$) - N + 1
> print a$(i for N)
> next i

Not quite sure what 'i for N' means. One problem in the slicing above is
lack of a range notation giving a start value, and length, rather than start
and end values. For example:

A .. B means A through B inclusive (perhaps A through B-1 in zero-based
language)
A : B means A through A+B-1 inclusive

(In place of just a range to select a slice, I'm also experimenting with
having a list or set value in there. That's do-able for r-values, but I
haven't figured out a way of doing it for l-values:

A := (10,20,30,40,50,60,70) # 1-based list
print A[(5,1,2,2)] # Output (50,10,20,20)
print A[5..7,3] # Output (30,50,60,70)

A[(1,1,1,1,6)] := ... # Haven't figured out what it means yet!)

> And I can see that you are comparing a lot against C
> but I don't know whether you prefer them over string slicing or why. I

C is hopeless at higher level string handling, so doesn't really enter it
it.

Python is more interesting to compare with; it uses quite advanced slicing:

a[5:20:3]

which I think means: start with the 6th character (index 5), then select
every 3rd character until the 20th is reached or passed (index 19). But the
first time you see something like a[15::-3], it can be quite puzzling!

--
Bartc

BartC

unread,

Sep 23, 2011, 5:29:04 PM9/23/11

to

"Rod Pemberton" <do_no...@noavailemail.cmm> wrote in message

news:j5iqu2$8dp$1...@speranza.aioe.org...

Python and Ruby, for a start.

An index of -3 means 3rd element from the end (ie. the 'right', if you
imagine the elements progressing from left to right).

Interestingly, both these languages switch to 1-based indexing in this case!
Otherwise index 0 is ambiguous.

--
Bartc

Jacko

unread,

Sep 23, 2011, 7:45:01 PM9/23/11

to

apart from zero not being a true number for zero indexing, I have little to say except SuperBASIC QL.

Cheers Jacko

X(A TO B,C,D TO E)

Ted Joyce

unread,

Sep 23, 2011, 11:39:55 PM9/23/11

to

Personally, I regard "string" as a sequence of anything, not just
characters. A string of characters is a "character string". A string of
dwords is... well you get it. Intel x86 assembly language has "string"
instructions, and they don't give a shit what is in the byte, word, dword
or qword string the instructions are manipulating.

So, "character string", I prefer to regard as a specific kind of the
general concept of "string", which is "a sequence of whatevers".

As for "character strings" there are as many implementations of that
fuzzy concept as there are programming languages that include that
concept. More actually. Terminated strings, like in C. Length-prefixed
strings like in Pascal. C#'s immutable strings. Etc.

(I forget if Pascal's strings were length-prefixed or length-ptr, but
just for illustration the above should be OK).

> If a string is
> considered an array of characters why not allow it to be indexed using
> array notation?

Indeed. It seems quite natural to do so, and wrong not to do so.

> (Similar could be said of a list if the language
> permitted arbitrary indices for accessing lists.)
>
> Consider a loop through a string one character at a time using a
> Basic-
> style function:
>
> for i = 1 to len(a$)
> print mid$(a$, i, 1)
> next i
>
> What I was saying was that I prefer
>
> for i = 1 to len(a$)
> print a$[i]
> next i
>
> Basically, I couldn't see the need for a separate function to extract
> part of a string. The array notation seemed quite adequate to me. And
> it expresses the essential operation arguably more clearly. We want
> element i of the string. Period.
>
> We alternative it saying we want the substring of the string that
> starts at position i and is one character long.

Yeah, that is an odd way of thinking about it. Reminds me of that
commercial where that NASCAR driver is giving someone directions and
makes him go around the block because a NASCAR driver only knows how to
turn left.

>
> Please bear in mind I'm not advocating any specific language here,
> merely the notation.
>
> Similar could be said for a substring of length greater than 1. Say we
> wanted 'N' characters in each substring:
>
> Basic-style:
>
> for i = 1 to len(a$) - N + 1
> print mid$(a$, i, N)
> next i
>
> vs
>
> for i = 1 to len(a$) - N + 1
> print a$[i : i + N - 1]
> next i
>
> (Note I've used a colon as the range operator here rather than the
> dots I used before for no other reason than to emphasise that I'm not
> talking about a particular language.)

I'm not sure I like the range plus array semantics for "substring". It
seems more orthogonal to have a function/method because it more than just
access. If the above returns a new string, even moreso.

>
> Another option is to allow the string slicing to include an operator
> other than a range operator. For example,
>
> foreach i = 1 to len(a$) - N + 1
> print a$(i for N)
> next i
>
>>> Care for a code-off?
>>
>> That depends.
>
> Haha! On what? I was hoping to compare options with you.
>
> <snipped interesting points but not about left, mid and right$>
>
>>> So, do you dislike the second of each example?
>>
>> Without seeing how you do string concatenation, I can't say for sure
>> one way
>> or another.
>
> A plus sign is fine. I don't mind too much what it is. I'm not talking
> about a specific language.

Plus-equals seems more appropriate for concatenation if by
"concatenation", you really mean "append". If a new string is returned,
then plus is OK.

Ted Joyce

unread,

Sep 23, 2011, 11:49:15 PM9/23/11

to

Rod Pemberton wrote:

> C doesn't have a string type.

Not a type, no, but it does have the concept of "character string".

> It supports chars and "arrays".

Well that is just a character array and not a "C string". You have to say
"null-terminated" somewhere along with that to have the C concept of
"string".

> C has
> types and "arrays" of them. So, "strings" are built using chars and
> "arrays".

And, that pesky little null, which of course has caused much grief.

> C uses "arrays" of bytes as the fundamental concept upon
> which everything in the language is constructed or stored.
>

It has "structs" also, don't forget, for heterogenous grouping.

James Harris

unread,

Sep 24, 2011, 7:36:32 AM9/24/11

to

On Sep 23, 9:35 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>

Yes, mid$ and a single slice can be equivalent in effect - at least on
the RHS of an assignment. As Bart pointed out a slice can also be
assigned to. That's a vastly different prospect. I'm not sure if one
would want mid$ to be assigned to.

I'll add something else in favour of slicing. One can take multiple
slices easily. In fact I can think of a good example. Say you had a
date in the form YYYY-MM-DD and wanted to do two things: check the
hyphens were present and form a copy without them. Here it is with
string slicing. For variety and to make the slicing as wordy as
possible I'll use words as slicing operators (and stick with 1-based
for purposes of comparison).

if dat$[5] == "-" and dat$[8] == "-" then
short_dat$ = dat$[1 upto 4, 6 upto 7, 9 upto 10]

Now let's do that with left$, mid$ and right$

if mid$(dat$, 5) == "-" and mid$(dat$, 8) == "-" then
short_dat$ = left$(dat$, 4) + mid$(dat$, 6, 2) + right$(dat$, 2)

My eyes start to cross as I read that second one. X-(

Here's another example, an expression to copy just the digits and
leave the three spaces out of a credit card number that appears at the
beginning of a string.

c$[1 upto 4, 6 upto 9, 11 upto 14, 16 upto 19]
left$(c$, 1, 4) + mid$(c$, 6, 4) + mid$(c$, 11, 4), mid$(c$, 16, 4)

Try reading that in your head. The second one is awful. As well as all
the function calls there's the mental issue of keeping track of where
in the string you are. E.g. mid$(c$, 6, 4) - now that takes me up to
position 6 plus 4 minus 1 equals 9 so I skip 10 and the next mid$
starts at 11. You asked if I really didn't like left$, mid$ and right
$. These examples show one reason why. To me, at least, slicing is a
more natural way to think of a string. I remember doing a lot of code
like the second when I wrote in Basic.

> If I take the first sentence, I'd say that that both C and BASIC do that.
> BASIC's sequence selectors for strings are mid$, left$, right$. C allows
> you to index strings from a specific location to the terminator. So, you
> can
> use indexing for sequence selection, albeit limited and primitively like
> BASIC. C also has a set of functions to do various sequence selections.

You keep mentioning C. I'm not sure why. C has neither left$, mid$ and
right$ nor slicing operations.

While I'm in the frame of mind here's another example using
assignment. This time I'll do the Basic-style first.

s$ = left$(s$, 5) + "zz" + mid$(s$, 8)

What does that do? It replaces two characters in s$. Now how can
anyone claim that such a language treats strings as arrays of
characters?

The slicing equivalent might be

s$[6 upto 7] = "zz"

Even C (yes, I'll join you for a moment - a brief moment - in
referring to C) allows individual characters of strings to be changed!

Seriously, obscuring the nature of strings as arrays of characters or
similar STM unnecessarily unhelpful.

> > Consider a loop through a string one character at a time using
> > a Basic- style function:
>
> > for i = 1 to len(a$)
> > print mid$(a$, i, 1)
> > next i
>
> > What I was saying was that I prefer
>
> > for i = 1 to len(a$)
> > print a$[i]
> > next i
>
> Ok. As a side, if you were to use mid$, left$, right$, you could come up a
> function for just one char in order to not type ",1)" , but chr$ is in use
> already.

Yet another named function to do what a language should provide
naturally???

I know of a couple of people famed for drawings of ludicrous machines.
One is Heath Robinson. The other is Rube Goldberg. The "solve it with
a function" approach of various Basics (and much more modern designs
such as PHP and others) is great for very simple friendly code but it
doesn't scale well. After a while such a language can become a bit
Heath Robinson or like a Rube Goldberg machine.

http://www.google.co.uk/search?q=heath%20robinson&um=1&tbm=isch
http://www.google.co.uk/search?q=rube%20goldberg&um=1&tbm=isch

> > Basically, I couldn't see the need for a separate function
> > to extract part of a string.
>
> I think that has to do with the design of a language like BASIC. Strings
> are a fundamental part of the language in BASIC. That's why there is a
> concatenation operator. AIR, BASIC has limited array functionality which
> doesn't support strings (?).
>
> C doesn't have a string type. It supports chars and "arrays". C has types
> and "arrays" of them. So, "strings" are built using chars and "arrays". C
> uses "arrays" of bytes as the fundamental concept upon which everything in
> the language is constructed or stored.

C again!

> > Please bear in mind I'm not advocating any specific language
> > here, merely the notation.
>
> If it allows you to take substrings and concatenate, then it's sufficient,
> but primitive.
>
> > Similar could be said for a substring of length greater than 1.
> > Say we wanted 'N' characters in each substring:
>
> > Basic-style:
>
> > for i = 1 to len(a$) - N + 1
> > print mid$(a$, i, N)
> > next i
>
> What do you need the loop for?

It is only a structure to allow comparison. Any code would do.

> You can print a specific substring directly if you know it's location:
>
> i=5
> N=3
> print mid$(a$,i,N)

Assume the print statement in the loop prints each character on a
separate line or something. Really, this is not code to be optimised
or improved. It's merely a piece of code to allow comparison of like
with like.

> I'm assuming the -N+1 here and N-1 below are correct. You might consider
> zero based arrays. It eliminates many, but not all, off-by-one errors with
> indexing.

I would use zero-based, yes, but kept to 1-based so that the two
examples both had the same warts!

> > vs
>
> > for i = 1 to len(a$) - N + 1
> > print a$[i : i + N - 1]
> > next i
>
> Normally, wouldn't you just set j=N-1 and use it?

I'd do something even simpler but if one example has the cumbersome
plus ones and minus ones surely the other has to have it too or the
comparison would be biased.

> j=N-1
> for i = 1 to len(a$) - j
> print a$[i : i + j]
> next i
>
> You should see one disadvantage by specifying start and end instead of start
> (or offset) and length ...
>
> > Another option is to allow the string slicing to include
> > an operator other than a range operator. For example,
>
> > foreach i = 1 to len(a$) - N + 1
> > print a$(i for N)
> > next i
>
> See comments above.
>
> > Normal assignment is fine too as in a$ = b$ or whatever
> > is used in the rest of the language.
>
> The issue here with assignment is what gets copied. Does b$ pointer get
> copied into a$ pointer, i.e., both pointers are the same value and there is
> only one string which is shared. Or, does string b$ contents get copied to
> string a$ content location, i.e., each string is in separate space, and both
> pointers are different.

A different topic.

> > Sure. Let's ignore C here. I thought we were discussing the relative
> > merits of left, mid and right (i.e. your preference) against string
> > slicing (my preference). My apologies if you wanted to discuss
> > something else.
>
> If one ignores syntax choices, I think it's all been covered. It only takes
> two sentences to describe.
>
> > [...] but I don't know whether you prefer them over string slicing or why.
>
> Well, I prefer them to C's usage, or at least the basic functionality of
> them, i.e., ability to take a substring. They are few, simple, easy to use,
> powerful, flexible, easy to remember, easy to build up more powerful
> functions with them, and concatenation is part of the language when they are
> present.

More C!

>
> > I was hoping you could give some examples of code showing
> > why they are better.
>
> Your examples, I'd say, are equivalent and slightly shorter.
>
> As I said above: if it allows you to take substrings and concatenate, then
> it's sufficient, but primitive.
>
> I think you mean "examples of code" _using them_ that "show why they are
> better." The "examples of code showing why they are better" is usually done
> not by showing how they are better. It's done by showing how other
> languages are far worse. You show the complexity or awkwardness of string
> operations in other languages as compared to the simplicity of BASIC. I did
> an example of that for C with strcpy() and strcat() vs. BASIC. C is worse.

You initially questioned whether I really disliked the functions
approach. I hoped that as an exponent of the left$, mid$ and right$
functions you would be able to produce some code to show why they are
so good in comparison to what I could show for my favoured option. To
choose to compare against something that we agree is worse (in this
respect) seems hardly relevant. :-(

James

James Harris

unread,

Sep 24, 2011, 7:40:14 AM9/24/11

to

On Sep 23, 9:41 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message

>
> news:9518b81f-0e77-425d...@u19g2000vbm.googlegroups.com...
>
>
>
> > If you go down the route of some languages (where all strings
> > can be indexed using non-negative integers as offsets from the
> > left; and as negative integers as indices from the right - i.e. the
> > last character is -1, the one before it is -2 etc) then your two
> > examples become
>
> Which languages allow negative indexing from the right?

Search for the word "negative" in

http://en.wikipedia.org/wiki/Comparison_of_programming_languages_%28string_functions%29

> The one language I know of that allows negative indexing is relative to the
> left ...

I can't see any sense in this. Which language is that?

James

Bill Gunshannon

unread,

Sep 24, 2011, 8:03:18 AM9/24/11

to

In article <j5jk2d$28a$2...@speranza.aioe.org>,

"Ted Joyce" <tpj2...@mynews.net> writes:
> Rod Pemberton wrote:
>
>> C doesn't have a string type.
>
> Not a type, no, but it does have the concept of "character string".
>
>> It supports chars and "arrays".
>
> Well that is just a character array and not a "C string". You have to say
> "null-terminated" somewhere along with that to have the C concept of
> "string".
>
>> C has
>> types and "arrays" of them. So, "strings" are built using chars and
>> "arrays".
>
> And, that pesky little null, which of course has caused much grief.
>

I suppose I am required to point out inthis group as well that C
did not invent the null termnated string. It was in common use
before C was created and Ritchie, et al. just followed industry
practice. The PDP-11, where C started had null terminate strings
as a part of its Macro Assembler (.ASCIZ to create them, .PRINT
to display them).

Null terminated strings have never caused one minute of grief.
incompetent programmers have and they usually don't need C or
null terminated strings for that.

James Harris

unread,

Sep 24, 2011, 7:57:53 AM9/24/11

to

On Sep 23, 10:19 pm, "BartC" <b...@freeuk.com> wrote:

...

> Basic indexing only worked for arrays, not strings. Hence the
> special-purpose functions. When indexing *is* available, then I think few
> would object to making use of it.
>
> When selecting a substring however, 'mid$' still has a use.

Couldn't you use a slice for that too?

> > print mid$(a$, i, N)
> > vs
> > print a$[i : i + N - 1]
>
> About the same. Basic has 8 tokens, slicing has 10, but a couple fewer
> characters.

I try to be fair!

> > (Note I've used a colon as the range operator here rather than the
> > dots I used before for no other reason than to emphasise that I'm not
> > talking about a particular language.)
>
> > Another option is to allow the string slicing to include an operator
> > other than a range operator. For example,
>
> > foreach i = 1 to len(a$) - N + 1
> > print a$(i for N)
> > next i
>
> Not quite sure what 'i for N' means.

As you guessed, Number of characters. i for N is intended to select a
string of length N.

> One problem in the slicing above is
> lack of a range notation giving a start value, and length, rather than start
> and end values. For example:
>
> A .. B means A through B inclusive (perhaps A through B-1 in zero-based
> language)

That's interesting. If using zero-based I would treat both A and B as
zero-based rather than make a distinction. Similarly, I've never got
used to languages that treat the indices as positions *between*
characters.

> A : B means A through A+B-1 inclusive

I prefer "for" as it seems more natural even though some would object
that it starts a for loop.

> (In place of just a range to select a slice, I'm also experimenting with
> having a list or set value in there. That's do-able for r-values, but I
> haven't figured out a way of doing it for l-values:
>
> A := (10,20,30,40,50,60,70) # 1-based list
> print A[(5,1,2,2)] # Output (50,10,20,20)
> print A[5..7,3] # Output (30,50,60,70)
>
> A[(1,1,1,1,6)] := ... # Haven't figured out what it means yet!)

Understood. I've been in a similar place - and haven't any answers
yet. This happens any time parameters can overlap.

> > And I can see that you are comparing a lot against C
> > but I don't know whether you prefer them over string slicing or why. I
>
> C is hopeless at higher level string handling, so doesn't really enter it
> it.

I've been trying to persuade Rod of that but he keeps comparing with
C!

> Python is more interesting to compare with; it uses quite advanced slicing:
>
> a[5:20:3]
>
> which I think means: start with the 6th character (index 5), then select
> every 3rd character until the 20th is reached or passed (index 19). But the
> first time you see something like a[15::-3], it can be quite puzzling!

Yes, it's maybe getting a bit to technological. I find similar issues
in my own ideas and have to keep bringing myself back to reality - the
technical possibilities can intoxicate a designer into going beyond
what is desirable.

James

James Harris

unread,

Sep 24, 2011, 7:59:49 AM9/24/11

to

On Sep 23, 10:29 pm, "BartC" <b...@freeuk.com> wrote:
> "Rod Pemberton" <do_not_h...@noavailemail.cmm> wrote in message
>
> news:j5iqu2$8dp$1...@speranza.aioe.org...
>
>
>
> > "James Harris" <james.harri...@googlemail.com> wrote in message

> >news:9518b81f-0e77-425d...@u19g2000vbm.googlegroups.com...
>
> >> If you go down the route of some languages (where all strings
> >> can be indexed using non-negative integers as offsets from the
> >> left; and as negative integers as indices from the right - i.e. the
> >> last character is -1, the one before it is -2 etc) then your two
> >> examples become
>
> > Which languages allow negative indexing from the right?
>
> > The one language I know of that allows negative indexing is relative to
> > the
> > left ...
>
> Python and Ruby, for a start.
>
> An index of -3 means 3rd element from the end (ie. the 'right', if you
> imagine the elements progressing from left to right).
>
> Interestingly, both these languages switch to 1-based indexing in this case!
> Otherwise index 0 is ambiguous.

I don't know if they 'switch' to anything. They just index as they do.
That's it.

James

Dmitry A. Kazakov

unread,

Sep 24, 2011, 11:01:14 AM9/24/11

to

On 24 Sep 2011 12:03:18 GMT, Bill Gunshannon wrote:

> In article <j5jk2d$28a$2...@speranza.aioe.org>,
> "Ted Joyce" <tpj2...@mynews.net> writes:

>> And, that pesky little null, which of course has caused much grief.
>
> I suppose I am required to point out inthis group as well that C
> did not invent the null termnated string. It was in common use
> before C was created and Ritchie, et al. just followed industry
> practice. The PDP-11, where C started had null terminate strings
> as a part of its Macro Assembler (.ASCIZ to create them, .PRINT
> to display them).

I don't remember I ever used null-terminated strings in MACRO-11, which I
used very extensively. I even wrote compilers in MACRO-11.

In fact PDP-11 (RSX-11M) did not use null terminated strings. File names
were Radix-50 encoded, without any ASCII-NUL, of course. Text files used
record format, where lines had length counts, no nulls, no LF/CR-mess, no
idiotic escape sequences.

--
Regards,
Dmitry A. Kazakov
http://www.dmitry-kazakov.de

Bill Gunshannon

unread,

Sep 24, 2011, 1:59:34 PM9/24/11

to

In article <taju9i13rgrq$.164im7gwtxbxa$.d...@40tude.net>,

"Dmitry A. Kazakov" <mai...@dmitry-kazakov.de> writes:
> On 24 Sep 2011 12:03:18 GMT, Bill Gunshannon wrote:
>
>> In article <j5jk2d$28a$2...@speranza.aioe.org>,
>> "Ted Joyce" <tpj2...@mynews.net> writes:
>
>>> And, that pesky little null, which of course has caused much grief.
>>
>> I suppose I am required to point out inthis group as well that C
>> did not invent the null termnated string. It was in common use
>> before C was created and Ritchie, et al. just followed industry
>> practice. The PDP-11, where C started had null terminate strings
>> as a part of its Macro Assembler (.ASCIZ to create them, .PRINT
>> to display them).
>
> I don't remember I ever used null-terminated strings in MACRO-11, which I
> used very extensively. I even wrote compilers in MACRO-11.

That's hard to believe as every text on Macro-11 I have ever seen
(and I have at least 5 in my personal library!) uses them both.

>
> In fact PDP-11 (RSX-11M) did not use null terminated strings.

RSX11M_V2 -- DEC-11-OIMRA-A-D -- MACRO-11 Reference Manual

Page 6-21 -- Para. 6.3.5 .ASCIZ Directive

> File names
> were Radix-50 encoded, without any ASCII-NUL,

Radix-50 is without any ASCII Characters at all.
" It uses a 40 character subset of ASCII to fit three characters into
a 16-bit word".

> of course. Text files used
> record format, where lines had length counts, no nulls, no LF/CR-mess, no
> idiotic escape sequences.

RMS was available, but you don't have to use it. I don't think DECUS-C
did. And Text Files are not "strings" on any system I have ever worked
on. They come in numerous colors and flavors even on a single machine.
Take Windows/DOS for example where a text file may or may not use a ^Z
to signify EOF. Two totally distinct types of plain text files and with
no nulls in either.

Dmitry A. Kazakov

unread,

Sep 24, 2011, 3:09:14 PM9/24/11

to

On 24 Sep 2011 17:59:34 GMT, Bill Gunshannon wrote:

> In article <taju9i13rgrq$.164im7gwtxbxa$.d...@40tude.net>,

>> of course. Text files used
>> record format, where lines had length counts, no nulls, no LF/CR-mess, no
>> idiotic escape sequences.
>
> RMS was available, but you don't have to use it. I don't think DECUS-C
> did.

I only marginally used C and only under VMS. The C compiler I had under
RSX-11M was so slow that it was impossible to use. It had 5 or 6 passes and
took several minutes to compile hello-world. I believe it was some Sys V
port. DEC C was the best ever C compiler I saw, but it was already under
VMS, and for VMS there was even better DEC Ada compiler.

> And Text Files are not "strings" on any system I have ever worked
> on.

They are logically sets of strings.

> They come in numerous colors and flavors even on a single machine.
> Take Windows/DOS for example where a text file may or may not use a ^Z
> to signify EOF. Two totally distinct types of plain text files and with
> no nulls in either.

It was the dark age of Unix, when the idea of "typed" files was killed.
Null-terminator and escape sequences were contributors to this, because
they were needed to uphold the flawed Unix idea that everything is a
character stream.

Rod Pemberton

unread,

Sep 24, 2011, 7:44:29 PM9/24/11

to

"Ted Joyce" <tpj2...@mynews.net> wrote in message
news:j5jk2d$28a$2...@speranza.aioe.org...

> Rod Pemberton wrote:
>
> > C doesn't have a string type.
>
> Not a type, no, but it does have the concept of "character string".
>

I more than adequately understand that. I was just reminding James that
strings are not a fundamental type in some languages. I think he wanted
them to be in his.

> > It supports chars and "arrays".
>
> Well that is just a character array and not a "C string". You have to say
> "null-terminated" somewhere along with that to have the C concept of
> "string".
>

Yes. It seems you mentioned this in 2008's "reinventing ASCII?" c.l.m.
thread. No I didn't recall that either.

Pedantically, NUL is a character in ASCII and EBCDIC.

More pedantically, the terminator in C is not a character at all but a a
byte, which C defines differently than common usage of byte being 8-bits and
which can be larger in size than a character in C. The language of the ANSI
and ISO C specifications confuses people regarding the terminator being a
character or a byte. The spec's imply it's a character. It's a byte - as C
defines a byte. I believe this is a holdover from C on the PDP-11 which was
word addressed, and so used a string terminator larger in size than the
character size. Use of a terminator for strings was a holdover from B.
Some quotes:

"Strings [in B] are double-quoted, and have zero or more characters; they
are left justified and terminated by an explicit delimiter, '*e'."
http://cm.bell-labs.com/cm/cs/who/dmr/btut.html

" ... strings are terminated by a special character, which B spelled `*e'.
This change was made partially to avoid the limitation on the length of a
string caused by holding the count in an 8- or 9-bit slot, and partly
because maintaining the count seemed, in our experience, less convenient
than using a terminator."
http://cm.bell-labs.com/cm/cs/who/dmr/chist.html

"C treats strings as arrays of characters conventionally terminated by a
marker. Aside from one special rule about initialization by string literals,
the semantics of strings are fully subsumed by more general rules governing
all arrays, and as a result the language is simpler to describe and to
translate than one incorporating the string as a unique data type."
http://cm.bell-labs.com/cm/cs/who/dmr/chist.html

"A minor difference was that the unit of I/O was the word, not the byte,
because the PDP-7 was a word-addressed machine. In practice this meant
merely that all programs dealing with character streams ignored null
characters, because null was used to pad a file to an even number of
characters."
http://cm.bell-labs.com/cm/cs/who/dmr/hist.html

Also, this describes the C machine model:
http://cm.bell-labs.com/cm/cs/who/dmr/portpap.html

> > C has
> > types and "arrays" of them. So, "strings" are built using chars and
> > "arrays".
>
> And, that pesky little null, which of course has caused much grief.
>

Having programmed in a number of non-terminated languages, e.g., assembly
with ASCIN (ASCII negated) and PL/I with counted strings, I can tell you
that null terminated strings have fewer issues. Just make sure they're
terminated.

> > C uses "arrays" of bytes as the fundamental concept upon
> > which everything in the language is constructed or stored.
> >
>
> It has "structs" also, don't forget, for heterogenous grouping.
>

It has structs, without quotes ...

I put arrays in quotes for a reason: C doesn't actually have them either. I
don't want to divert James' thread by explaining this. The five or six
times I've brought the "no arrays in C issue" up, it's led to lengthy and
hostile discussions in various NGs. You can lookup my past posts on this
via Google Groups advanced search, if need be.

Rod Pemberton

Rod Pemberton

unread,

Sep 24, 2011, 7:45:06 PM9/24/11

to

"James Harris" <james.h...@googlemail.com> wrote in message

news:d87d4221-d042-4243...@t11g2000yqa.googlegroups.com...
...

> I'll add something else in favour of slicing. One can take multiple
> slices easily. In fact I can think of a good example. Say you had a
> date in the form YYYY-MM-DD and wanted to do two things: check the
> hyphens were present and form a copy without them. Here it is with
> string slicing. For variety and to make the slicing as wordy as
> possible I'll use words as slicing operators (and stick with 1-based
> for purposes of comparison).
>
> if dat$[5] == "-" and dat$[8] == "-" then
> short_dat$ = dat$[1 upto 4, 6 upto 7, 9 upto 10]
>
> Now let's do that with left$, mid$ and right$
>
> if mid$(dat$, 5) == "-" and mid$(dat$, 8) == "-" then
> short_dat$ = left$(dat$, 4) + mid$(dat$, 6, 2) + right$(dat$, 2)

FYI, you forgot ,1 on mid$...

Now let's do that in C:

if(dat[5]=='-')&&(dat[8]=='-')
{
strncpy(short_dat,dat,4);
/* strncpy could be replaced with strcpy like below */
strcpy(&short_dat[4],&dat[5]);
strcpy(&short_dat[6],&dat[8]);
}

Or, somesuch, that's untested, btw ...

> The second one is awful.

Ok. It's awful to you. I just deal with it.

> You keep mentioning C. I'm not sure why. C has neither left$, mid$
> and right$ nor slicing operations.

C's "array" indexing of strings is slice-like IMO. Basically, C has the
equivalent of right$ or left$, but no mid$ because of a single index,
instead of dual. I.e., only the start and end of the string are defined, so
you can only work from the index to either the start or end, i.e., left$
right$.

E.g., truncate to the left 4 chars of a string:

dat[4]='\0';

E.g., cut to right 2 chars of a string (if declared correctly):

dat=&dat[strlen(dat)-2];

Or, with functions ...

E.g., short_dat$=left$(dat$,4) is:

strncpy(short_dat,dat,4)

E.g., short_dat$=right$(dat$,2) is:

strcpy(short_dat,&dat[strlen(dat)-2]);

> While I'm in the frame of mind here's another example using
> assignment. This time I'll do the Basic-style first.
>
> s$ = left$(s$, 5) + "zz" + mid$(s$, 8)
>
> What does that do? It replaces two characters in s$. Now
> how can anyone claim that such a language treats strings as
> arrays of characters?
>

A: ... because left$ and right$ index the string as an array ... I.e., they
access the characters in the string based upon their positions as an array
of characters.

What is it that in that example makes you think "that such a language" isn't
"treat[ing] strings as arrays of characters? The assignment? The
concatenation? The implicit garbage collection?

Did you mean right$ instead of mid$?

> The slicing equivalent might be
>
> s$[6 upto 7] = "zz"
>
> Even C (yes, I'll join you for a moment - a brief moment - in
> referring to C) allows individual characters of strings to be changed!
>
> Seriously, obscuring the nature of strings as arrays of characters or
> similar STM unnecessarily unhelpful.

How do you plan to store them in memory?

> The "solve it with a function" approach of various Basics
> (and much more modern designs such as PHP and others)
> is great for very simple friendly code but it doesn't scale well.

FYI, that's exactly how interpreted Forth is implemented. They define a
small set of low-level functionality ("primitives") usually code in
assembler (~30 to 45 functions), or sometimes C, and then everything else is
high-level functionality ("words") which we would know as parameterless or
0-operand functions or procedures. :-)

> You initially questioned whether I really disliked the functions
> approach.

Did I? Or, did I ask you something else, but similar:

JH> The Basic I used first had left$, mid$ and right$ which, IMHO,
JH> are *awful*. What did you have in mind?
>
RP> Do you really think that James? ...

(Your emphasis is on *awful*, but you didn't say why.)

> I hoped that as an exponent of the left$, mid$ and right$
> functions you would be able to produce some code to show
> why they are so good in comparison to what I could show for
> my favoured option.

My claim was without knowing your favored option. It was in regards to the
languages I've experienced. No one has experienced your language yet,
except you. If it's more succinct, then it is.

> To choose to compare against something
> that we agree is worse (in this respect) seems hardly relevant. :-(

And, to compare against something newly created by you to
which no one has any familiarity is hardly fair. :-)

FYI, I don't like the "upto" syntax, but it is more succinct. It could be
replaced with a colon though ... I'd avoid the dash with signed values.

Rod Pemberton

Rod Pemberton

unread,

Sep 24, 2011, 7:46:48 PM9/24/11

to

"James Harris" <james.h...@googlemail.com> wrote in message

news:76f1ecf4-69b5-419c...@h34g2000yqd.googlegroups.com...
> BartC
> > James
...

> > > And I can see that you are comparing a lot against C
> > > but I don't know whether you prefer them over string slicing or why. I
>
> > C is hopeless at higher level string handling, so doesn't really enter

> > it.
>
> I've been trying to persuade Rod of that but he keeps comparing with
> C!

Huh? 90% of what I use C for is string processing ... Sometimes it's a bit
awkward, but it's also powerful. Gain some. Lose some. I use very few of
the string library functions. So, there is lots more power available there.

RP

Rod Pemberton

unread,

Sep 24, 2011, 7:50:55 PM9/24/11

to

"James Harris" <james.h...@googlemail.com> wrote in message

news:6cb958bf-349b-4306...@g33g2000yqc.googlegroups.com...

> On Sep 23, 9:41 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
> > "James Harris" <james.harri...@googlemail.com> wrote in message
> >
news:9518b81f-0e77-425d...@u19g2000vbm.googlegroups.com...
>
...

> > > If you go down the route of some languages (where all strings
> > > can be indexed using non-negative integers as offsets from the
> > > left; and as negative integers as indices from the right - i.e. the
> > > last character is -1, the one before it is -2 etc) then your two
> > > examples become
>
> > Which languages allow negative indexing from the right?
>
> Search for the word "negative" in
>

> [link]

They only marked negative indexes *from* the right, not the left,
apparently.

> > The one language I know of that allows negative indexing is
> > relative to the left ...
>
> I can't see any sense in this. Which language is that?

...

> Which language is that?

You have to ask after you kept saying I mentioned it too much already? Of
course, it's C. I hope you didn't think it was BASIC! :-) I'd suspect
that the other C derived languages do so also, but I don't know.

> I can't see any sense in this.

C allows lots of stuff that doesn't make sense due to it's grammar. If it's
a legal construct, it compiles. That's why I only use a subset of C's
features that work for me. You can trip and fall on the other stuff.

For most C strings, you can only access from index 0 to the end-of-string.
Actually, C allows one char past the end-of-string for loops or something.
So, if you index backwards from zero without knowing that characters are
present, it's undefined behavior. It may work, or not. But, you can setup
strings where backwards accesses are valid. If you've got a string in C
that's already allocated and filled with chars, and also a pointer to char,
then you can set the pointer to char into the middle of the allocated
string. The "array" that the pointer to char points to is made up of chars
and null terminated, so it's a "string". Now, you can index forward with
positive values or backwards with negative values, but only to the start and
end chars of the original string. If you go past the ends of the original
string, it's undefined.

The only place I've ever heard of it being used is with implementing a
memory header for memory blocks allocated by the hidden memory allocator
used by C. Just prior to the address returned by C for malloc(), on some
implementations, there is a header which has the hidden information that C
needs to implement corresponding functions like realloc() and free(). If
you ever try to code a realloc(), you'll find that you don't have quite
enough information to safely reallocate under all circumstances. I've
personally never used negative indexing. I avoid signed types, although I
have been converting some of my trivial text programs from unsigned to
signed chars lately just to clean them up for others so I won't hear them
whine, and eliminate casts.

Rod Pemberton

Rod Pemberton

unread,

Sep 24, 2011, 7:51:38 PM9/24/11

to

"Bill Gunshannon" <bill...@cs.uofs.edu> wrote in message

news:9e5v86...@mid.individual.net...

>
> I suppose I am required to point out inthis group as well that C
> did not invent the null termnated string. It was in common use
> before C was created and Ritchie, et al. just followed industry
> practice.

Ok.

> The PDP-11, where C started had null terminate strings
> as a part of its Macro Assembler (.ASCIZ to create them, .PRINT
> to display them).

"where C started" was with B on the PDP-7
http://cm.bell-labs.com/cm/cs/who/dmr/chist.html

Rod Pemberton

Ted Joyce

unread,

Sep 25, 2011, 12:03:45 AM9/25/11

to

Rod Pemberton wrote:
> "Ted Joyce" <tpj2...@mynews.net> wrote in message
> news:j5jk2d$28a$2...@speranza.aioe.org...
>> Rod Pemberton wrote:
>>
>>> C doesn't have a string type.
>>
>> Not a type, no, but it does have the concept of "character string".
>>
>
> I more than adequately understand that.

I am sure you did, but I feel compelled to say pedantic shit when a
discussion is going on about pedantic shit! It's not good to be casual
with verbage when discussing at the lowest level, because a lot of times,
discussions go 'round-n-'round because those discussing pedantic shit are
not on the same page.

That said, I thought the statement you made was like "a trick question".
Umm, not really, actually more like, well nevermind. The word 'type'
needed emphasis.

> I was just reminding James
> that strings are not a fundamental type in some languages.

More vague verbage: "fundamental". Isn't a string always a "fundamental"
type? ;) It's just a holder for data. How more fundamental can a type be?
;)

> I think
> he wanted them to be in his.

I don't think he has to worry about THAT! (See above).

Well I hope that helped you to understand that character arrays and
strings in C are not the same thing, because they are not the same thing.
I'm not sure if you were saying that you did or didn't get that in the
first place. Conceptually, it seems rather obvious (to me).

>>> C has
>>> types and "arrays" of them. So, "strings" are built using chars and
>>> "arrays".
>>
>> And, that pesky little null, which of course has caused much grief.
>>
>
> Having programmed in a number of non-terminated languages, e.g.,
> assembly with ASCIN (ASCII negated) and PL/I with counted strings, I
> can tell you that null terminated strings have fewer issues. Just
> make sure they're terminated.

That is a tangent thread perhaps. I was trying to point out how you
continually associate "C string" with characters and arrays but leave off
the most important thing: the null. The null is what makes a sequence of
characters a string in C! Forget about array, any sequence of characters
with a null at the end of it can be a C string.

(I think posting about and "discussing" pedantic shit is my alternative
choice to watching television).

>
>>> C uses "arrays" of bytes as the fundamental concept upon
>>> which everything in the language is constructed or stored.
>>>
>>
>> It has "structs" also, don't forget, for heterogenous grouping.
>>
>
> It has structs, without quotes ...

I put it in quotes for a reason, but it escapes me now as to why. I don't
normally use quotes to emphasize (? maybe I do), but maybe I did that
time? I don't know. This is going to bug me now!

>
> I put arrays in quotes for a reason: C doesn't actually have them
> either.

C "arrays" are just strings!

> I don't want to divert James' thread by explaining this.
> The five or six times I've brought the "no arrays in C issue" up,
> it's led to lengthy and hostile discussions in various NGs. You can
> lookup my past posts on this via Google Groups advanced search, if
> need be.

Cool, but please sum up concisely your position on the matter just this
once? (Emphasis on *concise*). "Time is money", don't ya know.

Ted Joyce

unread,

Sep 25, 2011, 12:07:35 AM9/25/11

to

Rod Pemberton wrote:
> "James Harris" <james.h...@googlemail.com> wrote in message
> news:76f1ecf4-69b5-419c...@h34g2000yqd.googlegroups.com...
>> BartC
>>> James
> ...
>
>>>> And I can see that you are comparing a lot against C
>>>> but I don't know whether you prefer them over string slicing or
>>>> why. I
>>
>>> C is hopeless at higher level string handling, so doesn't really
>>> enter it.
>>
>> I've been trying to persuade Rod of that but he keeps comparing with
>> C!
>
> Huh? 90% of what I use C for is string processing ...

Yeah, but you just do that "because it is there", right? Otherwise you
would use SNOBOL, right?

> Sometimes
> it's a bit awkward, but it's also powerful. Gain some. Lose some.
> I use very few of the string library functions. So, there is lots
> more power available there.
>

The question that comes to mind is what the scope of "string processing"
(literally quoted in quotes!) is for you.

Dmitry A. Kazakov

unread,

Sep 25, 2011, 3:12:25 AM9/25/11

to

On Sat, 24 Sep 2011 23:03:45 -0500, Ted Joyce wrote:

> Rod Pemberton wrote:

>> I was just reminding James
>> that strings are not a fundamental type in some languages.
>
> More vague verbage: "fundamental". Isn't a string always a "fundamental"
> type? ;) It's just a holder for data. How more fundamental can a type be?

The string type is normally defined in terms of other types, e.g. as an
array of characters, which makes the index type, the character type and
array type more fundamental than strings. Not all arrays are strings, not
even all character arrays are.

>> I put arrays in quotes for a reason: C doesn't actually have them
>> either.
>
> C "arrays" are just strings!

C "arrays" are pointers. C does not have string type. It has string
literals of some unnamed type, which cannot be used explicitly as a proper
type and implicitly converted to char*.

Marco van de Voort

unread,

Sep 25, 2011, 8:45:54 AM9/25/11

to

On 2011-09-23, Rod Pemberton <do_no...@noavailemail.cmm> wrote:
>> > Lets start with the two most widespread languages: C and Forth.
>>
>> Forth widespread? Where, in Apple firmwares ? :)
>
> Everywhere. It's been implemented on more platforms than C.

Well, that is something different from "usage". Your argument reminds me a
bit of NetBSD propaganda ;-)

> Still no reason as to why "curly braces suck", so I'll take that
> as just an intense personal "preference" ...

It is for sure. But that was the whole point, to contradict the notion that that
is a fact, and the crazy idea that the prevalence of curly braces language has
something to do with its syntactic superiority .

Marco van de Voort

unread,

Sep 25, 2011, 9:00:36 AM9/25/11

to

On 2011-09-23, BGB <cr8...@hotmail.com> wrote:
> IMO, there are plenty worse options:
> paired keywords;
> indentation;
> opening and closing tags ("<foo>...</foo>" or "[foo]...[/foo]");
> ...

I give you tags and indentation

>> Forth widespread? Where, in Apple firmwares ? :)
>>
>

> IIRC, it is also used as an alternative to C in various embedded systems
> (since it can perform similarly and has similar operating
> characteristics, but requires a much simpler compiler).

I know. But if sb says widespread, I assume widespread usage, not number of
toasters a minimal system has been ported too.

Minimalistic systems will win that contest, but IMHO that is a sterile win.

>>> Pascal -
>>> begin end
>>
>> That's what I'm most familiar with. Though I like the Modula2 system more,
>> snce it solves dangling els problems.
>>
>
> but, these do involve, on average, typing 4x as many characters as { and }.

Yes, but they read faster. I don't know if it is by training (reading), or
innate, but experienced readers read entire words in one go. Dropping back
to single character encoding slows down reading, but worse, the extra
attention needed disrupts the natural reading flow.

At least in the West; It might be different for the eastern Asian population
which are trained to writing systems where a word is a glyph.

Anyway, even if you don't buy the above, typespeed is rarely, if ever, a
limiting factor in programming.

I've been now programming C for about 10 years, and I still miss braces now
and then, if they are misplaced, or I have to get used to an new editor with
a different font (or new highlighting). I never miss a begin/end that way.

>>> It seems the curly braces might be winning, and where they aren't parens are
>>> ... ;)
>>
>> Not all popular things are good things. Take for instance Berlusconi :-)
>
> had to look this up...
>
> found: a politician.
>
> then was thinking: wow, this guy sort of resembles the mobster
> stereotype (but this may be because in the US, Italians form one of two
> major popular cultural/stereotypical roles: being mobsters, or running
> small restaurants and talking funny, like "mamamia my pizzaria" and
> similar, meanwhile using lots of hand-gestures and similar...).

His business is slightly larger than a pizzaria :-)

> prior to looking up the word, was actually more expecting to find some
> sort of food product (like, say, something similar to haggis or similar,
> ...).
>
> hmm...
>
>
>> So my remark was about quality, not quantity.
>
> ?...

The virtues of a syntax, not how many use it. Most of them never made a
conscious choice about syntax, but just inherited the language because of
the company/school whatever.

Ted Joyce

unread,

Sep 25, 2011, 9:13:38 AM9/25/11

to

Dmitry A. Kazakov wrote:
> On Sat, 24 Sep 2011 23:03:45 -0500, Ted Joyce wrote:
>
>> Rod Pemberton wrote:
>
>>> I was just reminding James
>>> that strings are not a fundamental type in some languages.
>>
>> More vague verbage: "fundamental". Isn't a string always a
>> "fundamental" type? ;) It's just a holder for data. How more
>> fundamental can a type be?
>

> Ther string type

Do tell.
> is normally

quite qualifier.

>defined in terms of other types,

you wishing?

> e.g. as
> an array of characters, which makes the index type, the character
> type and array type more fundamental than strings. Not all arrays are
> strings, not even all character arrays are.

You seem to be struggling with basics. :P

>
>>> I put arrays in quotes for a reason: C doesn't actually have them
>>> either.
>>
>> C "arrays" are just strings!
>

> C "arrays" are pointers.'

no I don't think so.. else there is no "C", but C.

> C does not have string type.

We did that already.

> It has string
> literals of some unnamed type, which cannot be used explicitly as a
> proper type and implicitly converted to char*.

You are so drunk you don't even know what you are saying.

BartC

unread,

Sep 25, 2011, 9:30:26 AM9/25/11

to

"Marco van de Voort" <mar...@turtle.stack.nl> wrote in message
news:slrnj7u9fk...@turtle.stack.nl...

> Anyway, even if you don't buy the above, typespeed is rarely, if ever, a
> limiting factor in programming.

I can type an entire word in not much more time that it takes to fumble for
a shift+punctuation key combination ...

> I never miss a begin/end that way.

... but begin/end is sometimes a bit much. You don't really need them here
for example:

if ... then begin .... end else ...

as 'then' and 'else' are already perfectly good delimiters.

--
Bartc

Marco van de Voort

unread,

Sep 25, 2011, 3:39:27 PM9/25/11

to

On 2011-09-25, BartC <b...@freeuk.com> wrote:
>> I never miss a begin/end that way.
>
> ... but begin/end is sometimes a bit much. You don't really need them here
> for example:

First, I rarely ever type the end btw, and the begin not that much either.
They are typically completed by the (Delphi or Lazarus) ide.

And if you are a keystroke counter, IDE is also important in handling the
number of keys needed for identation

> if ... then begin .... end else ...
>
> as 'then' and 'else' are already perfectly good delimiters.

No, they are not redundant per se if not every IF mandatory has an ELSE, and
you allow empty blocks and/or blocks without begin..end. It gets you into
trouble with nesting, and the problem is called dangling ELSE. Both Pascal
and C suffer from it and have workarounds.

As said earlier, the successor to Pascal fixes this in the way I like best:

* begin..end is kept for the function scope. The "end" is followed by the
function name, like end functionname;
* For other blocks "begin" is dropped and end is mandatory.

The second bit is the nice part. Readable, reasonbly slim and never again
expand a single line to a block (e.g. to add a log msg). IMHO still the
best system. Unfortunately, compiler quality on Modula2 was bad, so I ended
up with the next best thing, Pascal.

For the people interested in the math, the number of characters is C:3 vs
M2: 3-4.

C has { } and a mandatory semi colon before the ;, Modula2 has three (end),
followed by a semicolon only if the next line is a statement (and not e.g.
another end)

The first bit is a mixed blessing, meant to make sure that an error wrt
block nesting will always be detected at the function's end, and never
beyond. Very secure, but back then I considered it a nuisance when renaming
functions, and in the relatively safe Wirthian languages the error was
99.9% on the next line (next function declaration) anyway. So I then
considered the tradeoff bad.

OTOH, with a decent IDE (with editing based on syntax parsing) it might be
not so much of a problem. (since you could easily create an editor that
would rename the functionname after "end" too when renaming the function,
assuming that the general blockstructure is intact)

BartC

unread,

Sep 25, 2011, 4:32:52 PM9/25/11

to

"Marco van de Voort" <mar...@turtle.stack.nl> wrote in message

news:slrnj7v0rf...@turtle.stack.nl...

> On 2011-09-25, BartC <b...@freeuk.com> wrote:

>> if ... then begin .... end else ...
>>
>> as 'then' and 'else' are already perfectly good delimiters.
>
> No, they are not redundant per se if not every IF mandatory has an ELSE,
> and
> you allow empty blocks and/or blocks without begin..end. It gets you into
> trouble with nesting, and the problem is called dangling ELSE. Both
> Pascal
> and C suffer from it and have workarounds.
>
> As said earlier, the successor to Pascal fixes this in the way I like
> best:
>
> * begin..end is kept for the function scope. The "end" is followed by the
> function name, like end functionname;
> * For other blocks "begin" is dropped and end is mandatory.
>
> The second bit is the nice part. Readable, reasonbly slim and never again
> expand a single line to a block (e.g. to add a log msg). IMHO still the
> best system. Unfortunately, compiler quality on Modula2 was bad, so I
> ended
> up with the next best thing, Pascal.

I thought Algol68 sorted this out pretty well.

However, every 'if' statement needs to end with 'fi', not to everyone's
taste. (In my versions of the syntax, I also allow 'end', 'endif' and 'end
if', just to provide a choice.)

Anyway, it means many places where you might sometimes write more than one
statement, are already delimited, or bracketed. So you can change from 1 to
2 or N statements, and back again, without inserting and remove blocks.
Quite luxurious! Of course not all of us use fancy IDEs..

Also there are a few places where you wouldn't usually expect multiple
statements, where they will also work thanks to being naturally delimited:
in between 'if' and 'then' for example. It all helps to give a nice,
expressive, orthogonal language.

--
bartc

BGB

unread,

Sep 25, 2011, 4:38:23 PM9/25/11

to

On 9/25/2011 6:00 AM, Marco van de Voort wrote:
> On 2011-09-23, BGB<cr8...@hotmail.com> wrote:
>> IMO, there are plenty worse options:
>> paired keywords;
>> indentation;
>> opening and closing tags ("<foo>...</foo>" or "[foo]...[/foo]");
>> ...
>
> I give you tags and indentation
>

ok.

I guess at one point, it was a fad to have transposed paired keywords,
like: begin/nigeb and dne/end and if/fi and similar, but this went out
of style.

>>> Forth widespread? Where, in Apple firmwares ? :)
>>>
>>
>> IIRC, it is also used as an alternative to C in various embedded systems
>> (since it can perform similarly and has similar operating
>> characteristics, but requires a much simpler compiler).
>
> I know. But if sb says widespread, I assume widespread usage, not number of
> toasters a minimal system has been ported too.
>
> Minimalistic systems will win that contest, but IMHO that is a sterile win.
>

yeah...

sort of how like claiming that ARM and Z80 are far more common than x86
(one sees this argument sometimes...).

yes, there are more devices running them, but most are mass-produced
limited-function devices (especially for Z80). most people need not know
or care what kind of CPU runs their wrist-watch or alarm clock or
calculator, and the software which runs on them is generally one-off,
meaning the number of units sold is not really a good indicator.

this is unlike, say, a PC or laptop, where people actually have to care.
the partial issue at this point is becoming modern cell-phones and
tablets, which are using ARM in an area disturbingly close to x86
territory (it is a mystery how this will play out).

>>>> Pascal -
>>>> begin end
>>>
>>> That's what I'm most familiar with. Though I like the Modula2 system more,
>>> snce it solves dangling els problems.
>>>
>>
>> but, these do involve, on average, typing 4x as many characters as { and }.
>
> Yes, but they read faster. I don't know if it is by training (reading), or
> innate, but experienced readers read entire words in one go. Dropping back
> to single character encoding slows down reading, but worse, the extra
> attention needed disrupts the natural reading flow.
>
> At least in the West; It might be different for the eastern Asian population
> which are trained to writing systems where a word is a glyph.
>

I don't figure it makes much of a difference, however the glyph does
take a little less space on screen, making it often preferable for sake
of code density (usually more an issue when using the often black-listed
"clump formatting" though).

I personally tire of "readability" based arguments though, as except in
severe cases (Lisp style or RPN based syntax) it tends not to be a big
issue, and is not strongly effected much by things like
glyphs-vs-keywords or naming conventions (people almost invariable
arguing that their personally preferred naming or formatting convention
is more readable than the others, and thus should be universally upheld,
...).

> Anyway, even if you don't buy the above, typespeed is rarely, if ever, a
> limiting factor in programming.
>
> I've been now programming C for about 10 years, and I still miss braces now
> and then, if they are misplaced, or I have to get used to an new editor with
> a different font (or new highlighting). I never miss a begin/end that way.
>

I don't really have a problem with braces, but more that people keep
insisting on using/creating fonts which have the I/l/1 being visually
identical issue, and then software keeps defaulting to them...

apparently, I am also in the apparent minority who prefers fixed-width
fonts over variable-width fonts. it is worse in some apps which
apparently prefer to use blurry fonts.

>>>> It seems the curly braces might be winning, and where they aren't parens are
>>>> ... ;)
>>>
>>> Not all popular things are good things. Take for instance Berlusconi :-)
>>
>> had to look this up...
>>
>> found: a politician.
>>
>> then was thinking: wow, this guy sort of resembles the mobster
>> stereotype (but this may be because in the US, Italians form one of two
>> major popular cultural/stereotypical roles: being mobsters, or running
>> small restaurants and talking funny, like "mamamia my pizzaria" and
>> similar, meanwhile using lots of hand-gestures and similar...).
>
> His business is slightly larger than a pizzaria :-)
>

yep.

was just noting some stereotypes, but then was left feeling paranoid
that someone might take this the wrong way or similar (like, implying
that because one notes the existence of a stereotype, that it means they
also believe it to be true, ...).

>> prior to looking up the word, was actually more expecting to find some
>> sort of food product (like, say, something similar to haggis or similar,
>> ...).
>>
>> hmm...
>>
>>
>>> So my remark was about quality, not quantity.
>>
>> ?...
>
> The virtues of a syntax, not how many use it. Most of them never made a
> conscious choice about syntax, but just inherited the language because of
> the company/school whatever.

I go by the assumption that most people generally like what is familiar,
and most language designers go by what is their preference or
expectation of preference.

under this model, if there is something substantially wrong with
something, it will generally tend to weed itself out after several
generations of programmers and/or new languages.

like, on a large scale (time and people) things will tend to head
towards the local minima.

given that many aspects of mainstream languages (such as curly braces)
are spreading to most other new languages, and many other styles are in
decline, it is very possible that this is "generally the direction
things are going" (and it will all eventually become something to be
taken for granted).

it is a mystery though how long until programmer-style thinking and
syntax begins spreading into the "mainstream" as-in, something which
also begins to permeate into the non-programmer world (or maybe even
language in-general...).

decided to leave out distant-future speculations about the possible
development of "language" (or merging of natural and machine languages).

or such...

BartC

unread,

Sep 25, 2011, 5:29:35 PM9/25/11

to

"BGB" <cr8...@hotmail.com> wrote in message
news:j5o3fv$roa$1...@news.albasani.net...

> I guess at one point, it was a fad to have transposed paired keywords,
> like: begin/nigeb and dne/end and if/fi and similar, but this went out of
> style.

If it did, no one told me about it! But I've never seen begin/nigeb and
such, only ever used if/fi, do/od, case/esac. (And they have the advantage,
over {/}, that they do have to be properly paired, so that if/od is an error
that might not be picked up so easily with {/}.)

> given that many aspects of mainstream languages (such as curly braces) are
> spreading to most other new languages, and many other styles are in
> decline, it is very possible that this is "generally the direction things
> are going" (and it will all eventually become something to be taken for
> granted).

Maybe. Or maybe it's like fashion, where people wear a particular style
simply because it's the only thing they can buy!

Anyway there are enough people writing Lua, Ruby, Python, Basic even, to
keep the spread of curly braces in check I hope. They are just too
anaemic-looking to be taken seriously...

--
Bartc

Rod Pemberton

unread,

Sep 26, 2011, 2:54:56 PM9/26/11

to

"Marco van de Voort" <mar...@turtle.stack.nl> wrote in message

news:slrnj7u8k2...@turtle.stack.nl...

> On 2011-09-23, Rod Pemberton <do_no...@noavailemail.cmm> wrote:
>
> >> > Lets start with the two most widespread languages: C and Forth.
> >>
> >> Forth widespread? Where, in Apple firmwares ? :)
> >
> > Everywhere. It's been implemented on more platforms than C.
>
> Well, that is something different from "usage". Your argument reminds me a
> bit of NetBSD propaganda ;-)
>

Well, I wasn't trying to spread "propaganda". I'm sorry that it comes
across that way to you. I was only stating the truth as I understand it.
That truth is based on my personal research and many years of experience.

> > Still no reason as to why "curly braces suck", so I'll take that
> > as just an intense personal "preference" ...
>
> It is for sure. But that was the whole point, to contradict the notion
> that that is a fact,

You presented nothing to contradict the notion. In fact, I presented mostly
contradictory evidence, albeit minimal. I didn't obviously compare all the
possible languages. I picked languages that I thought would be
representative, which leaves plenty of bias in the sampling. That leaves
plenty of room for you to post languages that are better. I'll even help
you by providing links to sites that show code in many languages. Of the
language I did compare, Forth, Cobol, and Lua are clearly worse. Pascal and
Fortran are arguably slightly worse. The rest use curly braces (and/or
parens).

Links to solutions in a wide variety of languages:
http://rosettacode.org/wiki/Category:Programming_Tasks
http://www.99-bottles-of-beer.net/

Links to rankings of programming language use:
http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html
http://lang-index.sourceforge.net/

> [...] and the crazy idea that the prevalence of curly braces

> language has something to do with its syntactic superiority .
>

Why then is it being used if not for superiority, or at least perceived
superiority? Curly braces are exceptional for grouping code. : and ; of
Forth used to define "words" (i.e., procedures) are also good, but since
those are just for procedures Forth needs many others.

Rod Pemberton

Rod Pemberton

unread,

Sep 26, 2011, 3:01:19 PM9/26/11

to

"Ted Joyce" <tpj2...@mynews.net> wrote in message

news:j5m9e2$lru$2...@speranza.aioe.org...

> Rod Pemberton wrote:
> > "James Harris" <james.h...@googlemail.com> wrote in message
> >
news:76f1ecf4-69b5-419c...@h34g2000yqd.googlegroups.com...
> >> BartC
> >>> James
...

> >>>> And I can see that you are comparing a lot against C
> >>>> but I don't know whether you prefer them over string slicing or
> >>>> why. I
> >>
> >>> C is hopeless at higher level string handling, so doesn't really
> >>> enter it.
> >>
> >> I've been trying to persuade Rod of that but he keeps comparing
> >> with C!
> >
> > Huh? 90% of what I use C for is string processing ...
>
> Yeah, but you just do that "because it is there", right? Otherwise you
> would use SNOBOL, right?
>

No, SNOBOL is a "dead" language. I think I have a book on it somewhere that
I never read since there is no point in reading about a language one can't
use ... C is by far the best overall of the numerous languages I've
programmed, which is why I use it. It's also about as widespread as Forth.
However, I admit I don't have much experience with object-oriented
languages. I've looked at C++ and Ruby.

> > Sometimes
> > it's a bit awkward, but it's also powerful. Gain some. Lose some.
> > I use very few of the string library functions. So, there is lots
> > more power available there.
>
> The question that comes to mind is what the scope of "string processing"
> (literally quoted in quotes!) is for you.

Anything to do with chars, which is what one would commonly call text ...
I've got everything from trivial character conversions to reformatting to
parsing and lexing of programming languages, like C, Forth, and x86
assembly.

Rod Pemberton

Rod Pemberton

unread,

Sep 26, 2011, 3:06:56 PM9/26/11

to

"Ted Joyce" <tpj2...@mynews.net> wrote in message

news:j5m9e1$lru$1...@speranza.aioe.org...

> Rod Pemberton wrote:
> > "Ted Joyce" <tpj2...@mynews.net> wrote in message
> > news:j5jk2d$28a$2...@speranza.aioe.org...

...

> > I was just reminding James
> > that strings are not a fundamental type in some languages.
>
> More vague verbage: "fundamental". Isn't a string always a
> "fundamental" type? ;) It's just a holder for data. How more
> fundamental can a type be? ;)
>

Strings are implemented as part of the syntax and type system of some
language but not others. So, e.g., a string is a fundamental type in BASIC,
but a string is not fundamental type in C. It's not a part of C's type
system. 'char' and "'Arrays' of 'char'" and 'pointer to char' are part of
C's type system, but 'strings' are not.

> > [quotes]

>
> Well I hope that helped you to understand that character arrays and
> strings in C are not the same thing, because they are not the same thing.

You're correct that they're not the same thing. In fact, they don't even
exist in C. C doesn't have strings. C doesn't have arrays. Neither is
actually a complete type in C's type system. Are you confused? You seem to
be equating C concepts of arrays and strings as actually being C types ...
"String concept" as you stated earlier is more accurate.

> I was trying to point out how you continually associate "C string"
> with characters and arrays but leave off the most important thing:
> the null. The null is what makes a sequence of
> characters a string in C!

A null is what differentiates an "array" of chars from a null terminated
"array" of chars. That's what C has. The latter of the two is used as a
string. Strings don't exist in C as a fundamental type like 'char' or
'long'. They are use programmatically, i.e., certain code checks for the
null.

> Forget about array, any sequence of characters
> with a null at the end of it can be a C string.

How do you intend to access it?

C doesn't have 'string' as a type, like 'char' or 'int' or 'long', etc. In
order to access a "string" in C, you must do so through some other type,
such as an "array" of 'char', or a 'pointer to char'. That means 'strings'
are not a fundamental type in C.

> > I put arrays in quotes for a reason: C doesn't actually
> > have them either.
>
> C "arrays" are just strings!
>

No. Firstly, "strings" refers to text. So, an "array" of structs or longs
is not a "string". Secondly, pointers, either explicitly or implicitly, are
used as the method to implement and access "arrays" in C, or other
sequences of objects.

Rod Pemberton

Dmitry A. Kazakov

unread,

Sep 26, 2011, 3:52:18 PM9/26/11

to

On Mon, 26 Sep 2011 15:01:19 -0400, Rod Pemberton wrote:

> No, SNOBOL is a "dead" language. I think I have a book on it somewhere that
> I never read since there is no point in reading about a language one can't
> use

If it is The SNOBOL4 Programming Language by Griswold et al, then I would
recommend to read it. It is an excellent book and it is worth reading to
see ideas of pattern matching beyond ugly and weak regular expressions.

SNOBOL is dead, but its patterns are alive. There are implementations of
SNOBOL/SPITBOL and SNOBOL-like patterns for various languages.

Marco van de Voort

unread,

Sep 27, 2011, 10:20:37 AM9/27/11

to

On 2011-09-25, BartC <b...@freeuk.com> wrote:
>> best system. Unfortunately, compiler quality on Modula2 was bad, so I
>> ended
>> up with the next best thing, Pascal.
>
> I thought Algol68 sorted this out pretty well.

There are many solutions. I just like the M2 one best.

Algol60 simply doesn't allow nesting of IFs. I can't exactly remember what
Algol-W did.

> However, every 'if' statement needs to end with 'fi', not to everyone's
> taste. (In my versions of the syntax, I also allow 'end', 'endif' and 'end
> if', just to provide a choice.)

The "end" solution is pretty equivalent to the M2 one.

> Anyway, it means many places where you might sometimes write more than one
> statement, are already delimited, or bracketed. So you can change from 1 to
> 2 or N statements, and back again, without inserting and remove blocks.
> Quite luxurious! Of course not all of us use fancy IDEs..

I usually use IDEs, but they are better at writing new code then modifying
existing ones.

Andy Walker

unread,

Sep 27, 2011, 6:36:31 PM9/27/11

to

On 27/09/11 15:20, Marco van de Voort wrote:
> Algol60 simply doesn't allow nesting of IFs.

The statement following "THEN" had to be unconditional [though
it could, of course, be a compound statement containing "IT ..."], but
any form of statement could follow "ELSE". But direct nesting of "IF"s
was allowed in expressions: the Report itself gives the example of a
Boolean expression:

IF IF IF a THEN b ELSE c THEN d ELSE f THEN g ELSE h < k

[no, I don't know what happened to "e", "i" and "j" either]. The syntax
wasn't very orthogonal, in that the "IF" and "ELSE" parts could be "IF"
expressions, but the "THEN" part couldn't.

Algol 68 is much better at this sort of thing.

--
Andy Walker,
Nottingham.

Fritz Wuehler

unread,

Sep 27, 2011, 7:06:21 PM9/27/11

to

"Rod Pemberton" <do_no...@noavailemail.cmm> wrote:

> No, SNOBOL is a "dead" language. I think I have a book on it somewhere that
> I never read since there is no point in reading about a language one can't
> use ...

SNOBOL4 is dead in production but it is as alive as any current language in
that there are interpreters available for UNIX-like platforms and there are
compilers available for Windows and the original Bell Labs SNOBOL4 is
available on MVS and so is SPITBOL-360. The UNIX port is being maintained
although from what I see it's excellent and anything else done is
non-conforming enhancements. You can't call a language that is still
available on so many different platforms and OS and still being maintained,
dead.

A lot of the silly foaming over BASIC's so-called greatness in string
processing would disappear if you would read your book. There's plenty to
learn from SNOBOL4. Those who don't know history are destined to repeat it.

Of all the languages I have seen none can come close to SNOBOL4's string
processing power. It's interesting to see how much they knew in 1969 that
people still don't get today. Where SNOBOL4 comes up short is it's
interpreted, although the interpreter is a VM, way ahead of its time. With
library support it would be as usable as any popular scripting language that
came after it. It doesn't handle big projects well although the SPITBOL
compiler is written mostly in SPITBOL (SNOBOL).

Rod Pemberton

unread,

Sep 27, 2011, 8:36:22 PM9/27/11

to

"Fritz Wuehler" <fr...@spamexpire-201109.rodent.frell.theremailer.net> wrote
in message
news:97924b292823516a...@msgid.frell.theremailer.net...
> "Rod Pemberton" <do_no...@noavailemail.cmm> wrote:
...

> > No, SNOBOL is a "dead" language. I think I have a book on it somewhere
> > that I never read since there is no point in reading about a language
> > one can't use ...
>
> SNOBOL4 is dead in production but it is as alive as any current language
> in that there are interpreters available for UNIX-like platforms and there
> are compilers available for Windows and the original Bell Labs SNOBOL4 is
> available on MVS and so is SPITBOL-360. The UNIX port is being maintained
> although from what I see it's excellent and anything else done is
> non-conforming enhancements. You can't call a language that is still
> available on so many different platforms and OS and still being
> maintained, dead.
>

Sure I can and will, again. If a language has less than 1% of market share
and that language has existed for a quite while, it's "dead". I.e., by
requiring them to exist for a while, I'm allowing for new languages that may
be growing in the under 1% category, albeit that is a rarity.

So, find Snobol here:
http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html

Or, here:
http://lang-index.sourceforge.net/

Snobol doesn't rank for either link. As you can see, Snobol effectively has
0% market share. The first link ranks down to 0.20%. That's 1/5th of a
percent! How much more "dead" can you get? The second link ranks some
exceptionally rare languages. They rank down to absolutely "dead", er, I
mean 0%. Even then, Snobol is not listed ... It's "dead", as dead as a
non-living thing can be.

> A lot of the silly foaming over BASIC's so-called greatness in string
> processing would disappear if you would read your book.

From my perspective, that's an utterly irrational claim. Although I know
BASIC (or knew a variety of it), I can only use ON-GOSUB with BASIC. It's
not available elsewhere. It's concepts are not that relevant to other
languages which have similar or alternate constructs. In which case, the
mere
fact that I'm aware of how to use ON-GOSUB is irrelevant and worthless
knowledge. It's just like reading a book on a language one can't use
anymore ...

> There's plenty to learn from SNOBOL4. Those who don't know
> history are destined to repeat it.
>

BASIC is simple, sufficient, and elegant. C is powerful. C is powerful
enough that one doesn't need AWK. As I stated previously, I use only a
minimal amount of C's capability and have done just about everything that
can be done with text. So, why in the world would I need Snobol or AWK
or etc for anything?

> Of all the languages I have seen none can come close to SNOBOL4's
> string processing power. It's interesting to see how much they knew in
> 1969 that people still don't get today.

At some point, you reach a certain level of capability, which I'll call
"saturation". At the "saturation" point, no further capability is required,
even if additional capability is more powerful, complete, or flexible. C is
beyond the "saturation" point for strings, so even if AWK, Perl, Snobol, etc
are way beyond C's abilities, those abilities aren't needed. A good example
of this is how flow control can be broken down into jumps, branches, and a
conditional. We only need a handful of higher level constructs, primarily
for their syntax which hides the jumps and branches, thereby reducing
"spaghetti code". BASIC is the same with strings.

Rod Pemberton

Dmitry A. Kazakov

unread,

Sep 28, 2011, 3:46:12 AM9/28/11

to

On Tue, 27 Sep 2011 20:36:22 -0400, Rod Pemberton wrote:

> At some point, you reach a certain level of capability, which I'll call
> "saturation". At the "saturation" point, no further capability is required,
> even if additional capability is more powerful, complete, or flexible. C is
> beyond the "saturation" point for strings, so even if AWK, Perl, Snobol, etc
> are way beyond C's abilities, those abilities aren't needed. A good example
> of this is how flow control can be broken down into jumps, branches, and a
> conditional. We only need a handful of higher level constructs, primarily
> for their syntax which hides the jumps and branches, thereby reducing
> "spaghetti code". BASIC is the same with strings.

SNOBOL had some fundamental constructs. unknown in BASIC or C, pioneering
to that time.

1. The concept of cursor which moves along the string as the latter gets
matched against the pattern.

(This concept was more or less incorporated into collections design in the
form of "iterators". But in SNOBOL cursors are built-in and implicit, while
iterators are shaped as proper objects.)

2. The concept of immediate assignment, that is when the part of a string
matched by some subpattern is assigned to a variable.

(In parsing techniques it was extended by semantic callbacks, subprograms
called when this or that syntactic element has been matched.)

3. The concept of rollback upon matching failure. It was extensively used
in SNOBOL for various schemes of enumeration and recursion. This was kind
of tree traversal built into the language.

Edward Feustel

unread,

Sep 28, 2011, 5:12:43 AM9/28/11

to

Fritz,
Snobol and Spitbol have yahoogroups for people who want to
do more with the languages. SPITBOL has been declared public domain.

You might want to investigate the Unicon Programming Language.
It is still under active development and is suitable for general
purposes.

Unicon generally maintains the spirit of Griswold's successor to
Snobol: Icon (which not only does strings but has various extensions
including graphics). Unicon extends Icon to have Objects and to
implement extensions for System Programming including internet access,
SQL databases, and parallel programming. It can use the extensive set
of pattern matching, string, and graphics libraries that Icon
supports. It provides an IDE that I routinely use when I have to do
programs supporting string algorithms.

You can find more information on the Unicon.org site and in
the code comparison files in Rosetta collection. Implementations
are available cross platform and you can get these from Unicon.org.

Ed Feustel

On Wed, 28 Sep 2011 01:06:21 +0200, Fritz Wuehler