Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Portable Assembly

337 views

Skip to first unread message

rickman

unread,

May 27, 2017, 3:39:41 PM5/27/17

Someone in another group is thinking of using a portable assembler to write
code for an app that would be ported to a number of different embedded
processors including custom processors in FPGAs. I'm wondering how useful
this will be in writing code that will require few changes across CPU ISAs
and manufacturers.

I am aware that there are many aspects of porting between CPUs that is
assembly language independent, like writing to Flash memory. I'm more
interested in the issues involved in trying to use a universal assembler to
write portable code in general. I'm wondering if it restricts the
instructions you can use or if it works more like a compiler where a single
instruction translates to multiple target instructions when there is no one
instruction suitable.

Or do I misunderstand how a portable assembler works? Does it require a
specific assembly language source format for each target just like using the
standard assembler for the target?

--

Rick C

Grant Edwards

unread,

May 27, 2017, 5:05:21 PM5/27/17

On 2017-05-27, rickman <gnu...@gmail.com> wrote:

> Someone in another group is thinking of using a portable assembler

You'll have to ask them what they mean by "portable assembler".

The only "portable assemblers" I've ever seen were frameworks that let
you build multiple CPU-specific assemblers from a single source tree.

I vaguely remember one product many years ago that used some sort of
configuration file to define the mnemonics and instruction format for
each architecture. The base "executable" for the assembler itself was
universal.

But that was simply an implementaiton detail. You still had to write
seperate source code for each CPU using that CPU's instruction set.
When you were using it, it wasn't any different than using a set of
seperate assemblers that shared a common macro/directive processor.

> to write code for an app that would be ported to a number of
> different embedded processors including custom processors in FPGAs.
> I'm wondering how useful this will be in writing code that will
> require few changes across CPU ISAs and manufacturers.
>
> I am aware that there are many aspects of porting between CPUs that
> is assembly language independent, like writing to Flash memory. I'm
> more interested in the issues involved in trying to use a universal
> assembler to write portable code in general.

You'll have to provide a reference to such a "universal assembler if
you want any real answers.

> I'm wondering if it restricts the instructions you can use or if it
> works more like a compiler where a single instruction translates to
> multiple target instructions when there is no one instruction
> suitable.

AFAIK, you still have to write seperate source code for each CPU
instruction set.

> Or do I misunderstand how a portable assembler works? Does it
> require a specific assembly language source format for each target
> just like using the standard assembler for the target?

That's all I've ever run across in 35 years of low-level embedded
work...

--
Grant

Les Cargill

unread,

May 27, 2017, 5:17:31 PM5/27/17

That's what C is for.

This being said, I've been doing this for
37 years and have only a few times seen an actual need for
portability - usually, the new hardware is so radically
different that porting makes little sense.

--
Les Cargill

Dimiter_Popoff

unread,

May 27, 2017, 5:23:20 PM5/27/17

The only thing of that kind I know of is vpa (virtual processor
assembler) which I have created some 17 years ago.
Takes 68k source and assembles it into power architecture code.
It is a pretty huge thing, all dps (the OS I had originally written
for 68k (CPU32), toolchains, application code for our products etc.
etc. (millions of lines) go through it - and it can do a lot more
than just assemble statements, it does everything I ever wanted
it to do - and when it could not I extended it so it could.

It would be a lot simpler for a smaller processor and less
demanding code of course - as the typical mcu firmware would be.
Basically apart from some exceptions any source working on one
processor can be assembled into code for another one; and the
exceptions are not bulky, though critical - like some handlers within
the supervisor/hypervisor code, task switching, in general dealing
with exceptions is highly processor dependent - though in large part
the code which does the handling is still processor independent, one
has to go through it manually.

Dimiter

------------------------------------------------------
Dimiter Popoff, TGI http://www.tgi-sci.com
------------------------------------------------------
http://www.flickr.com/photos/didi_tgi/

Dimiter_Popoff

unread,

May 27, 2017, 5:31:40 PM5/27/17

On 28.5.2017 г. 00:17, Les Cargill wrote:
> rickman wrote:
>> Someone in another group is thinking of using a portable assembler
>> to write code for an app that would be ported to a number of
>> different embedded processors including custom processors in FPGAs.
>> I'm wondering how useful this will be in writing code that will
>> require few changes across CPU ISAs and manufacturers.
>>
>> I am aware that there are many aspects of porting between CPUs that
>> is assembly language independent, like writing to Flash memory. I'm
>> more interested in the issues involved in trying to use a universal
>> assembler to write portable code in general. I'm wondering if it
>> restricts the instructions you can use or if it works more like a
>> compiler where a single instruction translates to multiple target
>> instructions when there is no one instruction suitable.
>>
>> Or do I misunderstand how a portable assembler works? Does it
>> require a specific assembly language source format for each target
>> just like using the standard assembler for the target?
>>
>
> That's what C is for.

Or Basic. Or Fortran etc.

However, they are by far not what a "portable assembler" - existing
under the name Virtual Processor Assembler in our house is.
And never will be, like any high level language C is yet another
phrase book - convenient when you need to do a quick interaction
when you don't speak the language - and only then.

>
> This being said, I've been doing this for
> 37 years and have only a few times seen an actual need for
> portability - usually, the new hardware is so radically
> different that porting makes little sense.
>

The need for portability arises when you have megabytes of
sources which are good and need to be moved to another, better
platform. For smallish projects - anything which would fit in
an MCU flash - porting is likely a waste of time, rewriting it
for the new target will be faster if done by the same person
who has already done it once.

Mel Wilson

unread,

May 27, 2017, 5:32:44 PM5/27/17

On Sat, 27 May 2017 21:05:18 +0000, Grant Edwards wrote:

> I vaguely remember one product many years ago that used some sort of
> configuration file to define the mnemonics and instruction format for
> each architecture. The base "executable" for the assembler itself was
> universal.

The Cross-32 Meta-assembler worked like that. Configuration tables
described the instruction word formats and the mnemonics you were going
to use. The unconfigured part of the program supplied the symbol-table
handling and the macro facility. Provided very powerful assembly-time
address processing. It worked very well on most architectures, except
for some DSPs that packed several instructions into single machine words,
so you couldn't summarize the format of a line of assembly code.
ADSP-2100 rings a bell here, though I may have mis-remembered. Back when
development software was expensive it was a boon to have to spend that
kind of money only once.

upsid...@downunder.com

unread,

May 27, 2017, 5:36:32 PM5/27/17

On Sat, 27 May 2017 21:05:18 +0000 (UTC), Grant Edwards
<inv...@invalid.invalid> wrote:

>On 2017-05-27, rickman <gnu...@gmail.com> wrote:
>
>> Someone in another group is thinking of using a portable assembler

The closest I can think of is called "C" :-)

>You'll have to ask them what they mean by "portable assembler".
>
>The only "portable assemblers" I've ever seen were frameworks that let
>you build multiple CPU-specific assemblers from a single source tree.
>
>I vaguely remember one product many years ago that used some sort of
>configuration file to define the mnemonics and instruction format for
>each architecture. The base "executable" for the assembler itself was
>universal.

I have written several (cross)assemblers and disassemblers for various
platforms using some sort of instruction tables. This works well with
some regular instruction sets, such as PDP-11, VAX and 68K and to a
degree with DG Nova.

Other architectures are so seriously oddball that you can only handle
a small subset with some general purpose instruction set description
language. You still had to provide hard coded assembly/disassembly
routines for a large number of instructions, especially privileged
instructions (which were often added to the instruction set later on
and then needed to find free op-codes).

>But that was simply an implementaiton detail. You still had to write
>seperate source code for each CPU using that CPU's instruction set.
>When you were using it, it wasn't any different than using a set of
>seperate assemblers that shared a common macro/directive processor.

Absolutely true.

In addition, some general purpose (dis)assemblers, such as the GNU
tools define source and destination order in a particular way, why the
native assembler may use different order for source and destination
operands.

Theo Markettos

unread,

May 27, 2017, 5:52:47 PM5/27/17

Dimiter_Popoff <d...@tgi-sci.com> wrote:
> The need for portability arises when you have megabytes of
> sources which are good and need to be moved to another, better
> platform. For smallish projects - anything which would fit in
> an MCU flash - porting is likely a waste of time, rewriting it
> for the new target will be faster if done by the same person
> who has already done it once.

Back in the 80s, lots of software was written in assembly. But it was
common for software to be cross-platform - a popular game might come out for
half a dozen or more machines, using Z80, 6502, 68K, 8086, 6809, etc.

Obviously 'conversion' involved more than just the instruction set - parts
had to be written for the memory available and make use of the platform's
graphics capabilities (which could be substantially different). But were
there tools to handle this, or did the programmers sit down and rewrite the
assembly from scratch for each version?

Theo

Grant Edwards

unread,

May 27, 2017, 5:56:46 PM5/27/17

On 2017-05-27, upsid...@downunder.com <upsid...@downunder.com> wrote:

>>But that was simply an implementaiton detail. You still had to write
>>seperate source code for each CPU using that CPU's instruction set.
>>When you were using it, it wasn't any different than using a set of
>>seperate assemblers that shared a common macro/directive processor.
>
> Absolutely true.
>
> In addition, some general purpose (dis)assemblers, such as the GNU
> tools define source and destination order in a particular way, why the
> native assembler may use different order for source and destination
> operands.

I always found it exceedingly odd that with the Gnu assembler, some of
the meta-level syntax/semantics differed from one target to the next
(e.g. comment delimiters, data directives, etc.).

--
Grant

Grant Edwards

unread,

May 27, 2017, 5:58:55 PM5/27/17

On 2017-05-27, Theo Markettos <theom...@chiark.greenend.org.uk> wrote:

> Back in the 80s, lots of software was written in assembly. But it was
> common for software to be cross-platform - a popular game might come out for
> half a dozen or more machines, using Z80, 6502, 68K, 8086, 6809, etc.
>
> Obviously 'conversion' involved more than just the instruction set - parts
> had to be written for the memory available and make use of the platform's
> graphics capabilities (which could be substantially different). But were
> there tools to handle this, or did the programmers sit down and rewrite the
> assembly from scratch for each version?

Usually the latter.

There were tools that were supposed to help you do things like port
8080 assmebly language programs to the 8086, but from what I
read/heard they didn't turn out to be very useful in the real world.

--
Grant

Dimiter_Popoff

unread,

May 27, 2017, 6:04:54 PM5/27/17

I am not aware of tools doing it, they must have been rewritten. The
exception on your list is the 6809, it was source level compatible
to the 6800 (i.e. 6800 code could be assembled into 6809 code, slightly
larger but very similar object code). BTW I still have a 6809 system
working under DPS - emulated as a task in a window, running MDOS09
(which ran on the Exorsiser systems),
http://tgi-sci.com/misc/sc09em.gif . The 6809 assembler is what I grew
up on back in the 80-s.

I may of course be simply unaware of something. I have never looked
into other people's work more than I needed to do what I wanted to
do as fast as I could, many times I may have chosen to reinvent things
simply because this has been the fastest (pre-www) way.

Don Y

unread,

May 27, 2017, 6:46:12 PM5/27/17

On 5/27/2017 2:17 PM, Les Cargill wrote:
> That's what C is for.

Arguably, ANY HLL.

> This being said, I've been doing this for
> 37 years and have only a few times seen an actual need for
> portability - usually, the new hardware is so radically
> different that porting makes little sense.

Depends on how you handle your abstractions in the design.
If you tie the design directly to the hardware, then you've
implicitly made it dependent on that hardware -- without
even being aware of the dependencies.

OTOH, you can opt to create abstractions that give you a "slip sheet"
above the bare iron -- at some (small, if done well) cost in efficiency.
(e.g., "Hardware Abstraction Layer" -- though not necessarily as
explicit or limiting therein)

E.g., my current RTOS moves reasonably well between different hardware
platforms (I'm running on ARM and x86, currently) with the same sorts of
services exported to the higher level API's.

OTOH, the API's explicitly include provisions that allow the "application"
layers to tailor themselves to key bots of the hardware made largely
opaque by the API (e.g., MMU page sizes, number and granularity of hardware
timers, etc.)

But, this changes the level of proficiency required of folks working
with those API's. Arguably, I guess it should (?)

Of course, if you want to shed all "hardware dependencies" and just code
to a POSIX API... <shrug>

One could make an abstraction that is sufficiently *crude* (the equivalent of
single-transistor logic) and force the coder to use that as an implementation
language; then, recognize patterns of "operations" and map those to templates
that correlate with opcodes of a particular CPU (i.e., many operations -> one
opcode). Or, the HLL approach of mapping *an* operation into a sequence of
CPU-specific opcodes. Or, many<->many, in between.

Don Y

unread,

May 27, 2017, 6:53:06 PM5/27/17

On 5/27/2017 2:31 PM, Dimiter_Popoff wrote:
> The need for portability arises when you have megabytes of
> sources which are good and need to be moved to another, better
> platform. For smallish projects - anything which would fit in
> an MCU flash - porting is likely a waste of time, rewriting it
> for the new target will be faster if done by the same person
> who has already done it once.

+1

The *concepts*/design are what you are trying to reuse, not
the *code*.

OTOH, we see increasing numbers of designs migrating into
software that would previously have been done with hardware
as the costs of processors falls and capabilities rise.
This makes it economical to leverage the higher levels of
integration available in an MCU over that of "discretes"
or, worse, a *specific* "custom".

E.g., I can design an electronic tape rule in hardware or
software in roughly the same amount of effort. But, the software
version will be more mutable, in the long term, and leverage
a single "raw part number" (the unprogrammed MCU) in the MRP
system.

OToOH, we are seeing levels of complexity now -- even in SoC's -- that
make "big" projects much more commonplace. I'd hate to have to
recode a cell-phone for a different choice of processor if I'd not
made plans for that contingency in the first place!

Dimiter_Popoff

unread,

May 27, 2017, 7:14:08 PM5/27/17

On 28.5.2017 г. 01:52, Don Y wrote:
> On 5/27/2017 2:31 PM, Dimiter_Popoff wrote:
>> The need for portability arises when you have megabytes of
>> sources which are good and need to be moved to another, better
>> platform. For smallish projects - anything which would fit in
>> an MCU flash - porting is likely a waste of time, rewriting it
>> for the new target will be faster if done by the same person
>> who has already done it once.
>
> +1
>
> The *concepts*/design are what you are trying to reuse, not
> the *code*.
>
> OTOH, we see increasing numbers of designs migrating into
> software that would previously have been done with hardware
> as the costs of processors falls and capabilities rise.
> This makes it economical to leverage the higher levels of
> integration available in an MCU over that of "discretes"
> or, worse, a *specific* "custom".

Well of course, it is where all of us here have been moving to
last 25 years or so (for me, since the HC11 days).

>
> E.g., I can design an electronic tape rule in hardware or
> software in roughly the same amount of effort. But, the software
> version will be more mutable, in the long term, and leverage
> a single "raw part number" (the unprogrammed MCU) in the MRP
> system.

Yes of course, but porting does not necessarily mean porting to
another CPU architecture, typically you will reuse the code on
the same one - and modify just some peripheral interactions etc.
sort of thing.

>
> OToOH, we are seeing levels of complexity now -- even in SoC's -- that
> make "big" projects much more commonplace. I'd hate to have to
> recode a cell-phone for a different choice of processor if I'd not
> made plans for that contingency in the first place!
>

Well phones do not have the flash as part of the SoC, I said
"in the MCU flash", meaning on the same chip. This is what I regard
as a "small" thingie, can't see what it will have to do to take
up more that 3-4 months of my time as long as I know what I want
to program.
Anything where external disks and/or "disks" are involved is in the
other category of course.

Dimiter

jim.bra...@ieee.org

unread,

May 27, 2017, 7:54:30 PM5/27/17

]>ported to a number of different embedded processors including custom processors in FPGAs

It's possible to do direct threaded code in C. For small projects, the number of threaded code routines is small and highly application specific.
So all the thread code segments are very portable and the debugging is in the threaded code routines (e.g. one can perfect the application in C on a PC and then migrate to any number of custom ISAs).

That said, am currently creating a system of symbolic constants for all the op-codes and operand values (using VHDL and for each specific ISA). One can create symbolic constants for various locations in the code (and manually update the constants as code gets inserted or deleted). Per-opcode functions can be defined that make code generation less troublesome. The code (either constant expressions or function calls) is laid out as initialization for the instruction memory. Simulation can be used to debug the code and the ISA. A quick two step process: edit the code and run the simulator.

One can also write a C or any other language program that generates the binary code file which is then inserted into FPGA RAM during the FPGA compile step. Typically one writes a separate function for each op-code or label generator (and for each ISA). Two passes through all the function calls (e.g. the application program) first pass to generate the labels and the second pass to generate the binary file. For use with FPGA simulation this is a three step process: edit the application program, run the binary file generator and run the FPGA simulator.

The preferred solution is to support label generators in the memory initialization sections of the VHDL or Verilog code.
Would be very interested if someone has managed to do label generators?

Don Y

unread,

May 27, 2017, 8:14:33 PM5/27/17

On 5/27/2017 4:14 PM, Dimiter_Popoff wrote:
>> E.g., I can design an electronic tape rule in hardware or
>> software in roughly the same amount of effort. But, the software
>> version will be more mutable, in the long term, and leverage
>> a single "raw part number" (the unprogrammed MCU) in the MRP
>> system.
>
> Yes of course, but porting does not necessarily mean porting to
> another CPU architecture, typically you will reuse the code on
> the same one - and modify just some peripheral interactions etc.
> sort of thing.

Yes, but you can't always be sure of that. I've seen many
products "squirm" when the platform they adopted for early
versions suddenly became unavailable -- or, too costly -- to
support new versions/revisions of the product. This is probably
one of the most maddening positions to be in: having *a* product
and facing a huge re-development just to come up with the NEXT
product in its evolution.

>> OToOH, we are seeing levels of complexity now -- even in SoC's -- that
>> make "big" projects much more commonplace. I'd hate to have to
>> recode a cell-phone for a different choice of processor if I'd not
>> made plans for that contingency in the first place!
>
> Well phones do not have the flash as part of the SoC, I said
> "in the MCU flash", meaning on the same chip. This is what I regard
> as a "small" thingie, can't see what it will have to do to take
> up more that 3-4 months of my time as long as I know what I want
> to program.

But you can pick devices with *megabytes* of on-board (on-chip) FLASH,
nowadays:
<https://www.microchip.com/wwwproducts/en/ATSAM4SD32C>
It seems fairly obvious that more real-estate will find its way into
devices' "memory" allocations.

You could keep a small staff busy just tracking new offerings and
evaluating price/performance points for each. I've discarded several
"finished" hardware designs for my current project because I *know* they'll
be obsolete before the rest of the designs are complete!

Instead, I concentrate on getting all of the software written for the
various applications on hardware that I *know* I won't be using
(I've a stack of a couple dozen identical x86 SBC's that I've been
repurposing for each of the application designs) just to allow me
to have "working prototypes" that the other applications can talk
to as THEY are being developed.

As most of the design effort is in OS and application software -- with
a little bit of specialized hardware I/O development -- the choice of
processor is largely boring (so, why make it NOW?)

Dimiter_Popoff

unread,

May 27, 2017, 8:38:15 PM5/27/17

On 28.5.2017 г. 03:14, Don Y wrote:

> On 5/27/2017 4:14 PM, Dimiter_Popoff wrote:
>>> E.g., I can design an electronic tape rule in hardware or
>>> software in roughly the same amount of effort. But, the software
>>> version will be more mutable, in the long term, and leverage
>>> a single "raw part number" (the unprogrammed MCU) in the MRP
>>> system.
>>
>> Yes of course, but porting does not necessarily mean porting to
>> another CPU architecture, typically you will reuse the code on
>> the same one - and modify just some peripheral interactions etc.
>> sort of thing.
>
> Yes, but you can't always be sure of that. I've seen many
> products "squirm" when the platform they adopted for early
> versions suddenly became unavailable -- or, too costly -- to
> support new versions/revisions of the product.

Of course you can't be sure what other people will do. We can't
really be sure what we'll do ourselves.... :-)

> ... This is probably

> one of the most maddening positions to be in: having *a* product
> and facing a huge re-development just to come up with the NEXT
> product in its evolution.

Exactly this situation forced my hand to create vpa (virtual processor
assembly language). I had several megabytes of good sources written in
68k assembly and the 68k line was going to an end. Sure I could have
used it for another few years but it was obvious I had to move
forward so I did.

>
>>> OToOH, we are seeing levels of complexity now -- even in SoC's -- that
>>> make "big" projects much more commonplace. I'd hate to have to
>>> recode a cell-phone for a different choice of processor if I'd not
>>> made plans for that contingency in the first place!
>>
>> Well phones do not have the flash as part of the SoC, I said
>> "in the MCU flash", meaning on the same chip. This is what I regard
>> as a "small" thingie, can't see what it will have to do to take
>> up more that 3-4 months of my time as long as I know what I want
>> to program.
>
> But you can pick devices with *megabytes* of on-board (on-chip) FLASH,
> nowadays:
> <https://www.microchip.com/wwwproducts/en/ATSAM4SD32C>
> It seems fairly obvious that more real-estate will find its way into
> devices' "memory" allocations.

Well this part is closer to a "big thing" but not there yet. 160k RAM
is by no means much nowadays, try buffering a 100 Mbps Ethernet link
on that for example. It is still for stuff you can do within a few
months if you know what you want to do. The high level language will
take care of clogging up the 2M flash if anything :D.

Although I must say that my devices (MPC5200b based) have 2M flash
and can boot a fully functional dps off it... including most of the
MCA software. It takes a disk to get the complete functionality
but much of it - OS, windows, shell and all commands etc.
fit in (just between 100 and 200k are used for "BIOS" purposes,
the rest of the 2M is a "disk").
This with 64M RAM of course - and no wallpapers, you need a proper
disk for that :-).

However I doubt anyone writing in C could fit a tenth of that
in 2M flash which is why it is there; it is just way more than really
necessary for the RAM that part has if programming would not be
done at high level.

Don Y

unread,

May 27, 2017, 9:42:13 PM5/27/17

On 5/27/2017 5:38 PM, Dimiter_Popoff wrote:
>> ... This is probably
>> one of the most maddening positions to be in: having *a* product
>> and facing a huge re-development just to come up with the NEXT
>> product in its evolution.
>
> Exactly this situation forced my hand to create vpa (virtual processor
> assembly language). I had several megabytes of good sources written in
> 68k assembly and the 68k line was going to an end. Sure I could have
> used it for another few years but it was obvious I had to move
> forward so I did.

Its one of the overwhelming reasons I opt for HLL's -- I can *buy* a
tool that will convert my code to the target of my choosing. :>

>>>> OToOH, we are seeing levels of complexity now -- even in SoC's -- that
>>>> make "big" projects much more commonplace. I'd hate to have to
>>>> recode a cell-phone for a different choice of processor if I'd not
>>>> made plans for that contingency in the first place!
>>>
>>> Well phones do not have the flash as part of the SoC, I said
>>> "in the MCU flash", meaning on the same chip. This is what I regard
>>> as a "small" thingie, can't see what it will have to do to take
>>> up more that 3-4 months of my time as long as I know what I want
>>> to program.
>>
>> But you can pick devices with *megabytes* of on-board (on-chip) FLASH,
>> nowadays:
>> <https://www.microchip.com/wwwproducts/en/ATSAM4SD32C>
>> It seems fairly obvious that more real-estate will find its way into
>> devices' "memory" allocations.
>
> Well this part is closer to a "big thing" but not there yet. 160k RAM
> is by no means much nowadays, try buffering a 100 Mbps Ethernet link
> on that for example. It is still for stuff you can do within a few
> months if you know what you want to do. The high level language will
> take care of clogging up the 2M flash if anything :D.

I don't think you realize just how clever modern compilers have become.
I've taken portions of old ASM projects and tried to code them in HLL's
to see what the "penalty" would be. It was alarming to see how much
cleverer they have become (over the course of many decades!).

Of course, you need to be working on a processor that is suitable to
their use to truly benefit from their cleverness -- I doubt an 8x300
compiler would beat my ASM code! :>

> Although I must say that my devices (MPC5200b based) have 2M flash
> and can boot a fully functional dps off it... including most of the
> MCA software. It takes a disk to get the complete functionality
> but much of it - OS, windows, shell and all commands etc.
> fit in (just between 100 and 200k are used for "BIOS" purposes,
> the rest of the 2M is a "disk").
> This with 64M RAM of course - and no wallpapers, you need a proper
> disk for that :-).

I use the FLASH solely for initial POST, a secure netboot protocol, a
*tiny* RTOS and "fail safe/secure" hooks to ensure a "mindless" device
can't get into -- or remain in -- an unsafe state (including the field
devices likely tethered to it).

[A device may not be easily accessible to a human user!]

Once "enough" of the hardware is known to be functional, a second level
boot drags in more diagnostics and a more functional protocol stack.

A third level boot drags in the *real* OS and real network stack.

After that, other aspects of the "environment" can be loaded and,
finally, the "applications".

[Of course, any of these steps can fail/timeout and leave me with
a device with *just* the functionality of the FLASH]

Even this level of functionality deferral isn't enough to keep
me from having to "customize" the FLASH in each type of device
(cuz they have different I/O complements). So, big incentive
to come up with a more universal set of "peripherals" just to
cut down on the number of different designs.

upsid...@downunder.com

unread,

May 28, 2017, 12:37:10 AM5/28/17

In original marketing material, itt was claimed that the 8086 was
8080 compatible. However, when the opcode tables were released, it was
quite obvious that this was not the case. Then they claimed assembly
level compatibility ...
.

rickman

unread,

May 28, 2017, 1:28:32 AM5/28/17

I've never heard they claimed anything other than assembly source
compatibility. It would have been very hard to make the 8086 opcode
compatible with the 8080. They needed to add a lot of new instructions and
the 8080 opcodes used nearly all the space. They would have had to treat
all the new opcodes as extended opcodes wasting a byte on each one. Or
worse, they could have used instruction set modes which would have been a
disaster.

--

Rick C

Tom Gardner

unread,

May 28, 2017, 2:33:45 AM5/28/17

I only saw the assembly level compatibility claims.

I did see the claims that the 80286 and 80386(!) would
be completely compatible.

rickman

unread,

May 28, 2017, 3:22:13 AM5/28/17

With each other? Isn't the 386 upwardly compatible with the 286? I know
the 286 was upward compatible with the 8086, kinda sorta. They used modes
for the different instruction sets and no one liked it. It was a big PITA
to switch modes.

--

Rick C

Tom Gardner

unread,

May 28, 2017, 3:49:33 AM5/28/17

That was the salesman's claims. Your points about modes
were obvious to any engineer that read the (blessedly
short) preliminary data sheets.

Unfortunately my 1978 Intel data book is too early
to contain the 8086, so I can't check it there.

Tauno Voipio

unread,

May 28, 2017, 5:26:12 AM5/28/17

I have practical experience of porting 8080 (actually 8085) code
to a 8088 (8086 architecture) when the 8086 family was fresh new.
The Intel tool kinda did it, but the resulting code was not good,
so I ended up re-writing the tool output.

IMHO, the best portability with decent run code is with plain C.
If there is assembly code not expressible efficiently with C, it
is so tightly tied with the hardware that it has to be redone anyway.

I ported several embedded systems from 80188 to ARM7TDMI with little
trouble. Of course, the low-level startup and interrupt handling went
to a total re-build, but other code (several hundreds of kB) went rather
painless from Borland C to GCC.

There was a bit mode work to port from 68k to ARM, due to the different
endianess of the architectures.

--

-TV

upsid...@downunder.com

unread,

May 28, 2017, 8:03:54 AM5/28/17

On Sun, 28 May 2017 03:22:04 -0400, rickman <gnu...@gmail.com> wrote:

>Tom Gardner wrote on 5/28/2017 2:33 AM:
>> On 28/05/17 05:37, upsid...@downunder.com wrote:
>>> On Sat, 27 May 2017 21:58:50 +0000 (UTC), Grant Edwards
>>> <inv...@invalid.invalid> wrote:
>>>
>>>> On 2017-05-27, Theo Markettos <theom...@chiark.greenend.org.uk> wrote:
>>>>
>>>>> Back in the 80s, lots of software was written in assembly. But it was
>>>>> common for software to be cross-platform - a popular game might come out
>>>>> for
>>>>> half a dozen or more machines, using Z80, 6502, 68K, 8086, 6809, etc.
>>>>>
>>>>> Obviously 'conversion' involved more than just the instruction set - parts
>>>>> had to be written for the memory available and make use of the platform's
>>>>> graphics capabilities (which could be substantially different). But were
>>>>> there tools to handle this, or did the programmers sit down and rewrite the
>>>>> assembly from scratch for each version?
>>>>
>>>> Usually the latter.
>>>>
>>>> There were tools that were supposed to help you do things like port
>>>> 8080 assmebly language programs to the 8086, but from what I
>>>> read/heard they didn't turn out to be very useful in the real world.
>>>
>>> In original marketing material, itt was claimed that the 8086 was
>>> 8080 compatible. However, when the opcode tables were released, it was
>>> quite obvious that this was not the case. Then they claimed assembly
>>> level compatibility ...
>>
>> I only saw the assembly level compatibility claims.

I am referring to times long before you could get any usable
information even with NDA.

>>
>> I did see the claims that the 80286 and 80386(!) would
>> be completely compatible.
>
>With each other? Isn't the 386 upwardly compatible with the 286? I know
>the 286 was upward compatible with the 8086, kinda sorta. They used modes
>for the different instruction sets and no one liked it. It was a big PITA
>to switch modes.

All the x86 series PC processors start in Real86 mode in order to run
the BIOS, it then switches to some more advanced modes, up to included
64 bit modes.

The problem with 286 was that it contained an instruction for going
from Real86 to Protected286 mode. However, there was no direct way of
going back to Real86 apart from doing a reset.

The 386 had things done better and also contained the Virtual86 mode,
usable for running MS-DOS programs (in user mode)..

About 286 between 386 compatibility, the kernel mode was quite
different.

Some have here criticized the concept of mode bits to select between
different instruction sets. For a microprogrammed machine, this just
require a bit more microcode.

For instance the VAX-11/7xx executed the PDP-11 user mode instructions
quite well. We only had a VAX for which we also did all RSX-11 PDP-11
program development for at least five years. The same user mode
executable executed nicely on a real PDP-11 as well as on a VAX by
just copying the executable one way or he other.

The only nuisance was that since the VAX file system used decimal
notation for file version numbers as decimal while a native PDP-11
used octal :-) . Thus, a program compiled on a real PDP-11 rejected
file references like FILE.TXT;9 , but of course, usually the newest
version was needed and hence no explicit version numbers were used on
the command line.

On IBM mainframes, sometimes previous generation instruction sets were
"emulated" thus allowing the most recent hardware run the previous
generation OS and programs. That emulator might run a two generation
old OS and programs and so on. I have no idea, how much of these older
generation instructions were partially executed on new hardware and
how much was software emulated instruction by instruction.

Hans-Bernhard Bröker

unread,

May 28, 2017, 8:36:16 AM5/28/17

Am 27.05.2017 um 23:56 schrieb Grant Edwards:

> I always found it exceedingly odd that with the Gnu assembler, some of
> the meta-level syntax/semantics differed from one target to the next
> (e.g. comment delimiters, data directives, etc.).

Yeah, well, that's what you get for trying to emulate a whole bunch of
different, pre-existing assemblers made by big companies who all excel
at doing things differently just for the heck of it. In other words,
that's the price of fighting an up-hill battle against epidemically
ingrained not-invented-here syndrome.

Stefan Reuther

unread,

May 28, 2017, 12:51:07 PM5/28/17

Am 27.05.2017 um 23:31 schrieb Dimiter_Popoff:
> On 28.5.2017 г. 00:17, Les Cargill wrote:
>> rickman wrote:
>>> Or do I misunderstand how a portable assembler works? Does it
>>> require a specific assembly language source format for each target
>>> just like using the standard assembler for the target?
>>
>> That's what C is for.
>
> Or Basic. Or Fortran etc.
>
> However, they are by far not what a "portable assembler" - existing
> under the name Virtual Processor Assembler in our house is.
> And never will be, like any high level language C is yet another
> phrase book - convenient when you need to do a quick interaction
> when you don't speak the language - and only then.

So, then, what is a portable assembler?

One major variable of a processor architecture is the number of
registers, and what you can do with them. On one side of the spectrum,
we have PICs or 6502 with pretty much no registers, on the other side,
there's things like x86_64 or ARM64 with plenty 64-bit registers. Using
an abstraction like C to let the compiler handle the distinction (which
register to use, when to spill) sounds like a pretty good idea to me. If
you were more close to assembler, you'd either limit yourself to an
unuseful subset that works everywhere, or to a set that works only in
one or two places.

Stefan

Dimiter_Popoff

unread,

May 28, 2017, 1:47:32 PM5/28/17

On 28.5.2017 г. 19:45, Stefan Reuther wrote:
> Am 27.05.2017 um 23:31 schrieb Dimiter_Popoff:
>> On 28.5.2017 г. 00:17, Les Cargill wrote:
>>> rickman wrote:
>>>> Or do I misunderstand how a portable assembler works? Does it
>>>> require a specific assembly language source format for each target
>>>> just like using the standard assembler for the target?
>>>
>>> That's what C is for.
>>
>> Or Basic. Or Fortran etc.
>>
>> However, they are by far not what a "portable assembler" - existing
>> under the name Virtual Processor Assembler in our house is.
>> And never will be, like any high level language C is yet another
>> phrase book - convenient when you need to do a quick interaction
>> when you don't speak the language - and only then.
>
> So, then, what is a portable assembler?

One which is not tied to a particular architecture, rather to an
idealized machine model.
It makes sense to use this assuming that processors evolve towards
better, larger register sets - which has been the case last few
decades. It would be impractical to try to assemble something written
once for say 68k and then assemble it for a 6502 - perhaps doable but
insane.

>
> One major variable of a processor architecture is the number of
> registers, and what you can do with them. On one side of the spectrum,
> we have PICs or 6502 with pretty much no registers, on the other side,
> there's things like x86_64 or ARM64 with plenty 64-bit registers. Using
> an abstraction like C to let the compiler handle the distinction (which
> register to use, when to spill) sounds like a pretty good idea to me.

Using a phrase book is of course a good idea if you want to conduct
a quick conversation.
It is a terrible idea if you try to use the language for years and
choose to stay confined within the phrases you have in the book.

> If you were more close to assembler, you'd either limit yourself to an
> unuseful subset that works everywhere, or to a set that works only in
> one or two places.

Like I said before, there is no point to write code which can work
on any processor ever made. I have no time to waste on that, I just need
my code to be working on what is the best silicon available. This
used to be 68k, now it is power. You have to program with some
constraints - e.g. knowing that the "assembler" (which in reality
is more a compiler) may use r3-r4 as it wishes and not preserve
them on a per line basis etc.
Since the only person who could make a comparison between a HLL and
my vpa is me, I can say it has made me orders of magnitude more
efficient. Obviously you can take my word for that or ignore it,
I can only say what I know.

Grant Edwards

unread,

May 28, 2017, 9:59:53 PM5/28/17

On 2017-05-28, Stefan Reuther <stefa...@arcor.de> wrote:
> Am 27.05.2017 um 23:31 schrieb Dimiter_Popoff:
>> On 28.5.2017 г. 00:17, Les Cargill wrote:
>>> rickman wrote:
>>>> Or do I misunderstand how a portable assembler works? Does it
>>>> require a specific assembly language source format for each target
>>>> just like using the standard assembler for the target?
>>>
>>> That's what C is for.
>>
>> Or Basic. Or Fortran etc.
>>
>> However, they are by far not what a "portable assembler" - existing
>> under the name Virtual Processor Assembler in our house is.
>> And never will be, like any high level language C is yet another
>> phrase book - convenient when you need to do a quick interaction
>> when you don't speak the language - and only then.
>
> So, then, what is a portable assembler?

1) It's what Unicorns use when writing code to run the automation
equipment used by Elves to mass-produce cookies inside hollow trees.

2) It's a trigger phrase that indicates the person using it is
delusional and is about to lure you into a time-sink of
relativistic proportions.

If I were you, I'd either smile politely and change the topic or just
turn and run.

--
Grant

Dimiter_Popoff

unread,

May 28, 2017, 10:21:36 PM5/28/17

Oh smile as much as you want. Then try to match 10% of what I have made
and try to smile again.

Les Cargill

unread,

May 28, 2017, 11:04:04 PM5/28/17

Dimiter_Popoff wrote:
> On 28.5.2017 г. 00:17, Les Cargill wrote:
>> rickman wrote:

>>> Someone in another group is thinking of using a portable assembler
>>> to write code for an app that would be ported to a number of
>>> different embedded processors including custom processors in FPGAs.
>>> I'm wondering how useful this will be in writing code that will
>>> require few changes across CPU ISAs and manufacturers.
>>>
>>> I am aware that there are many aspects of porting between CPUs that
>>> is assembly language independent, like writing to Flash memory. I'm
>>> more interested in the issues involved in trying to use a universal
>>> assembler to write portable code in general. I'm wondering if it
>>> restricts the instructions you can use or if it works more like a
>>> compiler where a single instruction translates to multiple target
>>> instructions when there is no one instruction suitable.
>>>

>>> Or do I misunderstand how a portable assembler works? Does it
>>> require a specific assembly language source format for each target
>>> just like using the standard assembler for the target?
>>>
>>
>> That's what C is for.
>
> Or Basic. Or Fortran etc.
>

Not so much. Perhaps Fortran plus say, LINPACK.

> However, they are by far not what a "portable assembler" - existing
> under the name Virtual Processor Assembler in our house is.
> And never will be, like any high level language C is yet another
> phrase book - convenient when you need to do a quick interaction
> when you don't speak the language - and only then.
>

"I am returning this tobacconist; it is scratched." - Monty Python.

It has been a long time since C presented a serious constraint
in performance for me.

>>
>> This being said, I've been doing this for
>> 37 years and have only a few times seen an actual need for
>> portability - usually, the new hardware is so radically
>> different that porting makes little sense.
>>
>

> The need for portability arises when you have megabytes of

> sources which are good and need to be moved to another, better
> platform.

Mostly, I've seen the source code outlast the company for
which it was written :)

I would personally view "megabytes of source" as an opportunity to
infuse a system with better ideas through a total rewrite. I
understand that this view is rarely shared; people prefer the
arbitrage of technical debt.

> For smallish projects - anything which would fit in
> an MCU flash - porting is likely a waste of time, rewriting it
> for the new target will be faster if done by the same person
> who has already done it once.
>

L'il MCU projects are essentially disposable. Too many
heresies.

> Dimiter
>
> ------------------------------------------------------
> Dimiter Popoff, TGI http://www.tgi-sci.com
> ------------------------------------------------------
> http://www.flickr.com/photos/didi_tgi/
>
>
>
>

--
Les Cargill

Tim Wescott

unread,

May 29, 2017, 12:59:17 AM5/29/17

On Sat, 27 May 2017 16:17:57 -0500, Les Cargill wrote:

> rickman wrote:
>> Someone in another group is thinking of using a portable assembler to
>> write code for an app that would be ported to a number of different
>> embedded processors including custom processors in FPGAs. I'm wondering
>> how useful this will be in writing code that will require few changes
>> across CPU ISAs and manufacturers.
>>
>> I am aware that there are many aspects of porting between CPUs that is
>> assembly language independent, like writing to Flash memory. I'm more
>> interested in the issues involved in trying to use a universal
>> assembler to write portable code in general. I'm wondering if it
>> restricts the instructions you can use or if it works more like a
>> compiler where a single instruction translates to multiple target
>> instructions when there is no one instruction suitable.
>>
>> Or do I misunderstand how a portable assembler works? Does it require
>> a specific assembly language source format for each target just like
>> using the standard assembler for the target?
>>
>>
> That's what C is for.
>

> This being said, I've been doing this for 37 years and have only a few
> times seen an actual need for portability - usually, the new hardware is
> so radically different that porting makes little sense.

I have used some very good portable C code across three or four different
architectures (depending on whether you view a 188 and a 286 as different
architectures). This all in one company over the span of 9 years or so.

So -- perhaps your scope is limited?

--
www.wescottdesign.com

David Brown

unread,

May 29, 2017, 3:53:05 AM5/29/17

On 27/05/17 23:36, upsid...@downunder.com wrote:
> On Sat, 27 May 2017 21:05:18 +0000 (UTC), Grant Edwards
> <inv...@invalid.invalid> wrote:

>
>> On 2017-05-27, rickman <gnu...@gmail.com> wrote:
>>
>>> Someone in another group is thinking of using a portable assembler
>

> The closest I can think of is called "C" :-)
>

Sometimes people call C a "portable assembly" - they are wrong. But one
of the purposes of C is so that you don't /need/ assembly, portable or not.

What has been discussed so far in this branch (I haven't read the whole
thread yet) has been a retargetable assembler - a way to generate an
assembler program for different processors without going through all the
work each time. Such tools have existed for many years, and are an
efficient way to make an assembler if you need to cover more than one
target. They don't help much for writing the actual target assembly
code, however - though usually you can share the same directives
(commands for sections, macros, etc.). GNU binutils "gas" is the most
widely used example.

As far as a portable assembly language is concerned, that does not and
cannot exist. Assembly language is by definition too tightly connected
to the ISA of the target. It is possible to have a language that is
higher abstraction than assembler, but still lower level and with
tighter control than C, and which can be translated/compiled to
different target assemblies. LLVM is a prime example.

David Brown

unread,

May 29, 2017, 4:44:00 AM5/29/17

On 27/05/17 23:52, Theo Markettos wrote:
> Dimiter_Popoff <d...@tgi-sci.com> wrote:

>> The need for portability arises when you have megabytes of
>> sources which are good and need to be moved to another, better

>> platform. For smallish projects - anything which would fit in

>> an MCU flash - porting is likely a waste of time, rewriting it
>> for the new target will be faster if done by the same person
>> who has already done it once.
>

> Back in the 80s, lots of software was written in assembly. But it was
> common for software to be cross-platform - a popular game might come out for
> half a dozen or more machines, using Z80, 6502, 68K, 8086, 6809, etc.
>
> Obviously 'conversion' involved more than just the instruction set - parts
> had to be written for the memory available and make use of the platform's
> graphics capabilities (which could be substantially different). But were
> there tools to handle this, or did the programmers sit down and rewrite the
> assembly from scratch for each version?
>

Writing a game involves a great deal more than just the coding.
Usually, the coding is in fact just a small part of the whole effort -
all the design of the gameplay, the storyline, the graphics, the music,
the algorithms for interaction, etc., is inherently cross-platform. The
code structure and design is also mostly cross-platform. Some parts
(the graphics and the music) need adapted to suit the limitations of the
different target platforms. The final coding in assembly would be done
by hand for each target.

David Brown

unread,

May 29, 2017, 5:00:24 AM5/29/17

On 27/05/17 23:31, Dimiter_Popoff wrote:
> On 28.5.2017 г. 00:17, Les Cargill wrote:

>> rickman wrote:
>>> Someone in another group is thinking of using a portable assembler

>>> to write code for an app that would be ported to a number of
>>> different embedded processors including custom processors in FPGAs.
>>> I'm wondering how useful this will be in writing code that will
>>> require few changes across CPU ISAs and manufacturers.
>>>
>>> I am aware that there are many aspects of porting between CPUs that
>>> is assembly language independent, like writing to Flash memory. I'm
>>> more interested in the issues involved in trying to use a universal
>>> assembler to write portable code in general. I'm wondering if it
>>> restricts the instructions you can use or if it works more like a
>>> compiler where a single instruction translates to multiple target
>>> instructions when there is no one instruction suitable.
>>>
>>> Or do I misunderstand how a portable assembler works? Does it
>>> require a specific assembly language source format for each target
>>> just like using the standard assembler for the target?
>>>
>>
>> That's what C is for.
>

> Or Basic. Or Fortran etc.

Exactly - you use a programming language appropriate for the job. For
most low-level work, that is C (or perhaps C++, if you /really/ know
what you are doing). Some parts of your code will be target-specific C,
some parts will be portable C. And a few tiny bits will be assembly or
"intrinsic functions" that are assembly made to look like C functions.

Most of the assembly used will actually be written by the toolchain
provider (startup code, library code, etc.) - and if you are using a
half-decent processor, this would almost certainly have been better
written in C than assembly.

C is /not/ a "portable assembly" - it means you don't /need/ a portable
assembly.

>
> However, they are by far not what a "portable assembler" - existing
> under the name Virtual Processor Assembler in our house is.

No, it is not a "portable assembler". It is just a translator to
generate PPC assembly from 68K assembly, because you had invested so
much time and code in 68K assembly and wanted to avoid re-writing
everything for the PPC. That's a reasonable enough business strategy,
and an alternative to writing an emulator for the 68K on the PPC, or
some sort of re-compiler.

But it is not a "portable assembler". If you can take code written in
your VPA and translate it into PIC, 8051, msp430, ARM, and x86 assembly,
in a way that gives near-optimal efficiency on each target, while
letting you write your VPA code knowing exactly which instructions will
be generated on the target, /then/ you would have a portable assembler.
But such a language cannot be made, for obvious reasons.

What you have is a two-target sort-of assembler that gives you
reasonable code on two different targets. You could also say that you
have your own personal low-level programming language with compiler
backends for two different targets. Again, that's fair enough - and if
it lets you write the code you want, great. But it is not a portable
assembly.

> And never will be, like any high level language C is yet another
> phrase book - convenient when you need to do a quick interaction
> when you don't speak the language - and only then.

Spoken like a true fanatic (or salesman).

>
>>
>> This being said, I've been doing this for
>> 37 years and have only a few times seen an actual need for
>> portability - usually, the new hardware is so radically
>> different that porting makes little sense.
>>
>

Don Y

unread,

May 29, 2017, 5:33:56 AM5/29/17

On 5/27/2017 2:52 PM, Theo Markettos wrote:
> Dimiter_Popoff <d...@tgi-sci.com> wrote:

>> The need for portability arises when you have megabytes of
>> sources which are good and need to be moved to another, better
>> platform. For smallish projects - anything which would fit in
>> an MCU flash - porting is likely a waste of time, rewriting it
>> for the new target will be faster if done by the same person
>> who has already done it once.
>

> Back in the 80s, lots of software was written in assembly.

For embedded systems (before we called them that), yes. There were
few compilers that were really worth the media they were delivered
on -- and few meant to generate code for bare iron.

> But it was
> common for software to be cross-platform - a popular game might come out for
> half a dozen or more machines, using Z80, 6502, 68K, 8086, 6809, etc.
>
> Obviously 'conversion' involved more than just the instruction set - parts
> had to be written for the memory available and make use of the platform's
> graphics capabilities (which could be substantially different). But were
> there tools to handle this, or did the programmers sit down and rewrite the
> assembly from scratch for each version?

Speaking from the standpoint of the *arcade* game industry, games were
developed on hardware specific to that particular game (trying, where possible,
to leverage as much of a previous design as possible -- for reasons of
economy).

Most games were coded from scratch in ASM; very little "lifted" from Game X
to act as a basis for Game Y (this slowly changed, over time -- but, mainly
in terms of core services... runtime executives predating "real" OS's).

Often, the hardware was *very* specific to the game (e.g., a vector graphic
display didn't draw vectors in a frame buffer but, rather, directly controlled
the deflection amplifiers -- X & Y -- of the monitor to move the "beam" around
the display tube in a particular path). As such, the "display I/O" wasn't
really portable in an economic sense -- no reason to make a Z80 version of
a 6502-based game with that same wonky display hardware. E.g., Atari had a
vector graphic display system (basically, a programmable display controller)
that could ONLY draw curves -- because curves were so hard to draw with a
typical vector graphic processor! (You'd note that every "line segment" on
the display was actually a curve of a particular radius)

Also, games taxed their hardware to the limit. There typically wasn't an "idle
task" that burned excess CPU cycles; all cycles were used to make the game
"do more" (players are demanding). The hardware was designed to leverage
whatever features the host CPU (often more than one CPU for different aspects
of the game -- e.g., "sound" was its own processor, etc.) to the greatest
advantage. E.g., 680x processors were a delight to interface to a frame buffer
as the bus timing directly lent itself to "display controller gets access
to the frame buffer for THIS half clock cycle... and the CPU gets access
for the OTHER half cycle" (no wait states as would be the case with a processor
having variable bus cycle timings (e.g., Z80).

Many manufacturers invested in full custom chips to add value (and make the
games harder to counterfeit).

A port of a game to another processor (and perhaps entire hardware platform)
typically meant rewriting the entire game, from scratch. But, 1980's games
(arcade pieces) weren't terribly big -- tens of KB of executables. Note
that any graphics for the game were directly portable (many of the driving
games and some of the Japanese pseudo-3D games had HUGE image ROMs that
were displayed by dedicated hardware -- under the control of the host CPU).

In practical terms, these were small enough projects that *seeing* one that
already works (that YOU coded or someone at your firm/affiliate coded) was
the biggest hurdle to overcome; you know how the game world operates,
the algorithms for the "robots", what the effects should look like, etc.

If you look at emulations of these games (e.g., MAME), you will see that they
aren't literal copies but, rather, just intended to make you THINK you're
playing the original game (because the timing of the algorithms in the
emulations isn't the same as that in the original game). E.g., the host
(application) typically synchronized its actions to the position of the "beam"
repainting the display from the frame buffer (in the case of a raster game;
similar concepts for vector games) to minimize visual artifacts (like "object
tearing") and provide other visual features ("OK, the beam has passed this
portion of the display, we can now go in and alter it in preparation for
its next pass, through")

In a sense, the games were small systems, by today's standards. Indeed, many
could be *emulated* on SoC's, today -- for far less money than their original
hardware and far less development time!

Don Y

unread,

May 29, 2017, 5:57:19 AM5/29/17

On 5/28/2017 8:04 PM, Les Cargill wrote:
>> The need for portability arises when you have megabytes of
>> sources which are good and need to be moved to another, better
>> platform.
>
> Mostly, I've seen the source code outlast the company for
> which it was written :)

Or, the platform on which it was originally intended to run!

OTOH, there are many "regulated" industries where change is
NOT seen as "good". Where even trivial changes can have huge
associated costs (e.g., formal validation, reestablishing
performance and reliability data, etc.)

[I've seen products that required the manufacturer to scour the
"used equipment" markets in order to build more devices simply
because the *new* equipment on which the design was based was no
longer being sold!]

[[I've a friend here who hordes big, antique (Sun) iron because
his enterprise systems *run* on old SPARCservers and the cost of
replacing/redesigning the software to run on new/commodity hardware
and software is simply too far beyond the company's means!]]

> I would personally view "megabytes of source" as an opportunity to
> infuse a system with better ideas through a total rewrite. I
> understand that this view is rarely shared; people prefer the
> arbitrage of technical debt.

I've never seen this done, successfully. The "second system"
effect seems to sabotage these attempts -- even for veteran
developers! Instead of reimplementing the *same* system,
they let feeping creaturism take over. The more developers,
the more "pet features" try to weasel their way into the
new design.

As each *seems* like a tiny little change, no one ever approaches
any of them with a serious evaluation of their impact(s) on the
overall system. And, everyone is chagrined at how much *harder*
it is to actually fold these changes into the new design -- because
the new design was conceived with the OLD design in mind (i.e., WITHOUT
these additions -- wasn't that the whole point of this effort?).

Meanwhile, your (existing) market is waiting on the new release of the
OLD product (with or without the new features) instead of a truly NEW
product.

And, your competitors are focused on their implementations of "better"
products (no one wants to play "catch-up"; they all aim to "leap-frog").

Save your new designs for new products!

Dimiter_Popoff

unread,

May 29, 2017, 8:08:10 AM5/29/17

On 29.5.2017 г. 12:00, David Brown wrote:
> On 27/05/17 23:31, Dimiter_Popoff wrote:

>.....

>>
>> However, they are by far not what a "portable assembler" - existing
>> under the name Virtual Processor Assembler in our house is.
>
> No, it is not a "portable assembler". It is just a translator to

> generate PPC assembly from 68K assembly, ....

I might agree with that - if we understand "portable" as "universally
portable".

>
> But it is not a "portable assembler". If you can take code written in
> your VPA and translate it into PIC, 8051, msp430, ARM, and x86 assembly,

Well who in his right mind would try to port serious 68020 or sort of
code to a PIC or MSP430 etc.
I am talking about what is practical and has worked for me. It would be
a pain to port back from code I have written for power to
something with fewer registers but within reason it is possible and
can even be practical. Yet porting to power has been easier because
it had more registers than the original 68k and many other things,
it is just more powerful and very well thought, whoever did it knew
what he was doing. It even has little endian load and store opcodes...
(I wonder of ARM have big endian load/store opcodes).

Yet I agree it is not an "assembler" I suppose. I myself refer to it
at times as a compiler, then as an assembler... It can generate many
lines per statement - many opcodes, e.g. the 64/32 bit divide the 68020
has is done in a loop, no way around that (17 opcodes, just counted it).
Practically the same what any other compiler would have to do.

> in a way that gives near-optimal efficiency on each target, while
> letting you write your VPA code knowing exactly which instructions will
> be generated on the target, /then/ you would have a portable assembler.

It comes pretty close to that as long as your CPU has 32 registers,
but you need to know exactly what each line does only during debugging,
running step by step through the native code.

>> And never will be, like any high level language C is yet another
>> phrase book - convenient when you need to do a quick interaction
>> when you don't speak the language - and only then.
>
> Spoken like a true fanatic (or salesman).

It may sound so but it is not what I intended.
VPA has made me a lot more efficient than anyone else I have been able
to compare myself with. Since I also am only human it can't be down
to me, not by *that* much. It has to be down to something else; in all
likelihood it is the toolchain I use. My "phrasebook" comment
stays I'm afraid.

David Brown

unread,

May 29, 2017, 9:06:37 AM5/29/17

On 29/05/17 14:08, Dimiter_Popoff wrote:
> On 29.5.2017 г. 12:00, David Brown wrote:
>> On 27/05/17 23:31, Dimiter_Popoff wrote:
>> .....
>>>
>>> However, they are by far not what a "portable assembler" - existing
>>> under the name Virtual Processor Assembler in our house is.
>>
>> No, it is not a "portable assembler". It is just a translator to
>> generate PPC assembly from 68K assembly, ....
>
> I might agree with that - if we understand "portable" as "universally
> portable".
>

"Universally portable" is perhaps a bit ambitious :-) But to be called
a "portable assembly", I would expect a good deal more than two
architectures that are relatively closely related (32-bit, reasonably
orthogonal instruction sets, big endian). I would imagine that
translating 68K assembly into PPC assembly is mostly straightforward -
unlike translating it into x86, or even ARM. (The extra registers on
the PPC give you the freedom you need for converting complex addressing
modes on the 68K into reasonable PPC code - while the ARM has fewer
registers available.)

>>
>> But it is not a "portable assembler". If you can take code written in
>> your VPA and translate it into PIC, 8051, msp430, ARM, and x86 assembly,
>
> Well who in his right mind would try to port serious 68020 or sort of
> code to a PIC or MSP430 etc.

If the code were /portable/ assembly, then it would be possible.
Standard C code will work fine on the msp430, ARM, x86, 68K and PPC -
though it is unlikely to be efficient on a PIC or 8051.

> I am talking about what is practical and has worked for me. It would be
> a pain to port back from code I have written for power to
> something with fewer registers but within reason it is possible and
> can even be practical. Yet porting to power has been easier because
> it had more registers than the original 68k and many other things,
> it is just more powerful and very well thought, whoever did it knew
> what he was doing. It even has little endian load and store opcodes...
> (I wonder of ARM have big endian load/store opcodes).
>

(I agree that the PPC is a fine ISA, and have enjoyed using it on a
couple of projects. ARM Cortex M, the most prevalent cores for
microcontrollers, does not have big endian load or store opcodes. But
it has byte-reverse instructions for both 16-bit and 32-bit values. The
traditional ARM instruction set may have them - I am not as familiar
with that.)

If it were /portable/ assembly, then your code that works well for the
PPC would automatically work well for the 68K.

The three key points about assembly, compared to other languages, are
that you know /exactly/ what instructions will be generated, including
the ordering, register choices, etc., that you can access /all/ features
of the target cpu, and that you can write code that is as efficient as
possible for the target. There is simply no way for this to portable.
Code written for the 68k may use complex addressing modes - they need
multiple instructions in PPC assembly. If you do this mechanically, you
will know exactly what instructions this generates - but the result will
not be as efficient as code that re-uses registers or re-orders
instructions for better pipelining. Code written for the PPC may use
more registers than are available on the 68K - /something/ has to give.

Thus your VLA may be a fantastic low-level programming language (having
never used it or seen it, I can't be sure - but I'm sure you would not
have stuck with it if it were not good!). But it is not a portable
assembly language - it cannot let you write assembly-style code for more
than one target.

> Yet I agree it is not an "assembler" I suppose. I myself refer to it
> at times as a compiler, then as an assembler... It can generate many
> lines per statement - many opcodes, e.g. the 64/32 bit divide the 68020
> has is done in a loop, no way around that (17 opcodes, just counted it).
> Practically the same what any other compiler would have to do.

That's fine - you have a low-level language and a compiler, not a
portable assembler.

Some time it might be fun to look at some example functions, compiled
for either the 68K or the PPC (or, better still, both) and compare both
the source code and the generated object code to modern C and modern C
compilers. (Noting that the state of C compilers has changed a great
deal since you started making VLA.)

>
>> in a way that gives near-optimal efficiency on each target, while
>> letting you write your VPA code knowing exactly which instructions will
>> be generated on the target, /then/ you would have a portable assembler.
>
> It comes pretty close to that as long as your CPU has 32 registers,
> but you need to know exactly what each line does only during debugging,
> running step by step through the native code.
>
>>> And never will be, like any high level language C is yet another
>>> phrase book - convenient when you need to do a quick interaction
>>> when you don't speak the language - and only then.
>>
>> Spoken like a true fanatic (or salesman).
>
> It may sound so but it is not what I intended.

Fair enough.

> VPA has made me a lot more efficient than anyone else I have been able
> to compare myself with. Since I also am only human it can't be down
> to me, not by *that* much. It has to be down to something else; in all
> likelihood it is the toolchain I use. My "phrasebook" comment
> stays I'm afraid.
>

Good comparisons are, of course, extremely difficult - and not least,
extremely expensive. You would need to do large scale experiments with
at least dozens of programmers working on a serious project before you
could compare efficiency properly.

Grant Edwards

unread,

May 29, 2017, 9:48:41 AM5/29/17

On 2017-05-29, Dimiter_Popoff <d...@tgi-sci.com> wrote:
> On 29.5.2017 г. 12:00, David Brown wrote:
>> On 27/05/17 23:31, Dimiter_Popoff wrote:
>>.....
>>>
>>> However, they are by far not what a "portable assembler" - existing
>>> under the name Virtual Processor Assembler in our house is.
>>
>> No, it is not a "portable assembler". It is just a translator to
>> generate PPC assembly from 68K assembly, ....
>
> I might agree with that - if we understand "portable" as "universally
> portable".
>
>>
>> But it is not a "portable assembler". If you can take code written in
>> your VPA and translate it into PIC, 8051, msp430, ARM, and x86 assembly,
>
> Well who in his right mind would try to port serious 68020 or sort of
> code to a PIC or MSP430 etc.

Nobody.

Yet, that's what a "Universal Assembler" would be able to do.

> I am talking about what is practical and has worked for me.

And it is not anything close to a "Universal Assembler".

--
Grant

Stefan Reuther

unread,

May 29, 2017, 12:47:51 PM5/29/17

Am 28.05.2017 um 19:47 schrieb Dimiter_Popoff:
> On 28.5.2017 г. 19:45, Stefan Reuther wrote:
>> Am 27.05.2017 um 23:31 schrieb Dimiter_Popoff:
>>> However, they are by far not what a "portable assembler" - existing
>>> under the name Virtual Processor Assembler in our house is.
>>> And never will be, like any high level language C is yet another
>>> phrase book - convenient when you need to do a quick interaction
>>> when you don't speak the language - and only then.
>>
>> So, then, what is a portable assembler?
>
> One which is not tied to a particular architecture, rather to an
> idealized machine model.

So, to what *is* it tied then? What is its *concrete* machine model?

> It makes sense to use this assuming that processors evolve towards
> better, larger register sets - which has been the case last few
> decades. It would be impractical to try to assemble something written
> once for say 68k and then assemble it for a 6502 - perhaps doable but
> insane.

Doable and not insane with C.

Actually, you can programm the 6502 in C++17.

>> One major variable of a processor architecture is the number of
>> registers, and what you can do with them. On one side of the spectrum,
>> we have PICs or 6502 with pretty much no registers, on the other side,
>> there's things like x86_64 or ARM64 with plenty 64-bit registers. Using
>> an abstraction like C to let the compiler handle the distinction (which
>> register to use, when to spill) sounds like a pretty good idea to me.
>
> Using a phrase book is of course a good idea if you want to conduct
> a quick conversation.
> It is a terrible idea if you try to use the language for years and
> choose to stay confined within the phrases you have in the book.

My point being: if you work on assembler level, that is: registers,
you'll not have anything more than a phrase book. A C compiler can use
knowledge from one phrase^Wstatement and carry it into the next, and it
can use grammar to generate not only "a = b + c" and "x = y * z", but
also "a = b + (y*z)".

>> If you were more close to assembler, you'd either limit yourself to an
>> unuseful subset that works everywhere, or to a set that works only in
>> one or two places.
>
> Like I said before, there is no point to write code which can work
> on any processor ever made. I have no time to waste on that, I just need
> my code to be working on what is the best silicon available. This
> used to be 68k, now it is power. You have to program with some
> constraints - e.g. knowing that the "assembler" (which in reality
> is more a compiler) may use r3-r4 as it wishes and not preserve
> them on a per line basis etc.

I am not an expert in either of these two architectures, but 68k has 8
data + 8 address registers whereas Power has 32 GPRs. If you work on a
virtual pseudo-assembler level you probably ignore most of your Power.

A classic compiler will happily use as many registers as it finds useful.

The only possible gripe with C would be that it has no easy way to write
a memory cell by number. But a simple macro fixes that.

Stefan

Dimiter_Popoff

unread,

May 29, 2017, 1:02:29 PM5/29/17

On 29.5.2017 г. 16:06, David Brown wrote:
> .... I would imagine that

> translating 68K assembly into PPC assembly is mostly straightforward -
> unlike translating it into x86, or even ARM. (The extra registers on
> the PPC give you the freedom you need for converting complex addressing
> modes on the 68K into reasonable PPC code - while the ARM has fewer
> registers available.)

Indeed, having more registers is of huge help. But it is not as straight
forward as it might seem at first glance. Then while at it I did a lot
more than just emulate the 68k - on power we have a lot more on offer,
I wanted to take advantage of it, like adding syntax to not touch
the CCR - as the 68k unavoidably does on moves, add and many others,
use the 3 address mode - source1,source2, destination - and have this
available not just as registers but as any addressing mode etc.
If you assemble plain CPU32 code the resulting power object code size is
about 3.5 times the native CPU32 code size. If you write with power
in mind - e.g. you discourage all the unnecessary CCR (CR in power)
updates - code sizes get pretty close. I have designed in a pass
for optimizing that automatically, 1.5 decades later still waiting
to happen... :-). No need for it which would deflect me from more
pressing issues I suppose.

>
> If it were /portable/ assembly, then your code that works well for the
> PPC would automatically work well for the 68K.

This is semantics - but since user level 68k code assembles directly
I think it is fair enough to borrow the word "assembler". Not what
everyone understands under it every time of course but must have
sounded OK to me back then. Then I am a practical person and tend
not to waste much time on names as long as they do not look
outright ugly or misleading (well I might go on purpose for "ugly"
of course but have not done it for vpa).

>
> The three key points about assembly, compared to other languages, are
> that you know /exactly/ what instructions will be generated, including
> the ordering, register choices, etc.,

Well yes, if we accept that we have to accept that VPA (Virtual
Processor Assembler) is not exactly an assembler. But I think the
name is telling enough what to expect.

> that you can access /all/ features
> of the target cpu, and that you can write code that is as efficient as
> possible for the target.

That is completely possible with vpa for power, nothing is stopping you
from using native to power opcodes (I use rwlinm and rlwimi quite often,
realizing there might be no efficient way to emulate them but I do what
I can do best, if I get stuck in a world with x86 processors only
which have just the few original 8086 registers I'll switch occupation
to herding kangaroos or something. Until then I'll change to a new
architecture only if I see why it is better than the one I use now,
for me portability is just a means, not a goal).

> There is simply no way for this to portable.
> Code written for the 68k may use complex addressing modes - they need
> multiple instructions in PPC assembly.

Yes but they run in fewer cycles. Apart from the PC relative - there is
no direct access to the PC on power, takes 2-3 opcodes to get to it
alone - the rest works faster. And say the An,Dn.l*4 mode can take
not just powers of 2... etc., it is pretty powerful.

> If you do this mechanically, you
> will know exactly what instructions this generates - but the result will
> not be as efficient as code that re-uses registers or re-orders
> instructions for better pipelining. Code written for the PPC may use
> more registers than are available on the 68K - /something/ has to give.

Oh yes, backward porting would be quite painful. I do use all registers
I have - rarely resorting to r4-r7, they are used for addressing mode
calculations, intermediate operands etc., use one of them and you
have something to work on when porting later. I still do it at times
when I think it is justified... may well bite me one day.

>
> Thus your VLA may be a fantastic low-level programming language (having
> never used it or seen it, I can't be sure - but I'm sure you would not
> have stuck with it if it were not good!). But it is not a portable
> assembly language - it cannot let you write assembly-style code for more
> than one target.

Hmmm, not for any target - yes. For more than one target with the code
not losing efficiency - it certainly can, if the new target is right
(as was the case 68k -> power).
Basically I have never been after a "universal assembler", I just wanted
to do what you already know I wanted. How we call it is of secondary
interest to me to be fair :-).

> Some time it might be fun to look at some example functions, compiled
> for either the 68K or the PPC (or, better still, both) and compare both
> the source code and the generated object code to modern C and modern C
> compilers. (Noting that the state of C compilers has changed a great
> deal since you started making VLA.)

Yes, I would also be curious to see that. Not just a function - as it
will likely have been written in assembly by the compiler author - but
some sort of standard thing, say a base64 encoder/decoder or some
vnc server thing etc. (the vnc server under dps is about 8 kilobytes,
just looked at it. Does one type of compression (RRE misused as RLE) and
raw).

Dimiter

======================================================

Dimiter Popoff, TGI http://www.tgi-sci.com

======================================================
http://www.flickr.com/photos/didi_tgi/

Don Y

unread,

May 29, 2017, 1:23:52 PM5/29/17

On 5/29/2017 9:43 AM, Stefan Reuther wrote:
> The only possible gripe with C would be that it has no easy way to write
> a memory cell by number. But a simple macro fixes that.

"Only" gripe?

Every language choice makes implicit tradeoffs in abstraction management.
The sorts of data types and the operations that can be performed on them
are baked into the underlying assumptions of the language.

What C construct maps to the NS16032's native *bit* array instructions?
Or, the test-and-set capability present in many architectures? Or,
x86 BCD data types? Support for 12 or 60 bit integers? 24b floats?
How is the PSW exposed? Why pointers in some languages and not others?

Why do we have to *worry* about atomic operations in the language in
a different way than on the underlying hardware? Why doesn't the language
explicitly acknowledge the idea of multiple tasks, foreground/background,
etc.?

Folks designing languages make the 90-10 (%) decisions and hope the
10 aren't unduly burdened by the wins afforded to the 90. Or, that
the applications addressed by the 10 can tolerate the contortions
they must endure as a necessary cost to gain *any* of the benefits
granted to the 90.

upsid...@downunder.com

unread,

May 29, 2017, 4:46:11 PM5/29/17

On Mon, 29 May 2017 02:33:47 -0700, Don Y
<blocked...@foo.invalid> wrote:

>On 5/27/2017 2:52 PM, Theo Markettos wrote:
>> Dimiter_Popoff <d...@tgi-sci.com> wrote:
>>> The need for portability arises when you have megabytes of
>>> sources which are good and need to be moved to another, better
>>> platform. For smallish projects - anything which would fit in
>>> an MCU flash - porting is likely a waste of time, rewriting it
>>> for the new target will be faster if done by the same person
>>> who has already done it once.
>>
>> Back in the 80s, lots of software was written in assembly.
>
>For embedded systems (before we called them that),

When did the "embedded system" term become popular ?

Of course, there were some military system (such as SAGE) that used
purpose built computers in the 1950s.

In the 1970s the PDP-11/34 was very popular as a single purpose
computer and the PDP-11/23 in the 1980's. After that 8080/Z80/6800
became popular as the low end processors.

Don Y

unread,

May 29, 2017, 4:54:22 PM5/29/17

On 5/29/2017 1:46 PM, upsid...@downunder.com wrote:
> On Mon, 29 May 2017 02:33:47 -0700, Don Y
> <blocked...@foo.invalid> wrote:
>
>> On 5/27/2017 2:52 PM, Theo Markettos wrote:
>>> Dimiter_Popoff <d...@tgi-sci.com> wrote:
>>>> The need for portability arises when you have megabytes of
>>>> sources which are good and need to be moved to another, better
>>>> platform. For smallish projects - anything which would fit in
>>>> an MCU flash - porting is likely a waste of time, rewriting it
>>>> for the new target will be faster if done by the same person
>>>> who has already done it once.
>>>
>>> Back in the 80s, lots of software was written in assembly.
>>
>> For embedded systems (before we called them that),
>
> When did the "embedded system" term become popular ?

No idea. I was "surprised" when told that this is what I did
for a living (and HAD been doing all along!).

I now tell people that I design "computers that don't LOOK like
computers" (cuz everyone thinks they KNOW what a "computer" looks
like!) "things that you know have a computer *in* them but
don't look like the stereotype you think of..."

> Of course, there were some military system (such as SAGE) that used
> purpose built computers in the 1950s.
>
> In the 1970s the PDP-11/34 was very popular as a single purpose
> computer and the PDP-11/23 in the 1980's. After that 8080/Z80/6800
> became popular as the low end processors.

11's were used a lot as they were reasonably affordable and widely
available (along with folks who could code for them). E.g., the Therac
was 11-based.

The i4004 was the first real chance to put "smarts" into something
that didn't also have a big, noisey box attached. I recall thinking
the i8080 (and 85) were pure luxury coming from that more crippled
world ("Oooh! Kilobytes of memory!!!")

David Brown

unread,

May 29, 2017, 5:13:10 PM5/29/17

On 29/05/17 19:02, Dimiter_Popoff wrote:
> On 29.5.2017 г. 16:06, David Brown wrote:

>
>> Some time it might be fun to look at some example functions, compiled
>> for either the 68K or the PPC (or, better still, both) and compare both
>> the source code and the generated object code to modern C and modern C
>> compilers. (Noting that the state of C compilers has changed a great
>> deal since you started making VLA.)
>
> Yes, I would also be curious to see that. Not just a function - as it
> will likely have been written in assembly by the compiler author - but
> some sort of standard thing, say a base64 encoder/decoder or some
> vnc server thing etc. (the vnc server under dps is about 8 kilobytes,
> just looked at it. Does one type of compression (RRE misused as RLE) and
> raw).
>

To be practical, it /should/ be a function - or no more than a few
functions. (I don't know why you think functions might be written in
assembly by the compiler author - the compiler author is only going to
provide compiler-assist functions such as division routines, floating
point emulation, etc.) And it should be something that has a clear
algorithm, so no one can "cheat" by using a better algorithm for the job.

Walter Banks

unread,

May 29, 2017, 5:23:09 PM5/29/17

On 2017-05-27 3:39 PM, rickman wrote:
> Someone in another group is thinking of using a portable assembler to
> write code for an app that would be ported to a number of different
> embedded processors including custom processors in FPGAs. I'm
> wondering how useful this will be in writing code that will require
> few changes across CPU ISAs and manufacturers.
>
> I am aware that there are many aspects of porting between CPUs that
> is assembly language independent, like writing to Flash memory. I'm
> more interested in the issues involved in trying to use a universal
> assembler to write portable code in general. I'm wondering if it
> restricts the instructions you can use or if it works more like a
> compiler where a single instruction translates to multiple target
> instructions when there is no one instruction suitable.
>
> Or do I misunderstand how a portable assembler works? Does it
> require a specific assembly language source format for each target
> just like using the standard assembler for the target?
>

I have done a few portable assemblers of the general type your
describing. There are two approaches. One is to write macro's for the
instruction set for the target processor and effectively assembler
processor A into processor B with macros. This might work for
architecturally close processors but even then has significant problems.
To give an example 6805 to 6502. The carry following the subtract of 0 -
0 is different.

There is one approach that I have used that does work reasonably well.
Assemble processor A into functionally rich intermediate code and
compile the intermediate code into processor B. The resulting code is
quite portable between the processors and it is capable of supporting a
diverse architectures quite well.

I have done mostly 8 bit processors this way 6808 3 major families to
PIC many varieties 12,14,14x,16 families. In all cases I set up the
translation so I could go either way. I have also targeted some 16,24,
and 32 bit processors. For pure code this has worked quite well with a
low penalty for the translation.

Application code usually has processor specific I/O which can actually
be detected by the translator but generally needs to have some hand
intervention.

w..

Dimiter_Popoff

unread,

May 30, 2017, 9:53:06 AM5/30/17

I am pretty sure I have seen - or read about - compiler generated
code where the compiler detects what you want to do and inserts
some assembly prewritten piece of code. Was something about CRC
or about tcp checksum, not sure - and it was someone who said that,
I don't know it from direct experience.

But if the compiler does this it will be obvious enough.

Anyway, a function would do - if complex and long enough to
be close to real life, i.e. a few hundred lines.

But I don't see why not compare written stuff, I just checked
again on that vnc server for dps - not 8k, closer to 11k (the 8k
I saw was a half-baked version, no keyboard tables inside it etc.;
the complete version also includes a screen mask to allow it
to ignore mouse clicks at certain areas, that sort of thing).
Add to it some menu (it is command line option driven only),
a much more complex menu than windows and android RealVNC has
I have and it adds up to 25k.
Compare this to the 350k exe for windows or to the 4M for Android
(and the android does only raw...) and the picture is clear enough
I think.

Dimiter

======================================================
Dimiter Popoff, TGI http://www.tgi=sci.com
======================================================
http://www.flickr.com/photos/didi_tgi/

Dimiter

Boudewijn Dijkstra

unread,

May 30, 2017, 10:33:26 AM5/30/17

Op Mon, 29 May 2017 18:43:01 +0200 schreef Stefan Reuther
<stefa...@arcor.de>:

Unless it uses a push/pop architecture like Java bytecode, which can get
'assembled' to any number of registers.

--
(Remove the obvious prefix to reply privately.)
Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/

Anssi Saari

unread,

May 31, 2017, 3:58:53 AM5/31/17

David Brown <david...@hesbynett.no> writes:

> Writing a game involves a great deal more than just the coding.
> Usually, the coding is in fact just a small part of the whole effort -
> all the design of the gameplay, the storyline, the graphics, the music,
> the algorithms for interaction, etc., is inherently cross-platform. The
> code structure and design is also mostly cross-platform. Some parts
> (the graphics and the music) need adapted to suit the limitations of the
> different target platforms. The final coding in assembly would be done
> by hand for each target.

I've sometimes wondered what kind of development systems were used for
those early 1980s home computers. Unreliable, slow and small storage
media would've made it pretty awful to do development on target
systems. I've read Commodore used a VAX for ROM development so they
probably had a cross assembler there but other than that, not much idea.

Don Y

unread,

May 31, 2017, 4:49:34 AM5/31/17

On 5/31/2017 12:19 AM, Anssi Saari wrote:
> I've sometimes wondered what kind of development systems were used for
> those early 1980s home computers. Unreliable, slow and small storage
> media would've made it pretty awful to do development on target
> systems. I've read Commodore used a VAX for ROM development so they
> probably had a cross assembler there but other than that, not much idea.

You forget that those computers were typically small and ran small
applications.

In the early 80's, we regularly developed products using CP/M-hosted
tools on generic Z80 machines, RIO-based tools on "Z-boxes", motogorilla's
tools on Exormacs, etc. None were much better than a 64K 8b machine
with one or two 1.4MB floppies. Even earlier, the MDS-800 systems
and ISIS-II, etc.

Your development *style* changes with the capabilities of the tools
available. E.g., in the 70's, I could turn the crank *twice* in an
8 hour shift -- edit, assemble, link, burn ROMs, debug. So, each
iteration *had* to bring you closer to a finished product. You
couldn't afford to just try the "I wonder if THIS is the problem"
game that seems so common, today ("Heck, I can just try rebuilding
everything and see if it NOW works...")

But, that doesn't necessarily limit you to the size of a final executable
(ever hear of overlays?) or the overall complexity of the product.

David Brown

unread,

May 31, 2017, 5:36:45 AM5/31/17

A compiler sees the source code you write, and generates object code
that does that job. It be smart about it, but it will not insert
"pre-written assembly code". Code generation in compilers is usually
defined with some sort of templates (such a pattern for reading data at
a register plus offset, or a pattern for doing a shift by a fixed size,
etc.). They are not "pre-written assembly", in that many of the details
are determined at generation time, such as registers, instruction
interleaving, etc.

The nearest you get to pre-written code from the compiler is in the
compiler support libraries. For example, if the target does not support
division instructions, or floating point, then the compiler will supply
routines as needed. These /might/ be written in assembly - but often
they are written in C.

A compiler /will/ detect patterns in your C code and use that to
generate object code rather than doing a "direct translation". The
types of patterns it can detect varies - it is one of the things that
differentiates between compilers. A classic example for the PPC would be:

#include <stdint.h>

uint32_t reverseLoad(uint32_t * p) {
uint32_t x = *p;
return ((x & 0xff000000) >> 24)
| ((x & 0x00ff0000) >> 8)
| ((x & 0x0000ff00) << 8)
| ((x & 0x000000ff) << 24);
}

I am using gcc 4.8 here, since there is a convenient online version as
part of the <https://gcc.godbolt.org/> "compiler explorer". gcc is at
7.0 these days, and has advanced significantly since then - but that is
the version that is most convenient.

A direct translation (compiling with no optimisation) would be:

reverseLoad:
stwu 1,-48(1)
stw 31,44(1)
mr 31,1
stw 3,24(31)
lwz 9,24(31)
lwz 9,0(9)
stw 9,8(31)
lwz 9,8(31)
srwi 10,9,24
lwz 9,8(31)
rlwinm 9,9,0,8,15
srwi 9,9,8
or 10,10,9
lwz 9,8(31)
rlwinm 9,9,0,16,23
slwi 9,9,8
or 10,10,9
lwz 9,8(31)
slwi 9,9,24
or 9,10,9
mr 3,9
addi 11,31,48
lwz 31,-4(11)
mr 1,11
blr

Gruesome, isn't it? Compiling with -O0 puts everything on the stack
rather than holding variables in registers. Code like that was used in
the old days - perhaps at the time when you decided you needed something
better than C. But even then, it was mainly only for debugging - since
debugger software was not good enough to handle variables in registers.

Next up, -O1 optimisation. This is a level where the code becomes
sensible, but not too smart - and it is not uncommon to use it in
debugging because you usually get a one-to-one correspondence between
lines in the source code and blocks of object code. It makes it easier
to do single stepping.

reverseLoad:
lwz 9,0(3)
slwi 3,9,24
srwi 10,9,24
or 3,3,10
rlwinm 10,9,24,16,23
or 3,3,10
rlwinm 9,9,8,8,15
or 3,3,9
blr

Those that can understand the PPC's bit field instruction "rlwinm" will
see immediately that this is a straightforward translation of the source
code, but with all data held in registers.

But if we ask for smarter optimisation, with -O2, we get:

reverseLoad:
lwbrx 3,0,3
blr

This is, of course, optimal. (Even the function call overhead will be
eliminated if the compiler can do so when the function is used.)

>
> But if the compiler does this it will be obvious enough.

If you had some examples or references, it would be easier to see what
you mean.

>
> Anyway, a function would do - if complex and long enough to
> be close to real life, i.e. a few hundred lines.

A function that is a few hundred lines of source code is /not/ real life
- it is broken code. Surely in VLA you divide your code into functions
of manageable size, rather than single massive functions?

>
> But I don't see why not compare written stuff, I just checked
> again on that vnc server for dps - not 8k, closer to 11k (the 8k
> I saw was a half-baked version, no keyboard tables inside it etc.;
> the complete version also includes a screen mask to allow it
> to ignore mouse clicks at certain areas, that sort of thing).
> Add to it some menu (it is command line option driven only),
> a much more complex menu than windows and android RealVNC has
> I have and it adds up to 25k.
> Compare this to the 350k exe for windows or to the 4M for Android
> (and the android does only raw...) and the picture is clear enough
> I think.
>

A VNC server is completely useless for such a test. It is far too
complex, with far too much variation in implementation and features, too
many external dependencies on an OS or other software (such as for
networking), and far too big for anyone to bother with such a comparison.

You specifically need something /small/. The algorithm needs to be
simple and clearly expressible. Total source code lines in C should be
no more than about a 100, with no more than perhaps 3 or 4 functions.
Smaller than that would be better, as it would make it easier for us to
understand the VLA and see its benefits.

Here is a possible example:

// Type for the data - this can easily be changed
typedef float data_t;

static int max(int a, int b) {
return (a > b) ? a : b;
}

static int min(int a, int b) {
return (a < b) ? a : b;
}

// Calculate the convolution of two input arrays pointed to by
// pA and pB, placing the results in the output array pC.
void convolute(const data_t * pA, int lenA, const data_t * pB,
int lenB, data_t * pC, int lenC) {

// i is the index of the output sample, run from 0 to lenC - 1
// For each i, we calculate the sum as j goes from -inf to +inf
// of A(j) * B(i - j)
// Clearly we can limit j to the range 0 to (lenA - 1)
// We use k to hold i - j, which will run down as j runs up.
// k will be limited to (lenB - 1) down to 0.
// From (i - j) >= 0, we have j <= i
// From (i - j) < lenB, we have j > (i - lenB)
// These give us tighter bounds on the run of j

for (int i = 0; i < lenC; i++) {
int firstJ = max(0, 1 + i - lenB);
int endJ = min(lenA, i + 1);
data_t x = 0;
for (int j = firstJ; j < endJ; j++) {
int k = i - j;
x += (pA[j] * pB[k]);
}

pC[i] = x;
}
}

With gcc 4.8 for the PPC, that's about 55 lines of assembly. An
interesting point is that the size and instructions are very similar
with -O1 and -O2, but the ordering is significantly different - with
-O2, the pipeline scheduling is considered. (I don't know which
particular cpu model is used for scheduling by default in gcc.)

To be able to compare with VLA, you'd have to write this algorithm in
VLA. Then you could compare various points. It should be easy enough
to look at the size of the code. For speed comparison, we'd have to
know your target processor and compile specifically for that (to get the
best scheduling, and to handle small differences in the availability of
particular instructions). Then you would need to run the code - I don't
have any PPC boards conveniently on hand, and of course you are the only
one with VLA tools.

Comparing code clarity and readability is, of course, difficult - but
you could publish your VLA and we can maybe get an idea. Productivity
is also hard to measure. For a function like this, the time is spent on
the details of the algorithm and getting the right bounds on the loops -
the actual C code is easy.

You can get a gcc 5.2 cross-compiler for PPC for Windows from here
<http://tranaptic.ca/wordpress/downloads/>, or you can use the online
compiler at <https://gcc.godbolt.org/>. The PowerPC is not nearly as
popular an architecture as ARM, and it is harder to find free
ready-built tools (though there are plenty of guides to building them
yourself, and you can get supported commercial versions of modern gcc
from Mentor/CodeSourcery). You can also find tools directly from
Freescale/NXP.

Tauno Voipio

unread,

May 31, 2017, 8:56:09 AM5/31/17

I used an Intel MDS and a Data General Eclipse to bootstrap a
Z80-based CP/M computer (self made). After that, the CP/M
system could be used to create the code, though the 8 inch
floppies were quite small for the task.

--

-Tauno Voipio

Dimiter_Popoff

unread,

Jun 1, 2017, 3:43:03 PM6/1/17

We are referring to the same thing under different names - again.
At the end of the day everything the compiler generates is written
in plain assembly, it must be executable by the CPU.
Under "prewritten" I mean some sort of template which gets filled
with addresses etc. thing before committing.
To what lengths the compiler writers go to make common cases look
good know only the writers themselves, my memory is vague but I
do think the guy who said that a few years ago knew what he was
talking about.

>.. A classic example for the PPC would be:

Above all this is a good example how limiting the high level language
is. Just look at the source and then at the final result.

You will get *exactly* the same result (- the return) with no
optimization in vpa from the line:

mover.l (source),r3

Logic optimization is more or less a kindergarten exercise. If you need
logic optimization you don't know what you are doing anyway so the
compiler won't be able to help much, no matter how good.

Of course if you stick by a phrase book at source level - as is the case
with *any* high level language - you will need plenty of optimization,
like your example demonstrates. I bet it will will be good only in demo
cases like yours and much less useful in real life, so the only benefit
of writing this in C is the source length, 10+ times the necessary (I
counted it and I included a return line in the count, 238 vs. 23 bytes).
While 10 times more typing may seem no serious issue to many 10 times
higher chance to insert an error is no laughing matter, and 10 times
more obscurity just because of that is a productivity killer.

>> But if the compiler does this it will be obvious enough.
>
> If you had some examples or references, it would be easier to see what
> you mean.
>
>>
>> Anyway, a function would do - if complex and long enough to
>> be close to real life, i.e. a few hundred lines.
>
> A function that is a few hundred lines of source code is /not/ real life
> - it is broken code. Surely in VLA you divide your code into functions
> of manageable size, rather than single massive functions?

I meant "function" not the in C subroutine kind of sense, I meant it
more as "functionality", i.e. some code doing some job. How it split
into pieces etc. will depend on many factors, language, programmer
style etc., not relevant to this discussion.

>>
>> But I don't see why not compare written stuff, I just checked
>> again on that vnc server for dps - not 8k, closer to 11k (the 8k
>> I saw was a half-baked version, no keyboard tables inside it etc.;
>> the complete version also includes a screen mask to allow it
>> to ignore mouse clicks at certain areas, that sort of thing).
>> Add to it some menu (it is command line option driven only),
>> a much more complex menu than windows and android RealVNC has
>> I have and it adds up to 25k.
>> Compare this to the 350k exe for windows or to the 4M for Android
>> (and the android does only raw...) and the picture is clear enough
>> I think.
>>
>
> A VNC server is completely useless for such a test. It is far too
> complex, with far too much variation in implementation and features, too
> many external dependencies on an OS or other software (such as for
> networking), and far too big for anyone to bother with such a comparison.

Actually I think a comparison between two pieces of code doing the same
thing is quite telling when the difference is in the orders of
magnitude, as in this case.
Writing small benchmarking toy sort of stuff is a waste of time, I am
interested in end results.

>
> You specifically need something /small/.

No, something "small" is kind of kindergarten exercise again, it can
only be good enough to fool someone into believing this or that.
It is end results which count.

Dimiter

======================================================
Dimiter Popoff, TGI http://www.tgi-sci.com
======================================================
http://www.flickr.com/photos/didi_tgi/

David Brown

unread,

Jun 2, 2017, 7:27:52 AM6/2/17

OK. I think your naming and description is odd, but I am glad to see we
are getting a better understanding of what the other is saying.

> At the end of the day everything the compiler generates is written
> in plain assembly, it must be executable by the CPU.
> Under "prewritten" I mean some sort of template which gets filled
> with addresses etc. thing before committing.

I think of "prewritten" as referring to larger chunks of assembly code,
with much more concrete choices of values, registers, scheduling, etc.
You described the "prewritten" code as being easily recognisable - in
reality, the majority of the code from modern compilers is generated
from very small templates with great variability. And on a processor
like the PPC, these will be intertwined with each other according to the
best scheduling for the chip.

As an example, if we have the function:

int foo0(int * p) {
int a = *p * *p;
return a;
}

The template for reading "*p" generates

lmz 3, 0(3)

(Register r3 is used for the first parameter in the PPC eabi. It is
also used for the return value from a function, which is why it may seem
"over used" in the examples here. In bigger code, and when the compiler
can inline functions, it will be more flexible about register choices.
I don't know whether you follow the standard PPC eabi in your tools.)

Multiplication is another template:

mullw 3, 3, 3

As is function exit, in this case just:

blr

I find it very strange to consider these as "pre-written assembly".

And if the function is more complex, the intertwining causes more
mixups, making it less "pre-written":

int foo1(int * p, int * q) {
int a = *p * *p;
int b = *q * *q;
return a + b;
}

foo1:
lwz 9,0(3)
lwz 10,0(4)
mullw 9,9,9
mullw 3,10,10
add 3,9,3
blr

> To what lengths the compiler writers go to make common cases look
> good know only the writers themselves, my memory is vague but I
> do think the guy who said that a few years ago knew what he was
> talking about.

Well, it is known to the compiler writers and to users who look at the
generated code! Certainly there is plenty of variation between tools,
with more advanced compilers working harder at this sort of thing.
Command line switches with choices of optimisation levels can also make
a big difference.

How much experience do you have of using C compilers, and studying their
output?

>
>> .. A classic example for the PPC would be:
>>
>> #include <stdint.h>
>>
>> uint32_t reverseLoad(uint32_t * p) {
>> uint32_t x = *p;
>> return ((x & 0xff000000) >> 24)
>> | ((x & 0x00ff0000) >> 8)
>> | ((x & 0x0000ff00) << 8)
>> | ((x & 0x000000ff) << 24);
>> }
>>

>> But if we ask for smarter optimisation, with -O2, we get:
>>
>> reverseLoad:
>> lwbrx 3,0,3
>> blr
>>
>> This is, of course, optimal. (Even the function call overhead will be
>> eliminated if the compiler can do so when the function is used.)
>
> Above all this is a good example how limiting the high level language
> is. Just look at the source and then at the final result.

No, that is a good example of how smart the compiler is (or can be)
about generating optimal code from the source.

You may in addition view this as a limitation of the C language, which
has no direct way to specify a "bit reversed pointer". That is fair
enough. However, it is not really any harder than defining a function
like this, and then using it. For situations where the compiler can't
generate ideal code, and it is particularly useful to get such optimal
assembly, it is also possible to write a simple little inline assembly
function - it is not really any harder than writing the same thing in
"normal" assembly.

Another option (for newer gcc) is to define the endianness of a struct.
Then you can access the fields directly, and the loads and stores will
be reversed as needed.

typedef struct __attribute__((scalar_storage_order ("little-endian"))) {
uint32_t x;
} le32_t;

uint32_t reverseLoad2(le32_t * p) {
return p->x;
}

reverseLoad2:
lwbrx 3,0,3
blr

So the high level language gives you a number of options, with specific
tools giving more options, and the implementation gives you efficient
object code in the end. You might need to define a function or macro
yourself, but that is a one-time job.

>
> You will get *exactly* the same result (- the return) with no
> optimization in vpa from the line:
>
> mover.l (source),r3

When you say "no optimisation" here, does that mean that VPA supports
some kinds of optimisations?

>
> Logic optimization is more or less a kindergarten exercise. If you need
> logic optimization you don't know what you are doing anyway so the
> compiler won't be able to help much, no matter how good.

What do you mean by "logic optimisation" ? It is normal for a good
compiler to do a variety of strength reduction and other re-arrangements
of code to give you something with the same result, but more efficient
execution. And it is a /good/ thing that the compiler does that - it
means you can write your source code in the clearest and most
maintainable fashion, and let the compiler generate better code.

For example, if you have a simple division by a constant:

uint32_t divX(uint32_t a) {
return a / 5;
}

The direct translation of this would be:

divX:
lis 4,5
divwu 3,3,4
blr

But a compiler can do better:

divX: // divide by 5
lis 9,0xcccc
ori 9,9,52429
mulhwu 3,3,9
srwi 3,3,2
blr

Such optimisation is certainly not a "kindergarten exercise", and doing
it by hand is hardly a maintainable or flexible solution. Changing the
denominator to 7 means significant changes:

divX: // divide by 7
lis 9,0x2492
ori 9,9,18725
mulhwu 9,3,9
subf 3,9,3
srwi 3,3,1
add 3,9,3
srwi 3,3,2
blr

>
> Of course if you stick by a phrase book at source level - as is the case
> with *any* high level language - you will need plenty of optimization,
> like your example demonstrates.

I still don't know what you mean with "phrase book" here.

> I bet it will will be good only in demo
> cases like yours and much less useful in real life,

Nonsense. The benefits of using a higher level language and a compiler
get more noticeable with larger code, as the compiler has no problem
tracking register usage, instruction scheduling, etc., across large
pieces of code - unlike a human. And it has no problem re-creating code
in different ways when small details change in the source (such as the
divide by 5 and divide by 7 examples).

> so the only benefit
> of writing this in C is the source length, 10+ times the necessary (I
> counted it and I included a return line in the count, 238 vs. 23 bytes).

You have this completely backwards. If I write a simple example like
this, in a manner that is compilable code, then it is going to take
longer in high-level source code. But that is the effect of giving that
function definition. In use, writing "reverseLoad" does not take
significantly more characters than "mover" - and with everything else
around, the C code will be much shorter. And this was a case picked
specifically to show how some long patterns in C code can be handled by
a compiler to generate optimal short assembly sequences.

The division example shows the opposite - in C, I write "a / 7", while
in assembly you have to write 7 lines (excluding labels and blr). And
the C code there is nicer in every way.

> While 10 times more typing may seem no serious issue to many 10 times
> higher chance to insert an error is no laughing matter, and 10 times
> more obscurity just because of that is a productivity killer.

In real code, the C source will be 10 times shorter than the assembly.
And if the assembly has enough comments to make it clear, there is
another order of magnitude difference.

>
>>> But if the compiler does this it will be obvious enough.
>>
>> If you had some examples or references, it would be easier to see what
>> you mean.
>>
>>>
>>> Anyway, a function would do - if complex and long enough to
>>> be close to real life, i.e. a few hundred lines.
>>
>> A function that is a few hundred lines of source code is /not/ real life
>> - it is broken code. Surely in VLA you divide your code into functions
>> of manageable size, rather than single massive functions?
>
> I meant "function" not the in C subroutine kind of sense, I meant it
> more as "functionality", i.e. some code doing some job. How it split
> into pieces etc. will depend on many factors, language, programmer
> style etc., not relevant to this discussion.

OK.

But again, it has to be a specific clearly defined and limited
functionality. "Write a VNC server" is not a specification - that would
take at least many dozens of pages of specifications, not including the
details of the interfacing to the network stack, the types of library
functions available, the API available to client programs that will
"draw" on the server, etc.

>
>>>
>>> But I don't see why not compare written stuff, I just checked
>>> again on that vnc server for dps - not 8k, closer to 11k (the 8k
>>> I saw was a half-baked version, no keyboard tables inside it etc.;
>>> the complete version also includes a screen mask to allow it
>>> to ignore mouse clicks at certain areas, that sort of thing).
>>> Add to it some menu (it is command line option driven only),
>>> a much more complex menu than windows and android RealVNC has
>>> I have and it adds up to 25k.
>>> Compare this to the 350k exe for windows or to the 4M for Android
>>> (and the android does only raw...) and the picture is clear enough
>>> I think.
>>>
>>
>> A VNC server is completely useless for such a test. It is far too
>> complex, with far too much variation in implementation and features, too
>> many external dependencies on an OS or other software (such as for
>> networking), and far too big for anyone to bother with such a comparison.
>
> Actually I think a comparison between two pieces of code doing the same
> thing is quite telling when the difference is in the orders of
> magnitude, as in this case.

No, it is not. The code is not comparable in any way, and does not do
the same thing except in a very superficial sense. It's like comparing
a small car with a train - both can transport you around, but they are
very different things, each with their advantages and disadvantages.

If you want to compare your VNC server for DPS written in VPA to a VNC
server written in C, then you would need to give /exact/ specifications
of all the features of your VNC server, and exact details of how it
interfaces with everything else in the DPS system, and have someone
write a VNC server in C for DPS that follows those same specifications.
That would be no small feat - indeed, it would totally impossible
unless you wanted to do it yourself.

The nearest existing comparison I can think of would be the eCos VNC
server, written in C. I can't say how it compares in features with your
server, but it has approximately 2100 lines of code, written in a wide
style. Since I have no idea about how interfacing with DPS compares
with interfacing with eCos (I don't know either system), I have no idea
if that is a useful comparison or not.

> Writing small benchmarking toy sort of stuff is a waste of time, I am
> interested in end results.
>
>>
>> You specifically need something /small/.
>
> No, something "small" is kind of kindergarten exercise again, it can
> only be good enough to fool someone into believing this or that.
> It is end results which count.
>

Then we will all remain in ignorance about whether VPA is useful or not,
in comparison to developing in C.

Boudewijn Dijkstra

unread,

Jun 2, 2017, 11:12:08 AM6/2/17

Op Sat, 27 May 2017 21:39:36 +0200 schreef rickman <gnu...@gmail.com>:

> Someone in another group is thinking of using a portable assembler to
> write code for an app that would be ported to a number of different
> embedded processors including custom processors in FPGAs. I'm wondering
> how useful this will be in writing code that will require few changes
> across CPU ISAs and manufacturers.
>
> I am aware that there are many aspects of porting between CPUs that is
> assembly language independent, like writing to Flash memory. I'm more
> interested in the issues involved in trying to use a universal assembler
> to write portable code in general. I'm wondering if it restricts the
> instructions you can use or if it works more like a compiler where a
> single instruction translates to multiple target instructions when there
> is no one instruction suitable.
>
> Or do I misunderstand how a portable assembler works? Does it require a
> specific assembly language source format for each target just like using
> the standard assembler for the target?

LLVM has a pretty generic intermediate assembler language, though I'm not
sure if it's meant for actually writing code in.

http://llvm.org/docs/LangRef.html#instruction-reference

Another portable assembly language is Java Bytecode, though it assumes a
32-bit machine.

Mike Perkins

unread,

Jun 5, 2017, 10:39:24 AM6/5/17

On 02/06/2017 16:03, Boudewijn Dijkstra wrote:
> Op Sat, 27 May 2017 21:39:36 +0200 schreef rickman <gnu...@gmail.com>:
>> Someone in another group is thinking of using a portable assembler to
>> write code for an app that would be ported to a number of different
>> embedded processors including custom processors in FPGAs. I'm
>> wondering how useful this will be in writing code that will require
>> few changes across CPU ISAs and manufacturers.
>>
>> I am aware that there are many aspects of porting between CPUs that is
>> assembly language independent, like writing to Flash memory. I'm more
>> interested in the issues involved in trying to use a universal
>> assembler to write portable code in general. I'm wondering if it
>> restricts the instructions you can use or if it works more like a
>> compiler where a single instruction translates to multiple target
>> instructions when there is no one instruction suitable.
>>
>> Or do I misunderstand how a portable assembler works? Does it require
>> a specific assembly language source format for each target just like
>> using the standard assembler for the target?
>
> LLVM has a pretty generic intermediate assembler language, though I'm
> not sure if it's meant for actually writing code in.
>
> http://llvm.org/docs/LangRef.html#instruction-reference

Interesting, but its not obvious who the audience is. Why would anyone
want to learn another language that is not in common use or aligned to
any specific CPU?

> Another portable assembly language is Java Bytecode, though it assumes a
> 32-bit machine.

I've been watching this thread for some time. My first impression was
why not just write in C? So far that impression hasn't changed. Despite
the odd line of CPU specific assembler code for those occasions that
require it, C is still perhaps the most portable code you can write?

--
Mike Perkins
Video Solutions Ltd
www.videosolutions.ltd.uk

David Brown

unread,

Jun 5, 2017, 12:10:22 PM6/5/17

On 05/06/17 16:39, Mike Perkins wrote:
> On 02/06/2017 16:03, Boudewijn Dijkstra wrote:
>> Op Sat, 27 May 2017 21:39:36 +0200 schreef rickman <gnu...@gmail.com>:
>>> Someone in another group is thinking of using a portable assembler to
>>> write code for an app that would be ported to a number of different
>>> embedded processors including custom processors in FPGAs. I'm
>>> wondering how useful this will be in writing code that will require
>>> few changes across CPU ISAs and manufacturers.
>>>
>>> I am aware that there are many aspects of porting between CPUs that is
>>> assembly language independent, like writing to Flash memory. I'm more
>>> interested in the issues involved in trying to use a universal
>>> assembler to write portable code in general. I'm wondering if it
>>> restricts the instructions you can use or if it works more like a
>>> compiler where a single instruction translates to multiple target
>>> instructions when there is no one instruction suitable.
>>>
>>> Or do I misunderstand how a portable assembler works? Does it require
>>> a specific assembly language source format for each target just like
>>> using the standard assembler for the target?
>>
>> LLVM has a pretty generic intermediate assembler language, though I'm
>> not sure if it's meant for actually writing code in.
>>
>> http://llvm.org/docs/LangRef.html#instruction-reference
>
> Interesting, but its not obvious who the audience is. Why would anyone
> want to learn another language that is not in common use or aligned to
> any specific CPU?

The LLVM "assembly" is intended as an intermediary language. Front-end
tools like clang (a C, C++ and Objective-C compiler) generate LLVM
assembly. Middle-end tools like optimisers and linkers "play" with it.
and back-end tools translate it into target-specific assembly. Each
level can do a wide variety of optimisations. The aim is that the whole
LLVM system can be more modular and more easily ported to new
architectures and new languages than a traditional multi-language
multi-target compiler (such as gcc). So LLVM assembly is not an
assembly language you would learn or code in - it's the glue holding the
whole system together.

>
>> Another portable assembly language is Java Bytecode, though it assumes a
>> 32-bit machine.
>
> I've been watching this thread for some time. My first impression was
> why not just write in C? So far that impression hasn't changed. Despite
> the odd line of CPU specific assembler code for those occasions that
> require it, C is still perhaps the most portable code you can write?
>

Well, yes - of course C is the sensible option here. Depending on the
exact type of code and the targets, Ada, C++, and Forth might also be
viable options. But since there is no such thing as "portable
assembly", it's a poor choice :-) However, the thread has lead to some
interesting discussions, IMHO.

Don Y

unread,

Jun 5, 2017, 12:10:22 PM6/5/17

On 6/5/2017 7:39 AM, Mike Perkins wrote:
> On 02/06/2017 16:03, Boudewijn Dijkstra wrote:
>> LLVM has a pretty generic intermediate assembler language, though I'm
>> not sure if it's meant for actually writing code in.
>>
>> http://llvm.org/docs/LangRef.html#instruction-reference
>
> Interesting, but its not obvious who the audience is. Why would anyone want to
> learn another language that is not in common use or aligned to any specific CPU?

Esperanto? :>

>> Another portable assembly language is Java Bytecode, though it assumes a
>> 32-bit machine.
>
> I've been watching this thread for some time. My first impression was why not
> just write in C? So far that impression hasn't changed. Despite the odd line of
> CPU specific assembler code for those occasions that require it, C is still
> perhaps the most portable code you can write?

The greater the level of abstraction in a language choice, the less
control you have over expressing the minutiae of what you want done.

When I design a new processor (typ. application specific), I code up
sample algorithms using a very low level set of abstractions... virtual
registers, virtual operators, etc.

Once I'm done with a number of these, I "eyeball" the "code" and sort out
what the instructions (opcodes) should be for the processor. I.e., invent
the "assembly language".

If I'd coded these algorithms in a HIGHER level language, I'd end up
implementing a much more "complex" processor (because it would have
to implement much more capable "primitives")

C's portability problem isn't with the language, per se, as much as it is
with the "practitioners". It could benefit from much stricter type
checking and a lot fewer "undefined/implementation-defined behaviors"
(cuz it seems folks just get the code working on THEIR target and
never see how it fails to execute properly on any OTHER target!)

George Neuner

unread,

Jun 6, 2017, 12:24:09 AM6/6/17

On Mon, 5 Jun 2017 09:10:10 -0700, Don Y <blocked...@foo.invalid>
wrote:

>C's portability problem isn't with the language, per se, as much as it is
>with the "practitioners". It could benefit from much stricter type
>checking and a lot fewer "undefined/implementation-defined behaviors"
>(cuz it seems folks just get the code working on THEIR target and
>never see how it fails to execute properly on any OTHER target!)

The argument always has been that if implementation defined behaviors
are locked down, then C would be inefficient on CPUs that don't have
good support for <whatever>.

Look at the (historical) troubles resulting from Java (initially)
requiring IEEE-754 compliance and that FP results be exactly
reproducible *both* on the same platform *and* across platforms.

No FP hardware fully implements any version of IEEE-754: every chip
requires software fixups to achieve compliance, and most fixup suites
are not even complete [e.g., ignoring unpopular rounding modes, etc.].
Java FP code ran slower on chips that needed more fixups, and the
requirements prevented even implementing a compliant Java on some
chips despite their having FP support.

Java ultimately had to entirely back away from its reproducibility
guarantees. It now requires only best consistency - not exact
reproducibility - on the same platform. If you want reproducible
results, you have to use software floating point (BigFloat), and
accept much slower code. And by requiring consistency, it can only
approximate the performance of C code which is likewise compiled. Most
C compilers allow to eshew FP consistency for more speed ... Java does
not.

Of course, FP in general is somewhat less important to this crowd than
to other groups, and C has a lot of implementation defined behavior
unrelated to FP. But the lesson of trying to lock down hardware
(and/or OS) dependent behavior still is important.

There is no question that C could do much better type/value and
pointer/index checking, but it likely would come at the cost of far
more explicit casting (more verbose code), and likely many more
runtime checks.

A more expressive type system would help [e.g., range integers, etc.],
but that would constitute a significant change to the language.

Some people point to Ada as an example of a language that can be both
"fast" and "safe", but many people (maybe not in this group, but many
nonetheless) are unaware that quite a lot of Ada's type/value checks
are done at runtime and throw exceptions if they fail.

Obviously, a compiler could provide a way to disable the automated
runtime checking, and even when enabled checks can be elided if the
compiler can statically prove that a given operation will always be
safe. But even in Ada with its far more expressive types there are
many situations in which the compiler simply can't do that.

More stringent languages like ML won't even compile if they can't
statically type check the code. In such languages, quite a lot of
programmer effort goes toward clubbing the type checker into
submission.

TANSTAAFL,
George

David Brown

unread,

Jun 6, 2017, 5:05:22 AM6/6/17

On 06/06/17 06:24, George Neuner wrote:
> On Mon, 5 Jun 2017 09:10:10 -0700, Don Y <blocked...@foo.invalid>
> wrote:
>
>> C's portability problem isn't with the language, per se, as much as it is
>> with the "practitioners". It could benefit from much stricter type
>> checking and a lot fewer "undefined/implementation-defined behaviors"
>> (cuz it seems folks just get the code working on THEIR target and
>> never see how it fails to execute properly on any OTHER target!)

I don't think C would benefit from a /lot/ fewer undefined or
implementation-dependent behaviours. Some could happily be removed
(IMHO), but most are fine. However, I would like to see
/implementations/ working harder towards spotting this sort of thing in
user code - working to fix the /real/ problem of bad programmers, rather
than changing the language.

For example, some people seem to think that ints have two's complement
wrap-around behaviour on overflow in C, just because that is how the
underlying cpu handles it. Some languages (like Java) avoid undefined
behaviour by giving this a definition - they say exactly how signed
overflow should be handled. In my opinion, this is missing the point -
if your code has signed integers that overflow, you've got a bug. There
is /no/ right answer - picking one and saying "we define the behaviour
/this/ way" does not make it right. So allowing the compiler to assume
that it will never happen, and to optimise accordingly, is a good idea.

But compilers should do their best to spot such cases, and hit out hard
when they see it. When the compiler sees "for (int i = 0; i >= 0;
i++)", it should throw a tantrum - it should not merely break off
compilation with an error message, it should send an email to the
programmer's boss. (I'll settle for a warning message that is enabled
by default.)

Compilers /are/ getting better at warning on undefined behaviour, but
they could always be better.

>
> The argument always has been that if implementation defined behaviors
> are locked down, then C would be inefficient on CPUs that don't have
> good support for <whatever>.
>

Yes - and it is still a good argument.

It might be a nice idea to do a little bit of clean-up of some of the
options. I don't think it would do much harm if future C standards
enforced two's complement signed integers without padding, for example -
one's complement and signed-magnitude machines are extremely rare.

>
> There is no question that C could do much better type/value and
> pointer/index checking, but it likely would come at the cost of far
> more explicit casting (more verbose code), and likely many more
> runtime checks.

That is not necessarily the case - but to get much stronger type
checking in C, you would need to include so many features to the
language that you might as well use C++. For example, it is quite
possible in C to define types "speed", "distance" and "time" so that you
can't simply add a "distance" and a "time", while still being able to
generate optimal code. But you can't use normal operators in
expressions with the types - you can't write "v = d / t;", but need to
write "v = divDistanceTime(d, t);".

>
> A more expressive type system would help [e.g., range integers, etc.],
> but that would constitute a significant change to the language.

Yes, and yes.

Ranged integer types would be nice, and would give not just safer code,
but more efficient code.

There are many things that would be nice to add to the language (and to
C++), some of which are common as extensions in compilers but which
could usefully be standardised. An example is gcc's
"__builtin_constant_p" feature. This can be used to let the compiler do
compile-time checking where possible, but skip run-time checks for code
efficiency:

extern void __attribute__((error("Assume failed"))) assumeFailed(void);

// The compiler can assume that "x" is true, and optimise or warn
// accordingly
// If the compiler can see that the assume will fail, it gives an error
#define assume(x) \
do { \
if (__builtin_constant_p(x)) { \
if (!(x)) { \
assumeFailed(); \
} \
} \
if (!(x)) __builtin_unreachable(); \
} while (0)

If such features were standardised, they could be used in all code - not
just gcc-specific code.

>
> Some people point to Ada as an example of a language that can be both
> "fast" and "safe", but many people (maybe not in this group, but many
> nonetheless) are unaware that quite a lot of Ada's type/value checks
> are done at runtime and throw exceptions if they fail.

They also involve a good deal more verbose code.

Don Y

unread,

Jun 6, 2017, 10:03:50 AM6/6/17

Hi George,

Snow melt, yet? :> (105+ here, all week)

On 6/5/2017 9:24 PM, George Neuner wrote:
> On Mon, 5 Jun 2017 09:10:10 -0700, Don Y <blocked...@foo.invalid>
> wrote:
>
>> C's portability problem isn't with the language, per se, as much as it is
>> with the "practitioners". It could benefit from much stricter type
>> checking and a lot fewer "undefined/implementation-defined behaviors"
>> (cuz it seems folks just get the code working on THEIR target and
>> never see how it fails to execute properly on any OTHER target!)
>
> The argument always has been that if implementation defined behaviors
> are locked down, then C would be inefficient on CPUs that don't have
> good support for <whatever>.

Of course. When you design a language, you seek to optimize some set
of *many* (often conflicting) design criteria. Do you want it to be
portable? Deterministic? Require minimal keystrokes? Lead to
pronounceable code? etc.

Most "programmers" like to consider themselves "artists" -- in the same
vein as authors/novelists. The fewest constraints on the way we practice
our craft (art?). Imagine if all novels *had* to be written composed entirely
of simple sentences built in a subject-predicate order. Or, if every subject
had to be proper noun, etc.

Alternatively, you can try to constrain the programmer (protect him from
himself) and *hope* he's compliant.

Of course, the range of applications is also significant. A language
intended for scripting has a different set of goals than one that can
effectively implement an OS, etc.

> Look at the (historical) troubles resulting from Java (initially)
> requiring IEEE-754 compliance and that FP results be exactly
> reproducible *both* on the same platform *and* across platforms.
>
> No FP hardware fully implements any version of IEEE-754: every chip
> requires software fixups to achieve compliance, and most fixup suites
> are not even complete [e.g., ignoring unpopular rounding modes, etc.].
> Java FP code ran slower on chips that needed more fixups, and the
> requirements prevented even implementing a compliant Java on some
> chips despite their having FP support.
>
> Java ultimately had to entirely back away from its reproducibility
> guarantees. It now requires only best consistency - not exact
> reproducibility - on the same platform. If you want reproducible
> results, you have to use software floating point (BigFloat), and
> accept much slower code. And by requiring consistency, it can only
> approximate the performance of C code which is likewise compiled. Most
> C compilers allow to eshew FP consistency for more speed ... Java does
> not.
>
> Of course, FP in general is somewhat less important to this crowd than
> to other groups, and C has a lot of implementation defined behavior
> unrelated to FP. But the lesson of trying to lock down hardware
> (and/or OS) dependent behavior still is important.

Yes. But this moves the languages design choices on the axis
AWAY from (inherent) "portability".

You can write "portable" code, in C -- but, it requires a conscious decision
to do so (and a fair bit of practice to do so WELL).

And, what does it mean to claim some piece of code is "portable"? That
it produces the same results without concern for resource usage, execution
speed, code size, etc.? (who decided that THOSE criteria were more/less
important?)

For example, My BigDecimal package will tailor itself to the size of the
largest integer data type in the target. But, this is often NOT what you
would want (it tends to make larger BigDecimals more space efficient), esp
on smaller machines! E.g., on a 16b machine, there might be support for
ulonglong's that my code will exploit... but, if the inherent data type
is "ushort" and all CPU operations are geared towards that size data,
there will be countless "helper routines" invoked just to let my code
use these "unnaturally large" data types, even if they do so inefficiently
(and the code could have just as easily tailor itself to a data size closer
to ushort)

And, more contortions from programmers trying to "work-around" those
checks ("Yeah, I *want* to dereference NULL! I want to 'jump 0x0000'")

> A more expressive type system would help [e.g., range integers, etc.],
> but that would constitute a significant change to the language.
>
> Some people point to Ada as an example of a language that can be both
> "fast" and "safe", but many people (maybe not in this group, but many
> nonetheless) are unaware that quite a lot of Ada's type/value checks
> are done at runtime and throw exceptions if they fail.

IME, that's the only way to get the degree of checking you *want*
(i.e., you, as an individual programmer on a specific project coding
a particular algorithm). But, that voids many tools designed to protect
you from these evils.

> Obviously, a compiler could provide a way to disable the automated
> runtime checking, and even when enabled checks can be elided if the
> compiler can statically prove that a given operation will always be
> safe. But even in Ada with its far more expressive types there are
> many situations in which the compiler simply can't do that.
>
> More stringent languages like ML won't even compile if they can't
> statically type check the code. In such languages, quite a lot of
> programmer effort goes toward clubbing the type checker into
> submission.

I'd considered coding my current project in C++ *just* for the
better type-checking.

E.g., I want new types to be syntactically treated as different
types, not just aliases:

typedef long int handle_t; // reference for an OS object
typedef handle_t file_handle_t; // reference for an OS *file* object
typedef handle_t lamp_handle_t; // reference for an OS lamp object

extern turn_on(lamp_handle_t theLamp);
extern unlink(file_handle_t theFile);

main()
{
file_handle_t aFile;
lamp_handle_t aLamp;

... // initialization

turn_on( (lamp_handle_t) aFile);
unlink( (file_handle_t) aLamp);
}

should raise eyebrows!

Add the potential source of ambiguity that the IDL can introduce
and its too easy for an "uncooperative" developer to craft code
that is way too cryptic and prone to errors.

Imagine poking bytes into a buffer called "stack_frame" and
then trying to wedge it "under" a function invocation...
sure, you can MAKE it work. But, will you know WHY it works,
next week??

David Brown

unread,

Jun 6, 2017, 10:58:53 AM6/6/17

On 06/06/17 16:03, Don Y wrote:

> I'd considered coding my current project in C++ *just* for the
> better type-checking.
>
> E.g., I want new types to be syntactically treated as different
> types, not just aliases:
>
> typedef long int handle_t; // reference for an OS object
> typedef handle_t file_handle_t; // reference for an OS *file* object
> typedef handle_t lamp_handle_t; // reference for an OS lamp object
>
> extern turn_on(lamp_handle_t theLamp);
> extern unlink(file_handle_t theFile);
>
> main()
> {
> file_handle_t aFile;
> lamp_handle_t aLamp;
>
> ... // initialization
>
> turn_on( (lamp_handle_t) aFile);
> unlink( (file_handle_t) aLamp);
> }
>
> should raise eyebrows!
>

You can get that in C - put your types in structs. It's a pain for
arithmetic types, but works fine for handles.

typedef struct { long int h; } handle_t;
typedef struct { handle_t fh; } file_handle_t;
typedef struct { handle_t lh; } lamp_handle_t;

Now "turn_on" will not accept a lamp_handle_t object.

Tauno Voipio

unread,

Jun 6, 2017, 2:26:45 PM6/6/17

Don seems to invent objects without objects.

--

-TV

George Neuner

unread,

Jun 6, 2017, 4:42:02 PM6/6/17

On Tue, 6 Jun 2017 07:03:38 -0700, Don Y <blocked...@foo.invalid>
wrote:

>Hi George,
>
>Snow melt, yet? :> (105+ here, all week)

45 and raining. It has rained at least some nearly every day for the
last 2 weeks. Great for pollen counts, bad for mold spores.

>On 6/5/2017 9:24 PM, George Neuner wrote:
>> On Mon, 5 Jun 2017 09:10:10 -0700, Don Y <blocked...@foo.invalid>
>> wrote:
>
>Most "programmers" like to consider themselves "artists" -- in the same
>vein as authors/novelists. The fewest constraints on the way we practice
>our craft (art?).

But that's the thing ... programming in and of itself is a skill that
can be taught, but software "engineering" *IS* an art.

It is exactly analogous to sculpting or piano playing ... almost
anyone can learn to wield hammer and chisel, or to play notes on keys
- but only some can produce beauty in statue or music.

>Imagine if all novels *had* to be written composed entirely
>of simple sentences built in a subject-predicate order. Or, if every
>subject had to be proper noun, etc.

Iambic meter? Limerick? Words that rhyme with "orange".

>Alternatively, you can try to constrain the programmer (protect him from
>himself) and *hope* he's compliant.

Yes. And my preference for a *general* purpose language is to default
to protecting the programmer, but to selectively permit use of more
dangerous constructs in marked "unsafe" code.

>Of course, the range of applications is also significant. A language
>intended for scripting has a different set of goals than one that can
>effectively implement an OS, etc.

Absolutely. A scripting or domain specific language often does not
need to be Turing powerful (or even anywhere close).

>> A more expressive type system would help [e.g., range integers, etc.],
>> but that would constitute a significant change to the language.
>>
>> Some people point to Ada as an example of a language that can be both
>> "fast" and "safe", but many people (maybe not in this group, but many
>> nonetheless) are unaware that quite a lot of Ada's type/value checks
>> are done at runtime and throw exceptions if they fail.
>
>IME, that's the only way to get the degree of checking you *want*
>(i.e., you, as an individual programmer on a specific project coding
>a particular algorithm). But, that voids many tools designed to protect
>you from these evils.

Lisp is safe by default, and even without type declarations a good
compiler can produce quite good code through extensive type/value
propogation analyses.

But by selectively adding declarations and local annotations, a Lisp
programmer can improve performance - sometimes significantly. In many
cases, carefully tuned Lisp code can approach C performance. Type
annotation in Lisp effectively tells the compiler "trust me, I know
what I'm doing" and lets the compiler elide runtime checks that it
couldn't otherwise eliminate, and sometimes switch to (smaller,faster)
untagged data representations.

There's nothing special about Lisp that makes it particularly amenable
to that kind of tuning - Lisp simply can benefit more than some other
languages because it uses tagged data and defaults to perform (almost)
all type checking at runtime.

[Aside: BiBOP still *effectively* tags data even where tags are not
explicitly stored. In BiBOP systems the type of a value is deducible
from its address in memory.]

Modern type inferencing allows a "safer than C" language to use C-like
raw data representations, and to rely *mostly* on static type checking
without the need for a whole lot of type declarations and casting. But
type inferencing has limitations: with current methods it is not
possible to completely eliminate the need for declarations. And as
with Lisp, most type inferencing systems can be assisted to generate
better code by selective use of annotations.

In any case, regardless of what type system is used, it isn't possible
to completely eliminate runtime checking IFF you want a language to be
safe: e.g., pointers issues aside, there's no way to statically
guarantee that range reduction will produce a value that can be safely
stored into a type having a smaller (bit) width. No matter how
sophicated the compiler becomes, there always will be cases where the
programmer knows better and should be able to override it.

But even with these limitations, there are languages that are useful
now and do far more of what you want than does C.

YMMV,
George

Niklas Holsti

unread,

Jun 6, 2017, 5:30:44 PM6/6/17

On 17-06-06 07:24 , George Neuner wrote:
> On Mon, 5 Jun 2017 09:10:10 -0700, Don Y <blocked...@foo.invalid>
> wrote:
>
>> C's portability problem isn't with the language, per se, as much as it is
>> with the "practitioners". It could benefit from much stricter type
>> checking and a lot fewer "undefined/implementation-defined behaviors"
>> (cuz it seems folks just get the code working on THEIR target and
>> never see how it fails to execute properly on any OTHER target!)
>
> The argument always has been that if implementation defined behaviors
> are locked down, then C would be inefficient on CPUs that don't have
> good support for <whatever>.

[snip]

> There is no question that C could do much better type/value and
> pointer/index checking, but it likely would come at the cost of far
> more explicit casting (more verbose code), and likely many more
> runtime checks.
>
> A more expressive type system would help [e.g., range integers, etc.],
> but that would constitute a significant change to the language.
>
> Some people point to Ada as an example of a language that can be both
> "fast" and "safe", but many people (maybe not in this group, but many
> nonetheless) are unaware that quite a lot of Ada's type/value checks
> are done at runtime and throw exceptions if they fail.

None of the Ada *type* checks are done at runtime. Only *value* checks
are done at runtime.

> Obviously, a compiler could provide a way to disable the automated
> runtime checking,

All Ada compilers I have seen have an option to disable runtime checks.

> and even when enabled checks can be elided if the
> compiler can statically prove that a given operation will always be
> safe. But even in Ada with its far more expressive types there are
> many situations in which the compiler simply can't do that.

There are other, more proof-oriented tools that can be used for that,
for example the CodePeer tool from AdaCore, or the SPARK toolset.

It is not uncommon for real Ada programs to be proven exception-free
with such tools, which means that it is safe to turn off the runtime checks.

--
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
. @ .

George Neuner

unread,

Jun 7, 2017, 2:02:13 AM6/7/17

On Wed, 7 Jun 2017 00:30:40 +0300, Niklas Holsti
<niklas...@tidorum.invalid> wrote:

>On 17-06-06 07:24 , George Neuner wrote:
>
>> Some people point to Ada as an example of a language that can be both
>> "fast" and "safe", but many people (maybe not in this group, but many
>> nonetheless) are unaware that quite a lot of Ada's type/value checks
>> are done at runtime and throw exceptions if they fail.
>
>None of the Ada *type* checks are done at runtime. Only *value* checks
>are done at runtime.

Note that I said "type/value", not simply "type".

In any case, it's a distinction without a difference. The value
checks that need to be performed at runtime are due mainly to use of
differing types that have overlapping range compatibility.

The remaining uses of runtime checks are due to I/O where input values
may be inconsistent with the types involved.

>> Obviously, a compiler could provide a way to disable the automated
>> runtime checking,
>
>All Ada compilers I have seen have an option to disable runtime checks.

Yes. And if you were following the discussion, you would have noticed
that that comment was directed not at Ada, but toward runtime checking
in a hypothetical "safer" C.

George

Don Y

unread,

Jun 7, 2017, 2:17:03 AM6/7/17

On 6/6/2017 1:42 PM, George Neuner wrote:
> On Tue, 6 Jun 2017 07:03:38 -0700, Don Y <blocked...@foo.invalid>
> wrote:
>
>> Snow melt, yet? :> (105+ here, all week)
>
> 45 and raining. It has rained at least some nearly every day for the
> last 2 weeks. Great for pollen counts, bad for mold spores.

Yeah, until it eases up a bit and all that stuff comes into BLOOM! :<
We're paying the price for a Spring that came 4 weeks early...

>> On 6/5/2017 9:24 PM, George Neuner wrote:
>>> On Mon, 5 Jun 2017 09:10:10 -0700, Don Y <blocked...@foo.invalid>
>>> wrote:
>>
>> Most "programmers" like to consider themselves "artists" -- in the same
>> vein as authors/novelists. The fewest constraints on the way we practice
>> our craft (art?).
>
> But that's the thing ... programming in and of itself is a skill that
> can be taught, but software "engineering" *IS* an art.

Exactly. But, there are a sh*tload of folks who THINK themselves
"artists" that really should see how the rest of the world views
their "art"! :>

The problem is that you have to *design* systems for with these
folks in mind as their likely "maintainers" and/or "evolvers".

So, even if you're "divinely inspired", you have to manage to either
put in place a framework that effectively guides (constrains!) their
future work to remain consistent with that design...

Or, leave copious notes and HOPE they read them, understand them and
take them to heart...

Or, create mechanisms (tools) that cripple the budding artistry they
(think) possess! :>

> It is exactly analogous to sculpting or piano playing ... almost
> anyone can learn to wield hammer and chisel, or to play notes on keys
> - but only some can produce beauty in statue or music.

Yes. Now, imagine The David needing a 21st century "update".
Michelangelo is "unavailable" for the job. <grin>

Do you find the current "contemporary master" (of that art form)
and hire him/her to perform the task? Or, some run-of-the-mill
guy from Sculptors University, Class of 2016??

If you're only concerned with The David *and* have the resources that
such an asset would command, you can probably afford to have the
current Master tackle the job.

OTOH, if you've got a boatload of similar jobs (The David, The Rita,
The Bob, The Bethany, The Harold, The Gretchen, etc.), that one artist
may decide he's tired of being asked to "tweek" the works of past
artists and want a commission of his own! Or, simply not have time
enough in his schedule to get to all of them at the pace you desire!

>> Alternatively, you can try to constrain the programmer (protect him from
>> himself) and *hope* he's compliant.
>
> Yes. And my preference for a *general* purpose language is to default
> to protecting the programmer, but to selectively permit use of more
> dangerous constructs in marked "unsafe" code.

And, count on the DISCIPLINE of all these would-be Michelangelos to
understand (and admit!) their own personal limitations prior to enabling
those constructs?

Like telling a 16 year old "keep it under 35MPH"... <grin>

>> Of course, the range of applications is also significant. A language
>> intended for scripting has a different set of goals than one that can
>> effectively implement an OS, etc.
>
> Absolutely. A scripting or domain specific language often does not
> need to be Turing powerful (or even anywhere close).

What you (ideally) want, is to be able to "set a knob" on the 'side' of
the language to limit its "potential for misuse". But, to do so in a
way that the practitioner doesn't feel intimidated/chastened at its
apparent "setting".

How do you tell a (as yet, unseen!) developer "you're not capable of
safely using pointers to functions? Or, recursive algorithms? Or,
self-modifying code? Or, ... But, *I* am!" :>

(Returning to "portability"...)

Even if I can craft something that is portable under some set of
conditions/criteria that I deem appropriate -- often by leveraging
particular features of the language of a given implementation
thereof -- how do I know the next guy will understand those issues?
How do I know he won't *break* that aspect (portability) -- and
only belatedly discover his error (two years down the road when the
code base moves to a Big Endian 36b processor)?

It's similar to trying to ensure "appropriate" documentation
accompanies each FUTURE change to the system -- who decides
what is "appropriate"? (Ans: the guy further into the future who
can't sort out the changes made by his predecessor!)

>>> A more expressive type system would help [e.g., range integers, etc.],
>>> but that would constitute a significant change to the language.
>>>
>>> Some people point to Ada as an example of a language that can be both
>>> "fast" and "safe", but many people (maybe not in this group, but many
>>> nonetheless) are unaware that quite a lot of Ada's type/value checks
>>> are done at runtime and throw exceptions if they fail.
>>
>> IME, that's the only way to get the degree of checking you *want*
>> (i.e., you, as an individual programmer on a specific project coding
>> a particular algorithm). But, that voids many tools designed to protect
>> you from these evils.
>
> Lisp is safe by default, and even without type declarations a good
> compiler can produce quite good code through extensive type/value
> propogation analyses.
>
> But by selectively adding declarations and local annotations, a Lisp
> programmer can improve performance - sometimes significantly. In many
> cases, carefully tuned Lisp code can approach C performance. Type
> annotation in Lisp effectively tells the compiler "trust me, I know
> what I'm doing" and lets the compiler elide runtime checks that it
> couldn't otherwise eliminate, and sometimes switch to (smaller,faster)
> untagged data representations.

I.e., "trust the programmer" -- except in those cases where you can't! :>

> There's nothing special about Lisp that makes it particularly amenable
> to that kind of tuning - Lisp simply can benefit more than some other
> languages because it uses tagged data and defaults to perform (almost)
> all type checking at runtime.
>
> [Aside: BiBOP still *effectively* tags data even where tags are not
> explicitly stored. In BiBOP systems the type of a value is deducible
> from its address in memory.]
>
> Modern type inferencing allows a "safer than C" language to use C-like
> raw data representations, and to rely *mostly* on static type checking
> without the need for a whole lot of type declarations and casting. But
> type inferencing has limitations: with current methods it is not
> possible to completely eliminate the need for declarations. And as
> with Lisp, most type inferencing systems can be assisted to generate
> better code by selective use of annotations.

But it still requires the programmer to know what he's doing; the compiler
takes its cues from the programmer's actions!

How often have you seen a bare int used as a pointer? Or, vice versa?
("Yeah, I know what you *mean* -- but that's not what you *coded*!")

> In any case, regardless of what type system is used, it isn't possible
> to completely eliminate runtime checking IFF you want a language to be
> safe: e.g., pointers issues aside, there's no way to statically
> guarantee that range reduction will produce a value that can be safely
> stored into a type having a smaller (bit) width. No matter how
> sophicated the compiler becomes, there always will be cases where the
> programmer knows better and should be able to override it.

Exactly. Hence the contradictory issues at play:
- enable the competent
- protect the incompetent

> But even with these limitations, there are languages that are useful
> now and do far more of what you want than does C.

But, when designing (or choosing!) a language, one of the dimensions
in your decision matrix has to be availability of that language AND in
the existing skillsets of its practitioners.

Being "better" (in whatever set of criteria) isn't enough to ensure
acceptance or adoption (witness the Betamax).

ASM saw widespread use -- not because it was the BEST tool for the
job but, rather, because it was (essentially) the ONLY game in town
(in the early embedded world). Amusing that we didn't repeat the same
evolution of languages that was the case in the "mainframe" world
(despite having comparable computational resources to those
ANCIENT machines!).

The (early) languages that we settled on were simple to implement
on the development platforms and with the target resources. Its
only as targets have become more resource-rich that we're exploring
richer execution environments (and the attendant consequences of
that for the developer).

David Brown

unread,

Jun 7, 2017, 3:20:38 AM6/7/17

On 06/06/17 22:42, George Neuner wrote:

>
> Modern type inferencing allows a "safer than C" language to use C-like
> raw data representations, and to rely *mostly* on static type checking
> without the need for a whole lot of type declarations and casting. But
> type inferencing has limitations: with current methods it is not
> possible to completely eliminate the need for declarations. And as
> with Lisp, most type inferencing systems can be assisted to generate
> better code by selective use of annotations.
>

Modern C++ has quite powerful type inference now, with C++11 "auto".
This allows you to have complex types, encoding lots of information in
compile-time checkable types, while often being able to use "auto" or
"decltype" to avoid messy and hard to maintain source code.

The next step up is "concepts", which are a sort of meta-type. A
"Number" concept, for example, might describe a type that has arithmetic
operations. Instead of writing your code specifying exactly which
concrete types are used, you describe the properties you need your types
to have.

As you say, however, concrete type declarations cannot be eliminated -
in C++ they are essential when importing or exporting functions and data
between units.

Yes. You can do /some/ of the checking at compile-time, but not all of
it. And a sophisticated whole-program optimiser can eliminate some of
the logical run-time checks, but not all of them.

George Neuner

unread,

Jun 8, 2017, 6:38:47 AM6/8/17

On Tue, 6 Jun 2017 23:16:52 -0700, Don Y <blocked...@foo.invalid>
wrote:

>>> On 6/5/2017 9:24 PM, George Neuner wrote:
>>

>> ... software "engineering" *IS* an art.

>
>Exactly. But, there are a sh*tload of folks who THINK themselves
>"artists" that really should see how the rest of the world views
>their "art"! :>
>
>The problem is that you have to *design* systems for with these
>folks in mind as their likely "maintainers" and/or "evolvers".
>
>So, even if you're "divinely inspired", you have to manage to either
>put in place a framework that effectively guides (constrains!) their
>future work to remain consistent with that design...
>
>Or, leave copious notes and HOPE they read them, understand them and
>take them to heart...

Or adopt a throw-away mentality: replace rather than maintain.

That basically is the idea behind the whole agile/devops/SaaS
movement: if it doesn't work today, no problem - there will be a new
release tomorrow [or sooner].

>> It is exactly analogous to sculpting or piano playing ... almost
>> anyone can learn to wield hammer and chisel, or to play notes on keys
>> - but only some can produce beauty in statue or music.
>
>Yes. Now, imagine The David needing a 21st century "update".
>Michelangelo is "unavailable" for the job. <grin>
>
>Do you find the current "contemporary master" (of that art form)
>and hire him/her to perform the task? Or, some run-of-the-mill
>guy from Sculptors University, Class of 2016??
>
>If you're only concerned with The David *and* have the resources that
>such an asset would command, you can probably afford to have the
>current Master tackle the job.

You know what they say: one decent Lisp programmer is worth 10,000
python monkeys.

>OTOH, if you've got a boatload of similar jobs (The David, The Rita,
>The Bob, The Bethany, The Harold, The Gretchen, etc.), that one artist
>may decide he's tired of being asked to "tweek" the works of past
>artists and want a commission of his own! Or, simply not have time
>enough in his schedule to get to all of them at the pace you desire!

No problem: robots and 3-D printers will take care of that. Just read
an article that predicts AI will best humans at *everything* within 50
years.

>>> Alternatively, you can try to constrain the programmer (protect him from
>>> himself) and *hope* he's compliant.
>>
>> Yes. And my preference for a *general* purpose language is to default
>> to protecting the programmer, but to selectively permit use of more
>> dangerous constructs in marked "unsafe" code.
>
>And, count on the DISCIPLINE of all these would-be Michelangelos to
>understand (and admit!) their own personal limitations prior to enabling
>those constructs?

For the less experienced, fear, uncertainty and doubt are better
counter-motivators than is any amount of discipline. When a person
believes (correctly or not) that something is hard to learn or hard to
use, he or she usually will avoid trying it for as long as possible.

The basic problem with C is that some of its hard to master concepts
are dangled right in the faces of new programmers.

For almost any non-system application, you can do without (explicit
source level) pointer arithmetic. But pointers and the address
operator are fundamental to function argument passing and returning
values (note: not "value return"), and it's effectively impossible to
program in C without using them.

This pushes newbies to learn about pointers, machine addressing and
memory management before many are ready. There is plenty else to
learn without *simultaneously* being burdoned with issues of object
location.

Learning about pointers then invariably leads to learning about
arithmetic on pointers because they are covered together in most
tutorials.

Keep in mind that the majority of people learning and using C (or C++)
today have no prior experience with hardware or even with programming
in assembler. If C isn't their 1st (non-scripting) language then most
likely their prior experiences were with "safe", high level, GC'd
languages that do not expose object addressing: e.g., Java, Scheme,
Python, etc. ... the commonly used "teaching" languages.

For general application programming, there is no need for a language
to provide mutable pointers: initialized references, together with
array (or stream) indexing and struct/object member access are
sufficient for virtually any non-system programming use. This has
been studied extensively and there is considerable literature on the
subject.

[Note also I am talking about what a programmer is permitted to do at
the source code level ... what a compiler does to implement object
addressing under the hood is beside the point.]

<frown>

Mutable pointers are just the tip of the iceberg: I could write a
treatise on the difficulties / frustrations of the *average*
programmer with respect to manual memory management, limited precision
floating point, differing logical "views" of the same data,
parallelism, etc. ...
... and how C's "defects" with regard to safe *application*
programming conspire to add to their misery.

But this already is long and off the "portability" topic.

>What you (ideally) want, is to be able to "set a knob" on the 'side' of
>the language to limit its "potential for misuse". But, to do so in a
>way that the practitioner doesn't feel intimidated/chastened at its
>apparent "setting".

Look at Racket's suite of teaching and extension languages. They all
are implemented over the same core language (an extended Scheme), but
they leverage the flexibility of the core langauge to offer different
syntaxes, different semantics, etc.

In the case of the teaching languages, there is reduced functionality,
combined with more newbie friendly debugging output, etc.

http://racket-lang.org/
https://docs.racket-lang.org/htdp-langs/index.html

And, yeah, the programmer can change which language is in use with a
simple "#lang <_>" directive, but the point here is the flexibility of
the system to provide (more or less) what you are asking for.

>(Returning to "portability"...)
>
>Even if I can craft something that is portable under some set of
>conditions/criteria that I deem appropriate -- often by leveraging
>particular features of the language of a given implementation
>thereof -- how do I know the next guy will understand those issues?
>How do I know he won't *break* that aspect (portability) -- and
>only belatedly discover his error (two years down the road when the
>code base moves to a Big Endian 36b processor)?

You don't, and there is little you can do about it. You can try to be
helpful - e.g., with documentation - but you can't be responsible for
what the next person will do.

No software truly is portable except that which runs on an abstract
virtual machine. As long as the virtual machine can be realized on a
particular base platform, the software that runs on the VM is
"portable" to that platform.

>It's similar to trying to ensure "appropriate" documentation
>accompanies each FUTURE change to the system -- who decides
>what is "appropriate"? (Ans: the guy further into the future who
>can't sort out the changes made by his predecessor!)

Again, you are only responsible for what you do.

>> ... No matter how sophisticated the compiler becomes, there always

>> will be cases where the programmer knows better and should be able
>> to override it.
>
>Exactly. Hence the contradictory issues at play:
>- enable the competent
>- protect the incompetent
>
>> But even with these limitations, there are languages that are useful
>> now and do far more of what you want than does C.
>
>But, when designing (or choosing!) a language, one of the dimensions
>in your decision matrix has to be availability of that language AND in
>the existing skillsets of its practitioners.

The modern concept of availability is very different than when you had
to wait for a company to provide a turnkey solution, or engineer
something yourself from scratch. Now, if the main distribution
doesn't run on your platform, you are likely to find source that you
can port yourself (if you are able), or if there's any significant
user base, you may find that somebody else already has done it.

Tutorials, reference materials, etc. are a different matter, but the
simpler and more uniform the syntax and semantics, the easier the
language is to learn and to master.

question: why in C is *p.q == p->q
but *p != p
and p.q != p->q

followup: given coincidental addresses and modulo a cast,
how is it that *p can == *p.q

Shit like this makes a student's head explode.

In Pascal, the pointer dereference operator '^' and the record
(struct) member access operator '.' were separate and always used
consistently. The type system guarantees that p and p^ and p^.q can
never, ever be the same object.

This visual and logical consistency made Pascal easier to learn. And
not any less functional.

My favorite dead horse - Modula 3 - takes a similar approach. Modula
3 is both a competent bare metal system language AND a safe OO
application language. It does a whole lot more than (extended) Pascal
- yet it isn't that much harder to learn.

It is possible to learn Modula 3 incrementally: leaving advanced
subjects such as where objects are located in memory and when it's
safe to delete() them - until you absolutely need to know.

And if you stick to writing user applications in the safe subset of
the language, you may never need to learn it: Modula 3 uses GC by
default.

>Being "better" (in whatever set of criteria) isn't enough to ensure
>acceptance or adoption (witness the Betamax).

Unfortunately.

>ASM saw widespread use -- not because it was the BEST tool for the
>job but, rather, because it was (essentially) the ONLY game in town
>(in the early embedded world). Amusing that we didn't repeat the same
>evolution of languages that was the case in the "mainframe" world
>(despite having comparable computational resources to those
>ANCIENT machines!).
>
>The (early) languages that we settled on were simple to implement
>on the development platforms and with the target resources. Its
>only as targets have become more resource-rich that we're exploring
>richer execution environments (and the attendant consequences of
>that for the developer).

There never was any C compiler that ran on any really tiny machine.
Ritchies' technotes on the development of C stated that the original
1972 PDP-11 compiler had to run in ~6KB (all that was left after
loading Unix), required several passes, and really was not usable
until the machine was given a hard disk. Note also that that 1st
compiler implemented only a subset of K&R1.

K&R1 - as described in the book - was 1st implemented in 1977 and I
have never seen any numbers on the size of that compiler.

The smallest K&R1 compiler I can remember that *ran* on an 8-bit micro
was circa 1983. It was a few hundred KB of code. It ran in 48KB
using overlays, needed 2 floppy drives or a hard disk, and required 2
compile passes per source file and a final link pass.

It was quite functional (if glacially slow), and included program code
overlay support and emulated single precision FP (in VAX format IIRC).
Although it targeted a 16-bit virtual machine with 6 16-bit registers,
it produced native 8-bit code : i.e. the "16-bit VM" program was not
interpreted, but was emulated by 8-bit code.

As part of the pro package (and available separately for personal use)
there also was a bytecode compiler that allowed packing much larger
applications (or their data) into memory. It had all the same
features as the native code compiler, but produced interpreted code
that ran much slower. You could use both native and interpreted code
in the same application via overlays.

There existed various subset C compilers that could run in less than
48KB, but most of them were no more than expensive learning toys.

But even by the standard of "the compiler could run on the machine",
there were languages better suited than C for application programming.

Consider that in the late 70's there already were decent 8-bit
implementations of BASIC, BCPL Logo, SNOBOL, etc. (Extended) Pascal,
Smalltalk, SNOBOL4, etc. became available in the early 80's for both 8
and 16-bit systems. But C really wasn't useable on any micro prior to
~1985 when reasonably<?> priced hard disks appeared.

Undoubtedly, AT&T giving away Unix to colleges from 1975..1979 meant
that students in that time frame would have gained some familiarity
with C. 16-bit micros powerful enough to really be characterized as
useful "development" systems popped out in the early 80's as these
students would have been graduating (or shortly thereafter).

But they were extremely expensive: tens of thousands of dollars for a
usable system. You'd have to mortage your home to afford one, which
is not something the newly working with looming college loans would do
lightly. And sans hard disk (more $$$), you'd manage only one or two
compiles a day.

Turbo Pascal was the 1st really useable [in the modern sense]
developement system. It did not need a hard disk and it hit the
market before commodity hard disks were widely available.

The question is not why C was adopted for system programming, or for
cross development from a capable system to a smaller target. Rather
the question is why it was so widely adopted for ALL kinds of
programming on ALL platforms given that were many other reasonable
choices available.

YMMV. I remain perplexed.
George

Dimiter_Popoff

unread,

Jun 8, 2017, 8:58:54 AM6/8/17

On 08.6.2017 г. 13:38, George Neuner wrote:
> ...

>
> The question is not why C was adopted for system programming, or for
> cross development from a capable system to a smaller target. Rather
> the question is why it was so widely adopted for ALL kinds of
> programming on ALL platforms given that were many other reasonable
> choices available.

My take on that is it happened because people needed a low level
language, some sort of assembler - and the widest spread CPU was
the x86 with a register model for which no sane person would consider
programming larger pieces of code.
I am sure there have been people who have done
it but they can't have been exactly sane :) (i.e. have been insane in
a way most people would have envied them for their insanity).
So C made x86 usable - and the combination (C+x86) is the main factor
which led to the absurd situation we have today, where code which
used to take kilobytes of memory takes gigabytes (not because of the
inefficiency of compilers, just because of where most programmers
have been led to).

Tom Gardner

unread,

Jun 8, 2017, 9:50:05 AM6/8/17

On 08/06/17 11:38, George Neuner wrote:
> The smallest K&R1 compiler I can remember that *ran* on an 8-bit micro
> was circa 1983. It was a few hundred KB of code. It ran in 48KB
> using overlays, needed 2 floppy drives or a hard disk, and required 2
> compile passes per source file and a final link pass.
>
> It was quite functional (if glacially slow), and included program code
> overlay support and emulated single precision FP (in VAX format IIRC).
> Although it targeted a 16-bit virtual machine with 6 16-bit registers,
> it produced native 8-bit code : i.e. the "16-bit VM" program was not
> interpreted, but was emulated by 8-bit code.

Whitesmiths? IIRC the symbol table size became a limiting
factor during linking, so linking became multipass :(

> There existed various subset C compilers that could run in less than
> 48KB, but most of them were no more than expensive learning toys.

I always found that remarkable, since Algol-60 compiler ran in
4kwords of 2 instructions/word.

I'll debate Smalltalk :) Apple's implementation (pre L Peter
Deutsch's JIT) was glacially slow. I know: it is still running
on my fat Mac downstairs :)

> The question is not why C was adopted for system programming, or for
> cross development from a capable system to a smaller target. Rather
> the question is why it was so widely adopted for ALL kinds of
> programming on ALL platforms given that were many other reasonable
> choices available.

Yes indeed.

Fortunately The New Generation has seen the light, for better
and for worse.

But then if you make it possible to program in English,
you will find that people cannot think and express
themselves in English.

Tauno Voipio

unread,

Jun 8, 2017, 10:38:45 AM6/8/17

On 8.6.17 16:50, Tom Gardner wrote:
> On 08/06/17 11:38, George Neuner wrote:
>> The smallest K&R1 compiler I can remember that *ran* on an 8-bit micro
>> was circa 1983. It was a few hundred KB of code. It ran in 48KB
>> using overlays, needed 2 floppy drives or a hard disk, and required 2
>> compile passes per source file and a final link pass.
>>
>> It was quite functional (if glacially slow), and included program code
>> overlay support and emulated single precision FP (in VAX format IIRC).
>> Although it targeted a 16-bit virtual machine with 6 16-bit registers,
>> it produced native 8-bit code : i.e. the "16-bit VM" program was not
>> interpreted, but was emulated by 8-bit code.
>
> Whitesmiths? IIRC the symbol table size became a limiting
> factor during linking, so linking became multipass :(

Must be. It run on CP/M machines.

>> There existed various subset C compilers that could run in less than
>> 48KB, but most of them were no more than expensive learning toys.
>
> I always found that remarkable, since Algol-60 compiler ran in
> 4kwords of 2 instructions/word.

You mean Elliott 803 / 503?

It had also an overlay structure. If the program grew above a
certain limit, it was dumped out in an intermediate format, and
the operator needed to feed in the second compiler pass paper
tape and the intermediate code ('owncode') to get the final
run code.

--

-TV

Tom Gardner

unread,

Jun 8, 2017, 11:37:36 AM6/8/17

Yes and yes.

I saw a running 803 a couple of weeks ago, and discussed
the circuit diagrams with the staff member there.

upsid...@downunder.com

unread,

Jun 8, 2017, 2:34:36 PM6/8/17

On Thu, 08 Jun 2017 15:58:51 +0300, Dimiter_Popoff <d...@tgi-sci.com>
wrote:

PL/M-80 and PL/M-86 were quite reasonable intermediate languages.

The same also applies to BLISS for PDP-10/PDP-11/VAX/Alpha and
recently some Intel HW.

The problem why these languages did not become popular was that the
hardware vendors did want to make money by compiler sales.

Some HW companies wanting to boost their HW sales did give away
compilers and development software for free and that way boost their
HW sale.

Don Y

unread,

Jun 9, 2017, 3:06:17 AM6/9/17

On 6/8/2017 3:38 AM, George Neuner wrote:
>> The problem is that you have to *design* systems for with these
>> folks in mind as their likely "maintainers" and/or "evolvers".
>>
>> So, even if you're "divinely inspired", you have to manage to either
>> put in place a framework that effectively guides (constrains!) their
>> future work to remain consistent with that design...
>>
>> Or, leave copious notes and HOPE they read them, understand them and
>> take them to heart...
>
> Or adopt a throw-away mentality: replace rather than maintain.
>
> That basically is the idea behind the whole agile/devops/SaaS
> movement: if it doesn't work today, no problem - there will be a new
> release tomorrow [or sooner].

I think those are just enablers for PHB's who are afraid to THINK
about what they want (in a product/design) and, instead, want to be shown
what they DON'T want.

I encountered a woman who was looking for a "mobility scooter" a week or
two ago. I showed her *one* and she jumped at the opportunity. I
quickly countered with a recommendation that some OTHER form of "transport"
might be better for her:
"The scooter has a wide turning radius. If you head down a hallway
(i.e., in your home) and want to turn around, you'll have to either
continue in the current direction until you encounter a wider space
that will accommodate the large turning radius *or* travel backwards
retracing your steps. A powerchair will give you a smaller turning
radius. An electric wheelchair tighter still!"
She was insistent on the scooter. Fearing that she was clinging to it
as the sole CONCRETE example available, I told her that I also had
examples of each of the other options available.

[I was fearful of getting into a situation where I refurbished one
"transport device", sent her home with it -- only to find her returning
a week later complaining of its limitations, and wanting to "try another
option"]

In this case, she had clearly considered the options and come to the
conclusion that the scooter was best suited to her needs: the chair
options tend to be controlled by a joystick interface whereas the
scooter has a tiller (handlebars) and "speed setting". For her,
the tremor in her hands made the fine motor skills required to interact
with the joystick impractical. So, while the scooter was less
maneuverable (in the abstract sense), it was more CONTROLABLE in her
particular case. She'd actively considered the options instead of
needing to "see" each of them (to discover each of their shortcomings).

>> OTOH, if you've got a boatload of similar jobs (The David, The Rita,
>> The Bob, The Bethany, The Harold, The Gretchen, etc.), that one artist
>> may decide he's tired of being asked to "tweek" the works of past
>> artists and want a commission of his own! Or, simply not have time
>> enough in his schedule to get to all of them at the pace you desire!
>
> No problem: robots and 3-D printers will take care of that. Just read
> an article that predicts AI will best humans at *everything* within 50
> years.

Yeah, Winston told me that... 40 years ago! :>

>>>> Alternatively, you can try to constrain the programmer (protect him from
>>>> himself) and *hope* he's compliant.
>>>
>>> Yes. And my preference for a *general* purpose language is to default
>>> to protecting the programmer, but to selectively permit use of more
>>> dangerous constructs in marked "unsafe" code.
>>
>> And, count on the DISCIPLINE of all these would-be Michelangelos to
>> understand (and admit!) their own personal limitations prior to enabling
>> those constructs?
>
> For the less experienced, fear, uncertainty and doubt are better
> counter-motivators than is any amount of discipline. When a person
> believes (correctly or not) that something is hard to learn or hard to
> use, he or she usually will avoid trying it for as long as possible.

Or, will think they are "above average" and, thus, qualified to KNOW
how to use/do it!

> The basic problem with C is that some of its hard to master concepts
> are dangled right in the faces of new programmers.

I think the problem is that the "trickier" aspects aren't really
labeled as such.

I know most folks would rather tackle a multiplication problem than
an equivalent one of division. But, they've learned (from experience)
of the relative costs/perils of each. It's not like there is a
big red flag on the chapter entitled "division" that warns of Dragons!

> For almost any non-system application, you can do without (explicit
> source level) pointer arithmetic. But pointers and the address
> operator are fundamental to function argument passing and returning
> values (note: not "value return"), and it's effectively impossible to
> program in C without using them.

But, if you'd a formal education in CS, it would be trivial to
semantically map the mechanisms to value and reference concepts.
And, thinking of "reference" in terms of an indication of WHERE
it is! etc.

Similarly, many of the "inconsistencies" (to noobs) in the language
could easily be explained with "common sense":
- why aren't strings/arrays passed by value? (think about how
ANYTHING is passed by value; the answer should then be obvious)
- the whole notion of references being IN/OUT's
- gee, const can ensure an IN can't be used as an OUT!
etc.

I think the bigger problem is that folks are (apparently) taught
"keystrokes" instead of "concepts": type THIS to do THAT.

> This pushes newbies to learn about pointers, machine addressing and
> memory management before many are ready. There is plenty else to
> learn without *simultaneously* being burdoned with issues of object
> location.

Then approach the topics more incrementally. Instead of introducing
the variety of data types (including arrays), introduce the basic
ones. Then, discuss passing arguments -- and how they are COPIED into
a stack frame.

This can NATURALLY lead to the fact that you can only "return" one
datum; which the caller would then have to explicitly assign to
<whatever>. "Gee, wouldn't it be nice if we could simply POINT to
the things that we want the function (subroutine) to operate on?"

Then, how you can use references to economize on the overhead of passing
large objects (like strings/arrays) to functions.

Etc.

I just think the teaching approach is crippled. Its driven by industry
with the goal of getting folks who can crank out code, regardless of
quality or comprehension.

> Learning about pointers then invariably leads to learning about
> arithmetic on pointers because they are covered together in most
> tutorials.
>
> Keep in mind that the majority of people learning and using C (or C++)
> today have no prior experience with hardware or even with programming
> in assembler. If C isn't their 1st (non-scripting) language then most
> likely their prior experiences were with "safe", high level, GC'd
> languages that do not expose object addressing: e.g., Java, Scheme,
> Python, etc. ... the commonly used "teaching" languages.

But you can still expose a student to the concepts of the underlying
machine, regardless of language. Introduce a hypothetical machine...
something with, say, memory and a computation unit. Treat memory
as a set of addressable "locations", etc. My first "computer texts"
all presented a conceptual model of a "computer system" -- even though
the languages discussed (e.g., FORTRAN) hid much of that from the
casual user.

Instead, there's an emphasis on idioms and tricks that aren't portable
and confuse the issue(s). Its like teaching a student driver about the
infotainment system in the vehicle instead of how the brake and accelerator
operate.

> For general application programming, there is no need for a language
> to provide mutable pointers: initialized references, together with
> array (or stream) indexing and struct/object member access are
> sufficient for virtually any non-system programming use. This has
> been studied extensively and there is considerable literature on the
> subject.

But then you force the developer to pick different languages for
different aspects of a problem. How many folks are comfortable
with this "application specific" approach to *a* problem's solution?

E.g., my OS is coded in C and ASM. Most of the core services are
written in C (so I can provide performance guarantees) with my bogus
IDL to handle RPC/IPC. The RDBMS server is accessed using SQL.
And, "applications" are written in my modified-Limbo.

This (hopefully) "works" because most folks will only be involved
with *one* of these layers. And, folks who are "sufficiently motivated"
to make their additions/modifications *work* can resort to cribbing
from the existing parts of the design -- as "examples" of how they
*could* do things ("Hey, this works; why not just copy it?")

OTOH, if someone had set out to tackle the whole problem in a single
language/style... <shrug>

>> What you (ideally) want, is to be able to "set a knob" on the 'side' of
>> the language to limit its "potential for misuse". But, to do so in a
>> way that the practitioner doesn't feel intimidated/chastened at its
>> apparent "setting".
>
> Look at Racket's suite of teaching and extension languages. They all
> are implemented over the same core language (an extended Scheme), but
> they leverage the flexibility of the core langauge to offer different
> syntaxes, different semantics, etc.
>
> In the case of the teaching languages, there is reduced functionality,
> combined with more newbie friendly debugging output, etc.
>
> http://racket-lang.org/
> https://docs.racket-lang.org/htdp-langs/index.html
>
> And, yeah, the programmer can change which language is in use with a
> simple "#lang <_>" directive, but the point here is the flexibility of
> the system to provide (more or less) what you are asking for.

I'm sure you've worked in environments where the implementation
was "dictated" by what appeared to be arbitrary constraints:
will use this language, these tools, this process, etc. IME,
programmers *chaffe* at such constraints. Almost as if they were
personal affronts ("*I* know the best way to tackle the problem
that *I* have been assigned!"). Imagine how content they'd be
knowing they were being told to eat at the "kiddie table".

I designed a little serial protocol that lets me daisy-chain
messages through simple "motes". The protocol had to be simple
and low overhead as the motes are intended to be *really*
crippled devices -- at best, coded in C (on a multitasking
*executive*, not even a full-fledged OS) and, more likely,
in ASM.

When I went to code the "host" side of the protocol, my first
approach was to use Limbo -- this should make it more maintainable
by those who follow (goal is to reduce the requirements imposed
on future developers as much as possible).

But, I was almost literally grinding my teeth as I was forced to
build message packets in "byte arrays" with constant juggling
of array indices, etc. (no support for pointers). I eventually
"rationalized" that this could be viewed as a "core service"
(communications) and, thus, suitable for coding along the same
lines as the other services: in C! :>

An hour later, the code is working and (to me) infinitely more
intuitive than a bunch of "array slices" and "casts".

>> (Returning to "portability"...)
>>
>> Even if I can craft something that is portable under some set of
>> conditions/criteria that I deem appropriate -- often by leveraging
>> particular features of the language of a given implementation
>> thereof -- how do I know the next guy will understand those issues?
>> How do I know he won't *break* that aspect (portability) -- and
>> only belatedly discover his error (two years down the road when the
>> code base moves to a Big Endian 36b processor)?
>
> You don't, and there is little you can do about it. You can try to be
> helpful - e.g., with documentation - but you can't be responsible for
> what the next person will do.

Of course! My approach is to exploit laziness and greed. Leave
bits of code that are RIPE for using as the basis for new services
("templates", of sorts). And, let the developer feel he can do
whatever he wants -- if he's willing to bear the eventual cost
for those design decisions (which might include users opting not
to deploy his enhancements!)

> No software truly is portable except that which runs on an abstract
> virtual machine. As long as the virtual machine can be realized on a
> particular base platform, the software that runs on the VM is
> "portable" to that platform.
>
>> It's similar to trying to ensure "appropriate" documentation
>> accompanies each FUTURE change to the system -- who decides
>> what is "appropriate"? (Ans: the guy further into the future who
>> can't sort out the changes made by his predecessor!)
>
> Again, you are only responsible for what you do.

But, you can use the same lazy/greedy motivators there, as well.
E.g., my gesture recognizer builds the documentation for the
gesture from the mathematical model of the gesture. This
relieves the developer from that task, ensures the documentation
is ALWAYS in sync with the implementation *and* makes it trivial
to add new gestures by lowering the effort required.

>>> ... No matter how sophisticated the compiler becomes, there always
>>> will be cases where the programmer knows better and should be able
>>> to override it.
>>
>> Exactly. Hence the contradictory issues at play:
>> - enable the competent
>> - protect the incompetent
>>
>>> But even with these limitations, there are languages that are useful
>>> now and do far more of what you want than does C.
>>
>> But, when designing (or choosing!) a language, one of the dimensions
>> in your decision matrix has to be availability of that language AND in
>> the existing skillsets of its practitioners.
>
> The modern concept of availability is very different than when you had
> to wait for a company to provide a turnkey solution, or engineer
> something yourself from scratch. Now, if the main distribution
> doesn't run on your platform, you are likely to find source that you
> can port yourself (if you are able), or if there's any significant
> user base, you may find that somebody else already has done it.

That works for vanilla implementations. It leads to all designs
looking like all others ("Lets use a PC for this!"). This is
fine *if* that's consistent with your product/project goals.
But, if not, you're SoL.

Or, faced with a tool porting/development task that exceeds the
complexity of your initial problem.

> Tutorials, reference materials, etc. are a different matter, but the
> simpler and more uniform the syntax and semantics, the easier the
> language is to learn and to master.
>
> question: why in C is *p.q == p->q
> but *p != p
> and p.q != p->q
>
> followup: given coincidental addresses and modulo a cast,
> how is it that *p can == *p.q
>
> Shit like this makes a student's head explode.

But C is lousy for its use of graphemes/glyphs. You'd think
K&R were paraplegics given how stingy they are with keystrokes!

Or, supremely lazy! (or, worse, think *us* that lazy!)

[I guess it coul dbe worse; they could have forced all
identifiers to be single character!]

> In Pascal, the pointer dereference operator '^' and the record
> (struct) member access operator '.' were separate and always used
> consistently. The type system guarantees that p and p^ and p^.q can
> never, ever be the same object.
>
> This visual and logical consistency made Pascal easier to learn. And
> not any less functional.
>
> My favorite dead horse - Modula 3 - takes a similar approach. Modula
> 3 is both a competent bare metal system language AND a safe OO
> application language. It does a whole lot more than (extended) Pascal
> - yet it isn't that much harder to learn.
>
> It is possible to learn Modula 3 incrementally: leaving advanced
> subjects such as where objects are located in memory and when it's
> safe to delete() them - until you absolutely need to know.
>
> And if you stick to writing user applications in the safe subset of
> the language, you may never need to learn it: Modula 3 uses GC by
> default.

The same is largely true of Ada. But, with Ada, you end up knowing
an encyclopaedic language that, in most cases, is overkill and affords
little for nominal projects.

An advantage of ASM was that there were *relatively* few operators
and addressing modes, etc. Even complex instructions could be reliably
(*and* mechanically) "decoded". You didn't find yourself wondering
if something was a constant pointer to variable data, or a variable
pointer to constant data, or a constant pointer to constant data, or...

And, ASM syntax tended to be more "fixed form". There wasn't as much
poetic license to how you expressed particular constructs.

E.g., I instinctively write "&array[0]" instead of "array" (depending on
the use).

>> ASM saw widespread use -- not because it was the BEST tool for the
>> job but, rather, because it was (essentially) the ONLY game in town
>> (in the early embedded world). Amusing that we didn't repeat the same
>> evolution of languages that was the case in the "mainframe" world
>> (despite having comparable computational resources to those
>> ANCIENT machines!).
>>
>> The (early) languages that we settled on were simple to implement
>> on the development platforms and with the target resources. Its
>> only as targets have become more resource-rich that we're exploring
>> richer execution environments (and the attendant consequences of
>> that for the developer).
>
> There never was any C compiler that ran on any really tiny machine.

Doesn't have to run *on* a tiny machine. It just had to generate code
that could run on a tiny machine!

E.g., we used an 11 to write our i4004 code; the idea of even something
as crude as an assembler running *ON* an i4004 was laughable!

> Ritchies' technotes on the development of C stated that the original
> 1972 PDP-11 compiler had to run in ~6KB (all that was left after
> loading Unix), required several passes, and really was not usable
> until the machine was given a hard disk. Note also that that 1st
> compiler implemented only a subset of K&R1.
>
> K&R1 - as described in the book - was 1st implemented in 1977 and I
> have never seen any numbers on the size of that compiler.
>
> The smallest K&R1 compiler I can remember that *ran* on an 8-bit micro
> was circa 1983. It was a few hundred KB of code. It ran in 48KB
> using overlays, needed 2 floppy drives or a hard disk, and required 2
> compile passes per source file and a final link pass.

It wasn't uncommon for early *assemblers* to require multiple passes.

I built some small CP/M based development systems for an employer
many years ago. To save a few bucks, he opted to deploy most of them
(mine being the exception! :> ) with a single 1.4M floppy. The
folks using them were ecstatic as they were so much faster than the
ZDS boxes we'd used up to then (hard sectored floppies, etc.).

But, had the boss *watched* folks using them and counted the amount
of time LOST swapping floppies (esp when you wanted to make a backup
of a floppy!), he'd have realized how foolhardy his "disk economy"
had been!

> It was quite functional (if glacially slow), and included program code
> overlay support and emulated single precision FP (in VAX format IIRC).
> Although it targeted a 16-bit virtual machine with 6 16-bit registers,
> it produced native 8-bit code : i.e. the "16-bit VM" program was not
> interpreted, but was emulated by 8-bit code.
>
> As part of the pro package (and available separately for personal use)
> there also was a bytecode compiler that allowed packing much larger
> applications (or their data) into memory. It had all the same
> features as the native code compiler, but produced interpreted code
> that ran much slower. You could use both native and interpreted code
> in the same application via overlays.
>
> There existed various subset C compilers that could run in less than
> 48KB, but most of them were no more than expensive learning toys.

Whitesmith's and Manx?

JRT Pascal ($19.95!) ran on small CP/M boxes. IIRC, there was an M2 that
also ran, there. And, MS had a BASIC compiler.

> But even by the standard of "the compiler could run on the machine",
> there were languages better suited than C for application programming.
>
> Consider that in the late 70's there already were decent 8-bit
> implementations of BASIC, BCPL Logo, SNOBOL, etc. (Extended) Pascal,
> Smalltalk, SNOBOL4, etc. became available in the early 80's for both 8
> and 16-bit systems. But C really wasn't useable on any micro prior to
> ~1985 when reasonably<?> priced hard disks appeared.
>
> Undoubtedly, AT&T giving away Unix to colleges from 1975..1979 meant
> that students in that time frame would have gained some familiarity
> with C. 16-bit micros powerful enough to really be characterized as
> useful "development" systems popped out in the early 80's as these
> students would have been graduating (or shortly thereafter).
>
> But they were extremely expensive: tens of thousands of dollars for a
> usable system. You'd have to mortage your home to afford one, which
> is not something the newly working with looming college loans would do
> lightly. And sans hard disk (more $$$), you'd manage only one or two
> compiles a day.

But you didn't have to rely on having a home system to write code.
Just like most folks don't *rely* on having home internet to access
the web, email, etc.

If you're still in school, there's little to prevent you from using
their tools for a "personal project". Ditto if employed. The only
caveat being "not on company time".

> Turbo Pascal was the 1st really useable [in the modern sense]
> developement system. It did not need a hard disk and it hit the
> market before commodity hard disks were widely available.
>
> The question is not why C was adopted for system programming, or for
> cross development from a capable system to a smaller target. Rather
> the question is why it was so widely adopted for ALL kinds of
> programming on ALL platforms given that were many other reasonable
> choices available.

Look at them, individually. And, at the types of products that
were being developed in that time frame.

You could code most algorithms *in* BASIC. But, if forced into a
single-threaded environment, most REAL projects would fall apart
(cuz the processor would be too slow to get around to polling
everything AND doing meaningful work). I wrote a little BASIC
compiler that targeted the 647180 (one of the earliest SoC's).

It was useless for product development. But, great for throwing
together dog-n-pony's for clients. Allow multiple "program
counters" to walk through ONE executable and you've got an effective
multitasking environment (though with few RT guarantees). Slap
*one* PLCC in a wirewrap socket with some misc signal conditioning/IO
logic and show the client a mockup of a final product in a couple
of weeks.

[Then, explain why it was going to take several MONTHS to go from
that to a *real* product! :> ]

SNOBOL is really only useful for text processing. Try implementing
Bresenham's algorithm in it -- or any other DDA. This sort of thing
highlights the differences between "mainframe" applications and
"embedded" applications.

Ditto Pascal. How much benefit is there in controlling a motor
that requires high level math and flagrant automatic type conversion?

Smalltalk? You *do* know how much RAM cost in the early 80's??

Much embedded coding could (today) be done with as crippled a
framework as PL/M. What you really want to do is give the developer
some syntactic freedom (e.g., infix notation for expressions)
and relieve him of the minutiae of setting up stack frames,
tracking binary points, etc.

C goes a long way towards that goal without favoring a particular
application domain. And, because its relatively easy to "visualize"
what is happening "behind the code", its easy to deploy applications
coded in it in multiple different environments.

[By contrast, think about how I tackled the multitasking BASIC
implementation and how I'd have to code *for* that implementation
to avoid "unexpected artifacts"]

> YMMV. I remain perplexed.

George Neuner

unread,

Jun 9, 2017, 2:14:39 PM6/9/17

On Thu, 8 Jun 2017 14:50:06 +0100, Tom Gardner
<spam...@blueyonder.co.uk> wrote:

>On 08/06/17 11:38, George Neuner wrote:
>> The smallest K&R1 compiler I can remember that *ran* on an 8-bit micro
>> was circa 1983. It was a few hundred KB of code. It ran in 48KB
>> using overlays, needed 2 floppy drives or a hard disk, and required 2
>> compile passes per source file and a final link pass.
>>
>> It was quite functional (if glacially slow), and included program code
>> overlay support and emulated single precision FP (in VAX format IIRC).
>> Although it targeted a 16-bit virtual machine with 6 16-bit registers,
>> it produced native 8-bit code : i.e. the "16-bit VM" program was not
>> interpreted, but was emulated by 8-bit code.
>
>Whitesmiths? IIRC the symbol table size became a limiting
>factor during linking, so linking became multipass :(

No, it was the Aztec compiler from Manx.

I'm not aware that Whitesmith ever ran on an 8-bit machine. The
versions I remember were for CP/M-86 and PC/MS-DOS. My (maybe faulty)
recollection is that Whitesmith was enormous: needing at least 512KB
and a hard disk to be useful.

I remember at one time using Microsoft's C compiler on 1.2MB floppies
and needing half a dozen disk swaps to compile "hello world!".

>> There existed various subset C compilers that could run in less than
>> 48KB, but most of them were no more than expensive learning toys.
>
>I always found that remarkable, since Algol-60 compiler ran in
>4kwords of 2 instructions/word.

Must have been written in assembler - I would have loved to have seen
that.

>> But even by the standard of "the compiler could run on the machine",
>> there were languages better suited than C for application programming.
>>
>> Consider that in the late 70's there already were decent 8-bit
>> implementations of BASIC, BCPL Logo, SNOBOL, etc. (Extended) Pascal,
>> Smalltalk, SNOBOL4, etc. became available in the early 80's for both 8
>> and 16-bit systems. But C really wasn't useable on any micro prior to
>> ~1985 when reasonably<?> priced hard disks appeared.
>
>I'll debate Smalltalk :) Apple's implementation (pre L Peter
>Deutsch's JIT) was glacially slow. I know: it is still running
>on my fat Mac downstairs :)

I agree that Apple's version was slow - I maybe never saw the version
with JIT - but ParcPlace Smalltalk ran very well on a FatMac.

I had a Smalltalk for my Apple IIe. It needed 128KB so required a IIe
or a II with language card. It used a text based browser and ran
quite acceptably for small programs. Unfortunately, the version I had
was not able to produce a separate executable.

Unfortunately, after too many moves, I no longer have very much of the
early stuff. I never figured on it becoming valuable.

George

Tom Gardner

unread,

Jun 9, 2017, 5:25:26 PM6/9/17

On 09/06/17 19:14, George Neuner wrote:
> On Thu, 8 Jun 2017 14:50:06 +0100, Tom Gardner

>>> There existed various subset C compilers that could run in less than
>>> 48KB, but most of them were no more than expensive learning toys.
>>
>> I always found that remarkable, since Algol-60 compiler ran in
>> 4kwords of 2 instructions/word.
>
> Must have been written in assembler - I would have loved to have seen
> that.

You probably still can. Certainly the 803 was playing
tunes a month ago.

http://www.tnmoc.org/news/notes-museum/iris-atc-has-hiccup-and-elliott-803-store-fault-returns

>>> But even by the standard of "the compiler could run on the machine",
>>> there were languages better suited than C for application programming.
>>>
>>> Consider that in the late 70's there already were decent 8-bit
>>> implementations of BASIC, BCPL Logo, SNOBOL, etc. (Extended) Pascal,
>>> Smalltalk, SNOBOL4, etc. became available in the early 80's for both 8
>>> and 16-bit systems. But C really wasn't useable on any micro prior to
>>> ~1985 when reasonably<?> priced hard disks appeared.
>>
>> I'll debate Smalltalk :) Apple's implementation (pre L Peter
>> Deutsch's JIT) was glacially slow. I know: it is still running
>> on my fat Mac downstairs :)
>
> I agree that Apple's version was slow - I maybe never saw the version
> with JIT - but ParcPlace Smalltalk ran very well on a FatMac.

I never saw PP Smalltalk on a fat mac. L Peter Deutsch's JIT
was a significant improvement.

I moved onto Smalltalk/V on a PC, a Tek Smalltalk machine,
and then Objective-C.

Later I was surprised to find that both Tek and HP had
embedded Smalltalk in some of their instruments.

> I had a Smalltalk for my Apple IIe. It needed 128KB so required a IIe
> or a II with language card. It used a text based browser and ran
> quite acceptably for small programs. Unfortunately, the version I had
> was not able to produce a separate executable.
>
> Unfortunately, after too many moves, I no longer have very much of the
> early stuff. I never figured on it becoming valuable.

I'm collecting a bit now; I was surprised the fat mac
only cost £90 inc shipping.

Clifford Heath

unread,

Jun 9, 2017, 8:29:00 PM6/9/17

On 10/06/17 04:14, George Neuner wrote:
> On Thu, 8 Jun 2017 14:50:06 +0100, Tom Gardner
> <spam...@blueyonder.co.uk> wrote:
>
>> On 08/06/17 11:38, George Neuner wrote:
>>> The smallest K&R1 compiler I can remember that *ran* on an 8-bit micro
>>> was circa 1983. It was a few hundred KB of code. It ran in 48KB
>>> using overlays, needed 2 floppy drives or a hard disk, and required 2
>>> compile passes per source file and a final link pass.
>>>
>>> It was quite functional (if glacially slow), and included program code
>>> overlay support and emulated single precision FP (in VAX format IIRC).
>>> Although it targeted a 16-bit virtual machine with 6 16-bit registers,
>>> it produced native 8-bit code : i.e. the "16-bit VM" program was not
>>> interpreted, but was emulated by 8-bit code.
>>
>> Whitesmiths? IIRC the symbol table size became a limiting
>> factor during linking, so linking became multipass :(
>
> No, it was the Aztec compiler from Manx.
>
> I'm not aware that Whitesmith ever ran on an 8-bit machine. The
> versions I remember were for CP/M-86 and PC/MS-DOS. My (maybe faulty)
> recollection is that Whitesmith was enormous: needing at least 512KB
> and a hard disk to be useful.
>
> I remember at one time using Microsoft's C compiler on 1.2MB floppies
> and needing half a dozen disk swaps to compile "hello world!".

I built my first software product like that, a personal filing system.
I was very glad when we got a 5MB disk drive, and didn't have to swap
disks any more. It was even better when, a few months later (1983) we
got MS-DOS 2, with mkdir/rmdir, so not all files were in the root
directory any more.

George Neuner

unread,

Jun 9, 2017, 10:14:34 PM6/9/17

On Fri, 9 Jun 2017 00:06:05 -0700, Don Y <blocked...@foo.invalid>
wrote:

>On 6/8/2017 3:38 AM, George Neuner wrote:
>>

>> ... adopt a throw-away mentality: replace rather than maintain.

>>
>> That basically is the idea behind the whole agile/devops/SaaS
>> movement: if it doesn't work today, no problem - there will be a new
>> release tomorrow [or sooner].
>
>I think those are just enablers for PHB's who are afraid to THINK
>about what they want (in a product/design) and, instead, want to be shown
>what they DON'T want.

IME most people [read "clients"] don't really know what they want
until they see what they don't want.

Most people go into a software development effort with a reasonable
idea of what it should do ... subject to revision if they are allowed
to think about it ... but absolutely no idea what it should look like
until they see - and reject - several demos.

The entire field of "Requirements Analysis" would not exist if people
knew what they wanted up front and could articulate it to the
developer.

>> For almost any non-system application, you can do without (explicit
>> source level) pointer arithmetic. But pointers and the address
>> operator are fundamental to function argument passing and returning
>> values (note: not "value return"), and it's effectively impossible to
>> program in C without using them.
>
>But, if you'd a formal education in CS, it would be trivial to
>semantically map the mechanisms to value and reference concepts.
>And, thinking of "reference" in terms of an indication of WHERE
>it is! etc.

But only a small fraction of "developers" have any formal CS, CE, or
CSE education. In general, the best you can expect is that some of
them may have a certificate from a programming course.

>Similarly, many of the "inconsistencies" (to noobs) in the language
>could easily be explained with "common sense":
>- why aren't strings/arrays passed by value? (think about how
> ANYTHING is passed by value; the answer should then be obvious)
>- the whole notion of references being IN/OUT's
>- gee, const can ensure an IN can't be used as an OUT!
>etc.

That's true ... but then you get perfectly reasonable questions like
"why aren't parameters marked as IN or OUT?", and have to dance around
the fact that the developers of the language were techno-snobs who
didn't expect that clueless people ever would be trying to use it.

Or "how do I ensure that an OUT can't be used as an IN?" Hmmm???

>I think the bigger problem is that folks are (apparently) taught
>"keystrokes" instead of "concepts": type THIS to do THAT.

There is a element of that. But also there is the fact that many who
can DO cannot effectively teach.

I knew someone who was taking a C programming course, 2 nights a week
at a local college. After (almost) every class, he would come to me
with questions and confusions about the subject matter. He remarked
on several occasions that I was able to teach him more in 10 minutes
than he learned in a 90 minute lecture.

>> This pushes newbies to learn about pointers, machine addressing and
>> memory management before many are ready. There is plenty else to
>> learn without *simultaneously* being burdoned with issues of object
>> location.
>
>Then approach the topics more incrementally. Instead of introducing
>the variety of data types (including arrays), introduce the basic
>ones. Then, discuss passing arguments -- and how they are COPIED into
>a stack frame.

A what frame?

I once mentioned "stack" in a response to a question posted in another
forum. The poster had proudly announced that he was a senior in a CS
program working on a midterm project. He had no clue that "stacks"
existed other than as abstract notions, didn't know the CPU had one,
and didn't understand why it was needed or how his code was faulty for
(ab)using it.

So much for "CS" programs.

>This can NATURALLY lead to the fact that you can only "return" one
>datum; which the caller would then have to explicitly assign to
><whatever>. "Gee, wouldn't it be nice if we could simply POINT to
>the things that we want the function (subroutine) to operate on?"

Huh? I saw once in a textbook that <insert_language> functions can
return more than one object. Why is this language so lame?

>I just think the teaching approach is crippled. Its driven by industry
>with the goal of getting folks who can crank out code, regardless of
>quality or comprehension.

You and I have had this discussion before [at least in part].

CS programs don't teach programming - they teach "computer science".
For the most part CS students simply are expected to know.

CSE programs are somewhat better because they [purport to] teach
project management: selection and use of tool chains, etc. But that
can be approached largely in the abstract as well.

Many schools are now requiring that a basic programming course be
taken by all students, regardless of major. But this is relatively
recent, and the language de choix varies widely.

>But you can still expose a student to the concepts of the underlying
>machine, regardless of language. Introduce a hypothetical machine...
>something with, say, memory and a computation unit. Treat memory
>as a set of addressable "locations", etc.

That's covered in a separate course: "Computer Architecture 106". It
is only offered Monday morning at 8am, and it costs another 3 credits.

>My first "computer texts" all presented a conceptual model of a
>"computer system" -- even though the languages discussed
>(e.g., FORTRAN) hid much of that from the casual user.

Every intro computer text introduces the hypothetical machine ... and
spends 6-10 pages laboriously stretching out the 2 sentence decription
you gave above. If you're lucky there will be an illustration of an
array of memory cells.

Beyond that, you are into specialty texts.

>> For general application programming, there is no need for a language
>> to provide mutable pointers: initialized references, together with
>> array (or stream) indexing and struct/object member access are
>> sufficient for virtually any non-system programming use. This has
>> been studied extensively and there is considerable literature on the
>> subject.
>
>But then you force the developer to pick different languages for
>different aspects of a problem. How many folks are comfortable
>with this "application specific" approach to *a* problem's solution?

Go ask this question in a Lisp forum where writing a little DSL to
address some knotty aspect of a problem is par for the course.

>E.g., my OS is coded in C and ASM. Most of the core services are
>written in C (so I can provide performance guarantees) with my bogus
>IDL to handle RPC/IPC. The RDBMS server is accessed using SQL.
>And, "applications" are written in my modified-Limbo.

What does CLIPS use?

By my count you are using 6 different languages ... 4 or 5 of which
you can virtually count on the next maintainer not knowing.

What would you have done differently if C were not available for
writing your applications? How exactly would that have impacted your
development?

>This (hopefully) "works" because most folks will only be involved
>with *one* of these layers. And, folks who are "sufficiently motivated"
>to make their additions/modifications *work* can resort to cribbing
>from the existing parts of the design -- as "examples" of how they
>*could* do things ("Hey, this works; why not just copy it?")

Above you complained about people being taught /"keystrokes" instead
of "concepts": type THIS to do THAT./ and something about how that
led to no understanding of the subject.

>OTOH, if someone had set out to tackle the whole problem in a single
>language/style... <shrug>

It would be a f_ing nightmare. That's precisely *why* you *want* to
use a mix of languages: often the best tool is a special purpose
domain language.

>>> What you (ideally) want, is to be able to "set a knob" on the 'side' of
>>> the language to limit its "potential for misuse". But, to do so in a
>>> way that the practitioner doesn't feel intimidated/chastened at its
>>> apparent "setting".
>>
>> Look at Racket's suite of teaching and extension languages. They all
>> are implemented over the same core language (an extended Scheme), but
>> they leverage the flexibility of the core langauge to offer different
>> syntaxes, different semantics, etc.
>>
>> In the case of the teaching languages, there is reduced functionality,
>> combined with more newbie friendly debugging output, etc.
>>
>> http://racket-lang.org/
>> https://docs.racket-lang.org/htdp-langs/index.html
>>
>

>I'm sure you've worked in environments where the implementation
>was "dictated" by what appeared to be arbitrary constraints:
>will use this language, these tools, this process, etc. IME,
>programmers *chaffe* at such constraints. Almost as if they were
>personal affronts ("*I* know the best way to tackle the problem
>that *I* have been assigned!"). Imagine how content they'd be
>knowing they were being told to eat at the "kiddie table".

If the tool is Racket, it supports creating, using and ad-mixing any
special purpose domain languages you are able to come up with.

<grin>

Racket isn't the only such versatile tool ... it's just the one I
happened to have at hand.

>> The modern concept of availability is very different than when you had
>> to wait for a company to provide a turnkey solution, or engineer
>> something yourself from scratch. Now, if the main distribution
>> doesn't run on your platform, you are likely to find source that you
>> can port yourself (if you are able), or if there's any significant
>> user base, you may find that somebody else already has done it.
>
>That works for vanilla implementations. It leads to all designs
>looking like all others ("Lets use a PC for this!"). This is
>fine *if* that's consistent with your product/project goals.
>But, if not, you're SoL.

Yeah ... well the world is going that way. My electric toothbrush is
a Raspberry PI running Linux.

>An advantage of ASM was that there were *relatively* few operators
>and addressing modes, etc.

Depends on the chip. Modern x86_64 chips can have instructions up to
15 bytes (120 bits) long. [No actual instruction *is* that long, but
that is the maximum the decoder will accept.]

>>> The (early) languages that we settled on were simple to implement
>>> on the development platforms and with the target resources. Its
>>> only as targets have become more resource-rich that we're exploring
>>> richer execution environments (and the attendant consequences of
>>> that for the developer).
>>
>> There never was any C compiler that ran on any really tiny machine.
>
>Doesn't have to run *on* a tiny machine. It just had to generate code
>that could run on a tiny machine!

Cross compiling is cheating!!!

In most cases, it takes more resources to develop a program than to
run it ... so if you have a capable machine for development, why do
need a *small* compiler?

A small runtime footprint is a different issue, but *most* languages
[even GC'd ones] are capable of operating with a small footprint.

Once upon a time, I created a Scheme-like GC'd language that could do
a hell of a lot in 8KB total for the compiler, runtime, a reasonably
complex user program and its data.

>E.g., we used an 11 to write our i4004 code; the idea of even something
>as crude as an assembler running *ON* an i4004 was laughable!

My point exactly. In any case, you wouldn't write for the i4004 in a
compiled language. Pro'ly not for the i8008 either, although I have
heard claims that that was possible.

But we aren't talking about *embedded* applications ... we're talking
about ALL KINDS of applications on ALL KINDS of machines.

You view everything through the embedded lens.

>Ditto Pascal. How much benefit is there in controlling a motor
>that requires high level math and flagrant automatic type conversion?

I don't even understand this.

>Smalltalk? You *do* know how much RAM cost in the early 80's??

Yes, I do.

I also know that I had a Smalltalk development system that ran on my
Apple IIe. Unfortunately, it was a "personal" edition that was not
able to create standalone executables ... there was a "professional"
version that could, but it was too expensive for me ... so I don't
know how small a 6502 Smalltalk program could have been.

I also had a Lisp and a Prolog for the IIe. No, they did not run in
4KB, but they were far from useless on an 8-bit machine.

George

Don Y

unread,

Jun 10, 2017, 1:50:42 AM6/10/17

On 6/9/2017 7:14 PM, George Neuner wrote:
> On Fri, 9 Jun 2017 00:06:05 -0700, Don Y <blocked...@foo.invalid>
> wrote:
>
>> On 6/8/2017 3:38 AM, George Neuner wrote:
>>>
>>> ... adopt a throw-away mentality: replace rather than maintain.
>>>
>>> That basically is the idea behind the whole agile/devops/SaaS
>>> movement: if it doesn't work today, no problem - there will be a new
>>> release tomorrow [or sooner].
>>
>> I think those are just enablers for PHB's who are afraid to THINK
>> about what they want (in a product/design) and, instead, want to be shown
>> what they DON'T want.
>
> IME most people [read "clients"] don't really know what they want
> until they see what they don't want.

I've typically only found that to be the case when clients (often
using "their own" money) can't decide *if* they want to enter a
particular market. They want to see something to gauge their own
reaction to it: is it an exciting product or just another warmed over
stale idea.

I used to make wooden mockups of devices just to "talk around".
Then foamcore. Then, just 3D CAD sketches.

But, how things work was always conveyed in prose. No need to see
the power light illuminate when the power switch was toggled. If
you can't imagine how a user will interact with a device, then
you shouldn't be developing that device!

The only "expensive" dog-and-pony's were cases where the underlying
technology was unproven. Typically mechanisms that weren't known
to behave as "envisioned" without some sort of reassurances (far from a
clinical *proof*). I don't have an ME background so can never vouch
for mechanical designs; if the client needs reassurance, the ME has
to provide it *or* invest in building a real mechanism (which often
just "looks pretty" without any associated driving electronics)

> Most people go into a software development effort with a reasonable
> idea of what it should do ... subject to revision if they are allowed
> to think about it ... but absolutely no idea what it should look like
> until they see - and reject - several demos.

That's just a failure of imagination. A good spec (or manual) should
allow a developer or potential user to imagine actually using the
device before anything has been reified. Its expensive building
space shuttles just to figure out what it should look like! :>

> The entire field of "Requirements Analysis" would not exist if people
> knew what they wanted up front and could articulate it to the
> developer.

IMO, the problem with the agile approach is that there is too much
temptation to cling to whatever you've already implemented. And, if
you've not thoroughly specified its behavior and characterized its
operation, you've got a black box with unknown contents -- that you
will now convince yourself does what it "should" (without having
designed it with knowledge of that "should").

So, you end up on the wrong initial trajectory and don't discover
the problem until you've baked lots of "compensations" into the
design.

[The hardest thing to do is convince yourself to start over]

>>> For almost any non-system application, you can do without (explicit
>>> source level) pointer arithmetic. But pointers and the address
>>> operator are fundamental to function argument passing and returning
>>> values (note: not "value return"), and it's effectively impossible to
>>> program in C without using them.
>>
>> But, if you'd a formal education in CS, it would be trivial to
>> semantically map the mechanisms to value and reference concepts.
>> And, thinking of "reference" in terms of an indication of WHERE
>> it is! etc.
>
> But only a small fraction of "developers" have any formal CS, CE, or
> CSE education. In general, the best you can expect is that some of
> them may have a certificate from a programming course.

You've said that in the past, but I can't wrap my head around it.
It's like claiming very few doctors have taken any BIOLOGY courses!
Or, that a baker doesn't understand the basic chemistries involved.

>> Similarly, many of the "inconsistencies" (to noobs) in the language
>> could easily be explained with "common sense":
>> - why aren't strings/arrays passed by value? (think about how
>> ANYTHING is passed by value; the answer should then be obvious)
>> - the whole notion of references being IN/OUT's
>> - gee, const can ensure an IN can't be used as an OUT!
>> etc.
>
> That's true ... but then you get perfectly reasonable questions like
> "why aren't parameters marked as IN or OUT?", and have to dance around
> the fact that the developers of the language were techno-snobs who
> didn't expect that clueless people ever would be trying to use it.

That's a shortcoming of the language's syntax. But, doesn't prevent
you from annotating the parameters as such.

My IDL requires formal specification because it has to know how to marshal
and unmarshal on each end.

> Or "how do I ensure that an OUT can't be used as an IN?" Hmmm???
>
>> I think the bigger problem is that folks are (apparently) taught
>> "keystrokes" instead of "concepts": type THIS to do THAT.
>
> There is a element of that. But also there is the fact that many who
> can DO cannot effectively teach.

Of course! SWMBO has been learning that lesson with her artwork.
Taking a course from a "great artist" doesn't mean you'll end up
learning anything or improving YOUR skillset.

> I knew someone who was taking a C programming course, 2 nights a week
> at a local college. After (almost) every class, he would come to me
> with questions and confusions about the subject matter. He remarked
> on several occasions that I was able to teach him more in 10 minutes
> than he learned in a 90 minute lecture.

But I suspect you had a previous relationship with said individual.
So, knew how to "relate" concepts to him/her.

Many of SWMBO's (female) artist-friends seem to have trouble grok'ing
perspective. They read books, take courses, etc. and still can't seem
to warp their head around the idea.

I can sit down with them one-on-one and convey the concept and "mechanisms"
in a matter of minutes: "Wow! This is EASY!!" But, I'm not trying to sell
a (fat!) book or sign folks up for hours of coursework, etc. And, I know
how to pitch the ideas to each person individually, based on my prior knowledge
of their backgrounds, etc.

>>> This pushes newbies to learn about pointers, machine addressing and
>>> memory management before many are ready. There is plenty else to
>>> learn without *simultaneously* being burdoned with issues of object
>>> location.
>>
>> Then approach the topics more incrementally. Instead of introducing
>> the variety of data types (including arrays), introduce the basic
>> ones. Then, discuss passing arguments -- and how they are COPIED into
>> a stack frame.
>
> A what frame?
>
> I once mentioned "stack" in a response to a question posted in another
> forum. The poster had proudly announced that he was a senior in a CS
> program working on a midterm project. He had no clue that "stacks"
> existed other than as abstract notions, didn't know the CPU had one,
> and didn't understand why it was needed or how his code was faulty for
> (ab)using it.
>
> So much for "CS" programs.

<frown> As time passes, I am becoming more convinced of the quality of
my education. This was "freshman-level" coursework: S-machines, lambda
calculus, petri nets, formal grammars, etc.

[My best friend from school recounted taking some graduate level
courses at Northwestern. First day of the *graduate* level AI
course, a fellow student walked in with the textbook under his
arm. My friend asked to look at it. After thumbing through
a few pages, he handed it back: "I already took this course...
as a FRESHMAN!"]

If I had "free time", I guess it would be interesting to see just what
modern teaching is like, in this field.

>> This can NATURALLY lead to the fact that you can only "return" one
>> datum; which the caller would then have to explicitly assign to
>> <whatever>. "Gee, wouldn't it be nice if we could simply POINT to
>> the things that we want the function (subroutine) to operate on?"
>
> Huh? I saw once in a textbook that <insert_language> functions can
> return more than one object. Why is this language so lame?

Limbo makes extensive use of tuples as return values. So, silly
not to take advantage of that directly. (changes the syntax of how you'd
otherwise use a function in an expression but the benefits outweigh the
costs, typ).

>> I just think the teaching approach is crippled. Its driven by industry
>> with the goal of getting folks who can crank out code, regardless of
>> quality or comprehension.
>
> You and I have had this discussion before [at least in part].
>
> CS programs don't teach programming - they teach "computer science".
> For the most part CS students simply are expected to know.

I guess I don't understand the difference.

In my mind, "programming" is the plebian skillset.
programming : computer science :: ditch-digging : landscaping
I.e., ANYONE can learn to "program". It can be taught as a rote skill.
Just like anyone can be taught to reheat a batch of ready-made cookie
dough to "bake cookies".

The CS aspect of my (EE) degree showed me the consequences of different
machine architectures, the value of certain characteristics in the design
of a language, the duality of recursion/iteration, etc. E.g., when I
designed my first CPU, the idea of having an "execution unit" started
by the decode of one opcode and CONTINUING while other opcodes were
fetched and executed wasn't novel; I'd already seen it done on 1960's
hardware.

[And, if the CPU *hardware* can do two -- or more -- things at once, then
the idea of a *program* doing two or more things at once is a no-brainer!
"Multitasking? meh..."]

> CSE programs are somewhat better because they [purport to] teach
> project management: selection and use of tool chains, etc. But that
> can be approached largely in the abstract as well.

This was an aspect of "software development" that was NOT stressed
in my curriculum. Nor was "how to use a soldering iron" in the
EE portion thereof (the focus was more towards theory with the
understanding that you could "pick up" the practical skills relatively
easily, outside of the classroom)

> Many schools are now requiring that a basic programming course be
> taken by all students, regardless of major. But this is relatively
> recent, and the language de choix varies widely.

I know every EE was required to take some set of "software" courses.
Having attended an engineering school, I suspect that was true of
virtually every "major". Even 40 years ago, it was hard to imagine
any engineering career that wouldn't require that capability.

[OTOH, I wouldn't trust one of the ME's to design a programming
language anymore than I'd trust an EE/CS to design a *bridge*!]

>> But you can still expose a student to the concepts of the underlying
>> machine, regardless of language. Introduce a hypothetical machine...
>> something with, say, memory and a computation unit. Treat memory
>> as a set of addressable "locations", etc.
>
> That's covered in a separate course: "Computer Architecture 106". It
> is only offered Monday morning at 8am, and it costs another 3 credits.

I just can't imagine how you could explain "programming" a machine to a
person without that person first understanding how the machine works.
Its not like trying to teach someone to *drive* while remaining
ignorant of the fact that there are many small explosions happening
each second, under the hood!

[How would you teach a car mechanic to perform repairs if he didn't
understand what the components he was replacing *did* or how they
interacted with the other components?]

>> My first "computer texts" all presented a conceptual model of a
>> "computer system" -- even though the languages discussed
>> (e.g., FORTRAN) hid much of that from the casual user.
>
> Every intro computer text introduces the hypothetical machine ... and
> spends 6-10 pages laboriously stretching out the 2 sentence decription
> you gave above. If you're lucky there will be an illustration of an
> array of memory cells.
>
> Beyond that, you are into specialty texts.

My first courses (pre-college) went to great lengths to explain the hardware
of the machine, DASD's vs., SASD's, components of access times, overlapped
I/O, instruction formats (in a generic sense -- PC's hadn't been invented,
yet), binary-decimal conversion, etc. But, then again, these were new ideas
at the time, not old saws.

>>> For general application programming, there is no need for a language
>>> to provide mutable pointers: initialized references, together with
>>> array (or stream) indexing and struct/object member access are
>>> sufficient for virtually any non-system programming use. This has
>>> been studied extensively and there is considerable literature on the
>>> subject.
>>
>> But then you force the developer to pick different languages for
>> different aspects of a problem. How many folks are comfortable
>> with this "application specific" approach to *a* problem's solution?
>
> Go ask this question in a Lisp forum where writing a little DSL to
> address some knotty aspect of a problem is par for the course.
>
>> E.g., my OS is coded in C and ASM. Most of the core services are
>> written in C (so I can provide performance guarantees) with my bogus
>> IDL to handle RPC/IPC. The RDBMS server is accessed using SQL.
>> And, "applications" are written in my modified-Limbo.
>
> What does CLIPS use?

Its hard to consider CLIPS's "language" to be a real "programming language"
(e.g., Turing complete -- though it probably *is*, but with ghastly syntax!).
Its bears the same sort of relationship that SQL has to RDBMS, SNOBOL to
string processing, etc. Its primarily concerned with asserting and retracting
facts based on patterns of recognized facts.

While you *can* code an "action" routine in it's "native" language, I
find it easier to invoke an external routine (C) that uses the API
exported by CLIPS to do all the work. In my case, it would be difficult
to code an "action routine" entirely in CLIPS and be able to access
the rest of the system via the service-based interfaces I've implemented.

> By my count you are using 6 different languages ... 4 or 5 of which
> you can virtually count on the next maintainer not knowing.

Yes. But I'm not designing a typical application; rather, a *system*
of applications, services, OS, etc. I wouldn't expect one language to
EFFICIENTLY tackle all (and, I'd have to build all of those components
from scratch if I wanted complete control over their own implementation
languages (I have no desire to write an RDBMS just so I can AVOID using
SQL).

> What would you have done differently if C were not available for
> writing your applications? How exactly would that have impacted your
> development?

The applications are written in Limbo. I'd considered other scripting
languages for that role -- LOTS of other languages! -- but Limbo already
had much of the support I needed to layer onto the "structure" of my
system. Did I want to invent a language and a hosting VM (to make it
easy to migrate applications at run-time)? Add multithreading hooks
to an existing language? etc.

[I was disappointed with most language choices as they all tend to
rely heavily on punctuation and other symbols that aren't "voiced"
when reading the code]

C just gives me lots of bang for the buck. I could implement all of this
on a bunch of 8b processors -- writing interpreters to allow more complex
machines to APPEAR to run on the simpler hardware, creating virtual address
spaces to exceed the limits of those tiny processors, etc. But, all that
would come at a huge performance cost. Easier just to *buy* faster
processors and run code written in more abstract languages.

>> This (hopefully) "works" because most folks will only be involved
>> with *one* of these layers. And, folks who are "sufficiently motivated"
>> to make their additions/modifications *work* can resort to cribbing
>>from the existing parts of the design -- as "examples" of how they
>> *could* do things ("Hey, this works; why not just copy it?")
>
> Above you complained about people being taught /"keystrokes" instead
> of "concepts": type THIS to do THAT./ and something about how that
> led to no understanding of the subject.

There's a difference between the types of people involved. I don't
expect anyone from "People's Software Institute #234B" to be writing
anything beyond application layer scripts. So, they only need to
understand the scripting language and the range of services available
to them. They don't have to worry about how I've implemented each
of these services. Or, how I move their application from processor
node 3 to node 78 without corrupting any data -- or, without their
even KNOWING that they've been moved!

Likewise, someone writing a new service (in C) need not be concerned with
the scripting language. Interfacing to it can be done by copying an
interface for an existing service. And, interfacing to the OS can as
easily mimic the code from a similar service.

You obviously have to understand the CONCEPT of "multiplication" in
order to avail yourself of it. But, do you care if it's implemented
in a purely combinatorial fashion? Or, iteratively with a bunch of CSA's?
Or, by tiny elves living in a hollow tree?

In my case, you have to understand that each function/subroutine invocation
just *appears* to be a subroutine/function invocation. That, in reality,
it can be running code on another processor in another building -- concurrent
with what you are NOW doing (this is a significant conceptual difference
between traditional "programming" where you consider everything to be a
series of operations -- even in a multithreaded environment!).

You also have to understand that your "program" can abend or be aborted
at any time. And, that persistent data has *structure* (imposed by
the DBMS) instead of being just BLOBs. And, that agents/clients have
capabilities that are finer-grained than "permissions" in conventional
systems.

But, you don't have to understand how any of these things are implemented
in order to use them correctly.

>> OTOH, if someone had set out to tackle the whole problem in a single
>> language/style... <shrug>
>
> It would be a f_ing nightmare. That's precisely *why* you *want* to
> use a mix of languages: often the best tool is a special purpose
> domain language.

But that complicates the design (and maintenance) effort(s) -- by requiring
staff with those skillsets to remain available. Imagine if you had to
have a VLSI person on hand all the time in case the silicon in your CPU
needed to be changed...

>>> The modern concept of availability is very different than when you had
>>> to wait for a company to provide a turnkey solution, or engineer
>>> something yourself from scratch. Now, if the main distribution
>>> doesn't run on your platform, you are likely to find source that you
>>> can port yourself (if you are able), or if there's any significant
>>> user base, you may find that somebody else already has done it.
>>
>> That works for vanilla implementations. It leads to all designs
>> looking like all others ("Lets use a PC for this!"). This is
>> fine *if* that's consistent with your product/project goals.
>> But, if not, you're SoL.
>
> Yeah ... well the world is going that way. My electric toothbrush is
> a Raspberry PI running Linux.

I suspect my electric toothbrush has a small MCU at its heart.

>> An advantage of ASM was that there were *relatively* few operators
>> and addressing modes, etc.
>
> Depends on the chip. Modern x86_64 chips can have instructions up to
> 15 bytes (120 bits) long. [No actual instruction *is* that long, but
> that is the maximum the decoder will accept.]

But the means by which the "source" is converted to the "binary" is
well defined. Different EA modes require different data to be present
in the instruction byte stream -- and, in predefined places relative to
the start of the instruction (or specific locations in memory).

And, SUB behaved essentially the same as ADD -- with the same range of
options available, etc.

[You might have to remember that certain instructions expected certain
parameters to be implicitly present in specific registers, etc.]

>>>> The (early) languages that we settled on were simple to implement
>>>> on the development platforms and with the target resources. Its
>>>> only as targets have become more resource-rich that we're exploring
>>>> richer execution environments (and the attendant consequences of
>>>> that for the developer).
>>>
>>> There never was any C compiler that ran on any really tiny machine.
>>
>> Doesn't have to run *on* a tiny machine. It just had to generate code
>> that could run on a tiny machine!
>
> Cross compiling is cheating!!!
>
> In most cases, it takes more resources to develop a program than to
> run it ... so if you have a capable machine for development, why do
> need a *small* compiler?

Because not all development machines were particularly capable.

My first project was i4004 based, developed on an 11.

The newer version of the same product was i8085 hosted and developed on
an MDS800. IIRC, the MDS800 was *8080* based and limited to 64KB of memory
(no fancy paging, bank switching, etc.) I think a second 8080 ran
the I/O's. So, building an object image was lots of passes, lots
of "egg scrambling" (the floppies always sounded like they were
grinding themselves to death)

I.e., if we'd opted to replace the EPROM in our product with SRAM
(or DRAM) and add some floppies, the product could have hosted the
tools.

> A small runtime footprint is a different issue, but *most* languages
> [even GC'd ones] are capable of operating with a small footprint.
>
> Once upon a time, I created a Scheme-like GC'd language that could do
> a hell of a lot in 8KB total for the compiler, runtime, a reasonably
> complex user program and its data.
>
>> E.g., we used an 11 to write our i4004 code; the idea of even something
>> as crude as an assembler running *ON* an i4004 was laughable!
>
> My point exactly. In any case, you wouldn't write for the i4004 in a
> compiled language. Pro'ly not for the i8008 either, although I have
> heard claims that that was possible.

I have a C compiler that targets the 8080, hosted on CP/M. Likewise, a
Pascal compiler and a BASIC compiler (and I think an M2 compiler) all
hosted on that 8085 CP/M machine.

The problem with HLL's on small machines is the helper routines and
standard libraries can quickly eat up ALL of your address space!

I designed several z180-based products in C -- but the (bizarre!)
bank switching capabilities of the processor would let me do things like
stack the object code for different libraries in the BANK section
and essentially do "far" calls through a bank-switching intermediary
that the compiler would automatically invoke for me.

By cleverly designing the memory map, you could have large DATA
and large CODE -- at the expense of lengthened call/return times
(of course, the interrupt system had to remain accessible at
all times so you worked hard to keep that tiny lest you waste
address space catering to it).

Sure we are! This is C.A.E! :> If we're talking about all
applications, then are we also dragging big mainframes into the mix?
Where's mention of PL/1 and the other big iron running it?

> You view everything through the embedded lens.
>
>> Ditto Pascal. How much benefit is there in controlling a motor
>> that requires high level math and flagrant automatic type conversion?
>
> I don't even understand this.

Motor control is a *relatively* simple algorithm. No *need* for complex
data types, automatic type casts, etc. And, what you really want is
deterministic behavior; you want to know that a particular set of
"instructions" (in a HLL?) will execute in a particular, predictable time
frame without worrying about some run-time support mechanism (e.g., GC)
kicking in and confounding the expected behavior.

[Or, having to take explicit measures to avoid this because of the choice
of HLL]

>> Smalltalk? You *do* know how much RAM cost in the early 80's??
>
> Yes, I do.
>
> I also know that I had a Smalltalk development system that ran on my
> Apple IIe. Unfortunately, it was a "personal" edition that was not
> able to create standalone executables ... there was a "professional"
> version that could, but it was too expensive for me ... so I don't
> know how small a 6502 Smalltalk program could have been.
>
> I also had a Lisp and a Prolog for the IIe. No, they did not run in
> 4KB, but they were far from useless on an 8-bit machine.

As I said, I id a lot with 8b hardware. But, you often didn't have a lot
of resources "to spare" with that hardware.

I recall going through an 8085 design and counting the number of
subroutine invocations (CALL's) for each specific subroutine.
Then, replacing the CALLs to the most frequently accessed subroutine
with "restart" instructions (RST) -- essentially a one-byte CALL
that vectored through a specific hard-coded address in the memory
map. I.e., each such replacement trimmed *2* bytes from the size of
the executable. JUST TWO!

We did that for seven of the eight possible RST's. (RST 0 is hard to
cheaply use as it doubles as the RESET entry point). The goal being to
trim a few score bytes out of the executable so we could eliminate
*one* 2KB EPROM from the BoM (because we didn't need the entire
EPROM, just a few score bytes of it -- so why pay for a $50 (!!)
chip if you only need a tiny piece of it? And, why pay for ANY of
it if you can replace 3-byte instructions with 1-byte instructions??)

Les Cargill

unread,

Jun 10, 2017, 4:22:57 PM6/10/17

Dimiter_Popoff wrote:
> On 08.6.2017 г. 13:38, George Neuner wrote:
>> ...
>>
>> The question is not why C was adopted for system programming, or for
>> cross development from a capable system to a smaller target. Rather
>> the question is why it was so widely adopted for ALL kinds of
>> programming on ALL platforms given that were many other reasonable
>> choices available.
>
> My take on that is it happened because people needed a low level
> language, some sort of assembler - and the widest spread CPU was
> the x86 with a register model for which no sane person would consider
> programming larger pieces of code.
> I am sure there have been people who have done
> it but they can't have been exactly sane :) (i.e. have been insane in
> a way most people would have envied them for their insanity).

I doubt that, but there's something to be said for architectures
that limit the complexity available to the financial/investor
classes.

The trouble with large swaths of assembly is that organizations
aren't stable enough to support maintainers long enough to keep things
running.

The game here is funding, not function.

> So C made x86 usable - and the combination (C+x86) is the main factor
> which led to the absurd situation we have today, where code which
> used to take kilobytes of memory takes gigabytes (not because of the
> inefficiency of compilers, just because of where most programmers
> have been led to).
>

Most of the people who need gigs of memory aren't at least native C
speakers.

It took the execrable web protocols and "relational databases"
to make things utterly reek of doom. These are fine for
toy programs to get you through a course, but no fun at all
for Real Work(tm).

Behold the $400 , WiFi enabled juicer; Juicero.

> Dimiter
>
> ======================================================
> Dimiter Popoff, TGI http://www.tgi-sci.com
> ======================================================
> http://www.flickr.com/photos/didi_tgi/
>
>

--
Les Cargill

Les Cargill

unread,

Jun 10, 2017, 4:26:38 PM6/10/17

PL/M was fairly hard to maintain in. A couple lines of C would
replace a page of PL/M, in cases.

> The same also applies to BLISS for PDP-10/PDP-11/VAX/Alpha and
> recently some Intel HW.
>
> The problem why these languages did not become popular was that the
> hardware vendors did want to make money by compiler sales.
>

Gates identified massive cognitive dissonance against the idea
of selling software en masse and set the tools price quite low.

"Hardware vendors" like IBM had OS/360, where the O/S cost
more than the machine.

People still don't want to pay for software.

> Some HW companies wanting to boost their HW sales did give away
> compilers and development software for free and that way boost their
> HW sale.
>

That's more generally true of chip vendors, who use the tools as
an enabler for sales.

--
Les Cargill

George Neuner

unread,

Jun 10, 2017, 9:01:56 PM6/10/17

On Fri, 9 Jun 2017 22:50:31 -0700, Don Y <blocked...@foo.invalid>

wrote:

>On 6/9/2017 7:14 PM, George Neuner wrote:

>> On Fri, 9 Jun 2017 00:06:05 -0700, Don Y wrote:
>>
>>> ..., if you'd a formal education in CS, it would be trivial to

>>> semantically map the mechanisms to value and reference concepts.
>>> And, thinking of "reference" in terms of an indication of WHERE
>>> it is! etc.
>>
>> But only a small fraction of "developers" have any formal CS, CE, or
>> CSE education. In general, the best you can expect is that some of
>> them may have a certificate from a programming course.
>
>You've said that in the past, but I can't wrap my head around it.
>It's like claiming very few doctors have taken any BIOLOGY courses!
>Or, that a baker doesn't understand the basic chemistries involved.

Comparitively few bakers actually can tell you the reason why yeast
makes dough rise, or why you need to add salt to make things taste
sweet. It's enough for many people to know that something works -
they don't have a need to know how or why.

WRT "developers":

A whole lot of "applications" are written by people in profressions
unrelated to software development. The become "developers" de facto
when their programs get passed around and used by others.

Consider all the scientists, mathematicians, statisticians, etc., who
write data analysis programs in the course of their work.

Consider all the data entry clerks / "accidental" database admins who
end up having to learn SQL and form coding to do their jobs.

Consider the frustrated office workers who study VBscript or
Powershell on their lunch hour and start automating their manual
processes to be more productive.

: < more examples elided - use your imagination >

Some of these "non-professional" programs end up being very effective
and reliable. The better ones frequently are passed around, modified,
extended, and eventually are coaxed into new uses that the original
developer never dreamed of.

Then consider the legions of (semi)professional coders who maybe took
a few programming courses, or who learned on their own, and went to
work writing, e.g., web applications, Android apps, etc.

It has been estimated that over 90% of all software today is written
by people who have no formal CS/CE/CSE or IS/IT education, and 40% of
all programmers are employed primarily to do something other than
software development.

Note: programming courses != CS/CE/CSE education

>> I knew someone who was taking a C programming course, 2 nights a week
>> at a local college. After (almost) every class, he would come to me
>> with questions and confusions about the subject matter. He remarked
>> on several occasions that I was able to teach him more in 10 minutes
>> than he learned in a 90 minute lecture.
>
>But I suspect you had a previous relationship with said individual.
>So, knew how to "relate" concepts to him/her.

In this case, yes. But I also had some prior teaching experience.

I rarely have much trouble explaining complicated subjects to others.
As you have noted in the past, it is largely a matter of finding
common ground with a student and drawing appropriate analogies.

>> CS programs don't teach programming - they teach "computer science".
>> For the most part CS students simply are expected to know.
>
>I guess I don't understand the difference.
>
>In my mind, "programming" is the plebian skillset.

Only sort of. Programming is fundamental to computer *engineering*,
but that is a different discipline.

Computer "science" is concerned with

- computational methods,
- language semantics,
- ways to bridge the semantic gap between languages and methods,
- design and study of algorithms,
- design of better programming languages [for some "better"]
- ...

Programming per se really is not a requirement for a lot of it. A
good foundation of math and logic is more necessary.

>> CSE programs are somewhat better because they [purport to] teach
>> project management: selection and use of tool chains, etc. But that
>> can be approached largely in the abstract as well.
>
>This was an aspect of "software development" that was NOT stressed
>in my curriculum. Nor was "how to use a soldering iron" in the
>EE portion thereof (the focus was more towards theory with the
>understanding that you could "pick up" the practical skills relatively
>easily, outside of the classroom)

Exactly! If you can't learn to solder on your own, you don't belong
here. CS regards programming in the same way.

>I just can't imagine how you could explain "programming" a machine to a
>person without that person first understanding how the machine works.

Take a browse through some classics:

- Abelson, Sussman & Sussman, "Structure and Interpretation of
Computer Programs" aka SICP

- Friedman, Wand & Haynes, "Essentials of Programming Languages"
aka EOPL

There are many printings of each of these. I happen to have SICP 2nd
Ed and EOPL 8th Ed on my shelf.

Both were - and are still - widely used in undergrad CS programs.

SICP doesn't mention any concrete machine representation until page
491, and then a hypothetical machine is considered with respect to
emulating its behavior.

EOPL doesn't refer to any concrete machine at all.

>> What would you have done differently if C were not available for
>> writing your applications? How exactly would that have impacted your
>> development?
>
>The applications are written in Limbo. I'd considered other scripting
>languages for that role -- LOTS of other languages! -- but Limbo already
>had much of the support I needed to layer onto the "structure" of my
>system. Did I want to invent a language and a hosting VM (to make it
>easy to migrate applications at run-time)? Add multithreading hooks
>to an existing language? etc.
>
>[I was disappointed with most language choices as they all tend to
>rely heavily on punctuation and other symbols that aren't "voiced"
>when reading the code]

Write in BrainF_ck ... that'll fix them.

Very few languages have been deliberately designed to be read. The
very idea has negative connotations because the example everyone jumps
to is COBOL - which was too verbose.

It's also true that reading and writing effort are inversely related,
and programmers always seem to want to type fewer characters - hence
the proliferation of languages whose code looks suspiciously like line
noise.

I don't know about you, but I haven't seen a teletype connected to a
computer since about 1972.

>You obviously have to understand the CONCEPT of "multiplication" in
>order to avail yourself of it. But, do you care if it's implemented
>in a purely combinatorial fashion? Or, iteratively with a bunch of CSA's?
>Or, by tiny elves living in a hollow tree?

Rabbits are best for multiplication.

>In my case, you have to understand that each function/subroutine invocation
>just *appears* to be a subroutine/function invocation. That, in reality,
>it can be running code on another processor in another building -- concurrent
>with what you are NOW doing (this is a significant conceptual difference
>between traditional "programming" where you consider everything to be a
>series of operations -- even in a multithreaded environment!).
>
>You also have to understand that your "program" can abend or be aborted
>at any time. And, that persistent data has *structure* (imposed by
>the DBMS) instead of being just BLOBs. And, that agents/clients have
>capabilities that are finer-grained than "permissions" in conventional
>systems.
>
>But, you don't have to understand how any of these things are implemented
>in order to use them correctly.

Which is one of the unspoken points of those I books mentioned above:
that (quite a lot of) programming is an exercise in logic that is
machine independent.

Obviously I am extrapolating and paraphrasing, and the authors did not
have device programming in mind when they wrote the books.

Nevertheless, there is lot of truth in it: identifying required
functionality, designing program logic, evaluating and choosing
algorithms, etc. ... all may be *guided* in situ by specific knowledge
of the target machine, but they are skills which are independent of
it.

YMMV,
George

Don Y

unread,

Jun 11, 2017, 3:39:52 AM6/11/17

On 6/10/2017 6:01 PM, George Neuner wrote:
> On Fri, 9 Jun 2017 22:50:31 -0700, Don Y <blocked...@foo.invalid>
> wrote:
>
>> On 6/9/2017 7:14 PM, George Neuner wrote:
>>> On Fri, 9 Jun 2017 00:06:05 -0700, Don Y wrote:
>>>
>>>> ..., if you'd a formal education in CS, it would be trivial to
>>>> semantically map the mechanisms to value and reference concepts.
>>>> And, thinking of "reference" in terms of an indication of WHERE
>>>> it is! etc.
>>>
>>> But only a small fraction of "developers" have any formal CS, CE, or
>>> CSE education. In general, the best you can expect is that some of
>>> them may have a certificate from a programming course.
>>
>> You've said that in the past, but I can't wrap my head around it.
>> It's like claiming very few doctors have taken any BIOLOGY courses!
>> Or, that a baker doesn't understand the basic chemistries involved.
>
> Comparitively few bakers actually can tell you the reason why yeast
> makes dough rise, or why you need to add salt to make things taste
> sweet. It's enough for many people to know that something works -
> they don't have a need to know how or why.

I guess different experiences. Growing up, I learned these sorts
of things by asking countless questions of the vendors we frequented.
Yeast vs. baking soda as leavening agent; baking soda vs. powder;
vs. adding cream of tartar; cake flour vs. bread flour; white sugar
vs. brown sugar; vege shortening vs. butter (vs. oleo/oil); sugar
as a "wet" ingredient; etc.

Our favorite baker was a weekly visit. He'd take me in the back room
(much to the chagrin of other customers) and show me the various bits
of equipment, what he was making at the time, his "tricks" to eek a
bit more life out of something approaching its "best by" date, etc.

[I wish I'd pestered him, more, to learn about donuts and, esp, bagels
as he made the *best* of both! OTOH, probably too many details for
a youngster to commit to memory...]

The unfortunate thing (re: US style of measurement by volume) is that
you don't have as fine control over some of the ingredients (e.g.,
what proportion of "other ingredients" per "egg unit")

[I've debated purchasing a scale just to weigh eggs! Not to tweek
the amount of other ingredients proportionately but, rather, to
select a "set" of eggs closest to a target weight for a particular
set of "other ingredients". Instead, I do that "by feel", presently
(one of the aspects of my Rx's that makes them "non-portable -- the
other being my deliberate failure to upgrade the written Rx's as I
improve upon them. Leaves folks wondering why things never come
out "as good" when THEY make them... <grin>]

> WRT "developers":
>
> A whole lot of "applications" are written by people in profressions
> unrelated to software development. The become "developers" de facto
> when their programs get passed around and used by others.
>
> Consider all the scientists, mathematicians, statisticians, etc., who
> write data analysis programs in the course of their work.
>
> Consider all the data entry clerks / "accidental" database admins who
> end up having to learn SQL and form coding to do their jobs.
>
> Consider the frustrated office workers who study VBscript or
> Powershell on their lunch hour and start automating their manual
> processes to be more productive.
>
> : < more examples elided - use your imagination >
>
> Some of these "non-professional" programs end up being very effective
> and reliable. The better ones frequently are passed around, modified,
> extended, and eventually are coaxed into new uses that the original
> developer never dreamed of.
>
> Then consider the legions of (semi)professional coders who maybe took
> a few programming courses, or who learned on their own, and went to
> work writing, e.g., web applications, Android apps, etc.
>
> It has been estimated that over 90% of all software today is written
> by people who have no formal CS/CE/CSE or IS/IT education, and 40% of
> all programmers are employed primarily to do something other than
> software development.
>
> Note: programming courses != CS/CE/CSE education

And these folks tend to use languages (and tools) that are tailored to
those sorts of "applications". Hence the reason I include a scripting
language in my design; no desire to force folks to understand data types,
overflow, mathematical precision, etc.

"I have a room that is 13 ft, 2-1/4 inches by 18 ft, 3-3/8 inches.
Roughly how many 10cm x 10cm tiles will it take to cover the floor?"

Why should the user have to normalize to some particular unit of measure?
All he wants, at the end, is a dimensionless *count*.

[I was recently musing over the number of SOIC8 devices that could fit
on the surface of a sphere having a radius equal to the average distance
of Pluto from the Sun (idea came from a novel I was reading). And, how
much that SOIC8 collection would *weigh*...]

>>> I knew someone who was taking a C programming course, 2 nights a week
>>> at a local college. After (almost) every class, he would come to me
>>> with questions and confusions about the subject matter. He remarked
>>> on several occasions that I was able to teach him more in 10 minutes
>>> than he learned in a 90 minute lecture.
>>
>> But I suspect you had a previous relationship with said individual.
>> So, knew how to "relate" concepts to him/her.
>
> In this case, yes. But I also had some prior teaching experience.
>
> I rarely have much trouble explaining complicated subjects to others.
> As you have noted in the past, it is largely a matter of finding
> common ground with a student and drawing appropriate analogies.

Exactly. I had a lady friend many years ago to whom I'd always explain
computer-related issues (more typ operational ones than theoretical ones)
using "kitchen analogies". In a playful mood, one day, she chided me for
the misogynistic examples. So, I started explaining things in terms
of salacious "bedroom activities". Didn't take long for her to request
a return to the kitchen analogies! :>

>>> CS programs don't teach programming - they teach "computer science".
>>> For the most part CS students simply are expected to know.
>>
>> I guess I don't understand the difference.
>>
>> In my mind, "programming" is the plebian skillset.
>
> Only sort of. Programming is fundamental to computer *engineering*,
> but that is a different discipline.
>
> Computer "science" is concerned with
> - computational methods,
> - language semantics,
> - ways to bridge the semantic gap between languages and methods,
> - design and study of algorithms,
> - design of better programming languages [for some "better"]
> - ...
> Programming per se really is not a requirement for a lot of it. A
> good foundation of math and logic is more necessary.

Petri nets, lamda calculus, S-machines, etc.

But, to become *practical*, these ideas have to eventually be bound
to concrete representations. You need ways of recording algorithms
and verifying that they do, in fact, meet their desired goals.

I know no one who makes a living dealing in abstractions, entirely.
Even my physics friends have lives beyond a blackboard.

>>> CSE programs are somewhat better because they [purport to] teach
>>> project management: selection and use of tool chains, etc. But that
>>> can be approached largely in the abstract as well.
>>
>> This was an aspect of "software development" that was NOT stressed
>> in my curriculum. Nor was "how to use a soldering iron" in the
>> EE portion thereof (the focus was more towards theory with the
>> understanding that you could "pick up" the practical skills relatively
>> easily, outside of the classroom)
>
> Exactly! If you can't learn to solder on your own, you don't belong
> here. CS regards programming in the same way.

But you can't examine algorithms and characterize their behaviors,
costs, etc. without being able to reify them. You can't just
magically invent an abstract language that supports:
solve_homework_problem(identifier)

>> I just can't imagine how you could explain "programming" a machine to a
>> person without that person first understanding how the machine works.
>
> Take a browse through some classics:
>
> - Abelson, Sussman & Sussman, "Structure and Interpretation of
> Computer Programs" aka SICP
>
> - Friedman, Wand & Haynes, "Essentials of Programming Languages"
> aka EOPL

All written long after I'd graduated. :> Most (all?) of my college
CS courses didn't have "bound textbooks". Instead, we had collections
of handouts coupled with notes that formed our "texts". In some cases,
the handouts were "bound" (e.g., a cheap "perfect binding" paperback)
for convenience as the instructors were writing the texts *from*
their teachings.

Sussman taught one of my favorite courses and I'm chagrined that
all I have to show for it are the handouts and my notes -- it would
have been nicer to have a lengthier text that I could explore at
my leisure (esp after the fact).

The books that I have on the subject predate my time in college
(I attended classes at a local colleges at night and on weekends
while I was in Jr High and High School). Many of the terms used
in them have long since gone out of style (e.g., DASD, VTOC, etc.)
I still have my flowcharting template and some FORTRAN coding forms
for punched cards... I suspect *somewhere* these are still used! :>

Other texts from that period are amusing to examine to see how
terminology and approaches to problems have changed. "Real-time"
being one of the most maligned terms! (e.g., Caxton's book)

> There are many printings of each of these. I happen to have SICP 2nd
> Ed and EOPL 8th Ed on my shelf.
>
> Both were - and are still - widely used in undergrad CS programs.
>
> SICP doesn't mention any concrete machine representation until page
> 491, and then a hypothetical machine is considered with respect to
> emulating its behavior.
>
> EOPL doesn't refer to any concrete machine at all.
>
>>> What would you have done differently if C were not available for
>>> writing your applications? How exactly would that have impacted your
>>> development?
>>
>> The applications are written in Limbo. I'd considered other scripting
>> languages for that role -- LOTS of other languages! -- but Limbo already
>> had much of the support I needed to layer onto the "structure" of my
>> system. Did I want to invent a language and a hosting VM (to make it
>> easy to migrate applications at run-time)? Add multithreading hooks
>> to an existing language? etc.
>>
>> [I was disappointed with most language choices as they all tend to
>> rely heavily on punctuation and other symbols that aren't "voiced"
>> when reading the code]
>
> Write in BrainF_ck ... that'll fix them.
>
> Very few languages have been deliberately designed to be read. The
> very idea has negative connotations because the example everyone jumps
> to is COBOL - which was too verbose.

Janus (Consistent System) was equally verbose. Its what I think of when
I'm writing SQL :< An 80 column display was dreadfully inadequate!

> It's also true that reading and writing effort are inversely related,
> and programmers always seem to want to type fewer characters - hence
> the proliferation of languages whose code looks suspiciously like line
> noise.

Yes, but if you're expecting to exchange code snippets with folks
who can't *see*, the imprecision of "speaking" a program's contents
is fraught with opportunity for screwups -- even among "professionals"
who know where certain punctuation are *implied*.

Try dictating "Hello World" to a newbie over the phone...

I actually considered altering the expression syntax to deliberately
render parens unnecessary (and illegal). I.e., if an expression
can have two different meanings with/without parens, then ONLY the
meaning without parens would be supported.

But, this added lots of superfluous statements just to meet that
goal *and* quickly overloads STM as you try to keep track of
which "component statements" you've already encountered:
area = (width_feet+(width_inches/12))*(length_feet+(length_inches/12)
becomes:
width = width_feet + width_inches/12
length = length_feet + length_inches/12
area = length * width
[Imagine you were, instead, computing the *perimeter* of a 6 walled room!]

> I don't know about you, but I haven't seen a teletype connected to a
> computer since about 1972.

Actually, I have one :>

>> You obviously have to understand the CONCEPT of "multiplication" in
>> order to avail yourself of it. But, do you care if it's implemented
>> in a purely combinatorial fashion? Or, iteratively with a bunch of CSA's?
>> Or, by tiny elves living in a hollow tree?
>
> Rabbits are best for multiplication.

Or, Adders and log tables! (bad childhood joke)

>> In my case, you have to understand that each function/subroutine invocation
>> just *appears* to be a subroutine/function invocation. That, in reality,
>> it can be running code on another processor in another building -- concurrent
>> with what you are NOW doing (this is a significant conceptual difference
>> between traditional "programming" where you consider everything to be a
>> series of operations -- even in a multithreaded environment!).
>>
>> You also have to understand that your "program" can abend or be aborted
>> at any time. And, that persistent data has *structure* (imposed by
>> the DBMS) instead of being just BLOBs. And, that agents/clients have
>> capabilities that are finer-grained than "permissions" in conventional
>> systems.
>>
>> But, you don't have to understand how any of these things are implemented
>> in order to use them correctly.
>
> Which is one of the unspoken points of those I books mentioned above:
> that (quite a lot of) programming is an exercise in logic that is
> machine independent.
>
> Obviously I am extrapolating and paraphrasing, and the authors did not
> have device programming in mind when they wrote the books.
>
> Nevertheless, there is lot of truth in it: identifying required
> functionality, designing program logic, evaluating and choosing
> algorithms, etc. ... all may be *guided* in situ by specific knowledge
> of the target machine, but they are skills which are independent of
> it.

But I see programming (C.A.E) as having moved far beyond the sorts of
algorithms you would run on a desktop, mainframe, etc. Its no longer
just about this operator in combination with these arguments yields
this result.

When I was younger, I'd frequently use "changing a flat tire" as an
example to coax folks into describing a "familiar" algorithm. It
was especially helpful at pointing out all the little details that
are so easy to forget (omit) that can render an implementation
ineffective, buggy, etc.
"Wonderful! Where did you get the spare tire from?"
"The trunk!"
"And, how did you get it out of the trunk?"
"Ah, I see... 'I *opened* the trunk!'"
"And, you did this while seated behind the wheel?"
"Oh, OK. 'I got out of the car and OPENED the trunk'"
"While you were driving down the road?"
"Grrr... 'I pulled over to the shoulder and stopped the car; then got out'"
"And got hit by a passing vehicle?"

Now, its not just about the language and the target hardware but, also, the
execution environment, OS, etc.

Why are people surprised to discover that it's possible for <something> to
see partial results of <something else's> actions? (i.e., the need for
atomic operations) Or, to be frustrated that such problems are so hard
to track down?

(In a multithreaded environment,) we all know that the time between
execution of instruction N and instruction N+1 can vary -- from whatever
the "instruction rate" of the underlying machine happens to be up to
the time it takes to service all threads at this, and higher, priority...
up to "indefinite". Yet, how many folks are consciously aware of that
as they write code?

A "programmer" can beat on a printf() statement until he manages to stumble
on the correct combination of format specifiers, flags, arguments, etc.
But, will it ever occur to him that the printf() can fail, at RUNtime?
Or, the NEXT printf() might fail while this one didn't?

How many "programmers" know how much stack to allocate to each thread?
How do they decide -- wait for a stack fence to be breached and then
increase the number and try again? Are they ever *sure* that they've
got the correct, "worst case" value?

I.e., there are just too many details of successful program deployment
that don't work when you get away from the rich and tame "classroom
environment". This is especially true as we move towards scenarios
where things "talk to" each other, more (for folks who aren't prepared
to deal with a malloc/printf *failing*, how do they address "network
programming"? Or, RPC/RMI? etc.)

Its easy to see how someone can coax a piece of code to work in a
desktop setting -- and fall flat on their face when exposed to a
less friendly environment (i.e., The Real World).

[Cookies tonight (while its below 100F) and build a new machine to replace
this one. Replace toilets tomorrow (replaced flange in master bath today).]

George Neuner

unread,

Jun 12, 2017, 11:27:12 PM6/12/17

On Sun, 11 Jun 2017 00:39:41 -0700, Don Y

<blocked...@foo.invalid> wrote:

>On 6/10/2017 6:01 PM, George Neuner wrote:
>> On Fri, 9 Jun 2017 22:50:31 -0700, Don Y <blocked...@foo.invalid>
>> wrote:

>[I was recently musing over the number of SOIC8 devices that could fit
>on the surface of a sphere having a radius equal to the average distance
>of Pluto from the Sun (idea came from a novel I was reading). And, how
>much that SOIC8 collection would *weigh*...]

Reading about Dyson spheres are we? So how many trillion-trillion
devices would it take?

>But you can't examine algorithms and characterize their behaviors,
>costs, etc. without being able to reify them.

To a 1st approximation, you can. E.g., given just an equation, you
can count the arithmetic operations and approximate the number of
operand reads and result writes.

Certain analyses are very sensitive to the language being considered.
E.g., you'll get different results from analyzing an algorithm
expressed in C vs the same algorithm expressed in assembler because
the assembler version exposes low level minutia that is hidden by the
C version.

>You can't just magically invent an abstract language that supports:
> solve_homework_problem(identifier)

You can invent it ... you just [currently] can't implement it.

And you probably even can get a patent on it since the USPTO no longer
requires working prototypes.

>>> I just can't imagine how you could explain "programming" a machine to a
>>> person without that person first understanding how the machine works.
>>
>> Take a browse through some classics:
>>
>> - Abelson, Sussman & Sussman, "Structure and Interpretation of
>> Computer Programs" aka SICP
>>
>> - Friedman, Wand & Haynes, "Essentials of Programming Languages"
>> aka EOPL
>
>All written long after I'd graduated. :>

SICP and EOPL both were being written during the time I was in grad
school. I had some courses with Mitch Wand and I'm sure I was used as
a guinea pig for EOPL.

I acquired them later because they subsequently became famous as
foundation material for legions of CS students.

>Most (all?) of my college
>CS courses didn't have "bound textbooks". Instead, we had collections
>of handouts coupled with notes that formed our "texts". In some cases,
>the handouts were "bound" (e.g., a cheap "perfect binding" paperback)
>for convenience as the instructors were writing the texts *from*
>their teachings.

I'm not *that* far behind you. Many of my courses did have books, but
quite a few of those books were early (1st or 2nd) editions.

I have a 1st edition on denotational semantics that is pre-press and
contains inserts of hand drawn illustrations.

>Sussman taught one of my favorite courses and I'm chagrined that
>all I have to show for it are the handouts and my notes -- it would
>have been nicer to have a lengthier text that I could explore at
>my leisure (esp after the fact).

I met once Gerry Sussman at a seminar. Never had the opportunity to
take one of his classes.

>The books that I have on the subject predate my time in college
>(I attended classes at a local colleges at night and on weekends
>while I was in Jr High and High School). Many of the terms used
>in them have long since gone out of style (e.g., DASD, VTOC, etc.)
>I still have my flowcharting template and some FORTRAN coding forms
>for punched cards... I suspect *somewhere* these are still used! :>

I have the Fortran IV manual my father used when he was in grad
school. <grin>

>I actually considered altering the expression syntax to deliberately
>render parens unnecessary (and illegal). I.e., if an expression
>can have two different meanings with/without parens, then ONLY the
>meaning without parens would be supported.

Indentation sensitive syntax (I-expressions) is a recurring idea to
rid the world of parentheses.

Given the popularity of Python, iexprs may eventually find a future.
OTOH, many people - me included - are philosophically opposed to the
idea of significant whitespace.

If you want syntax visualization, use a structure editor.

>But, this added lots of superfluous statements just to meet that
>goal *and* quickly overloads STM as you try to keep track of
>which "component statements" you've already encountered:
> area = (width_feet+(width_inches/12))*(length_feet+(length_inches/12)
>becomes:
> width = width_feet + width_inches/12
> length = length_feet + length_inches/12
> area = length * width
>[Imagine you were, instead, computing the *perimeter* of a 6 walled room!]

??? For what definition of "STM"?

Transactional memory - if that's what you mean - shouldn't require
refactoring code in that way.

>> ... identifying required

>> functionality, designing program logic, evaluating and choosing
>> algorithms, etc. ... all may be *guided* in situ by specific knowledge
>> of the target machine, but they are skills which are independent of
>> it.
>
>But I see programming (C.A.E) as having moved far beyond the sorts of
>algorithms you would run on a desktop, mainframe, etc. Its no longer
>just about this operator in combination with these arguments yields
>this result.

As long as you don't dismiss desktops and servers, etc.

[Mainframes and minis as distinct concepts are mostly passe. Super
and cluster computers, however, are very important].

Despite the current IoT and BYO device fads, devices are not all there
are. Judging from some in the computer press, you'd think the legions
of office workers in the world would need nothing more than iPads and
Kinkos. That isn't even close to being true.

>I.e., there are just too many details of successful program deployment
>that don't work when you get away from the rich and tame "classroom
>environment". This is especially true as we move towards scenarios
>where things "talk to" each other, more (for folks who aren't prepared
>to deal with a malloc/printf *failing*, how do they address "network
>programming"? Or, RPC/RMI? etc.)

Again: CS is about computation and language theory, not about systems
engineering.

I got into it while ago with a VC guy I met at a party. He wouldn't
(let companies he was backing) hire anyone more than 5 years out of
school because he thought their skills were out of date.

I told him I would hesitate to hire anyone *less* than 5 years out of
school because most new graduates don't have any skills and need time
to acquire them. I also said something about how the average new CS
grad would struggle to implement a way out of a wet paper bag.

Obviously there is a component of this that is industry specific, but
few (if any) industries change so fast that skills learned 5 years ago
are useless today. For me, it was a scary look into the (lack of)
mind of modern business.

YMMV,
George

Don Y

unread,

Jun 13, 2017, 1:42:17 AM6/13/17

On 6/12/2017 8:27 PM, George Neuner wrote:
> On Sun, 11 Jun 2017 00:39:41 -0700, Don Y
> <blocked...@foo.invalid> wrote:
>
>> On 6/10/2017 6:01 PM, George Neuner wrote:
>>> On Fri, 9 Jun 2017 22:50:31 -0700, Don Y <blocked...@foo.invalid>
>>> wrote:
>
>> [I was recently musing over the number of SOIC8 devices that could fit
>> on the surface of a sphere having a radius equal to the average distance
>> of Pluto from the Sun (idea came from a novel I was reading). And, how
>> much that SOIC8 collection would *weigh*...]
>
> Reading about Dyson spheres are we? So how many trillion-trillion
> devices would it take?

Matrioshka Brain -- "concentric" Dyson spheres each powered by the waste
heat of the innermore spheres.

I didn't do the math as I couldn't figure out what a good representative
weight for a "wired" SOIC SoC might be...

>> But you can't examine algorithms and characterize their behaviors,
>> costs, etc. without being able to reify them.
>
> To a 1st approximation, you can. E.g., given just an equation, you
> can count the arithmetic operations and approximate the number of
> operand reads and result writes.

Yes, but only for evaluating *relative* costs/merits of algorithms.
It assumes you can "value" the costs/performance of the different
operators in some "intuitive" manner.

This doesn't always hold. E.g., a more traditionally costly operation
might be "native" while the *expected* traditional operation has to be
approximated or emulated.

> Certain analyses are very sensitive to the language being considered.
> E.g., you'll get different results from analyzing an algorithm
> expressed in C vs the same algorithm expressed in assembler because
> the assembler version exposes low level minutia that is hidden by the
> C version.
>
>> You can't just magically invent an abstract language that supports:
>> solve_homework_problem(identifier)
>
> You can invent it ... you just [currently] can't implement it.

Sure you can! You just have to find someone sufficiently motivated to
apply their meatware to the problem! There's nothing specifying the
*time* that the implementation needs to take to perform the operation!

> And you probably even can get a patent on it since the USPTO no longer
> requires working prototypes.
>
>>>> I just can't imagine how you could explain "programming" a machine to a
>>>> person without that person first understanding how the machine works.
>>>
>>> Take a browse through some classics:
>>>
>>> - Abelson, Sussman & Sussman, "Structure and Interpretation of
>>> Computer Programs" aka SICP
>>>
>>> - Friedman, Wand & Haynes, "Essentials of Programming Languages"
>>> aka EOPL
>>
>> All written long after I'd graduated. :>
>
> SICP and EOPL both were being written during the time I was in grad
> school. I had some courses with Mitch Wand and I'm sure I was used as
> a guinea pig for EOPL.

Having not seen SICP, its possible the notes for GS's class found
their way into it -- or, at least, *shaped* it.

> I acquired them later because they subsequently became famous as
> foundation material for legions of CS students.
>
>> Most (all?) of my college
>> CS courses didn't have "bound textbooks". Instead, we had collections
>> of handouts coupled with notes that formed our "texts". In some cases,
>> the handouts were "bound" (e.g., a cheap "perfect binding" paperback)
>> for convenience as the instructors were writing the texts *from*
>> their teachings.
>
> I'm not *that* far behind you. Many of my courses did have books, but
> quite a few of those books were early (1st or 2nd) editions.

Its not just *when* you got your education but what the folks teaching
opted to use as their "teaching materials". Most of my "CS" professors
obviously considered themselves "budding authors" as each seemed unable
to find a suitable text from which to teach and opted, instead, to
write their own.

OTOH, all my *other* classes (including the "EE" ones) had *real*
textbooks.

> I have a 1st edition on denotational semantics that is pre-press and
> contains inserts of hand drawn illustrations.
>
>> Sussman taught one of my favorite courses and I'm chagrined that
>> all I have to show for it are the handouts and my notes -- it would
>> have been nicer to have a lengthier text that I could explore at
>> my leisure (esp after the fact).
>
> I met once Gerry Sussman at a seminar. Never had the opportunity to
> take one of his classes.

Unfortunately, I never realized the sorts of folks I was surrounded by,
at the time. It was "just school", in my mind.

>> I actually considered altering the expression syntax to deliberately
>> render parens unnecessary (and illegal). I.e., if an expression
>> can have two different meanings with/without parens, then ONLY the
>> meaning without parens would be supported.
>
> Indentation sensitive syntax (I-expressions) is a recurring idea to
> rid the world of parentheses.
>
> Given the popularity of Python, iexprs may eventually find a future.
> OTOH, many people - me included - are philosophically opposed to the
> idea of significant whitespace.
>
> If you want syntax visualization, use a structure editor.

Still doesn't work without *vision*!

>> But, this added lots of superfluous statements just to meet that
>> goal *and* quickly overloads STM as you try to keep track of
>> which "component statements" you've already encountered:
>> area = (width_feet+(width_inches/12))*(length_feet+(length_inches/12)
>> becomes:
>> width = width_feet + width_inches/12
>> length = length_feet + length_inches/12
>> area = length * width
>> [Imagine you were, instead, computing the *perimeter* of a 6 walled room!]
>
> ??? For what definition of "STM"?
>
> Transactional memory - if that's what you mean - shouldn't require
> refactoring code in that way.

How many nested levels of parens can you keep track of if I'm dictating
the code to you over the phone and your eyes are closed? Will I be disciplined
enough to remember to alert you to the presence of every punctuation mark
(e.g., paren)? Will you be agile enough to notice when I miss one?

>>> ... identifying required
>>> functionality, designing program logic, evaluating and choosing
>>> algorithms, etc. ... all may be *guided* in situ by specific knowledge
>>> of the target machine, but they are skills which are independent of
>>> it.
>>
>> But I see programming (C.A.E) as having moved far beyond the sorts of
>> algorithms you would run on a desktop, mainframe, etc. Its no longer
>> just about this operator in combination with these arguments yields
>> this result.
>
> As long as you don't dismiss desktops and servers, etc.
>
> [Mainframes and minis as distinct concepts are mostly passe. Super
> and cluster computers, however, are very important].

"Mainframe" is a colloquial overloading to reference "big machines"
that have their own dedicated homes. The data center servicing your
bank is a mainframe -- despite the fact that it might be built of
hundreds of blade servers, etc.

"Desktop" is the sort of "appliance" that a normal user relates
to when you say "computer". He *won't* think of his phone even
though he knows its one.

He certainly won't think of his microwave oven, furnace, doorbell,
etc.

> Despite the current IoT and BYO device fads, devices are not all there
> are. Judging from some in the computer press, you'd think the legions
> of office workers in the world would need nothing more than iPads and
> Kinkos. That isn't even close to being true.
>
>> I.e., there are just too many details of successful program deployment
>> that don't work when you get away from the rich and tame "classroom
>> environment". This is especially true as we move towards scenarios
>> where things "talk to" each other, more (for folks who aren't prepared
>> to deal with a malloc/printf *failing*, how do they address "network
>> programming"? Or, RPC/RMI? etc.)
>
> Again: CS is about computation and language theory, not about systems
> engineering.
>
> I got into it while ago with a VC guy I met at a party. He wouldn't
> (let companies he was backing) hire anyone more than 5 years out of
> school because he thought their skills were out of date.
>
> I told him I would hesitate to hire anyone *less* than 5 years out of
> school because most new graduates don't have any skills and need time
> to acquire them. I also said something about how the average new CS
> grad would struggle to implement a way out of a wet paper bag.

I think it depends on the "pedigree". When I was hired at my first
job, the boss said, outright, "I don't expect you to be productive,
today. I hired you for 'tomorrow'; if I wanted someone to be productive
today, I'd have hired from the other side of the river -- and planned on
mothballing him next year!"

From the few folks that I interact with, I have learned to see his point.
Most don't know anything about the "history" of their technology or the
gyrations as it "experimented" with different things. They see "The Cloud"
as something new and exciting -- and don't see the parallels to
"time sharing", centralized computing, etc. that the industry routinely
bounces through. Or, think it amazingly clever to turn a PC into an
X terminal ("Um, would you like to see some REAL ones?? You know, the
idea that you're PILFERING?")

Employers/clients want to know if you've done THIS before (amusing
if its a cutting edge project -- that NO ONE has done before!) as
if that somehow makes you MORE qualified to solve their problem(s).
I guess they don't expect people to LEARN...

> Obviously there is a component of this that is industry specific, but
> few (if any) industries change so fast that skills learned 5 years ago
> are useless today. For me, it was a scary look into the (lack of)
> mind of modern business.

A lady friend once told me "Management is easy! No one wants to take
risks or make decisions so, if YOU will, they'll gladly hide behind you!"

/Pro bono/ day tomorrow. Last sub 100F day for at least 10 days (103 on Wed
climbing linearly to 115 next Mon with a LOW of 82F) so I'm hoping to get my
*ss out of here bright and early in the morning! <frown>

45 and raining, you say... :>

Ed Prochak

unread,

Jun 13, 2017, 5:24:54 PM6/13/17

On Tuesday, June 13, 2017 at 1:42:17 AM UTC-4, Don Y wrote:
> On 6/12/2017 8:27 PM, George Neuner wrote:
> > On Sun, 11 Jun 2017 00:39:41 -0700, Don Y
> > <blocked...@foo.invalid> wrote:

[]

>
> >> But you can't examine algorithms and characterize their behaviors,
> >> costs, etc. without being able to reify them.
> >
> > To a 1st approximation, you can. E.g., given just an equation, you
> > can count the arithmetic operations and approximate the number of
> > operand reads and result writes.
>
> Yes, but only for evaluating *relative* costs/merits of algorithms.
> It assumes you can "value" the costs/performance of the different
> operators in some "intuitive" manner.

I'm jumping in late here so forgive me if you covered this.

Algorithmic analysis is generally order of magnitude (the familiar
Big O notation) and independent of hardware implementation.

>
> This doesn't always hold. E.g., a more traditionally costly operation
> might be "native" while the *expected* traditional operation has to be
> approximated or emulated.

I'm not quite sure what you are saying here, Don.
What's the difference between "native" and *expected*?

Is it that you *expected* the system to have a floating point
multiply, but the "native" hardware does not so it is emulated?

[]
first:

> >> You can't just magically invent an abstract language that supports:
> >> solve_homework_problem(identifier)
> >

second:

> > You can invent it ... you just [currently] can't implement it.
>

third:

> Sure you can! You just have to find someone sufficiently motivated to
> apply their meatware to the problem! There's nothing specifying the
> *time* that the implementation needs to take to perform the operation!

I'm confused here too, Don, unless the quotation levels are off.
Is it you that said the first and third comments above?
They seem contradictory. (or else you are referencing
different contexts?)

[lots of other interesting stuff deleted]

ed

Don Y

unread,

Jun 14, 2017, 12:57:26 AM6/14/17

Hi Ed,

On 6/13/2017 2:24 PM, Ed Prochak wrote:

>>>> But you can't examine algorithms and characterize their behaviors,
>>>> costs, etc. without being able to reify them.
>>>
>>> To a 1st approximation, you can. E.g., given just an equation, you
>>> can count the arithmetic operations and approximate the number of
>>> operand reads and result writes.
>>
>> Yes, but only for evaluating *relative* costs/merits of algorithms.
>> It assumes you can "value" the costs/performance of the different
>> operators in some "intuitive" manner.
>
> I'm jumping in late here so forgive me if you covered this.
>
> Algorithmic analysis is generally order of magnitude (the familiar
> Big O notation) and independent of hardware implementation.

Correct. But, it's only "order of" assessments. I.e., is this
a constant time algorithm? Linear time? Quadratic? Exponential?
etc.

There's a lot of handwaving in O() evaluations of algorithms.
What's the relative cost of multiplication vs. addition operators?
Division? etc.

With O() you're just trying to evaluate the relative merits of one
approach over another in gross terms.

>> This doesn't always hold. E.g., a more traditionally costly operation
>> might be "native" while the *expected* traditional operation has to be
>> approximated or emulated.
>
> I'm not quite sure what you are saying here, Don.
> What's the difference between "native" and *expected*?
>
> Is it that you *expected* the system to have a floating point
> multiply, but the "native" hardware does not so it is emulated?

Or, exactly the reverse: that it had the more complex operator
but not the "simpler" (expected) one.

We *expect* integer operations to be cheap. We expect logical
operators to be <= additive operators <= multiplication, etc.
But, that's not always the case.

E.g., having a "multiply-and-accumulate" instruction (common in DSP)
can eliminate the need for an "add" opcode (i.e., multiplication is as
"expensive" as addition).

I've designed (specialty) CPU's that had hardware to support direct
(native) implementation of DDA's. But, trying to perform a simple
"logical" operation would require a page of code (because there were no
logical operators so they'd have to be emulated).

Atari (?) made a processor that could only draw arcs -- never straight
lines (despite the fact that line segments SHOULD be easier).

Limbo initially took the approach of having five "base" data types:
- byte
- int ("long")
- big ("long long")
- real ("double")
- string
(These are supported directly by the underlying VM) No pointer types.
No shorts, short-reals (floats), etc. If you want something beyond
an integer, you go balls out and get a double!

The haughtiness of always relying on "gold" instead of "lead" proved
impractical, in he real world. So, there are now things like native
support for Q-format -- and beyond (i.e., you can effectively declare the
value of the rightmost bit AND a maximum value to representable by
that particular "fixed" type):

hourlyassessment: type fixed(0.1, 40.0);
timespentworking, timeinmeetings: hourlyassessment;

LETTER: con 11.0;
DPI: con 400;
inches: type fixed(1/DPI, LETTER);
topmargin: inches;

Likewise, the later inclusion of REFERENCES to functions as a compromise
in the "no pointers" mentality. (you don't *need* references as you
can map functions to integer identifiers and then use big "case"
statements (the equivalent of "switch") to invoke one of N functions
as indicated by that identifier; just far less efficiently (enough
so that you'd make a change to the LANGUAGE to support it?)

While not strictly on-topic, it's vindication that "rough" approximations
of the cost of operators can often be far enough astray that you need to
refine the costs "in practice". I.e., if "multiplication" could just
be considered to have *a* cost, then there's be no need for all those
numerical data types.

So... you can use O-notation to compare the relative costs of different
algorithms on SOME SET of preconceived available operators and data types.
You have some implicit preconceived notion of what the "real" machine is
like.

But, mapping this to a real implementation is fraught with opportunities
to come to the wrong conclusions (e.g., if you think arcs are expensive
as being built from multiple chords, you'll favor algorithms that minimize
the number of chords used)

> first:
>>>> You can't just magically invent an abstract language that supports:
>>>> solve_homework_problem(identifier)
>>>
> second:
>>> You can invent it ... you just [currently] can't implement it.
>>
> third:
>> Sure you can! You just have to find someone sufficiently motivated to
>> apply their meatware to the problem! There's nothing specifying the
>> *time* that the implementation needs to take to perform the operation!
>
> I'm confused here too, Don, unless the quotation levels are off.
> Is it you that said the first and third comments above?

Yes.

> They seem contradictory. (or else you are referencing
> different contexts?)

Read them again; the subject changes:

1. You can't invent that magical language that allows you to solve
homework assignments with a single operator <grin>
2. You can INVENT it, but can't IMPLEMENT it (i.e., its just a conceptual
language that doesn't run on any REAL machine)
3. You *can* IMPLEMENT it; find a flunky to do the work FOR you!
(tongue firmly in cheek)

Reinhardt Behm

unread,

Jun 14, 2017, 1:26:56 AM6/14/17

AT Wednesday 14 June 2017 12:57, Don Y wrote:

> Correct. But, it's only "order of" assessments. I.e., is this
> a constant time algorithm? Linear time? Quadratic? Exponential?
> etc.
>
> There's a lot of handwaving in O() evaluations of algorithms.
> What's the relative cost of multiplication vs. addition operators?
> Division? etc.

I found many of such evaluation grossly wrong. We had some iterative
solutions praised as taking much fewer iterative steps than others. But
nobody took into account that each step was much more complicated and took
much more CPU time, than one of the solutions that took more steps.
And quite often the simpler solution worked for the more general case
whereas the "better" worked only under limited conditions.

--
Reinhardt

Don Y

unread,

Jun 14, 2017, 2:58:42 AM6/14/17

That doesn't invalidate the *idea* of modeling algorithms using some
set of "abstract operation costs".

But, it points to the fact that the real world eventually intervenes;
and, its a "bitch". :> Approximations have to eventually give way to
REAL data!

For example, I rely heavily on the (paged) MMU in the way my RTOS
handles "memory objects". I'll map a page into another processes
address space instead of bcopy()-ing the contents of the page
across the protection boundary. A win, right? "Constant time"
operation (instead of a linear time operation governed by the
amount of data to be bcopied).

But, there's a boat-load of overhead in remapping pages -- including
a trap to the OS to actually do the work. So, what you'd *think* to be
a more efficient way of handling the data actually is *less* efficient.

(Unless, of course, you can escalate the amount of data involved
to help absorb the cost of that overhead)

OTOH, there are often "other concerns" that can bias an implementation
in favor of (what appears to be) a less efficient solution.

E.g., as I work in a true multiprocessing environment, I want to
ensure a caller can't muck with an argument for which it has provided
a "reference" to the called function WHILE the called function is
(potentially) using it. So, the MMU lets me mark the page as
immutable (even though it *should* be mutable) UNTIL the called
function has completed.

[This allows the caller to *access* the contents of the page
concurrently with the called function -- but, lets the OS
intervene if the caller tries to alter the contents prematurely.
The "write lock" has value that couldn't be implemented with the
bcopy() approach -- think of large objects -- and is "cheaper"
to implement once you've already decided to take the hit for
manipulating the page tables]

The "real machine" fixes values to all those constants (K) in the
O()-evaluation. And, can muck with the decision making process
in certain sets of constraints.

Ed Prochak

unread,

Jun 19, 2017, 9:48:13 AM6/19/17

On Wednesday, June 14, 2017 at 2:58:42 AM UTC-4, Don Y wrote:
> On 6/13/2017 10:26 PM, Reinhardt Behm wrote:
> > AT Wednesday 14 June 2017 12:57, Don Y wrote:
> >
> >> Correct. But, it's only "order of" assessments. I.e., is this
> >> a constant time algorithm? Linear time? Quadratic? Exponential?
> >> etc.
> >>
> >> There's a lot of handwaving in O() evaluations of algorithms.
> >> What's the relative cost of multiplication vs. addition operators?
> >> Division? etc.
> >
> > I found many of such evaluation grossly wrong. We had some iterative
> > solutions praised as taking much fewer iterative steps than others. But
> > nobody took into account that each step was much more complicated and took
> > much more CPU time, than one of the solutions that took more steps.
> > And quite often the simpler solution worked for the more general case
> > whereas the "better" worked only under limited conditions.
>
> That doesn't invalidate the *idea* of modeling algorithms using some
> set of "abstract operation costs".
>
> But, it points to the fact that the real world eventually intervenes;
> and, its a "bitch". :> Approximations have to eventually give way to
> REAL data!
>

[]

>
> The "real machine" fixes values to all those constants (K) in the
> O()-evaluation. And, can muck with the decision making process
> in certain sets of constraints.

Exactly right!

The BIG-O notation too often is depicted as O() when it is
really O(n) where n is the input set size. What get lost sometimes
is people treat O(n) as a comparison at a given set size. and that
is the error. (I don't think you fall into this error, Don.)

BIG-O analysis allows you do do some testing at a data set size n1
and then make a rough estimate of the run time for n2 > n1.
this can be done for the hardware and processor instruction set.
Its purpose is to avoid the naive estimate:
"I ran this on a data set of 100 and took 0.5 microseconds,
so on the production run of 1,000,000 it should only less
than a minute (50 seconds)."

Hopefully, folks here are not that naive to depend on just O(n).
Your point is very important and worth repeating:
the analysis of algorithms can be more precise when it can
take into account the features of the implementation environment.
(I hope I phrased that close to what you meant.)

have a great day
ed

Don Y

unread,

Jun 19, 2017, 7:45:25 PM6/19/17

Hi Ed,

On 6/19/2017 6:48 AM, Ed Prochak wrote:
> On Wednesday, June 14, 2017 at 2:58:42 AM UTC-4, Don Y wrote:

>> The "real machine" fixes values to all those constants (K) in the
>> O()-evaluation. And, can muck with the decision making process
>> in certain sets of constraints.
>
> Exactly right!
>
> The BIG-O notation too often is depicted as O() when it is
> really O(n) where n is the input set size. What get lost sometimes
> is people treat O(n) as a comparison at a given set size. and that
> is the error. (I don't think you fall into this error, Don.)

It's helpful for evaluating the *relative* costs of different
algorithms where you assign some abstract cost to particular
classes of operations. It is particularly effective in a
classroom setting -- where newbies haven't yet learned to THINK
in terms of the costs of their "solutions" (algorithms).

For example, one way to convert a binary number to an equivalent
decimal value is to count the binary value down towards zero (i.e.,
using a "binary subtract 1") while simultaneously counting the
decimal value *up* (i.e., using a "decimal add 1"). Clearly, the
algorithm executes in O(n) time (n being the magnitude of the number
being converted).

Other algorithms might perform an operation (or set of operations)
for each bit of the argument. So, they operate in O(logs(n)) time
(i.e., a 32b value takes twice as long to process as a 16b value).

This is great for a "schoolbook" understanding of the algorithms
and an idea of how the "work" required varies with the input
value and/or range of supported values (i.e., an algorithm may
work well with a certain set of test values -- yet STUN the
developer when applied to *other* values).

But, when it comes to deciding which approach to use in a particular
application, you need to know what the "typical values" (and worst
case) to be processed are likely to be. And, what the costs of the
operations required in each case (the various 'k' that have been
elided from the discussion).

If you're converting small numbers, the cost of a pair of counters
can be much less than the cost of a "machine" that knows how to
process a bit at a time. So, while (in general) the counter approach
looks incredibly naive (stupid), it can, in fact be the best
approach in a particular set of conditions!

> BIG-O analysis allows you do do some testing at a data set size n1
> and then make a rough estimate of the run time for n2 > n1.
> this can be done for the hardware and processor instruction set.
> Its purpose is to avoid the naive estimate:
> "I ran this on a data set of 100 and took 0.5 microseconds,
> so on the production run of 1,000,000 it should only less
> than a minute (50 seconds)."

If O(n), you'd expect it to be done in 5ms (0.5us*1,000,000/100).

By thinking in terms of HOW the algorithm experiences its costs,
you can better evaluate the types of operations (implementations)
you'd like to favor/avoid. If you know that you are intending to
deploy on a target that has a particular set of characteristics
for its operations, you might opt for a different algorithm
to avoid the more expensive operators and exploit the cheaper ones.

Many years ago, I wrote a little piece of code to exhaustively
probe a state machine (with no buried state) to build a state
transition table empirically with only indirect observation
of the "next state" logic (i.e., by looking at the "current state"
and applying various input vectors and then tabulating the
resulting next state).

[This is actually a delightfully interesting problem to solve!
Hint: once you've applied an input vector and clocked the machine,
you've accumulated knowledge of how that state handles that input
vector -- but, you're no longer in that original "current state".
And, you still have other input vectors to evaluate for it!]

How do you evaluate the efficacy of the algorithm that "walks"
the FSM? How do you determine how long it will take to map
a FSM of a given maximum complexity (measured by number of
states and input vector size)? All you know, /a priori/ is
the time that it takes to apply a set of inputs to the machine
and observe that *one* state transition...

[Exercise left for the reader.]

> Hopefully, folks here are not that naive to depend on just O(n).
> Your point is very important and worth repeating:
> the analysis of algorithms can be more precise when it can
> take into account the features of the implementation environment.
> (I hope I phrased that close to what you meant.)

The more interesting cases are O(1), O(n^2), etc. And, how to
downgrade the cost of an algorithm that *appears*, on its surface,
to be of a higher order than it actually needs to be.

> have a great day

Too late -- 119F today.

Ed Prochak

unread,

Jun 20, 2017, 6:07:51 PM6/20/17

On Monday, June 19, 2017 at 7:45:25 PM UTC-4, Don Y wrote:
> Hi Ed,
>
> On 6/19/2017 6:48 AM, Ed Prochak wrote:

[Lots of Don's good advice left out for a little socializing]

> > BIG-O analysis allows you do do some testing at a data set size n1
> > and then make a rough estimate of the run time for n2 > n1.
> > this can be done for the hardware and processor instruction set.
> > Its purpose is to avoid the naive estimate:
> > "I ran this on a data set of 100 and took 0.5 microseconds,
> > so on the production run of 1,000,000 it should only less
> > than a minute (50 seconds)."
>
> If O(n), you'd expect it to be done in 5ms (0.5us*1,000,000/100).

Oh crap, I slipped my units.

Thanks Don.
[]

>
> > have a great day
>
> Too late -- 119F today.

Wow, it cooled off a bit here, back to the 70's.
Well try to stay cool.
ed

Don Y

unread,

Jun 21, 2017, 10:37:38 AM6/21/17

On 6/20/2017 3:07 PM, Ed Prochak wrote:

>>> have a great day
>>
>> Too late -- 119F today.
>
> Wow, it cooled off a bit here, back to the 70's.
> Well try to stay cool.

Our highs aren't expected to fall below 110 for at least a week.
Night-time lows are 80+.

Hopefully this will pass before Monsoon starts (in a few weeks)

0 new messages