Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

FORTH on the Advanced Placement Computer Science !

1,508 views
Skip to first unread message

visua...@rocketmail.com

unread,
Dec 19, 2014, 6:59:34 PM12/19/14
to
You may know that there are rules which programming language has to be taught in the United States. It is called Advanced Placement Computer Science.

It's interesting to read about the decisions of the board which programming language is the one to be taught:

"In the early 1990s it became clear that colleges were moving away from Pascal toward languages that allowed the creation of abstract data types that could be written in separate modules and incorporated into any program."

Hear, hear! Could this been a decision to approve Forth?
Where did it go?

"So, the AP Computer Science Development Committee chose C++ as the language that would best keep the AP courses comparable to their college counterparts, and teachers learned a new syntax as well as how to design and implement classes...
Between the time that the decision was made to switch to C++ and the first AP Exam in C++ in 1999, Java came onto the scene. Java is a safer language than C++ and has a clean way of implementing inheritance. At the 1999 AP Reading, we were already hearing speculation that Java was on the way -- and
indeed, Java became the new language for AP Computer Science in 2004.

It was chosen because it was judged to be one of the best available languages to teach the fundamental concepts that colleges require beginning computer science students to know, and because a significant percentage of colleges are currently using it as their introductory language, increasing the likelihood that AP students will receive college credit for their work."

One of the best available languages - but certainly not the best!

"Look back at the definition of computer science. With all three languages, Pascal, C++, and Java, AP CS instructors have taught and are teaching the concepts listed in this definition. The design methodology has changed, and so has the knowledge representation and implementation, but we are still teaching these essential principles.
What will happen in the future? Is there a new language on the horizon? I don't know. I do know that computer science is a dynamic field. As it changes and grows, so will AP Computer Science."

We all know that there is a new language on the horizon, since nearly five decades. This programming language is called "FORTH" - in 2018 it will be the AP Computer Science programming language.

Sources:
AP Computer Science Teacher's Guide, Chapter 1: About AP Computer Science
http://apcentral.collegeboard.com/apc/members/repository/ap07_compsci_teachersguide.pdf
http://apcentral.collegeboard.com/apc/public/repository/ap-computer-science-course-description.pdf
http://www.whitehouse.gov/blog/2014/12/08/celebrating-computer-science-education-week-kids-code-white-house

mix

unread,
Dec 20, 2014, 3:36:42 AM12/20/14
to
Is it clearly stated somewhere? I didn't follow links you gave us because I
already know that C++ and Java are used in education.

I don't see how Forth could be "safer" than C++ or Java. ANS Forth and many
specific translators lacks many features of popular high level languages,
especially safety features. Many translators will crash when trying to get
value out of empty stack, and it's completely normal.

I think Forth could be really useful in education field, but not because
it's the "best" language over there (even thought I like Forth and
Assembler better then any other language).

I think to really teach concepts of programming teachers have to start with
Assembler, then let students write high level language on it, and Forth
would be the best choice here because of its simplicity.

It seems in many cases people think that programming is a part of math, and
first things students have to learn are math and algorithms. However, math
is pure abstraction, but in real life we have to work with real machines
and write algorithms for real processors.

When student is writing something like a tree or another data structure
implementation, he already should understand how it would look in RAM, and
what price will be paid for speed or lesser memory consumption.

Well, that's just my opinion.
--
mix x

Andrew Haley

unread,
Dec 20, 2014, 4:40:43 AM12/20/14
to
mix <m...@test.net> wrote:
>
> I think to really teach concepts of programming teachers have to start with
> Assembler, then let students write high level language on it, and Forth
> would be the best choice here because of its simplicity.

I think many students would be demotivated by this. You have to give
people a way to get feedback and enjoy themselves; just crashing is
frustrating.

Andrew.


mix

unread,
Dec 20, 2014, 8:57:54 AM12/20/14
to
Just crashing of program written by Intel or Microsoft is one thing, and
just crashing of program written by you is another thing, because you know
exactly why is it crashing.

--
mix x

Andrew Haley

unread,
Dec 21, 2014, 5:38:42 AM12/21/14
to
mix <m...@test.net> wrote:
> Andrew Haley <andr...@littlepinkcloud.invalid> wrote:
>> mix <m...@test.net> wrote:
>>>
>>> I think to really teach concepts of programming teachers have to start with
>>> Assembler, then let students write high level language on it, and Forth
>>> would be the best choice here because of its simplicity.
>>
>> I think many students would be demotivated by this. You have to give
>> people a way to get feedback and enjoy themselves; just crashing is
>> frustrating.
>
> Just crashing of program written by Intel or Microsoft is one thing, and
> just crashing of program written by you is another thing, because you know
> exactly why is it crashing.

Err, no you don't: not at this stage of learning to program, anyway.

Andrew.

mix

unread,
Dec 21, 2014, 4:38:17 PM12/21/14
to
Disagree.

Didn't you notice that I mention Assembler have to be learned first?

--
mix x

Andrew Haley

unread,
Dec 21, 2014, 5:32:30 PM12/21/14
to
mix <m...@test.net> wrote:
> Andrew Haley <andr...@littlepinkcloud.invalid> wrote:
>> mix <m...@test.net> wrote:
>>> Andrew Haley <andr...@littlepinkcloud.invalid> wrote:
>>>> mix <m...@test.net> wrote:
>>>>>
>>>>> I think to really teach concepts of programming teachers have to
>>>>> start with Assembler, then let students write high level
>>>>> language on it, and Forth would be the best choice here because
>>>>> of its simplicity.
>>>>
>>>> I think many students would be demotivated by this. You have to give
>>>> people a way to get feedback and enjoy themselves; just crashing is
>>>> frustrating.
>>>
>>> Just crashing of program written by Intel or Microsoft is one thing, and
>>> just crashing of program written by you is another thing, because you know
>>> exactly why is it crashing.
>>
>> Err, no you don't: not at this stage of learning to program, anyway.
>
> Disagree.
>
> Didn't you notice that I mention Assembler have to be learned first?

Sure; it's right up there, about twenty lines above this one. I don't
quite know why you believe that a beginner knows exactly why their
assembly language program is crashing. Assembly language isn't all
that easy to understand. Debugging assembly language is difficult.

Andrew.

Michael Barry

unread,
Dec 21, 2014, 7:02:32 PM12/21/14
to
On Sunday, December 21, 2014 1:38:17 PM UTC-8, mix wrote:
>
> Didn't you notice that I mention Assembler have to be learned first?
>
> --
> mix x

When I started college in the mid-80s, we had the lower-division
choices of Pascal and Assembler, both of which were hosted on NOS,
the CDC Cyber time-sharing system. The Assembler was IBM 360
assembler, and was run on a simulator on the Cyber! C and 68k
assembler showed up in the course catalog within a couple of years,
and they were hosted by the time-sharing VAX (C was anyway, I can't
remember the 68k one, because I took that class elsewhere). Good
times fighting over time-share terminals during crunch time, and
waiting for the wide fan-fold line printer listings to show up in
the output bins!

Mike

mix

unread,
Dec 21, 2014, 10:56:33 PM12/21/14
to
Assembly language is easy to understand and easy to debug.

--
mix x

m...@iae.nl

unread,
Dec 22, 2014, 3:53:16 AM12/22/14
to
On Monday, December 22, 2014 4:56:33 AM UTC+1, mix wrote:
[..]
> Assembly language is easy to understand and easy to debug.
[..]

I made an error here ...

FORTH> see ud/mod
Flags:
$01131438 pop r8
$0113143A pop rcx
$0113143B pop rbx
$0113143C pop rdx
$0113143D pop rax
$0113143E test rcx, rcx
$01131441 jne $01131462 offset SHORT
$01131443 cmp rdx, rbx
$01131446 jb $01131457 offset SHORT
$01131448 mov rcx, rax
$0113144B mov rax, rdx
$0113144E xor rdx, rdx
$01131451 div rbx
$01131454 xchg rcx, rax
$01131457 div rbx
$0113145A push rdx
$0113145B push 0 b#
$0113145D push rax
$0113145E push rcx
$0113145F jmp r8
$01131462 push rsi
$01131463 mov r9, rdx
$01131466 mov r10, rax
$01131469 mov rsi, rbx
$0113146C mov rdi, rcx
$0113146F shr rdx, 1 b#
$01131473 rcr rax, 1 b#
$01131477 ror rdi, 1 b#
$0113147B rcr rbx, 1 b#
$0113147F bsr rcx, rcx
$01131483 shrd rbx, rdi, cl
$01131487 shrd rax, rdx, cl
$0113148B shr rdx, cl
$0113148E rol rdi, 1 b#
$01131492 div rbx
$01131495 mov rbx, r10
$01131498 mov rcx, rax
$0113149B imul rdi, rax
$0113149F mul rsi
$011314A2 add rdx, rdi
$011314A5 sub rbx, rax
$011314A8 mov rax, rcx
$011314AB mov rcx, r9
$011314AE sbb rcx, rdx
$011314B1 sbb rdx, rdx
$011314B4 and rsi, rdx
$011314B7 and rdi, rdx
$011314BA add rbx, rsi
$011314BD adc rcx, rdi
$011314C0 add rax, rdx
$011314C3 xor rdx, rdx
$011314C6 pop rsi
$011314C7 push rbx
$011314C8 push rcx
$011314C9 push rax
$011314CA push rdx
$011314CB jmp r8

-marcel

Alex McDonald

unread,
Dec 22, 2014, 6:42:51 AM12/22/14
to
As a commercial assembly programmer in IBM BAL for several years I
disagree. I started programming with assembler, and it was tough to learn
and tougher to debug.

Branch on indeX Low or Equal
BXLE Ri,Rn,D(B)

D(B) is an address made from the sum of an offset & base register. The
offset is a maxiumum of signed 4K

where Rn is an even register, R(n+1) is paired odd
Ri <-- Ri+Rn; branch to D2(B2) if Ri <= R(n+1)

where Rn is an odd register
Ri <-- Ri+Rn; branch to D2(B2) if Ri <= Rn

Let me know how how you might use this instruction, and the value of the
odd register form in particular. Then we can talk about how to use it for
registers where i=n. And how to debug it. This might seem an artificial
instruction used once in a blue moon by sadistic programmers. It isn't.
Any IBM mainframe assembler programmer will have come across it and used
it, since it's an incredibly expressive instruction.

Now add in a complete algorithm based on its use, and debugging requires
experience and knowledge not only of the machine code, the disassemby,
and the source (if you have it), but also of the person or compiler that
wrote the code too. It's not just the place where the code crashes
either, since often the problem starts a long, long way away (both in
time and space).

Writing assembler and debugging it is (for me at least) a holistic
experience. That experience doesn't come cheap. While teaching students
assembler is a fine and noble goal, and IMHO does make for a better
programmer in any language, it really isn't a great starting point.

> --
> mix x
>

rickman

unread,
Dec 22, 2014, 7:51:33 AM12/22/14
to
On 12/22/2014 6:42 AM, Alex McDonald wrote:
>
> Writing assembler and debugging it is (for me at least) a holistic
> experience. That experience doesn't come cheap. While teaching students
> assembler is a fine and noble goal, and IMHO does make for a better
> programmer in any language, it really isn't a great starting point.

You picked a very complex example of a rather complex machine. But
learning the assembly language of the 8 bit AVR processor or the 16 bit
MSP430 would be *much* easier.

I won't argue that assembler is the right starting point for learning to
program computers, but I think it should be taught early on next to a
higher level language. I remember the first time I learned to program
in a chemistry class where the professor felt the computer was a
mysterious thing and had no idea of the insides. Then when I did learn
how it all worked, it was a bit of a letdown... so primitive. That in
turn led me to programming in microcode and designing hardware for
microcoded functions. Can't get much more primitive than that.

Knowing that the hardware is running to do 3 billion things per second
gives a perspective that I don't think is so easy to get when only doing
high level programming.

--

Rick

Albert van der Horst

unread,
Dec 22, 2014, 8:43:44 AM12/22/14
to
In article <a026689d-c89f-46e8...@googlegroups.com>,
Although I agree with your point, you didn't prove your point,

This is unfair. This isn't assembler source, this is a
disassembly. A reasonable assembler source would
have symbolic names indicating the purpose of registers,
named labels, and possibly some helpful macro's.

>
>-marcel

Groetjes Albert
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

JUERGEN

unread,
Dec 22, 2014, 8:47:53 AM12/22/14
to
Rick, have you looked at a NISC - no microcode - just control of registers, so you have only registers, buses, memory, register control, so Forth would drive the hardware directly. CISC - RISC - NISC

rickman

unread,
Dec 22, 2014, 9:11:09 AM12/22/14
to
On 12/22/2014 8:47 AM, JUERGEN wrote:
> On Friday, December 19, 2014 11:59:34 PM UTC, visua...@rocketmail.com wrote:
>> You may know that there are rules which programming language has to be taught in the United States. It is called Advanced Placement Computer Science..
I believe what you are describing would be a micro coded computer with
no assembly language. In other words, VLIW in its purest form. That is
exactly the sort of machine I worked on many years ago. It was a
floating point processor before they had floating point on a chip. The
control word was over a hundred bits wide. There was no instruction,
just lots of control points, a few of them encoded.

--

Rick

Albert van der Horst

unread,
Dec 22, 2014, 10:23:20 AM12/22/14
to
This reminds of the DEC Alpha. There were a lot of spare words that did
special things, via the PAL instruction. It could be interesting to
play with, but I never had enough decumentation for that.
It is not even clear to me whether those instructions were chip or
system dependant. My Dec Alpha is an NT work station that I abuse for
Linux. Does anyone know more about this PAL stuff?

>
>--
>
>Rick

Alex McDonald

unread,
Dec 22, 2014, 10:59:35 AM12/22/14
to
on 22/12/2014 12:51:15, rickman wrote:
> On 12/22/2014 6:42 AM, Alex McDonald wrote:
>>
>> Writing assembler and debugging it is (for me at least) a holistic
>> experience. That experience doesn't come cheap. While teaching students
>> assembler is a fine and noble goal, and IMHO does make for a better
>> programmer in any language, it really isn't a great starting point.
>
> You picked a very complex example of a rather complex machine. But
> learning the assembly language of the 8 bit AVR processor or the 16
> bit MSP430 would be *much* easier.

Knuth had MIX; and it was horribly complex and a mistake due to its wierd
& whacky decimal number system, not to the architecture of the
instruction set. MMIX didn't fix it; it made it much worse. The books
(TAOCP) were successful despite that mistake, mainly due to Knuth's fine
analysis in pseudo-code or words.

>
> I won't argue that assembler is the right starting point for learning
> to program computers, but I think it should be taught early on next to
> a higher level language. I remember the first time I learned to
> program in a chemistry class where the professor felt the computer was
> a mysterious thing and had no idea of the insides. Then when I did
> learn how it all worked, it was a bit of a letdown... so primitive.

That for me was the "aha!" moment. All that emergent power from simple
instructions, wrapped in a high level language. The same with chemistry;
with a handful of elements you can constuct two brains and a damn fine
argument on Usenet.

> That in turn led me to programming in microcode and designing hardware
> for microcoded functions. Can't get much more primitive than that.

That's a desire for primitive, which is not what assembler is about, and
for students interested in programming is very definitely headed in the
wrong direction.

>
> Knowing that the hardware is running to do 3 billion things per second
> gives a perspective that I don't think is so easy to get when only
> doing high level programming.
>

High level languages get translated to code which is executed; or to some
intermediate that is interpreted. Or assembler gets translated to code in
a more 1 to 1 fashion. Speed is about clock frequency, and assembler vs
high level has nothing to do with that.

visua...@rocketmail.com

unread,
Dec 22, 2014, 1:51:08 PM12/22/14
to
I started in 1975 with the MOS-TECHNOLOGY 6502 KIM (Keyboard Input Monitor).
Only knowing - but never having programmed with - BASIC at that time, the 6502 machine language was really easy to use for me: easy to remember Mnemonics, and an orthogonal command table with approx. 60 commands, ten have been enough to start with, using the command table and pencil and paper for writing programs, including a VM and a database.

In 1984 I got RSC-Forth for the 65xx family, as easy to use, but much more effective than machine language - and the RSC-Forth allowed to write on floppy disks. That was a great improvement.

mix

unread,
Dec 22, 2014, 9:42:38 PM12/22/14
to
Original _ulldiv routine was written by Norbert Juffa for 32-bit x86
assembly language. I don't see a mistake in your code, unless it's some
typo which I overlooked. I would rewrite this routine in pretty much the
same way you did, with couple of exceptions: I wouldn't use push and pop,
and I wouldn't use test instruction to check content of rcx.

# (d1 d2 -- d3 d4 )
word "ud/mod",6,NONE,ud_slash_mod,d_min,ud_slash_mod
movq CELL*3(%rsp), %rax
movq CELL*2(%rsp), %rdx
movq CELL(%rsp), %rbx
movq (%rsp), %rcx

jrcxz ud_mod__no_big_divisor
ud_mod__big_divisor:
movq %rdx, %r9
movq %rax, %r10
movq %rbx, %rsi
movq %rcx, %rdi
shrq $1, %rdx
rcrq $1, %rax
rorq $1, %rdi
rcrq $1, %rbx
bsrq %rcx, %rcx
shrdq %cl, %rdi, %rbx
shrdq %cl, %rdx, %rax
shrq %cl, %rdx
rolq $1, %rdi
divq %rbx
movq %r10, %rbx
movq %rax, %rcx
imulq %rax, %rdi
mulq %rsi
addq %rdi, %rdx
subq %rax, %rdx
movq %rcx, %rax
movq %r9, %rcx
sbbq %rdx, %rcx
sbbq %rdx, %rdx
andq %rdx, %rsi
andq %rdx, %rdi
addq %rsi, %rbx
adcq %rdi, %rcx
addq %rdx, %rax

movq $0, (%rsp)
movq %rax, CELL(%rsp)
movq %rcx, CELL*2(%rsp)
movq %rbx, CELL*3(%rsp)
NEXT
ud_mod__no_big_divisor:
cmpq %rbx, %rdx
jb ud_mod__one_div
movq %rax, %rcx
movq %rdx, %rax
xorq %rdx, %rdx
divq %rbx
xchgq %rcx, %rax
ud_mod__one_div:
divq %rbx

movq %rdx, CELL*3(%rsp)
movq $0, CELL*2(%rsp)
movq %rax, CELL(%rsp)
movq %rcx, (%rsp)
NEXT
drow

--
mix x

mix

unread,
Dec 22, 2014, 9:57:47 PM12/22/14
to
This is a very well know routine named _ulldiv written many years ago for
32-bit x86 assembly language. Marcel just changed 32-bit registers to
64-bit ones.

Such routines in assembly language are similar to AWK, sed, or Perl
one-liners: you don't even have to understand what's going on inside them
since you know which registers are used for input and output.

To find commented assembly source code, google "_ulldiv Norbert Juffa".

--
mix x

mix

unread,
Dec 22, 2014, 9:57:48 PM12/22/14
to
That's not the best example of machine with easy-to-understand and
easy-to-debug assembly language.
--
mix x

mix

unread,
Dec 22, 2014, 10:38:39 PM12/22/14
to
"Very well known", of course. I apologize for being sloppy.

> 32-bit x86 assembly language. Marcel just changed 32-bit registers to
> 64-bit ones.
>
> Such routines in assembly language are similar to AWK, sed, or Perl
> one-liners: you don't even have to understand what's going on inside them
> since you know which registers are used for input and output.
>
> To find commented assembly source code, google "_ulldiv Norbert Juffa".

Actually, here is direct link:
http://www.df.lth.se/~john_e/gems/gem002a.html

--
mix x

mix

unread,
Dec 23, 2014, 1:09:44 AM12/23/14
to
OK, I've read your message and I still disagree.

I have access to IBM z mainframe, however, I've never tried to write
assembly code for it. Moreover, I'm not sure that particular mainframe have
same instruction set you used to use, and I'm not familiar with developer
tools provided there.

Can I see the opcode of this instruction? I would like to see if such
opcode can be generated and executed in run-time.

--
mix x

Alex McDonald

unread,
Dec 23, 2014, 7:44:48 AM12/23/14
to
>> Ri <-- Ri+Rn; branch to D2(B2) if Ri < R(n+1)
>>
>> where Rn is an odd register
>> Ri <-- Ri+Rn; branch to D2(B2) if Ri < Rn
>>
>> Let me know how how you might use this instruction, and the value of the
>> odd register form in particular. Then we can talk about how to use it for
>> registers where in. And how to debug it. This might seem an artificial
>> instruction used once in a blue moon by sadistic programmers. It isn't.
>> Any IBM mainframe assembler programmer will have come across it and used
>> it, since it's an incredibly expressive instruction.
>>
>> Now add in a complete algorithm based on its use, and debugging requires
>> experience and knowledge not only of the machine code, the disassemby,
>> and the source (if you have it), but also of the person or compiler that
>> wrote the code too. It's not just the place where the code crashes
>> either, since often the problem starts a long, long way away (both in
>> time and space).
>>
>> Writing assembler and debugging it is (for me at least) a holistic
>> experience. That experience doesn't come cheap. While teaching students
>> assembler is a fine and noble goal, and IMHO does make for a better
>> programmer in any language, it really isn't a great starting point.
>
> OK, I've read your message and I still disagree.
>
> I have access to IBM z mainframe, however, I've never tried to write
> assembly code for it. Moreover, I'm not sure that particular mainframe
> have same instruction set you used to use, and I'm not familiar with
> developer tools provided there.

The instruction set has been added to over the years, but hasn't changed
since 1964; all the way from S/360 through to Z series. BXLE (and all the
others) are still there and 50 year old programs will run with no to
little change. The toolset doc is here;


http://www-01.ibm.com/support/knowledgecenter/SSLTBW_1.12.0/com.ibm.zos.r12.asmk200/asmtug20.htm

>
> Can I see the opcode of this instruction? I would like to see if such
> opcode can be generated and executed in run-time.

I'm not quite sure why that would help. The opcode mnemonic is BXLE, it's
an RS 4 byte (two registers and an unindexed base-displacement address)
and the assembler (HLASM) works the rest out. I'm sure you'll find the
hex encoding in a Principle of Operations manual. HLASM is very high
level, and there are macros like DO and ENDDO. An example might help.

DO FROM=(R1,1),TO=(R3,10),BY=(R2,2)
Code for F
ENDDO

which generates and then assembles

LA R1,1
LA R3,10
LA R2,2
#@LB2 DC 0H
* code for F
#@LB3 DC 0H
BXLE R1,R2,#@LB2


The assembled code is what you debug. The DO/ENDDO aren't the source
that's assembled; they are macros that generate the source that's
assembled, and the debugger has no knowledge of those statements. HLASM
isn't a compiler.

Learning assembler as a first programming language makes it difficult to
grasp a whole host of important programming concepts, far less be able to
implement them. Although it may be instructive to understand what goes on
at a lower level, it's a poor first introduction to programming.

I still contend that assembler, regardless of architecture, is much
harder to write well and debug effectively that high level programs.

>
> --
> mix x
>

mix

unread,
Dec 23, 2014, 8:12:19 AM12/23/14
to
If I remember correctly, IBM lately was in favor of virtualization, i.e.
the old code written for S/360 would be translated to pseudo-code, and that
pseudo-code would be translated to actual machine code.

Thank you for links, though. I'll check them out.

> http://www-01.ibm.com/support/knowledgecenter/SSLTBW_1.12.0/com.ibm.zos.r12.asmk200/asmtug20.htm
>
>>
>> Can I see the opcode of this instruction? I would like to see if such
>> opcode can be generated and executed in run-time.
>
> I'm not quite sure why that would help. The opcode mnemonic is BXLE, it's
> an RS 4 byte (two registers and an unindexed base-displacement address)
> and the assembler (HLASM) works the rest out. I'm sure you'll find the
> hex encoding in a Principle of Operations manual. HLASM is very high
> level, and there are macros like DO and ENDDO. An example might help.
>
> DO FROM=(R1,1),TO=(R3,10),BY=(R2,2)
> Code for F
> ENDDO

I would never write something like that. That's not Assembler. I would
write Forth first and then do high level language things if needed.

OK, give me some time and I'll try to answer your questions.

> which generates and then assembles
>
> LA R1,1
> LA R3,10
> LA R2,2
> #@LB2 DC 0H
> * code for F
> #@LB3 DC 0H
> BXLE R1,R2,#@LB2
>
>
> The assembled code is what you debug. The DO/ENDDO aren't the source
> that's assembled; they are macros that generate the source that's
> assembled, and the debugger has no knowledge of those statements. HLASM
> isn't a compiler.
>
> Learning assembler as a first programming language makes it difficult to
> grasp a whole host of important programming concepts, far less be able to
> implement them. Although it may be instructive to understand what goes on
> at a lower level, it's a poor first introduction to programming.
>
> I still contend that assembler, regardless of architecture, is much
> harder to write well and debug effectively that high level programs.

The reason for that is lack of good teachers. Assembler is no any more
complicated than any other programming language, but many things are done
differently in assembly code.

Anton Ertl

unread,
Dec 23, 2014, 10:25:16 AM12/23/14
to
alb...@spenarnc.xs4all.nl (Albert van der Horst) writes:
>This reminds of the DEC Alpha. There were a lot of spare words that did
>special things, via the PAL instruction. It could be interesting to
>play with, but I never had enough decumentation for that.
>It is not even clear to me whether those instructions were chip or
>system dependant. My Dec Alpha is an NT work station that I abuse for
>Linux. Does anyone know more about this PAL stuff?

PAL instructions invoke machine code routines that can access some
hardware resources that normal code is not allowed to access. So they
were somewhat similar to system calls. But sometimes it is just
ordinary code. I looked at the IMB (instruction memory barrier) code
for IIRC the 21064a, and it just tramples through the whole cache in
order to flush it. Some PAL instructions are guaranteed to be
available everywhere (e.g., IMB), others are OS-specific (or maybe
rather console-specific: the SRM console for Digital Unix and VMS, and
ARC/Alphabios for WNT; Linux worked with either).

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2014: http://www.euroforth.org/ef14/

Anton Ertl

unread,
Dec 23, 2014, 10:38:15 AM12/23/14
to
"Alex McDonald" <bl...@rivadpm.com> writes:
>Learning assembler as a first programming language makes it difficult to
>grasp a whole host of important programming concepts

Like what?

I don't see the problem there, I see it in the feedback you get when
you make a mistake; it's not very enlightening to the novice.

Paul Rubin

unread,
Dec 23, 2014, 10:57:55 AM12/23/14
to
mix <m...@test.net> writes:
> The reason for that is lack of good teachers. Assembler is no any more
> complicated than any other programming language, but many things are done
> differently in assembly code.

Assembly is harder to debug mostly because of its complete lack of
memory safety, so it's easy for a buggy program to clobber data
belonging to some other part of the program, that doesn't cause
observable misbehaviour until the other part of the program runs and you
have to diagnose what happened to it.

If a program in a safe language has a similar bug, then it crashes with
an understandable diagnostic as soon as the buggy code tries writing to
the wrong place; or better yet, the error is found at compile time, say
through type checking.

There is also the far higher amount of code you have to write to do even
the simplest things, compared to an HLL.

rickman

unread,
Dec 23, 2014, 11:17:29 AM12/23/14
to
I think you missed my point. I see a computer accept a command and
spend seconds doing something that should happen in the blink of an eye.
I ask the programmer what the computer is doing and they don't even
understand the question. Many people find it mind boggling that
computers do what they do so quickly while I find it mind boggling that
that computers are so slow. That is because the programmer only sees
high level constructs and has no idea how it is implemented.
Programming in assembly gives him a better perspective to understand how
his ideas are implemented.

--

Rick

Andrew Haley

unread,
Dec 23, 2014, 11:31:06 AM12/23/14
to
Paul Rubin <no.e...@nospam.invalid> wrote:
> mix <m...@test.net> writes:
>> The reason for that is lack of good teachers. Assembler is no any more
>> complicated than any other programming language, but many things are done
>> differently in assembly code.
>
> Assembly is harder to debug mostly because of its complete lack of
> memory safety, so it's easy for a buggy program to clobber data
> belonging to some other part of the program, that doesn't cause
> observable misbehaviour until the other part of the program runs and you
> have to diagnose what happened to it.

I don't think that's the reason at all. There are plenty of other
languages without memory safety that are much easier to debug. Forth,
for one.

> If a program in a safe language has a similar bug, then it crashes with
> an understandable diagnostic as soon as the buggy code tries writing to
> the wrong place; or better yet, the error is found at compile time, say
> through type checking.
>
> There is also the far higher amount of code you have to write to do even
> the simplest things, compared to an HLL.

Yeah. That one. :-)

Andrew.

mix

unread,
Dec 23, 2014, 1:56:42 PM12/23/14
to
Paul Rubin <no.e...@nospam.invalid> wrote:
> mix <m...@test.net> writes:
>> The reason for that is lack of good teachers. Assembler is no any more
>> complicated than any other programming language, but many things are done
>> differently in assembly code.
>
> Assembly is harder to debug mostly because of its complete lack of
> memory safety, so it's easy for a buggy program to clobber data
> belonging to some other part of the program, that doesn't cause
> observable misbehaviour until the other part of the program runs and you
> have to diagnose what happened to it.
>
> If a program in a safe language has a similar bug, then it crashes with
> an understandable diagnostic as soon as the buggy code tries writing to
> the wrong place; or better yet, the error is found at compile time, say
> through type checking.

Better yet, the programmer is not doing that mistake, because there is only
one type: pointer to array of bytes, that's it.

> There is also the far higher amount of code you have to write to do even
> the simplest things, compared to an HLL.


--
mix x

mix

unread,
Dec 23, 2014, 3:13:51 PM12/23/14
to
Paul Rubin <no.e...@nospam.invalid> wrote:
> mix <m...@test.net> writes:
>> The reason for that is lack of good teachers. Assembler is no any more
>> complicated than any other programming language, but many things are done
>> differently in assembly code.
>
> Assembly is harder to debug mostly because of its complete lack of
> memory safety, so it's easy for a buggy program to clobber data
> belonging to some other part of the program, that doesn't cause
> observable misbehaviour until the other part of the program runs and you
> have to diagnose what happened to it.
>
> If a program in a safe language has a similar bug, then it crashes with
> an understandable diagnostic as soon as the buggy code tries writing to
> the wrong place; or better yet, the error is found at compile time, say
> through type checking.

Not sure what are you talking about. You can implement OOP or type checking
in Assembler same as you can do it in Forth. Moreover, ANS Forth and
Assembler are completely same considering this particular problem. The
difference with VB or some similar language would be the fact that *you*
have to implement safety facilities. It's not very complicated, but most
people decide not do deal with it, I hope you understand why.

Google "Assembler OOP", for example. If you're really interested, I can
write you trivial safe type implementation.

> There is also the far higher amount of code you have to write to do even
> the simplest things, compared to an HLL.


--
mix x

mix

unread,
Dec 23, 2014, 3:21:59 PM12/23/14
to
That's not exactly true. The amount of constructs is limited in Assembler
same as in HLL. It's true that code will be longer in terms of bytes, but
it's not mean you'll spend more time typing it.

Programmers do not spend most of their time typing things.

--
mix x

Paul Rubin

unread,
Dec 23, 2014, 3:27:28 PM12/23/14
to
Andrew Haley <andr...@littlepinkcloud.invalid> writes:
> I don't think that's the reason at all. There are plenty of other
> languages without memory safety that are much easier to debug. Forth,
> for one.

Hmm. Forth style tends to avoid writing to memory (using the stack
instead) which might help; and the interactive interpreter (test each
word carefully before going to the next one) probably makes debugging
and bug prevention easier. The stack conventions also create some
informal type discipline, that could also be followed in assembler but
is more stuff for the programmer to have to stay on top of. Especially
for beginners, the complexity gets out of control.

I've never written any significant sized asm programs (just some small
ones) and don't really have a picture in my mind of how the generation
of programmers who wrote complex apps in asm did it.

Paul Rubin

unread,
Dec 23, 2014, 3:31:53 PM12/23/14
to
mix <m...@test.net> writes:
> Better yet, the programmer is not doing that mistake, because there is only
> one type: pointer to array of bytes, that's it.

To use an analogy that came up in Haskell, think of a jigsaw puzzle
(picture of scenery, say) whose pieces have complicated shapes (types).
So if you try putting a piece in the wrong place, it doesn't fit. Now
imagine the same puzzle except the pieces are all 1x1 squares. Yes the
square pieces are "simpler", but the puzzle becomes much harder to
solve.

Paul Rubin

unread,
Dec 23, 2014, 3:39:43 PM12/23/14
to
mix <m...@test.net> writes:
> That's not exactly true. The amount of constructs is limited in Assembler
> same as in HLL. It's true that code will be longer in terms of bytes, but
> it's not mean you'll spend more time typing it.

I get the impression you've never used an HLL. For this purpose, an HLL
means something with type safety, garbage collection, and preferably
first class functions (like Python and Javascript but probably unlike
VB). Here is the notorious Quicksort example in Haskell:

quicksort [] = []
quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
where
lesser = filter (< p) xs
greater = filter (>= p) xs

Like any small program this doesn't rely much on types, but it relies on
GC to clean up intermediate results, and it uses a higher-order function
(filter) that takes more cognitive overhead in assembler. Someone
familiar with the algorithm could have written the above in about 2
minutes. I don't think even an expert could implement something
comparable in assembler in less than 10x that much time.

Andrew Haley

unread,
Dec 23, 2014, 4:04:16 PM12/23/14
to
Paul Rubin <no.e...@nospam.invalid> wrote:
> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>> I don't think that's the reason at all. There are plenty of other
>> languages without memory safety that are much easier to debug. Forth,
>> for one.
>
> Hmm. Forth style tends to avoid writing to memory (using the stack
> instead) which might help; and the interactive interpreter (test
> each word carefully before going to the next one) probably makes
> debugging and bug prevention easier. The stack conventions also
> create some informal type discipline, that could also be followed in
> assembler but is more stuff for the programmer to have to stay on
> top of. Especially for beginners, the complexity gets out of
> control.

Does it, indeed. The thing you've not mentioned, perhaps because
you've not tried it, is the ease of extending the compiler to inject
monitoring and tracing facilities. That really is the proverbial
magic bullet when it comes to finding weird bugs.

> I've never written any significant sized asm programs (just some
> small ones) and don't really have a picture in my mind of how the
> generation of programmers who wrote complex apps in asm did it.

I've written a great deal of asm, but never in an app with no HLL at
all. But I'm fairly sure I know how it's done: macros, lots of them,
and a rigorous set of interface conventions.

Andrew.

Andrew Haley

unread,
Dec 23, 2014, 4:16:10 PM12/23/14
to
Paul Rubin <no.e...@nospam.invalid> wrote:
> Here is the notorious Quicksort example in Haskell:

And it's a terrible implementation of quicksort.

> quicksort [] = []
> quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
> where
> lesser = filter (< p) xs
> greater = filter (>= p) xs
>
> Like any small program this doesn't rely much on types, but it relies on
> GC to clean up intermediate results, and it uses a higher-order function
> (filter) that takes more cognitive overhead in assembler. Someone
> familiar with the algorithm could have written the above in about 2
> minutes. I don't think even an expert could implement something
> comparable in assembler in less than 10x that much time.

Probably not, but it would be much better. It would be in-place so it
wouldn't generate any garbage, and it would make a more intelligent
choice of pivot, and it would only stack what was necessary, and it
would have the tight inner loop which is what really makes quicksort
fast. It's like comparing a little toy car that you push along with
your feet with something you'd drive to work.

Andrew.

mix

unread,
Dec 23, 2014, 5:08:01 PM12/23/14
to
I'm not sure if you understand what are you trying to compare with what. In
your example, you're calling some routines written by someone else for you.
Assembly programmer will also call some routines, probably written by
someone else. Why should it take 10x more times to place data in memory and
write call instruction?

--
mix x

Paul Rubin

unread,
Dec 23, 2014, 5:27:37 PM12/23/14
to
mix <m...@test.net> writes:
>> lesser = filter (< p) xs
>> greater = filter (>= p) xs

> I'm not sure if you understand what are you trying to compare with
> what. In your example, you're calling some routines written by someone
> else for you.

You mean the filter function? It happens there's also syntactic support
in the language to do that without calling a function:

lesser = [x | x <- xs, x < p]
greater = [x | x <- xs, x > p]

> Assembly programmer will also call some routines, probably written by
> someone else. Why should it take 10x more times to place data in
> memory and write call instruction?

What routine would you call (other than a sorting routine), that's
likely to be available in an asm subroutine library, that would let you
write a sorting routine in 2 minutes?

The example came from here, fwiw:

https://www.haskell.org/haskellwiki/Introduction

mix

unread,
Dec 23, 2014, 5:58:09 PM12/23/14
to
I guess you should google something like "asm sorting" or "asm sorting
subroutine".

"Part of language" makes really no difference here. It's still clearly
defined subroutine, not some kind of magic which some language will
magically do for you.

For some reason you think you are free to use years of other people work,
but Assembler programmer have to start from zero having nothing but
registers and empty memory. I'm sorry, that's not the case.

--
mix x

Paul Rubin

unread,
Dec 23, 2014, 6:38:05 PM12/23/14
to
mix <m...@test.net> writes:
> I guess you should google something like "asm sorting" or "asm sorting
> subroutine".

Of course there are sorting routines for just about any language. The
question is writing a sorting routine from available building blocks
instead of calling a sorting routine. The building blocks available in
a library will be shaped by the language itself, including if you count
the list comprehensions in that example as library calls. So what asm
libraries out there give building blocks that let you sort so easily?
How would you even design one? The same idea applies to stuff other
than sorting.

> For some reason you think you are free to use years of other people work,
> but Assembler programmer have to start from zero having nothing but
> registers and empty memory. I'm sorry, that's not the case.

Of course you can use existing work from assembler, such as subroutine
libraries that make things easier. But why stop there? Compilers also
make things easier.

Paul Rubin

unread,
Dec 23, 2014, 8:40:20 PM12/23/14
to
Andrew Haley <andr...@littlepinkcloud.invalid> writes:
> And it's a terrible implementation of quicksort.

It's illustrative.

> it would have the tight inner loop which is what really makes
> quicksort fast.

It's the O(n log n) asymptotic speed (with non-pathological data) that
makes quicksort fast, compared with the quadratic algorithms (no matter
how optmized) one still finds in programs where the implementer didn't
know better or have a good library routine available.

This was a real issue in a crufty old C program that I used to work on.
It had what amounted to a quadratic-time selection sort, smeared all
through a subsystem with 1000's of LOC doing stuff unrelated to sorting,
that was causing performance problems in an embedded system. Factoring
out the sorting stuff to a routine with a straightforwardly coded,
reasonable algorithm fixed the issue much better than adding any amount
more micro-optimization to the quadratic version (which already had a
lot, at the cost of clarity).

> It's like comparing a little toy car that you push along with your
> feet with something you'd drive to work.

Sure, you have to start somewhere. FWIW, here's a merge sort that I
wrote in less than 10 minutes, that avoids all the quicksort problems
about pivot selection etc. Of particular note is the function

prop_msort xs = msort xs == sort xs

This function takes a list of ints, sorts it with both msort and the
library sort routine, and returns True iff the results are the same.
The type signature is inferred from the one on msort.

Then the "main" function calls quickCheck on prop_msort. quickCheck
generates a bunch (default 100) of test cases that are random lists of
ints, and calls prop_msort to make sure that msort correctly sorts all
these random lists. How does it know to generate random lists of ints
instead of, say, random characters? Again, it figures it out from the
type signature. So you get lightweight automatic generation of random
test suites, that often find stuff that manually written unit tests
miss.

I think it would be pretty hard to write something that general and
convenient for asm programs, because of the QuichCheck's use of type
inference in figuring out what test values to generate. Although, there
is something like it for Erlang (dynamically typed) that I haven't yet
tried.

I cheated a little with the [Int]->[Int] type signature since the
more general signature would be

Ord a => [a] -> [a]

to sort lists of any type with an ordering relation. That works fine,
but to use QuickCheck I would have had to give it a bit more info about
how to generate arbitrary instances since the concrete type is unknown.
That it can do lists of ints (floats, chars, etc) though is pretty good.

================================================================

import Data.List
import Test.QuickCheck

msort :: [Int] -> [Int]
msort xs
| n < 2 = xs
| otherwise = merge (msort x1s) (msort x2s)
where
n = length xs
(x1s,x2s) = splitAt (n`div`2) xs
merge xs ys = case (xs,ys) of
([], ys') -> ys'
(xs', []) -> xs'
(x:xs',y:ys') | x < y -> x : merge xs' ys
| otherwise -> y : merge xs ys'

prop_msort xs = msort xs == sort xs

main = quickCheck prop_msort

Melzzzzz

unread,
Dec 23, 2014, 11:31:56 PM12/23/14
to
On Tue, 23 Dec 2014 15:16:09 -0600
Andrew Haley <andr...@littlepinkcloud.invalid> wrote:

> Paul Rubin <no.e...@nospam.invalid> wrote:
> > Here is the notorious Quicksort example in Haskell:
>
> And it's a terrible implementation of quicksort.

It's not that it is terrible quicksort, it is data structure
used (singly linked list) terrible.

>
> > quicksort [] = []
> > quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort
> > greater) where
> > lesser = filter (< p) xs
> > greater = filter (>= p) xs
> >
> > Like any small program this doesn't rely much on types, but it
> > relies on GC to clean up intermediate results, and it uses a
> > higher-order function (filter) that takes more cognitive overhead
> > in assembler. Someone familiar with the algorithm could have
> > written the above in about 2 minutes. I don't think even an expert
> > could implement something comparable in assembler in less than 10x
> > that much time.
>
> Probably not, but it would be much better. It would be in-place so it
> wouldn't generate any garbage, and it would make a more intelligent
> choice of pivot, and it would only stack what was necessary, and it
> would have the tight inner loop which is what really makes quicksort
> fast. It's like comparing a little toy car that you push along with
> your feet with something you'd drive to work.

Try that on singly linked list and result would be even slower :p
Function is also generic so that means you have to call provided
function every time comparison is needed, which means much slower :p
In Haskell, several lists can contain same node as nothing
is mutable so generated garbage would be at minimum.

>
> Andrew.


Melzzzzz

unread,
Dec 23, 2014, 11:34:53 PM12/23/14
to
O(n)

> (x1s,x2s) = splitAt (n`div`2) xs

O(n)

> merge xs ys = case (xs,ys) of
> ([], ys') -> ys'
> (xs', []) -> xs'
> (x:xs',y:ys') | x < y -> x : merge xs' ys
> | otherwise -> y : merge xs ys'
>
> prop_msort xs = msort xs == sort xs
>
> main = quickCheck prop_msort

Anyway I really don't know why Haskell insists on using
linked lists when , actually is pretty useless data structure
for everything but toy examples.
eg string as linked list, pleeeaaazeee.

Paul Rubin

unread,
Dec 24, 2014, 12:12:13 AM12/24/14
to
Melzzzzz <m...@zzzzz.com> writes:
>> n = length xs
> O(n)
>> (x1s,x2s) = splitAt (n`div`2) xs
> O(n)

Yes, sorting is O(n log n) at best, so doing an O(n) operation is fine.
Note those steps are really O(n) for the whole algorithm since the list
involved are smaller by half at each level. Maybe those two steps could
be combined somehow as a micro-optimization but it works fine as is.

> Anyway I really don't know why Haskell insists on using
> linked lists when , actually is pretty useless data structure
> for everything but toy examples.

That example sorts 100k ints in about 0.1 sec which is fine for all
kinds of purposes. A compiled Forth implementation of an in-place
algorithm could probably do better, but I think most people use
interpreted Forth which might not do as well.

> eg string as linked list, pleeeaaazeee.

Yes, better to use Data.Text for stuff with lots of strings.

Andrew Haley

unread,
Dec 24, 2014, 4:26:34 AM12/24/14
to
Paul Rubin <no.e...@nospam.invalid> wrote:
> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>> And it's a terrible implementation of quicksort.
>
> It's illustrative.
>
>> it would have the tight inner loop which is what really makes
>> quicksort fast.
>
> It's the O(n log n) asymptotic speed (with non-pathological data)
> that makes quicksort fast, compared with the quadratic algorithms
> (no matter how optmized) one still finds in programs where the
> implementer didn't know better or have a good library routine
> available.

There are other N(log n) sorting algorithms: what makes quicksort fast
is the simplicity of the inner loop and the fact that it can be done
(almost) entrely in place. If you're sorting lists it makes more
sense to use some kind of merge sort. There is a real quicksort for
Haskell at
http://augustss.blogspot.co.uk/2007/08/quicksort-in-haskell-quicksort-is.html
Yes, I like that. It makes more sense than the notorious Haskell
quicksort, which isn't really quicksort at all.

Andrew.

Andrew Haley

unread,
Dec 24, 2014, 4:28:41 AM12/24/14
to
Melzzzzz <m...@zzzzz.com> wrote:
> On Tue, 23 Dec 2014 15:16:09 -0600
> Andrew Haley <andr...@littlepinkcloud.invalid> wrote:
>
>> Paul Rubin <no.e...@nospam.invalid> wrote:
>> > Here is the notorious Quicksort example in Haskell:
>>
>> And it's a terrible implementation of quicksort.
>
> It's not that it is terrible quicksort, it is data structure
> used (singly linked list) terrible.

Hmm. That's an implementation detail: even with arrays it would be
horrible.

Andrew.

Anton Ertl

unread,
Dec 24, 2014, 6:42:47 AM12/24/14
to
Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>Paul Rubin <no.e...@nospam.invalid> wrote:
>> Here is the notorious Quicksort example in Haskell:
>
>And it's a terrible implementation of quicksort.
>
>> quicksort [] = []
>> quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
>> where
>> lesser = filter (< p) xs
>> greater = filter (>= p) xs
>>
>> Like any small program this doesn't rely much on types, but it relies on
>> GC to clean up intermediate results, and it uses a higher-order function
>> (filter) that takes more cognitive overhead in assembler. Someone
>> familiar with the algorithm could have written the above in about 2
>> minutes. I don't think even an expert could implement something
>> comparable in assembler in less than 10x that much time.
>
>Probably not, but it would be much better. It would be in-place so it
>wouldn't generate any garbage, and it would make a more intelligent
>choice of pivot

Paul Rubin's choice of pivot is not just not intelligent, it's about
as bad as possible: It brings out the worst-case (quadratic) behaviour
in the pre-sorted case, which tends to be much a more likely input
than any random sequence.

Anton Ertl

unread,
Dec 24, 2014, 7:47:56 AM12/24/14
to
Paul Rubin <no.e...@nospam.invalid> writes:
>That example sorts 100k ints in about 0.1 sec which is fine for all
>kinds of purposes. A compiled Forth implementation of an in-place
>algorithm could probably do better, but I think most people use
>interpreted Forth which might not do as well.

As it happens, <2013Aug1...@mips.complang.tuwien.ac.at> contains
data on that:

|Here they are (on a 3GHz Core 2 Duo E8400); the numbers are the user
|times in seconds.
|
|gforth-fast 64-bit
|m-a m r i i2 d
...
|3.22 3.11 3.16 3.12 3.14 3.53 100000 * 100
|vfxlin
|m-a m r i i2 d
...
|1.00 0.98 1.00 1.00 1.01 1.12 100000 * 100
...
|
|The line "100000 * 100" indicates that an array with 100000 elements
|was filled 100 times with pseudo-random numbers and sorted.

So gforth-fast takes 0.03s for 100k 64-bit ints, three times faster
than your Haskell program. I leave it to you whether you consider
gforth-fast to be an interpreted Forth. If not, what makes you think
that most people use an interpreted Forth?

Vfxlin takes 0.01s for 100k 32-bit ints, ten times faster than your
Haskell program.

Anton Ertl

unread,
Dec 24, 2014, 8:03:10 AM12/24/14
to
Melzzzzz <m...@zzzzz.com> writes:
>Anyway I really don't know why Haskell insists on using
>linked lists when , actually is pretty useless data structure
>for everything but toy examples.

You can build a linked list step-by-step without side effects (which
you must not use in Haskell proper). With an array you have to build
it all at once. I guess building it all at once is pretty common in
Haskell, but can it be used for everything, e.g., the merge step in
mergesort? I don't think so.

In general, linked lists are a nice data structure when you don't know
the size in advance, and you don't need random access.

Paul Rubin

unread,
Dec 24, 2014, 12:15:32 PM12/24/14
to
an...@mips.complang.tuwien.ac.at (Anton Ertl) writes:
> You can build a linked list step-by-step without side effects (which
> you must not use in Haskell proper). With an array you have to build
> it all at once. I guess building it all at once is pretty common in
> Haskell, but can it be used for everything, e.g., the merge step in
> mergesort? I don't think so.

Not sure what you mean about the merge step--do you mean could it have
used an array? Haskell has mutable arrays, so you can write the same
algorithms on them as with any other language. The main difference is
that the mutation code is of a special type (in the state transformer or
ST monad), so the type system prevents any code other than the mutation
action from modifying the array contents.

The merge step in the Haskell code I posted is very interesting and I
don't have it completely visualized. But basically because of lazy
evaluation, you don't actually generate complete intermediate lists.
Instead, all the nested merge steps happen simultaneously
coroutine-style, so as you read elements off the sorted list one by one,
each of the nested merges makes a little more progress.

Paul Rubin

unread,
Dec 24, 2014, 12:40:57 PM12/24/14
to
an...@mips.complang.tuwien.ac.at (Anton Ertl) writes:
> So gforth-fast takes 0.03s for 100k 64-bit ints, three times faster
> than your Haskell program.

Nice. That's actually quite a bit faster than Python sorting 100k
pseudo-random floats, and that's in a Python array rather than a linked
list, using Python's library sorting routine which is written in
optimized C, but which has to be type-generic with runtime dispatch for
comparisons.

Can I ask what the coding time was for the Forth version? The debugging
time? The amount of time spent testing before you felt confident that
the code worked? Thanks.

mix

unread,
Dec 24, 2014, 9:37:09 PM12/24/14
to
Paul Rubin <no.e...@nospam.invalid> wrote:
> mix <m...@test.net> writes:
>> I guess you should google something like "asm sorting" or "asm sorting
>> subroutine".
>
> Of course there are sorting routines for just about any language. The
> question is writing a sorting routine from available building blocks
> instead of calling a sorting routine. The building blocks available in
> a library will be shaped by the language itself, including if you count
> the list comprehensions in that example as library calls.

I assume you never have to do any actual programming using Assembler,
right?

> So what asm
> libraries out there give building blocks that let you sort so easily?
> How would you even design one? The same idea applies to stuff other
> than sorting.

You design one any way you want, any way that eventually let you meet your
goals.

>> For some reason you think you are free to use years of other people work,
>> but Assembler programmer have to start from zero having nothing but
>> registers and empty memory. I'm sorry, that's not the case.
>
> Of course you can use existing work from assembler, such as subroutine
> libraries that make things easier. But why stop there? Compilers also
> make things easier.

Even easier would be to pay someone so he would write the code for you.

If you remember the beginning of the conversation, I proposed to teach
students how to write HLL (Forth) on Assembler.

"Easier" is a matter of habit and knowledge. If you know Haskell and know
no Assembler, of course it would be easier for you to write "Hello World"
in Haskell. There are many people out there who had to use Assembler for
big projects, but have no knowledge of Haskell, and they will probably
choose Assembler for the same task.

When you are planning to write a program, you do a little research first.
You already know what are you planning to write, you know the algorithms
you're going to use. There is no "easier" there, it's just different.

--
mix x

Paul Rubin

unread,
Dec 24, 2014, 10:16:24 PM12/24/14
to
mix <m...@test.net> writes:
> I assume you never have to do any actual programming using Assembler,
> right?

Never any large apps. Some small stuff.


>> Compilers also make things easier.
> Even easier would be to pay someone so he would write the code for you.

But (some) compilers are free, so if they make things easier than
assembler...

> If you remember the beginning of the conversation, I proposed to teach
> students how to write HLL (Forth) on Assembler.

That's a bit much for a beginner, and anyway I don't think of Forth as
an HLL, though it's higher level than assembler.

> There are many people out there who had to use Assembler for big
> projects, but have no knowledge of Haskell, and they will probably
> choose Assembler for the same task.

Has anyone done a big project in assembler in the current century for
reasons other than amusement? And what's "big"? These days I'd say a
big project starts around 100 KLOC. 20 years ago "big" might have meant
10 KLOC.

There have hardly even been any big C projects done since the 1900's.
We had that discussion here a while back. Criterion for "done in the
current century" is "the first release of the program was on or after
January 1, 2000". I give you the extra year instead of the technical
definition of the century as starting in 2001, and possibly extra years
for development that happened before the first release. But, 1990's and
earlier programs (anything first released before 2000) don't count even
if they're still being maintained. Using assembler as a small bootstrap
layer for an HLL implementation doesn't count as a big assembler project
either.

> When you are planning to write a program, you do a little research first.
> You already know what are you planning to write, you know the algorithms
> you're going to use. There is no "easier" there, it's just different.

OK, here's the first and easiest Euler problem:

https://projecteuler.net/problem=1

How about starting a timer, coding a solution in assembler, and posting
how long it took. It's a very easy problem, just above "hello world".
Takes under 1 minute in Haskell or Python, about 5 minutes in Forth. A
real Forther could do it in Forth quicker than me, of course.

You might try a few of the other Euler problems too, or (less math
oriented) some of the less Ruby-specific rubyquiz.com problems.

Paul Rubin

unread,
Dec 24, 2014, 10:28:28 PM12/24/14
to
Andrew Haley <andr...@littlepinkcloud.invalid> writes:
> Does it, indeed. The thing you've not mentioned, perhaps because
> you've not tried it, is the ease of extending the compiler to inject
> monitoring and tracing facilities. That really is the proverbial
> magic bullet when it comes to finding weird bugs.

That does sound useful and I'd be interested in hearing examples. It
sounds limited though, unless you carefully write in a style where all
the memory operations are funnelled through a few words that are aware
of the program data structures.

By comparison, with today's dynamic scripting languages, there's not
much debugging per se. The program crashes and there's a diagnostic
dump that tells you immediately what went wrong. So at least for
throwaway scripting, you can fling code at the screen, then test and fix
it until you get the desired output, without spending much time on design.

> I've written a great deal of asm, but never in an app with no HLL at
> all. But I'm fairly sure I know how it's done: macros, lots of them,
> and a rigorous set of interface conventions.

I could see a significant efficiency loss from that. I'm imagining
those macros saving and restoring registers much more than necessary,
etc.

mix

unread,
Dec 25, 2014, 1:30:36 AM12/25/14
to
Paul Rubin <no.e...@nospam.invalid> wrote:
> mix <m...@test.net> writes:
>> I assume you never have to do any actual programming using Assembler,
>> right?
>
> Never any large apps. Some small stuff.
>
>
>>> Compilers also make things easier.
>> Even easier would be to pay someone so he would write the code for you.
>
> But (some) compilers are free, so if they make things easier than
> assembler...

Easier for you, don't forget that.

>> If you remember the beginning of the conversation, I proposed to teach
>> students how to write HLL (Forth) on Assembler.
>
> That's a bit much for a beginner, and anyway I don't think of Forth as
> an HLL, though it's higher level than assembler.

You didn't prove your point on that.

>> There are many people out there who had to use Assembler for big
>> projects, but have no knowledge of Haskell, and they will probably
>> choose Assembler for the same task.
>
> Has anyone done a big project in assembler in the current century for
> reasons other than amusement? And what's "big"? These days I'd say a
> big project starts around 100 KLOC. 20 years ago "big" might have meant
> 10 KLOC.

What if I'll give you a link to the project which is bigger than anything
you wrote before, but still written using Assembler as primary programming
language? You'll do some bullshit like trying to show it's not what you
meant or you'll agree with my point?

> There have hardly even been any big C projects done since the 1900's.
> We had that discussion here a while back. Criterion for "done in the
> current century" is "the first release of the program was on or after
> January 1, 2000". I give you the extra year instead of the technical
> definition of the century as starting in 2001, and possibly extra years
> for development that happened before the first release. But, 1990's and
> earlier programs (anything first released before 2000) don't count even
> if they're still being maintained. Using assembler as a small bootstrap
> layer for an HLL implementation doesn't count as a big assembler project
> either.
>
>> When you are planning to write a program, you do a little research first.
>> You already know what are you planning to write, you know the algorithms
>> you're going to use. There is no "easier" there, it's just different.
>
> OK, here's the first and easiest Euler problem:
>
> https://projecteuler.net/problem=1
>
> How about starting a timer, coding a solution in assembler, and posting
> how long it took. It's a very easy problem, just above "hello world".
> Takes under 1 minute in Haskell or Python, about 5 minutes in Forth. A
> real Forther could do it in Forth quicker than me, of course.
>
> You might try a few of the other Euler problems too, or (less math
> oriented) some of the less Ruby-specific rubyquiz.com problems.


--
mix x

Paul Rubin

unread,
Dec 25, 2014, 1:49:36 AM12/25/14
to
mix <m...@test.net> writes:
> What if I'll give you a link to the project which is bigger than anything
> you wrote before, but still written using Assembler as primary programming
> language? You'll do some bullshit like trying to show it's not what you
> meant or you'll agree with my point?

Sure, I'll be happy to look at the link. Once the link is up, we can
discuss the specifics.

I'd like to see a project that is:

1) 10 KLOC or more of assembly code (I won't hold out for 100KLOC)
2) Written in the 21st century,
3) an actual application whose goal is the end result rather than
as an exercise in asm programming,
4) Written in assembler as a choice among reasonable alternatives.

I can imagine some scenarios (special purpose processor etc) where asm
is the only possible choice so you have to bite the bullet and use it.
So I'm looking for an example of where someone used assembler by
preference and got a sane technical outcome.

The 64-bit Picolisp implementation might qualify (picolisp.de) since it
was originally written in C, then rewritten in asm to get around some C
limitations, and this might have happened after 2000. The assembler
part of it (actually appears to be a Lisp-based macroassembler) is
around 30 KLOC. There was also a Scheme interpreter in ARM assembler
that I've been wanting to look at.

But, the reasons for writing those interpreters in asm was to implement
stuff like coroutines and continuations that are difficult in C.
I'm not sure why they didn't use C with some low level asm support though.

Tom O'Donnell

unread,
Dec 25, 2014, 2:08:15 AM12/25/14
to
On 2014-12-25, Paul Rubin <no.e...@nospam.invalid> wrote:
> mix <m...@test.net> writes:
>> I assume you never have to do any actual programming using Assembler,
>> right?
>
> Never any large apps. Some small stuff.
>
>
>>> Compilers also make things easier.

But they're not always suitable for certain types of programming.

>> Even easier would be to pay someone so he would write the code for you.
>
> But (some) compilers are free, so if they make things easier than
> assembler...
>
>> If you remember the beginning of the conversation, I proposed to teach
>> students how to write HLL (Forth) on Assembler.

Incidentally not sure what the above sentence means but my point in replying
was to answer the question below anyway.

>> There are many people out there who had to use Assembler for big
>> projects, but have no knowledge of Haskell, and they will probably
>> choose Assembler for the same task.

It is a good thing to know languages for the domains you're working in. That
may be one language or many, depending on what you do. For a hobby coder the
demands are different than someone who has a job doing it. But just because
you can apply a certain language or technique in your problem domain doesn't
mean you would automatically choose that language or technique when doing
something that isn't in the normal frame of reference.

> Has anyone done a big project in assembler in the current century for
> reasons other than amusement?

Oh yes.

Are you really unaware there are platforms where the primary language is
assembler? Products range from foundation code running about 100 KLOC and a
complete saleable product usually ranges from 500KLOC to 15MLOC total,
including comments but not including macro expansion.

> And what's "big"? These days I'd say a big project starts around 100
> KLOC. 20 years ago "big" might have meant 10 KLOC.

I would say big starts around a couple million lines but I've worked on
software considerably larger than that.

> There have hardly even been any big C projects done since the 1900's.

Really? I don't know much about C but I would have thought in the last
hundred years or so there *must* have been some large projects ;-)

Tom O'Donnell

Tom O'Donnell

unread,
Dec 25, 2014, 2:15:37 AM12/25/14
to
On 2014-12-25, Paul Rubin <no.e...@nospam.invalid> wrote:
> mix <m...@test.net> writes:
>> What if I'll give you a link to the project which is bigger than anything
>> you wrote before, but still written using Assembler as primary programming
>> language? You'll do some bullshit like trying to show it's not what you
>> meant or you'll agree with my point?
>
> Sure, I'll be happy to look at the link. Once the link is up, we can
> discuss the specifics.
>
> I'd like to see a project that is:
>
> 1) 10 KLOC or more of assembly code (I won't hold out for 100KLOC)
> 2) Written in the 21st century,
> 3) an actual application whose goal is the end result rather than
> as an exercise in asm programming,

We do this every day for products that are currently sold and
supported. However the code is all proprietary and unless you work for one
of the vendors you won't ever see any of it.

If you ask for examples on the Hercules mailing lists somebody might be able
to point you to some code.

> 4) Written in assembler as a choice among reasonable alternatives.

For systems programming on z/OS there really is not any reasonable
alternative. For the past few years it has been technically possible to use
C to do that but in practice it is not. C/C++ can be used for several things
including most commonly UI but it is not viable as a systems programming
language on z.

Tom O'Donnell

> I can imagine some scenarios (special purpose processor etc) where asm
> is the only possible choice so you have to bite the bullet and use it.
> So I'm looking for an example of where someone used assembler by
> preference and got a sane technical outcome.

A lot depends on the tools, too. Most of the assemblers I have seen for
other systems haven't provided the functionality to work on large software
projects so it is probably self-limiting most of the time. I don't think
it's necessarily a problem of language.

Tom O'Donnell

Paul Rubin

unread,
Dec 25, 2014, 2:27:35 AM12/25/14
to
Tom O'Donnell <t...@nospam.com> writes:
>> Has anyone done a big project in assembler in the current century for
>> reasons other than amusement?
> Oh yes. Are you really unaware there are platforms where the primary
> language is assembler?

Can you name one from this century?

> Products range from foundation code running about 100 KLOC and a
> complete saleable product usually ranges from 500KLOC to 15MLOC total,

Hmm, I guess you're talking about IBM mainframes with their need for
legacy support. OK, good observation. But in this case maybe asm is
being used by necessity rather than choice.

Anyway, can you name one of those 100 KLOC - 15 MLOC products that was
written in this century? Products handed down from the last century
don't establish that asm is a sane choice for new products today.

>> There have hardly even been any big C projects done since the 1900's.
> Really? I don't know much about C but I would have thought in the last
> hundred years or so there *must* have been some large projects ;-)

1900's referred to the century (like the 1400's), not the decade.
I.e. the 1900's ended in 1999. By "this century" I mean 2000 and up.

Paul Rubin

unread,
Dec 25, 2014, 2:32:45 AM12/25/14
to
Tom O'Donnell <t...@nospam.com> writes:
>> 1) 10 KLOC or more of assembly code (I won't hold out for 100KLOC)
>> 2) Written in the 21st century,....
> We do this every day for products that are currently sold and
> supported. However the code is all proprietary

That's ok, I'll take your word for it without seeing the code. What I
want to know is whether any of those products was first released in 2000
or later (and repackaging/rebranding of older products or codebases
doesn't count for this purpose). Continuing to sell and support older
products is great, but this question is about new development.

WJ

unread,
Dec 25, 2014, 3:37:16 AM12/25/14
to
Paul Rubin wrote:

> I get the impression you've never used an HLL. For this purpose, an HLL
> means something with type safety, garbage collection, and preferably
> first class functions (like Python and Javascript but probably unlike
> VB). Here is the notorious Quicksort example in Haskell:
>
> quicksort [] = []
> quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
> where
> lesser = filter (< p) xs
> greater = filter (>= p) xs

Factor:

USING: sequences locals ;

:: quicksort ( seq -- seq )
seq empty?
[ { } ]
[ seq unclip :> hd ! Separate 1st el. from rest of sequence.
[ hd < ] partition [ quicksort ] bi@
hd prefix append ]
if ;

{ 9 8 7 0 2 88 3 4 22 5 1 6 } quicksort .
===>
{ 0 1 2 3 4 5 6 7 8 9 22 88 }

Andrew Haley

unread,
Dec 25, 2014, 3:43:32 AM12/25/14
to
Paul Rubin <no.e...@nospam.invalid> wrote:
> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>> Does it, indeed. The thing you've not mentioned, perhaps because
>> you've not tried it, is the ease of extending the compiler to inject
>> monitoring and tracing facilities. That really is the proverbial
>> magic bullet when it comes to finding weird bugs.
>
> That does sound useful and I'd be interested in hearing examples. It
> sounds limited though, unless you carefully write in a style where all
> the memory operations are funnelled through a few words that are aware
> of the program data structures.

All memory operations are already funelled through a few words, so
that's easy: you just redefine @ and ! (etc.) to be aware of the
program data structures. There are lots of powerful variations on
that, such as redefining : .

> By comparison, with today's dynamic scripting languages, there's not
> much debugging per se. The program crashes and there's a diagnostic
> dump that tells you immediately what went wrong. So at least for
> throwaway scripting, you can fling code at the screen, then test and fix
> it until you get the desired output, without spending much time on design.

Heh. Yeah right. :-)

I've never had much success with "inspect the entrails" as a way of
debugging a program. IME it's usually too late by that: you want to
catch the problem when things start to go wrong.

>> I've written a great deal of asm, but never in an app with no HLL at
>> all. But I'm fairly sure I know how it's done: macros, lots of them,
>> and a rigorous set of interface conventions.
>
> I could see a significant efficiency loss from that. I'm imagining
> those macros saving and restoring registers much more than necessary,
> etc.

Oh, absolutely, yes. But I suspect it's the only way to maintain your
sanity.

As an aside, here's a real gem:
http://oai.cwi.nl/oai/asset/4155/04155D.pdf

It's Dijkstra et al's Algol compiler, written in Electrologica X1
assembly language. (It's very low-level: maybe not even assembly
language, at least by modern standards.)

Andrew.

WJ

unread,
Dec 25, 2014, 3:52:36 AM12/25/14
to
Paul Rubin wrote:

> Can I ask what the coding time was for the Forth version? The debugging
> time? The amount of time spent testing before you felt confident that
> the code worked? Thanks.

You can ask, but he doesn't want to answer honestly.

Disciples of ANS Forth have no desire to increase programmer
productivity.

They want to keep Forth hobbled; they want it to be suitable
only for programming embedded controllers.

Some of them are so mentally limited that they cannot grasp
high-level languages.

They are like the caveman who was disturbed and agitated when
he first saw a round wheel. It puzzled him so that he declared
that only the decadent and wicked would use it and that he
would stick with the old reliable square wheels.

mix

unread,
Dec 25, 2014, 4:06:49 AM12/25/14
to
Paul Rubin <no.e...@nospam.invalid> wrote:
> mix <m...@test.net> writes:
>> What if I'll give you a link to the project which is bigger than anything
>> you wrote before, but still written using Assembler as primary programming
>> language? You'll do some bullshit like trying to show it's not what you
>> meant or you'll agree with my point?
>
> Sure, I'll be happy to look at the link. Once the link is up, we can
> discuss the specifics.
>
> I'd like to see a project that is:

I'm sorry, my point is not to make you happy. Moreover, the only reason you
arguing is because you're going to be unhappy to say that you were wrong
and your opponent is right.

I'll just going to prove what I've said:

http://en.m.wikipedia.org/wiki/KolibriOS
http://en.m.wikipedia.org/wiki/MenuetOS

Those are just most known ones, because they are free source projects. Are
they big enough for you?

> 1) 10 KLOC or more of assembly code (I won't hold out for 100KLOC)
> 2) Written in the 21st century,
> 3) an actual application whose goal is the end result rather than
> as an exercise in asm programming,
> 4) Written in assembler as a choice among reasonable alternatives.
>
> I can imagine some scenarios (special purpose processor etc) where asm
> is the only possible choice so you have to bite the bullet and use it.
> So I'm looking for an example of where someone used assembler by
> preference and got a sane technical outcome.
>
> The 64-bit Picolisp implementation might qualify (picolisp.de) since it
> was originally written in C, then rewritten in asm to get around some C
> limitations, and this might have happened after 2000. The assembler
> part of it (actually appears to be a Lisp-based macroassembler) is
> around 30 KLOC. There was also a Scheme interpreter in ARM assembler
> that I've been wanting to look at.
>
> But, the reasons for writing those interpreters in asm was to implement
> stuff like coroutines and continuations that are difficult in C.
> I'm not sure why they didn't use C with some low level asm support though.

Because there is no difference between them if you are experienced in both.

--
mix x

mix

unread,
Dec 25, 2014, 4:18:22 AM12/25/14
to
Lol, you're a poor troll, let me pet you.

--
mix x

Tom O'Donnell

unread,
Dec 25, 2014, 5:34:48 AM12/25/14
to
On 2014-12-25, Paul Rubin <no.e...@nospam.invalid> wrote:
> Tom O'Donnell <t...@nospam.com> writes:
>>> Has anyone done a big project in assembler in the current century for
>>> reasons other than amusement?
>> Oh yes. Are you really unaware there are platforms where the primary
>> language is assembler?
>
> Can you name one from this century?

IBM's z/OS. It is the descendant of OS/360 from 1964 and it is still making
us all money 50 some odd years later.

>> Products range from foundation code running about 100 KLOC and a
>> complete saleable product usually ranges from 500KLOC to 15MLOC total,
>
> Hmm, I guess you're talking about IBM mainframes with their need for
> legacy support. OK, good observation. But in this case maybe asm is
> being used by necessity rather than choice.

Yes for systems software in that environment assembler is certainly a
necessity. But we're perfectly satisfied with that. The OS is designed to
support assembler and the tools are designed to support large software. And
it is not only "legacy" support but all new systems software.

Most of the legacy support for IBM is in COBOL. "Trillions served".

> Anyway, can you name one of those 100 KLOC - 15 MLOC products that was
> written in this century? Products handed down from the last century
> don't establish that asm is a sane choice for new products today.

Look around for product announcements. That will be a superset of what is
totally new because there is some rebadging but it will give you a rough
idea. I know of several completely new products that came out in the past
ten years. I'm not going to name anything I've written or worked on here.
Development is expensive and time to market is fairly long. We don't put out
a new product every year. But we do put them out. We also add features to
products consisting of completely new code. Those new features keep clients
who pay for support and they help generate new sales. It's hard to say what
an average would be for that but I would think 50 to 100KLOC for a new
feature that sells as an add-on would be a reasonable median value.

Tom


Tom O'Donnell

unread,
Dec 25, 2014, 5:39:52 AM12/25/14
to
On 2014-12-25, Paul Rubin <no.e...@nospam.invalid> wrote:
I can think of at least 5 new products I'm aware of either having worked on
them or know the developers since 2000. I could probably name 10 products
that had significant 100KLOC+ new features (new code) added since 2000.

And I could probably count 50 products written between 1985 and 2000. This is
many lines of code and selling at an average of $100,000 to $500,000 per
copy and sometimes quite a bit more it is a significant business. It is is
99% assembler and being refreshed more or less to keep current with new OS
and program product releases and features, etc. Certainly there is some dust
gathering on parts that don't need to be changed but there is also plenty of
activity at times on parts that do.

Tom O'Donnell

Tom O'Donnell

unread,
Dec 25, 2014, 5:56:24 AM12/25/14
to
On 2014-12-25, Tom O'Donnell <t...@nospam.com> wrote:
> On 2014-12-25, Paul Rubin <no.e...@nospam.invalid> wrote:
>> Tom O'Donnell <t...@nospam.com> writes:
>>>> 1) 10 KLOC or more of assembly code (I won't hold out for 100KLOC)
>>>> 2) Written in the 21st century,....
>>> We do this every day for products that are currently sold and
>>> supported. However the code is all proprietary
>>
>> That's ok, I'll take your word for it without seeing the code. What I
>> want to know is whether any of those products was first released in 2000
>> or later (and repackaging/rebranding of older products or codebases
>> doesn't count for this purpose). Continuing to sell and support older
>> products is great, but this question is about new development.

A point I forgot to mention is in many cases we are maintaining our own
code. It's not as if our grandfathers wrote something and we're working on
legacy code they handed down. Product lifetimes are fairly long. We write
new stuff occasionally and we also support products we wrote or worked on in
early phases.

Tom

Anton Ertl

unread,
Dec 25, 2014, 5:58:27 AM12/25/14
to
Paul Rubin <no.e...@nospam.invalid> writes:
>I'm not sure why they didn't use C with some low level asm support though.

I wouldn't start a new project in C today. The quality of the C
compiler maintainers has degraded too much. Assembly language does
not have that problem.

Anton Ertl

unread,
Dec 25, 2014, 6:06:00 AM12/25/14
to
Tom O'Donnell <t...@nospam.com> writes:
>Are you really unaware there are platforms where the primary language is
>assembler? Products range from foundation code running about 100 KLOC and a
>complete saleable product usually ranges from 500KLOC to 15MLOC total,
>including comments but not including macro expansion.

I am aware of platforms where the primary language is assembly
language: small embedded systems. I am not aware of any platforms for
new projects where one would write 500KLOC-15MLOC in assembly
language. Could you elaborate on that?

Anton Ertl

unread,
Dec 25, 2014, 6:41:31 AM12/25/14
to
Paul Rubin <no.e...@nospam.invalid> writes:
>an...@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>> So gforth-fast takes 0.03s for 100k 64-bit ints, three times faster
>> than your Haskell program.
>
>Nice. That's actually quite a bit faster than Python sorting 100k
>pseudo-random floats, and that's in a Python array rather than a linked
>list, using Python's library sorting routine which is written in
>optimized C, but which has to be type-generic with runtime dispatch for
>comparisons.

Having looked at CPython's bytecode interpreter, I am not surprised;
and the type dispatch is not necessarily the most expensive part of
that. The bytecode interpreter goes through a large amount of code
for each bytecode, checking for various conditions; it may skip a
large part of these 50-100 lines, but the amount of checking alone is
appalling to someone interested in efficient interpreters like me.

One might see how much the type dispatch may cost by changing the
Forth version to use boxed values and some form of type dispatch
(maybe from mini-oof.fs, although that does not contain support for
checking the type of the other argument). The source code is
<http://www.complang.tuwien.ac.at/forth/programs/sort.fs>.

>Can I ask what the coding time was for the Forth version? The debugging
>time? The amount of time spent testing before you felt confident that
>the code worked? Thanks.

That was 17 months ago, and I don't remember the time needed. I don't
remember if there was a bug; if there was a hard one, I would remeber
it:-). For testing I did testing with pseudo-random inputs, and
ensured that there were equal integers among the tests.

One thing I remember (and actually wrote down in
<2013Jul2...@mips.complang.tuwien.ac.at>) is that I originally
thought that one could be clever on partitioning by having the interface

partition1 ( a1 a3 -- a1 a2 a3 )

for that; but that was a bad idea, because it would have produced the
quadratic behaviour for the all-equals inputs (like the Haskell
version), whereas

partition ( al ah -- a1l a1h a2l a2h )

produces the O(n) behaviour for this case.

Tom O'Donnell

unread,
Dec 25, 2014, 7:01:05 AM12/25/14
to
On 2014-12-25, Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
> Tom O'Donnell <t...@nospam.com> writes:
>>Are you really unaware there are platforms where the primary language is
>>assembler? Products range from foundation code running about 100 KLOC and a
>>complete saleable product usually ranges from 500KLOC to 15MLOC total,
>>including comments but not including macro expansion.
>
> I am aware of platforms where the primary language is assembly
> language: small embedded systems. I am not aware of any platforms for
> new projects where one would write 500KLOC-15MLOC in assembly
> language. Could you elaborate on that?

IBM mainframe hardware and OS is designed around an assembler interface. The
large majority of system services are simply not available to C or HLL
programs. Over 95% of all vendor software is and has always been written in
assembler. 10 or 15 years ago that number would have been closer to 99%. The
parts that aren't are UI and some algorithmic stuff. For user interface REXX
or a scripting language called CLIST and C/C++ are often used. Algorithmic
code is sometimes written in C or C++ but that is usually a very small part
of a systems software product and their use even for those parts is
relatively new.

The IBM OS and architecture has been evolving since the early 1960s and it's
been thoughtfully and carefully upgraded and refined in all that time so
customers don't lose their investment in vendor software or development
which are very expensive. Backwards compatibility is extremely important for
example application object code and load modules from the 1960s will still
run on the newest hardware and OS. Customers stay with the platform because
of throughput, security, RAS, and software stability & compatibility.

Vendor product lifetimes probably average 20 or more years and 30 year old
and older products are not uncommon.

Tom

Anton Ertl

unread,
Dec 25, 2014, 7:06:51 AM12/25/14
to
Paul Rubin <no.e...@nospam.invalid> writes:
>By comparison, with today's dynamic scripting languages, there's not
>much debugging per se. The program crashes and there's a diagnostic
>dump that tells you immediately what went wrong.

Not in my experience. A typical experience is that I use a complex
regexp in awk, and it does not match what I want it to match (and I
doubt that it's any better with other languages). So I have to try
out simpler forms of that regexp to see what's wrong. Or I do some
complex shell pipe, and I get no output, or the wrong output. Then I
leave away parts to see where it went wrong. No diagnostic tells me
what went wrong.

>So at least for
>throwaway scripting, you can fling code at the screen, then test and fix
>it until you get the desired output, without spending much time on design.

The time needed for design depends mainly on the complexity of the
task, on whether you have experience with a similar task (so you can
reuse the design), and on whether there are different designs for the
that appear similarly viable at first sight; the latter may have to do
with the programming language, and you might do less investigation on
that in a throw-away program.

Anyway, I don't really see a connection between the time needed for
design and the way you debug programs; sure, you also consider testing
and debugging in design, but having certain errors reported by the
language does not have more influence than other language features.

Anton Ertl

unread,
Dec 25, 2014, 7:20:54 AM12/25/14
to
Paul Rubin <no.e...@nospam.invalid> writes:
>an...@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>> You can build a linked list step-by-step without side effects (which
>> you must not use in Haskell proper). With an array you have to build
>> it all at once. I guess building it all at once is pretty common in
>> Haskell, but can it be used for everything, e.g., the merge step in
>> mergesort? I don't think so.
>
>Not sure what you mean about the merge step--do you mean could it have
>used an array? Haskell has mutable arrays, so you can write the same
>algorithms on them as with any other language. The main difference is
>that the mutation code is of a special type (in the state transformer or
>ST monad), so the type system prevents any code other than the mutation
>action from modifying the array contents.

And yet you chose to use a linked list. I don't really program in
Haskell, but is seems to me that the mutation stuff is a second-class
citizen and cannot be used as flexibly as the side-effect free stuff.

Or maybe that was just in the beginning, and that established a
preference towards linked lists that still lives on despite the base
now being equal?

>The merge step in the Haskell code I posted is very interesting and I
>don't have it completely visualized. But basically because of lazy
>evaluation, you don't actually generate complete intermediate lists.
>Instead, all the nested merge steps happen simultaneously
>coroutine-style, so as you read elements off the sorted list one by one,
>each of the nested merges makes a little more progress.

Is the result faster than the eager version in, say, Ocaml (or
Haskell, if you can force it to be eager)?

Anyway, my point was about the way it is expressed, and it's expressed
using linked lists. I can imagine a list->array conversion word, and
an implementation that is smart enough about that to turn a list-based
merge-sort or quicksort with that conversion applied at the
appropriate places into an array-based version of the sort, without
needing to resort to monads. You would still express the merging step
with linked lists, because that allows you to express every element
construction separately.

Anton Ertl

unread,
Dec 25, 2014, 11:05:33 AM12/25/14
to
Tom O'Donnell <t...@nospam.com> writes:
>On 2014-12-25, Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>> Tom O'Donnell <t...@nospam.com> writes:
>>>Are you really unaware there are platforms where the primary language is
>>>assembler? Products range from foundation code running about 100 KLOC and a
>>>complete saleable product usually ranges from 500KLOC to 15MLOC total,
>>>including comments but not including macro expansion.
>>
>> I am aware of platforms where the primary language is assembly
>> language: small embedded systems. I am not aware of any platforms for
>> new projects where one would write 500KLOC-15MLOC in assembly
>> language. Could you elaborate on that?
>
>IBM mainframe hardware and OS is designed around an assembler interface. The
>large majority of system services are simply not available to C or HLL
>programs. Over 95% of all vendor software is and has always been written in
>assembler. 10 or 15 years ago that number would have been closer to 99%.

Interesting. Up to now my impression was that these mainframes are
programmed in Cobol, Fortran, and PL/I, with assembly language in
selected places.

Tom O'Donnell

unread,
Dec 25, 2014, 12:20:09 PM12/25/14
to
On 2014-12-25, Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
> Tom O'Donnell <t...@nospam.com> writes:
>>On 2014-12-25, Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>>> Tom O'Donnell <t...@nospam.com> writes:
>>>>Are you really unaware there are platforms where the primary language is
>>>>assembler? Products range from foundation code running about 100 KLOC and a
>>>>complete saleable product usually ranges from 500KLOC to 15MLOC total,
>>>>including comments but not including macro expansion.
>>>
>>> I am aware of platforms where the primary language is assembly
>>> language: small embedded systems. I am not aware of any platforms for
>>> new projects where one would write 500KLOC-15MLOC in assembly
>>> language. Could you elaborate on that?
>>
>>IBM mainframe hardware and OS is designed around an assembler interface. The
>>large majority of system services are simply not available to C or HLL
>>programs. Over 95% of all vendor software is and has always been written in
>>assembler. 10 or 15 years ago that number would have been closer to 99%.
>
> Interesting. Up to now my impression was that these mainframes are
> programmed in Cobol, Fortran, and PL/I, with assembly language in
> selected places.

It matters whether we're discussing application programming or systems
programming. There are fairly clear lines and not much overlap.

COBOL is by far the most popular application programming language but Java
and C/C++ have started to make a dent especially in web-enabled apps and UI
respectively. PL/I is a great language but is neither fish nor fowl so it
has a tough time. It has always been there but is still a tiny percentage of
all mainframe code. It was probably too little too late. FORTRAN has a niche
and there are big old systems still running but I don't know of any new
mainframe FORTRAN written in many years. The compiler hasn't been updated
since F77 although it has extensions. It's just easier to code Fortran on
UNIX or Windows with a new compiler and if you need supercomputing a
mainframe isn't an appropriate platform.

I guess COBOL accounts for 99% of all application code written on the
mainframe since the dawn of time. I have come across and worked on a fair
amount of application code in assembler but there aren't many good reasons
to have used assembler for applications and it's usually not pretty.

Tom


Paul Rubin

unread,
Dec 25, 2014, 12:39:16 PM12/25/14
to
mix <m...@test.net> writes:
> http://en.m.wikipedia.org/wiki/KolibriOS
> http://en.m.wikipedia.org/wiki/MenuetOS
>
> Those are just most known ones, because they are free source projects. Are
> they big enough for you?

They are nontrivial so I'll accept them without being finicky about the
size requirement. According to those articles, KolibriOS is apparently
a fork of MenuetOS so I won't count it as a separaate program. MenuetOS
was first released on May 16, 2000, so it meets the 21st century first
release criterion, though just barely. They do seem to be examples of
writing in asm for its own sake, but as the saying goes, coding in
assembly language is good for the soul. Thanks!

>> I'm not sure why they didn't use C with some low level asm support though.
> Because there is no difference between them if you are experienced in both.

One difference is you can run the C program on multiple cpu
architectures without much change. I've actually seen several programs
(apparently including Picolisp64) written in a sort of abstracted
assembly language, that's mapped onto a hardware instruction set using
macros.

Paul Rubin

unread,
Dec 25, 2014, 12:47:12 PM12/25/14
to
Tom O'Donnell <t...@nospam.com> writes:
>> Can you name one from this century?
> IBM's z/OS. It is the descendant of OS/360 from 1964 and it is still making
> us all money 50 some odd years later.

Right, but that's what I'd call continued maintenance and development of
a 20th century product.

>> Anyway, can you name one of those 100 KLOC - 15 MLOC products

> I know of several completely new products that came out in the past
> ten years. I'm not going to name anything I've written or worked on
> here.

No prob. The confirmation that new stuff is being done suffices.

> Development is expensive and time to market is fairly long... It's
> hard to say what an average would be for that but I would think 50 to
> 100KLOC for a new feature that sells as an add-on would be a
> reasonable median value.

Fair enough. What kind of features? Just system level (add support
for some new hardware device), or also at the application level?

If application level, how would you compare the development cost and
time, to what it would take for a Java shop to implement comparable
functionality?

Bernd Paysan

unread,
Dec 25, 2014, 1:36:53 PM12/25/14
to
Paul Rubin wrote:

> mix <m...@test.net> writes:
>> That's not exactly true. The amount of constructs is limited in Assembler
>> same as in HLL. It's true that code will be longer in terms of bytes, but
>> it's not mean you'll spend more time typing it.
>
> I get the impression you've never used an HLL. For this purpose, an HLL
> means something with type safety, garbage collection, and preferably
> first class functions (like Python and Javascript but probably unlike
> VB). Here is the notorious Quicksort example in Haskell:
>
> quicksort [] = []
> quicksort (p:xs) = (quicksort lesser) ++ [p] ++ (quicksort greater)
> where
> lesser = filter (< p) xs
> greater = filter (>= p) xs
>
> Like any small program this doesn't rely much on types, but it relies on
> GC to clean up intermediate results,

Ouch. This is acutally not Quicksort, as Hoare's Quicksort is an in-place
array sort, and has the advantage (compared to other O(n log n) sorts) that
you don't have intermediate results to clean up later.

--
Bernd Paysan
"If you want it done right, you have to do it yourself"
http://bernd-paysan.de/

WJ

unread,
Dec 25, 2014, 1:44:37 PM12/25/14
to
Paul Rubin wrote:

> mix <m...@test.net> writes:
> > Better yet, the programmer is not doing that mistake, because there is only
> > one type: pointer to array of bytes, that's it.
>
> To use an analogy that came up in Haskell, think of a jigsaw puzzle
> (picture of scenery, say) whose pieces have complicated shapes (types).
> So if you try putting a piece in the wrong place, it doesn't fit. Now
> imagine the same puzzle except the pieces are all 1x1 squares. Yes the
> square pieces are "simpler", but the puzzle becomes much harder to
> solve.

Very nice analogy, but disciples of ANS Forth won't understand it.
I'm afraid that they are too simple-minded to grok high-level
languages.

They are the worst enemies that Forth has ever had.

If Forth is to live, then ANS Forth must die.

Bernd Paysan

unread,
Dec 25, 2014, 1:45:24 PM12/25/14
to
Paul Rubin wrote:

> Andrew Haley <andr...@littlepinkcloud.invalid> writes:
>> And it's a terrible implementation of quicksort.
>
> It's illustrative.
>
>> it would have the tight inner loop which is what really makes
>> quicksort fast.
>
> It's the O(n log n) asymptotic speed (with non-pathological data) that
> makes quicksort fast, compared with the quadratic algorithms (no matter
> how optmized) one still finds in programs where the implementer didn't
> know better or have a good library routine available.

No, there are quite a number of O(n log n) sorts, and quicksort is fast,
because it has a considerably lower constant overhead over the other O(n log
n) sorts. First of all, it's in-place, so it uses only half of the memory
of those sorts who aren't in-place. Then, it's tight inner loop is doing
unit-strides through the data (typical databases used to have fixed-length
fields, so no pointers, just the data), which means it takes advantage of
caching.

Quicksort has the disadvantage of degrading to a O(n²) sort in some
pathological cases; especially if you choose the first element as pivot, the
pathological case is really easy to construct.

IMHO, a "HLL" like Haskell should rather use mergesort, which works quite
well on lists, and is always O(n log n), and in Haskell, the costs are the
same.

Paul Rubin

unread,
Dec 25, 2014, 2:00:36 PM12/25/14
to
an...@mips.complang.tuwien.ac.at (Anton Ertl) writes:
> And yet you chose to use a linked list. I don't really program in
> Haskell, but is seems to me that the mutation stuff is a second-class
> citizen and cannot be used as flexibly as the side-effect free stuff.

That's a reasonable description. Mutation is there if you need it, but
it feels dirty, and you have to wear a hazmat suit (run inside the ST
monad) while you do it.

> Or maybe that was just in the beginning, and that established a
> preference towards linked lists that still lives on despite the base
> now being equal?

Lists are just very natural for Haskell and other lambda-calculus
languages. If you want, you can think of Haskell as a stylized Lisp
with infix syntax and a fancy type system.

>>Instead, all the nested merge steps happen simultaneously
>
> Is the result faster than the eager version in, say, Ocaml (or
> Haskell, if you can force it to be eager)?

I'd be interested in a benchmark against Ocaml. Forcing it to be eager
in Haskell (by reading out all the values before they're needed) would
almost certainly slow it down. Lazy evaluation probably saves memory
but I don't know the cpu tradeoff. E.g., if you have a 1000 element
list "xs" and ask for the first 500 elements (take 500 xs), a strict
language would have to cons up a new 500 element list, since it has to
end with nil. Lazy evaluation creates a thunk that delivers the desired
elements one by one, until it has counted to 500.

> I can imagine a list->array conversion word, and an implementation
> that is smart enough about that to turn a list-based merge-sort or
> quicksort with that conversion applied at the appropriate places into
> an array-based version of the sort, without needing to resort to
> monads.

In pure Haskell this is impossible, because the mutation stuff only
exists in certain monads. It could be done with GHC's "unsafe"
operations, which are FFI calls that allow bypassing type safety. Their
intended purpose is to pass values between Haskell and C, but they're
also useful in certain situations like that. But for this sorting, you
can use the ST monad which you can introduce anywhere in your program.
It's not like the IO monad which you can only use from other IO actions.

(Slight elaboration: "main" is your program's entry point called by the
runtime, like in C. "main" is an IO action and as such, it can call
other IO actions and it can also call pure functions. But, pure
functions can't call IO actions. So your program is written in a style
where only the outermost layer can do i/o, and that sometimes means
having to refactor if you suddenly want something in an inner layer to
open a socket or something.)

> You would still express the merging step with linked lists, because
> that allows you to express every element construction separately.

It might make more sense to convert the list to an array, do an in-place
sorting operation, and convert back.

Bernd Paysan

unread,
Dec 25, 2014, 2:00:43 PM12/25/14
to
Paul Rubin wrote:

> OK, here's the first and easiest Euler problem:
>
> https://projecteuler.net/problem=1
>
> How about starting a timer, coding a solution in assembler, and posting
> how long it took. It's a very easy problem, just above "hello world".
> Takes under 1 minute in Haskell or Python, about 5 minutes in Forth. A
> real Forther could do it in Forth quicker than me, of course.

Yes, it took me less than a minute. This is really trivial.

: 3or5? ( n -- flag ) dup 3 mod 0= swap 5 mod 0= or ; ok
: sum3or5 0 swap 1 ?DO i 3or5? IF i + THEN LOOP ; ok
10 sum3or5 . 23
1000 sum3or5 . 233168 ok

Tom O'Donnell

unread,
Dec 25, 2014, 2:44:34 PM12/25/14
to
On 2014-12-25, Paul Rubin <no.e...@nospam.invalid> wrote:
> Tom O'Donnell <t...@nospam.com> writes:

>> Development is expensive and time to market is fairly long... It's
>> hard to say what an average would be for that but I would think 50 to
>> 100KLOC for a new feature that sells as an add-on would be a
>> reasonable median value.
>
> Fair enough. What kind of features? Just system level (add support
> for some new hardware device), or also at the application level?

Toleration or exploitation of new OS features or hardware is mostly done as
ongoing support as in customers expect their stuff works as long as they pay
maintenance. New features are usually significant line items that show up in
marcom materials so yes, application level.

> If application level, how would you compare the development cost and
> time, to what it would take for a Java shop to implement comparable
> functionality?

Oh. We've been talking about systems software so application level features
are not something that could normally be done in any language but assembler.
In the systems software venue comparable functionality is not possible in
any language but assembler so I can't answer your question. But for
application code generally there is nothing that ever has to be done to go
from one release of the OS to the next or even a bunch of releases. Once in
a great while there is something but usually you can go for decades running
the same loadmodule (program). That means companies that sell application
software very seldom worry about anything but bug fixes and new features,
because toleration and exploitation are built into the compilers rather than
the product code.

Tom

Albert van der Horst

unread,
Dec 25, 2014, 3:36:23 PM12/25/14
to
In article <Cq2dndVL0quuUwbJ...@supernews.com>,
That is an extremely interesting reference. Those photographs...
We really looked like that in the 50's. Fortunately I can easily
read the comments in Dutch.
I saved it. Not sure when I will take time to look at it.

>
>Andrew.
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Albert van der Horst

unread,
Dec 25, 2014, 3:41:22 PM12/25/14
to
In article <2014Dec2...@mips.complang.tuwien.ac.at>,
Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>Paul Rubin <no.e...@nospam.invalid> writes:
>>By comparison, with today's dynamic scripting languages, there's not
>>much debugging per se. The program crashes and there's a diagnostic
>>dump that tells you immediately what went wrong.
>
>Not in my experience. A typical experience is that I use a complex
>regexp in awk, and it does not match what I want it to match (and I
>doubt that it's any better with other languages). So I have to try
>out simpler forms of that regexp to see what's wrong. Or I do some
>complex shell pipe, and I get no output, or the wrong output. Then I
>leave away parts to see where it went wrong. No diagnostic tells me
>what went wrong.

Especially for regexp the rule is: a complicated program is a simple
program with some extensions. While I tend (nowadays) to be able to
get a regexp right the first time (!), one really should start with a
simple expression and make it more complicated. In this way it doesn't
cost an inordinate amount of time.

<SNIP>
>Anyway, I don't really see a connection between the time needed for
>design and the way you debug programs; sure, you also consider testing
>and debugging in design, but having certain errors reported by the
>language does not have more influence than other language features.

Well. For really well-designed programs (like Ahem when I designed them),
an error often triggers a reaction like "of course ... that is probably ..".

>
>- anton

Julian Fondren

unread,
Dec 25, 2014, 3:46:04 PM12/25/14
to
On Thursday, December 25, 2014 12:44:37 PM UTC-6, WJ wrote:
> Very nice analogy, but disciples of ANS Forth won't understand it.
> I'm afraid that they are too simple-minded to grok high-level
> languages.
>
> They are the worst enemies that Forth has ever had.
>
> If Forth is to live, then ANS Forth must die.

Here we have the most readily and naturally extended language on
the planet, and the criticism against it is ever "This language
[without extension] is less expressive or less capable of this
task than that that language (with some libriaries) is." I wonder
in how many languages and for how many tasks you can make that
point. If you were more productive I guess you would already have
the next Factor or RetroForth or blah.


Meanwhile... just so you have some idea at all of the kinds of
things that actually *do* go through the heads of the 'disciples
of ANS Forth':

Have you heard of PHP? It sucks. One of the biggest things it
sucks is CPU time and memory. It is the absolute suckiest on
shared hosting, because the hacks to make it suck less aren't safe
to use when you don't trust your users and your users also don't
trust each other. That it sucks is actually a big problem for a
lot of its users. Nonetheless, PHP is a big deal, because people
have put a lot of work into content management systems (WordPress,
Joomla, Drupal, Magento) that use it to let ordinary people do
creative stuff and make money, with very minimal investment.
Those people don't care that they're using PHP.

How much does PHP suck? It sucks so hard that if you direct
enough traffic at a .php file that contains straight HTML -- that
contains no PHP at all -- that the CPU cost of firing up PHP and
having it look at this file and decide do nothing but regurgitate
it will cause your account to be suspended on shared hosting. And
not unreasonably; 'overselling' doesn't have a whole lot to do
with your ability to rack up ten times more CPU seconds than
anyone else on the server. OK, it requires a lot of traffic, but
it happens, and people are surprised when it does.

The proposed answers to PHP's suckage (e.g., from CMS producers,
from Zend, from alleged competitors like CMS systems in Perl) all
go like this: first, get dedicated resources; second, hire a
developer; third, set up this elaborate system, and then--
basically they miss the point from the get-go. One of the hot
things in Perl is the Dancer framework, and I spent ages looking
for a quick tutorial on how to get that running on shared hosting,
and there isn't any such tutorial. You can do it though - badly;
And if you complain, the only way they know how to help you make
it better is to miss the point.

But shared hosting, these days, isn't just a little web UI with a
button that you can use to install your PHP CMS of choice.
They give you an account on a normal-ish unix box and you can run
all kinds of stuff, provided that you understand the environment.

Typically, under shared hosting, PHP sites aren't using persistent
processes. A PHP process is fired up on every individual request
that isn't for a static asset like an image or a CSS file, and it
generates some HTML. It might also generate a cache file of the
generated HTML that mod_rewrite rules (or another PHP process) can
pull the HTML from instead. Even if you have process limits (and
you always have SOME resource limit that concurrent running PHP
processes would eventually hit), you can handle large numbers of
requests per second because each request's PHP process finishes
fast enough for all the rest of the requests to come through.

So you want a process to start up, do its work, and then go away
as quickly as possible, while consuming as little CPU as possible.
(memory usage is much less important, yes even on shared hosting.)
You use standard I/O and environment variables to do this;
possibly, you connect to a database server or an SMTP server over
the local network.

So I'll just leave this here:

https://www.complang.tuwien.ac.at/forth/gforth/Docs-html/Startup-speed.html


.... oh, and ANS Forth is such an obstacle that a Forth CMS might
come with a little slot that customers can put money into to make
their sites work more betterer. I sure wish I could discard that
potentiality for no benefit whatsoever.

-- Julian

WJ

unread,
Dec 25, 2014, 4:15:44 PM12/25/14
to
Paul Rubin wrote:

> mix <m...@test.net> writes:
> > http://en.m.wikipedia.org/wiki/KolibriOS
> > http://en.m.wikipedia.org/wiki/MenuetOS
> >
> > Those are just most known ones, because they are free source projects. Are
> > they big enough for you?
>
> They are nontrivial so I'll accept them without being finicky about the
> size requirement. According to those articles, KolibriOS is apparently
> a fork of MenuetOS so I won't count it as a separaate program. MenuetOS
> was first released on May 16, 2000, so it meets the 21st century first
> release criterion, though just barely. They do seem to be examples of
> writing in asm for its own sake, but as the saying goes, coding in
> assembly language is good for the soul. Thanks!

One assembly language program produced in this century, versus hundreds of
C and C++ programs.

Why did so many programmers choose C insted of assembly language?
Were they trying to make their programming tasks harder?
No, they knew that it would be easier to program in a higher-level
language.

Melzzzzz

unread,
Dec 25, 2014, 5:34:12 PM12/25/14
to
On Thu, 25 Dec 2014 21:15:20 +0000 (UTC)
"WJ" <w_a_...@yahoo.com> wrote:

> Paul Rubin wrote:
>
> > mix <m...@test.net> writes:
> > > http://en.m.wikipedia.org/wiki/KolibriOS
> > > http://en.m.wikipedia.org/wiki/MenuetOS
> > >
> > > Those are just most known ones, because they are free source
> > > projects. Are they big enough for you?
> >
> > They are nontrivial so I'll accept them without being finicky about
> > the size requirement. According to those articles, KolibriOS is
> > apparently a fork of MenuetOS so I won't count it as a separaate
> > program. MenuetOS was first released on May 16, 2000, so it meets
> > the 21st century first release criterion, though just barely. They
> > do seem to be examples of writing in asm for its own sake, but as
> > the saying goes, coding in assembly language is good for the soul.
> > Thanks!
>
> One assembly language program produced in this century, versus
> hundreds of C and C++ programs.

Don;t forget that writing C or C++ sooner or later will require
use of assembler.

>
> Why did so many programmers choose C insted of assembly language?

Because C is portable assembler. Take for example Go language.
I have seen people write assembler files in combination with
high level constructs. Why? Because Go has different calling
convention (calling C functions has hi cost), than common C abi and
has built in assembler that does not have such performance
costs. So you will see assembler files for three architectures
i386 , amd64 and ARM ;)

> Were they trying to make their programming tasks harder?
> No, they knew that it would be easier to program in a higher-level
> language.

Writing things in assembler requires mental discipline, which
is good exercise for making less bugs ;)


Melzzzzz

unread,
Dec 25, 2014, 7:21:37 PM12/25/14
to
On Wed, 24 Dec 2014 23:32:44 -0800
Paul Rubin <no.e...@nospam.invalid> wrote:

> Tom O'Donnell <t...@nospam.com> writes:
> >> 1) 10 KLOC or more of assembly code (I won't hold out for 100KLOC)
> >> 2) Written in the 21st century,....
> > We do this every day for products that are currently sold and
> > supported. However the code is all proprietary
>
> That's ok, I'll take your word for it without seeing the code. What I
> want to know is whether any of those products was first released in
> 2000 or later (and repackaging/rebranding of older products or
> codebases doesn't count for this purpose). Continuing to sell and
> support older products is great, but this question is about new
> development.

You have ciforth from Albert van der Horst written in assembler?

Paul Rubin

unread,
Dec 25, 2014, 8:09:23 PM12/25/14
to
Melzzzzz <m...@zzzzz.com> writes:
> Don;t forget that writing C or C++ sooner or later will require
> use of assembler.

Only occasionally (maybe a little more for OS's or embedded programs),
and the amount of assembler code is usually quite small. Assembly as a
low level assist to a HLL program is fine, I'm just in doubt about the
practicality of writing large asm programs these days. Some mainframe
(mentioned by Tom O'Donnell) apparently do it for legacy OS
compatibility and it sounds like they spend a boatload of money on
development.

> You have ciforth from Albert van der Horst written in assembler?

It looks to me like ciforth is a descendant of FigForth which is from
way before 2000.

Melzzzzz

unread,
Dec 25, 2014, 8:22:23 PM12/25/14
to
On Thu, 25 Dec 2014 17:09:22 -0800
Paul Rubin <no.e...@nospam.invalid> wrote:

> Melzzzzz <m...@zzzzz.com> writes:
> > Don;t forget that writing C or C++ sooner or later will require
> > use of assembler.
>
> Only occasionally (maybe a little more for OS's or embedded programs),
> and the amount of assembler code is usually quite small.

True.

Assembly as
> a low level assist to a HLL program is fine, I'm just in doubt about
> the practicality of writing large asm programs these days.

Main problem is lack of available programmers.

Some
> mainframe (mentioned by Tom O'Donnell) apparently do it for legacy OS
> compatibility and it sounds like they spend a boatload of money on
> development.

That is because that there are not many such programmers.

>
> > You have ciforth from Albert van der Horst written in assembler?
>
> It looks to me like ciforth is a descendant of FigForth which is from
> way before 2000.

Don't forget you can't reuse assembly code from older processors.
I have x86-64 version and it does not looks like it is pre 2000.

Paul Rubin

unread,
Dec 25, 2014, 8:49:49 PM12/25/14
to
Melzzzzz <m...@zzzzz.com> writes:
>> legacy OS compatibility and it sounds like they spend a boatload of
>> money on development.
> That is because that there are not many such programmers.

The extra cost is mostly because it takes so much longer to implement
functionality in asm than in an HLL.

>> It looks to me like ciforth is a descendant of FigForth
> Don't forget you can't reuse assembly code from older processors.
> I have x86-64 version and it does not looks like it is pre 2000.

I haven't looked closely at ciforth but iirc, Figforth was metacompiled,
and it may have used a semi-portable virtual machine mapped to hardware
instructions with asm macros (ie. with a W register and so on).
Figforth ran on a lot of different cpu architectures with relatively
small amount of asm code hacking needed for each one.

Melzzzzz

unread,
Dec 25, 2014, 11:22:16 PM12/25/14
to
On Thu, 25 Dec 2014 17:49:46 -0800
Paul Rubin <no.e...@nospam.invalid> wrote:

> Melzzzzz <m...@zzzzz.com> writes:
> >> legacy OS compatibility and it sounds like they spend a boatload of
> >> money on development.
> > That is because that there are not many such programmers.
>
> The extra cost is mostly because it takes so much longer to implement
> functionality in asm than in an HLL.

True but results are satisfactory ;p

Eg here is mine merge sort and test asm program in fasm linux/amd64.
It assembles into dynamically linked executable directly, without
linker.

I needed about 4 hours to write it ;p
Time to generate list of million nodes of 32bit integers and to sort it
follows:

[bmaxa@maxa-pc sort]$ time ./list_sort
seed 1419567339
unsorted

0x314e7d0 63077
0x314e7b0 933004
0x314e790 814103
0x314e770 61894
0x314e750 128938
0x314e730 894326
0x314e710 691552
0x314e6f0 42594
0x314e6d0 234833
0x314e6b0 133719
0x314e690 997511
0x314e670 269168
0x314e650 55560
0x314e630 364881
0x314e610 221141
0x314e5f0 324993
sorted

0x13777f0 0
0x18d82f0 0
0x2da8bb0 0
0x1d836b0 0
0x1b537f0 1
0x18f2e90 2
0x2f99d30 2
0x2fec4d0 2
0x26532d0 3
0x1647af0 3
0x172ddf0 5
0x1ab3f30 6
0x28938b0 7
0x16c0ed0 7
0x2b84990 8
0x236bfb0 9
size of node 12, length 1000000

real 0m0.388s
user 0m0.387s
sys 0m0.000s

Here is whole program (test and merge_sort):

[bmaxa@maxa-pc sort]$ cat list_sort.asm
format elf64 executable 3
struc Node {
.next dq ?
.data dd ?
.size = $-.next
}
virtual at 0
n Node
end virtual

include 'import64.inc'
interpreter '/lib64/ld-linux-x86-64.so.2'
needed 'libc.so.6'
import printf,puts,exit,rand_r,time,malloc

N = 1000000 ; number of nodes
segment executable
entry $
xor edi,edi
call [time]
mov [seed],eax ; seed on time
mov rdi,fmt1
mov esi,[seed]
xor eax,eax
call [printf] ; print seed
call init_list
mov rdi,fmtu
call print_list
mov rbx,[list]
push rbx
call sort
pop rbx
mov [list],rbx
mov rdi,fmts
call print_list
mov rdi,[list]
call length
mov rdx,rcx
mov rdi,fmt3
mov rsi,n.size
call [printf]
xor edi,edi
call [exit]
init_list:
mov ebx,N
.L0:
mov edi,n.size
call [malloc]
mov rcx,[list]
mov [rax+n.next],rcx
mov [list],rax
mov rdi,seed
call [rand_r]
xor edx,edx
mov ecx,N
div ecx
mov rcx,[list]
mov [rcx+n.data],edx
dec ebx
jnz .L0
ret
print_list:
call [puts]
mov rbx,[list]
mov r12,16
.L0:
test rbx,rbx
jz .exit
mov rdi,fmt2
mov rsi,[rbx+n.next]
mov edx,[rbx+n.data]
xor eax,eax
call [printf]
mov rbx,[rbx+n.next]
dec r12
jz .exit
jmp .L0
.exit:
ret
; [rsp+8] list to sort
sort:
mov rdi,[rsp+8]
call length
cmp rcx,1
jle .exit
shr rcx,1 ; middle
sub rsp,16 ; left,right
mov qword[rsp],0
mov qword[rsp+8],0
mov rbx,[rsp+8+16]
.L0: ; append to left
mov rax,[rsp]
mov rdx,[rbx+n.next]
mov [rbx+n.next],rax
mov [rsp],rbx
mov rbx,rdx
dec rcx
jnz .L0
.L1: ; append to right
mov rax,[rsp+8]
mov rdx,[rbx+n.next]
mov [rbx+n.next],rax
mov [rsp+8],rbx
mov rbx,[rbx+n.next]
mov rbx,rdx
test rbx,rbx
jnz .L1
sub rsp,8 ; result
mov rbx,[rsp+8]
mov [rsp],rbx
call sort
mov rbx,[rsp]
mov [rsp+8],rbx

mov rbx,[rsp+16]
mov [rsp],rbx
call sort
mov rbx,[rsp]
mov [rsp+16],rbx
call merge
mov rbx,[rsp]
add rsp,24
mov [rsp+8],rbx
.exit:
ret
; [rsp+8] output , [rsp+16] left, [rsp+24] right
merge:
sub rsp,8 ; append position
mov qword[rsp+16],0
mov qword[rsp],0
.L0:
cmp qword[rsp+24],0
jz .right
cmp qword[rsp+32],0
jz .left
mov rax,[rsp+24]
mov ebx,[rax+n.data]
mov rcx,[rsp+32]
cmp ebx,[rcx+n.data]
jl .add_left
.add_right:
cmp qword[rsp],0
je .just_set_right
mov rdx,[rsp]
mov [rdx+n.next],rcx
mov rdx,[rcx+n.next]
mov [rsp+32],rdx
mov qword[rcx+n.next],0
mov [rsp],rcx
jmp .L0
.add_left:
cmp qword[rsp],0
je .just_set_left
mov rdx,[rsp]
mov [rdx+n.next],rax
mov rdx,[rax+n.next]
mov [rsp+24],rdx
mov qword[rax+n.next],0
mov [rsp],rax
jmp .L0
.just_set_left:
mov rdx,[rax+n.next]
mov qword[rax+n.next],0
mov [rsp],rax
mov [rsp+16],rax
mov [rsp+24],rdx
jmp .L0
.just_set_right:
mov rdx,[rcx+n.next]
mov qword[rcx+n.next],0
mov [rsp],rcx
mov [rsp+16],rcx
mov [rsp+32],rdx
jmp .L0
.right:
cmp qword[rsp+32],0
jz .exit
mov rcx,[rsp+32]
cmp qword[rsp],0
je .just_set_right_only
mov rdx,[rsp]
mov [rdx+n.next],rcx
mov [rsp],rcx
mov rdx,[rcx+n.next]
mov qword[rcx+n.next],0
mov [rsp+32],rdx
jmp .right
.just_set_right_only:
mov rdx,[rcx+n.next]
mov qword[rcx+n.next],0
mov [rsp],rcx
mov [rsp+16],rcx
mov [rsp+32],rdx
jmp .right
.left:
cmp qword[rsp+24],0
jz .exit
mov rax,[rsp+24]
cmp qword[rsp],0
je .just_set_left_only
mov rdx,[rsp]
mov [rdx+n.next],rax
mov [rsp],rax
mov rdx,[rax+n.next]
mov qword[rax+n.next],0
mov [rsp+24],rdx
jmp .left
.just_set_left_only:
mov rdx,[rax+n.next]
mov qword[rax+n.next],0
mov [rsp],rax
mov [rsp+24],rax
mov [rsp+32],rdx
jmp .left
.exit:
add rsp,8
ret
; rdi input list, rcx count
length:
mov rcx,0
.L0:
test rdi,rdi
jz .exit
mov rdi,[rdi+n.next]
inc rcx
jmp .L0
.exit:
ret
segment readable
fmtu db 'unsorted',0ah,0
fmts db 'sorted' ,0ah,0
fmt1 db 'seed %d',0ah,0
fmt2 db '%16p %d',0ah,0
fmt3 db 'size of node %d, length %d',0ah,0
segment writeable
list rq 1
seed rd 1
;------------------------------------------------ ende

>
> >> It looks to me like ciforth is a descendant of FigForth
> > Don't forget you can't reuse assembly code from older processors.
> > I have x86-64 version and it does not looks like it is pre 2000.
>
> I haven't looked closely at ciforth but iirc, Figforth was
> metacompiled, and it may have used a semi-portable virtual machine
> mapped to hardware instructions with asm macros (ie. with a W
> register and so on). Figforth ran on a lot of different cpu
> architectures with relatively small amount of asm code hacking needed
> for each one.
Really I am not sure, code has human readable comments, I don;t know
really.


Andrew Haley

unread,
Dec 26, 2014, 4:28:20 AM12/26/14
to
Paul Rubin <no.e...@nospam.invalid> wrote:
> Melzzzzz <m...@zzzzz.com> writes:
>
>>> It looks to me like ciforth is a descendant of FigForth
>> Don't forget you can't reuse assembly code from older processors.
>> I have x86-64 version and it does not looks like it is pre 2000.
>
> I haven't looked closely at ciforth but iirc, Figforth was
> metacompiled, and it may have used a semi-portable virtual machine
> mapped to hardware instructions with asm macros (ie. with a W
> register and so on). Figforth ran on a lot of different cpu
> architectures with relatively small amount of asm code hacking
> needed for each one.

All true, kinda sorta. fig-FORTH was written for the 6502 in Forth
with the smallest part of assembly language and metacompiled with a
FORTH, Inc metacompiler. Then, a program was written to convert the
metacompiler output into assembly language source code, and then that
assembly code was translated by hand into code for other processors.

Andrew.

Tom O'Donnell

unread,
Dec 26, 2014, 5:36:04 AM12/26/14
to
On 2014-12-26, Paul Rubin <no.e...@nospam.invalid> wrote:
> Melzzzzz <m...@zzzzz.com> writes:
>>> legacy OS compatibility and it sounds like they spend a boatload of
>>> money on development.
>> That is because that there are not many such programmers.
>
> The extra cost is mostly because it takes so much longer to implement
> functionality in asm than in an HLL.

You said you haven't written much assembler but you sure make a lot of
strong statements about the use of it! I've been doing this for almost 41
years and I beg to differ strongly with most of what you have been saying.

What you said isn't the reason. Assembler is completely appropriate and the best
choice for systems programming in this environment. You can't make a comparison
between how long things take to do with a language designed specifically for a
particular purpose and how long they *might* take to do in language(s) that aren't
capable of being used for a particular job at all. We are not talking about
application programming. Consider the IBM assembler a DSL for systems programming
on IBM mainframes because that's exactly what it is.

The OS is not a legacy OS. It's the oldest OS still being *developed* sold
and marketed. The clocks on mainframe CPUs are up to over 5 GHz. There's tons
and tons of high tech in every mainframe. The use of "legacy" to describe
any of this is factually wrong.

The reason it takes a long time to write systems software on this platform
is because of the same values that keep people paying for the OS: you need
skilled people with decades of experience who understand that crashing isn't
acceptable ever, failures have to leave all resources in known, consistent states,
performance has to be much better than the functionality in the OS they're
enhancing and/or replacing, and things have to work now and on future
versions of the OS and with no bad interactions with countless other vendor
products. We also have to build in ways to fix the product in the field
right now when somebody is down. We don't send them source to compile.

We also need good tech writers and a lot of coordination between managers,
developers, QA, and doc. Customers are paying a lot of money for a fully
finished product that meets the same standards of the OS. They expect clear,
usable professional doc. The product has to install and work, always. It
can't do anything bad to the system. It has to perform, it has to be secure,
it has to insure data integrity uberalles. Messing up in any of these areas
means possible lawsuits and loss of big money.

Mainframes scale up, not out. Your stuff has to work on not require a
failover box. They may have some load balancing and failover but we can't
rely on that. Our stuff has to stay up. If it doesn't heads roll. The amount
of design that goes into a good systems software product is not something
application coders have any experience with. Yes, it takes longer to design
a Boeing 747 than it takes to design a Piper Cub because the values are
different. All that costs money. But if we didn't save the customers a lot
more than the products cost we wouldn't have a financial model that could work.

Tom

Anton Ertl

unread,
Dec 26, 2014, 8:37:18 AM12/26/14
to
Paul Rubin <no.e...@nospam.invalid> writes:
>One difference is you can run the C program on multiple cpu
>architectures without much change.

That used to be the case, but not any longer. Nowadays you cannot run
a C program on a single CPU architecture without change.

Anton Ertl

unread,
Dec 26, 2014, 9:00:00 AM12/26/14
to
alb...@spenarnc.xs4all.nl (Albert van der Horst) writes:
>In article <2014Dec2...@mips.complang.tuwien.ac.at>,
>Anton Ertl <an...@mips.complang.tuwien.ac.at> wrote:
>>Paul Rubin <no.e...@nospam.invalid> writes:
>>>By comparison, with today's dynamic scripting languages, there's not
>>>much debugging per se. The program crashes and there's a diagnostic
>>>dump that tells you immediately what went wrong.
>>
>>Not in my experience. A typical experience is that I use a complex
>>regexp in awk, and it does not match what I want it to match (and I
>>doubt that it's any better with other languages). So I have to try
>>out simpler forms of that regexp to see what's wrong. Or I do some
>>complex shell pipe, and I get no output, or the wrong output. Then I
>>leave away parts to see where it went wrong. No diagnostic tells me
>>what went wrong.
>
>Especially for regexp the rule is: a complicated program is a simple
>program with some extensions. While I tend (nowadays) to be able to
>get a regexp right the first time (!), one really should start with a
>simple expression and make it more complicated. In this way it doesn't
>cost an inordinate amount of time.

Yes. Anyway, the point is that you don't get a diagnostic that tells
you what went wrong.

><SNIP>
>>Anyway, I don't really see a connection between the time needed for
>>design and the way you debug programs; sure, you also consider testing
>>and debugging in design, but having certain errors reported by the
>>language does not have more influence than other language features.
>
>Well. For really well-designed programs (like Ahem when I designed them),
>an error often triggers a reaction like "of course ... that is probably ..".

Yes, that happens in some cases, and in some cases a different design
might have been chosen that does not have the same property. But you
can have that in any language. And you can have bugs that mystify you
in any language. And you can have programs where you don't spend much
time on design in any language. And my guess is, that if the task
fits a well-established design (design pattern is the not-so-new
buzzword for that), that design also has its debugging methodology
established (a debugging pattern?), and if you know it, you get the
"that is probably ..." feeling more often.

So there is a certain connection between the time needed for the
design and the way you debug programs, contrary to what I wrote above.
The connection to the language is slim, though. Sure, a design and
debugging methodology that relies on array bounds checking won't work
on a language without array bounds checking, but OTOH, a language
with, say address arithmetics may lead to other designs that one would
not use in checked languages; I remember the horrible contortions a
student of mine went through when implementing Postscript in C#.
It is loading more messages.
0 new messages