incremental assembly

Hugh Aguilar

unread,

Oct 26, 2011, 1:12:51 AM10/26/11

to

Does any assembler support incremental assembly? What I mean by
"incremental assembly" is that I can assemble a function and the
generated code will get tacked onto the end of the code that has
already been generated. I don't have to assemble an entire file and
then link the generated object file together with other object files
to produce an executable program. I want to incrementally build the
code. Also, when I tack a new function onto the end of the existing
code, I don't want to mess up the data at all. Most of the time, this
new function is a rewrite of a buggy old function, so I want to test
it on the same data that I had tested the old function on --- I don't
want to have to rerun the program from the start to regenerate all of
that data.

I noticed in the NASM documentation that it generates a wide variety
of object formats, but that one of them is a simple binary file. Is it
possible to assemble a single function into a binary file, locating
that function at a specific address, and then just copy the binary
file into memory at that address? This would be similar to writing a
one-function .COM file, but instead of placing the function at 0x0100,
it would be located somewhere else. Actually though, most functions
are relocatable (or I could make sure that they were), so they don't
really have to be located anywhere by the assembler (0x0100 would be
fine, even though the function will actually get copied somewhere else
in memory).

If HLA does incremental assembly, I would prefer to use it for my
project because it has such an awesome library of code all of which
will assemble for either Windows or Linux. Cool! But I don't want to
have to delve into the internal workings of HLA to make it do
incremental assembly, if it doesn't already do this. I would use NASM
or whatever if that assembler already has incremental assembly.

BTW, has anybody here used the assembler that comes with PLT Scheme?
If I had to delve into the internal workings of any assembler, that
one might be the easiest. Also, as powerful as Mr. Hyde's compile-time-
language is, I expect that Scheme is more powerful. Does anybody know
why Mr. Hyde wrote his own language rather than use an existing
language, such as Scheme or Ruby or Python or whatever?

I tried to get on the HLA mailing list, but haven't gotten any
response in several days. Is that still active?

Rod Pemberton

unread,

Oct 26, 2011, 7:00:55 PM10/26/11

to

"Hugh Aguilar" <hughag...@nospicedham.yahoo.com> wrote in message
news:e580e82d-fb49-43d6...@h30g2000pro.googlegroups.com...

>
> Does any assembler support incremental assembly? What I mean by
> "incremental assembly" is that I can assemble a function and the
> generated code will get tacked onto the end of the code that has
> already been generated. I don't have to assemble an entire file and
> then link the generated object file together with other object files
> to produce an executable program. I want to incrementally build the
> code. Also, when I tack a new function onto the end of the existing
> code, I don't want to mess up the data at all. Most of the time, this
> new function is a rewrite of a buggy old function, so I want to test
> it on the same data that I had tested the old function on --- I don't
> want to have to rerun the program from the start to regenerate all of
> that data.

I have no answer for you on that.

> I noticed in the NASM documentation that it generates a wide variety
> of object formats, but that one of them is a simple binary file. Is it
> possible to assemble a single function into a binary file, locating
> that function at a specific address, and then just copy the binary
> file into memory at that address?

Yes. It's possible. It's do-able for disk boot code loaded to 0x7C00, or
with DOS as your OS, like a .com file. You'd have to comply with the
executable and/or object formats of other OSes. I.e., you may not be able
to load it to a specific address. Others here, hopefully, know more about
Windows and Linux in that regard.

> BTW, has anybody here used the assembler that comes with PLT Scheme?

Not I.

> If I had to delve into the internal workings of any assembler, that
> one might be the easiest. Also, as powerful as Mr. Hyde's compile-time-
> language is, I expect that Scheme is more powerful. Does anybody know
> why Mr. Hyde wrote his own language rather than use an existing
> language, such as Scheme or Ruby or Python or whatever?
>

Not I.

> I tried to get on the HLA mailing list, but haven't gotten any
> response in several days. Is that still active?

Frank? Nathan? ...

AIR, we haven't seen a post from "Mr. Hyde" in a few years here, or on
alt.lang.asm.

Rod Pemberton

Hugh Aguilar

unread,

Oct 26, 2011, 9:41:12 PM10/26/11

to

On Oct 26, 5:00 pm, "Rod Pemberton"
<do_not_h...@nospicedham.noavailemail.cmm> wrote:
> "Hugh Aguilar" <hughaguila...@nospicedham.yahoo.com> wrote in message

> > I noticed in the NASM documentation that it generates a wide variety
> > of object formats, but that one of them is a simple binary file. Is it
> > possible to assemble a single function into a binary file, locating
> > that function at a specific address, and then just copy the binary
> > file into memory at that address?
>
> Yes. It's possible. It's do-able for disk boot code loaded to 0x7C00, or
> with DOS as your OS, like a .com file. You'd have to comply with the
> executable and/or object formats of other OSes. I.e., you may not be able
> to load it to a specific address. Others here, hopefully, know more about
> Windows and Linux in that regard.

Generating code to a specific address shouldn't be too difficult; any
assembler that can create .COM files should be able to do this. The
tricky part is that the assembler has to have the symbol table for the
program still in place, so the function can reference sub-functions
and data and so forth that are in the program.

I have written a couple of Forth assemblers, but not for the x86 (for
the 65c02 and the MiniForth, the latter done professionally). I have
also used a couple of x86 Forth assemblers (UR/Forth and SwiftForth)
extensively, and learned the basics of some others. All Forth
assemblers do incremental assembly. For example, I entered the
following code at the console of SwiftForth (the "ok" is produced by
the Forth console after every line; I didn't type that myself):

code plus ( a b -- a+b ) ok
0 [ebp] ebx add 4 # ebp add ok
ret end-code ok
89 7 plus . 96 ok

The PLUS function was just tacked onto the existing code. The downside
of Forth assemblers however, is that they don't generate object files,
so it is difficult to link Forth code together with C code and
traditional assembly code.

I want the best of both worlds. I want the interactive development of
incremental assembly such as I'm accustomed to in Forth (I don't go
through an assemble, link and debug cycle for the entire program). I
also want to be able to produce object files though, so that my code
can be linked together with libraries of code. The big problem with
Forth is that there aren't very many libraries of code available. One
way to solve this is to just get access to the myriad C libraries
floating around. This should work well for high-level code (like
Berkeley DB, for example), although it won't work well for low-level
code (like linked-lists, associative arrays, and the other code in my
novice package:
http://www.forth.org/novice.html) --- this stuff has to be written in
Forth so that it will have a Forthish face.

> > BTW, has anybody here used the assembler that comes with PLT Scheme?
>
> Not I.
>
> > If I had to delve into the internal workings of any assembler, that
> > one might be the easiest. Also, as powerful as Mr. Hyde's compile-time-
> > language is, I expect that Scheme is more powerful. Does anybody know
> > why Mr. Hyde wrote his own language rather than use an existing
> > language, such as Scheme or Ruby or Python or whatever?
>
> Not I.

Actually, I studied more of the HLA compile-time-language last night
and I discovered that it has goal-directed execution similar to Icon
(http://en.wikipedia.org/wiki/Icon_(programming_language)). I think
the reason why Scheme etc. weren't used, is because they don't have
anything like that. This is actually pretty sophisticated! I learned
Icon way back in the 1990s and I was very impressed. Icon never became
popular though, so I forgot about it --- but I'm pleased now to see
that the ideas in Icon have turned up in HLA --- this makes me even
more interested in learning HLA.

> > I tried to get on the HLA mailing list, but haven't gotten any
> > response in several days. Is that still active?
>
> Frank? Nathan? ...
>
> AIR, we haven't seen a post from "Mr. Hyde" in a few years here, or on
> alt.lang.asm.

I shouldn't have called him "Mr. Hyde" --- "Prof. Hyde" would have
been better, as I have read that he is a professor somewhere.

I read some of the old threads here, dating back to when he was
posting on this forum, and people were quite critical of him, and
generally over trivial matters such as his use of forward operand
ordering (source, destination) rather than the more common backward
ordering (destination, source). People also criticized him for
providing all of those macros, saying that they prefer to write code
by hand (which they consider to be part-and-parcel of assembly-
language programming). That makes no sense to me. With every assembler
that I have used (IBM/360, PIC16, PIC24, etc.) I have always written a
*lot* of macros. Invariably, what bugs me the most about assembly
language is that the macro language is too weak for me and it won't
let me do the things that I want to do. I'm not the only person that
feels this way either; I noticed that MicroChip upgraded to a new
assembler, and it has a much better macro language than their old one.
It is possible that HLA will be the first assembler that I use in
which I won't be able to complain about the macro language being too
weak! I don't understand why people are criticizing HLA --- what is
the downside to having a powerful macro language?

If everybody was putting him down, he likely just got sick of it and
stopped posting messages about HLA --- I can relate --- nobody has
ever said anything positive about my novice package, and many have
denounced it in the most vulgar terms.

Nathan Baker

unread,

Oct 26, 2011, 10:42:29 PM10/26/11

to

"Hugh Aguilar" <hughag...@nospicedham.yahoo.com> wrote in message
news:e580e82d-fb49-43d6...@h30g2000pro.googlegroups.com...

> Does any assembler support incremental assembly? What I mean by
> "incremental assembly" is that I can assemble a function and the
> generated code will get tacked onto the end of the code that has
> already been generated. I don't have to assemble an entire file and
> then link the generated object file together with other object files
> to produce an executable program. I want to incrementally build the
> code. Also, when I tack a new function onto the end of the existing
> code, I don't want to mess up the data at all. Most of the time, this
> new function is a rewrite of a buggy old function, so I want to test
> it on the same data that I had tested the old function on --- I don't
> want to have to rerun the program from the start to regenerate all of
> that data.
>

The only thing close to "incremental assembler" would be RosAsm. However, I
don't have any idea how anyone would go about obtaining the latest copy of
it.
http://en.wikipedia.org/wiki/User:B2kguga/RosAsm

> I tried to get on the HLA mailing list, but haven't gotten any
> response in several days. Is that still active?

It is still active.

Nathan.

Hugh Aguilar

unread,

Oct 27, 2011, 12:54:40 AM10/27/11

to

On Oct 26, 8:42 pm, "Nathan Baker"
<nathancba...@nospicedham.gmail.com> wrote:
> "Hugh Aguilar" <hughaguila...@nospicedham.yahoo.com> wrote in message

>
> news:e580e82d-fb49-43d6...@h30g2000pro.googlegroups.com...
>
> > Does any assembler support incremental assembly? What I mean by
> > "incremental assembly" is that I can assemble a function and the
> > generated code will get tacked onto the end of the code that has
> > already been generated. I don't have to assemble an entire file and
> > then link the generated object file together with other object files
> > to produce an executable program. I want to incrementally build the
> > code. Also, when I tack a new function onto the end of the existing
> > code, I don't want to mess up the data at all. Most of the time, this
> > new function is a rewrite of a buggy old function, so I want to test
> > it on the same data that I had tested the old function on --- I don't
> > want to have to rerun the program from the start to regenerate all of
> > that data.
>
> The only thing close to "incremental assembler" would be RosAsm. However, I
> don't have any idea how anyone would go about obtaining the latest copy of

> it.http://en.wikipedia.org/wiki/User:B2kguga/RosAsm

>
> > I tried to get on the HLA mailing list, but haven't gotten any
> > response in several days. Is that still active?
>
> It is still active.
>
> Nathan.

I just made up the term "incremental assembly" --- I don't know if
there is an official term. RosAsm might do the incremental assembly
(or it might not, I can't tell), but it looks like it is too fringe
even for me (and as a Forther, I'm pretty fringe!). BTW, you don't
usually see statements like this on Wikipedia: "for mankind in
general, a free gift that can be used to improve their lives wherever
possible by the continuous development of high technologies." Well, so
much for objectivity!

HLA looks like my kind of software. I like the idea of a powerful
macro language, especially one based on Icon. Like I said, I have a
long history of being frustrated with macro languages that were too
weak for me, so maybe HLA will be the first one that like. I might be
able to tweak HLA so that it will do the incremental assembly (at some
far future date when I know more about how HLA works internally).

HLA is also somewhat fringe apparently. Is it true that most people
use NASM? Is that the mainstream nowadays? I used TASM and A86 way
back in MS-DOS days (I liked A86 because it got rid of most of those
MASM/TASM directives, and it was also significantly faster), but those
assemblers seem to have fallen by the wayside. If I wanted to be a
totally mainstream programmer who fits in well on comp.lang.asm.x86
(and could possibly even get a job), would NASM be the best choice for
me? Is it true that HLA is academic and hobbyist, but isn't used
commercially, so there is no chance of me ever getting a paycheck? For
that matter, is there any work available in assembly (using any
assembler), or has Java totally taken over the world? Should I forget
about all of this fringe foolishness and just learn to love curly
brackets?

Phil Carmody

unread,

Oct 27, 2011, 6:13:32 AM10/27/11

to

Hugh Aguilar <hughag...@nospicedham.yahoo.com> writes:
> Does any assembler support incremental assembly? What I mean by
> "incremental assembly" is that I can assemble a function and the
> generated code will get tacked onto the end of the code that has
> already been generated.

Paging Beth Stone...

Phil
--
Unix is simple. It just takes a genius to understand its simplicity
-- Dennis Ritchie (1941-2011), Unix Co-Creator

Steve

unread,

Oct 27, 2011, 7:43:20 AM10/27/11

to

Hugh Aguilar <hughag...@nospicedham.yahoo.com> writes:
>Does any assembler support incremental assembly? What I mean by
>"incremental assembly" is that I can assemble a function and the
>generated code will get tacked onto the end of the code that has
>already been generated. I don't have to assemble an entire file and
>then link the generated object file together with other object files
>to produce an executable program. I want to incrementally build the
>code. Also, when I tack a new function onto the end of the existing
>code, I don't want to mess up the data at all. Most of the time, this
>new function is a rewrite of a buggy old function, so I want to test
>it on the same data that I had tested the old function on --- I don't
>want to have to rerun the program from the start to regenerate all of
>that data.

Hi,

If programming in DOS you could use overlays. Other
common OS'es could use DLL's. An option might be a
device driver, but that would requife dynamic loading.

Regards,

Steve N.

Frank Kotler

unread,

Oct 27, 2011, 11:11:05 AM10/27/11

to

Phil Carmody wrote:
> Hugh Aguilar <hughag...@nospicedham.yahoo.com> writes:
>> Does any assembler support incremental assembly? What I mean by
>> "incremental assembly" is that I can assemble a function and the
>> generated code will get tacked onto the end of the code that has
>> already been generated.
>
> Paging Beth Stone...
>
> Phil

:)

This gets into a long story, Hugh! I made the statement, "Even though
I'm a command-line guy, I think a port of RosAsm to Linux would be
cool." Betov (author of RosAsm) replied, "Okay, Frank, you're the chief
maintainer of Luxasm, the Linux port of RosAsm." Although I had no
intention of writing the thing, I thought the name "Luxasm" was so great
that I started a project at SourceForge. Beth signed up for it, as did
CRChafer ("C"), YeohHS, and a few others. Beth thought that "incremental
assembly" would be good. I thought it was impractical, but C (the only
one who ever actually wrote any code) thought we might be able to do it.
Then C announced that he didn't have time to continue with it, and Beth
dropped out of sight entirely, so Luxasm is in a "medically induced
coma" (indistinguishable from "dead").

cmp eax, ebx "ok" (to use your example)
jz L1 "???"

If "L1" has been "seen", fine. If it's a forward reference... put in a
placeholder? Will a "short" jump fit, or will we need "near"? We may
need to relocate any following code, once we find out where "L1" is. In
this case, is it still "incremental"? I dunno.

So Luxasm and RosAsm are, for practical purposes, "dead". If Betov is
still alive, he'll be angry to see RosAsm and HLA mentioned on the same
page (same universe!). If not, he'll be rolling over in his grave. Tough!

HLA is still alive and well(?). When Randy was teaching at University of
California, Riverside, his title was "lecturer", not "professor", AFAIK.
I used to call him "professor" until I figured out that he's not a
"pompous asm", and that "Randy" seems to be an acceptable form of
address. I think he's working for General Atomics now, writing code for
the control panel of a research reactor - in HLA, so HLA is used
"commercially" in that case, at least. He also has a "hobby job"
providing lights (and other "production"?) for music groups, so he
doesn't have too much time to work on HLA. Definitely a "teacher"! He
has always taken time to help me out, despite the fact that I disagree
with him frequently.

My "objection" to HLA is that I feel that teaching a beginner
"stdout.puts("hello, world");" and claiming that you're teaching them
"assembly language" is... "misleading" (nicest word I can think of).
Since you're looking for a powerful macro language, it may be just what
you want. It won't do "incremental assembly", but it's open source. If
you can figure out what to do with forward references, perhaps it could
be modified...

http://tech.groups.yahoo.com/group/aoaprogramming/

If that's the mailing list you tried to join, and got no response...
maybe try again. Maybe try a different "reason" for wanting to join. The
didn't do that when I joined, or I would have had to say, "I'm a spy
from Nasm!" :) I think it's just an anti-spam measure - shouldn't matter
what you say, as long as you say something. It *may* be that the "owner"
of the group (not Randy, AFAIK) is 404, but the group is still "active"
(not very) - Randy posted just the other day! I don't recall seeing many
posts from "new" members, mostly the "regulars", so there might be a
problem joining(?). We could inquire, if you continue to have trouble
with it - get back to us.

Best,
Frank

io_x

unread,

Oct 27, 2011, 1:23:12 PM10/27/11

to

"Frank Kotler" <fbko...@nospicedham.myfairpoint.net> ha scritto nel messaggio
news:j8bsg4$vsn$1...@speranza.aioe.org...

> cmp eax, ebx "ok" (to use your example)
> jz L1 "???"
>
> If "L1" has been "seen", fine. If it's a forward reference... put in a
> placeholder?

> Will a "short" jump fit, or will we need "near"? We may

i remember some time ago one discussion from you and Betov about that...

until some time ago i not suppose exist in Nasm the word "near" in the jump
instructions so all too much big jump were broken in 2 short jump
for explain: at place of
...
.b: je near .z
many instructions
.z:

i wrote:
jmp .1
.a: jmp .z ; jmp seem always "near"
.1: ....
.b: je .a
many instructions
.z:

but where is the problem in using all jump near?

Rod Pemberton

unread,

Oct 27, 2011, 1:35:42 PM10/27/11

to

"Hugh Aguilar" <hughag...@nospicedham.yahoo.com> wrote in message

news:add664ee-863e-44b4...@h39g2000prh.googlegroups.com...
> [SNIP]
...

> [...] people were quite critical of ["Mr. Hyde"]
>

Yes, much of that was due to his continual, commercial-like,
"advertisements" of his HLA. Whenever serious criticism of his
work was posted, such as some posts by me, he seemed to run
and hide ...

> I have always written a
> *lot* of macros. Invariably, what bugs me the most about
> assembly language is that the macro language is too weak
> for me and it won't let me do the things that I want to do.

NASM doesn't have that issue. Herbert Kleebauer uses a 68000 style
syntax for x86 with his personal assembler. He also implemented that
syntax for NASM using NASM's macros. The macros allow others to
compile his Windela (or Lindela ?) code with NASM.

A recent post of the mac.inc file which does that is here:
http://groups.google.com/group/alt.lang.asm/msg/25de6b2c28ec7004

Windela.zip with windela.inc:
http://www.bitlib.de/pub/assembler/

> It is possible that HLA will be the first assembler that I use in
> which I won't be able to complain about the macro language being
> too weak! I don't understand why people are criticizing HLA ---
> what is the downside to having a powerful macro language?

As Frank noted, HLA is really a high-level language with an assembly
appearance. NASM has a powerful macro language. See my
comments above.

In your other post,

> Is it true that most people use NASM?

No idea. I prefer NASM since has fewer syntax issues versus MASM
or GAS. I always struggle to figure out the correct syntax for MASM
(or clones like WASM, JWasm, etc.) GAS doesn't generate some x86
instructions and GAS syntax causes some confusion too. There are also
certain forms of instructions that neither will generate, but NASM will.
I don't like having to create instructions using DB's. People use FASM
and YASM too which are NASM syntax based, IIRC.

Rod Pemberton

Hugh Aguilar

unread,

Oct 27, 2011, 3:57:16 PM10/27/11

to

On Oct 27, 9:11 am, Frank Kotler

<fbkot...@nospicedham.myfairpoint.net> wrote:
> cmp eax, ebx "ok" (to use your example)
> jz L1 "???"
>
> If "L1" has been "seen", fine. If it's a forward reference... put in a
> placeholder? Will a "short" jump fit, or will we need "near"? We may
> need to relocate any following code, once we find out where "L1" is. In
> this case, is it still "incremental"? I dunno.

Well, I tried this out in SwiftForth:

code flag ( n -- flag ) ok
ebx ebx or 0<> if -1 # ebx mov then ok
ret end-code ok
9 flag . -1 ok
0 flag . 0 ok
see flag
46E89F EBX EBX OR 09DB
46E8A1 46E8A8 JZ 7405
46E8A3 -1 # EBX MOV BBFFFFFFFF
46E8A8 RET C3 ok

SwiftForth seems to just be using short jumps by default. I don't know
what would happen if your function was so big that it required a near
jump; maybe get an error message. If I was writing the assembler, I
would be fine with this being an error. When I wrote my 65c02 Forth
and assembler, I required forward jumps to be short and only in a very
few cases was this a problem, but I just factored the function rather
than "fix" the assembler. Another option when this error cropped up,
would be to switch to inline assembly and manually code a far jump (my
Forth wasn't standard anyway, so the occasional use of inline-assembly
wasn't a problem). For backward jumps my compiler was smart enough to
use either a short or a far jump as necessary.

This issue has never been important to me. In Forth, programmers are
strongly urged to factor the heck out of their programs. We don't
generally have lengthy functions; we have a lot of very short
functions. Forth allows interactive testing, so it makes sense to have
short functions that do only one thing and which do not access global
data but whose results are entirely dependent upon the input
parameters. It is possible to test each function by trying it out with
some input parameters (a few typical parameters, the edge cases and
the illegal parameters), and then you are reasonably confident that
your function is good and you don't have to worry about it anymore.

I have seen both C and traditional assembly programs in which the
functions were gigantic. A lot of such programmers will give lip
service to the idea of factoring a program into small functions, but
they don't do this in practice. They can't interactively test their
functions at the console, so having small functions doesn't benefit
them. They run their entire program through a debugger and step
through their code, so it doesn't really matter if the code is
organized as a few big functions or a lot of small functions. The only
advantage of small functions is that they make for handy breakpoint
targets, and it is possible to step over a function call rather than
step through all the code in the function, but neither of these
advantages are apparently important enough to induce the C and
traditional assembly programmers to factor their programs --- because
I never see them do it!

I hate debuggers! I haven't used a debugger or an ICE this century,
and I don't plan on doing so again in my lifetime. I'm really enamored
to functional-programming, in which small functions are stand-alone
and can be tested interactively. In my 65c02 Forth I provided source-
level debugging; I could single-step through programs, step over
function calls, set breakpoints, etc.. I wrote that 20 years ago
though. I know more about programming now, and I no longer have any
need for that kind of thing.

> So Luxasm and RosAsm are, for practical purposes, "dead". If Betov is
> still alive, he'll be angry to see RosAsm and HLA mentioned on the same
> page (same universe!). If not, he'll be rolling over in his grave. Tough!

I did notice that RosAsm and HLA seem to be very different, and might
have difficulty coexisting in the same universe. HLA is all about the
macro language, and RosAsm is all about simplicity in the sense of
eschewing macros. I think that I'm a lot more on the HLA side of the
fence, which is another reason why I didn't feel inspired to
investigate RosAsm further.

> My "objection" to HLA is that I feel that teaching a beginner
> "stdout.puts("hello, world");" and claiming that you're teaching them
> "assembly language" is... "misleading" (nicest word I can think of).

Realistically, how many beginning programmers (or advanced
programmers, for that matter), are going to write their own console
interface code? Not me! I want to get my program completed before the
end of the century, so I don't want to spend a lot of time on low-
level stuff like this --- I'm happy to let somebody else do that for
me --- if Randy Hyde and his students are volunteering to provide that
stuff for free, then I'll take it.

I think that HLA and its macros are great. I also think that Forth and
its compile-time code is great. I also remain unemployed...

I have worked as a programmer in the past. I was often told that my
code had to look just like everybody else's code. The goal was that,
if I got run over by a bus while walking to work, the company could
hire another programmer who could sit down at my desk and look at my
code and immediately understand it and begin working on it. What this
means essentially, is that I can't use macros. That other programmer
doesn't want to look at my code and see a macro being used, and not
know what it does --- he just wants to see MOV and LEA and so forth,
that he is familiar with. This is why Forth is not used commercially
--- because Forthers use a lot of macros (immediate words) --- one
Forth program doesn't look very much like another Forth program, as
the language has been made to fit the application. By comparison, C
doesn't have macros (except for #define which is so simplistic that it
doesn't really count), so all C programs look alike.

I have worked as an assembly language programmer. My experience is
that the company typically has an approved library of macros that were
written by some genius (the boss describes himself as such), but the
ordinary programmers (me) are strongly discouraged from writing macros
of their own. The rational is that they aren't smart enough to write
macros, and that doing so is outside of their paygrade. The real
reason is what I described above, in regard to what will happen if the
programmer gets run over by a bus (or, more likely, annoys the boss
and gets fired).

I like macros though! As for getting run over by a bus, I avoid that
by looking both ways before crossing the street. As for annoying the
boss, that can also be avoided. Whenever the boss begins talking, the
best thing to do is shut one's mouth, grin like an idiot, and bob
one's head up and down in wholehearted agreement --- I've seen this
done successfully, but have never tried it myself.

> Since you're looking for a powerful macro language, it may be just what
> you want. It won't do "incremental assembly", but it's open source. If
> you can figure out what to do with forward references, perhaps it could
> be modified...

Requiring short jumps should be adequate --- that is pretty simple ---
and it is a draconian way to force programmers to factor their
functions down to reasonable size.

If that doesn't work, requiring near jumps is equally simple, although
it results in a performance hit.

> http://tech.groups.yahoo.com/group/aoaprogramming/
>
> If that's the mailing list you tried to join, and got no response...
> maybe try again. Maybe try a different "reason" for wanting to join. The
> didn't do that when I joined, or I would have had to say, "I'm a spy
> from Nasm!" :) I think it's just an anti-spam measure - shouldn't matter
> what you say, as long as you say something. It *may* be that the "owner"
> of the group (not Randy, AFAIK) is 404, but the group is still "active"
> (not very) - Randy posted just the other day! I don't recall seeing many
> posts from "new" members, mostly the "regulars", so there might be a
> problem joining(?). We could inquire, if you continue to have trouble
> with it - get back to us.

What does 404 mean?

If it means that he is on vacation, then that is the most likely
explanation. I'll give them another week and if there is no response
I'll get back to youse.

io_x

unread,

Oct 28, 2011, 2:55:47 AM10/28/11

to

"Hugh Aguilar" <hughag...@nospicedham.yahoo.com> ha scritto nel messaggio
news:23542726-db97-4637...@u37g2000prh.googlegroups.com...

On Oct 27, 9:11 am, Frank Kotler
<fbkot...@nospicedham.myfairpoint.net> wrote:

#io_x
#disclaimer: i'm only one hobby programmer so it is easy what i say it is wrong
#i have seen only one little part of programming area
#[i never see one big compute or a mainframe or many hlls as perl etc]
#so it is possible i say all wrong

> cmp eax, ebx "ok" (to use your example)
> jz L1 "???"
>
> If "L1" has been "seen", fine. If it's a forward reference... put in a
> placeholder? Will a "short" jump fit, or will we need "near"? We may
> need to relocate any following code, once we find out where "L1" is. In
> this case, is it still "incremental"? I dunno.

Well, I tried this out in SwiftForth:

code flag ( n -- flag ) ok
ebx ebx or 0<> if -1 # ebx mov then ok
ret end-code ok
9 flag . -1 ok
0 flag . 0 ok
see flag
46E89F EBX EBX OR 09DB
46E8A1 46E8A8 JZ 7405
46E8A3 -1 # EBX MOV BBFFFFFFFF
46E8A8 RET C3 ok

#this is more concise: b==ebx [don't know 100% if it is equivalent]
# b|=b|jz .a|b=-1
#.a: ret
#if one macro-asm set, has name registers of 1 char eg. a b c etc
#has multiple instructions for line,
#use the C language assignment instructions =, |=, ^= etc
#use one reduce set of most important cpu instructions:
#it is a language ok for me

SwiftForth seems to just be using short jumps by default. I don't know
what would happen if your function was so big that it required a near
jump; maybe get an error message. If I was writing the assembler, I
would be fine with this being an error. When I wrote my 65c02 Forth
and assembler, I required forward jumps to be short and only in a very
few cases was this a problem, but I just factored the function rather
than "fix" the assembler.

#i like big functions that do all
#and i know that should be not so;
#i think there are some function that have to be big too
#for example the ones that controll all events
#in a window program [hook functions?]

Another option when this error cropped up,
would be to switch to inline assembly and manually code a far jump (my
Forth wasn't standard anyway, so the occasional use of inline-assembly
wasn't a problem). For backward jumps my compiler was smart enough to
use either a short or a far jump as necessary.

#for me backwards jumps and forwards jumps are to same importance
#and in my asm code could be the same number
#enven if i not count all them

This issue has never been important to me. In Forth, programmers are
strongly urged to factor the heck out of their programs. We don't
generally have lengthy functions; we have a lot of very short
functions. Forth allows interactive testing, so it makes sense to have
short functions that do only one thing and which do not access global
data but whose results are entirely dependent upon the input
parameters. It is possible to test each function by trying it out with
some input parameters (a few typical parameters, the edge cases and
the illegal parameters), and then you are reasonably confident that
your function is good and you don't have to worry about it anymore.

I have seen both C and traditional assembly programs in which the
functions were gigantic. A lot of such programmers will give lip
service to the idea of factoring a program into small functions, but
they don't do this in practice. They can't interactively test their
functions at the console, so having small functions doesn't benefit
them.

#every function can be tested too in C or in assembly using some
#arg value; i do that for each assembly function i write / wrote

They run their entire program through a debugger and step
through their code,

#it is enough to step through the new function only

so it doesn't really matter if the code is
organized as a few big functions or a lot of small functions. The only
advantage of small functions is that they make for handy breakpoint
targets, and it is possible to step over a function call rather than
step through all the code in the function,

#yes small functions are easier to follow than big ones

but neither of these
advantages are apparently important enough to induce the C and
traditional assembly programmers to factor their programs --- because
I never see them do it!

I hate debuggers! I haven't used a debugger or an ICE this century,
and I don't plan on doing so again in my lifetime. I'm really enamored
to functional-programming, in which small functions are stand-alone
and can be tested interactively. In my 65c02 Forth I provided source-
level debugging; I could single-step through programs, step over
function calls, set breakpoints, etc.. I wrote that 20 years ago
though. I know more about programming now, and I no longer have any
need for that kind of thing.

#programming is about execute simple instructions
#so the clue it is see their execution, it is not matter in one debugger
#or when someone see them with the mind eye debugger

> So Luxasm and RosAsm are, for practical purposes, "dead". If Betov is
> still alive, he'll be angry to see RosAsm and HLA mentioned on the same
> page (same universe!). If not, he'll be rolling over in his grave. Tough!

I did notice that RosAsm and HLA seem to be very different, and might
have difficulty coexisting in the same universe. HLA is all about the
macro language, and RosAsm is all about simplicity in the sense of
eschewing macros. I think that I'm a lot more on the HLA side of the
fence, which is another reason why I didn't feel inspired to
investigate RosAsm further.

> My "objection" to HLA is that I feel that teaching a beginner
> "stdout.puts("hello, world");" and claiming that you're teaching them
> "assembly language" is... "misleading" (nicest word I can think of).

Realistically, how many beginning programmers (or advanced
programmers, for that matter), are going to write their own console
interface code? Not me! I want to get my program completed before the
end of the century,

#i not agree, one right macro-asm language
#could be [in writing] faster or the same
#than each hll language;
#macro-asm scale well for me to easy to difficult
#i not have strong evidece on that
#but the time of write one routine in C for me
#it is the same than of asm one;
#only the .asm programs it seems to me very good
#when i see them in a debugger or see run them
#i say this is the way this is very good
#and i not say that for the C/C++ programs

so I don't want to spend a lot of time on low-
level stuff like this

#programming begin to low level stuff;
#each program has need of this low level stuff
#and this in not so easy
#it is not as call one function...

--- I'm happy to let somebody else do that for
me --- if Randy Hyde and his students are volunteering to provide that
stuff for free, then I'll take it.

I think that HLA and its macros are great. I also think that Forth and
its compile-time code is great. I also remain unemployed...

#i'm sorry for that; programming is a hobby for me, but i see
#people like math thru computer better than from the book;
#so i try too to teach some programming language or to do with them
#some exercise using the Excel program or some "foglio di calcolo".
#The problem for the hll people is not to have the control on the hardware
#and in practice fail in too few think about problem and about cpu
#or virtual cpu

#the world is going in a dark age
#the only way for go out that is, pray and hope each nation
#stop import-export of goods or use duty on import goods

I have worked as a programmer in the past. I was often told that my
code had to look just like everybody else's code. The goal was that,
if I got run over by a bus while walking to work, the company could
hire another programmer who could sit down at my desk and look at my
code and immediately understand it and begin working on it. What this
means essentially, is that I can't use macros. That other programmer
doesn't want to look at my code and see a macro being used, and not
know what it does --- he just wants to see MOV and LEA and so forth,
that he is familiar with. This is why Forth is not used commercially
--- because Forthers use a lot of macros (immediate words) --- one
Forth program doesn't look very much like another Forth program, as
the language has been made to fit the application. By comparison, C
doesn't have macros (except for #define which is so simplistic that it
doesn't really count), so all C programs look alike.

#this is the first point: macros are only for build *one* language.
#macros can not vary for one prog to one other, because people has need to
#be familiar with *one* language
#[the obout lines are not for macro as "#define name 16" but for macro
#that define the language]

I have worked as an assembly language programmer. My experience is
that the company typically has an approved library of macros that were
written by some genius (the boss describes himself as such), but the
ordinary programmers (me) are strongly discouraged from writing macros
of their own.

The rational is that they aren't smart enough to write
macros, and that doing so is outside of their paygrade.

#the key is to see
#who wrote the better language [set of macros] boss or you?
#there would be some time for test and
#see what language is better...

wolfgang kern

unread,

Oct 30, 2011, 4:46:52 AM10/30/11

to

Frank answered Phil:

>> Hugh Aguilar <hughag...@nospicedham.yahoo.com> writes:
>>> Does any assembler support incremental assembly? What I mean by
>>> "incremental assembly" is that I can assemble a function and the
>>> generated code will get tacked onto the end of the code that has
>>> already been generated.
>>
>> Paging Beth Stone...
>>
>> Phil
>
> :)

> This gets into a long story, Hugh! I made the statement, "Even though I'm
> a command-line guy, I think a port of RosAsm to Linux would be cool."
> Betov (author of RosAsm) replied, "Okay, Frank, you're the chief
> maintainer of Luxasm, the Linux port of RosAsm." Although I had no
> intention of writing the thing, I thought the name "Luxasm" was so great
> that I started a project at SourceForge. Beth signed up for it, as did
> CRChafer ("C"), YeohHS, and a few others. Beth thought that "incremental
> assembly" would be good. I thought it was impractical, but C (the only one
> who ever actually wrote any code) thought we might be able to do it. Then
> C announced that he didn't have time to continue with it, and Beth dropped
> out of sight entirely, so Luxasm is in a "medically induced coma"
> (indistinguishable from "dead").

> cmp eax, ebx "ok" (to use your example)
> jz L1 "???"

> If "L1" has been "seen", fine. If it's a forward reference... put in a
> placeholder? Will a "short" jump fit, or will we need "near"?

I now remember Beth's idea about an (almost) immediate compilation for
LuxAsm. The problem with forward references (also yet unknown data) will
ask for 'opcode-prototypes' which must use longer form for FW-branches.

The idea of 'compile while typing' will not only save on compile time,
it also could show the address-offset and so tell the programmer about
alignment, cache-line crossing and code size on the fly.

And even this may not really produce optimised jmp-size,
I find this still a great idea.
OTOH, isn't fw-branching avoidable and called 'spaghetti' anyway? :)

> We may need to relocate any following code, once we find out where
> "L1" is. In this case, is it still "incremental"? I dunno.

> So Luxasm and RosAsm are, for practical purposes, "dead". If Betov is
> still alive, he'll be angry to see RosAsm and HLA mentioned on the same
> page (same universe!). If not, he'll be rolling over in his grave. Tough!

and I'll keep my mouth shut about HLA for today ;)

RosAsm is really a good tool, unfortunately made for windoze only
and so very rare used by be.

[...]
__
wolfgang

wolfgang kern

unread,

Oct 30, 2011, 5:00:48 AM10/30/11

to

"io_x" asked:
...

> i wrote:
> jmp .1
> .a: jmp .z ; jmp seem always "near"
> .1: ....
> .b: je .a
> many instructions
> .z:

> but where is the problem in using all jump near?

jcc by either 6 bytes or just two byte may be seen as a size issue.
[unconditional jmp near need 5 bytes]

your 'jump over the near' were once used when conditional branches
existed as two byte code only.
Jumps eat CPU-resources, so better use 'em less! :)

Meanwhile we got 32-bit displacements also for conditiional jumps
in addition to the 8-bit versions.

__
wolfgang

Rod Pemberton

unread,

Oct 30, 2011, 2:18:00 PM10/30/11

to

"Frank Kotler" <fbko...@nospicedham.myfairpoint.net> wrote in message
news:j8bsg4$vsn$1...@speranza.aioe.org...
> [snip]

>
> cmp eax, ebx "ok" (to use your example)
> jz L1 "???"
>
> If "L1" has been "seen", fine. If it's a forward reference... put in a
> placeholder? Will a "short" jump fit, or will we need "near"? We
> may need to relocate any following code, once we find out where
> "L1" is. In this case, is it still "incremental"? I dunno.
>

Yes, I'd say it's incremental, but with an unresolved forward reference ...

If the placeholder allocates the largest possible branch size padded with
NOPs, it can be patched once it's location is determined, even in a
single-pass assembler. If the resulting branch has a large offset, all the
NOPs will be overwritten. If the resulting branch has a small offset, you
get a NOP or a few, after the branch. That way, the space allocated for
the branch and/or padding is of a fixed size and won't require relocating
code that comes later.

Rod Pemberton

Hugh Aguilar

unread,

Oct 31, 2011, 1:43:30 AM10/31/11

to

On Oct 30, 12:18 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "Frank Kotler" <fbkot...@nospicedham.myfairpoint.net> wrote in message

What is the big deal about relocating code? There is no reason why
code can't be relocated in a single-pass compiler. You can have a fix-
up done after the function has been compiled that converts any short
jumps to near jumps by moving the latter code to make room for the
larger jump instruction. This will have to be done repeatedly until no
more can be done, because changing one short into a near increases the
size of the function and may cause another short to need to be changed
to a near. Make sure that, if the function has multiple entry points
(I never do that, but some people write such ugly code), that the
addresses of the entry points get adjusted as necessary.

The only way that this could fail, is if the function was self-
modifying --- if the programmer was doing that though, then he
deserves bugs.

io_x

unread,

Oct 31, 2011, 3:04:54 AM10/31/11

to

"Rod Pemberton" <do_no...@noavailemail.cmm> ha scritto nel messaggio
news:j8k4bf$pe4$1...@speranza.aioe.org...

possibly i'm not understand the problem of jmp and labels,
but nobody think it is better to put all label addressable
code in a space of memory reachable to one pointer?

so executable would be in one array of pointers
with the final passage to calculate the right min distance
for jumps something as

.1: instruction1
instruction2
instruction3
.2: instruction4
instruction5
je .1
instruction6
code

pointer1Label-> "HexTraslate(instruction1)...HexTraslate(instruction3)"
pointer2Label-> "HexTraslate(instruction1)..."HexTraslate(je)PointerLabel1"..."
...

use pointer1Label and 2 for calculate the min right distance for to do the jump
in ..."HexTraslate(je)PointerLabel1"
and all other jumps

i think would be 2 or 3 pass to correct some error for the add of jumps.
all above could be not ok so could be better to use only jumps near
or manual adjust too far jumps

Hugh Aguilar

unread,

Oct 31, 2011, 4:45:18 PM10/31/11

to

On Oct 30, 11:43 pm, Hugh Aguilar

<hughaguila...@nospicedham.yahoo.com> wrote:

> This will have to be done repeatedly until no
> more can be done, because changing one short into a near increases the
> size of the function and may cause another short to need to be changed
> to a near.

This multiple-pass on the fix-up is only necessary if you are
supporting spaghetti code. If your assembler only supports structured
programming, then you can be sure that there are no jumps into or out
of the middle of a control structure. All the jumps and destinations
inside of your block of code are local to that block of code, so the
entire block can be relocated and you don't have to worry that any of
those jumps got messed up. You just fix-up a single forward jump at a
time, and you do this at the time that you have gotten to your
destination address --- this is very simple --- structured programming
is a very good thing!

Hugh Aguilar

unread,

Oct 31, 2011, 5:01:03 PM10/31/11

to

On Oct 27, 1:57 pm, Hugh Aguilar <hughaguila...@nospicedham.yahoo.com>
wrote:

> On Oct 27, 9:11 am, Frank Kotler

> >http://tech.groups.yahoo.com/group/aoaprogramming/
>
> > If that's the mailing list you tried to join, and got no response...
> > maybe try again. Maybe try a different "reason" for wanting to join. The
> > didn't do that when I joined, or I would have had to say, "I'm a spy
> > from Nasm!" :) I think it's just an anti-spam measure - shouldn't matter
> > what you say, as long as you say something. It *may* be that the "owner"
> > of the group (not Randy, AFAIK) is 404, but the group is still "active"
> > (not very) - Randy posted just the other day! I don't recall seeing many
> > posts from "new" members, mostly the "regulars", so there might be a
> > problem joining(?). We could inquire, if you continue to have trouble
> > with it - get back to us.
>
> What does 404 mean?
>
> If it means that he is on vacation, then that is the most likely
> explanation. I'll give them another week and if there is no response
> I'll get back to youse.

Well, it has been 4 days, which is not quite a week, but I'm getting
impatient. I don't want to commit to using HLA until I can be sure
that there is support available. I could just go with NASM, which
seems to be more mainstream --- I can just ask my dumb questions here
on this forum, and there will be plenty of youse who can answer them.
Is there any way to contact the HLA group owner though, to find out if
I can get on their mailing list or not? I really would prefer to use
HLA than NASM if I can.

I told them that my reason for being interested in HLA was because it
is big on compile-time code, which I thought would make HLA similar to
Forth which is also big on compile-time code. Forth does have a bad
reputation though --- a lot of people think that Forthers are
inherently incompetent and unteachable, and not worth talking to ---
it is possible that they rejected me because of this reason. If so,
then that is their decision --- I would like to be told what is going
on though.

Nathan Baker

unread,

Oct 31, 2011, 5:27:08 PM10/31/11

to

The group owner can be contacted here:

aoaprogramming-owner [at] yahoogroups [dot] com

Nathan.

Nathan Baker

unread,

Oct 31, 2011, 7:54:33 PM10/31/11

to

There also exist a few other active forums where you can ask questions about
HLA:

The 'HLA Forum' subforum at the MASM32 board:

http://www.masm32.com/board/index.php?board=9.0

The 'High Level Languages' subforum at the Flat Assembler board:

http://board.flatassembler.net/forum.php?f=19

Also, feel free to ask such questions either here (in CLAX) or the
'alt.lang.asm' group -- UseNet is still a viable support network.

Nathan.

Rod Pemberton

unread,

Oct 31, 2011, 7:52:31 PM10/31/11

to

"Hugh Aguilar" <hughag...@nospicedham.yahoo.com> wrote in message

news:aaab2dc1-e970-4f8c...@h23g2000pra.googlegroups.com...

> On Oct 30, 12:18 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
> > "Frank Kotler" <fbkot...@nospicedham.myfairpoint.net> wrote in message
> > news:j8bsg4$vsn$1...@speranza.aioe.org...
>
> > [snip]
>
> > > cmp eax, ebx "ok" (to use your example)
> > > jz L1 "???"
>
> > > If "L1" has been "seen", fine. If it's a forward reference... put in a
> > > placeholder? Will a "short" jump fit, or will we need "near"? We
> > > may need to relocate any following code, once we find out where
> > > "L1" is. In this case, is it still "incremental"? I dunno.
>
> > Yes, I'd say it's incremental, but with an unresolved forward reference
> ...
>
> > If the placeholder allocates the largest possible branch size padded
> > with NOPs, it can be patched once it's location is determined, even
> > in a single-pass assembler. If the resulting branch has a large offset,
> > all the NOPs will be overwritten. If the resulting branch has a small
> > offset, you get a NOP or a few, after the branch. That way, the space
> > allocated for the branch and/or padding is of a fixed size and won't
> > require relocating code that comes later.
>
>

> What is the big deal about relocating code?
>

As long as all the code only uses relative addressing, no absolute
addressing, and has fixed sizes in bytes for control-flow offsets, then
there is no issue with shifting code to assemble it. I.e., relative
addressing and fixed sized offsets. But, that's not the case for most x86
code. 64-bit x86 can be mostly relative addressing.

x86 jumps and branches use relative addressing and have offsets which can be
of different sizes. Shifting the code can change the size of the offset in
bytes and the offset's value. If you shift the position of the code,
because one instruction consumes more bytes than expected, then you must
recalculate all other labels that come after the shift, no matter where they
are used, i.e., earlier.

E.g., let's look as this common scenario:

jz label1
...
jc label2
...
...
label1:
...
label2:

So, your assembler left one byte of space for "label1" offset at the jz
instruction. Now, your assembler is at label1. It decides to patch the
"label1" offset at the jz instruction. It needs two bytes instead of one.
It shifts all code after "jz label1" by one, computes and patches the offset
for label1 using label1's shifted location. So far, so good. Now, your
assembler is at label2. It decides to patch the "label2" offset at the jc
instruction. It too needs two bytes instead of one. It shifts all code
after "jc label2" by one byte. Now, you've got problems. The shifting of
the code to fix the size of label2's offset moved the location of label1.
The already patched "jz label1" forward reference now points to label1 minus
one byte. How do you intend to fix this now? Your assembler already
patched "jz label1" and so discarded the patch information for it.
Backtrack? Keep a table? I suspect that any solution to this issue will
result in an assembler that won't be single pass.

Basically, when using relative addressing with differently sized offsets and
absolute addressing, all code must be at fixed locations or calculation of
the addresses must be delayed until the end, i.e., two-passes or more.

In a single-pass assembler, if one uses a fixed amount of bytes for unknown
forward references large enough to prevent recomputing addresses and
offsets, you only have to keep track of a single address, the current
address being compiled. When you reach a forward reference label, you back
patch with the address or the offset needed for that address. No other
addresses or offsets need to be changed since there is no code shifting. If
you allow code shift, the exact address or offset will be different
depending on the offset size of the code being patched.

> There is no reason why code can't be relocated in a single-pass compiler.

See above.

> You can have a fix-up done after the function has been compiled that

> converts any short jumps to near jumps by moving the latter code to make
> room for the larger jump instruction.

See above.

> This will have to be done repeatedly until no more can be done, because
> changing one short into a near increases the size of the function and
> may cause another short to need to be changed to a near. Make sure that,
> if the function has multiple entry points (I never do that, but some
> people write such ugly code), that the addresses of the entry points
> get adjusted as necessary.

That doesn't seem to describe single pass ... Single pass means just that:
goes through the code and assembles just once.

Rod Pemberton

io_x

unread,

Nov 1, 2011, 1:48:30 AM11/1/11

to

"io_x" <a...@nospicedham.b.c.invalid> ha scritto nel messaggio
news:4eae47d8$0$1377$4faf...@reader2.news.tin.it...

i rethink all; it would take 3 pass
i would use always near in the I pass
in the II pass if the instruction of jump has not a near in it
i would try if the short one is ok
if it is ok end and continue other jumps
if not is ok i would see if that instruction has a "short" in it
if it has "short" than fail the compilation
if it has not a "short" in it than it remain "near"

in the III pass write all in the executable
calculating all the offset pointers label-jmp
of the code

io_x

unread,

Nov 1, 2011, 2:00:08 AM11/1/11

to

"io_x" <a...@nospicedham.b.c.invalid> ha scritto nel messaggio

news:4eaf8774$0$1392$4faf...@reader1.news.tin.it...

> i rethink all; it would take 3 pass
> i would use always near in the I pass
> in the II pass if the instruction of jump has not a near in it
> i would try if the short one is ok
> if it is ok end and continue other jumps
> if not is ok i would see if that instruction has a "short" in it
> if it has "short" than fail the compilation
> if it has not a "short" in it than it remain "near"
>
> in the III pass write all in the executable
> calculating all the offset pointers label-jmp
> of the code

but all jumps near is simple and portable to
other cpu

Hugh Aguilar

unread,

Nov 1, 2011, 4:50:14 AM11/1/11

to

On Oct 31, 5:52 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:

> "Hugh Aguilar" <hughaguila...@nospicedham.yahoo.com> wrote in message

> E.g., let's look as this common scenario:
>
> jz label1
> ...
> jc label2
> ...
> ...
> label1:
> ...
> label2:

> ...

> > This will have to be done repeatedly until no more can be done, because
> > changing one short into a near increases the size of the function and
> > may cause another short to need to be changed to a near. Make sure that,
> > if the function has multiple entry points (I never do that, but some
> > people write such ugly code), that the addresses of the entry points
> > get adjusted as necessary.
>
> That doesn't seem to describe single pass ... Single pass means just that:
> goes through the code and assembles just once.

This is what I described as "spaghetti code." If only supporting
structured programming, then a single pass is sufficient. I'm not
necessarily planning on supporting spaghetti code --- it is almost
never necessary --- I can support it though, doing a multiple-pass fix-
up on jumps is a hassle but is not a deal-breaker.

Even if doing a multiple-pass on the function, the assembler/compiler
is still single-pass if it is compiling the functions one at a time in
the order that they were written. The point of this thread, if you go
back read my original post, was that the assembler could assemble a
single function at a time --- what I call "incremental assembly" ---
the purpose of this is interactive development. I am not much
concerned with trivial details, such as fixing jumps, that are
involved in assembling that function --- so long as I can assemble a
single function at a time, incrementally.

I have said many times (on comp.lang.forth) that it is best to
assemble into an intermediate data structure (a linked list), optimize
that, and then go from there into memory as an executable. It is a
huge mistake to go straight into your final data format, as even the
most trivial optimizations such as jump sizes as discussed here,
become complicated. Almost all computer programming is conversion of
data from one format to another. I'm a big fan of doing this in
multiple stages, using intermediate data formats. That is why my
novice package has so much support for linked lists --- because they
make for a good intermediate data format most of the time.

Robert Wessel

unread,

Nov 1, 2011, 3:10:09 AM11/1/11

to

On Tue, 1 Nov 2011 06:48:30 +0100, "io_x" <a...@nospicedham.b.c.invalid>
wrote:

It's harder than that. Assume you assemble the long form for all
branches in the first pass. Then each additional pass can look for
which of those can be replaced with a short branch. But that might
make it possible to shorten other long branches. IOW, pass N might
shorten a branch B, which lies between branch A and its target. That
may present a possibility that branch A can now also be shortened.
Thus multiple passes may be needed.

Worse, there are cases where you cannot make an optimal determination
by looking at a single branch, even if you make multiple passes.
Consider two branches that branch past each other:

a:
...
jx b
...
jy a
...
b:

If they're just the right distance apart, it may be possible to make
them both short form branches at the same time, but not individually.
IOW, if the "jx b" is the long form, then the "jy a" is out of range
for the short form, and vice versa, but if you shorten both, they're
then both in range. And that can be extended to include an
indefinitely long list of branches which could all assemble in short
or long form en-mass, but not individually. The basic multiple pass
approach can't deal with that at all. (Although it obviously can be
extended by tracking a those potential combination branches).

And then you're assuming that your assembler has a traditional
multi-pass structure. That's hardly required. Many single pass
assemblers exist, those have to keep a list of addresses then need to
patch up after their (only) lexical pass. That can be extended to
support jump size optimization by continually processing the list of
fixups (which now includes both forwards and backwards jumps), by
fixing individual branches to be short or long form if it can be
guaranteed to be one for or the other, and then continuing as the list
of unresolved branch forms dwindles. That structure can also help
with the complex cases involving multiple branches (although still not
without a fair bit of work).

Another alternative is to put some limit on ambiguous branches in some
sized object (say within a MASM-style PROC), say 10, after taking a
few conventional passes, simply try all the combinations (1024 for
10), and take the shortest. For more, display a warning, and accept
the result from the several conventional passes.

In practice, a few conventional passes will cover almost all the real
cases, and the programmer can always add explicit shorts for the
remaining cases.

Rod Pemberton

unread,

Nov 1, 2011, 5:44:03 AM11/1/11

to

"Hugh Aguilar" <hughag...@nospicedham.yahoo.com> wrote in message

news:efffb448-7c38-4609...@j36g2000prh.googlegroups.com...

> On Oct 31, 5:52 pm, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
> wrote:
> > E.g., let's look as this common scenario:
> >
> > jz label1
> > ...
> > jc label2
> > ...
> > ...
> > label1:
> > ...
> > label2:
> > ...
>

> This is what I described as "spaghetti code." If only supporting
> structured programming, then a single pass is sufficient.
>

Ok. Does this C code qualify as structured code or spaghetti code?

while(1)
{
if()
{
...
if()
break;
}
else
{
...
}
}

That is structured code: single entry, single exit. Yes? However, it has
the same issues as the assembly code above ... Convert it to assembly.
Look at where the branches and branch destinations are located. I wouldn't
doubt it if Forth has the same problem too. 0BRANCH and BRANCH place
branches at the same locations for Forth's IF-ELSE-THEN as for C's if-else.
If there is a way in Forth to exit a loop in the middle, like C's break
statement, then Forth will have the same problem. I'd have to lookup some
Forth control-flow words, but I seem to recall UNLOOP ...

Rod Pemberton

Hugh Aguilar

unread,

Nov 1, 2011, 10:09:49 PM11/1/11

to

On Nov 1, 3:44 am, "Rod Pemberton" <do_not_h...@noavailemail.cmm>
wrote:
> "Hugh Aguilar" <hughaguila...@nospicedham.yahoo.com> wrote in message

UNLOOP clears out the DO stuff from the return stack in preparation
for an EXIT out of the function. I think you meant LEAVE, that clears
out the DO stuff and jumps to just past the LOOP or +LOOP. Also, BEGIN
WHILE REPEAT loops can usually have multiple WHILE statements, all of
which jump past the REPEAT. ANS-Forth has some kind of bizarre system
that I won't go into here (and won't support in my own Forth) --- but
the more rational approach that I have implemented myself (in UR/
Forth) is to just allow multiple WHILE statements.

None of this stuff matters in regard to fixing up the jump sizes
though. At the time that you find out what the destination address is,
you know that all of those jumps go to this destination. You just fix-
up the latest one first, then the next latest one, and so on back to
the earliest one. Each time, you are guaranteed that there are no
unresolved branches in the block of code extending from the jump to
the destination. You don't have to do multiple passes --- just do the
jumps in the correct order (latest to earliest) and you are good to go
with a single pass.

Similarly, as described elsewhere, I often have a series of IF ...
EXIT THEN statements, to give me something similar to a CASE
statement. This is easy if each EXIT just compiles a RET. Sometimes
however, there is more complicated code involved in exiting from the
function. Each EXIT does not compile this code, but instead compiles a
JMP to that code at the end of the function. Once again, all of these
jumps go to a common destination. All that is necessary, is to resolve
them in the order latest to earliest. Each time, you are guaranteed
that there are no unresolved branches in the block of code extending
from the jump to the destination.

The problem that Robert Wessel was describing, can only come up in
spaghetti code. Like I said before, there is not that much need to
support spaghetti code --- I can't recall the last time that I wrote
unstructured code --- and I think that it was done in some (likely
misguided) effort at optimization, and it wasn't at all necessary. I'm
really a lot more interested in a compiler that generates assembly
code, rather than hand-written assembly code --- so I can easily
require that only structured programming be done --- forget about all
of this foolishness that assembly language "hackers" do, with labels
scattered about and jumps to and fro --- none of that stuff makes
sense anyway, and it creates havoc in regard to the fix-up of the jump
sizes.

Frank Kotler

unread,

Nov 16, 2011, 12:27:48 AM11/16/11

to

Hi Hugh,

Sorry for the delay in getting back to you. I got distracted (easily done).

Hugh Aguilar wrote:
> On Oct 27, 9:11 am, Frank Kotler
> <fbkot...@nospicedham.myfairpoint.net> wrote:
>> cmp eax, ebx "ok" (to use your example)
>> jz L1 "???"
>>
>> If "L1" has been "seen", fine. If it's a forward reference... put in a
>> placeholder? Will a "short" jump fit, or will we need "near"? We may
>> need to relocate any following code, once we find out where "L1" is. In
>> this case, is it still "incremental"? I dunno.
>
> Well, I tried this out in SwiftForth:
>
> code flag ( n -- flag ) ok
> ebx ebx or 0<> if -1 # ebx mov then ok

If this is considered an "increment", okay, but it represents multiple
lines of assembly, which would not assemble "incrementally" in themselves.

> ret end-code ok
> 9 flag . -1 ok
> 0 flag . 0 ok
> see flag
> 46E89F EBX EBX OR 09DB
> 46E8A1 46E8A8 JZ 7405
> 46E8A3 -1 # EBX MOV BBFFFFFFFF
> 46E8A8 RET C3 ok
>
> SwiftForth seems to just be using short jumps by default. I don't know
> what would happen if your function was so big that it required a near
> jump; maybe get an error message. If I was writing the assembler, I
> would be fine with this being an error. When I wrote my 65c02 Forth
> and assembler, I required forward jumps to be short and only in a very
> few cases was this a problem, but I just factored the function rather
> than "fix" the assembler.

Okay. I won't argue against your preference for "highly factored" code,
but it would be a poor assembler (IMO) which *forced* you to program in
a certain style.

> Another option when this error cropped up,
> would be to switch to inline assembly and manually code a far jump (my
> Forth wasn't standard anyway, so the occasional use of inline-assembly
> wasn't a problem). For backward jumps my compiler was smart enough to
> use either a short or a far jump as necessary.

Yeah, backward jumps would be much less of a problem here. To get
nit-picky, a "far" jump would involve loading a new code segment
register as well as an offset, which is a slightly different issue.

Strictly speaking, what RosAsm and Betov object to is "built in" macros.
RosAsm does (did) have macro capabilities (not too powerful, I think).
It has (had) a utility whereby you could right-click on a macro name and
see the expansion of the macro. This might overcome some of the "Boss'
objections" to using macros.

>> My "objection" to HLA is that I feel that teaching a beginner
>> "stdout.puts("hello, world");" and claiming that you're teaching them
>> "assembly language" is... "misleading" (nicest word I can think of).
>
> Realistically, how many beginning programmers (or advanced
> programmers, for that matter), are going to write their own console
> interface code?

Once you're past DOS, you pretty much have to go through the OS to put
anything on the console. "call" is a CPU instruction, "stdout.puts" is
not. If you wanna call "stdout.puts" "teaching assembly language", feel
free... and I'll feel free to disagree. :)

> Not me! I want to get my program completed before the
> end of the century, so I don't want to spend a lot of time on low-
> level stuff like this --- I'm happy to let somebody else do that for
> me --- if Randy Hyde and his students are volunteering to provide that
> stuff for free, then I'll take it.

Well, sure... but why mess with assembly language at all in that case?

> I think that HLA and its macros are great. I also think that Forth and
> its compile-time code is great. I also remain unemployed...

A lot of people are unemployed these days. Probably no connection... :)

> I have worked as a programmer in the past. I was often told that my
> code had to look just like everybody else's code. The goal was that,
> if I got run over by a bus while walking to work, the company could
> hire another programmer who could sit down at my desk and look at my
> code and immediately understand it and begin working on it. What this
> means essentially, is that I can't use macros. That other programmer
> doesn't want to look at my code and see a macro being used, and not
> know what it does --- he just wants to see MOV and LEA and so forth,
> that he is familiar with.

This is where RosAsm's "right click expansion" of the macro might come
in handy.

> This is why Forth is not used commercially
> --- because Forthers use a lot of macros (immediate words) --- one
> Forth program doesn't look very much like another Forth program, as
> the language has been made to fit the application. By comparison, C
> doesn't have macros (except for #define which is so simplistic that it
> doesn't really count), so all C programs look alike.

I don't think that everyone would agree with that. Some folks who claim
that C is "inherently" more readable, when pointed at Linux code, claim
that they can't read "that kind of C"... :)

I've got no problem with the notion that the code ought to be tailored
to fit the application in question. I'd even go so far as to say that
large functions might be appropriate in some cases.

>
> I have worked as an assembly language programmer. My experience is
> that the company typically has an approved library of macros that were
> written by some genius (the boss describes himself as such), but the
> ordinary programmers (me) are strongly discouraged from writing macros
> of their own. The rational is that they aren't smart enough to write
> macros, and that doing so is outside of their paygrade. The real
> reason is what I described above, in regard to what will happen if the
> programmer gets run over by a bus (or, more likely, annoys the boss
> and gets fired).
>
> I like macros though! As for getting run over by a bus, I avoid that
> by looking both ways before crossing the street. As for annoying the
> boss, that can also be avoided. Whenever the boss begins talking, the
> best thing to do is shut one's mouth, grin like an idiot, and bob
> one's head up and down in wholehearted agreement --- I've seen this
> done successfully, but have never tried it myself.

Perhaps the next time you acquire a boss, you ought to give it a whirl.
If you can get into a situation where you're your own boss, that's an
even better solution. I was in that situation at one time, until I
looked in the mirror one day and said "You can take this job and shove
it!" Been retired ever since - a situation I like even better! :)

>> Since you're looking for a powerful macro language, it may be just what
>> you want. It won't do "incremental assembly", but it's open source. If
>> you can figure out what to do with forward references, perhaps it could
>> be modified...
>
> Requiring short jumps should be adequate --- that is pretty simple ---
> and it is a draconian way to force programmers to factor their
> functions down to reasonable size.

Some languages *do* attempt to force programmers into a certain style -
assembly language isn't one of 'em. Assembly language represents machine
instructions in a vaguely text-like format. How you string 'em together
is none of the language's business!

> If that doesn't work, requiring near jumps is equally simple, although
> it results in a performance hit.

Yeah... not a very big deal, but the "appropriate" size for the jump in
question is best, I think. Whether the assembler calculates this for you
or requires you to specify what you want is another question...

>> http://tech.groups.yahoo.com/group/aoaprogramming/
>>
>> If that's the mailing list you tried to join, and got no response...
>> maybe try again. Maybe try a different "reason" for wanting to join. The
>> didn't do that when I joined, or I would have had to say, "I'm a spy
>> from Nasm!" :) I think it's just an anti-spam measure - shouldn't matter
>> what you say, as long as you say something. It *may* be that the "owner"
>> of the group (not Randy, AFAIK) is 404, but the group is still "active"
>> (not very) - Randy posted just the other day! I don't recall seeing many
>> posts from "new" members, mostly the "regulars", so there might be a
>> problem joining(?). We could inquire, if you continue to have trouble
>> with it - get back to us.
>
> What does 404 mean?
>
> If it means that he is on vacation,

Possibly permanently on vacation. As you found out, the nasm-users list
at SF has the same problem...

> then that is the most likely
> explanation. I'll give them another week and if there is no response
> I'll get back to youse.

I have seen a post from an apparently "new" member on the aoa list (no
response yet), so they are apparently still accepting members. I'll see
if I can get you straightened out on the nasm-users list... but you're
pretty much talking to the same people as on the forum. It's a small
club. :)

Best,
Frank

io_x

unread,

Nov 16, 2011, 12:11:37 PM11/16/11

to

"Frank Kotler" <fbko...@nospicedham.myfairpoint.net> ha scritto nel messaggio
news:j9vhjs$g7b$1...@speranza.aioe.org...

> Some languages *do* attempt to force programmers into a certain style -
> assembly language isn't one of 'em. Assembly language represents machine
> instructions in a vaguely text-like format. How you string 'em together is
> none of the language's business!

each way to put "text-like format" <-> machine instruction
would build a language in the "text-like format" side

Hugh Aguilar

unread,

Nov 16, 2011, 10:55:47 PM11/16/11

to

On Nov 15, 10:27 pm, Frank Kotler

<fbkot...@nospicedham.myfairpoint.net> wrote:
> > Well, I tried this out in SwiftForth:
>
> > code flag ( n -- flag ) ok
> > ebx ebx or 0<> if -1 # ebx mov then ok

> > ret end-code ok

>
> If this is considered an "increment", okay, but it represents multiple
> lines of assembly, which would not assemble "incrementally" in themselves.

It is incremental in the sense that the program can be incrementally
grown, one function at a time --- the programmer doesn't have to
assemble and link the entire program all at once. The purpose of this
is to allow the programmer to test each function immediately after
writing it, using the data that the program has already generated.

> Okay. I won't argue against your preference for "highly factored" code,
> but it would be a poor assembler (IMO) which *forced* you to program in
> a certain style.

Well, I described in another post how the assembler can assemble the
appropriate size jump, short or near, so long as the code is
structured. This is a reasonable restriction --- spaghetti code has
been "considered harmful" since before I was born --- I can disallow
GOTO in my compiler and nobody will complain (or, at least, I won't
listen to them if they do complain).

I didn't assemble appropriate size jumps in my old 65c02 compiler, but
I forced the programmer to use small jumps and highly factored code
--- because the programmer was just myself, and I didn't mind --- if I
expected somebody other than myself to use the compiler though, then I
would make more of an effort at smartening up the compiler.

So far, I have written two Forth cross-compilers. The first was for
the 65c02, and I was the only programmer who used it. The second was
for the MiniForth processor when I was employed at Testra, and my
coworker and I were the only programmers who used it. Now I'm writing
another Forth cross-compiler and I expect (or, at least, hope) that
lots of people will use it --- definitely something new for me, to be
obliged to care about other people's opinions.

> > I don't want to spend a lot of time on low-
> > level stuff like this --- I'm happy to let somebody else do that for
> > me --- if Randy Hyde and his students are volunteering to provide that
> > stuff for free, then I'll take it.
>
> Well, sure... but why mess with assembly language at all in that case?

To a large extent, I am "messing with" assembly because I want to use
a Forth calling convention, rather than a C or Pascal calling
convention. There are several Forth calling conventions. What I am
doing is using ESP as the parameter stack pointer, so PUSH and POP can
be used to push and pop data. I use ESI as the return stack pointer.
Functions are called with CALL. The first thing that a function does,
is POP the return address into EAX. From there, EAX is pushed onto the
return stack (ESI), and at the end of the function the return address
is popped from the return stack back into EAX and a JMP is made
through EAX (the RET instruction is not used). Alternatively, the
return address can be held in EAX throughout the function's execution.
This only works with "primitive" functions that don't call any other
functions or mess up EAX in any other way (EAX is a general-purpose
register and it can get used for various purposes). Also, btw, EBX
holds the top value of the parameter stack at all times.

It is possible to write a Forth system in C (Gforth and FICL are both
examples of this), but the code is a lot more complicated than an
assembly language implementation, and it is significantly slower.

HLA is better than C because it allows me to make low-level decisions,
such as global register usage and a custom calling convention. HLA is
also better than C (from what I've seen so far) because it has high-
level features such as thunks and generators that C lacks --- it
appears to be closer to ICON than to C, which makes it a pretty high-
level language --- not even C++ has this kind of stuff. From my point
of view, it is the best of both worlds. Maybe it is the worst of both
worlds though --- a lot of assembly language programmers (including
yourself) disparage it for being too high-level because you have to
learn about things like thunks and generators (or, at least, you think
that such things are better hand-written than provided as part of the
language), and a lot of high-level language programmers disparage it
for being too low-level because they have to learn about things like
registers and addressing modes --- so HLA is vilified by both high and
low!