Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

GAS ".code16" and DJASM quirks ...

26 views
Skip to first unread message

Rugxulo

unread,
Nov 9, 2009, 11:03:14 PM11/9/09
to
I've been messing with a small Befunge93 interpreter in assembly. I
didn't write it, just tweaked it a bit, converted it to something
besides TASM, fixed a few bugs (though some obscure ones still exist),
and shrank its size a lot. I know it's silly, but ...

http://board.flatassembler.net/topic.php?t=10810

Yesterday I made it work with DJASM (via MAKE.BAT which uses sed
script). Unfortunately, I couldn't manually get GAS working earlier
today. Anyways, here's what I discovered:

DJASM:

1). no "salc" (0xD6) support (though other assemblers have it except
GAS)
2). no hex numbers a la "30h" (only "0x30" ??), that's very annoying
and not Intel-friendly, IMHO, can't be hard to add support for that
can it?
3). '.type "com"' even when using ".org 0x100" seems useless, "djasm
befi.dj" still makes an .EXE unless explicitly saying "djasm befi.dj
befi.com"
4). "rep movsb" must be split to use two lines (vaguely annoying but
not the only assembler to require this, e.g. GAS or WolfASM)
5). it's also picky on "byte" or "word" overrides, which is kinda
clunky
6). jumps that are > 128 bytes away aren't manually converted to 3-
byte jumps, so they have to be manually tweaked ("jmp" -> "jmpl"),
which is annoying

Okay, so I know it's not perfect and never was meant for too much
initially. I also know it's probably moot mentioning this, but I tried
anyways. At least it works! ;-)

While GAS now supports ".code16" and everything mostly seems to work
(almost), I have a few doubts that anybody really uses it much for 16-
bit (or maybe I did it wrong, who knows).

GAS:

1). no "salc" support (argh, oh well)
2). "0x300" required instead of "300h" (seriously?? bah)
3). ".org 0x100" doesn't seem to do anything useful (adds a bunch of
zero bytes at the beginning), nor does ".data" (adds bunch of crud to
end)
4). seems to align the code with no way to turn that off (is there?),
so it's "too big" a result (e.g. smallest .COM I got was 1536 bytes)
5). can't even directly output a flat binary (.COM) file (can it?)
except via external tools like OBJCOPY (yes) or LD (doubt it), can
it???
6). it seems to assume 0x1000 as "entry" and using "-e0x100" (or
anything else I tried) doesn't seem to change the following (of what I
assume is a) "bug":

(normal assembler)
dw offset lnop

(GAS)
.short lnop

For some reason, while most ".intel_syntax noprefix" code requires
"offset" and "word ptr" etc., using "offset" like ".short offset lnop"
isn't accepted. So the closest I can get is as mentioned above, BUT it
doesn't generate a correct address/offset (instead of being 0x262 it's
like 0x1162). I'm blindly guessing entry is accidentally hard-coded
or maybe not handled correctly or whatever. (This was 2.19, I think.)

So based upon that and not knowing a solution, I gave up on GAS (for
now). I almost wouldn't even post this, but there's no much other
traffic (die, spam!) and it is at least *almost* on topic. ;-)

DJ Delorie

unread,
Nov 9, 2009, 11:29:06 PM11/9/09
to dj...@delorie.com

Keep in mind that both djasm and gas/.code16 were designed for very
limited uses - in djgpp, for the stub and sbrk code and in gas for the
linux boot sector.

Rugxulo

unread,
Nov 10, 2009, 2:54:57 PM11/10/09
to
Hi,

Yes, but DJASM was extended quite a bit (by Bill Currie, apparently).
And I quote:

"The `djasm' compiler was to have no other purpose in life, and was
never intended to be a generic utility, until some of the developers
got
carried away and introduced support for many additional opcodes that
had
not yet been used. As a result, `djasm' has become sufficiently
powerful to be useful for more 16-bit applications than just the DJGPP
stub."

A check of the CVS shows that it hasn't changed too too much, but it
indeed has had a few minor changes in the past few years.

http://www.delorie.com/bin/cvsweb.cgi/djgpp/src/djasm/djasm.y


Linux, however, I'm not that familiar with, but I always thought it
used dev86 (bcc, as86) for the 16-bit stuff.

"... within Linux the assembler and linker are used for bootblocks,
DOSEMU and other packages."

http://www.debath.co.uk/

DJ Delorie

unread,
Nov 10, 2009, 3:10:59 PM11/10/09
to dj...@delorie.com

> "... got carried away ..."

I think that might be the key phrase ;-)

If you want to continue the fine tradition of making it meet your
needs, go ahead :-)

Rugxulo

unread,
Nov 10, 2009, 6:34:58 PM11/10/09
to
Hi,

On Nov 10, 2:10 pm, DJ Delorie <d...@delorie.com> wrote:
>
> If you want to continue the fine tradition of making it meet your
> needs, go ahead :-)

Well, I don't grok yacc/bison, but here's the easiest change so far.
(Now for the hard part, argh ....)


*** djasm.y Fri Aug 29 08:33:45 2008
--- tony.y Tue Nov 10 17:18:24 2009
***************
*** 285,290 ****
--- 285,291 ----
{"insb", ONEBYTE, 0x6c},
{"insw", ONEBYTE, 0x6d},
{"insd", TWOBYTE, 0x666d},
+ {"int3", ONEBYTE, 0xcc},
{"into", ONEBYTE, 0xce},
{"iret", ONEBYTE, 0xcf},
{"iretd", TWOBYTE, 0x66cf},
***************
*** 315,320 ****
--- 316,322 ----
{"repne", ONEBYTE, 0xf2},
{"repnz", ONEBYTE, 0xf2},
{"sahf", ONEBYTE, 0x9e},
+ {"salc", ONEBYTE, 0xd6},
{"scasb", ONEBYTE, 0xae},
{"scasw", ONEBYTE, 0xaf},
{"scasd", TWOBYTE, 0x66af},

Rugxulo

unread,
Nov 10, 2009, 10:35:33 PM11/10/09
to
Hi,

On Nov 10, 2:10 pm, DJ Delorie <d...@delorie.com> wrote:
>
> If you want to continue the fine tradition of making it meet your
> needs, go ahead :-)

Okay, this hack below seems to work (even though I swear I am not a C
programmer):

=======================================
*** tony_y.bk1 Tue Nov 10 17:37:22 2009
--- tony.y Tue Nov 10 21:26:58 2009
***************
*** 2335,2340 ****
--- 2335,2341 ----
{
int c, c2, i, oldc;
struct opcode *opp, op;
+ char str[33], str2[33];

do {
c = fgetc(infile);
***************
*** 2454,2460 ****
}
#else
ungetc(c, infile);
! fscanf(infile, "%i", &(yylval.i));
#endif
sprintf(last_token, "%d", yylval.i);
return NUMBER;
--- 2455,2473 ----
}
#else
ungetc(c, infile);
!
! fscanf(infile, "%[0-9a-fA-FhHxX]",str);
!
! if (str[strlen(str)-1] == 'h' || str[strlen(str)-1] == 'H')
! {
! str[strlen(str)-1]='\0';
! strcpy(str2,"0x");
! strcat(str2,str);
! strcpy(str,str2);
! }
! sscanf(str, "%i", &(yylval.i));
!
!
#endif
sprintf(last_token, "%d", yylval.i);
return NUMBER;
=======================================

And here's the simplest example (modified HELLO.ASM) although a bigger
example, BEFI.COM using "0x100" or "100h" style both assemble to the
exact same .COM, matching CRC32.

=======================================
.type "com"
.org 0x100 ; origin for .COM program

mov dx, msg ; point DX to message
mov ah, 0x9 ; DOS print string function
int 21h
mov ax, 4c00h ; DOS exit with errorlevel
int 0x21
msg:
.db "hello, world\r\n$"
=======================================

Martin Str|mberg

unread,
Nov 11, 2009, 3:42:59 AM11/11/09
to
Rugxulo <rug...@gmail.com> wrote:
> =======================================
> *** tony_y.bk1 Tue Nov 10 17:37:22 2009
> --- tony.y Tue Nov 10 21:26:58 2009
> ***************
> *** 2335,2340 ****
> --- 2335,2341 ----
> {
> int c, c2, i, oldc;
> struct opcode *opp, op;
> + char str[33], str2[33];

No.

#define MAX 32
#define CHARS "0-9a-fA-FhHxX"
char str[MAX+1+1], ... /* One for nul and one for "h"->"0x" transformation. */
char format[1+ sizeof("2147483647") /* INT_MAX */ +1+ sizeof(CHARS) +1+1];
sprintf(format "%%%d[%s]", MAX, CHARS); /* Possibly error handling here too. */

> ! fscanf(infile, "%[0-9a-fA-FhHxX]",str);

fscanf(infile, format, str);

> !
> ! if (str[strlen(str)-1] == 'h' || str[strlen(str)-1] == 'H')
> ! {
> ! str[strlen(str)-1]='\0';
> ! strcpy(str2,"0x");
> ! strcat(str2,str);

Why not sprintf(str2, "0x%s", str)? /* Again some proper size! */

> ! strcpy(str,str2);
> ! }
> ! sscanf(str, "%i", &(yylval.i));
> !
> !
> #endif
> sprintf(last_token, "%d", yylval.i);
> return NUMBER;
> =======================================


Thanks!

MartinS

Martin Str|mberg

unread,
Nov 11, 2009, 3:48:36 AM11/11/09
to
Rugxulo <rug...@gmail.com> wrote:
> 3). ".org 0x100" doesn't seem to do anything useful (adds a bunch of
> zero bytes at the beginning), nor does ".data" (adds bunch of crud to
> end)

How do you link? If you use GNU ld you need to mess with linker scripts.
I've written small .com files using gas and ld. I've also recoded
FDXMS using gas and ld.

I remember it was hard to get rid of the extra zeroes before the real
code. (Bug in ld, perhaps?) I think I just skipped .org 0x100 and
instead used nothing or .org 0 and let ld with the correct script deal
with it.


If you wait and are lucky I might dig up some of what I've written.


--

MartinS

Rugxulo

unread,
Nov 11, 2009, 4:03:03 PM11/11/09
to
Hi,

On Nov 11, 2:48 am, Martin Str|mberg <a...@sister.ludd.ltu.se> wrote:
>
> How do you link? If you use GNU ld you need to mess with linker scripts.
> I've written small .com files using gas and ld. I've also recoded
> FDXMS using gas and ld.
>
> I remember it was hard to get rid of the extra zeroes before the real
> code. (Bug in ld, perhaps?) I think I just skipped .org 0x100 and
> instead used nothing or .org 0 and let ld with the correct script deal
> with it.
>
> If you wait and are lucky I might dig up some of what I've written.

No hurry, esp. since I already have FDXMS. (My only complaint is AT&T,
ugh.)

http://www.ibiblio.org/pub/micro/pc-stuff/freedos/files/dos/fdxms/

Quoting from makefile :

LD = ld -T link_script --oformat binary -Map $*.map -s


link_script :
===============================
SECTIONS
{
.text 0 : {
*(.text)
etext = . ; _etext = .;
}
.data : {
*(.data)
edata = . ; _edata = .;
}
.bss SIZEOF(.data) + ADDR(.data) :
{
*(.bss)
*(COMMON)
end = . ; _end = .;
}
}
===============================

Rugxulo

unread,
Nov 13, 2009, 5:59:59 PM11/13/09
to
Hi,

On Nov 11, 2:42 am, Martin Str|mberg <a...@sister.ludd.ltu.se> wrote:
>
> No.
>
> (snip code suggestions)
>
> Thanks!
>
> MartinS

I sent a patch for this to djgpp-workers, did you see it? I can post
it here if needed (no new functionality, though).

0 new messages