Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Linker address translation

66 views
Skip to first unread message

Ron Rondis

unread,
Dec 31, 2011, 5:49:10 AM12/31/11
to
Hi,

I encounter a problem but I don't know if it is a bug or the expected
result,
hopping someone will be able to clarify that.


The head of the linker map file:

Open Watcom Linker Version 1.9
Portions Copyright (c) 1985-2002 Sybase, Inc. All Rights Reserved.
Created on: 11/12/31 10:44:12
Executable Image: bios.bin
creating a DOS .COM executable


+------------+
| Groups |
+------------+

Group Address Size
===== ======= ====

DGROUP 0000:0000 00008f82



+--------------+
| Segments |
+--------------+

Segment Class Group
Address Size
======= ===== =====
======= ====

ENTRY CODE DGROUP
0000:0000 00000743
_TEXT CODE DGROUP
0074:0004 0000678c
CONST DATA DGROUP 06ed:
0000 00001d02
CONST2 DATA DGROUP 08bd:
0002 00000395
_DATA DATA DGROUP
08f6:0008 0000001a


The linker assume CS == 0x0074 when translating addresses, in _TEXT
segment.
I assumed that all segments in the same group share the same segment
and so
the linker need to translate all address relative to DGROUP.


For example the following fragment from _TEXT:


0AD8 L$70:
0AD8 FB 0A DW
offset L$71
0ADA FB 0A DW
offset L$71
0ADC FB 0A DW
offset L$71
0ADE FB 0A DW
offset L$71
0AE0 FB 0A DW
offset L$71
0AE2 FB 0A DW
offset L$71

Routine Size: 1246 bytes, Routine Base: _TEXT + 0606

0AE4 _switch_test:
0AE4 56 push si
0AE5 57 push di
0AE6 55 push bp
0AE7 89 E5 mov bp,sp
0AE9 90 nop
0AEA 8B 46 08 mov ax,word ptr 0x8[bp]
0AED 3D 05 00 cmp ax,0x0005
0AF0 77 09 ja L$71
0AF2 89 C3 mov bx,ax
0AF4 01 C3 add bx,ax
0AF6 2E FF A7 D8 0A jmp word ptr cs:L
$70[bx]
0AFB L$71:

Is translated by the linker to the following:

121c: ff 0a decw (%bp,%si)
121e: ff 0a decw (%bp,%si)
1220: ff 0a decw (%bp,%si)
1222: ff 0a decw (%bp,%si)
1224: ff 0a decw (%bp,%si)
1226: ff 0a decw (%bp,%si)
1228: 56 push %si
1229: 57 push %di
122a: 55 push %bp
122b: 89 e5 mov %sp,%bp
122d: 90 nop
122e: 8b 46 08 mov 0x8(%bp),%ax
1231: 3d 05 00 cmp $0x5,%ax
1234: 77 09 ja 0x123f
1236: 89 c3 mov %ax,%bx
1238: 01 c3 add %ax,%bx
123a: 2e ff a7 dc 0a jmp *%cs:0xadc(%bx)
123f: b0 01 mov $0x1,%al


Addresses at 121c-1226 (0AD8-0AE2) were translated relative to _TEXT
cs
(i.e. cs == 0x74).


Thanks,
Ron

Paul S. Person

unread,
Dec 31, 2011, 12:42:31 PM12/31/11
to
I could be wrong, but I don't interpret it that way.

I read it as saying that ENTRY starts at 0000:0000 and extends 743
bytes, that is, to 0000:0743. But 0000:0743 is the same address as
0074:0003. So _TEXT starts the byte after ENTRY ends. That is all it
is saying.

The "segments" here are not descriptors, but merely the point in RAM
where the code or data begins, using a 16-byte metric (that is, there
is one of these every 16 bytes).

I suspect that this is done so that each segment can contain 64K bytes
(if it starts with an offset of 0000). Note that segments greater than
64K bytes produce linker errors (if the compiler doesn't catch them
first).

Segmentation in DOS is very hard to decode from the MAP file. If you
compile the code for 16-bit OS/2 (or even 16-bit Windows), the MAP
file will use what I think of as virtual descriptors (0001, 0002,
etc), making it a bit easier to interpret. For one thing, everything
shown with a given virtual descriptor is, in fact, in the same 64K
segment. For another, the function map uses the virtual descriptors as
well, making it easy to tell which segment each function is in.

OS/2 is better than Windows because Windows is prone to expanding the
functions and so changing the segment they end up in, in some cases.
It must be 16-bit because, for 32-bit, everything is in one segment
with the same virtual descriptor. I call them "virtual descriptors"
because I do not know the proper term and the debugger makes it clear
that, when the program is loaded, they are replaced with actual
descriptors.
--
"'If God foreknew that this would happen,
it will happen.'"

Ron Rondis

unread,
Jan 1, 2012, 3:38:25 AM1/1/12
to
On Dec 31 2011, 7:42 pm, Paul S. Person
<psper...@ix.netscom.com.invalid> wrote:

>
> I read it as saying that ENTRY starts at 0000:0000 and extends 743
> bytes, that is, to 0000:0743. But 0000:0743 is the same address as
> 0074:0003. So _TEXT starts the byte after ENTRY ends. That is all it
> is saying.
>

Maybe my explanation was not good enough. The question is what CS
the linker assume for code that is part of _TEXT (during runtime) and
what
CS it assume for code that is part of ENTRY. According to the linker
output
it assume CS == (image_load_seg + 0) when executing code in ENTRY and
CS == (image_load_seg + 0x74) for _TEXT. Is it the expected result
although
they are in the same group?

To clarify thing, what I'm trying to achieve is to control the address
of some
symbols. I planned to use the linker "order" directive combined with
"segaddr"
but I don't know whether I like to use it in case far calls are
involved. Is there any
other alternative?

Thanks,
Ron


Paul S. Person

unread,
Jan 1, 2012, 1:25:52 PM1/1/12
to
On Sun, 1 Jan 2012 00:38:25 -0800 (PST), Ron Rondis
<ron.r...@gmail.com> wrote:

>On Dec 31 2011, 7:42 pm, Paul S. Person
><psper...@ix.netscom.com.invalid> wrote:
>
>>
>> I read it as saying that ENTRY starts at 0000:0000 and extends 743
>> bytes, that is, to 0000:0743. But 0000:0743 is the same address as
>> 0074:0003. So _TEXT starts the byte after ENTRY ends. That is all it
>> is saying.
>>
>
>Maybe my explanation was not good enough. The question is what CS
>the linker assume for code that is part of _TEXT (during runtime) and
>what
>CS it assume for code that is part of ENTRY. According to the linker
>output
>it assume CS == (image_load_seg + 0) when executing code in ENTRY and
>CS == (image_load_seg + 0x74) for _TEXT. Is it the expected result
>although
>they are in the same group?

I think you are correct that I don't know what you are asking, for I
am not sure where you are getting this from. 0x121c - 0x0ae4 = 0x0738;
0x0738 - 0x0606 = 0x0132. Provided, of course, that all these values
are hexadecimal. That leaves 0x0132, not 0x0074, as the Routine Base.

But what all this has to do with the value of CS, particularly when
the program is being run, I have no idea. My impression, which could
be wrong, is that it is the loader that actually sets the value of CS.
The linker appears to be concerned mostly in cramming as many
compiler-generated segments as will fit into each linker segment in
turn.

This isn't as clear with 16-bit DOS as it is with, say, 16-bit OS/2,
and yet comparing the map files for programs with many modules and
many segments per module suggests that the linker is trying to group
compiler-generated segments the same way in 16-bit DOS as it is in
16-bit OS/2. For one thing, there is a clear tendency to use the same
"segment" address for as many of the compiler-generated segments as
will fit in 64K and only then to change it to another 64K segment.

Which illustrates another problem: the term "segment" is ambiguous.
But that doesn't seem to matter here. What matters here is that I am
not at all sure that the value of CS can be found from the segment
data in the MAP file.

Running the program in the debugger, in contrast, should be very
informative on this point.

>To clarify thing, what I'm trying to achieve is to control the address
>of some
>symbols. I planned to use the linker "order" directive combined with
>"segaddr"
>but I don't know whether I like to use it in case far calls are
>involved. Is there any
>other alternative?

That I definitely have no idea on. You could, of course, try it and
see what happens.

Ron Rondis

unread,
Jan 1, 2012, 2:38:54 PM1/1/12
to
I'll try again.

I load the image myself. The image is a PC BIOS and is loaded at
0xf000:0x0000 (cs == 0xf000). Link format is DOS COM.

Address 0x121c - 0x1228 is a compiler generated table that hold
addresses to switch-case labels (switch optimization). In the object
file the same table is at 0x0AD8 - 0x0AE4.

All those addresses must point to L$71. The linker convert them from
0x0AFB to 0x0aff and the address of L$71 in the image is 0x123f.
Now 0x123f - 0x0aff = 0x740, so the linker assume cs is at 0x740
relative
to the image baser address (DGROUP).

The same for the instruction at 0x123a, jmp *%cs:0xadc(%bx). The
table
is at address 0x121c. 0x121c - 0xadc = 0x740.

In contrast to the above the linker call from _TEXT seg to ENTRY seg
and
via-versa using near relative call instruction (i.e. no cs switch).

I like to know if it is a BUG or the expected result. I assumed that
if _DATA,
CONST, and CONST2 are all addressed relative to DGROUP then all code
segment that belong to the DGROUP will also be addressed relative to
DGROUP.

Thanks,
Ron


Kevin G. Rhoads

unread,
Jan 1, 2012, 7:48:43 PM1/1/12
to
>I assumed that if _DATA,
>CONST, and CONST2 are all addressed relative to DGROUP then all code
>segment that belong to the DGROUP will also be addressed relative to
>DGROUP.


OK- let's back up a bit. In x86 real mode you have 6 addressing models
in common use: tiny, small, compact, medium, large and huge.

It sounds like you want tiny, where CS=DS and code and data are all in
the same segment, but maybe you are getting treated as small instead (?).

You mention DOS .COM, which is a variant on .BIN. .COM assumes (normally)
tiny and a specific load-time set of conditions (CS=DS=PSP seg) and everything
is relative to that.

But you say load the image yourself, that is when you should set the value of
CS and DS to match what they should be for the memory model and linker's
processing.

As for why the linker's processing doesn't match your expectations, what are
you feeding the linker? Can you (are you willing) to share asm or C source?
compiler or IDE settings?

I'm not that familiar with OW linker for targeting DOS .COM stuff, but have
you tried generating an EXE and doing EXE2BIN and comparing results that
way? Have you used one of the 16 bit MS linkers (5.60 is still findable
out there) and compared? What about TLINK? Or VAL? any comparisons?

Ron Rondis

unread,
Jan 2, 2012, 3:33:47 AM1/2/12
to
On Jan 2, 2:48 am, "Kevin G. Rhoads" <kgrho...@alum.mit.edu> wrote:
> >I assumed that if _DATA,
> >CONST, and CONST2 are all addressed relative to DGROUP then all code
> >segment that belong to the DGROUP will also be addressed relative to
> >DGROUP.
>
> OK- let's back up a bit.  In x86 real mode you have 6 addressing models
> in common use: tiny, small, compact, medium, large and huge.
>
> It sounds like you want tiny, where CS=DS and code and data are all in
> the same segment, but maybe you are getting treated as small instead (?).

I use -ms (-mt is not a valid option) and according to the
documentation in case
of DOS COM it is implicitly a tiny model. But even in case of small
model all
code supposed to be in one seg. Furthermore, during runtime CS=DS and
data
addressing is correct so it is in fact a tiny model.

>
> You mention DOS .COM, which is a variant on .BIN.  .COM assumes (normally)
> tiny and a specific load-time set of conditions (CS=DS=PSP seg) and everything
> is relative to that.
>
> But you say load the image yourself, that is when you should set the value of
> CS and DS to match what they should be for the memory model and linker's
> processing.

The image is loaded at 0xf000:0x0000 and cd=ds=0xf000.

>
> As for why the linker's processing doesn't match your expectations, what are
> you feeding the linker?  Can you (are you willing) to share asm or C source?
> compiler or IDE settings?
>

I did not decide what licence to use for the sources so I can't share
them at
this point. The following list the relevant data:


c files compile command:
wcc -q -6 -ecc -zls -ms -zc -zu -s -os -we -fr=/dev/null -i=..
-ad=$(TMP_DEP_FILE) <file_name>.c

asm files compile command:
nasm -f obj -o <file_name>.o <file_name>.nasm


segment definition in asm files

in case the file is part of ENTRY:
segment ENTRY class=CODE USE16 align=1 CPU=686
group DGROUP ENTRY

in case the file is part of _TEXT:
segment _TEXT class=CODE USE32 align=1 CPU=686
group DGROUP _TEXT


linker command:
wlink option q @bios.link

/*********************** bios.link ************
name bios.bin
format dos com

file entry.o
file bios.o
file utils_16.o
file pci_16.o
file platform_16.o
file ata.o
file boot.o
file keyboard.o
file asm_utils.o

file lib16/i8d086.o
file lib16/i8m086.o
file lib16/u8rs086.o
file lib16/i8ls086.o
file lib16/i4d.o
file lib16/i4m.o

order
clname CODE segment ENTRY segment _TEXT
clname DATA

option nodefaultlibs
option map

disable 1023
/
********************************************************************************



> I'm not that familiar with OW linker for targeting DOS .COM stuff, but have
> you tried generating an EXE and doing EXE2BIN and comparing results that
> way?  Have you used one of the 16 bit MS linkers (5.60 is still findable
> out there) and compared?  What about TLINK?  Or VAL?  any comparisons?

I will try to use other linker.

Thanks,
Ron


Ron Rondis

unread,
Jan 2, 2012, 6:52:17 PM1/2/12
to

>
> > I'm not that familiar with OW linker for targeting DOS .COM stuff, but have
> > you tried generating an EXE and doing EXE2BIN and comparing results that
> > way?  Have you used one of the 16 bit MS linkers (5.60 is still findable
> > out there) and compared?  What about TLINK?  Or VAL?  any comparisons?
>
> I will try to use other linker.
>

Linking using MS Linker fit my expectation:

11ff: 65 gs
1200: 15 65 15 adc $0x1565,%ax
1203: 65 gs
1204: 15 65 15 adc $0x1565,%ax
1207: 65 gs
1208: 15 65 15 adc $0x1565,%ax
120b: 56 push %si
120c: 57 push %di
120d: 55 push %bp
120e: 89 e5 mov %sp,%bp
1210: 90 nop
1211: 90 nop
1212: 90 nop
1213: 8b 46 08 mov 0x8(%bp),%ax
1216: 3d 05 00 cmp $0x5,%ax
1219: 0f 87 48 03 ja 0x1565
121d: 89 c3 mov %ax,%bx
121f: 01 c3 add %ax,%bx
1221: 2e ff a7 ff 11 jmp *%cs:0x11ff(%bx)
1226: 56 push %si
1227: 57 push %di

As you can see the address at 0x1221 is the start of the optimization
jump table.
And all the address in the table are relative to DGROUP or image base,
as I expect.

The MS linker output map file header is:

Start Stop Length Name Class

00000H 00742H 00743H ENTRY CODE

00744H 074B2H 06D6FH _TEXT CODE

074B4H 09586H 020D3H CONST DATA

09588H 0991DH 00396H CONST2 DATA

0991EH 09937H 0001AH _DATA DATA



Origin Group

0000:0 DGROUP


TLink output is similar to that of wlink but I do not understand what
sort of purpose
such an output serve.


Wilton Helm

unread,
Jan 6, 2012, 2:38:16 PM1/6/12
to
Hi, I just ran across this and may be able to help. I don't know if I
understand all the questions, but let's deal with the first one first.

As someone once explained to me, once upon a time someone in Intel land
envisioned a world where data stretched across multiple adjacent segments to
allow access a bit beyond 64 K. Code on one side might access it as
400:C000
and code on the other side could access it as
1000:0000
Each access can only reach 64 K, but between them they can share portions
that total more than 64K. For instance
Segment A - size 32K
Segment B - size 32K
Setment C - size 32K

Either A and B or B and C can be accessed, allowing B to be shared without
any far accesses.

I've never actually used any tools that allowed this, but in theory it would
work.
For that reason, the segments that make up DGROUP have two addresses (except
the first one where they are the same). One is a common DGROUP based
address that is based on what DS would normally be set to. The other is an
address based on the start of the segment itself, which generally has a
higher base address and lower offset, but represents the same physical
memory. OW shows the former address in the symbol listing in the map file,
but the latter address in the segment summary preceeding that listing.

The only place that might be useful (and can be dangerous, if not
understood) is in assembly. The assume directive tells the assembler what
DS and ES are expected to be set to for a region of code. Of course, it is
up to the programmer to insure that they are actually set as specified,
either explicity, or by assuming the register handling rules of the memory
model in question (such as DS = DGROUP). The assembler then checks to see
if the address referenced is accessible (within the 64 K range) of DS or ES.
If it is, it makes the appropriate translation using that base address. If
it is not, it generates an error (I just got one this morning in some code
that was using DS and ES with non-standard values and I dropped in a memory
access instruction that needed DGROUP).

Now if I understand your question, the answer is that the C compiler will
include
assume DS=DGROUP
in its source. In assembly you need to explicitly do so. Avoid the trap of
saying something like assume DS=_DATA, because that isn't the same. You
also need to make sure DS is really set as stated. Then the assembler will
generate the correct offsets.

As for the Tiny memory model, the C compiler could care less. DS is still
set to DGROUP. It just happens that _Text is part of DGROUP, and thus CS =
DS. The compiler can be told to store constants in CS if you wish in that
case, but that's about all. The main thing that matters is the startup
code, because if Tiny model is used, it will place _Text in DGROUP and set
DS = CS. Other than that, it functions the same as Small model. There are
very few places where the Tiny model offers any advantage, and it limits
code + data to 64 K rather than allowing 64K for each.

Wilton


Ron Rondis

unread,
Jan 12, 2012, 7:40:44 AM1/12/12
to
On Jan 6, 9:38 pm, "Wilton Helm" <wh...@compuserve.com> wrote:
>
> Now if I understand your question, the answer is that the C compiler will
> include
>     assume DS=DGROUP

No the question was about the code segments ENTRY and _TEXT.
Both ENTRY and _TEXT class is CODE and they both in DGROUP,
what is the expected CS to be at run time?

Using TLink and WLink => assume ENTRY and _TEXT doesn't have the same
CS at runtime.
Using MS linker => assume ENTRY and _TEXT having the same CS at
runtime.

I found a workaround (or the expected thing to do) for my problem:

order
clname CODE segment ENTRY segaddr=0 segment _TEXT segaddr=0
clname DATA

Thanks,
Ron






Wilton Helm

unread,
Jan 18, 2012, 7:19:14 PM1/18/12
to
Glad you have a work-around. I wrote that feature for the linker, but never
envisioned it being used in that manner, so I'm glad it is helpful. Its
original purpose was to allow classes and segments to be placed at specific
addresses in embedded system.

Since you are working with the Tiny model, all the code has the same segment
address as the data and has to be in DGROUP. Therefore the comments I made
about DGROUP addressing apply. The linker handles DGROUP special in that
the segments keep their own segment anddresses, but offset generating in
code fixups is relative to the DEGROUP segment instead of the own segment
address. It looks confusing in the map file, but generally produces the
expected results. Since all of DGROUP has to fit in 64 K address 100:200
and 110:100 refer to the same location, so if an entry point in _Text is at
110:100 but ds, which is set to DGROUP is 100, then computing an offset of
200 (based on a segment of 100) is the expected result, even though the map
file might list it as 110:100.

Wilton


Ron Rondis

unread,
Jan 19, 2012, 8:53:10 AM1/19/12
to
So according to the above it is a linker bug. The linker fixups
offsets relative
to the CODE segment _TEXT instead of translating it relative to
DGROUP.

Example based on my first post:

_TEXT at 0074:0004

123a: 2e ff a7 dc 0a jmp *%cs:0xadc(%bx)

the above instruction suppose to take the jump address from a
table
that start at address 0x121c ( 121c: ff 0a). The linker
fixup is
0xadc instead of 0x121c. 0x121c - 0xadc equal the _TEXT segment
0x740.


Thanks,
Ron

Wilton Helm

unread,
Jan 19, 2012, 5:27:29 PM1/19/12
to
I went back and looked more carefully at the original example. It is a
fragment, so I can't tell for sure what happened. What I would need to see
is what assume statements were in effect at that time. There should have
been an
assume cs:DGROUP
if there was an
assume cs:_TEXT
that would cause the problem.

Alternately, instead of saying
cs:0xadc
you could say
DGROUP:0xadc
Some other linkers (my own experience is with Borland) automatically assume
that any segment that is part of DGROUP should inherit DGROUP's segment
address. While that assumption is convenient, it is incorrect. When I
switch to OW, I had to change several assembly files that said things like
assume ds:_DATA
to
assume ds:DGROUP
in my case I wasn't using tiny model, so there isn't an equivelent cs
version.

Wilton


Ron Rondis

unread,
Jan 20, 2012, 7:55:55 AM1/20/12
to
On Jan 20, 12:27 am, "Wilton Helm" <wh...@compuserve.com> wrote:
> I went back and looked more carefully at the original example.  It is a
> fragment, so I can't tell for sure what happened.  What I would need to see
> is what assume statements were in effect at that time.  There should have
> been an
>     assume cs:DGROUP
> if there was an
>     assume cs:_TEXT
> that would cause the problem.
>

The code is generated from a c file by wcc.


The following is a complete example.


*********************** file1.nasm ************************
segment ENTRY class=CODE USE16 align=1 CPU=686
group DGROUP ENTRY

%define BIOS16_STACK_BASE 0xfff0

extern _init

entry:
mov ax, cs
mov ds, ax
xor ax, ax
mov ss, ax
mov sp, BIOS16_STACK_BASE
call _init
cli
.infloop:
hlt
jmp .infloop

*********************** file2.c ************************
static void switch_test(int val)
{
switch (val) {
case 0:
break;
case 1:
break;
case 2:
break;
case 3:
break;
case 4:
break;
case 5:
break;
}
}


void init()
{
switch_test(0);
}

*********************** file3.link ************************
name test.bin
format dos com

file file1.o
file file2.o

order
clname CODE segment ENTRY segment _TEXT
clname DATA

option nodefaultlibs
option map

disable 1023
***********************************************************

nasm command line:
nasm -f obj -o file1.o file1.nasm
wcc command line:
~/watcom/binl/wcc -q -6 -ecc -zls -ms -zc -zu -s -os -we file2.c
link command line:
~/watcom/binl/wlink option q @file3.link

****************** map file ******************************
Open Watcom Linker Version 1.9
Portions Copyright (c) 1985-2002 Sybase, Inc. All Rights Reserved.
Created on: 12/01/20 14:23:45
Executable Image: test.bin
creating a DOS .COM executable


+------------+
| Groups |
+------------+

Group Address Size
===== ======= ====

DGROUP 0000:0000 0000003e



+--------------+
| Segments |
+--------------+

Segment Class Group Address
Size
======= ===== ===== =======
====

ENTRY CODE DGROUP 0000:0000
00000012
_TEXT CODE AUTO 0001:0002
0000002b
CONST DATA DGROUP 0003:000e
00000000
CONST2 DATA DGROUP 0003:000e
00000000
_DATA DATA DGROUP 0003:000e
00000000


+----------------+
| Memory Map |
+----------------+

* = unreferenced symbol
+ = symbol only referenced locally

Address Symbol
======= ======

Module: file2.o(~/bug_example/file2.c)
0001:0024 _init


+-----------------------+
| Linker Statistics |
+-----------------------+

Stack size: 1000 (4096.)
Memory size: 003e (62.)
Entry point address: 0000:0000
Link time: 00:00.00


****************** linker output ******************************
test.bin: file format binary


Disassembly of section .data:

00000000 <.data>:
0: 8c c8 mov %cs,%ax
2: 8e d8 mov %ax,%ds
4: 31 c0 xor %ax,%ax
6: 8e d0 mov %ax,%ss
8: bc f0 ff mov $0xfff0,%sp
b: e8 26 00 call 0x34
e: fa cli
f: f4 hlt
10: eb fd jmp 0xf
12: 22 00 and (%bx,%si),%al
14: 22 00 and (%bx,%si),%al
16: 22 00 and (%bx,%si),%al
18: 22 00 and (%bx,%si),%al
1a: 22 00 and (%bx,%si),%al
1c: 22 00 and (%bx,%si),%al
1e: 55 push %bp
1f: 89 e5 mov %sp,%bp
21: 8b 46 04 mov 0x4(%bp),%ax
24: 3d 05 00 cmp $0x5,%ax
27: 77 09 ja 0x32
29: 89 c3 mov %ax,%bx
2b: 01 c3 add %ax,%bx
2d: 2e ff a7 02 00 jmp *%cs:0x2(%bx)
32: 5d pop %bp
33: c3 ret
34: 6a 00 push $0x0
36: e8 e5 ff call 0x1e
39: 83 c4 02 add $0x2,%sp
3c: c3 ret

*****************************************************************

As you can see at address 0x2d the jump table address
is 0x2 instead of 0x12 and also all the addresses in the jump
table itself are 0x22 instead of 0x32. In both cases the delta
is 0x10 (i.e. cs == _TEXT)

Ron




Wilton Helm

unread,
Jan 20, 2012, 1:23:10 PM1/20/12
to
At the moment it looks to me like a memory model issue. Because you are
trying to create a tiny model program, the map file line
_TEXT CODE AUTO 0001:0002
0000002b

should say
_TEXT CODE DGROUP 0001:0002
0000002b

I think this is controlled by the the startup file. For tiny model the
startup file should have a line

DGROUP group _TEXT,CONST,STRINGS,_DATA,XIB,XI,XIE,YIB,YI,YIE,_BSS

(it actually should have ENTRY included as well, but I took this from a more
conventional source.)

Without _TEXT being part of DGROUP, it won't use DGROUPs segment address to
compute the offset fixup.

Wilton


Ron Rondis

unread,
Jan 21, 2012, 4:25:32 AM1/21/12
to
In my first post the map file shows that _TEXT is in DGROUP and the
fixup was wrong.
Files compiled using nasm include the following:
segment _TEXT class=CODE USE16 align=1 CPU=686
group DGROUP _TEXT
I assume there was a conflict between files compiled using wcc and the
files compiled
using nasm. I do not know what the linker suppose to do in such a case
but I think that
at least it need to reflect the correct segments mapping in the map
file.


I added "-g=DGROUP -nt=_TEXT" to the wcc command line and now the
sagment are:


ENTRY CODE DGROUP 0000:0000 00000012
_TEXT CODE DGROUP 0001:0002 0000002b
CONST DATA DGROUP 0003:000e 00000000
CONST2 DATA DGROUP 0003:000e 00000000
_DATA DATA DGROUP 0003:000e 00000000


and the linker output is:


test.bin: file format binary


Disassembly of section .data:

00000000 <.data>:
0: 8c c8 mov %cs,%ax
2: 8e d8 mov %ax,%ds
4: 31 c0 xor %ax,%ax
6: 8e d0 mov %ax,%ss
8: bc f0 ff mov $0xfff0,%sp
b: e8 26 00 call 0x34
e: fa cli
f: f4 hlt
10: eb fd jmp 0xf
12: 32 00 xor (%bx,%si),%al
14: 32 00 xor (%bx,%si),%al
16: 32 00 xor (%bx,%si),%al
18: 32 00 xor (%bx,%si),%al
1a: 32 00 xor (%bx,%si),%al
1c: 32 00 xor (%bx,%si),%al
1e: 55 push %bp
1f: 89 e5 mov %sp,%bp
21: 8b 46 04 mov 0x4(%bp),%ax
24: 3d 05 00 cmp $0x5,%ax
27: 77 09 ja 0x32
29: 89 c3 mov %ax,%bx
2b: 01 c3 add %ax,%bx
2d: 2e ff a7 12 00 jmp *%cs:0x12(%bx)
32: 5d pop %bp
33: c3 ret
34: 6a 00 push $0x0
36: e8 e5 ff call 0x1e
39: 83 c4 02 add $0x2,%sp
3c: c3 ret


As you can see the fixup is OK now (e.g. address of the jump table
at 0x2d is 0x12)

Thanks,
Ron

Wilton Helm

unread,
Jan 26, 2012, 3:06:47 PM1/26/12
to
Sorry, I'm trying to sort out conflicting information. I just reviewed the
whole discussion. I don't know if nasm is having any impact here. I am not
familiar with it or the details of the relocatable object records it might
generate.

Why I mentioned the Group directive is that the example about 4 posts back
clearly states that _TEXT was in group AUTO and I knew that would not
produce the correct results. I hadn't noticed that the first example didn't
have that, and I don't know what messed it up.

It looks like the -g option is what you needed. Probably the disassembly
source would show the difference it made. I think -nt=_TEXT is the default
for any model that uses small code, so that probably didn't change anything.

Fixups are tricky both because of the possible dual segment nature involved
and because some of the work is done by the assembler (the assembler, not
the linker determines what segment address it is relative to and simply
provides the linker with instructions) and some by the linker (its job is
simply to do the math the assembler told it to do, now that it knows the
numbers). That is why assume statements are so important in assembly.
Maybe the -g caused an assume statement that wouldn't have otherwise been
there.

It seems to me that you were fighting two battles: 1) the assembler source
needed to assume cs:DGROUP and the linker needed to see _TEXT as part of
DGROUP. Lose either of those two and either the assembler is going to give
the linker wrong relative information or the linker is going to use wrong
segment information.

Wilton


Wilton


Ron Rondis

unread,
Feb 1, 2012, 2:26:45 PM2/1/12
to
On Jan 26, 10:06 pm, "Wilton Helm" <wh...@compuserve.com> wrote:

> It looks like the -g option is what you needed.  Probably the disassembly
> source would show the difference it made.  I think -nt=_TEXT is the default
> for any model that uses small code, so that probably didn't change anything.
>

Without -nt=_TEXT the segment is DGROUP_TEXT

Thanks for your help,
Ron
0 new messages