Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

DOS RELOCATION TABLE USAGE BY DOS EXE LOADER

412 views
Skip to first unread message

Dave Sper

unread,
Feb 1, 1997, 3:00:00 AM2/1/97
to

From Dave Speir, System Analyst Student , Florida.
I know this material is little dated but I need to start with the dos exe
header to move on to the NE. PE. LE.
I'm doing some research on what actually goes down as to how the dos
loader deals with the alignment instructions of the exe header relocation
table. Can anyone help me with the following. I've found and digested the
object file format of the segtab records and how the linker uses the former
to write the exe header for the disk image. I also know how to read the
header to find the offset for the start of the relocation table, but I
can't quite understand exactly what the bytes in the table are
mathematically saying that the dos loader could use. I.e. when I look on
the map file and the ACBP byte I fully can spot the byte values for byte,
word, and para alignments but cannot see the corolation to the relocation
table bytes. I know that the msdos encyclopedia has examples or said
material but it is out of print. To attempt to clarify further here is a
listing from an assembled exe file I wrote up just to check this out. It
has the stack & code segs and 3 data segs with the following alignments in
the order of creation, dataseg1 byte aligned, dataseg2 word aligned,
dataseg3 para aligned. Now here is the linked exe file relocation table
bytes from the exe header disk image.01 00 20 00 0D 00 20 00 19 00 20 00 .
Offset 6h of the header gave a value of 3 for the number of relocation
items and offset 18h gave me the offset to the relocation bytes (table) but
what exactly do these bytes represent? When I attempt to add the values
given to the disk image offsets of the segments they don't seem to be
aligned on the given boundaries. Obviously I'm not understanding what
algorithym the dos loader is executing to load the disk image into memory
and pad for desired alignments. I understand that all alignment is
machine/platform related as to perforance of ram accesses. I also
understand that the NE PE AND LE exe formats deal with different alignments
(namely page) as opposed to the alignment options of the dos exe real mode
execution, but I feel after understanding the dos relocations first I can
better piece together the NE relocations. Help anyone with a firm grip on
the dos exe loading procedures.! If the above info. is too vague I could
send a copy of the map
file and .obj file seg/record.
Sincerely Dave Speir Florida. sp...@mail.gte.net

Philippe Auphelle

unread,
Feb 2, 1997, 3:00:00 AM2/2/97
to

Dave,

this is something I looked at many moons ago (and I one knew the exact
answers - I must still have documentation somewhere). But here is what
I can tell from the top of my head (caution):

I don't think the alignment is handled at the loader level. The DOS
loader is a very crude, simple loader.
I think the DOS loader always loads with the same alignment, like a
paragraph (or is it 256 bytes page?) boundary. Loaders in systems
where the swap page would be 4K would rather load on a coarser
alignment boundary, like a 4K page.
Assembly-defined alignment is handled at the binary image level
i.e.
- The linker aligns segments in the binary image according to the
directives in the SEGMENT directives.
Various linkers behave differently when they get contradicting
directives for the same segment from different modules. The smarter
ones use the more demanding alignment they found for any given
segment, but it's not safe to rely on this.
- The assembler pads code (and/or data) according to ALIGN directives
(or "STRUCT X", etc...) so that alignement is respected. This of
course assumes that the segment alignment has been properly defined by
the programmer.
As I said above, I don't think there is any alignment directive for
the loader in the .EXE file. The loader is only assumed to always
align on a boundary that's greater or equal to the largest alignment
required by the program.

- In the assembly language code, you can't request in a segment an
alignment that's larger than the segment's own alignment. For
instance, if the segment is defined as

FOO SEGMENT BYTE 'DATA'

neither the assembler nor the linker will be able to honor an

ALIGN DWORD

directive. They can't, because they can't insure at assembly/link that
the segment will be at least DWORD aligned (typically, it won't). The
assembler should give an error in such a case. A really smart
assembler would give a warning and enlarge the segment alignment
definition on the fly, but I don't know any that does.

- Segment alignment

I think the DOS loader only separatly handles three allocation blocs,
the PSP, the memory image and the stack.
The relocation table contains offsets relative to the start of the
program image. The loader loads the image (pure image, not PSP)
somewhere in memory at address XYZ, loads the relocation table
somewhere in some work memory and then for each entry in the table:
- uses the entry as an index into the image,
- adds the XYZ base address to the value at the specified location.

I'm not sure knowing DOS Loader format can help you for studying LE,
NE formats. I don't know much about LE and NE formats. I know they are
documented somewhere (I think I've ran into the document once), but I
never looked at them.

But I've looked at PE format, and I can tell you that
1) PE format is completely different from DOS loader format. PE format
is actually much smarter.
2) PE format at least is fully documented in MSDN, on the library CD,
in "Specifications / Microsoft Portable Executable and Common Object
File Format Specification 4.1"

Hope this helps somehow. If it doesn't, yell back and I'll try to
figer'out where my very old documentation is.

On Sat, 01 Feb 1997 14:03:54 -0800, "Dave Sper" <sp...@mail.gte.net>
wrote:

Tim Farley

unread,
Feb 3, 1997, 3:00:00 AM2/3/97
to

Boy, you're really digging deep in the past here...<grin>

I think what Philippe said is correct about alignment. It is handled
when the EXE is built, not at load time. Because of the way the segment
registers are set up, you are guaranteed 64K alignment at the "start" of
your EXE, but only in the context of the segment registers you are
using. Actually you are only guaranteed 16-byte alignment on physical
memory.

If you are writing a device driver or some other hardware-specific code
that needs a specific alignment with a physical memory address, you have
to handle that at runtime in your code in DOS exe's. I've seen drivers
that had to guarantee 4K alignment for some reason like 386 paging or
DMA, and they would simply allocate a buffer that was twice that in
size. Somewhere within any given 8K of memory is a 4K-aligned block of
4K just waiting to be found. <grin>

Now for the relocation table, as I recall these were pairs of words.
One word "pointed to" a word in the actual text of the program which
needed to be relocated and the second word in the pair was the
"adjustment" which was required to the target word. These adjustments
were relative to the base segment at which the first byte of the EXE was
loaded at.

However, clearly EXE's larger than 64K are possible, and I'm not sure
how that was handled. I seem to recall the offset might be relative to
the previous relocation, or maybe it uses the word at the target
location as part of the relocation too.

I have source to an EXE loader at home (I'm at work right now) so I can
get you an answer to this a little later. Email me if you don't see
anything and still need it.

--Tim Farley
Email: Tim.F...@XcelleNet.com

Dave Speir

unread,
Feb 5, 1997, 3:00:00 AM2/5/97
to

Yes Tim,
The last two paragraphs of your reply are close to what I've been
trying to find out. I understand about the dos O/S 16 byte paragraph MCB's
and how the "start" of exe image (from disk) would be loaded be loaded at
the beginning of one of these MCB's since that's how dos loads programs.
And also I see how this start location would set up the possibilty for
further
segment alignment and then for further struct alignment each dependent on
it's predescessing alignment as to the possiblities. But what I've been
unable
to find documentation on (since dos material is dated nowadays) is the
relocation ptr definitions as to how they are used. To be clear I have
found
some more material since I posted the masm news question. But nothing
really
definitive, only a clue here or there. And some of what I found sounded
a lot like what you replied. I know now (from documentation search found
on the net) that the relocation table is an array of 32 bit ptrs. Actually
it's
only an array if there's more than one relocation ptr. And the relo-ptrs
are
for segments defined in the exe image. Now if they are only 32 bits then we
can safely deduce that the location of the segment within the exe image AND
the relocation of the segment must be defined within these 32 bits. Now
setting aside the possibility of bit field info, that only leaves us with
one of
the words as the source and the other word as the target like you said. And
the base segment (as you said) is certainly feasible as a starting pt since
it's
value should be sitting in CS reg (correct me if I'm astray here) upon dos
allocating the available MCB for the exe to load in. And it seems also just
as feasable that the problem of larger than 64k program could also be
solved
by making the second relo-ptr a 32 bit far ptr from the first "near offset

relo-source" that replaced the "first near offset relo-target" upon
loading.
Or possibly the second relo-ptr could be a "near offset" from the first
source ptr but this would seem limiting since if the first segment was
within
say 3 bytes of 64k in size then you couldn't relocate the next seg on a
para
alignment with a near offset value. Anyhow your (and Phillip's ) reply
have
helped me a great deal to getting underneath this, it would seem now that
all
I have to do is experiment with debug to mathematically see if the patched
up
segments are indeed starting at the locations we are proposing. However if
there's someway I could download the source of the exec loader you have I
would like to look at it. Or if you knew where I could find a copy of the
MsDos Encyclopedia by Microsoft Press which is out of print I believe.
After
I finish this study of the dos relo-ptrs I can store it away and move on to
the
newer executable types.

Here's to asking you another question someday.

I'm a system's analyst student at night and a construction project manager
by day.
My biggest problem is time and my biggest solution is the net. Or as it has
been
said time is NOT a constant but rather moves at the speed of the observer.
Maybe this why some days go faster than others <grin>
Sincerely Dave Speir Florida. , sp...@mail.gte.net

Dave Speir

unread,
Feb 6, 1997, 3:00:00 AM2/6/97
to

Thursday 6th. about 12 noon eastern standard.
Tim I should have waited til I rested before I answered you last
night at midnight but I was so intent on doing it before I went to
bed. However in retrospect this morning it hit me that I was in
error on the statement that para segment alignment could not
be achieved with a near offset if the previous source (fixup) seg
was 61 kb in size. It could be done obviously now that I'm clear
headed to see it. But the huge model could pose some problems
related to whether a word offset could accomplish alignment.
In that case it would seem that without the dos loader being
more intelligent than it is, the 32 bit relo-ptr could be a seg:offset
from the previous source (fixup) patch address from the relo-table.
Wish I had more time but I'm on my lunch hour now.
Greatly appreciate your help.
Dave .


Tim Farley

unread,
Feb 7, 1997, 3:00:00 AM2/7/97
to

I sent you an answer via Email, but just for the folks looking in
here over our shoulders, here's a repeat...

I did dig up that disassembled copy of Novell's NETX and looked at their
loader. The relocation table works this way:

Each relocation element in the table is a DWORD (32 bits). That DWORD
contains a standard 16-bit segmented pointer, i.e. the offset word
first, followed by the segment value.

Each pointer in the relocation table points to one WORD in the actual
program which needs to be fixed. These words are segment references
within instructions or data in the code.

Ironically enough, that means that you have to fixup the relocation
pointers themselves first, then use them to fix up the target words in
the EXE image! Other than that one weirdness, it's pretty simple,
really.

To fixup the EXE, you must do the following:

1. Load the EXE into its destination location in memory.

2. Load the relocation table somewhere else in memory where you can use
it. (I think DOS uses one of the BUFFERS= buffers for this, NETX uses
an internal buffer).

3. Remember the segment which represents the first byte of the actual
EXE image. I.e. the first byte of the program AFTER the EXE header,
which gets loaded just after the end of the PSP for this program. Lets
assume for the moment that you hold this in DX, meaning the first byte
of your actual program is loaded at DX:0000.

4. Load a relocation element into a pointer like DS:SI or ES:DI.

5. Adjust the segment value of your relocation pointer by adding DX to
it (see step 3).

6. Use the resulting pointer to fetch a word of the actual program
sitting in memory.

7. Add DX to this word as well, and store it back where you found.
(Actually 6 and 7 can be done in one instruction, of course).

8. Repeat steps 4 through 7 until you are done with the relocation
table.

--Tim Farley
Email: Tim.F...@XcelleNet.com

0 new messages