this is something I looked at many moons ago (and I one knew the exact
answers - I must still have documentation somewhere). But here is what
I can tell from the top of my head (caution):
I don't think the alignment is handled at the loader level. The DOS
loader is a very crude, simple loader.
I think the DOS loader always loads with the same alignment, like a
paragraph (or is it 256 bytes page?) boundary. Loaders in systems
where the swap page would be 4K would rather load on a coarser
alignment boundary, like a 4K page.
Assembly-defined alignment is handled at the binary image level
i.e.
- The linker aligns segments in the binary image according to the
directives in the SEGMENT directives.
Various linkers behave differently when they get contradicting
directives for the same segment from different modules. The smarter
ones use the more demanding alignment they found for any given
segment, but it's not safe to rely on this.
- The assembler pads code (and/or data) according to ALIGN directives
(or "STRUCT X", etc...) so that alignement is respected. This of
course assumes that the segment alignment has been properly defined by
the programmer.
As I said above, I don't think there is any alignment directive for
the loader in the .EXE file. The loader is only assumed to always
align on a boundary that's greater or equal to the largest alignment
required by the program.
- In the assembly language code, you can't request in a segment an
alignment that's larger than the segment's own alignment. For
instance, if the segment is defined as
FOO SEGMENT BYTE 'DATA'
neither the assembler nor the linker will be able to honor an
ALIGN DWORD
directive. They can't, because they can't insure at assembly/link that
the segment will be at least DWORD aligned (typically, it won't). The
assembler should give an error in such a case. A really smart
assembler would give a warning and enlarge the segment alignment
definition on the fly, but I don't know any that does.
- Segment alignment
I think the DOS loader only separatly handles three allocation blocs,
the PSP, the memory image and the stack.
The relocation table contains offsets relative to the start of the
program image. The loader loads the image (pure image, not PSP)
somewhere in memory at address XYZ, loads the relocation table
somewhere in some work memory and then for each entry in the table:
- uses the entry as an index into the image,
- adds the XYZ base address to the value at the specified location.
I'm not sure knowing DOS Loader format can help you for studying LE,
NE formats. I don't know much about LE and NE formats. I know they are
documented somewhere (I think I've ran into the document once), but I
never looked at them.
But I've looked at PE format, and I can tell you that
1) PE format is completely different from DOS loader format. PE format
is actually much smarter.
2) PE format at least is fully documented in MSDN, on the library CD,
in "Specifications / Microsoft Portable Executable and Common Object
File Format Specification 4.1"
Hope this helps somehow. If it doesn't, yell back and I'll try to
figer'out where my very old documentation is.
On Sat, 01 Feb 1997 14:03:54 -0800, "Dave Sper" <sp...@mail.gte.net>
wrote:
I think what Philippe said is correct about alignment. It is handled
when the EXE is built, not at load time. Because of the way the segment
registers are set up, you are guaranteed 64K alignment at the "start" of
your EXE, but only in the context of the segment registers you are
using. Actually you are only guaranteed 16-byte alignment on physical
memory.
If you are writing a device driver or some other hardware-specific code
that needs a specific alignment with a physical memory address, you have
to handle that at runtime in your code in DOS exe's. I've seen drivers
that had to guarantee 4K alignment for some reason like 386 paging or
DMA, and they would simply allocate a buffer that was twice that in
size. Somewhere within any given 8K of memory is a 4K-aligned block of
4K just waiting to be found. <grin>
Now for the relocation table, as I recall these were pairs of words.
One word "pointed to" a word in the actual text of the program which
needed to be relocated and the second word in the pair was the
"adjustment" which was required to the target word. These adjustments
were relative to the base segment at which the first byte of the EXE was
loaded at.
However, clearly EXE's larger than 64K are possible, and I'm not sure
how that was handled. I seem to recall the offset might be relative to
the previous relocation, or maybe it uses the word at the target
location as part of the relocation too.
I have source to an EXE loader at home (I'm at work right now) so I can
get you an answer to this a little later. Email me if you don't see
anything and still need it.
--Tim Farley
Email: Tim.F...@XcelleNet.com
relo-source" that replaced the "first near offset relo-target" upon
loading.
Or possibly the second relo-ptr could be a "near offset" from the first
source ptr but this would seem limiting since if the first segment was
within
say 3 bytes of 64k in size then you couldn't relocate the next seg on a
para
alignment with a near offset value. Anyhow your (and Phillip's ) reply
have
helped me a great deal to getting underneath this, it would seem now that
all
I have to do is experiment with debug to mathematically see if the patched
up
segments are indeed starting at the locations we are proposing. However if
there's someway I could download the source of the exec loader you have I
would like to look at it. Or if you knew where I could find a copy of the
MsDos Encyclopedia by Microsoft Press which is out of print I believe.
After
I finish this study of the dos relo-ptrs I can store it away and move on to
the
newer executable types.
Here's to asking you another question someday.
I'm a system's analyst student at night and a construction project manager
by day.
My biggest problem is time and my biggest solution is the net. Or as it has
been
said time is NOT a constant but rather moves at the speed of the observer.
Maybe this why some days go faster than others <grin>
Sincerely Dave Speir Florida. , sp...@mail.gte.net
I did dig up that disassembled copy of Novell's NETX and looked at their
loader. The relocation table works this way:
Each relocation element in the table is a DWORD (32 bits). That DWORD
contains a standard 16-bit segmented pointer, i.e. the offset word
first, followed by the segment value.
Each pointer in the relocation table points to one WORD in the actual
program which needs to be fixed. These words are segment references
within instructions or data in the code.
Ironically enough, that means that you have to fixup the relocation
pointers themselves first, then use them to fix up the target words in
the EXE image! Other than that one weirdness, it's pretty simple,
really.
To fixup the EXE, you must do the following:
1. Load the EXE into its destination location in memory.
2. Load the relocation table somewhere else in memory where you can use
it. (I think DOS uses one of the BUFFERS= buffers for this, NETX uses
an internal buffer).
3. Remember the segment which represents the first byte of the actual
EXE image. I.e. the first byte of the program AFTER the EXE header,
which gets loaded just after the end of the PSP for this program. Lets
assume for the moment that you hold this in DX, meaning the first byte
of your actual program is loaded at DX:0000.
4. Load a relocation element into a pointer like DS:SI or ES:DI.
5. Adjust the segment value of your relocation pointer by adding DX to
it (see step 3).
6. Use the resulting pointer to fetch a word of the actual program
sitting in memory.
7. Add DX to this word as well, and store it back where you found.
(Actually 6 and 7 can be done in one instruction, of course).
8. Repeat steps 4 through 7 until you are done with the relocation
table.
--Tim Farley
Email: Tim.F...@XcelleNet.com