Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

DOS COM File format question

0 views
Skip to first unread message

Will Hartung

unread,
Oct 26, 2002, 12:32:27 PM10/26/02
to
Hi all,

I understand that a DOS COM file format is simply a memory image.

DOS loads the program at 0x0000? and then starts executing at 0x0100?

Can I place data in the first 256 bytes?

What are the segement registers set at? Anything, or do I need to clear them
first thing?

Also, it's my understanding that even though a COM file is, essentially,
only a "tiny" model program, that MS-DOS allocates "all of the memory" to
the COM file, so technically, I could have a tiny model program query MS-DOS
for the size of the heap, and go dancing through all of that memory, but the
original image must be less than 64K. Is that correct?

Thanx for any thoughts you might have on this.

Regards,

Will Hartung
(wi...@msoft.com)

Dave Dunfield

unread,
Oct 26, 2002, 3:39:31 PM10/26/02
to
>Hi all,

>I understand that a DOS COM file format is simply a memory image.

>DOS loads the program at 0x0000? and then starts executing at 0x0100?
>Can I place data in the first 256 bytes?

No, DOS assigns a free memory segment, and then loads the program
at offset 0x100 into that segment. In other words, the first byte from the file
goes at 0x100 and on-up. The file cannot contain data to be placed in the
offset range 0x00-0xFF.

The contents of locations 0x00-0xFF are filled in by DOS and are called
the "Program Segment Prefix" (PSP). The top half of this (0x80-0xFF)
is the "comand tail", and contains the command line argument string.
Locations 0x00-0x7F contain various internal DOS information, some of
which remain as artifacts from CP/M. Included in this area are things
like:
- Default file control blocks
- Saved termination, Ctrl-C and critical error vectors
- A pointer to the environment segment
- A pointer to the programs highest allocated memory block
- An INT 20 call for program termination

>What are the segement registers set at? Anything, or do I need to clear them
>first thing?

All segment registers (CS, DS, ES and SS) are set to the same value, which
will be the segment in which your program was loaded.

>Also, it's my understanding that even though a COM file is, essentially,
>only a "tiny" model program, that MS-DOS allocates "all of the memory" to
>the COM file, so technically, I could have a tiny model program query MS-DOS
>for the size of the heap, and go dancing through all of that memory, but the
>original image must be less than 64K. Is that correct?

DOS uses a very simple scheme to track memory allocation, mainly because
the nature of the real mode processor means that once a program is loaded,
neither it nor it's data can be moved. Memory is allocated in simple non-
movable blocks. A .COM program does not contain nformation about how
much memory it needs, so DOS takes the safe approach and gives it the
largest available block. Unless you have loaded and unloaded transient
programs out of sequence (which fragments the simple blocking scheme),
this will be essentially the entire amount of free conventionalmemory. It is up
to the program to "give back" any memory it doesn't need.

If you don't give it back, yes you can use the memory any way you like. Keep in
mind that DOS will set the stack to the top of the program load segment, or the
top of physical memory, whichever is less. For most systems, this means that
your stack is sitting out there 64k above your load segment. Be sure to either
move it or avoid it when "dancing through all that memory".

The maximum size of a .COM file that DOS5 will load is 65279 bytes, which
works out to 64k (65535) - 256 (PSP) - 1 extra byte which is probably an
"off by one error" however it is not practical to have a load file this large,
because the upper end would be corrupted by the stack.

DOS places an "INT 20" instruction in the PSP at location 0, and pushes a
word of value 0 on the stack when launching the program, this means:

- Your program can just use "RET" to exit (provided that you have not
moved the stack)

- At least 2 bytes at the upper end of the image will be written by the
stack before your program launches meaning the largest image you
could actually get uncorrupted into memory is 65278 bytes (Until an
interrupt comes along and the stack grows some more)

>Thanx for any thoughts you might have on this.

>Regards,

>Will Hartung
>(wi...@msoft.com)


--
dave@ Dunfield Development Systems http://www.dunfield.com
dunfield. Low cost software development tools for embedded systems
com Software/firmware development services Fax:613-256-5821

Kenneth Brody

unread,
Oct 26, 2002, 4:10:18 PM10/26/02
to
Will Hartung wrote:
>
> Hi all,
>
> I understand that a DOS COM file format is simply a memory image.
>
> DOS loads the program at 0x0000? and then starts executing at 0x0100?

No. DOS loads the image at 0x0100.

> Can I place data in the first 256 bytes?

No. DOS uses that for the PSP ("Program Segment Prefix") which contains
critical system information.

> What are the segement registers set at? Anything, or do I need to clear them
> first thing?

SP = 0xfffe
DS, ES, SS, CS = segment
IP = 0x0100

I don't recall if the other registers are guaranteed to contain anything
meaningful. (Though, under DEBUG, BX:CX=image size, everything else = 0.)

[...]

--

+---------+----------------------------------+-----------------------------+
| Kenneth | kenbrody at spamcop.net | "The opinions expressed |
| J. | http://www.hvcomputer.com | herein are not necessarily |
| Brody | http://www.fptech.com | those of fP Technologies." |
+---------+----------------------------------+-----------------------------+

franz

unread,
Oct 26, 2002, 5:46:49 PM10/26/02
to

Will Hartung wrote:

> Hi all,
>
> I understand that a DOS COM file format is simply a memory image.

Yes, that's right.

> DOS loads the program at 0x0000? and then starts executing at 0x0100?

Not quite. The image is loaded at 0x0100 and execution begins at that
address.

> Can I place data in the first 256 bytes?

No. That's the program segment prefix and contains information that DOS
constructs for your program at load time. It remains there to provide backward
compatibility to previous DOS versions as well as the CP/M operating system
which DOS 1.x was based on.

> What are the segement registers set at? Anything, or do I need to clear them
> first thing?

The four 8086 segment registers CS DS SS ES point to the base of the program
segment prefix mentioned above. I don't know if later versions of DOS do the
same thing to FS and GS in 386 and later chips.


> Also, it's my understanding that even though a COM file is, essentially,
> only a "tiny" model program, that MS-DOS allocates "all of the memory" to
> the COM file, so technically, I could have a tiny model program query MS-DOS
> for the size of the heap, and go dancing through all of that memory, but the
> original image must be less than 64K. Is that correct?

Yes, that correct. DOS 1.x didn't have much in the way of memory management,
and the 8086 didn't have any protection mechanisms to prevent a rogue program
from modifying memory it wasn't supposed to, so as a backwards compatibility
measure your little .COM program gets allocated all available conventional
memory at load time. If you want to exec other programs, you'll have to call
the appropriate DOS functions to shrink your program's memory allocation down to
what it actually needs.

> Thanx for any thoughts you might have on this.
>
> Regards,
>
> Will Hartung
> (wi...@msoft.com)

Interesting email address. Was this a test? Hope I get an "A" ...

-Frank

0 new messages