C's system() & COMMAND.COM

Alexei A. Frounze

unread,

Sep 11, 2014, 2:37:26 PM9/11/14

to

I'm implementing system() in the DOS version of my compiler's (https://github.com/alexfru/SmallerC) standard C library and having a bit of a dilemma.

It appears that if I build system("command") around "%COMSPEC% /C command", neither the "execute" function 0x4b00 nor the "get exit code" function 0x4d does return the exit code from "command" if "command" resolves to a .COM/.EXE executable. Looks like I'm getting success exit code of 0 from COMMAND.COM itself.

This is unfortunate since I can't easily detect errors in child processes based on their exit code.

I can think of a few pretty nasty workarounds.

1. Do COMMAND.COM's job and find the full path of the .COM/.EXE and execute it directly without involving "COMMAND.COM /C". The problem here is that even though I can traverse %PATH% and find the executable, if it's there, I need to know that the command is not one of COMMAND.COM's internal commands and the set of command processor's commands varies between DOS versions and vendors and DOS emulators (e.g. DosBox). I don't want to incorporate substantial knowledge about different DOS versions into the library. I don't feel it's the right way to do things.

2. Hook int 0x20 and int 0x21's function 0x4c that are used to terminate executables and note terminations and stash away the exit codes (from function 0x4c only as int 0x20 doesn't take any exit code). If I ignore the last one (from COMMAND.COM), then the next one is from my executable. But again, what if the command resolves to a batch file, which in turn invokes a .COM/.EXE? In that case occasionally I'm going to get the exit code that has nothing to do with batch file execution. Further, I may run into trouble hooking DOS interrupts. Either DOS may not appreciate it or some ancient anti-virus may block such suspicious activity.

I could probably devise a more intrusive workaround, but it might result in incompatibilities with various DOSes and be even more fragile.

Any suggestions, ideas?

Thanks,
Alex

Ross Ridge

unread,

Sep 11, 2014, 7:07:46 PM9/11/14

to

Alexei A. Frounze <alexf...@gmail.com> wrote:
>Any suggestions, ideas?

I wouldn't worry about it. Other MS-DOS C compilers' system() functions
didn't return the exit status, so there's no reason for your to behave
differently. It was a common complaint back in the day, but as you've
apparently discovered there was a good reason for this.

Ross Ridge

--
l/ // Ross Ridge -- The Great HTMU
[oo][oo] rri...@csclub.uwaterloo.ca
-()-/()/ http://www.csclub.uwaterloo.ca/~rridge/
db //

Rod Pemberton

unread,

Sep 11, 2014, 9:25:31 PM9/11/14

to

On Thu, 11 Sep 2014 14:37:26 -0400, Alexei A. Frounze
<alexf...@gmail.com> wrote:

> I'm implementing system() in the DOS version of my compiler's [link]

> standard C library and having a bit of a dilemma.
>
> It appears that if I build system("command") around "%COMSPEC% /C
> command", neither the "execute" function 0x4b00 nor the "get exit code"
> function 0x4d does return the exit code from "command" if "command"
> resolves to a .COM/.EXE executable. Looks like I'm getting success exit
> code of 0 from COMMAND.COM itself.
>

I'm not sure what DJGPP or OpenWatcom do, but the code is available
for both and you can look at it at your leisure. I've done so for
DJGPP, but it gets long-winded and complicated.

> This is unfortunate since I can't easily detect errors in child
> processes based on their exit code.

Is there some other way to monitor/trap/store exit codes?

> I can think of a few pretty nasty workarounds.
>
> 1. Do COMMAND.COM's job and find the full path of the .COM/.EXE and
> execute it directly without involving "COMMAND.COM /C". The problem here
> is that even though I can traverse %PATH% and find the executable, if
> it's there, I need to know that the command is not one of COMMAND.COM's
> internal commands and the set of command processor's commands varies
> between DOS versions and vendors and DOS emulators (e.g. DosBox).

Yes. I encountered the same issue a while ago. I wrote a function,
a large function with hashing and such, to detect the more widely
available internal DOS commands, before I attempted to called system().
Of course, if it's internal, it should still work with a spawn, yes?
Although, maybe slow ...

I know DJGPP has some functions which execute programs differently
based on their file extensions. These are used by system() and the
various spawn functions.

> I don't want to incorporate substantial knowledge about different DOS
> versions into the library. I don't feel it's the right way to do things.
>
> 2. Hook int 0x20 and int 0x21's function 0x4c that are used to terminate
> executables and note terminations and stash away the exit codes (from
> function 0x4c only as int 0x20 doesn't take any exit code).

FYI, DPMI also exits the DPMI host via Int 0x21, function 0x4c.
Some DPMI hosts are single tasking, while others, like DJGPP's
CWSDPMI is nestable. I.e., a nestable DPMI host may have 0x21 0x4c
called many times before actually exiting the host. If it's permanent
or resident, it'll never exit. This could get into complex issues of
bimodal trapping of the interrupt (both RM and PM) and preventing the
DPMI host from changing the interrupt vector which would bypass your
trap, etc.

Also, weren't there a couple of other ways to exit DOS application
besides 0x20 and 0x21 0x4c? CP/M far jump? ret instruction?
I.e., I'm not sure how you'd trap those, if they're being used.

Rod Pemberton

Alexei A. Frounze

unread,

Sep 11, 2014, 9:41:00 PM9/11/14

to

On Thursday, September 11, 2014 4:07:46 PM UTC-7, Ross Ridge wrote:

> Alexei A. Frounze <...@gmail.com> wrote:
>
> >Any suggestions, ideas?
>
> I wouldn't worry about it. Other MS-DOS C compilers' system() functions
> didn't return the exit status, so there's no reason for your to behave
> differently. It was a common complaint back in the day, but as you've
> apparently discovered there was a good reason for this.

I don't think the reason was good. Unfortunately, even the C standard says system() can return anything at all if the parameter isn't NULL. This makes it nearly impossible to stay within the C standard (e.g. use only system(), but not exec*(), which is POSIX) and develop programs that invoke other programs and can make any sense of what's going on in the process. Basically, I'm forced to use other communication channels (naturally, files) to find out whether or not the child process exited cleanly but that still won't help with the existing software that doesn't use other communication channels and can't be changed easily if it all.

Alex

Alexei A. Frounze

unread,

Sep 11, 2014, 11:11:39 PM9/11/14

to

On Thursday, September 11, 2014 6:25:31 PM UTC-7, Rod Pemberton wrote:
> On Thu, 11 Sep 2014 14:37:26 -0400, Alexei A. Frounze
>

> <...@gmail.com> wrote:
>
> > I'm implementing system() in the DOS version of my compiler's [link]
> > standard C library and having a bit of a dilemma.
> >
> > It appears that if I build system("command") around "%COMSPEC% /C
> > command", neither the "execute" function 0x4b00 nor the "get exit code"
> > function 0x4d does return the exit code from "command" if "command"
> > resolves to a .COM/.EXE executable. Looks like I'm getting success exit
> > code of 0 from COMMAND.COM itself.
> >
>
> I'm not sure what DJGPP or OpenWatcom do, but the code is available
> for both and you can look at it at your leisure. I've done so for
> DJGPP, but it gets long-winded and complicated.

OW looks at the extension and at a list of internal commands.
DJGPP looks at the extension only.

Seems like if there's the .COM or .EXE extension, they get the fully qualified path and execute the binary directly without invoking COMMAND.COM.

I think, I should do the same extension check and explicitly provide the .EXE extension in my compiler driver so that it can detect failures in the subordinate processes in DOS (the core compiler, NASM, the linker).

Better than nothing.

> > This is unfortunate since I can't easily detect errors in child
> > processes based on their exit code.
>
> Is there some other way to monitor/trap/store exit codes?

Doesn't look like there is.

> > I can think of a few pretty nasty workarounds.
> >
> > 1. Do COMMAND.COM's job and find the full path of the .COM/.EXE and
> > execute it directly without involving "COMMAND.COM /C". The problem here
> > is that even though I can traverse %PATH% and find the executable, if
> > it's there, I need to know that the command is not one of COMMAND.COM's
> > internal commands and the set of command processor's commands varies
> > between DOS versions and vendors and DOS emulators (e.g. DosBox).
>
> Yes. I encountered the same issue a while ago. I wrote a function,
> a large function with hashing and such, to detect the more widely
> available internal DOS commands, before I attempted to called system().
> Of course, if it's internal, it should still work with a spawn, yes?
> Although, maybe slow ...

If I stick to standard C, there's no spawn(), no exec*(), no nothing. Just the ugly system().

> I know DJGPP has some functions which execute programs differently
> based on their file extensions. These are used by system() and the
> various spawn functions.

Yep.

> > I don't want to incorporate substantial knowledge about different DOS
> > versions into the library. I don't feel it's the right way to do things.
> >
> > 2. Hook int 0x20 and int 0x21's function 0x4c that are used to terminate
> > executables and note terminations and stash away the exit codes (from
> > function 0x4c only as int 0x20 doesn't take any exit code).
>
> FYI, DPMI also exits the DPMI host via Int 0x21, function 0x4c.
> Some DPMI hosts are single tasking, while others, like DJGPP's
> CWSDPMI is nestable. I.e., a nestable DPMI host may have 0x21 0x4c
> called many times before actually exiting the host. If it's permanent
> or resident, it'll never exit. This could get into complex issues of
> bimodal trapping of the interrupt (both RM and PM) and preventing the
> DPMI host from changing the interrupt vector which would bypass your
> trap, etc.

I'm not planning to offer DPMI support in the compiler (nor LFN). That would be too much research and work for something that's supposed to be small and fun.

DJGPP goes at lengths to make DOS look like some kind of UNIX and to hide its 16-bit nature. But it's still a pig, no matter how much lipstick you put on it.

> Also, weren't there a couple of other ways to exit DOS application
> besides 0x20 and 0x21 0x4c? CP/M far jump? ret instruction?
> I.e., I'm not sure how you'd trap those, if they're being used.

I have thought of those as well. It's a nightmare. :)

Thanks!
Alex

Ross Ridge

unread,

Sep 11, 2014, 11:49:37 PM9/11/14

to

Alexei A. Frounze <alexf...@gmail.com> wrote:

>I don't think the reason was good.

It's not clear to me your proposed alternatives are better.

> Unfortunately, even the C standard says system() can return anything
> at all if the parameter isn't NULL. This makes it nearly impossible

> to stay within the C standard ...

Yes. Certianly any portable C program targetting MS-DOS can't depend on
system() returning an exit status. Your compiler isn't going to change
that no matter how you implement system().

> ... and develop programs that invoke other programs

As soon you invoke a program with system() you've gone outside the scope
of the C standard. Standard C doesn't say whether that program exists
nor does it define its behaviour.

Unless you're building a toy C compiler, not intended to be actually
useful, you should reconsider your desire to "stick to standard C".
Of all the useful C programs ever written, very few have stuck to just
what's defined in the C standard.

Alexei A. Frounze

unread,

Sep 12, 2014, 1:05:31 AM9/12/14

to

On Thursday, September 11, 2014 8:49:37 PM UTC-7, Ross Ridge wrote:

> Alexei A. Frounze <...@gmail.com> wrote:
>
> >I don't think the reason was good.
>
> It's not clear to me your proposed alternatives are better.

If the command has .COM or .EXE in what might be a program name, I could execute that directly and get the exit code. It will solve my compiler's dependency on the child's success/failure exit code when the compiler is compiled by itself for DOS. Isn't that better?

> > Unfortunately, even the C standard says system() can return anything
> > at all if the parameter isn't NULL. This makes it nearly impossible
> > to stay within the C standard ...
>
> Yes. Certianly any portable C program targetting MS-DOS can't depend on
> system() returning an exit status. Your compiler isn't going to change
> that no matter how you implement system().

Right. But, like I just said, I can partially solve the problem in my compiler's library.

Further, even though the C standard defines very little, there's some help in POSIX:

http://pubs.opengroup.org/onlinepubs/009695399/functions/system.html:

"
If command is not a null pointer, system() shall return the termination status of the command language interpreter in the format specified by waitpid(). The termination status shall be as defined for the sh utility; ...
"

http://pubs.opengroup.org/onlinepubs/009695399/functions/waitpid.html:

"
The value stored at the location pointed to by stat_loc shall be 0 if and only if the status returned is from a terminated child process that terminated by one of the following means:

The process returned 0 from main().

The process called _exit() or exit() with a status argument of 0.

The process was terminated because the last thread in the process terminated.
"

http://pubs.opengroup.org/onlinepubs/009695399/utilities/sh.html:

"
The following exit values shall be returned:

0
The script to be executed consisted solely of zero or more blank lines or comments, or both.
1-125
A non-interactive shell detected a syntax, redirection, or variable assignment error.
127
A specified command_file could not be found by a non-interactive shell.

Otherwise, the shell shall return the exit status of the last command it invoked or attempted to invoke...
"

I can reasonably expect to get 0 from system() on POSIX systems if the command has succeeded.

So, I'll be taking a dependency on POSIX-ish behavior and I'll be providing similar behavior in my system() in DOS (and Windows, when I get to supporting it).

And, clearly, I'm not going to support every imaginable or existing system and architecture. So, it should be good enough within the limited range of what's supported.

> > ... and develop programs that invoke other programs
>
> As soon you invoke a program with system() you've gone outside the scope
> of the C standard. Standard C doesn't say whether that program exists
> nor does it define its behaviour.

True.

> Unless you're building a toy C compiler, not intended to be actually
> useful,

It is a toy C compiler by all "standards", but it's functional and it's special value is in targeting DOS and real/v86 mode.

> you should reconsider your desire to "stick to standard C".
> Of all the useful C programs ever written, very few have stuck to just
> what's defined in the C standard.

I've kind of noticed that in my 15+ years of programming in C/C++. :)

Alex

Rod Pemberton

unread,

Sep 12, 2014, 4:56:56 AM9/12/14

to

On Thu, 11 Sep 2014 23:11:39 -0400, Alexei A. Frounze
<alexf...@gmail.com> wrote:
> On Thursday, September 11, 2014 6:25:31 PM UTC-7, Rod Pemberton wrote:
>> On Thu, 11 Sep 2014 14:37:26 -0400, Alexei A. Frounze

>> > I don't want to incorporate substantial knowledge about different DOS
>> > versions into the library. I don't feel it's the right way to do
>> things.
>> >
>> > 2. Hook int 0x20 and int 0x21's function 0x4c that are used to
>> terminate
>> > executables and note terminations and stash away the exit codes (from
>> > function 0x4c only as int 0x20 doesn't take any exit code).
>>
>> FYI, DPMI also exits the DPMI host via Int 0x21, function 0x4c.
>> Some DPMI hosts are single tasking, while others, like DJGPP's
>> CWSDPMI is nestable. I.e., a nestable DPMI host may have 0x21 0x4c
>> called many times before actually exiting the host. If it's permanent
>> or resident, it'll never exit. This could get into complex issues of
>> bimodal trapping of the interrupt (both RM and PM) and preventing the
>> DPMI host from changing the interrupt vector which would bypass your
>> trap, etc.
>
> I'm not planning to offer DPMI support in the compiler (nor LFN). That
> would be too much research and work for something that's supposed to be
> small and fun.
>

What if someone, like me, executes a DPMI application via your system()?

Well, that was the point. I.e., DPMI may mess up your monitoring of
Int 0x21 0x4c, if you haven't tested it with a variety of DPMI hosts.
FYI, CWSDPMI is nestable and PMODETSR is non-nestable. Both come with
DJGPP. Also, some emulations have built in DPMI, e.g., dosemu, Windows,
etc.

Rod Pemberton

Alexei A. Frounze

unread,

Sep 12, 2014, 6:12:27 AM9/12/14

to

On Friday, September 12, 2014 1:56:56 AM UTC-7, Rod Pemberton wrote:
> On Thu, 11 Sep 2014 23:11:39 -0400, Alexei A. Frounze
>

> <...@gmail.com> wrote:
>
> > I'm not planning to offer DPMI support in the compiler (nor LFN). That
> > would be too much research and work for something that's supposed to be
> > small and fun.
> >
>
> What if someone, like me, executes a DPMI application via your system()?
>
> Well, that was the point. I.e., DPMI may mess up your monitoring of
> Int 0x21 0x4c, if you haven't tested it with a variety of DPMI hosts.
> FYI, CWSDPMI is nestable and PMODETSR is non-nestable. Both come with
> DJGPP. Also, some emulations have built in DPMI, e.g., dosemu, Windows,
> etc.

I've just implemented extension checking in system() (and started supplying ".exe" in the commands passed to my system()) and I'm running self-compiled DOS versions of my compiler under DosBox and smlrcc invokes NASM, which, as you probably know, is distributed as a DJGPP DPMI binary for DOS, and everything seems to work. The only problem is that NASM is tremendously slow under DosBox. Heck, under Windows it isn't blazingly fast either. I'm suspecting the relative jump optimization needs algorithmic improvement. And there may be something else. But that's just a speculation. I haven't looked into it.

Alex

Steve

unread,

Sep 12, 2014, 7:44:06 AM9/12/14

to

"Rod Pemberton" <buz...@nonamewhichexists.cmm> writes:
>Also, weren't there a couple of other ways to exit DOS application
>besides 0x20 and 0x21 0x4c? CP/M far jump? ret instruction?

Hi,

Both of those end up invoking Int 20H.

Cheers,

Steve N.

rug...@gmail.com

unread,

Sep 17, 2014, 6:06:26 AM9/17/14

to

Hi,

On Friday, September 12, 2014 5:12:27 AM UTC-5, Alexei A. Frounze wrote:
>
> I've just implemented extension checking in system() (and started
> supplying ".exe" in the commands passed to my system()) and I'm
> running self-compiled DOS versions of my compiler under DosBox
> and smlrcc invokes NASM, which, as you probably know, is distributed
> as a DJGPP DPMI binary for DOS, and everything seems to work. The
> only problem is that NASM is tremendously slow under DosBox.

Everything is slow under DOSBox. It's not fast. It's full software
emulation. It's a "fast" 486, at best (on a 1 Ghz host machine).
It's not really meant for anything besides games. You'd get better
results testing elsewhere, obviously. But I don't know what host
OS you're using or what you would prefer (VirtualBox with VT-X,
NTVDM, native FreeDOS via RUFUS USB, QEMU, etc).

You could probably tweak/clone the dosbox.conf (core=dynamic) and/or
up the cycles, up the frameskip, or try a third-party build (or
even a newer MSVC build, allegedly faster). It's not totally
impossible, but honestly I never do any of that, so I'm probably
not the one to be giving much advice here. :-)

> Heck, under Windows it isn't blazingly fast either.

Back in the day, when I still used NTVDM (before I gave up due to
billions of bugs), it was discovered that a UPX'd DJGPP binary
was twice (or more) as slow than a non-UPX'd binary. You could
try decompressing it (upx -d), and see if that helps.

> I'm suspecting the relative jump optimization needs algorithmic
> improvement. And there may be something else. But that's just
> a speculation. I haven't looked into it.

You mean -Ox? That's been default turned on for a few years now
(apparently since 2.09, according to the online doc). It used to
be optional (unlike YASM or FASM). You can disable that, of
course (-O0), even via %NASMENV% environment variable. Somehow
I don't think this is your problem, but who knows. For me, it
was never noticeable except on really old machines. But
some things make it more obvious than others. (I never messed
with ZSNES, but I vaguely recall that they claimed it exposed
something ridiculous that made it painfully slow. Dunno the
details, you'd have to ask Frank Kotler.)

Alexei A. Frounze

unread,

Sep 17, 2014, 6:28:47 AM9/17/14

to

On Wednesday, September 17, 2014 3:06:26 AM UTC-7, rug...@gmail.com wrote:
> Hi,
>
> On Friday, September 12, 2014 5:12:27 AM UTC-5, Alexei A. Frounze wrote:
>
> >
> > I've just implemented extension checking in system() (and started
> > supplying ".exe" in the commands passed to my system()) and I'm
> > running self-compiled DOS versions of my compiler under DosBox
> > and smlrcc invokes NASM, which, as you probably know, is distributed
> > as a DJGPP DPMI binary for DOS, and everything seems to work. The
> > only problem is that NASM is tremendously slow under DosBox.
>
> Everything is slow under DOSBox. It's not fast. It's full software
> emulation. It's a "fast" 486, at best (on a 1 Ghz host machine).
> It's not really meant for anything besides games. You'd get better
> results testing elsewhere, obviously. But I don't know what host
> OS you're using or what you would prefer (VirtualBox with VT-X,
> NTVDM, native FreeDOS via RUFUS USB, QEMU, etc).

I know. DosBox is probably some 50 times slower than the host (3.x GHz Win7).

> You could probably tweak/clone the dosbox.conf (core=dynamic) and/or
> up the cycles, up the frameskip, or try a third-party build (or
> even a newer MSVC build, allegedly faster). It's not totally
> impossible, but honestly I never do any of that, so I'm probably
> not the one to be giving much advice here. :-)

I'm using Ctrl+F12 to get to 100K when I'm impatient. :)

> > Heck, under Windows it isn't blazingly fast either.
>
> Back in the day, when I still used NTVDM (before I gave up due to
> billions of bugs), it was discovered that a UPX'd DJGPP binary
> was twice (or more) as slow than a non-UPX'd binary. You could
> try decompressing it (upx -d), and see if that helps.

DJGPP version of NASM runs quite fast in XP (aka XP Mode in Win7) and real DOS.

> > I'm suspecting the relative jump optimization needs algorithmic
> > improvement. And there may be something else. But that's just
> > a speculation. I haven't looked into it.
>
> You mean -Ox? That's been default turned on for a few years now
> (apparently since 2.09, according to the online doc). It used to
> be optional (unlike YASM or FASM). You can disable that, of
> course (-O0), even via %NASMENV% environment variable. Somehow
> I don't think this is your problem, but who knows. For me, it
> was never noticeable except on really old machines. But
> some things make it more obvious than others. (I never messed
> with ZSNES, but I vaguely recall that they claimed it exposed
> something ridiculous that made it painfully slow. Dunno the
> details, you'd have to ask Frank Kotler.)

Win32 version of NASM (2.10) when run against smlrc.asm (my compiler's main file, translated from C; 94K lines) takes ~7 seconds to assemble into an ELF file. -O0 shaves ~2 seconds, totaling just ~5 seconds. Basically, almost a third of time is spent in optimization of relative jump instructions.

Alex