Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

A short-but-bizarre program that shouldn't work but does.

23 views
Skip to first unread message

Robbie Hatley

unread,
Apr 28, 2012, 5:59:57 PM4/28/12
to

Greetings, group. I was just looking through my source code
folders and found the following program, author unknown
(did I write this myself years ago and forget? dunno).
According to the C Standard, as I interpret it, this should
not work... and yet, it does:

#include <stdio.h>
int main(int argc, char **argv, char **envp)
{
char **envstr = envp;
while(*envstr)
{
puts(*envstr);
++envstr;
}
return 0;
}

My expectation: either won't compile, or will crash if ran.

Actuality: gcc compiles this without error. It does give
warnings about the unused "argv" and "argc", but it does NOT
give any error or warning about the illegal third parameter
"envp". On execution, the program does not crash, but prints
a list of all environmental variables and their values.

How can this even compile?
C99 section 5.1.2.2.1(2) clearly states:
"The function called at program startup is named main.
The implementation declares no prototype for this function.
It shall be defined with a return type of int and with no
parameters:
int main(void) { /* ... */ }
or with two parameters (referred to here as argc and argv,
though any names may be used, as they are local to the function
in which they are declared):
int main(int argc, char *argv[]) { /* ... */ }
or equivalent; or in some other implementation-defined manner."

So, how can main() have *three* parameters if the standard
demands it have either zero or two? Why does the above
program even compile? Is this some kind of non-standard
extension or something?

--
Confused,
Robbie Hatley
Santa Ana, CA, USA
lonewolf (at) well (dot) com

Kaz Kylheku

unread,
Apr 28, 2012, 6:58:07 PM4/28/12
to
On 2012-04-28, Robbie Hatley <see.my.s...@for.my.contact.info> wrote:
> Actuality: gcc compiles this without error. It does give

GCC also compiles this without error:

int x = ({ puts("foo"); 42 });

if you don't give it any options to control dialect.

There is no restriction on what may compile. Firstly, compilers often
do not accept the one of the ISO C dialects but their own, unless they are
asked to accept one of the ISO C dialects (and then it's still a question of
which one).

Then, ISO C only requires programs which break syntax rules and semantic
constraints to be diagnosed. But to be diagnosed does not mean to be
rejected.

For instance GCC compiles this also, even in some ISO C mode, like -ansi
or -std=c99.

double x = 3.0;
int *p = &x;

There is a diagnostic, but it is treated as a warning and the code
is translated anyway.

Of course, the behavior is then undefined. As far as ISO C is concerned,
if the translation unit requires a diagnostic, it is incorrect. If the
implementation chooses to run it, that is in the domain of behavior that
is not deined by ISO C.

Note there is no diagnostic required if a program uses a nonstandard form of
the main startup function. That is just undefined behavior without
the requirement for a diagnostic.

> C99 section 5.1.2.2.1(2) clearly states:
> "The function called at program startup is named main.
> The implementation declares no prototype for this function.

See, since there is no prototype, no rule is being broken that requires
a diagnostic (no conflict between declaration and definition). The behavior is
simply undefined.

That just means that ISO C gives no opinion on what the behavior should be,
nor any recommended handling of the situation. Not that it is wrong.

Your environment does have an opinion: it supports the additional parameter for
passing the environment pointer.

> Is this some kind of non-standard extension or something?

Yes, this extra argument to main is a traditional extension found in Unix
environments which evidently has not been standardized.

It is not documented in POSIX 1003/Single Unix Spec.

What /is/ documented is the "extern char *environ[]" array; so it is better
to rely on that (if you're targetting POSIX and you need to enumerate
the environment variables rather than just perform lookup).

Reference: http://pubs.opengroup.org/onlinepubs/9699919799/functions/environ.html

Further down in the page it talks about main. Nothing is mentioned about
any third parameter, and I cannot find anything else in the document that
gives requirements about main.

--
If you ever need any coding done, I'm your goto man!

pete

unread,
Apr 28, 2012, 7:40:23 PM4/28/12
to
It's called a "Common extension".

ISO/IEC 9899:1999 (E)

Annex J
(informative)
Portability issues
1 This annex collects some information about portability
that appears in this International Standard.

J.5 Common extensions

J.5.1 Environment arguments
1 In a hosted environment,
the main function receives a third argument,
char *envp[],
that points to a null-terminated array of pointers to char,
each of which points to a string
that provides information about the environment
for this execution of the program (5.1.2.2.1).

--
pete

James Kuyper

unread,
Apr 28, 2012, 10:15:39 PM4/28/12
to
On 04/28/2012 05:59 PM, Robbie Hatley wrote:
....
> C99 section 5.1.2.2.1(2) clearly states:
> "The function called at program startup is named main.
> The implementation declares no prototype for this function.
> It shall be defined with a return type of int and with no
> parameters:
> int main(void) { /* ... */ }
> or with two parameters (referred to here as argc and argv,
> though any names may be used, as they are local to the function
> in which they are declared):
> int main(int argc, char *argv[]) { /* ... */ }
> or equivalent; or in some other implementation-defined manner."
>
> So, how can main() have *three* parameters if the standard
> demands it have either zero or two? Why does the above
> program even compile? Is this some kind of non-standard
> extension or something?

Look at the last line again: "or in some other implementation-defined
manner".
--
James Kuyper

Keith Thompson

unread,
Apr 28, 2012, 11:18:59 PM4/28/12
to
Robbie Hatley <see.my.s...@for.my.contact.info> writes:
> Greetings, group. I was just looking through my source code
> folders and found the following program, author unknown
> (did I write this myself years ago and forget? dunno).
> According to the C Standard, as I interpret it, this should
> not work... and yet, it does:
>
> #include <stdio.h>
> int main(int argc, char **argv, char **envp)
> {
> char **envstr = envp;
> while(*envstr)
> {
> puts(*envstr);
> ++envstr;
> }
> return 0;
> }
>
> My expectation: either won't compile, or will crash if ran.
>
> Actuality: gcc compiles this without error. It does give
> warnings about the unused "argv" and "argc", but it does NOT
> give any error or warning about the illegal third parameter
> "envp". On execution, the program does not crash, but prints
> a list of all environmental variables and their values.
>
> How can this even compile?
> C99 section 5.1.2.2.1(2) clearly states:
> "The function called at program startup is named main.
[snip]
> or equivalent; or in some other implementation-defined manner."
>
> So, how can main() have *three* parameters if the standard
> demands it have either zero or two? Why does the above
> program even compile? Is this some kind of non-standard
> extension or something?

James Kuyper already pointed out the phrase "or in some other
implementation-defined manner".

In addition, that section specifies how the main function "shall be
defined", but when the word "shall" appears outside a constraint, it
states a rule whose violation causes undefined behavior. There's no
requirement for a diagnostic -- even if the implementation *doesn't*
define
int main(int argc, char **argv, char **envp) /* ... */
as a valid definition of main.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Robbie Hatley

unread,
Apr 29, 2012, 1:15:33 AM4/29/12
to
Regarding "int main(int argc, char **argv, char **envp)",
"Pete" writes:

> It's called a "Common extension".
>
> ISO/IEC 9899:1999 (E)
>
> Annex J
> (informative)
> Portability issues
> 1 This annex collects some information about portability
> that appears in this International Standard.
>
> J.5 Common extensions
>
> J.5.1 Environment arguments
> 1 In a hosted environment,
> the main function receives a third argument,
> char *envp[],
> that points to a null-terminated array of pointers to char,
> each of which points to a string
> that provides information about the environment
> for this execution of the program (5.1.2.2.1).

Ah-ha! Thanks so much for the explanation. Obviously, then,
this program's author made use of that, and my OS (Win-XP)
and compiler (DJGPP's port of GCC) do support that, so when
I compile it and run it, it works as intended. Not portable,
but on my setup works quite well. Could be useful for certain
file maintenance utilities I write from time to time if they
need to look at an environmental variable in real time.

--
Cheers,
Robbie Hatley
Santa Ana, CA, USA
lonewolf (at) well (dot) com
http://www.well.com/user/lonewolf/

Philip Lantz

unread,
Apr 29, 2012, 4:13:43 AM4/29/12
to
If by "real time", you mean that the value of an environment variable
passed to your program is updated by some external program while your
program is running, don't count on it.

If all you need to do is read specific environment variables, you should
use getenv, not this feature.

However, this feature (or the Posix standard "extern char *environ[]")
can be useful if you need to do something with all the environment
variables, or to find environment variables where you don't know the
exact name.

Heinrich Wolf

unread,
Apr 29, 2012, 5:20:00 AM4/29/12
to
Hi,

my old Borland Turbo C 2.0 also compiles that program and that lists the
environment correctly without error.

Heiner

Heinrich Wolf

unread,
Apr 29, 2012, 5:31:24 AM4/29/12
to
In the help of Borland C++ Builder 5 the third parameter env is documented.
Here is the documentation. I am sorry that it is German.



Drei Parameter (Argumente) werden von der Borland
C++Builder-Programmstartroutine an main übergeben: argc, argv und env.

argc ist eine Ganzzahl und gibt die Anzahl der an main übergebenen
Kommandozeilenargumente an, einschließlich des Namens des ausführbaren
Programms selbst.
argv ist ein Array von Zeigern auf Strings (char *[]).

- argv[0] ist der vollständige Pfadname des laufenden Programms.
- argv[1] zeigt auf den ersten String nach dem Programmnamen, der in der
Kommandozeile des Betriebssystems eingegeben wurde.
- argv[2] zeigt auf den zweiten String, der nach dem Programmnamen
eingegeben wurde.
- argv[argc-1] zeigt auf das letzte an main übergebene Argument.
- argv[argc] enthält NULL.

env ist ebenfalls ein Array von Zeigern auf Strings. Jedes Element von env[]
enthält einen String der Form ENVVAR=Wert.

- ENVVAR ist der Name einer Umgebungsvariablen, wie etwa PATH oder COMSPEC.
- Wert ist der Wert, auf den ENVVAR gesetzt ist, wie beispielsweise
C:\APPS;C:\TOOLS; (für PATH) oder C:\DOS\COMMAND.COM (für COMSPEC).

Bei der Deklaration dieser Parameter muß die exakte Reihenfolge eingehalten
werden: argc, argv, env. So sind beispielsweise sämtliche folgenden
Deklarationen von Argumenten zu main gültig:

int main()
int main(int argc) /* erlaubt, aber sehr ungewöhnlich */
int main(int argc, char * argv[])
int main(int argc, char * argv[], char * env[])]

Die Deklaration int main(int argc) ist zulässig; es ist jedoch sehr
ungewöhnlich, argc in einem Programm zu verwenden, ohne zugleich die
Argumente von argv zu benutzen.

Das Argument env ist auch über die globale Variable environ verfügbar.

In allen Umgebungen sind argc und argv ebenfalls über die globalen Variablen
_argc und _argv verfügbar.

main mit einer Unicode-Anwendung

Die Unicode-Version der Funktion main lautet:

int wmain (int argc, wchar_t *argv[])

Der Parameter argv (und optional der Parameter envp) unterstützt
wide-character-Typen.

Die folgende _tmain-Funktion ist ein Makro, das - abhängig von dem
Anwendungstyp - auf die entsprechende main-Funktion expandiert wird.

int _tmain (int argc, _TCHAR *argv[])

Heinrich Wolf

unread,
Apr 29, 2012, 5:38:03 AM4/29/12
to

"Heinrich Wolf" <inv...@invalid.invalid> schrieb im Newsbeitrag
news:jnj143$qu1$1...@news.m-online.net...
My manual of Borland Turbo C 2.0 also documents env as third parameter of
main.

Heinrich Wolf

unread,
Apr 29, 2012, 6:26:12 AM4/29/12
to
man gcc on my Fedora 14 Linux writes:

...
-Wmain
Warn if the type of main is suspicious. main should be a
function with external linkage, returning int, taking either zero arguments,
two, or
three arguments of appropriate types. This warning is enabled by
default in C++ and is enabled by either -Wall or -pedantic.
...

pete

unread,
Apr 29, 2012, 10:02:46 AM4/29/12
to
You're welcome.

--
pete
0 new messages