Questions about getopt() (Linux)

491 views
Skip to first unread message

Kenny McCormack

unread,
Jul 8, 2008, 6:25:22 AM7/8/08
to
1) Is it documented anywhere that you have to reset optind to 0 in order
to use getopt again (i.e., to use it more than once in a single program) ?
Observation: I've discovered this on my own at least twice now, the hard way.

2) Is it documented anywhere how/that getopt silently ignores the 0
element of the passed argv[] array? It is obvious, of course, why it
does this, but it is a catch point if you build up your own argv, rather
than just pass it the one supplied to main().

Observation: Obviously, I think the answer to both of the above
questions is "no" - at least I could find nothing on them. I'm curious
also as to how well known/understood these limitations are.

Geoff Clare

unread,
Jul 8, 2008, 8:49:02 AM7/8/08
to
Kenny McCormack wrote:

> 1) Is it documented anywhere that you have to reset optind to 0 in order
> to use getopt again (i.e., to use it more than once in a single program) ?
> Observation: I've discovered this on my own at least twice now, the hard way.
>
> 2) Is it documented anywhere how/that getopt silently ignores the 0
> element of the passed argv[] array? It is obvious, of course, why it
> does this, but it is a catch point if you build up your own argv, rather
> than just pass it the one supplied to main().

POSIX/SUS says:

"The variable optind is the index of the next element of the
argv[] vector to be processed. It shall be initialized to 1 by the
system, and getopt() shall update it when it finishes with each
element of argv[]."

This would seem to provide the answer both your questions.

You can get access to the on-line HTML version of POSIX.1-2001/SUSv3
at http://www.unix.org/version3/online.html

--
Geoff Clare <net...@gclare.org.uk>

Thomas E. Dickey

unread,
Jul 8, 2008, 11:19:36 AM7/8/08
to
Geoff Clare <ge...@clare.see-my-signature.invalid> wrote:
> Kenny McCormack wrote:
>
>> 1) Is it documented anywhere that you have to reset optind to 0 in order
>> to use getopt again (i.e., to use it more than once in a single program) ?
>> Observation: I've discovered this on my own at least twice now, the hard way.
>>
>> 2) Is it documented anywhere how/that getopt silently ignores the 0
>> element of the passed argv[] array? It is obvious, of course, why it
>> does this, but it is a catch point if you build up your own argv, rather
>> than just pass it the one supplied to main().
>
> POSIX/SUS says:
>
> "The variable optind is the index of the next element of the
> argv[] vector to be processed. It shall be initialized to 1 by the
> system, and getopt() shall update it when it finishes with each
> element of argv[]."
>
> This would seem to provide the answer both your questions.

It doesn't really answer either one (none of the first question, and only
part of the second). I really reading that the first one depends on the
implementation.

--
Thomas E. Dickey
http://invisible-island.net
ftp://invisible-island.net

Thomas E. Dickey

unread,
Jul 8, 2008, 12:34:22 PM7/8/08
to
Thomas E. Dickey <dic...@invisible-island.net> wrote:
> It doesn't really answer either one (none of the first question, and only
> part of the second). I really reading that the first one depends on the
recall

Kenny McCormack

unread,
Jul 8, 2008, 1:10:55 PM7/8/08
to
In article <ua4ck5-...@leafnode-msgid.gclare.org.uk>,

Geoff Clare <net...@gclare.org.uk> wrote:
>Kenny McCormack wrote:
>
>> 1) Is it documented anywhere that you have to reset optind to 0 in order
>> to use getopt again (i.e., to use it more than once in a single program) ?
>> Observation: I've discovered this on my own at least twice now, the hard way.
>>
>> 2) Is it documented anywhere how/that getopt silently ignores the 0
>> element of the passed argv[] array? It is obvious, of course, why it
>> does this, but it is a catch point if you build up your own argv, rather
>> than just pass it the one supplied to main().
>
>POSIX/SUS says:
>
> "The variable optind is the index of the next element of the
> argv[] vector to be processed. It shall be initialized to 1 by the
> system, and getopt() shall update it when it finishes with each
> element of argv[]."
>
>This would seem to provide the answer both your questions.

Well, as Mr. Dickey observes, it sorta implies an answer to the first,
but it certainly would have been nice if it had been made explicit.
There also seems to be some magic going on here - see below - which is
probably why the thing is overall not well documented. I.e., it has a
"it's really just for this one thing, it's not re-usable, and just use
it this way, don't ask a lot of questions, it will be OK" flavor to it.

Observations:
1) I see now that if you are going to re-initialize optind, you
should set it to 1, not 0.
2) I couldn't get it to work with a zero-based array. It seems to
only work with a 1 based array (like argv - that is, once you
ignore the irrelevant-for-this-purpose zero element). I.e.
(pseudo-code - not my real test case):

char *argv[100] = {0};
int argc = 0;

argv[argc++] = "zeroElement";
argv[argc++] = "oneElement";
now call getopt with argv and argc - results in core dump.

Michael Kerrisk

unread,
Jul 9, 2008, 1:23:37 AM7/9/08
to
On Tue, 8 Jul 2008 13:49:02 +0100, Geoff Clare
<ge...@clare.See-My-Signature.invalid> wrote:

>Kenny McCormack wrote:
>
>> 1) Is it documented anywhere that you have to reset optind to 0 in order
>> to use getopt again (i.e., to use it more than once in a single program) ?

Reset to *1*, surely?

>> Observation: I've discovered this on my own at least twice now, the hard way.
>>
>> 2) Is it documented anywhere how/that getopt silently ignores the 0
>> element of the passed argv[] array? It is obvious, of course, why it
>> does this, but it is a catch point if you build up your own argv, rather
>> than just pass it the one supplied to main().
>
>POSIX/SUS says:
>
> "The variable optind is the index of the next element of the
> argv[] vector to be processed. It shall be initialized to 1 by the
> system, and getopt() shall update it when it finishes with each
> element of argv[]."
>
>This would seem to provide the answer both your questions.
>
>You can get access to the on-line HTML version of POSIX.1-2001/SUSv3
>at http://www.unix.org/version3/online.html

The Linux man page was not explicit on how optind is initialized, and also
didn't state that optind can be reset to 1 to restart scanning. I've fixed
both of these points for the next (3.04) man-pages release.

Cheers,

Michael

PS The best way to get fixes made for Linux system call and glibc man-pages
is described here: http://www.kernel.org/doc/man-pages/reporting_bugs.html

Rainer Weikusat

unread,
Jul 9, 2008, 5:34:17 AM7/9/08
to
gaz...@xmission.xmission.com (Kenny McCormack) writes:
> In article <ua4ck5-...@leafnode-msgid.gclare.org.uk>,
> Geoff Clare <net...@gclare.org.uk> wrote:
>>Kenny McCormack wrote:
>>
>>> 1) Is it documented anywhere that you have to reset optind to 0 in order
>>> to use getopt again (i.e., to use it more than once in a single program) ?
>>> Observation: I've discovered this on my own at least twice now, the hard way.
>>>
>>> 2) Is it documented anywhere how/that getopt silently ignores the 0
>>> element of the passed argv[] array? It is obvious, of course, why it
>>> does this, but it is a catch point if you build up your own argv, rather
>>> than just pass it the one supplied to main().
>>
>>POSIX/SUS says:
>>
>> "The variable optind is the index of the next element of the
>> argv[] vector to be processed. It shall be initialized to 1 by the
>> system, and getopt() shall update it when it finishes with each
>> element of argv[]."
>>
>>This would seem to provide the answer both your questions.
>
> Well, as Mr. Dickey observes, it sorta implies an answer to the first,
> but it certainly would have been nice if it had been made explicit.

Stating that the variable is used to hold the index of the next
argument to be processed and that it is initialized to 1 before the
first call to getopt can hardly be anymore explicit.

[...]

> Observations:
> 1) I see now that if you are going to re-initialize optind, you
> should set it to 1, not 0.

As opposed to what the Glibc getopt does (by default), option
processing is supposed to stop when the first non-option argument is
encountered, cf

If, when getopt() is called: [...] *argv[optind] is not the
character - [...] getopt() shall return -1 without changing
optind.

This enables programs to process a set of options which cause the
'exec-persistent' environment of the process running them to be
changed, and then exec another programm, complete with a sequence of
option and non-option arguments of its own.

For obvious reasons, option processing must therefore not start with
the program names which should conventionally reside at argv[0],
because it would then terminate immediatly.

> 2) I couldn't get it to work with a zero-based array. It seems to
> only work with a 1 based array (like argv - that is, once you
> ignore the irrelevant-for-this-purpose zero element). I.e.
> (pseudo-code - not my real test case):
>
> char *argv[100] = {0};
> int argc = 0;
>
> argv[argc++] = "zeroElement";
> argv[argc++] = "oneElement";
> now call getopt with argv and argc - results in core dump.

Assuming this is referring to Gnu getopt: That treats a zero-valued
optind as 'has not yet been initialized':

int
_getopt_internal_r (int argc, char *const *argv, const char *optstring,
const struct option *longopts, int *longind,
int long_only, struct _getopt_data *d)
{
int print_errors = d->opterr;
if (optstring[0] == ':')
print_errors = 0;

if (argc < 1)
return -1;

d->optarg = NULL;

if (d->optind == 0 || !d->__initialized)
{
if (d->optind == 0)
d->optind = 1; /* Don't scan ARGV[0], the program name. */
optstring = _getopt_initialize (argc, argv, optstring, d);
d->__initialized = 1;
}

http://cvs.savannah.gnu.org/viewvc/libc/posix/getopt.c?root=libc&view=markup

Additionally, optind is statically initialized to 1 to 'conform to
POSIX requirements' (kind-of useless exercise, considering that the
default behaviour is completely non-conformant).

Presuambly, this could be regarded as a bug.

Geoff Clare

unread,
Jul 9, 2008, 8:38:19 AM7/9/08
to
Thomas E. Dickey wrote:

> Geoff Clare <ge...@clare.see-my-signature.invalid> wrote:
>> Kenny McCormack wrote:
>>
>>> 1) Is it documented anywhere that you have to reset optind to 0 in order
>>> to use getopt again (i.e., to use it more than once in a single program) ?
>>> Observation: I've discovered this on my own at least twice now, the hard way.
>>>
>>> 2) Is it documented anywhere how/that getopt silently ignores the 0
>>> element of the passed argv[] array? It is obvious, of course, why it
>>> does this, but it is a catch point if you build up your own argv, rather
>>> than just pass it the one supplied to main().
>>
>> POSIX/SUS says:
>>
>> "The variable optind is the index of the next element of the
>> argv[] vector to be processed. It shall be initialized to 1 by the
>> system, and getopt() shall update it when it finishes with each
>> element of argv[]."
>>
>> This would seem to provide the answer both your questions.
>
> It doesn't really answer either one (none of the first question, and only
> part of the second).

Okay, maybe it doesn't explicitly talk about the things Kenny is
asking about, but it is possible to deduce the answers from what
POSIX/SUS states.

"optind is the index of the next element of the argv[] vector to be

processed" and "getopt() shall update it when it finishes with each
element of argv[]" together imply that in order to make getopt()
re-process an argv[] that it has previously processed (or to start
processing a different vector - I'm not sure which case Kenny had in
mind), optind must be set back to its original value.

"It shall be initialized to 1 by the system" explains why getopt()
ignores argv[0].

--
Geoff Clare <net...@gclare.org.uk>

Thomas E. Dickey

unread,
Jul 9, 2008, 9:23:41 AM7/9/08
to
Geoff Clare <ge...@clare.see-my-signature.invalid> wrote:
> Thomas E. Dickey wrote:
...

>> It doesn't really answer either one (none of the first question, and only
>> part of the second).
>
> Okay, maybe it doesn't explicitly talk about the things Kenny is
> asking about, but it is possible to deduce the answers from what
> POSIX/SUS states.

actually "infer" would be the proper word in this context.
There's not enough information to deduce the answer.



> "optind is the index of the next element of the argv[] vector to be
> processed" and "getopt() shall update it when it finishes with each
> element of argv[]" together imply that in order to make getopt()
> re-process an argv[] that it has previously processed (or to start
> processing a different vector - I'm not sure which case Kenny had in
> mind), optind must be set back to its original value.

But if getopt had some additional internal state, it wouldn't necessarily
be reusable (in the the process). For example - reading a manpage on
OpenBSD 3.4:

The getopt() function implements a superset of the functionality speci-
fied by IEEE Std 1003.1 (``POSIX'').

The following extensions are supported:

o The optreset variable was added to make it possible to call the
getopt() function multiple times.

o If the optind variable is set to 0, getopt() will behave as if the
optreset variable has been set. This is for compatibility with GNU
getopt(). New code should use optreset instead.

That's pretty clear that someone else decided that POSIX left the
behavior less than fully specified.

> "It shall be initialized to 1 by the system" explains why getopt()
> ignores argv[0].

;-)

Michael Kerrisk

unread,
Jul 10, 2008, 4:52:47 AM7/10/08
to
On 09 Jul 2008 13:23:41 GMT, dic...@invisible-island.net (Thomas E. Dickey)
wrote:

Maybe.

It seems to me that's just one possible interpretation of events.
Alternatively, it also easily have been the case the implementer may not
have read the standard well enough, or they may have (wrongly) thought they
couldn't achieve the desired result without extending the interface.

Anyway, if I'm reading glibc's getopt() code correctly, then the situation
is this:

* resetting optind to 1 does the Right Thing by POSIX.1, that is, restarts
the scan following POSIX.1 rules.

* resetting optind to 0 causes the invocation of a glibc-specific
initialization routine that scans optstring for glibc extensions ('+' and
'-' at the start of the string) and rechecks POSIXLY_CORRECT. Resetting to
0 is required, if you want to employ these extensions.

Cheers,

Michael

berndj

unread,
Jul 16, 2008, 9:02:10 AM7/16/08
to
Funny how this thread appeared just a week before I needed its
answers...

On Jul 9, 11:34 am, Rainer Weikusat <rweiku...@mssgmbh.com> wrote:
> Stating that the variable is used to hold the index of the next
> argument to be processed and that it is initialized to 1 before the
> first call to getopt can hardly be anymore explicit.

No: that only explicitly documents what you get when you *read* the
variable. Although unlikely, it's conceivable that an implementation
of getopt() would work like this:

static char *real_optind = NULL;
int optind = 1;

int getopt(int argc, char *argv[], char const *optstring)
{
if (real_optind == NULL) {
real_optind = argv + 1;
}
<read next option at real_optind>
<find it in optstring>
real_optind += <has colon> ? 2 : 1;
optind = real_optind - argv;
return optopt;
}

> >            char *argv[100] = {0};
> >            int argc = 0;
>
> >            argv[argc++] = "zeroElement";
> >            argv[argc++] = "oneElement";
> >            now call getopt with argv and argc - results in core dump.

Because you forgot argv[argc++] = NULL;

Rainer Weikusat

unread,
Jul 16, 2008, 9:39:27 AM7/16/08
to
berndj <bernd.je...@gmail.com> writes:
> On Jul 9, 11:34 am, Rainer Weikusat <rweiku...@mssgmbh.com> wrote:
>> Stating that the variable is used to hold the index of the next
>> argument to be processed and that it is initialized to 1 before the
>> first call to getopt can hardly be anymore explicit.
>
> No: that only explicitly documents what you get when you *read* the
> variable.

The verbatim statement (which you 'accidentally' deleted) would be:

The variable optind is the index of the next element of the
argv[] vector to be processed.

Any assumption you may want to make regarding 'how "is the index"
could be interpreted as "is not the index", should be discussed with
you girl friend (presumably impressed by your mental ejaculations) or
your psychatrist (gets paid to endure them).

*P*L*O*N*K

Ok, that's childish. But I am really tired of this shit ..


Kenny McCormack

unread,
Jul 17, 2008, 10:06:11 AM7/17/08
to
In article <20b39b0a-5c5e-4855...@p25g2000hsf.googlegroups.com>,
berndj <bernd.je...@gmail.com> wrote:
...

>> >            char *argv[100] = {0};
>> >            int argc = 0;
>>
>> >            argv[argc++] = "zeroElement";
>> >            argv[argc++] = "oneElement";
>> >            now call getopt with argv and argc - results in core dump.
>
>Because you forgot argv[argc++] = NULL;

I didn't *forget* anything. I was explicitly testing something, and
derived the (it seems now expected) result.

And besides, your solution is wrong anyway. The obviously right thing is:

int argc = 1;

Kenny McCormack

unread,
Jul 17, 2008, 10:06:54 AM7/17/08
to
In article <87zloid...@fever.mssgmbh.com>,
Rainer Weikusat <rwei...@mssgmbh.com> wrote:
...

>Any assumption you may want to make regarding 'how "is the index"
>could be interpreted as "is not the index", should be discussed with
>you girl friend (presumably impressed by your mental ejaculations) or
>your psychatrist (gets paid to endure them).

Everyone's a comedian these days...

Rainer Weikusat

unread,
Jul 17, 2008, 10:35:53 AM7/17/08
to
gaz...@xmission.xmission.com (Kenny McCormack) writes:
> Rainer Weikusat <rwei...@mssgmbh.com> wrote:
> ...
>>Any assumption you may want to make regarding 'how "is the index"
>>could be interpreted as "is not the index", should be discussed with
>>you girl friend (presumably impressed by your mental ejaculations) or
>>your psychatrist (gets paid to endure them).
>
> Everyone's a comedian these days...

This is, unfortunately, not really funny: One of the people who always
manage to "understand" the exact opposite of the contents of a text is
certainly going to develop software for the ABS in your car ...

Kenny McCormack

unread,
Jul 17, 2008, 3:10:05 PM7/17/08
to
In article <87r69st...@fever.mssgmbh.com>,

Um, I was referring to your silly comments about girlfriends and
psychaitrists...

Rainer Weikusat

unread,
Jul 17, 2008, 5:56:09 PM7/17/08
to

That was hardly noticeable.

berndj

unread,
Jul 18, 2008, 9:41:26 AM7/18/08
to
On Jul 16, 3:02 pm, berndj <bernd.jendris...@gmail.com> wrote:

> > Kenny McCormack wrote:
> > >            char *argv[100] = {0};

How embarrassing for me not to see the initialization!

> > >            int argc = 0;
>
> > >            argv[argc++] = "zeroElement";
> > >            argv[argc++] = "oneElement";
> > >            now call getopt with argv and argc - results in core dump.
>
> Because you forgot argv[argc++] = NULL;

No, instead because (GNU) getopt wants to print your program name as
part of the error message it prints when it doesn't recognize an
option. In a zero-based array, that is argv[-1], which will be
whatever stack turds are adjacent to argv. Hence the segfault.

Reply all
Reply to author
Forward
0 new messages