Should use of goto completely avoided in programs?

SUMIT

unread,

Sep 6, 2005, 9:13:15 AM9/6/05

to

I wrote a program for removing comment from C source file for which
i...
1.first constructed a DFA
2.used goto statement to write a program.

now it was very easy to model DFA using goto & i suppose it was also
efficient.

so was the use of goto right here?

thanx

Sumit
PUCSD

Richard Heathfield

unread,

Sep 6, 2005, 9:54:49 AM9/6/05

to

SUMIT said:

If it made the program clearer to read and maintain than the alternatives,
yes. Otherwise, no. And the odds are good that it didn't.

State machines such as DFAs are normally implemented (in procedural
languages, at least) using a construct such as C's switch (multi-way
select). In OOP, you would probably do it using something a bit more - er -
OO. :-)

The problem with goto is that it tends to make maintenance a pig.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/2005
http://www.cpax.org.uk
email: rjh at above domain

John Bode

unread,

Sep 6, 2005, 11:45:53 AM9/6/05

to

Whether it was right or not depends on the answers to the following
questions:

1. Is this a one-off, quick-n-dirty job that's never going to be
touched again?

2. Is the code easy to understand by someone who didn't write it?

3. Can the code easily be updated to reflect any potential changes in
the DFA?

If the answer to 1 is "no", then the answers to both 2 and 3 had better
be "yes"; otherwise you need to re-evaluate how you are using gotos.

Gotos by themselves aren't a bad thing, but you have to be disciplined
in how you use them, otherwise they can lead to maintenance headaches.
You should never replace a structured construct (if-else, for, while,
do-while, switch) with a goto. You should never branch into a block,
and you should branch forward only. In some cases they may interfere
with compiler optimizations.

jiush...@gmail.com

unread,

Sep 6, 2005, 12:13:21 PM9/6/05

to

Keith Thompson

unread,

Sep 6, 2005, 4:52:50 PM9/6/05

to

I'll agree with everyone else: Maybe.

Any program maps a problem space (the real-world problem you're trying
to solve) onto a program space (the implementation of your program).
Ideally, a control-flow construct should have some meaning in the
problem space. For example, an if-else should usually correspond to
some decision in the problem space; a loop should correspond to some
kind of iteration in the problem space.

A goto statement usually applies only to the program itself; it rarely
corresponds to anything in the problem space. An exception to this is
an explicit finite state machine, where a goto (a jump from one place
in your program to another) can actually correspond to a state
transition.

The other cases where a goto is acceptable in C are exception handling
(giving up on processing and jumping to wrap-up code) and breaking out
of a nested loop. These both, in my opinion, represent gaps in the
language. If the language provided an explicit construct to break out
of a nested loop, for example, a goto statement wouldn't be necessary
there. And yes, a hypothetical multi-level break would be
functionally equivalent to a goto; the difference is that it wouldn't
be spelled g-o-t-o, and it couldn't be mistaken by a reader for
undisciplined spaghetti code.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Tim Rentsch

unread,

Sep 6, 2005, 11:17:19 PM9/6/05

to

"SUMIT" <sumit....@gmail.com> writes:

Other responders have given answers like "probably not",
"maybe", or "it depends". Here's another opinion: No.

I'll elaborate a bit on that two-letter answer.

First, there's an excellent chance the program you have now
is wrong. It's not too hard to remove C comments if one
ignores the corner cases, but dealing with the corner cases
adds noticeably to the complexity. Using goto doesn't scale
up very well as the DFA gets more complicated.

Second, even if it was clear to you when you wrote it, it
probably won't be so clear to the next guy, or yourself when
you have to look at it again later. Right now you've got
all the structure in your head; but someone who doesn't
have all the structure in their head will have to discover
it looking over the various goto's (presumably in a single
large function body).

Third, perhaps most important, there's a simple and clear
way to write the code that needs no goto's at all. Here's
a sketch:

int
main(){
State s;
int c;

s = NORMAL;
do {
c = getchar();
switch( s ){
case NORMAL: s = normal_next(c); break;

case STRING: s = string_next(c); break;
case STRING_1: s = string_1_next(c); break;

case CHARLIT: s = charlit_next(c); break;
case CHARLIT_1: s = charlit_1_next(c); break;

case SLASH: s = slash_next(c); break;

case COMMENT: s = comment_next(c); break;
case COMMENT_1: s = comment_1_next(c); break;

...
}
} while( c != EOF );

return appropriate_return_code();
}

State
normal_next( int c ){
if( c == '"' ){ out(c); return STRING; }
if( c == '\'' ){ out(c); return CHARLIT; }
if( c == '/' ){ return SLASH; }

...

out(c);
return NORMAL;
}

... etc ...

I expect a program along these lines would be somewhat
longer than the program you wrote, and probably take
slightly longer to write (because of typing time if nothing
else). However, when the time comes to fix or improve the
program, you'll be glad you took the time to structure the
code in a way that scales up better.

Incidentally, the removal of comments from C source cannot,
strictly speaking, be done with just a finite state machine
(aka DFA). Additional state (in principle unbounded) is
needed. (I'll just leave this topic here, without saying
why, in case some comp.lang.c readers would enjoy figuring
that out on their own.)

Antonio Contreras

unread,

Sep 7, 2005, 4:36:18 AM9/7/05

to

Tim Rentsch wrote:
<SNIP digression about gotos, structured programing and scalability>

> Incidentally, the removal of comments from C source cannot,
> strictly speaking, be done with just a finite state machine
> (aka DFA). Additional state (in principle unbounded) is
> needed. (I'll just leave this topic here, without saying
> why, in case some comp.lang.c readers would enjoy figuring
> that out on their own.)

Nested comments? By the way, are they legal? (I know they weren't, but,
are they in C99?)

Lawrence Kirby

unread,

Sep 7, 2005, 8:52:13 AM9/7/05

to

Comments don't nest in C99, although you can put /* */ comments around
the // comments that C99 supports.

Lawrence

Michael Wojcik

unread,

Sep 7, 2005, 12:58:37 PM9/7/05

to

No, thank goodness. They'd break existing code. Not *good* existing
code, perhaps, but existing nonetheless.

I thought perhaps Tim was referring to incomplete comments in
included files, but that's illegal (C90 5.1.1.2 phase 3); or to
incomplete comments controlled by #if, but comments are removed
(translation phase 3) before preprocessing directives are processed
(phase 4).

Similarly, string literals can't span lines (in conforming code), and
string literals are pp-tokens, so you can't play games with enclosing
(what appear to be) comments in strings using preprocessing
directives.

Trigraphs don't require unbounded state, and at any rate there are no
trigraphs for "*" or "/". It should be legal to separate a comment
delimiter using backslash-continuation, but that only requires a
fixed amount of state, too.

It's probably something obvious that I'm just overlooking.

--
Michael Wojcik michael...@microfocus.com

Be sure to push the button of the bottom, and push the button of the
settlement page indicated next only once, there is fear of the bottom
rhinoceros multiplex lesson money. -- Sukebe Net

Tim Rentsch

unread,

Sep 8, 2005, 2:36:16 PM9/8/05

to

mwo...@newsguy.com (Michael Wojcik) writes:

The hint is, one of Michael's comments is getting warm.

Incidentally, it wasn't obvious to me either when I first started
thinking about how to do comment removal.

Anonymous 7843

unread,

Sep 9, 2005, 7:48:14 PM9/9/05

to

In article <kfnbr35...@alumnus.caltech.edu>,

Tim Rentsch <t...@alumnus.caltech.edu> wrote:
>
> Incidentally, the removal of comments from C source cannot,
> strictly speaking, be done with just a finite state machine
> (aka DFA). Additional state (in principle unbounded) is
> needed. (I'll just leave this topic here, without saying
> why, in case some comp.lang.c readers would enjoy figuring
> that out on their own.)

/\
\
\
\
* Are you referring to comments like this? *\
\
\
\
/

Tim Rentsch

unread,

Sep 10, 2005, 3:27:23 PM9/10/05

to

anon...@example.com (Anonymous 7843) writes:

Yes. The start of that comment needs unbounded
lookahead to tell that it's a comment rather than,
for example, a '/=' operator.

P.J. Plauger

unread,

Sep 10, 2005, 3:34:31 PM9/10/05

to

"Tim Rentsch" <t...@alumnus.caltech.edu> wrote in message
news:kfnzmqk...@alumnus.caltech.edu...

Uh, no. A comment is syntactically equivalent to a single
space character, which turns any comment of the form
"//**/=" into "/ =".

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com

Tim Rentsch

unread,

Sep 10, 2005, 6:40:41 PM9/10/05

to

"P.J. Plauger" <p...@dinkumware.com> writes:

Right, however that isn't the case I was talking
about. I meant this one:

/\
\
\
\
\
\
=

which looks like it might be a comment until
the scan gets to the '=' character.

Anonymous 7843

unread,

Sep 12, 2005, 1:05:12 PM9/12/05

to

In article <kfnzmqk...@alumnus.caltech.edu>,

Unbounded lookahead is not the same as unbounded state. The repetition
of backslash/newline is two states no matter how long it goes on for
and is well within the abilities of a DFA.

However, catering to backslash/newline for every possible token
is a tedious way to implement a lexer. One might prefer to paste the
backslash/newlines together in an early phase, even before removing
comments.

Tim Rentsch

unread,

Sep 13, 2005, 2:54:45 AM9/13/05

to

anon...@example.com (Anonymous 7843) writes:

I think you may have missed the significance of the context here.
We weren't talking about writing a lexer. The requirement is for
a program to remove comments from C source code, and except for
removing comments leaves the original source completely intact
(not counting possibly inserting spaces to preserve the token
boundaries around the removed comments).

In this context it's important to know how many \<NL> pairs have
been skipped over, because they may need to be output if they
don't belong to a comment. Needing that count is what changes
the problem to one that can't be done with a finite state
machine.

Anonymous 7843

unread,

Sep 13, 2005, 1:31:51 PM9/13/05

to

In article <kfnoe6x...@alumnus.caltech.edu>,

You have piqued my curiousity. I'm going to read about flex some more
and have a go at writing a proper comment-removal thingy. I think the
trailing context feature would be sufficient to decide whether or not
to emit backslash/newline, but I'm going to try it out before digging
myself any deeper into the hole.

Tim Rentsch

unread,

Sep 13, 2005, 4:10:42 PM9/13/05

to

anon...@example.com (Anonymous 7843) writes:

I expect the trailing context feature is sufficient to resolve
the question. Of course, using the trailing context feature
already (in the general case) makes the program exceed what a
finite state machine can do: the trailing context lookahead has
to remember where to continue scanning after checking the
trailing context, and remembering where to go back to takes (in
the theoretical sense) unbounded memory.