Do you trust the "indent" program?

36 views
Skip to first unread message

for...@sybase.com

unread,
Dec 22, 1990, 6:17:09 PM12/22/90
to
Let's say your company just decided on a coding standard and let's
say you were able to come up with a specification file for the
'indent' program that would allow 'indent' to change all your
1000's of source files to meet the standard. Do you trust 'indent'
enough to run it on all your source file without making any mistakes?
By mistakes I mean changes that actually change your code so that it
does something different than what it did before you ran it through
'indent'.

----
Anything you read here is my opinion and in no way represents Sybase, Inc.

Jon Forrest WB6EDM
for...@sybase.com
{pacbell,sun,{uunet,ucbvax}!mtxinu}!sybase!forrest
415-596-3422

Rick Stevenson

unread,
Dec 24, 1990, 12:21:58 AM12/24/90
to
for...@sybase.com writes:
>Let's say your company just decided on a coding standard and let's
>say you were able to come up with a specification file for the
>'indent' program that would allow 'indent' to change all your
>1000's of source files to meet the standard. Do you trust 'indent'
>enough to run it on all your source file without making any mistakes?
>By mistakes I mean changes that actually change your code so that it
>does something different than what it did before you ran it through
>'indent'.

Why don't you run a copy of your src files through indent, recompile,
and compare new object files with the old versions? The proof of the
pudding...

Rick.
--
-m------- Rick Stevenson
---mmm----- Pyramid Technology +61 75 522475 FAX
-----mmmmm--- Research Park, Bond University +61 75 950249 VOICE
-------mmmmmmm- Gold Coast, Q 4229, AUSTRALIA ri...@ptcburp.ptcbu.oz.au

Dave Eisen

unread,
Dec 24, 1990, 11:47:10 AM12/24/90
to
In article <12...@sybase.sybase.com> for...@sybase.com writes:
>Do you trust 'indent'
>enough to run it on all your source file without making any mistakes?


I don't trust indent to go through my source files without making any
indenting mistakes; I find that any source file that was typed with anywhere
near what I consider to be reasonable indenting gets made worse by indent.
But, I have been using indent for some time now and I've never seen
any changes in anything other than whitespace. Well, there was that
bug where it would change >>= to >> = .

--
Dave Eisen dke...@Gang-of-Four.Stanford.EDU
1447 N. Shoreline Blvd.
Mountain View, CA 94043 Anybody have an extra New Year's ticket?
(415) 967-5644 (I can hope, can't I?)

Andrew Koenig

unread,
Dec 24, 1990, 1:37:09 PM12/24/90
to

> Do you trust 'indent'
> enough to run it on all your source file without making any mistakes?

I don't need to -- I can compile my programs before and after
and compare the object files. If they're not identical,
something's broken.
--
--Andrew Koenig
a...@europa.att.com

Arnold Robbins

unread,
Dec 26, 1990, 1:24:31 PM12/26/90
to
>In article <12...@sybase.sybase.com> for...@sybase.com writes:
>> Do you trust 'indent'
>> enough to run it on all your source file without making any mistakes?

Personally, no. See below for what I'd do if I had thousands of lines of
code to massage.

In article <11...@alice.att.com> a...@alice.UUCP () writes:
>I don't need to -- I can compile my programs before and after
>and compare the object files. If they're not identical,
>something's broken.

Andrew is fortunate enough to be running on a Unix system that doesn't
use COFF for it's object files --- COFF files have a timestamp in them.
If you know where it is (I don't), you can arrange to strip off the
COFF header and then compare the objects, but it is not as simple an
operation as it used to be.

A better, more useful technique would be to use lex/flex to write a
C scanner that produced a stream of tokens, one per line. Run it against
the before and after versions, and then use diff on the two outputs.

main() { printf("hello, world\n"); }

and
main (
) {
printf
(
"hello, world\n"
) ; }

should both produce (collapsed to save space)

ident open-p close-p open-brace ident open-p string-const
close-p semi-colon close-brace

as their token streams. Comments should be noted, as should preprocessor
lines and such. It's about a day's job to do well. Some versions of
indent were notorious for turning

i =-1; /* XXX --- could be old style assign op */

into

i -= 1;

which of course is wrong. Comparing the tokenized version of the code
would make catching this sort of change much easier.

Even if there were a few errors of this sort, using indent and then cleaning
up after it would still be a win over reformatting code by hand.
--
Arnold Robbins AudioFAX, Inc. | Laundry increases
2000 Powers Ferry Road, #200 / Marietta, GA. 30067 | exponentially in the
INTERNET: arn...@audiofax.com Phone: +1 404 933 7612 | number of children.
UUCP: emory!audfax!arnold Fax-box: +1 404 618 4581 | -- Miriam Robbins

Brian K. W. Hook

unread,
Dec 26, 1990, 1:40:20 PM12/26/90
to
I know that C and Pascal pass parameters differently and clean up the calling
stack differently, but I was wondering whether there is a NOTICEABLE difference
in speed and size of program that uses PASCAL declarations for fixed argument
functios and the cdecl declaration for variable...e.g.

void pascal foobar ( void );

void pascal foobar ( void )
{
...
}

VERSUS:

void cdecl foobar ( void );

void cdecl foobar ( void )
{
....
}

Assume that parameters can be passed, also, just a fixed amount....I know
main() has to be declared as cdecl, but otherwise, is it possible for me
to change all of my other functions to pascal declarations without any
side effects?

And will I actually gain anything? I know a lot fo Windows prototypes are
prototype with the PASCAL keyword.

Brian

Kevin D. Quitt

unread,
Dec 26, 1990, 1:36:36 PM12/26/90
to
>Let's say your company just decided on a coding standard and let's
>say you were able to come up with a specification file for the
>'indent' program that would allow 'indent' to change all your
>1000's of source files to meet the standard. Do you trust 'indent'
>enough to run it on all your source file without making any mistakes?

Certainly. Just as soon as I'd proven that the executables were
bit-for-bit indentical. 8-{)}


--
_
Kevin D. Quitt demott!kdq k...@demott.com
DeMott Electronics Co. 14707 Keswick St. Van Nuys, CA 91405-1266
VOICE (818) 988-4975 FAX (818) 997-1190 MODEM (818) 997-4496 PEP last

Pete Kvitek

unread,
Dec 26, 1990, 11:24:34 PM12/26/90
to
In article <26...@uflorida.cis.ufl.EDU> j...@reef.cis.ufl.edu (Brian K. W. Hook) writes:
>I know that C and Pascal pass parameters differently and clean up the calling
>stack differently, but I was wondering whether there is a NOTICEABLE difference
>in speed and size of program that uses PASCAL declarations for fixed argument
>functios and the cdecl declaration for variable...e.g.
>
[ stuff deleted ]

>
>Assume that parameters can be passed, also, just a fixed amount....I know
>main() has to be declared as cdecl, but otherwise, is it possible for me
>to change all of my other functions to pascal declarations without any
>side effects?
>
>And will I actually gain anything? I know a lot fo Windows prototypes are
>prototype with the PASCAL keyword.
>
>Brian

The order in which parameters are pushed on the stack
does not seem to make any speed or size difference.
However, if procedure uses C calling conventions, then
calling procedure is responsible for cleaning up stack
after _every_ call, thus consuming some extra memory in
code segment. Pascal procedure cleans up stack _before_
returning to the caller.

This probably explains why Windows designers choose
pascal calling conventions since Windows code usually
contains huge amount of references to the Windows
libraries.

Another reason to use pascal calling convections is to
conserve stack space with heavily nested procedures
(for example recourcive ones).

> Pete
--
--
Pete I. Kvitek <kvi...@jvd.msk.su> | Phone: (095) 328-1327
Speaking from but not for JV Dialogue, Moscow, USSR | Fax: (095) 329-4711

Ozan Yigit

unread,
Dec 27, 1990, 1:49:02 AM12/27/90
to
>Do you trust 'indent'
>enough to run it on all your source file without making any mistakes?

Yes. Ever since the famous assignment bug was fixed, I ran literally
hundreds of thousands of lines of code through indent [the one that
was posted] and on randomly paranoid occasions, I compared binaries.

[at this point, I stopped, went to a system running SunOS 4.0.3, and ran
/bin/indent on a recently arrived 63 files containing 110,000 lines of
awful C code, compiled the result, and diffed against the first binary
I had created earlier. No differences were found.]

Nowadays, almost any piece of foreign code actually goes through indent
before I ever look at it. [Almost all such code looks better afterwards :-)]

I think there exists versions of indent, including the one included in the
latest Berkeley distributions, and the one posted to comp.sources.unix,
and probably the one hacked for GNU, that are basically reliable in terms
of one-to-one translation. Any unreliable versions should be flushed.

oz
---
Good design means less design. Design | Internet: o...@nexus.yorku.ca
must serve users, not try to fool them. | UUCP: utzoo/utai!yunexus!oz
-- Dieter Rams, Chief Designer, Braun. | phonet: 1+ 416 736 5257

Ronald S H Khoo

unread,
Dec 26, 1990, 9:38:28 PM12/26/90
to
k...@demott.COM (Kevin D. Quitt) writes:

> In article <12...@sybase.sybase.com> for...@sybase.com writes:
> > Do you trust 'indent'
> > enough to run it on all your source file without making any mistakes?

> Certainly. Just as soon as I'd proven that the executables were
> bit-for-bit indentical. 8-{)}

And you gotta repeat this process for each and every possible combination
of compile-time #ifdef options. Ugh. It's not a small job. My feeling is
that if the original code were *that* badly indented, it's indicative of
poor quality anyway, and its probably time to rewrite the whole damn lot.

only 1/2 :-)

--
ron...@robobar.co.uk +44 81 991 1142 (O) +44 71 229 7741 (H)

Jody Hagins

unread,
Dec 27, 1990, 10:49:17 AM12/27/90
to
In article <3...@audfax.audiofax.com>, arn...@audiofax.com (Arnold Robbins) writes:
|> >In article <12...@sybase.sybase.com> for...@sybase.com writes:
|> >> Do you trust 'indent'
|> >> enough to run it on all your source file without making any mistakes?
|>
|> Personally, no. See below for what I'd do if I had thousands of lines of
|> code to massage.
|>
|> In article <11...@alice.att.com> a...@alice.UUCP () writes:
|> >I don't need to -- I can compile my programs before and after
|> >and compare the object files. If they're not identical,
|> >something's broken.
|>
|> Andrew is fortunate enough to be running on a Unix system that doesn't
|> use COFF for it's object files --- COFF files have a timestamp in them.
|> If you know where it is (I don't), you can arrange to strip off the
|> COFF header and then compare the objects, but it is not as simple an
|> operation as it used to be.


The COFF header layout follows, for those interested, but without
the references.


COFF object file format specifies that the header is first, then
the optional (aout) header, followed by section headers, etc.

The file header is a <struct filehdr>, declared in filehdr.h.
The optional header is <struct aouthdr>, declared in aouthdr.h.

struct filehdr
{
unsigned short f_magic;
unsigned short f_nscns;
long f_timdat;
long f_symptr;
long f_nsyms;
unsigned short f_othdr;
unsigned short f_flags;
};

By looking at <struct filehdr>, we see that the time stamp is
at bytes 4-7.

All this can be (and was) found in the UNIX programmers guide.


--

Jody Hagins
hag...@gamecock.rtp.dg.com
Data General Corp.
62 Alexander Dr.
RTP, N.C. 27709
(919) 248-6035

Chip Salzenberg

unread,
Dec 27, 1990, 10:55:28 AM12/27/90
to
According to arn...@audiofax.com (Arnold Robbins):

>A better, more useful technique would be to use lex/flex to write a
>C scanner that produced a stream of tokens, one per line. Run it against
>the before and after versions, and then use diff on the two outputs.

No need. There is already a utility in the c.s.u archives called
"spiff" (for Spaceman Spiff). Spiff is a token-based diff program.
Just run indent and "spiff -C"; if indent worked properly, spiff
should produce no output.
--
Chip Salzenberg at Teltronics/TCT <ch...@tct.uucp>, <uunet!pdn!tct!chip>
"Please don't send me any more of yer scandalous email, Mr. Salzenberg..."
-- Bruce Becker

Risto Lankinen

unread,
Dec 28, 1990, 3:41:39 AM12/28/90
to
kvi...@jvd.msk.su (Pete Kvitek) writes:

> Another reason to use pascal calling convections is to
> conserve stack space with heavily nested procedures
> (for example recourcive ones).

Hi!

This is a false assumption, at least in case of Microsoft C (using _cdecl
versus _pascal). In either way, the stack contains the arguments of the
(possibly recursively) called function, the backed-up local frame pointer
and the return address. There's no penalty in stack usage for using the
_cdecl even in recursive functions.

The size difference is, indeed, significant in programs using a lot of
function calls: Using _pascal, each function *code* carries an overhead
of 1 byte (by RET nn instead of plain RET), while with _cdecl each *call*
to a function spends 3 extra bytes (by an inserted ADD SP,nnnn) for stack
clean-up.

The _cdecl is good for functions, which have variable number of arguments.
I have also seen it claimed faster than _pascal, although with i286 I dare
to doubt the significance of the difference. Because with _cdecl the next
instruction is 'guaranteed' to be 3 bytes long (and spend a few additional
cycles for pre-fetch), while with _pascal there is a good chance the next
instruction is smaller than that (say, PUSH AX = 1 byte within nested calls,
for example).

By the way, an 'extremely optimizing' compiler could fight back a bit with
_cdecl, by leaving the stack arguments intact between calls, should there be
a (rare) piece of code, where exactly (or almost exactly) the same arguments
are used in subsequent calls, and the arguments were declared as const.

Terveisin: Risto Lankinen
--
Risto Lankinen / product specialist ***************************************
Nokia Data Systems, Technology Dept * 2 2 *
THIS SPACE INTENTIONALLY LEFT BLANK * 2 -1 is PRIME! Now working on 2 +1 *
replies: ri...@yj.data.nokia.fi ***************************************

Dan Mercer

unread,
Dec 28, 1990, 5:35:48 PM12/28/90
to
In article <11...@alice.att.com> a...@alice.UUCP () writes:

There are differences - just tried it. They're minor - byte 8
differs, and so does a byte 18000 bytes in - haven't checked the
significance. Time stamps? I'd look it up, but I'm too lazy.
Edification anyone?

--
Dan Mercer
NCR Network Products Division - Network Integration Services
Reply-To: mer...@npdiss1.StPaul.NCR.COM (Dan Mercer)
"MAN - the only one word oxymoron in the English Language"

Alan J Rosenthal

unread,
Dec 28, 1990, 10:05:27 PM12/28/90
to
arn...@audiofax.com (Arnold Robbins) writes:
>Andrew is fortunate enough to be running on a Unix system that doesn't
>use COFF for it's object files --- COFF files have a timestamp in them.
>If you know where it is (I don't) ...

sigh..
just cmp -l the files. if more than a few bytes are different, they're
different, otherwise it's the timestamps. I've done this, works fine.
of course just compiling a program twice & "cmp -l" ing it will show you where
the bytes are. and there's always /usr/include/a.out.h. no problem.

Alvin the Chipmunk Sylvain

unread,
Dec 28, 1990, 5:59:43 PM12/28/90
to
In article <277A19...@tct.uucp> ch...@tct.uucp (Chip Salzenberg) writes:
> According to arn...@audiofax.com (Arnold Robbins):
> >A better, more useful technique would be to use lex/flex to write a
> >C scanner that produced a stream of tokens, one per line. Run it against
> >the before and after versions, and then use diff on the two outputs.
>
> No need. There is already a utility in the c.s.u archives called
> "spiff" (for Spaceman Spiff). Spiff is a token-based diff program.
> Just run indent and "spiff -C"; if indent worked properly, spiff
> should produce no output.

You would be better off, or at least more accurate, to compile both
before and after versions and 'diff' the object code.

What you're suggesting is that we take a piece of relatively new soft-
ware (with s/w, new == buggy) and validate it using a piece of relatively
new sofware (probably just as buggy). The only thing that'll save you
is the low probability that the bugs line up such that they both make
the same mistake.

'diff' at least is old enough to have most of the bugs removed.
--
asyl...@felix.UUCP (Alvin "the Chipmunk" Sylvain)
========================= Opinions are Mine, Typos belong to /usr/ucb/vi
"We're sorry, but the reality you have dialed is no longer in service.
Please check the value of pi, or see your SysOp for assistance."
UUCP: hplabs!felix!asylvain ============================================

Bob Balkwill

unread,
Dec 30, 1990, 12:45:37 AM12/30/90
to
Why would anyone want to indent code that alreay compiles, thereby
permitting a diff of the object code? Indent is invaluable (IMO) in
porting code that looks yucky but the code is nowhere near compilable
at that stage.
--
----
Bob Balkwill Operations Research and Analysis Establishment
balk...@ncs.dnd.ca DND, Canada
----

Dave P. Schaumann

unread,
Dec 30, 1990, 1:55:10 AM12/30/90
to
In article <1990Dec30.0...@ncs.dnd.ca> balk...@ncs.dnd.ca (Bob Balkwill) writes:
>Why would anyone want to indent code that alreay compiles, thereby
>permitting a diff of the object code? Indent is invaluable (IMO) in
>porting code that looks yucky but the code is nowhere near compilable
>at that stage.

Imagine you have just inherited some working code, on the machine it was
intended for. (The author won the lottery...). Imagine that the author
used the Code Style From Hell. Here you have a situation where you want
to re-indent functioning code. (Masochists excepted, of course...)

>Bob Balkwill Operations Research and Analysis Establishment
>balk...@ncs.dnd.ca DND, Canada
>----

Dave Schaumann | You are in a twisty maze of little
da...@cs.arizona.edu | C statements, all different.

David Brooks

unread,
Dec 31, 1990, 1:47:20 AM12/31/90
to

Re suggestions that you compile the before and after code, and compare
the .o files... and complaints that the assembler plunks the date and
time into the .o file...

Why not compile both sources with -S and diff the assembly source?
That way, you stand a chance of finding out not only that there is a
difference, but where it is.
--
David Brooks dbr...@osf.org
Systems Engineering, OSF uunet!osf.org!dbrooks
In Memoriam: Chris Naughton, aged 16, killed by a drunk driver Dec 22, 1990

Ken Lerman

unread,
Dec 31, 1990, 10:04:46 AM12/31/90
to

NO, I would not trust indent.

Alternative 1:
You could write a program (call it undent) which replaces each
contiguous string of white space with a single blank character. You
could run it on my code before and after running indent and diff the
results. If they are the same, you are probably OK.

The one area of concern I'd have left is what is done with comments.
In particular, some CPPs treat the zero length comment /**/ as a
concatenation operator. Make sure that the undent program does NOT
treat this as white space. (Or grep your input and output code
looking for this.)

Alternative 2:
The purpose of this standard is to make code easier to maintain.
Before doing any maintenance, the programmer should run regression
tests on the application he is working on. Then run indent on the
module(s) he will be changing. Then run the regression tests again.
Now you can start modifying code. If some module is never worked on,
there is no need to "indent" it.

What, you don't have regression tests? Then your problem is far worse
than the lack of a standard coding format.

Ken

Ken Lerman

unread,
Dec 31, 1990, 10:10:03 AM12/31/90
to
In article <2...@ptcburp.ptcbu.oz.au> ri...@ptcburp.ptcbu.oz.au (Rick Stevenson) writes:
.....
.Why don't you run a copy of your src files through indent, recompile,
.and compare new object files with the old versions? The proof of the
.pudding...
.
.Rick.
.--
. -m------- Rick Stevenson
. ---mmm----- Pyramid Technology +61 75 522475 FAX
. -----mmmmm--- Research Park, Bond University +61 75 950249 VOICE
.-------mmmmmmm- Gold Coast, Q 4229, AUSTRALIA ri...@ptcburp.ptcbu.oz.au

I've never been able to compare object successfully. Object files
tend to have imbedded line numbers, version numbers, etc. which cause
them to compare differently. Is there a utility out there which will
enable me to determine if the code and data parts of an executable or
object file are the same?

Ken

Mark A Terribile

unread,
Dec 31, 1990, 10:15:03 PM12/31/90
to
> In article <11...@alice.att.com> a...@alice.UUCP () writes:
> >I don't need to -- I can compile my programs before and after
> >and compare the object files. If they're not identical,
> >something's broken.

> Andrew is fortunate enough to be running on a Unix system that doesn't
> use COFF for it's object files --- COFF files have a timestamp in them.
> If you know where it is (I don't), you can arrange to strip off the
> COFF header and then compare the objects, but it is not as simple an
> operation as it used to be.

If you are on a System V family UNIX (or even a System III or System IV,
when they introduced the frotzenglarken timestamp into the COFF) you should
have the -l on your cmp; this will report differences in the two or three
bytes of timestamp and go running merrily onward. It does make it harder
to do shell scripts that must stop if a problem arises, but if you are
using a shell with enough magic (like ksh) it's surely possible.

--

(This man's opinions are his own.)
From mole-end Mark Terribile

Joseph Hillenburg

unread,
Dec 31, 1990, 11:18:50 PM12/31/90
to
In article <59...@stpstn.UUCP>, ler...@stpstn.UUCP (Ken Lerman) writes...

I fiddled with Intent on a pure BSD4.3/MicroVAX II, and compiled a single
C file with the options:

% gcc -O -S test.c
% mv test.s test_noindent.s
% indent test.c
% gcc -O -S test.c
% mv test.s test_indent.s

I won't show diffs, but there was less than 10 lines difference in the two
files. The test.c file was the source to a mutant version of ctar.

Of course, you may get different results.

+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
| // Joseph Hillenburg, Secretary, Bloomington Amiga Users Group |
| \X/ anlh...@ucs.indiana.edu anlh...@iurose.BITNET |
| "Have fun folks. It's the last time you'll be seeing this place" |
+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+

Doug McDonald

unread,
Jan 1, 1991, 10:11:57 AM1/1/91
to
***************************************************************************
I would have used e-mail, but I know that it is hopeless: the poster of
the following gave no "real" address, only the bogus one "ler...@stpstn.UUCP"
He should edit his sendmail program to use a real address, either UUCP-
style complete with !'s or internet style a...@b.c.d.e. Or put
his real address in a signature line.
***************************************************************************

In article <59...@stpstn.UUCP> ler...@stpstn.UUCP (Ken Lerman) writes:
>
>NO, I would not trust indent.

Then you are deluded. Indent is probably more trustworthy than your
C compiler.


>
>Alternative 1:
>You could write a program (call it undent) which replaces each
>contiguous string of white space with a single blank character. You
>could run it on my code before and after running indent and diff the
>results. If they are the same, you are probably OK.

This is a sufficient but not necessary condition. Indent will not
in general pass this test.

>
>The one area of concern I'd have left is what is done with comments.
>In particular, some CPPs treat the zero length comment /**/ as a
>concatenation operator.

Not if they are compiling the C language they don't. The string
/**/ MUST be replaced by a single white space. Period. All pertinent
references (K&R first edition, K&R second edition, and the actual
ANSI standard) all agree 100% on this. If you so-called C compiler
does something else, it is terminally broken.

>Make sure that the undent program does NOT
>treat this as white space.

It MUST treat it as white space!!! Because that is what it is.

Doug McDonald (mcdo...@aries.scs.uiuc.edu)

Garrett Wollman

unread,
Jan 1, 1991, 1:10:41 PM1/1/91
to
We are getting *very* far afield from the subject of C... :-(

That said, I should point out to the previous posters, that there is
*no* reason why the arguments have to be popped off the stack
immediately after every function call. In fact, by default, gcc -O
waits to pop until there is a branch or unbranch in control-flow (or,
at any rate, that what I *think* it's doing) before popping. You have
to specifically request -fno-defer-pop for it to stop doing this.

Of course, you can also do without a frame pointer if it strikes your
fancy--just don't try to debug!

Unfortunately, as all PC programmers know, Microsoft is (and always
has been) considerably behind the times when it comes to optimization.
I'm thinking of trying to compile the mutated, 16-bit, 8086-supporting
version of GCC so that I can get decent optimization on PC programs.
I'm also thinking of scrapping PC programming altogether, at least so
long as I have access to a Real Computer :-)

-GAWollman

Garrett A. Wollman - wol...@emily.uvm.edu

Disclaimer: I'm not even sure this represents *my* opinion, never
mind UVM's, EMBA's, EMBA-CF's, or indeed anyone else's.

Linus Torvalds

unread,
Jan 1, 1991, 1:30:08 PM1/1/91
to
In article <8...@tuura.UUCP> ri...@tuura.UUCP (Risto Lankinen) writes:
[lots of interesting stuff deleted ...]

>By the way, an 'extremely optimizing' compiler could fight back a bit with
>_cdecl, by leaving the stack arguments intact between calls, should there be
>a (rare) piece of code, where exactly (or almost exactly) the same arguments
>are used in subsequent calls, and the arguments were declared as const.

In fact the compiler needn't be THAT optimizing to use the _cdecl to
speed up programs - it can defer stack cleanup until really necessary.
Thus instead of cleaning up the stack after EVERY call, it can call a
few functions and clean up the stack for ALL calls with one instruction
(ADD #n,A7 on mc68k etc.) This, I believe is done (optionally) by gcc.
Of course the compiler must be pretty certain that the stack doesn't get
TOO big, but it isn't (shouldn't) be that difficult to check for. IE:

....
pea addr # argument on stack, then
bsr _func1 # call _func1

move.l #0,-(a7) # don't clean up, just put new arg
bsr _func2 # on top of stack and call _func2

addq.l #8,a7 # now clean up both call arguments
...

Obviously, the speedup isn't dramatic, just my $ 0.02

>
>Terveisin: Risto Lankinen
>--
>Risto Lankinen / product specialist ***************************************
>Nokia Data Systems, Technology Dept * 2 2 *
>THIS SPACE INTENTIONALLY LEFT BLANK * 2 -1 is PRIME! Now working on 2 +1 *
>replies: ri...@yj.data.nokia.fi ***************************************

Linus Torvalds torv...@cs.helsinki.fi

Steve Summit

unread,
Jan 1, 1991, 6:01:41 PM1/1/91
to
In article <3...@audfax.audiofax.com> Arnold Robbins writes:
>Andrew is fortunate enough to be running on a Unix system that doesn't
>use COFF for it's object files --- COFF files have a timestamp in them.
>If you know where it is (I don't), you can arrange to strip off the
>COFF header and then compare the objects...

(Others have correctly noted that a workable solution to the
immediate problem is to use cmp -l.)

Just a side note to those designing object file formats and the
programs that manipulate them: it would be extremely nice if the
nonessential information could be optionally suppressed, for
exactly this reason. (There is usually adequate control over the
amount of source line number and other debugging information;
what's needed is a way to suppress _everything_ -- symbol tables,
relocation information, source file information, invocation
options, compilation timestamps -- other than the basic machine
code found in an old-style Unix a.out file.) I'm not arguing
against the default presence of extra information (it's usually
useful, and I often wish there were more, such as source file
name, timestamp, and/or inode so that a source-level debugger
wouldn't quietly show me the wrong lines when single-stepping a
module which I've inadvertently modified since compiling); I'm
just asking for the traditional extra degree of flexibility (i.e.
creeping feature).

Steve Summit
s...@adam.mit.edu

Michael Meissner

unread,
Jan 1, 1991, 11:41:35 PM1/1/91
to
In article <1990Dec28.2...@jarvis.csri.toronto.edu>

This doesn't always work with some object file formats out there. For
example, with the MIPS ECOFF object file format, the filenames are put
into the object file. The MIPS assembler explicitly puts an entry for
the assembly file into the table, even if the first line of the
assembly file is a .file to set the filename (it puts that in as
well). If you have a compiler that produces assembly output and
invokes the assembler (such as GCC), the assembler input file is
invariably created using the process ID in the filename. Two
successive runs will produce differences in the filenames stored. One
of the things that I did in my MIPS debug patches to GCC, is strip out
the useless filename, so that only the timestamps will differ.

The OSF/rose object file format has the potential for the same problem
in that it has a field to record the command line of the object file
creator. Thus if you compare .o's you will see differences, but if
you compare linked programs you won't (since the linker puts it's own
command line in the field).
--
Michael Meissner email: meis...@osf.org phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Considering the flames and intolerance, shouldn't USENET be spelled ABUSENET?

Benson I. Margulies

unread,
Jan 2, 1991, 11:47:22 AM1/2/91
to
Multics has a compare_object command that did a logical compare,
section by section. Surely OSF can do as well. Surely someone could
write one and put it in the public domain, in any case.
I would if I needed it.

--
Benson I. Margulies

Michael Meissner

unread,
Jan 2, 1991, 2:51:13 PM1/2/91
to
In article <1991Jan2.1...@odi.com> ben...@odi.com (Benson I.
Margulies) writes:

| Multics has a compare_object command that did a logical compare,
| section by section. Surely OSF can do as well. Surely someone could
| write one and put it in the public domain, in any case.
| I would if I needed it.

I'm planning on writing one. I hacked together a perl script that
compares MIPS ECOFF binaries section by section. Perl is not the
language to write such a beast, so the mark-II version will be in C.
It reminded me of my early days at Data General, where you had to
write your own structures for doing system calls from the assembly
language parameters (which is why I became a fanatic about providing
the structures for the C compiler I wrote for them).

Rahul Dhesi

unread,
Jan 4, 1991, 12:25:39 AM1/4/91
to
In <26...@uflorida.cis.ufl.EDU> j...@reef.cis.ufl.edu (Brian K. W. Hook) writes:

>...but I was wondering whether there is a NOTICEABLE difference
>in speed and size of program that uses PASCAL declarations...

This belongs in comp.lang.pascal, does it not?

Seriously, please don't assume that comp.lang.c readers will be able to
answer questions about nonstandard extensions of the C programming
language. The "pascal" keyword is used by MS-DOS-specific compilers,
and a better place to ask the question will be one of the MS-DOS
newsgroups.

However...

The effect of using the "pascal" keyword is to tell the compiler that
it need not assume that traditional C style function calls, with a
posibly variable number of parameters, will be used. The compiler can
thus generate code that allows the called function to pop the stack
before it returns, rather than inserting stack-popping code after each
call to each function. Difference in speed: probably not noticeable.
Difference in code size: a few bytes (let's say about 2 bytes) saved
each time any function is called. If a function is called 10 times,
that means 20 bytes saved by making that function a "pascal" function.

In theory, there is no need for a "pascal" keyword if you're using ANSI
C, since the compiler knows whether or not a prototype is in scope, and
can generate efficient or inefficient code accordingly. In practice,
it seems (unverified suspicion) that most MS-DOS compilers don't take
advantage of the presence of ANSI C prototypes for such optimization
but do take advantage of the "pascal" keyword. This could be because
of the fear that the user might link the object code resulting from his
C source with an assembly language program, or out of the knowledge
that most MS-DOS users don't use "lint" and can therefore not check
function calls for consistency across files.
--
History never | Rahul Dhesi <dhesi%cir...@oliveb.ATC.olivetti.com>
becomes obsolete. | UUCP: oliveb!cirrusl!dhesi

Alvin the Chipmunk Sylvain

unread,
Jan 3, 1991, 6:11:36 PM1/3/91
to
In article <59...@stpstn.UUCP> ler...@stpstn.UUCP (Ken Lerman) writes:
> In article <2...@ptcburp.ptcbu.oz.au> ri...@ptcburp.ptcbu.oz.au (Rick Stevenson) writes:
> .....
> .Why don't you run a copy of your src files through indent, recompile,
> .and compare new object files with the old versions? The proof of the
> .pudding...
>
> I've never been able to compare object successfully. Object files
> tend to have imbedded line numbers, version numbers, etc. which cause
> them to compare differently. Is there a utility out there which will
> enable me to determine if the code and data parts of an executable or
> object file are the same?

You should be able to run 'strip' on the object files to remove any
extraneous garbage, including information for debuggers. Although I
like someone's suggestion of compiling to assembly source with the -S
option much better. As he pointed out, this not only identifies a
difference, it indicates where it is as well. (Or at least gives you an
idea, depending on how transmorgrified it becomes with optimization.)

Alvin the Chipmunk Sylvain

unread,
Jan 3, 1991, 6:22:34 PM1/3/91
to
In article <1991Jan1.1...@ux1.cso.uiuc.edu> mcdo...@aries.scs.uiuc.edu (Doug McDonald) writes:
[...]

> In article <59...@stpstn.UUCP> ler...@stpstn.UUCP (Ken Lerman) writes:
> >
> >NO, I would not trust indent.
> Then you are deluded. Indent is probably more trustworthy than your
> C compiler.

Well, it has less work to do ... other than that, who knows?

[...]


> >The one area of concern I'd have left is what is done with comments.
> >In particular, some CPPs treat the zero length comment /**/ as a
> >concatenation operator.
>
> Not if they are compiling the C language they don't. The string
> /**/ MUST be replaced by a single white space. Period. All pertinent
> references (K&R first edition, K&R second edition, and the actual
> ANSI standard) all agree 100% on this. If you so-called C compiler
> does something else, it is terminally broken.

It may be standard, but there *are* C compilers which do this. And, of
course, programmers who have taken advantage of the "feature".

> >Make sure that the undent program does NOT
> >treat this as white space.
>
> It MUST treat it as white space!!! Because that is what it is.

I agree. However, when one is stuck with the task of making a piece of
code work, s/he is usually not concerned with what the "correct" is
supposed to be, except for the results generated by the application.

Actually, anyone with code which makes extensive use of the "feature"
should probably rewrite it anyway, especially if upgrading to a system
or compiler where the "feature" (correctly) doesn't exist. But that's
a decision to be made on a case-by-case basis, i.e., is it cheaper to
rewrite the code, or find a kludge around the problem? (I know, nobody
likes "kludges", but this is the real world and elegance must sometimes
take a backseat to expediency. Ah, but that's a whole 'nother thread!)

Karl Heuer

unread,
Jan 4, 1991, 12:54:21 PM1/4/91
to
In article <1991Jan1.1...@ux1.cso.uiuc.edu> mcdo...@aries.scs.uiuc.edu (Doug McDonald) writes:
>In article <59...@stpstn.UUCP> ler...@stpstn.UUCP (Ken Lerman) writes:
>>some CPPs treat the zero length comment /**/ as a concatenation operator.
>
>Not if they are compiling the C language they don't. The string
>/**/ MUST be replaced by a single white space. Period. All pertinent
>references (K&R first edition, K&R second edition, and the actual
>ANSI standard) all agree 100% on this.

Unfortunately, prior to ANSI C there was no such thing as "the C language";
there were multiple dialects of C. One subfamily, including the dialect
described by K&R1, treated comments as whitespace. Another, which includes
the dialect described by the Ritchie compiler with the Reiser cpp, treated
them as empty. Whether this was a bug in the compiler or in the manual cannot
be resolved without further information.

X3J11 accepted both K&R1 and widespread existing practice as evidence in
defining the new language. In this case they went with K&R1, not because it
was sacred text, but because the behavior described by K&R1 is better. (And
because they'd added an alternate way to do concatenation.)

Karl W. Z. Heuer (ka...@ima.isc.com or uunet!ima!karl), The Walking Lint

Karl Heuer

unread,
Jan 4, 1991, 1:32:43 PM1/4/91
to
In article <28...@cirrusl.UUCP> dhesi%cir...@oliveb.ATC.olivetti.com (Rahul Dhesi) writes:
>In theory, there is no need for a "pascal" keyword if you're using ANSI C...

Unless the compiler is going to do cross-file analysis (e.g. with a fixup at
link time), it would have to assume that all non-variadic functions are
pascal-like. (It's irrelevant whether or not a prototype is in scope, since
int main(void) { return f(1); }
and
extern int f(int);
int main(void) { return f(1); }
are equivalent programs, irregardful of whether f() is defined with or without
a prototype in its own source file.)

But if the compiler treats non-variadic functions and variadic functions
differently, it would break the common (but non-ANSI) usage
/* no <stdio.h>! */
int main() { printf("hello, world\n"); return 0; }
because it doesn't know that printf() is variadic.

Conclusion: an implementation must do one of the following:

[0] Be prepared to distinguish variadic from non-variadic functions at link
time.

[1] Treat all functions as caller-pop (unless otherwise indicated by a
nonstandard keyword like "__pascal").

[2] Treat all non-variadic functions as callee-pop, and break old programs
such as the above. (A warning "no prototype for printf()" helps here.)

[3] Recognize "printf" and friends as reserved words even when <stdio.h> is
missing.

Brian K. W. Hook

unread,
Jan 4, 1991, 5:31:53 PM1/4/91
to
> "CDECL and PASCAL keywords...shouldn't this be in comp.lang.pascal?"

*ahem* Granted, I am new to the net, but I was not sure whether to post a
question RE:CDECL and PASCAL to comp.lang.c or comp.os.msdos.programmer...
The keywords CDECL and PASCAL are part of DOS compiler C language extensions,
and thus I posted here. I have assumed that comp.lang.c is, in fact, for
all flavors of C, and that comp.std.c is for standard C (ANSI or K&R?).

For those that answered, thank you for your assistance. As an aside, maybe
there should be:

comp.lang.c.msdos
comp.lang.c.ansi
comp.lang.c.kr
comp.lang.c.sun

etc...posting on comp.os.msdos.programmer invariably gets me few answers
since many are posters familiar with pascal or asm...and I need help with
C....

Brian

Gertjan van Oosten

unread,
Jan 7, 1991, 7:49:50 AM1/7/91
to
> > Do you trust 'indent'
> > enough to run it on all your source file without making any mistakes?
>
> I don't need to -- I can compile my programs before and after
> and compare the object files. If they're not identical,
> something's broken.
> --
> --Andrew Koenig
> a...@europa.att.com

Sadly, this is not enough :-(

I had a version of 'indent' (sorry, don't know which, nor the OS version (it
was SunOS)) that ate some comment from my file.
And I definitely do NOT consider source files with and without comment equal.

As an aside: what does one do when the object files aren't identical?

Regards,
--
-- Gertjan van Oosten, ger...@westc.uucp OR mcsun!hp4nl!westc!gertjan
-- West Consulting bv, Phoenixstraat 49, 2611 AL Delft, The Netherlands
-- P.O. Box 3318, 2601 DH Delft
-- Tel: +31-15-123190, Fax: +31-15-147889

joseph.a.brownlee

unread,
Jan 7, 1991, 8:02:50 AM1/7/91
to
In article <28...@cirrusl.UUCP>, dhesi%cir...@oliveb.ATC.olivetti.com

(Rahul Dhesi) writes:
> Seriously, please don't assume that comp.lang.c readers will be able to
> answer questions about nonstandard extensions of the C programming
> language. The "pascal" keyword is used by MS-DOS-specific compilers,
> and a better place to ask the question will be one of the MS-DOS
> newsgroups.

Well, it is also commonly used on the Macintosh, where the ROM software is
all written assuming PASCAL calling conventions. The keyword "pascal" is
generally used in one of two ways:

. to prototype a ROM (or other library) routine.

. to force one of your own functions to use PASCAL conventions. This
is necessary, for example, when a ROM routine expects a procedure
pointer and you pass a pointer to your own routine written in C.

--
- _ Joe Brownlee, Analysts International Corp. @ AT&T Network Systems
/_\ @ / ` 471 E Broad St, Suite 1610, Columbus, Ohio 43215 (614) 860-7461
/ \ | \_, E-mail: j...@cblph.att.com Who pays attention to what _I_ say?
"Scotty, we need warp drive in 3 minutes or we're all dead!" --- James T. Kirk

Reply all
Reply to author
Forward
0 new messages