# size of a sizeof(pointer)

1 view

### syntax

Feb 8, 2004, 2:37:15 PM2/8/04
to
what is the size of a pointer?

suppose i am writing,

datatype *ptr;
sizeof(ptr);

now what does this sizeof(ptr) will give? will it give the size of the
data the pointer is pointing to?

if no, can you give an counter example?

basically , i want to know what is the meaning of size of a ponter.

as you know

sizeof(int)=4;

sizeof(char)= 2;

but what does sizeof(ptr) means??

can anybody explain?

### Josh Sebastian

Feb 8, 2004, 2:40:52 PM2/8/04
to
On Sun, 08 Feb 2004 11:37:15 -0800, syntax wrote:

> what is the size of a pointer?
>
> suppose i am writing,
>
>
> datatype *ptr;
> sizeof(ptr);
>
>
> now what does this sizeof(ptr) will give? will it give the size of the
> data the pointer is pointing to?
>
> if no, can you give an counter example?
>
> basically , i want to know what is the meaning of size of a ponter.
>
> as you know
>
> sizeof(int)=4;

Maybe. It must be >= 2.

> sizeof(char)= 2;

sizeof(char) is, by definition, 1.

> but what does sizeof(ptr) means??

It's the amount of space the pointer itself takes up. Not the data pointed
to, but the pointer itself. Often, it's == sizeof(int).

Josh

### Malcolm

Feb 8, 2004, 2:53:17 PM2/8/04
to

"syntax" <san...@yahoo.com.hk> wrote in message

> what is the size of a pointer?
>
A pointer is a variable that holds an address. The size of a pointer is the
For instance, most computers have an address space of 4GB. 32 bits allows
you 4GB, so the size of a pointer will be 32 bits, or 4 (char is usually 8
bits). On some microcomputers the address space is only 64K, so 16-bit
pointers are used.

>
> datatype *ptr;
> sizeof(ptr);
>
> now what does this sizeof(ptr) will give? will it give the size of the
> data the pointer is pointing to?
>
No, it gives the size of the pointer, probably 4.

>
> if no, can you give an counter example?
>
One confusing thing about C is that arrays and pointer have array/pointer
equivalence.

char string[32];

printf("sizeof string %d\n", (int) sizeof(string));

will give you 32.

char *string = malloc(32);

printf(" sizeof string %d\n", (int) sizeof(string));

will give you the size of a pointer on your system, probably 4.

>
> basically , i want to know what is the meaning of size of a ponter.
>
> as you know
>
> sizeof(int)=4;
>
> sizeof(char)= 2;
>

sizeof(char) is always 1, one of the little quirks of the C language.
sizeof(int) is very commonly 4, but it can be any size. It is meant to be
the natural size for the machine to use, which means the width of the
register.
For technical reasons pointers are usually the same size as ints, but again
they can be any size.

### Richard Heathfield

Feb 8, 2004, 2:58:20 PM2/8/04
to
Josh Sebastian wrote:

> On Sun, 08 Feb 2004 11:37:15 -0800, syntax wrote:
>
>> as you know
>>
>> sizeof(int)=4;
>
> Maybe. It must be >= 2.

Wrong. It must, however, be an exact multiple of 1.

>> sizeof(char)= 2;
>
> sizeof(char) is, by definition, 1.

Right.

>
>> but what does sizeof(ptr) means??
>
> It's the amount of space the pointer itself takes up. Not the data pointed
> to, but the pointer itself. Often, it's == sizeof(int).

But, of course, it doesn't have to be (as you know).

--
Richard Heathfield : bin...@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

### Josh Sebastian

Feb 8, 2004, 3:29:51 PM2/8/04
to
On Sun, 08 Feb 2004 19:58:20 +0000, Richard Heathfield wrote:

> Josh Sebastian wrote:
>
>> On Sun, 08 Feb 2004 11:37:15 -0800, syntax wrote:
>>
>>> as you know
>>>
>>> sizeof(int)=4;
>>
>> Maybe. It must be >= 2.
>
> Wrong. It must, however, be an exact multiple of 1.

Jeez... yeah, thanks.

### Keith Thompson

Feb 8, 2004, 3:41:32 PM2/8/04
to
Josh Sebastian <cur...@cox.net> writes:
> On Sun, 08 Feb 2004 11:37:15 -0800, syntax wrote:
[...]

> > but what does sizeof(ptr) means??
>
> It's the amount of space the pointer itself takes up. Not the data pointed
> to, but the pointer itself. Often, it's == sizeof(int).

It's true that the size of a pointer is often equal to sizeof(int),
but it's dangerous (an unnecessary) to assume that it always is.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://www.sdsc.edu/~kst>
Schroedinger does Shakespeare: "To be *and* not to be"

### Mike Wahler

Feb 8, 2004, 4:19:18 PM2/8/04
to

"Keith Thompson" <ks...@mib.org> wrote in message
news:lnfzdlc...@nuthaus.mib.org...

> Josh Sebastian <cur...@cox.net> writes:
> > On Sun, 08 Feb 2004 11:37:15 -0800, syntax wrote:
> [...]
> > > but what does sizeof(ptr) means??
> >
> > It's the amount of space the pointer itself takes up. Not the data
pointed
> > to, but the pointer itself. Often, it's == sizeof(int).
>
> It's true that the size of a pointer is often equal to sizeof(int),
> but it's dangerous (an unnecessary) to assume that it always is.

Or for that matter, to assume that all pointer types have the same size.

-Mike

### Keith Thompson

Feb 8, 2004, 5:01:19 PM2/8/04
to
"Malcolm" <mal...@55bank.freeserve.co.uk> writes:
[...]

> One confusing thing about C is that arrays and pointer have array/pointer
> equivalence.

No, there is no array/pointer equivalence (or rather, "equivalence" is
a misleading term for what's really going on). Array names are
implicitly converted to pointer values in many contexts.

See the C FAQ at <http://www.eskimo.com/~scs/C-faq/faq.html>,
particularly section 6, particularly question 6.3.

### Malcolm

Feb 8, 2004, 5:04:50 PM2/8/04
to

"Keith Thompson" <ks...@mib.org> wrote in message
>
> No, there is no array/pointer equivalence (or rather, "equivalence" is
> a misleading term for what's really going on). Array names are
> implicitly converted to pointer values in many contexts.
>
> See the C FAQ at <http://www.eskimo.com/~scs/C-faq/faq.html>,
> particularly section 6, particularly question 6.3.
>
Exactly. "Equivalence" is the accepted term for what is going on, which is
confusing.

### CBFalconer

Feb 8, 2004, 9:00:20 PM2/8/04
to
Malcolm wrote:
> "syntax" <san...@yahoo.com.hk> wrote in message
>
> > what is the size of a pointer?
> >
> A pointer is a variable that holds an address. The size of a
> pointer is the size of this address.
>
> For instance, most computers have an address space of 4GB. 32
> bits allows you 4GB, so the size of a pointer will be 32 bits,
> or 4 (char is usually 8 bits). On some microcomputers the
> address space is only 64K, so 16-bit pointers are used.

Nope. A pointer points. What information it needs to hold to do
that is up to the implementation. It could consist of a URL and
other information, just as a not too wild example. Another might
be "Malcolms house, under the bed beside the dirty socks, last
Tuesday". The amount of information needed is usually constrained
by limiting the things that the pointer is allowed to point to.
Clear now?

At any rate the C expression "sizeof ptr", where ptr is an actual
pointer, is available to tell you how much space that particular
implementation needs for the job.

Sometimes that pointer may be a real memory address. Today it
more often represents an offset from another pointer which points
to a block of some sort of storage. You should neither know nor
care, unless you are implementing the system.

--
Chuck F (cbfal...@yahoo.com) (cbfal...@worldnet.att.net)
Available for consulting/temporary embedded and systems.

### Jack Klein

Feb 8, 2004, 9:47:43 PM2/8/04
to
On Sun, 08 Feb 2004 14:40:52 -0500, Josh Sebastian <cur...@cox.net>
wrote in comp.lang.c:

> On Sun, 08 Feb 2004 11:37:15 -0800, syntax wrote:
>
> > what is the size of a pointer?
> >
> > suppose i am writing,
> >
> >
> > datatype *ptr;
> > sizeof(ptr);
> >
> >
> > now what does this sizeof(ptr) will give? will it give the size of the
> > data the pointer is pointing to?
> >
> > if no, can you give an counter example?
> >
> > basically , i want to know what is the meaning of size of a ponter.
> >
> > as you know
> >
> > sizeof(int)=4;
>
> Maybe. It must be >= 2.

No, the number of bits in an int must be at least 16, but on some
platforms CHAR_BIT is greater than 8.

I am actually developing code right now on for a Texas Instruments
2812 DSP with their Code Composer Studio. CHAR_BIT is 16. The types
char, signed char, unsigned char, signed short, unsigned short, signed
int and unsigned int all contain 16 bits and the sizeof operator
yields a value of 1 for each and every one of these types. The
processor only reads and writes memory in 16 bit words.

In the past I have worked with a 32 bit DSP from Analog devices which
only addressed memory in 32 bit words. CHAR_BIT was 32. All the
integer types (this was before C99, so there was no long long type)
had 32 bits and sizeof yielded 1, even for signed and unsigned long.

> sizeof(char) is, by definition, 1.

This is true.

> > but what does sizeof(ptr) means??
>
> It's the amount of space the pointer itself takes up. Not the data pointed
> to, but the pointer itself. Often, it's == sizeof(int).

And often it is not. Under the TI compiler I mentioned above, while
int has 16 bits and sizeof(int) is, pointers have 32 bits and
sizeof(void *) is 2.

Under Keil's compiler for the 8051, int has 16 bits and occupies 2
bytes, sizeof(void *) is 3, although it doesn't really use all 24
bits.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html

### Grumble

Feb 9, 2004, 6:30:59 AM2/9/04
to
Richard Heathfield wrote:

> Josh Sebastian wrote:

>
>> syntax wrote:
>>
>>> sizeof(int)=4;
>>
>> Maybe. It must be >= 2.
>
> Wrong. It must, however, be an exact multiple of 1.

An implementation cannot have 16-bit chars and 24-bit ints?

How about 16-bit chars and 24-bit pointers?

### pete

Feb 9, 2004, 7:40:21 AM2/9/04
to
Grumble wrote:
>
> Richard Heathfield wrote:
>
> > Josh Sebastian wrote:
> >
> >> syntax wrote:
> >>
> >>> sizeof(int)=4;
> >>
> >> Maybe. It must be >= 2.
> >
> > Wrong. It must, however, be an exact multiple of 1.

It must be greater than 1, on hosted implementations.

> An implementation cannot have 16-bit chars and 24-bit ints?

The sum of the numbers of padding bits,
value bits and the sign bit, is a multiple of CHAR_BIT.

> How about 16-bit chars and 24-bit pointers?

The bit representation of pointers is not specified.

--
pete

### Mark McIntyre

Feb 9, 2004, 9:08:39 AM2/9/04
to
On Mon, 09 Feb 2004 12:40:21 GMT, in comp.lang.c , pete
<pfi...@mindspring.com> wrote:

>Grumble wrote:
>>
>> Richard Heathfield wrote:
>>
>> > Josh Sebastian wrote:
>> >
>> >> syntax wrote:
>> >>
>> >>> sizeof(int)=4;
>> >>
>> >> Maybe. It must be >= 2.
>> >
>> > Wrong. It must, however, be an exact multiple of 1.
>
>It must be greater than 1, on hosted implementations.

Not if a char were 16 bits wide.

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>

----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---

### Richard Bos

Feb 9, 2004, 8:59:44 AM2/9/04
to
pete <pfi...@mindspring.com> wrote:

> Grumble wrote:
> >
> > Richard Heathfield wrote:
> >
> > > Josh Sebastian wrote:
> > >
> > >> syntax wrote:
> > >>
> > >>> sizeof(int)=4;
> > >>
> > >> Maybe. It must be >= 2.
> > >
> > > Wrong. It must, however, be an exact multiple of 1.
>
> It must be greater than 1, on hosted implementations.

Of course, it's exceedingly awkward for a hosted implementation to have
sizeof(int)==1, but it isn't illegal.

> > An implementation cannot have 16-bit chars and 24-bit ints?
>
> The sum of the numbers of padding bits,
> value bits and the sign bit, is a multiple of CHAR_BIT.
>
> > How about 16-bit chars and 24-bit pointers?
>
> The bit representation of pointers is not specified.

Even so, all types have sizes measurable in whole chars; look up the
definition of sizeof.

Richard

### Mark A. Odell

Feb 9, 2004, 10:43:36 AM2/9/04
to
"Mike Wahler" <mkwa...@mkwahler.net> wrote in

>> > to, but the pointer itself. Often, it's == sizeof(int).
>>
>> It's true that the size of a pointer is often equal to sizeof(int),
>> but it's dangerous (an unnecessary) to assume that it always is.
>
> Or for that matter, to assume that all pointer types have the same size.

Indeed. For example, Keil C51 has 1 byte, 2 byte, and 3 byte pointer sizes
depending upon which memory space the pointer points to.

### Grumble

Feb 9, 2004, 10:47:03 AM2/9/04
to
Richard Bos wrote:

> Of course, it's exceedingly awkward for a hosted implementation

> to have sizeof(int)==1 [...]

Is it awkward because getc() can return either a char or EOF?

### Richard Bos

Feb 9, 2004, 10:12:34 AM2/9/04
to
Grumble <inv...@kma.eu.org> wrote:

That, and related problems, yes. If you need to take these legal-but-
unlikely implementations into account (i.e., if you really want to be as
(and uncommon troll) makes us out to be), you need to check for feof()
Personally, I never do.

Richard

Message has been deleted

Feb 9, 2004, 12:28:21 PM2/9/04
to
Mark McIntyre wrote:

> On Mon, 09 Feb 2004 12:40:21 GMT, in comp.lang.c , pete
> <pfi...@mindspring.com> wrote:
>
>
>>Grumble wrote:
>>
>>>Richard Heathfield wrote:
>>>
>>>
>>>>Josh Sebastian wrote:
>>>>
>>>>
>>>>>syntax wrote:
>>>>>
>>>>>
>>>>>>sizeof(int)=4;
>>>>>
>>>>>Maybe. It must be >= 2.
>>>>
>>>>Wrong. It must, however, be an exact multiple of 1.
>>
>>It must be greater than 1, on hosted implementations.
>
>
> Not if a char were 16 bits wide.
>
>

Is there any alive implementation that uses 16bit chars?? (I know of the
existance of a machine that a byte is 6-bit)

--
#include <stdio.h>
#define p(s) printf(#s" endian")
int main(void){int v=1;*(char*)&v?p(Little):p(Big);return 0;}

http://dop.users.uth.gr/
University of Thessaly
Computer & Communications Engineering dept.

### Mark McIntyre

Feb 9, 2004, 12:55:10 PM2/9/04
to
On Mon, 09 Feb 2004 19:28:21 +0200, in comp.lang.c , Papadopoulos Giannis
<ipap...@inf.uth.gr> wrote:
>
>Is there any alive implementation that uses 16bit chars?? (I know of the
>existance of a machine that a byte is 6-bit)

Unicode springs to mind.

I suspect that quite a few DSPs do, tho typically they're freestanding
implementations.

That aside, I'd be unsurprised to see future implementations using 16 bits
for chars.

### Malcolm

Feb 9, 2004, 3:06:54 PM2/9/04
to

"Grumble" <inv...@kma.eu.org> wrote in message

>
> An implementation cannot have 16-bit chars and 24-bit ints?
>
> How about 16-bit chars and 24-bit pointers?
>
Not allowed. chars and bytes, or to be pedantic unsigned chars and bytes,
are the same thing in C. An unfortunate hangover from the early days.
All types have to be a whole multiple of char.

### Malcolm

Feb 9, 2004, 4:21:01 PM2/9/04
to

"CBFalconer" <cbfal...@yahoo.com> wrote in message

> > For instance, most computers have an address space of 4GB. 32
> > bits allows you 4GB, so the size of a pointer will be 32 bits,
> > or 4 (char is usually 8 bits). On some microcomputers the
> > address space is only 64K, so 16-bit pointers are used.
>
> Nope. A pointer points. What information it needs to hold to do
> that is up to the implementation. It could consist of a URL and
> other information, just as a not too wild example. Another might
> be "Malcolms house, under the bed beside the dirty socks, last
> Tuesday". The amount of information needed is usually constrained
> by limiting the things that the pointer is allowed to point to.
> Clear now?
>
Don't patronise.
You and I both know that perverse implementations are allowed. Since
pointers have to be a fixed size then using a URL would be grossly
inefficient.
Since the OP needs to understand how pointers are represented in memory on a
typical system such as the one he will certainly be using, telling him that
32 bit pointers are needed to address 4GB gets across the message clearly.
Talk about URL pointers is liable to confuse.

>
> You should neither know nor care, unless you are implementing the
> system.
>
Well you very often need to break the bounds of ANSI C and go to a lower
level. An example would be if you have a custom memory scheme. How do you
know if a pointer comes from your arena or from elsewhere?
Another example would be using a debugger. Invalid pointers are often set to
some defined bit pattern. You need to know something about addressing to
Programming is practical. It doesn't make sense to hand someone a copy of
the standard and expect them to be able to write fully-conforming ANSI C.
You need to play with a real implementation on a real machine to have any
hope of understanding what is going on.

### Leor Zolman

Feb 9, 2004, 4:40:37 PM2/9/04
to

I've never heard the term before starting to read this newsgroup. I've
always called it "array/pointer duality"
-leor

>

Leor Zolman
BD Software
le...@bdsoft.com
www.bdsoft.com -- On-Site Training in C/C++, Java, Perl & Unix
Decryptor at www.bdsoft.com/tools/stlfilt.html

Feb 9, 2004, 7:03:13 PM2/9/04
to
Mark McIntyre wrote:

> On Mon, 09 Feb 2004 19:28:21 +0200, in comp.lang.c , Papadopoulos Giannis
> <ipap...@inf.uth.gr> wrote:
>
> That aside, I'd be unsurprised to see future implementations using 16 bits
> for chars.

If we use 16-bit values as char, then the new C0x spec must define
something like "byte" (java's char is unicode and it haves an 8-bit type)..

There is of course wchar_t so there is definately no need for 16bit
chars.. Or so I think... Comments?

### Keith Thompson

Feb 9, 2004, 8:31:36 PM2/9/04
to
> Mark McIntyre wrote:
> > On Mon, 09 Feb 2004 19:28:21 +0200, in comp.lang.c , Papadopoulos Giannis
> > <ipap...@inf.uth.gr> wrote:
> > That aside, I'd be unsurprised to see future implementations using
> > 16 bits
> > for chars.
>
> If we use 16-bit values as char, then the new C0x spec must define
> something like "byte" (java's char is unicode and it haves an 8-bit
> type)..
>
> There is of course wchar_t so there is definately no need for 16bit
> chars.. Or so I think... Comments?

I think C will always define a char as being one byte (sizeof(char)==1).
There's too much code that would break if that were changed. The
process that led to the 1989 ANSI standard was probably the last real
opportunity to change this.

I'd greatly prefer the concepts of "character" and "uniquely
addressable storage unit" to be separate, but it's too late to fix it.

It just might be possible to deprecate the use of the word "byte"
(which is part of the desciption of the language, not part of the
language itself) while continuing to guarantee that sizeof(char)==1,
but I doubt that even that will be done.

### pete

Feb 9, 2004, 9:54:58 PM2/9/04
to
Mark McIntyre wrote:
>
> On Mon, 09 Feb 2004 12:40:21 GMT, in comp.lang.c , pete
> <pfi...@mindspring.com> wrote:
>
> >Grumble wrote:
> >>
> >> Richard Heathfield wrote:
> >>
> >> > Josh Sebastian wrote:
> >> >
> >> >> syntax wrote:
> >> >>
> >> >>> sizeof(int)=4;
> >> >>
> >> >> Maybe. It must be >= 2.
> >> >
> >> > Wrong. It must, however, be an exact multiple of 1.
> >
> >It must be greater than 1, on hosted implementations.
>
> Not if a char were 16 bits wide.

You can't implement the whole standard library,
if sizeof(int) is one.

putchar(EOF) has to be able to return EOF
converted to an unsigned char value,
converted back to a nonnegative int.

--
pete

### Mike Wahler

Feb 10, 2004, 1:23:04 AM2/10/04
to
"Malcolm" <mal...@55bank.freeserve.co.uk> wrote in message
news:c08tlg\$maa\$1...@newsg1.svr.pol.co.uk...

>
> "CBFalconer" <cbfal...@yahoo.com> wrote in message
> > > For instance, most computers have an address space of 4GB. 32
> > > bits allows you 4GB, so the size of a pointer will be 32 bits,
> > > or 4 (char is usually 8 bits). On some microcomputers the
> > > address space is only 64K, so 16-bit pointers are used.
> >
> > Nope. A pointer points. What information it needs to hold to do
> > that is up to the implementation. It could consist of a URL and
> > other information, just as a not too wild example. Another might
> > be "Malcolms house, under the bed beside the dirty socks, last
> > Tuesday". The amount of information needed is usually constrained
> > by limiting the things that the pointer is allowed to point to.
> > Clear now?
> >
> Don't patronise.
> You and I both know that perverse implementations are allowed.

For suitable defintions of 'perverse'.

> Since
> pointers have to be a fixed size

> then using a URL would be grossly
> inefficient.
> Since the OP needs to understand how pointers are represented in memory

That's platform/implemenatation dependent.

>on a
> typical system

Whose definition of 'typical'?

>such as the one he will certainly be using,

Doesn't matter which one. The answers will be platform-specific,
not applicable to standard C.

>telling him that
> 32 bit pointers are needed to address 4GB gets across the message clearly.

That's one of many possible ways to represent such an address space.

> Talk about URL pointers is liable to confuse.

It's intended to clarify (and imo it did) that a pointer is
an *abstraction*, and as such, one need not (should not) be

> >
> > You should neither know nor care, unless you are implementing the
> > system.
> >
> Well you very often need to break the bounds of ANSI C and go to a lower
> level.

In which case the dicussion needs to depart from clc.

>An example would be if you have a custom memory scheme. How do you
> know if a pointer comes from your arena or from elsewhere?

Not here.

> Another example would be using a debugger. Invalid pointers are often set
to
> some defined bit pattern. You need to know something about addressing to

here.

> Programming is practical.

The subject of clc is not programming.

> It doesn't make sense to hand someone a copy of
> the standard and expect them to be able to write fully-conforming ANSI C.

That's why we have books, schools, intructors, etc.

> You need to play with a real implementation on a real machine to have any
> hope of understanding what is going on.

Not at the abstract level of ISO C. 'Way' back when, I got a decent
understanding
of how COBOL worked, before I ever laid eyes on any hardware. This was
proven
when I actually coded, compiled, and successfully ran programs when we did

-Mike

### Kelsey Bjarnason

Feb 10, 2004, 5:32:04 AM2/10/04
to
[snips]

On Tue, 10 Feb 2004 06:23:04 +0000, Mike Wahler wrote:

>> Since
>> pointers have to be a fixed size
>
>
>> then using a URL would be grossly
>> inefficient.
>> Since the OP needs to understand how pointers are represented in memory
>
> That's platform/implemenatation dependent.

I've always favord SQL queries. Store all the values in a database and
the pointers are all just queries to retrieve them.

>>telling him that
>> 32 bit pointers are needed to address 4GB gets across the message
>> clearly.
>
> That's one of many possible ways to represent such an address space.

Anyone who ever used older DOS compilers will appreciate the clarity of
not assuming pointers make any sort of inherent sense. :)

### Richard Bos

Feb 10, 2004, 4:02:08 AM2/10/04
to
"Mike Wahler" <mkwa...@mkwahler.net> wrote:

> "Malcolm" <mal...@55bank.freeserve.co.uk> wrote in message
> news:c08tlg\$maa\$1...@newsg1.svr.pol.co.uk...

> > Programming is practical.
>
> The subject of clc is not programming.

Well, yes, it is. Where Malcolm goes wrong is in believing that locking
yourself into the Wintel platform is part of that practicality.

Richard

### Malcolm

Feb 10, 2004, 4:08:17 PM2/10/04
to

"Mike Wahler" <mkwa...@mkwahler.net> wrote in message

> > Since
> > pointers have to be a fixed size
>
>
Uggle *ptr = 0;

Uggle **uptr = malloc(sizeof(Uggle *));

*uptr = ptr;

*uptr now must be set to NULL. How is this achieved if an Uggle * is of
variable width?

>
> Whose definition of 'typical'?
>

Natural language definition of "typical".

>
> >such as the one he will certainly be using,
>
> Doesn't matter which one. The answers will be platform-specific,
> not applicable to standard C.
>

But standard C is deeply dependent on the types of architectures that exist
in the real world. That's why it has pointers, rather than the "advance"
commands that would be expected of Turing machines.

>
> That's one of many possible ways to represent such an address
> space.

Use of 32 bit pointers to address a 4GB memory space is not just one of many
possible ways to represent such a space. It's the most obvious, natural way
to do so.

>
> > Talk about URL pointers is liable to confuse.
>
> It's intended to clarify (and imo it did) that a pointer is
> an *abstraction*, and as such, one need not (should not) be
> concerned about its physical implementation.
>

You need to understand the physical representation to understand how the
ANSI committee made their decisions. Or else why not say that a pointer is
held in a variable size memory entity?

>
> > Well you very often need to break the bounds of ANSI C and go
> > to a lower level.
>
> In which case the dicussion needs to depart from clc.
>

NO, because clc is not cl.ansic. The newsgriup precedes the ANSI standard,
which is proving itself to be an ephemeral chapter in the history of the
language. The C99 standard seems to have failed.

> >An example would be if you have a custom memory scheme. How > > do you
know if a pointer comes from your arena or from
> > elsewhere?
>
> discussed. Not here.
>

It's a perfectly on-topic question. I have implemeted a mymalloc() using a
static arena, when a pointer is passed to myfree(), how can I verify that it
is from the arena. The ANSI answer is that you can't, but that's not good
enough.
>
[ debuggers ]

> discussed. Not here.
>

You need to understand the sorts of ways pointers are represented in memory
before you can understand debuggers, or indeed the (ANSI) %p format
specifier to the printf() family of functions. Perfectly on topic, but
nothing to do with ANSI.

>
> > Programming is practical.
>
> The subject of clc is not programming.
>

It's C programming. Not ANSI C programming, portable C programming i.e.
compiler-specfic questions are off-topic, but not, for example, "how does a
typical implemetation provide malloc()".

>
> > It doesn't make sense to hand someone a copy of
> > the standard and expect them to be able to write fully-conforming
> > ANSI C.
>
> That's why we have books, schools, intructors, etc.
>

And also comp.lang,c. Otherwise one could simply post the standard in answer
to every query.

>
> Not at the abstract level of ISO C. 'Way' back when, I got a decent
> understanding of how COBOL worked, before I ever laid eyes on
> any hardware. This was proven when I actually coded, compiled,
> and successfully ran programs when we did get access to a computer.
>

Well done but that's unusual, and an inefficient way of learning. Basically
you are using the tutor to dry run code, and he will do so several million
times slower than a processor.
Programming is a practical skill, which means that you need to understnad
your implementation. Otherwise we could simply hand a copy of the standard
to every newbie and expect them to become proficient C programmers. It
doesn't work like that.

Basically engage brain before trying to obfuscate my explanations with
references to URL pointers and other such rubbish.

### Malcolm

Feb 10, 2004, 4:09:10 PM2/10/04
to

"Richard Bos" <r...@hoekstra-uitgeverij.nl> wrote in message

> Well, yes, it is. Where Malcolm goes wrong is in believing that
> locking yourself into the Wintel platform is part of that practicality.
>
So you think that Wintel is the only platform that uses 32-bit pointers to

### Mark McIntyre

Feb 10, 2004, 5:26:06 PM2/10/04
to
On Tue, 10 Feb 2004 21:08:17 -0000, in comp.lang.c , "Malcolm"
<mal...@55bank.freeserve.co.uk> wrote:

>
>"Mike Wahler" <mkwa...@mkwahler.net> wrote in message
>> > Since
>> > pointers have to be a fixed size
>>
>>
>Uggle *ptr = 0;
>
>Uggle **uptr = malloc(sizeof(Uggle *));
>
>*uptr = ptr;
>
>*uptr now must be set to NULL. How is this achieved if an Uggle * is of
>variable width?

Mike meant that different types' pointers might be different widths. Thus
an Uggle** might be wider (or narrower) than an Uggle*, which might in turn
be wider (or narrower) than an int*.

### Mike Wahler

Feb 11, 2004, 1:12:41 AM2/11/04
to

"Malcolm" <mal...@55bank.freeserve.co.uk> wrote in message
news:c0bh9j\$fb8\$1...@newsg1.svr.pol.co.uk...

>
> "Mike Wahler" <mkwa...@mkwahler.net> wrote in message
> > > Since
> > > pointers have to be a fixed size
> >
> > C & V please.

In case you didn't know, that acronym means "Chapter & Verse"
the standard.

> >
> Uggle *ptr = 0;
>
> Uggle **uptr = malloc(sizeof(Uggle *));
>
> *uptr = ptr;
>
> *uptr now must be set to NULL. How is this achieved if an Uggle * is of
> variable width?

Doesn't matter "how". It must simply 'work correctly'. That's
all the standard requires.

> >
> > Whose definition of 'typical'?

Malcolm: on a typical system

Mike: Whose definition of 'typical'?

> >
> Natural language definition of "typical".

OK I suppose I have to spell it out. Whose definition of
'typical *system*'. In some contexts a 'typical system'
is a PC. In others, it's a cell phone. In the widest
(computer system) context, if 'typical' is the most
widely used, it's certainly not a PC, but more likely
some embedded system I've probably never heard of.

> > >such as the one he will certainly be using,
> >
> > Doesn't matter which one. The answers will be platform-specific,
> > not applicable to standard C.
> >
> But standard C is deeply dependent on the types of architectures that
exist
> in the real world.

Not at all. The standard makes requirements that an implementation
must meet. If a platform cannot provide support sufficient for
such an implementation (either directly or via e.g. software emulation,
etc.)
(perhaps it only has 6 bit bytes) then it's simply not possible to create a
conforming C implemenation for it. Period. So you have the 'dependency'
issue exactly backwards.

>That's why it has pointers,

I'd have to ask Mr. Ritchie for the 'real' answer, but imo
it has pointers because they allow one to do the useful things
they can do. They implement an abstraction: indirection.

> commands that would be expected of Turing machines.
> >
> > That's one of many possible ways to represent such an address
> > space.
> Use of 32 bit pointers to address a 4GB memory space is not just one of
many
> possible ways to represent such a space. It's the most obvious, natural
way
> to do so.
> >
> > > Talk about URL pointers is liable to confuse.
> >
> > It's intended to clarify (and imo it did) that a pointer is
> > an *abstraction*, and as such, one need not (should not) be
> > concerned about its physical implementation.
> >
> You need to understand the physical representation to understand how the
> ANSI committee made their decisions.

I need to understand neither physical representation, nor know (or care)
why the committee decided what they did, in order to successfully write
standard C. All I need is a conforming implementation, and access to
the rules (the standard). Of course textbooks written in a more 'prose'
like form are a huge help.

>Or else why not say that a pointer is
> held in a variable size memory entity?

Because either one would be acceptable with regard to the standard.
It's called flexibility, which I suspect the committe allowed for
when possible. For example why do you suppose there's no hard
definition for the exact representation of '\n'?

> >
> > > Well you very often need to break the bounds of ANSI C and go
> > > to a lower level.
> >
> > In which case the dicussion needs to depart from clc.
> >
> NO, because clc is not cl.ansic.

For the zillionth time that I've stated this here, the name of a newsgroup
does *not* define its exact nature. It's only a general guideline.

The nature and guidelines of clc are stated in the 'welcome message',
which has by consensus of the regulars become the defining document.

>The newsgriup precedes the ANSI standard,

Irrelevant.

> which is proving itself to be an ephemeral chapter in the history of the
> language. The C99 standard seems to have failed.

Your opinion. And you seem to have imposed some arbitrary
time limit for C99 to 'succeed'.

> > >An example would be if you have a custom memory scheme. How > > do you
> know if a pointer comes from your arena or from
> > > elsewhere?
> >
> > discussed. Not here.
> >
> It's a perfectly on-topic question. I have implemeted a mymalloc() using a
> static arena, when a pointer is passed to myfree(), how can I verify that
it
> is from the arena. The ANSI answer is that you can't, but that's not good
> enough.

Tough.

> >
> [ debuggers ]
> > discussed. Not here.
> >
> You need to understand the sorts of ways pointers are represented in
memory
> before you can understand debuggers,

Debuggers are not topical here.

>or indeed the (ANSI) %p format

All one need know is that it will print the value of a type 'void*'
object. The exact display format used is left up to the implemenation.

> specifier to the printf() family of functions. Perfectly on topic, but
> nothing to do with ANSI.

%p (the ISO specification of it) is indeed topical. Its implementation
is not.

> >
> > > Programming is practical.
> >
> > The subject of clc is not programming.
> >
> It's C programming.

It's the C programming *language* and how to *use* it.

>Not ANSI C programming, portable C programming i.e.
> compiler-specfic questions are off-topic, but not, for example, "how does
a
> typical implemetation provide malloc()".

That's an implementation specific issue. The language only
specifies 'malloc()'s *behavior*.

> >
> > > It doesn't make sense to hand someone a copy of
> > > the standard and expect them to be able to write fully-conforming
> > > ANSI C.
> >
> > That's why we have books, schools, intructors, etc.
> >
> And also comp.lang,c. Otherwise one could simply post the standard in
> to every query.

So here you are at comp.lang.c where so many experts graciously share
their knowledge and skill, gratis. So instead of desperately trying
to prove yourself "right", why not *listen* and learn? I did.
When I first came to clc, I considered myself, if not 'expert',
at least very knowledgable about C. A couple days here proved
me wrong. I did not allow my ego to obscure or deny this fact.

> > Not at the abstract level of ISO C. 'Way' back when, I got a decent
> > understanding of how COBOL worked, before I ever laid eyes on
> > any hardware. This was proven when I actually coded, compiled,
> > and successfully ran programs when we did get access to a computer.
> >
> Well done but that's unusual,

I suppose one might call it "unusual". I found my instructor's
methods to be brilliant.

>and an inefficient way of learning.

I suppose that depends upon what you mean by "efficient". Fast?
Fast just means fast, not necessarily "good".

I found it a very *effective* way to learn.

> Basically
> you are using the tutor to dry run code,

Actually the students all used one another to represent
system components, one of which was the CPU, who was
given a sequence of predefined instructions. Others
represented data objects, peripheral devices, etc.
We 'executed' a 'program' according to a strict
formal set of rules (analagous to a standard
language specification). But these rules did *not*
mandate implementation methods. E.g. a the person
representing an 'accumulator' was only required
to 'reset', 'accumulate', and report a value.
It was not mandated *how* to do so. He was free
to rely on his memory, or he could write things
down, or use a handheld calculator, etc.

>and he will do so several million
> times slower than a processor.

Speed was not the objective. Learning was.
And after the students all having participated
in the 'execution' of a 'program' we all had a
much better appreciation for the true power
of a computer, and the discipline required to
effectively program one.

> Programming is a practical skill,

Yes, and a programming language is only a small part of it.
This newsgroup provides only a small part of the knowledge
necessary. Other learning resources exist for the other
issues.

> which means that you need to understnad

Not to use C you don't.

>Otherwise we could simply hand a copy of the standard
> to every newbie and expect them to become proficient C programmers. It
> doesn't work like that.

As I already said, that's why we have schools, books, instructors, etc.

>
> Basically engage brain before trying to obfuscate my explanations

I have in no way tried to obfuscate anything you've 'explained'.

>with
> references to URL pointers and other such rubbish.

I made no reference to a URL pointer.

-Mike

### Malcolm

Feb 11, 2004, 2:06:32 PM2/11/04
to

"Mike Wahler" <mkwa...@mkwahler.net> wrote in message
>
> > > Whose definition of 'typical'?
>
> Please don't omit context. Restored:
>
> Malcolm: on a typical system
>
> Mike: Whose definition of 'typical'?
>
Well every system I know uses fixed-size pointers. There is one main
exception to the rule that the size of the pointer represents the size of
the address space, and that's quite an important one, old x86 compilers with
their segmented architecture.
I think we can call the x86 "non-typical" because the natural thing to do is
to have one pointer value equalling one address, and because virtually every
other system works that way.

>
> > > But standard C is deeply dependent on the types of architectures > > >
that exist in the real world.
>
> Not at all. The standard makes requirements that an implementation
> must meet. If a platform cannot provide support sufficient for
> such an implementation (either directly or via e.g. software emulation,
> etc.)
> (perhaps it only has 6 bit bytes) then it's simply not possible to create
> a conforming C implemenation for it. Period. So you have the
> 'dependency' issue exactly backwards.
>
C is not an abstract language for specifying the behviour of Turing
machines, but one that is deeply-dependent on the types of architectures
that exist. You can incidentally provide a conforming C implemetation for
any Turing-comptible machine, even if it uses 6-bit bytes internally, as
long as you are prepared to accept gross inefficiency.
It is precisely because 6-bit byte general-purpose processors are rare that
C doesn't easily support them.

>
> I need to understand neither physical representation, nor know (or
> care) why the committee decided what they did, in order to
> successfully write standard C. All I need is a conforming
> textbooks written in a more 'prose' like form are a huge help.
>
This is nonsense. People are not machines. You can't learn French from a
dictionary and grammar, nor is it possible to learn C from the standard. And
over-literal explantions, such as "pointers can be URLs" obfusucate rather
than illuminate.

>
> >The newsgriup precedes the ANSI standard,
>
> Irrelevant.
>
No highly relevant. And ANSI has shot itself in the foot by proposing a
standard that has not been widely adopted, which means that now C will
probably spread into several dialects. The newsgroup precedes ANSI, and will
survive when ANSI is just a memory.

>
> Your opinion. And you seem to have imposed some arbitrary
> time limit for C99 to 'succeed'.
>
It's only five years, and obviously |I cannot fortell the future, but it
seems likely that C99 will never be widely implemeted. I think that what
will happen is that people will increasingly run C code through a C++
compiler to use useful C99 features such as single line comments and inline
functions.

>
> > The ANSI answer is that you can't, but that's not good
> > enough.
>
> Tough.
>
Tough for you but you're being unnecessarily restrictive. How about
explaining how this can be done in C on some platforms, but not portably?

> > Debuggers are not topical here.
>
The details of a specific debugger are not topical, debuggers generally (for
wasters) are topical.

>
> %p (the ISO specification of it) is indeed topical. Its implementation
> is not.
>
Implemetation of standard library functions is topical.

>
> So here you are at comp.lang.c where so many experts graciously
> share their knowledge and skill, gratis. So instead of desperately
> trying to prove yourself "right", why not *listen* and learn? I did.
> When I first came to clc, I considered myself, if not 'expert',
> at least very knowledgable about C. A couple days here proved
> me wrong. I did not allow my ego to obscure or deny this fact.
>
It doesn't take more than a couple of days to learn all the C you need to
know, unless you want to write a compiler, if you already know another
language. That is one of the great strengths of C.
To know the answer to exotica takes a bit longer, but you don't actually
need to know this to write successful C. How about learning from someone who
knows a great deal about programming, without claiming to be at the leading
edge?

>
> Actually the students all used one another to represent
> system components, one of which was the CPU, who was
> given a sequence of predefined instructions. Others
> represented data objects, peripheral devices, etc.
> We 'executed' a 'program' according to a strict
> formal set of rules (analagous to a standard
> language specification). But these rules did *not*
> mandate implementation methods. E.g. a the person
> representing an 'accumulator' was only required
> to 'reset', 'accumulate', and report a value.
> It was not mandated *how* to do so. He was free
> to rely on his memory, or he could write things
> down, or use a handheld calculator, etc.
>
If you don't have a computer then you can use these sorts of devices to
teach programming. It sounds highly creative and I wouldn't want to knock
your tutor. However if you just hnad someone a computer and let them play
with it, they can very quickly pick up programming if they have a natural
aptitude for it.

>
> > Programming is a practical skill,
>
> Yes, and a programming language is only a small part of it.
> This newsgroup provides only a small part of the knowledge
> necessary. Other learning resources exist for the other
> issues.
>
Yes sure, knowing C is only a small part of knowing "how to program", which
is a bit like "knowing how to cook", there are a few basics everyone has to
learn, but you can be perfectly competent at meat and 2 veg without being a
cordon bleu chef.

>
> > which means that you need to understnad
>
> Not to use C you don't.
>
Yes you do, because to make mistakes and funny things happen. Formally we
could just post a copy of the standard in response to every query, in
practise humans aren't built like that.

>
> I made no reference to a URL pointer.
>
No, you've defended someone who corrected my statement that typically a
pointer has enough bits to address the meory space of the computer by
pointing out that the implemetation could use a URL pointer. Formally he's
right of course, in the same way that it could use decimal ten-state memory

In fact a non-perverse use of pointers would be to store the bounds of the
data item pointed to in every pointer. Then an attempt to address memeory
illegally could be caught. To my knowledge not a single implemetation
actually uses safe pointers. The reason of course is that C programmers
expect pointer dereferences to compile to single machine instructions -
something again not mentioned in the standard but highly relevant to anyone
who programs in C.

### Michael Wojcik

Feb 12, 2004, 9:15:48 AM2/12/04
to

In article <c0duh9\$u4v\$1...@news5.svr.pol.co.uk>, "Malcolm" <mal...@55bank.freeserve.co.uk> writes:

> In fact a non-perverse use of pointers would be to store the bounds of the
> data item pointed to in every pointer. Then an attempt to address memeory
> illegally could be caught. To my knowledge not a single implemetation
> actually uses safe pointers.

Your knowledge is incomplete. At least three C implementations for the
AS/400 - EPM C, System C, and ILE C - use 16-byte / 128-bit pointers
(CHAR_BIT is 8) which are not simple addresses but descriptors, and
which include a reference to a memory space, an offset in that memory
space, and a validity flag which can only be set by a privileged-mode
instruction. Mucking about with a pointer's internals resets the
flag, rendering the pointer invalid.

All three implementations will immediately trap on invalid pointer
access.

I believe ILE C (the current one) is a fully conforming C94 hosted
implementation, and System C was a fully conforming C90 hosted
implementation. I suspect EPM C wasn't a conforming hosted
implementation, though it probably came fairly close, and may have
been a conforming freestanding implementation.

> The reason of course is that C programmers
> expect pointer dereferences to compile to single machine instructions -
> something again not mentioned in the standard but highly relevant to anyone
> who programs in C.

C programmers working on the AS/400 will find that expectation is
incorrect. In C on the AS/400, *nothing* compiles to machine
instructions, single or otherwise. It compiles to a pseudoassembly
language called "MI". And that's a good thing, for AS/400 software,
since it's one of the qualities that allowed IBM to completely change
the machine's architecture without breaking working programs. (That's
*binaries*, with no recompilation required, in many cases.)

On the AS/400, robustness trumps performance. That was the design
decision for the whole architecture, and C needed to fall in line.
One of the nice things about the C standard was that it could
accomodate that.

More C programmers should do some work on the AS/400. (For one thing,
it'd make them appreciate their other development environments all
the more, if they use IBM's awful Program Development Manager and
Source Entry Utility.) You can learn a lot about what a conforming
hosted implementation can do. And if you're using a real 5250
terminal, you can also learn those swell trigraph sequences (or the
EBCDIC code points for various C punctuation characters).

--
Michael Wojcik michael...@microfocus.com

Pseudoscientific Nonsense Quote o' the Day:
From the scientific standpoint, until these energies are directly
sensed by the evolving perceptions of the individual, via the right
brain, inner-conscious, intuitive faculties, scientists will never
grasp the true workings of the universe's ubiquitous computer system.
-- Noel Huntley

### Malcolm

Feb 12, 2004, 6:32:43 PM2/12/04
to

"Michael Wojcik" <mwo...@newsguy.com> wrote in message

>
> C programmers working on the AS/400 will find that expectation
>[that pointer dereferences compile to single machine instructions ] is

> incorrect. In C on the AS/400, *nothing* compiles to machine
> instructions, single or otherwise. It compiles to a pseudoassembly
> language called "MI".
>
This really is the exception that proves the point. A platform that
disallows native machine langauge programs cannot really be said to have a
compiler. Nor is C the ideal language for such an environment - you need
something which does memory management for you.

### Chris Torek

Feb 13, 2004, 12:56:00 AM2/13/04
to
>"Michael Wojcik" <mwo...@newsguy.com> wrote in message
>>
>> C programmers working on the AS/400 will find that expectation
>>[that pointer dereferences compile to single machine instructions ] is
>> incorrect. In C on the AS/400, *nothing* compiles to machine
>> instructions, single or otherwise. It compiles to a pseudoassembly
>> language called "MI".

In article <news:c0h2g8\$9sj\$1...@newsg4.svr.pol.co.uk>

Malcolm <mal...@55bank.freeserve.co.uk> writes:
>This really is the exception that proves the point. A platform that
>disallows native machine langauge programs cannot really be said to have a
>compiler. Nor is C the ideal language for such an environment - you need
>something which does memory management for you.

But if you believe that C on this machine is not "compiled", then
you must believe that *nothing* on the AS/400 is *ever* compiled --
not COBOL, not RPG, not Modula-2. Yet IBM will sell you "compilers"
for all of these, as well as for C and C++. There are even AS/400
assemblers that read "MI" source and produces "machine code":
<http://www-1.ibm.com/servers/eserver/iseries/whpapr/translator.html>.

Would you also claim that any machine on which the machine's "opcodes"
are interpreted by microcode has no compilers? If not, why do you
distinguish between OMI opcodes and microcoded-machine opcodes?
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40Â°39.22'N, 111Â°50.29'W) +1 801 277 2603
Reading email is like searching for food in the garbage, thanks to spammers.

### Keith Thompson

Feb 13, 2004, 3:42:29 AM2/13/04
to

Exceptions don't prove points, as least not in the sense you mean.

There are plenty of compilers that generate something other than
machine code. I'm not familiar with the AS/400, but I haven't seen
anything to suggest that C is a poor language for it.

### pete

Feb 13, 2004, 7:18:42 AM2/13/04
to
Keith Thompson wrote:
>
> "Malcolm" <mal...@55bank.freeserve.co.uk> writes:

> > This really is the exception that proves the point.

If you ever see me using sophistry like that, here,
it will be the first of April.

> Exceptions don't prove points, as least not in the sense you mean.

--
pete

### Dan Pop

Feb 13, 2004, 9:34:40 AM2/13/04
to
In <ln65ebp...@nuthaus.mib.org> Keith Thompson <ks...@mib.org> writes:

>machine code. I'm not familiar with the AS/400, but I haven't seen
>anything to suggest that C is a poor language for it.

It depends on how you define the notion of poor language.

It is a fact that C is not the language of choice for the primary
application domain of this machine (small business server) and that very
little (if any) of the open source C code available on the Internet
has been ported to that platform (or written with portability to this
platform in mind).

It is possible to program in C on this machine, but apparently few of
those who did it actually enjoyed the experience. And this has precious
little to do with the unusual pointer size/representation.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Dan...@ifh.de

### Malcolm

Feb 13, 2004, 4:00:36 PM2/13/04
to

"Chris Torek" <nos...@torek.net> wrote in message

>
> Would you also claim that any machine on which the machine's
> "opcodes" are interpreted by microcode has no compilers? If not,
> why do you distinguish between OMI opcodes and microcoded-
> machine opcodes?
>
Let's say someone produces a tool that converts C code to compliant C++
code - e.g. alters C++ keywords used as identifiers, adds prototypes, adds
explicit casts of void * etc. Would you describe such a program as a C
compiler? If not, why not?

Microcode creates a grey area. I would say that the difference is between an
intermediate bytecode that is designed for arbitrary hardware, and a program
which is hardware-specific, although it relies on some microcode to support
that hardware. Of course a really good optimising C compiler should build

Ultimately it's just a question of definition - how far can we extend the
term "compiler" until we're talking about two totally different things? The
AS/400 almost certainly contains a substantial amount of C code which is
compiled to native machine code and runs the OS. How are we to distiguish
this compiler from the "compiler" shipped to customers?

### Malcolm

Feb 13, 2004, 4:08:54 PM2/13/04
to

"pete" <pfi...@mindspring.com> wrote in message

>
> > > This really is the exception that proves the point.
>
> If you ever see me using sophistry like that, here,
> it will be the first of April.
>
> > Exceptions don't prove points, as least not in the sense you mean.
>
"The exception proves the rule" is a famous proverb. "Prove" means "tests",
not "demonstrates the point".

Now I claimed that not a single compiler, to my knowledge, implemented safe
pointers. An exception was raised. However on examination we see that the
"compiler" isn't really a compiler at all, if we define "compiler" as
"something that translates source code to machine code". So the exception
actually demonstrates that the point is valid.

### Michael Wojcik

Feb 13, 2004, 3:49:18 PM2/13/04
to

In article <c0h2g8\$9sj\$1...@newsg4.svr.pol.co.uk>, "Malcolm" <mal...@55bank.freeserve.co.uk> writes:
>
> "Michael Wojcik" <mwo...@newsguy.com> wrote in message
> >
> > C programmers working on the AS/400 will find that expectation
> >[that pointer dereferences compile to single machine instructions ] is
> > incorrect. In C on the AS/400, *nothing* compiles to machine
> > instructions, single or otherwise. It compiles to a pseudoassembly
> > language called "MI".

> This really is the exception that proves the point.

That's not what that idiom means. "The exception proves the rule"
is a partial vernacular translation of a Latin legal principle which
means that when an exception is explicit in the law ("No parking
between 9AM and 5PM"), it implies a general rule where the exception
does not apply ("You may park between 5PM and 9AM").

In what logical system does the existence of an exception prove that
the general thesis is true? In fact, what we have here is an
exception which disproves the thesis. See [1].

> A platform that
> disallows native machine langauge programs cannot really be said to have a
> compiler.

Oh yes it can. Observe: There are compiled languages on the AS/400.
Perhaps you need to review what a "compiler" is. Hint: it's not a
system for translating some source language into "native machine
language". That's why Java, for example, is still a compiled
language.

A compiler *compiles*. It collects multiple source statements and
processes them as a whole into some form more amenable for execution.
Contrast that with an interpreter, which is incremental - it processes
and executes one "statement" (however defined by the language) at a
time.

In any case, the C standard says nothing about compilation. There is
an implementation, which acts upon translation units. A program is
composed of one or more translation units, which undergo the various
translation stages specified by the standard.

> Nor is C the ideal language for such an environment - you need
> something which does memory management for you.

Really. Care to expand upon this rather bizarre thesis? In what
way do the characteristics of the AS/400 1) make C any less "ideal"
there than on any other platform, or 2) require automatic memory
management?

--
Michael Wojcik michael...@microfocus.com

Is it any wonder the world's gone insane, with information come to be
the only real medium of exchange? -- Thomas Pynchon

### Michael Wojcik

Feb 13, 2004, 3:55:37 PM2/13/04
to

In article <c0hot...@enews2.newsguy.com>, Chris Torek <nos...@torek.net> writes:
>
> But if you believe that C on this machine is not "compiled", then
> you must believe that *nothing* on the AS/400 is *ever* compiled --
> not COBOL, not RPG, not Modula-2. Yet IBM will sell you "compilers"
> for all of these, as well as for C and C++.

Indeed, though I suppose we shouldn't in general allow IBM to define
"compiler" for us. Still, I think the consensus among AS/400
programmers is that we are, indeed, compiling our programs, and I
defy Malcolm to prove otherwise.

> There are even AS/400
> assemblers that read "MI" source and produces "machine code":
> <http://www-1.ibm.com/servers/eserver/iseries/whpapr/translator.html>.

In fact, there used to be (and probably still is) a C API supplied by
IBM for this purpose; IIRC, it was just a function that took a
FILE* referring to a file open for writing and a string containing
MI source, assembled the latter, and wrote it into the former. Which
made the AS/400 the easiest machine I knew of to write an assembler
for...

(MI is a nicely CISCy pseudo-assembly, with opcodes like "translate
byte using table". Not as CISCy as VAX assembly, as I recall, but
pretty rich.)

--
Michael Wojcik michael...@microfocus.com

This record comes with a coupon that wins you a trip around the world.
-- Pizzicato Five

### Chris Torek

Feb 13, 2004, 4:57:45 PM2/13/04
to
In article <c0jduv\$bd9\$1...@newsg3.svr.pol.co.uk>

Malcolm <mal...@55bank.freeserve.co.uk> writes:
>Let's say someone produces a tool that converts C code to compliant C++
>code - e.g. alters C++ keywords used as identifiers, adds prototypes, adds
>explicit casts of void * etc. Would you describe such a program as a C
>compiler? If not, why not?

Generally, I *would* call it a compiler (provided it produced an
executable image in the process, perhaps by later invoking the
"assembler" that translates the C++ to machine code). But if this
particular translator depended on the C++-to-machine-code step to
find certain fundamental errors, that is a -- perhaps even the only
-- condition under which I would not call it a compiler.

I am not sure I can define it very well, so consider the following
as an example, before I go on to an attempt at a definition:

% cat bug.c
int main(void] { return *42; }
% ctocxx -C bug.c

(Here, please assume the -C option means "leave the C++ `assembly'
visible for inspection, and that no diagnostics occur.)

% cat bug.c++
int main(] { return *42; }
%

This fails the "compiler" criterion by missing the obvious syntax
error ("]" should be "}") and semantic error (unary "*" cannot be
applied to an integer constant). (And of course, if main() were
to call itself recursively in the C version, the C++ code would
have to use some other function, or depend on that particular C++
implementation to allow recursive calls to main() -- either would
be acceptable, provided the "C compiler" comes *with* the C++
compiler portion. If the C compiler is meant to work with *any*
C++ compiler, depending on implementation-defined characteristics
would be at best a bug.)

The difference is basically one of responsibility: to be called a
"compiler", the program must make a complete syntactic and semantic
analysis of the source code, determine its "intended meaning" (or
one of several meanings, in cases where the source language has
various freedoms), and generate as its output code that is intended
to pass cleanly through any (required and/or supplied) intermediate
stages before it produces the final "executable". If something
fails to "assemble" without the "compiler" stage first pointing out
an error, this indicates a bug in the compiler.

A preprocessor, macro-processor, or textual-substitution system, on
the other hand, does not need to make complete analyses -- if the
input is erroneous, its output can be arbitrarily malformed without
this necessarily being a bug. Diagnostics from later passes are
acceptable and expected.

Of course, escape hatches (as commonly found in C compilers with
__asm__ keywords and the like) can muddy things up a bit. If you
use __asm__ to insert invalid assembly code, while the compiler
assumes that you know what you are doing, this is probably "your
fault". Likewise, a C-via-C++-to-executable compiler might provide
an escape hatch to "raw C++", and if you muck that up, it would be
your fault, rather than a compiler bug or disqualifier.

(Note that a clever implementor might even use the C++ stage to
find [some of the] required-diagnostic bugs in incorrect C code.
I consider this "OK" and "not a disqualifier" *if* the C compiler
actually reads and digests the C++ stage's diagnostics, and re-forms
them back to refer to the original C code, so that the process is
invisible to the C programmer.)

### Mark McIntyre

Feb 13, 2004, 5:48:29 PM2/13/04
to
On 13 Feb 2004 20:49:18 GMT, in comp.lang.c , mwo...@newsguy.com (Michael
Wojcik) wrote:

>
>In article <c0h2g8\$9sj\$1...@newsg4.svr.pol.co.uk>, "Malcolm" <mal...@55bank.freeserve.co.uk> writes:
>>
>> "Michael Wojcik" <mwo...@newsguy.com> wrote in message
>> >
>> > C programmers working on the AS/400 will find that expectation
>> >[that pointer dereferences compile to single machine instructions ] is
>> > incorrect. In C on the AS/400, *nothing* compiles to machine
>> > instructions, single or otherwise. It compiles to a pseudoassembly
>> > language called "MI".
>
>> This really is the exception that proves the point.
>
>That's not what that idiom means. "The exception proves the rule"
>is a partial vernacular translation of a Latin legal principle

possibly. Its probably more likely that the saying uses the alternate
meaning of "prove" which is "test". As in the "proof of the pudding is in
the eating".

Message has been deleted

### Richard Heathfield

Feb 14, 2004, 12:38:41 AM2/14/04
to
Chris Torek wrote:

> In article <c0jduv\$bd9\$1...@newsg3.svr.pol.co.uk>
> Malcolm <mal...@55bank.freeserve.co.uk> writes:
>>Let's say someone produces a tool that converts C code to compliant C++
>>code - e.g. alters C++ keywords used as identifiers, adds prototypes, adds
>>explicit casts of void * etc. Would you describe such a program as a C
>>compiler? If not, why not?
>
> Generally, I *would* call it a compiler (provided it produced an
> executable image in the process, perhaps by later invoking the
> "assembler" that translates the C++ to machine code).

Well, it's obviously your prerogative to use words as you choose, but your
proviso here flies in the face of Aho, Sethi and Ullman's definition: "a
compiler is a program that reads a program written in one language - the
source language - and translates it to an equivalent program in another
language - the target language" - no mention there of executable images.
Source: Dragon Book (Chapter 1, page 1!)

<snip>

--
Richard Heathfield : bin...@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

### Keith Thompson

Feb 14, 2004, 4:34:21 AM2/14/04
to
"Malcolm" <mal...@55bank.freeserve.co.uk> writes:
[...]

> Ultimately it's just a question of definition - how far can we extend the
> term "compiler" until we're talking about two totally different things? The
> AS/400 almost certainly contains a substantial amount of C code which is
> compiled to native machine code and runs the OS. How are we to distiguish
> this compiler from the "compiler" shipped to customers?

You're almost certain that the AS/400 OS is written in C? You may be
right, but my guess is that it's written in some other language(s).

### Malcolm

Feb 14, 2004, 4:43:59 AM2/14/04
to

"Michael Wojcik" <mwo...@newsguy.com> wrote in message
>
> > This really is the exception that proves the point.
>
> That's not what that idiom means. "The exception proves the rule"
> is a partial vernacular translation of a Latin legal principle which
> means that when an exception is explicit in the law ("No parking
> between 9AM and 5PM"), it implies a general rule where the
> exception does not apply ("You may park between 5PM and 9AM").
>
Etymology isn't meaning. The proverb is not used in that way. (By the way
the etymology itself is dodgy)
http://www.icsi.berkeley.edu/~nchang/personal/exception.html

>
> In what logical system does the existence of an exception prove that
> the general thesis is true? In fact, what we have here is an
> exception which disproves the thesis. See [1].
>
It's not something in formal logic, but a rule of thumb. To see if a rule
applies, look at cases that appear to be exceptions. For instance if I say
"All mammals are viviparous" then looking at chickens, or bats, which are
not exceptions to the rule, isn't helpful. However if we look at the
duck-billed platypus and echidna, which lay eggs, we find that they are
formally mammals, but they split off from the rest of the mammals a long
time ago. The rule is still useful - we won't find an oviparous antelope.
Let's take another example, "No mammals are eusocial." Well the naked mole
rat is eusocial, and wolves are a borderline case. There is nothing much
else in common between these animals, and they are not otherwise special. We
conclude that the rule isn't too useful - there's nothing special about
being a mammal that precludes eusociality.

End of logic 101, back to C.

>
> > A platform that disallows native machine langauge programs cannot
> > really be said to have a compiler.
>
> Oh yes it can. Observe: There are compiled languages on the
> AS/400.
>

It depends how you want to use the word. If anyone with a little bit of
computer knowledge asked "What's a compiler?" I would say "Something that
translates a high-level language to machine code."

>
> In any case, the C standard says nothing about compilation.
>

You can build a C interpreter. What's your point, that mentioning "the
compiler" makes a post off-topic?

>
> > Nor is C the ideal language for such an environment - you need
> > something which does memory management for you.
>
> Really. Care to expand upon this rather bizarre thesis? In what
> way do the characteristics of the AS/400 1) make C any less "ideal"
> there than on any other platform, or 2) require automatic memory
> management?
>

Because C sacrifices safety in memory access for efficiency. Since the
platform won't allow this, the safety has to be put in at an inappropriate
level. So I would guess that when writing a function to iterate over a
string, the pointer is checked for out-of-bounds at every increment.
Certainly passing a pointer, if it contains validity information, will be
very slow.
If you do memory management at a higher level then you can have similar
safety, but raw pointers can be used internally (where the user code can't
mess with them).

There ceases to be a point in using C on the AS/400, except that C is a very
popular language, and there is always a point in supporting a standard. A
bit like driving a sports car over a traffic-calmed road - it can't go very
fast and a hatchback would make more sense, but if you own a sports car
already then you might want to do it.

### Joona I Palaste

Feb 14, 2004, 6:57:06 AM2/14/04
to
Michael Wojcik <mwo...@newsguy.com> scribbled the following:

> A compiler *compiles*. It collects multiple source statements and
> processes them as a whole into some form more amenable for execution.
> Contrast that with an interpreter, which is incremental - it processes
> and executes one "statement" (however defined by the language) at a
> time.

Could the distinction between a compiler and an interpreter be that when
they encounter program code, compilers translate it into another
language, while interpreters execute it? In other words, more or less,
compilers store away code for later execution while interpreters execute
it when they see it?

--
/-- Joona Palaste (pal...@cc.helsinki.fi) ------------- Finland --------\
\-- http://www.helsinki.fi/~palaste --------------------- rules! --------/
"All that flower power is no match for my glower power!"
- Montgomery Burns

### nrk

Feb 14, 2004, 9:38:13 PM2/14/04
to
Richard Heathfield wrote:

> Chris Torek wrote:
>
>> In article <c0jduv\$bd9\$1...@newsg3.svr.pol.co.uk>
>> Malcolm <mal...@55bank.freeserve.co.uk> writes:
>>>Let's say someone produces a tool that converts C code to compliant C++
>>>code - e.g. alters C++ keywords used as identifiers, adds prototypes,
>>>adds explicit casts of void * etc. Would you describe such a program as a
>>>C compiler? If not, why not?
>>
>> Generally, I *would* call it a compiler (provided it produced an
>> executable image in the process, perhaps by later invoking the
>> "assembler" that translates the C++ to machine code).
>
> Well, it's obviously your prerogative to use words as you choose, but your
> proviso here flies in the face of Aho, Sethi and Ullman's definition: "a
> compiler is a program that reads a program written in one language - the
> source language - and translates it to an equivalent program in another
> language - the target language" - no mention there of executable images.
> Source: Dragon Book (Chapter 1, page 1!)
>
> <snip>
>

From "Advanced Compiler Design and Implementation" by Steven S. Muchnik:

<quote>
Strictly speaking, compilers are software systems that translate programs
written in higher-level languages into equivalent programs in object code
or machine language for execution on a computer.
...
The definition can be widened to include systems that translate from one
higher-level language to an indermediate-level form, etc.
</quote>

One might argue that an author/book cannot serve as an authoritative
definition of a term, but considering the widespread use and popularity of
the book, I would tend to take this to be an appropriate definition.

-nrk.

--
Remove devnull for email

### Les Cargill

Feb 15, 2004, 10:35:48 AM2/15/04
to

Muchnik's book's version is problematic. The Aho, Sethi and
Ullman version is
a much more disciplined definition.

Object code itself is a "language".

There may be, and usually are, several stages to producing
executables from
source code - compilation, assembly ( which is just a
specialization of
compilation ), linking, (possibly) locating and
some of thes estages are hidden behind one command or ( button
on
an IDE) is a matter of packaging, not of much else.

--
Les Cargill

### Keith Thompson

Feb 15, 2004, 11:50:00 AM2/15/04
to
nrk <ram_n...@devnull.verizon.net> writes:
[...]

> From "Advanced Compiler Design and Implementation" by Steven S. Muchnik:
>
> <quote>
> Strictly speaking, compilers are software systems that translate programs
> written in higher-level languages into equivalent programs in object code
> or machine language for execution on a computer.
> ...
> The definition can be widened to include systems that translate from one
> higher-level language to an indermediate-level form, etc.
> </quote>
>
> One might argue that an author/book cannot serve as an authoritative
> definition of a term, but considering the widespread use and popularity of
> the book, I would tend to take this to be an appropriate definition.

Doesn't IEEE have an official dictionary of computer terms? Can
someone who has a copy look up "compiler"?

For what it's worth, the first compiler I used (UCSD Pascal) generated
a pseudo-code (P-code) which was then interpreted; nobody ever called
it a translator rather than a compiler. (Later, one company started
making chips that executed P-code in hardware, or at least in
microcode.)

### Chris Torek

Feb 15, 2004, 1:55:20 PM2/15/04
to
(This is getting pretty far off topic and perhaps should move to
comp.programming or even the moderated group, comp.compilers...)

>> In article <c0jduv\$bd9\$1...@newsg3.svr.pol.co.uk>
>> Malcolm <mal...@55bank.freeserve.co.uk> writes:
>>>Let's say someone produces a tool that converts C code to compliant C++
>>>code - e.g. alters C++ keywords used as identifiers, adds prototypes, adds
>>>explicit casts of void * etc. Would you describe such a program as a C
>>>compiler? If not, why not?

>Chris Torek wrote:
>> Generally, I *would* call it a compiler (provided it produced an
>> executable image in the process, perhaps by later invoking the
>> "assembler" that translates the C++ to machine code).

In article <news:c0kc90\$r6m\$1...@sparta.btinternet.com>

Richard Heathfield <bin...@eton.powernet.co.uk> writes:
>Well, it's obviously your prerogative to use words as you choose, but your
>proviso here flies in the face of Aho, Sethi and Ullman's definition: "a
>compiler is a program that reads a program written in one language - the
>source language - and translates it to an equivalent program in another
>language - the target language" - no mention there of executable images.
>Source: Dragon Book (Chapter 1, page 1!)

As nrk points out in a followup, there is at least some disagreement
over precisely what a "compiler" is. I am happy to work with the
Dragon Book definition (which, to be honest, I had actually forgotten
-- it has, after all, been 20 years! :-) ) as well. In this case,
a translator much like the original "cfront" is also a compiler
even if it never produces an executable.

Given that we have the word "translator", however, I personally
would tend to use that word for a system in which the "produce
something useful" step requires outside assistance, such as a
C++-to-C step that not only does not come with a C compiler, but
is provided for a computer for which no C compiler is even available.
It is probably also worth pointing out that there are a number% of
compilers that have produced C as their "assembly code", but in
all cases of which I am aware, that C code was not portable at all
-- you had to tell the XYZ-to-C step "this implementation has 32-bit
int, is big-endian, widens all floats to double, uses a stack for
variable arguments", and all sorts of other things tied to the
specific implementation. Thus, calling the C output an "equivalent
program" is perhaps stretching the truth: it is only "equivalent"
for one specific machine or group of machines, not for all systems
on which C runs.

Finally, let me note that this sort of thing is why it sometimes
pays to step back and define one's terms. People can argue forever
fruitlessly over nitpicky details, never reaching any agreement,
simply because they started with different definitions. This, in
fact, is why we have C standards: without a common definition of
what it is to "be" a C program, it may be impossible for two people
-- or even one person and a compiler -- to reach agreement as to
what the source-language program *means*.

[%footnote: when I say "a number" I do mean "more than one". While
cfront is perhaps the best-known example, I believe Xerox PARC had
C back-ends for some of their compilers, for the language that was
a followon to Mesa -- I have forgotten its name -- and for Modula-3,
for instance.]

### Michael Wojcik

Feb 16, 2004, 10:29:07 AM2/16/04
to

In article <c0jeeh\$e41\$1...@news6.svr.pol.co.uk>, "Malcolm" <mal...@55bank.freeserve.co.uk> writes:
>
> "The exception proves the rule" is a famous proverb. "Prove" means "tests",
> not "demonstrates the point".

Yes, "prove" does sometimes mean "test". However, the phrase "the
exception proves the rule" does not use "prove" in this sense. See the
link I provided in my first reply. You are victim of a folk etymology.

> Now I claimed that not a single compiler, to my knowledge, implemented safe
> pointers. An exception was raised.

Well, that's what safe pointers are for, after all...

> However on examination we see that the
> "compiler" isn't really a compiler at all, if we define "compiler" as
> "something that translates source code to machine code".

And if we define "compiler" as "person who assembles documents out
of individual pages", gcc isn't a compiler either.

> So the exception actually demonstrates that the point is valid.

What it demonstrated was that your claim was wrong, and your definition
of "compiler" is overly restrictive and opportunistic, introduced solely
in an attempt to save a bogus argument.

--
Michael Wojcik michael...@microfocus.com

This book uses the modern technology to explain the phenomemon in the world
of Japanese animation. If you love anime so much, you'd better read this.
After you read it, you may agree that is destroying the dream of the child.
Needs Chinese viewing system. -- The Goodboy Scientific

### Michael Wojcik

Feb 16, 2004, 10:22:24 AM2/16/04
to

In article <c0jduv\$bd9\$1...@newsg3.svr.pol.co.uk>, "Malcolm" <mal...@55bank.freeserve.co.uk> writes:
>
> "Chris Torek" <nos...@torek.net> wrote in message
> >
> > Would you also claim that any machine on which the machine's
> > "opcodes" are interpreted by microcode has no compilers?

> Let's say someone produces a tool that converts C code to compliant C++

> code - e.g. alters C++ keywords used as identifiers, adds prototypes, adds
> explicit casts of void * etc. Would you describe such a program as a C
> compiler? If not, why not?

I posted a perfectly workable definition of "compiler" which easily
deals with this sort of case. A compiler is a software tool which
processes translation units (files, in many implementations - though
not in OS/400, for one) in their entirety (that's the "compilation"
process), converting them into a form more amenable for execution.
The compiler's output need not be directly executable by the OS - and
indeed in many cases is not - it just needs to be further along the
process.

If, for some reason, you have an implementation which supplies only
a C++ compiler, then a C-to-C++ translator would be a C compiler for
that system. It would serve to move C source translation units
further along the path to executable form.

> Microcode creates a grey area.

Not if you have a sensible, consistent definition of "compiler".

> I would say that the difference is between an
> intermediate bytecode that is designed for arbitrary hardware, and a program
> which is hardware-specific, although it relies on some microcode to support
> that hardware.

The difference between what and what else?

> Ultimately it's just a question of definition

Yes, that's what determines what a word, such as "compiler", means.

> - how far can we extend the
> term "compiler" until we're talking about two totally different things?

I'm afraid I don't see what "two totally different things" are under
discussion here. I might guess you mean, on the one hand, one of the
OS/400 C implementations, and on the other some C implementation more
familiar to you, but those strike me as two things which are not very
different at all.

> The
> AS/400 almost certainly contains a substantial amount of C code which is
> compiled to native machine code and runs the OS.

How did you come by this information? The internals of OS/400 are for
the most part a closely guarded secret. Could it be that you are making
a wild, unsubstantiated guess?

(In point of fact, what I've heard informally from IBM insiders is that
the majority of the original OS/400 implementation was written in PL/I.
Rumor has it was substantially rewritten for the RISC implementation.
That's the LIC - Licensed Internal Code - which is the equivalent of the
kernel in OS/400 and is said to be the only part of the OS which is
native code. The bulk of the OS, including the shell and utilities, is
in theory compiled to MI. Again, this may have only been true in the
early implementations, with more pieces migrated to native code as the
OS evolved.)

> How are we to distiguish
> this compiler from the "compiler" shipped to customers?

Most obviously, in that it's not available to customers. And, of
course, it has a different target. What exactly is your point?

--
Michael Wojcik michael...@microfocus.com

We are subdued to what we work in. (E M Forster)