stack smashing

frank

unread,

Jan 16, 2010, 4:56:58 PM1/16/10

to

I've heard of stack smashing but never done it myself until about 36
hours ago. Wiki had an example that I'm having problems following, but
it does do the trick:

http://en.wikipedia.org/wiki/Stack_buffer_overflow

dan@dan-desktop:~/source$ gcc -std=c99 -Wall -Wextra ss1.c -o out; ./out
a is 4
b is 32
My Float value = 10.500000
My Float value = 10.500000
*** stack smashing detected ***: ./out terminated
======= Backtrace: =========
/lib/tls/i686/cmov/libc.so.6(__fortify_fail+0x48)[0xb7fc0da8]
/lib/tls/i686/cmov/libc.so.6(__fortify_fail+0x0)[0xb7fc0d60]
./out[0x8048536]
[0x21212067]
======= Memory map: ========
08048000-08049000 r-xp 00000000 08:04 213111 /home/dan/source/out
08049000-0804a000 r--p 00000000 08:04 213111 /home/dan/source/out
0804a000-0804b000 rw-p 00001000 08:04 213111 /home/dan/source/out
09cff000-09d20000 rw-p 09cff000 00:00 0 [heap]
b7ea4000-b7eb1000 r-xp 00000000 08:01 2601 /lib/libgcc_s.so.1
b7eb1000-b7eb2000 r--p 0000c000 08:01 2601 /lib/libgcc_s.so.1
b7eb2000-b7eb3000 rw-p 0000d000 08:01 2601 /lib/libgcc_s.so.1
b7ec2000-b7ec3000 rw-p b7ec2000 00:00 0
b7ec3000-b801f000 r-xp 00000000 08:01 2661
/lib/tls/i686/cmov/libc-2.9.so
b801f000-b8020000 ---p 0015c000 08:01 2661
/lib/tls/i686/cmov/libc-2.9.so
b8020000-b8022000 r--p 0015c000 08:01 2661
/lib/tls/i686/cmov/libc-2.9.so
b8022000-b8023000 rw-p 0015e000 08:01 2661
/lib/tls/i686/cmov/libc-2.9.so
b8023000-b8026000 rw-p b8023000 00:00 0
b8034000-b8037000 rw-p b8034000 00:00 0
b8037000-b8038000 r-xp b8037000 00:00 0 [vdso]
b8038000-b8054000 r-xp 00000000 08:01 8001 /lib/ld-2.9.so
b8054000-b8055000 r--p 0001b000 08:01 8001 /lib/ld-2.9.so
b8055000-b8056000 rw-p 0001c000 08:01 8001 /lib/ld-2.9.so
bf9f6000-bfa0b000 rw-p bffeb000 00:00 0 [stack]
Aborted
dan@dan-desktop:~/source$ cat ss1.c
#include <string.h>
#include <stdio.h>
#include <stdlib.h>

void foo (char *bar)
{
float My_Float = 10.5; // Addr = 0x0023FF4C
char c[12]; // Addr = 0x0023FF30
size_t a, b;

a = sizeof(float);
printf("a is %d\n", a);
b = strlen( bar);
printf("b is %d\n", b);

// Will print 10.500000
printf("My Float value = %f\n", My_Float);

/* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Memory map:
@ : c allocated memory
# : My_Float allocated memory
- : other memory

*c *My_Float
0x0023FF30 0x0023FF4C
| |
@@@@@@@@@@@@----------------#####
foo("my string is too long !!!!! XXXXX");

memcpy will put 0x1010C042 in My_Float value.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/

memcpy(c, bar, strlen(bar)); // no bounds checking...

// Will print 96.031372
printf("My Float value = %f\n", My_Float);
}

int main (void)
{
foo("my string is too long !!!!! \x10\x10\xC0\x42");
return 0;
}

// gcc -std=c99 -Wall -Wextra ss1.c -o out; ./out

dan@dan-desktop:~/source$

I have a couple questions:

1) Does the backtrace and memory map data tell anyone something of
relevance?

2) Why do I not get 96.03 as the wiki promises?

Thanks for your comment.
--
frank

Richard Heathfield

unread,

Jan 16, 2010, 5:53:53 PM1/16/10

to

frank wrote:

<snip>

> dan@dan-desktop:~/source$ cat ss1.c
> #include <string.h>
> #include <stdio.h>
> #include <stdlib.h>
>
> void foo (char *bar)
> {
> float My_Float = 10.5; // Addr = 0x0023FF4C
> char c[12]; // Addr = 0x0023FF30
> size_t a, b;
>
> a = sizeof(float);
> printf("a is %d\n", a);

size_t is an unsigned integral type. If you want to pass a size_t to
printf to match a %d format specifier, cast it to int.

> b = strlen( bar);
> printf("b is %d\n", b);
>
> // Will print 10.500000
> printf("My Float value = %f\n", My_Float);

<snip>

> memcpy(c, bar, strlen(bar)); // no bounds checking...

Since the source string is longer than the amount of memory available at
the destination, your call to memcpy will violate the bounds of the
array, at which point the behaviour of the program is undefined.

<snip>

> 2) Why do I not get 96.03 as the wiki promises?

C does not define the behaviour of a program whose behaviour is
undefined. So any result is okay, including the Wiki's result and any
other result (or no result at all).

--
Richard Heathfield <http://www.cpax.org.uk>
Email: -http://www. +rjh@
"Usenet is a strange place" - dmr 29 July 1999
Sig line vacant - apply within

Alan Curry

unread,

Jan 16, 2010, 6:01:46 PM1/16/10

to

In article <7rer1b...@mid.individual.net>,

frank <fr...@example.invalid> wrote:
>I've heard of stack smashing but never done it myself until about 36
>hours ago. Wiki had an example that I'm having problems following, but
>it does do the trick:
>
>http://en.wikipedia.org/wiki/Stack_buffer_overflow
>

[snip]

>1) Does the backtrace and memory map data tell anyone something of
>relevance?

A little bit. The backtrace ended at 0x21212067, which is "!! g" in ASCII.
It's a piece of the "too long !!!!!" string, appearing reversed because
you're on a little-endian architecture.

>
>2) Why do I not get 96.03 as the wiki promises?

It's an example, not a promise. Come on, man, this is a demonstration of what
evil things can be done with deliberately invalid code. You can't expect it
to just work out of the box!

Exploiting buffer overflows is delicate work. When the victim doesn't use the
same compile and options that the attacker was expecting, the memory
addresses come out wrong and it doesn't work. The attacker's next step would
be to adjust the length of the string, and the 0x21212067 would be a helpful
clue.

Quit trying to learn how to exploit bad code, at least until you've learned
to write good code. Here's a wikipedia article on that:
http://en.wikipedia.org/wiki/Script_kiddie

--
Alan Curry

frank

unread,

Jan 16, 2010, 9:44:16 PM1/16/10

to

Alan Curry wrote:
> In article <7rer1b...@mid.individual.net>,
> frank <fr...@example.invalid> wrote:
>> I've heard of stack smashing but never done it myself until about 36
>> hours ago. Wiki had an example that I'm having problems following, but
>> it does do the trick:
>>
>> http://en.wikipedia.org/wiki/Stack_buffer_overflow
>>
> [snip]
>
>> 1) Does the backtrace and memory map data tell anyone something of
>> relevance?
>
> A little bit. The backtrace ended at 0x21212067, which is "!! g" in ASCII.
> It's a piece of the "too long !!!!!" string, appearing reversed because
> you're on a little-endian architecture.

Interesting.

>
>> 2) Why do I not get 96.03 as the wiki promises?
>
> It's an example, not a promise. Come on, man, this is a demonstration of what
> evil things can be done with deliberately invalid code. You can't expect it
> to just work out of the box!
>
> Exploiting buffer overflows is delicate work. When the victim doesn't use the
> same compile and options that the attacker was expecting, the memory
> addresses come out wrong and it doesn't work. The attacker's next step would
> be to adjust the length of the string, and the 0x21212067 would be a helpful
> clue.

Right.

>
> Quit trying to learn how to exploit bad code, at least until you've learned
> to write good code. Here's a wikipedia article on that:
> http://en.wikipedia.org/wiki/Script_kiddie
>

It doesn't make much sense for me to hack myself here. I'm trying to
figure out why this is happening in other code:
#define PATH_SIZE 100

int readdir_r(DIR *restrict dirp, struct dirent *restrict entry,
struct dirent **restrict result);

int lstat(const char *restrict path, struct stat *restrict buf);

ssize_t readlink(const char *restrict path, char *restrict buf,
size_t bufsize);

unsigned
process_directory (char *theDir)
{
DIR *dir = NULL;
struct dirent entry;
struct dirent *entryPtr = NULL;
int retval = 0;
unsigned count = 0;
char pathName[PATH_SIZE + 1];

dir = opendir (theDir);
if (dir == NULL)
{
printf ("Error opening %s: %s", theDir, strerror (errno));
return 0;
}

retval = readdir_r (dir, &entry, &entryPtr);
while (entryPtr != NULL)
{
struct stat entryInfo;

if ((strncmp (entry.d_namessize_t readlink(const char *restrict
path, char *restrict buf,
size_t bufsize);, ".", PATH_SIZE) == 0) ||
(strncmp (entry.d_name, "..", PATH_SIZE) == 0))
{
/* Short-circuit the . and .. entries. */
retval = readdir_r (dir, &entry, &entryPtr);
continue;
}

(void) strncpy (pathName, theDir, PATH_SIZE);
(void) strncat (pathName, "/", PATH_SIZE);
(void) strncat (pathName, entry.d_name, PATH_SIZE);

if (lstat (pathName, &entryInfo) == 0)
{
/* stat() succeeded, let's party */
count++;

if (S_ISDIR (entryInfo.st_mode))
{ /* directory */
printf ("processing %s/\n", pathName);
count += process_directory (pathName);
}
else if (S_ISREG (entryInfo.st_mode))
{ /* regular file */
printf ("\t%s has %lld bytes\n",
pathName, (long long) entryInfo.st_size);
}
else if (S_ISLNK (entryInfo.st_mode))
{ /* symbolic link */
char targetName[PATH_SIZE + 1];
if (readlink (pathName, targetName, PATH_SIZE) != -1)
{
printf ("\t%s -> %s\n", pathName, targetName);
}
else
{
printf ("\t%s -> (invalid symbolic link!)\n", pathName);
}
}
}
else
{ssize_t readlink(const char *restrict path, char *restrict buf,
size_t bufsize);
printf ("Error statting %s: %s\n", pathName, strerror (errno));
}

retval = readdir_r (dir, &entry, &entryPtr);
}

/* Close the directory and return the number of entries. */
(void) closedir (dir);
return count;
}

int
main (void)
{
process_directory ("/home/dan/source");
return EXIT_SUCCESS;
}

I wonder if my kludge to work around PATH_MAX is causing the stack
corruption.
--
frank

Ben Bacarisse

unread,

Jan 16, 2010, 10:37:49 PM1/16/10

to

frank <fr...@example.invalid> writes:
<snip>

> (void) strncpy (pathName, theDir, PATH_SIZE);
> (void) strncat (pathName, "/", PATH_SIZE);
> (void) strncat (pathName, entry.d_name, PATH_SIZE);

Just a quick heads up: strncpy is almost always the wrong function to
use. For one thing, you may end up with something that is not a
string.

Also, strncat does not do what you seem to think. The last argument
is the maximum number of (non-null) characters to append to the
buffer. Using PATH_SIZE (one less than the buffer size) is therefore
not useful. There is really no way to do this without knowing the
sizes of the string involved.

<snip>
--
Ben.

frank

unread,

Jan 16, 2010, 10:59:45 PM1/16/10

to

Ben Bacarisse wrote:
> frank <fr...@example.invalid> writes:
> <snip>
>> (void) strncpy (pathName, theDir, PATH_SIZE);
>> (void) strncat (pathName, "/", PATH_SIZE);
>> (void) strncat (pathName, entry.d_name, PATH_SIZE);
>
> Just a quick heads up: strncpy is almost always the wrong function to
> use. For one thing, you may end up with something that is not a
> string.

Ben, I'm not strong at all with string processing in C. I see no
warnings about strncpy in H&S. What should I use instead?

>
> Also, strncat does not do what you seem to think. The last argument
> is the maximum number of (non-null) characters to append to the
> buffer. Using PATH_SIZE (one less than the buffer size) is therefore
> not useful. There is really no way to do this without knowing the
> sizes of the string involved.
>

It makes some sense to me. It does however look flawed in general
because you'd have
/home/dan/source/ and 400 null characters then
/home/dan/source// and 399 null characters then
/home/dan/source//rd7.c and 395 null characters

I don't know. I'm trying to break down this source and was hoping that
it was written by an expert, but it appears to have been written by
someone with little experience or ability.
--
frank

Richard Heathfield

unread,

Jan 17, 2010, 1:40:11 AM1/17/10

to

frank wrote:
> Ben Bacarisse wrote:
>> frank <fr...@example.invalid> writes:
>> <snip>
>>> (void) strncpy (pathName, theDir, PATH_SIZE);
>>> (void) strncat (pathName, "/", PATH_SIZE);
>>> (void) strncat (pathName, entry.d_name, PATH_SIZE);
>>
>> Just a quick heads up: strncpy is almost always the wrong function to
>> use. For one thing, you may end up with something that is not a
>> string.
>
> Ben, I'm not strong at all with string processing in C. I see no
> warnings about strncpy in H&S. What should I use instead?

In this case, malloc (to build a sufficiently long buffer), followed by
a check to ensure it succeeded, followed either by error handling or
sprintf.

<snip>

frank

unread,

Jan 17, 2010, 3:13:23 AM1/17/10

to

Richard Heathfield wrote:
> frank wrote:
>> Ben Bacarisse wrote:
>>> frank <fr...@example.invalid> writes:
>>> <snip>
>>>> (void) strncpy (pathName, theDir, PATH_SIZE);
>>>> (void) strncat (pathName, "/", PATH_SIZE);
>>>> (void) strncat (pathName, entry.d_name, PATH_SIZE);
>>>
>>> Just a quick heads up: strncpy is almost always the wrong function to
>>> use. For one thing, you may end up with something that is not a
>>> string.
>>
>> Ben, I'm not strong at all with string processing in C. I see no
>> warnings about strncpy in H&S. What should I use instead?
>
> In this case, malloc (to build a sufficiently long buffer), followed by
> a check to ensure it succeeded, followed either by error handling or
> sprintf.

Well yes and no. My buffer wasn't long enough. When I switched from
100, which I thought was plenty, to 300 for PATH_SIZE, the program
compiled and behaved as far as I could tell. I got a good tip from
yowie in alt.os.linux.ubuntu there.

So there is a proper value for this declaration:

char pathName[PATH_SIZE + 1];

so that one doesn't overwrite pathName, which is precisely what I was doing.

I haven't yet figured out how to get the information that has supplanted
PATH_MAX. I'm told this:

So instead of using PATH_MAX, you call pathconf(filename, _PC_PATH_MAX),
where filename is the name of a file on the filesystem whose maximum
pathname length you're interested in.

end quote. How do I know where any file is without a path? So I need a
path to find the file that is going to tell me how large _PC_PATH_MAX is
going to be for that directory.

Sounds like a circle to me. :-(
--
frank

io_x

unread,

Jan 17, 2010, 4:10:34 AM1/17/10

to

"frank" <fr...@example.invalid> ha scritto nel messaggio
news:7rer1b...@mid.individual.net...

> #include <string.h>
> #include <stdio.h>
> #include <stdlib.h>
>
> void foo (char *bar)
> {
> float My_Float = 10.5; // Addr = 0x0023FF4C
> char c[12]; // Addr = 0x0023FF30
> size_t a, b;
>
> a = sizeof(float);
> printf("a is %d\n", a);
> b = strlen( bar);
> printf("b is %d\n", b);
>
>
>
> // Will print 10.500000
> printf("My Float value = %f\n", My_Float);

the format string "%f" for printf (but not for scanf) is for print
double and not float; i think you should write this instead
printf("My Float value = %f\n", (double) My_Float);

Richard Heathfield

unread,

Jan 17, 2010, 4:23:23 AM1/17/10

to

frank wrote:
> Richard Heathfield wrote:
>> frank wrote:
>>> Ben Bacarisse wrote:
>>>> frank <fr...@example.invalid> writes:
>>>> <snip>
>>>>> (void) strncpy (pathName, theDir, PATH_SIZE);
>>>>> (void) strncat (pathName, "/", PATH_SIZE);
>>>>> (void) strncat (pathName, entry.d_name, PATH_SIZE);
>>>>
>>>> Just a quick heads up: strncpy is almost always the wrong function to
>>>> use. For one thing, you may end up with something that is not a
>>>> string.
>>>
>>> Ben, I'm not strong at all with string processing in C. I see no
>>> warnings about strncpy in H&S. What should I use instead?
>>
>> In this case, malloc (to build a sufficiently long buffer), followed
>> by a check to ensure it succeeded, followed either by error handling
>> or sprintf.
>
> Well yes and no.

You're half-right.

> My buffer wasn't long enough. When I switched from
> 100, which I thought was plenty, to 300 for PATH_SIZE, the program
> compiled and behaved as far as I could tell. I got a good tip from
> yowie in alt.os.linux.ubuntu there.

char *pathName = malloc(strlen(theDir) + strlen("/") +
strlen(entry.d_name) + 1);
if(pathName != NULL)
{
sprintf(pathName, "%s/%s", theDir, entry.d_name);

If the malloc call succeeded, pathName is now guaranteed to be long
enough, no matter how long theDir and entry.d_name are (up to the upper
limit of size_t, anyway).

Richard Heathfield

unread,

Jan 17, 2010, 4:28:13 AM1/17/10

to

io_x wrote:

<snip>

> the format string "%f" for printf (but not for scanf) is for print
> double and not float; i think you should write this instead
> printf("My Float value = %f\n", (double) My_Float);

No, the float value will be converted to double anyway prior to the
call. Look up "default argument promotions" in the Standard.

Antoninus Twink

unread,

Jan 17, 2010, 4:54:55 AM1/17/10

to

On 16 Jan 2010 at 22:53, Richard Heathfield wrote:
> frank wrote:

>> printf("a is %d\n", a);
>
> size_t is an unsigned integral type. If you want to pass a size_t to
> printf to match a %d format specifier, cast it to int.

This is exceptionally poor advice - and as usual, the famous "clc peer
review" that would have seen half a dozen people piling in to point out
that the resulting int might have been a trap representation or some
such nonsense if a newbie had posted this remains strangely silent when
it is one of their chums who's boobooed.

Casting to an int so that you can use %d is utterly stupid when you can
just use the %zu format specifier and not risk printing nonsense in the
quite likely event that size_t is wider than int.

If you happen to be stuck with C90 (and hopefully you've got a better
reason for this than Heathfield's one of irrational prejudice), then
using %lu and casting to unsigned long int is a much better option than
using %d and casting to int.

Gordon Burditt

unread,

Jan 17, 2010, 5:34:26 AM1/17/10

to

>So instead of using PATH_MAX, you call pathconf(filename, _PC_PATH_MAX),
> where filename is the name of a file on the filesystem whose maximum
>pathname length you're interested in.

Hint: in FreeBSD, FILENAME_MAX is defined as 1024, along with a
comment that it must be <= PATH_MAX. Linux has similar filesystems
to FreeBSD. If you can't find PATH_MAX, consider using FILENAME_MAX
as a minimum for it (or perhaps 10*FILENAME_MAX). That's at least
a starter for the length of a filename of the mount point of the
relevant filesystem.

Nobody

unread,

Jan 17, 2010, 5:37:21 AM1/17/10

to

On Sat, 16 Jan 2010 20:59:45 -0700, frank wrote:

>> Just a quick heads up: strncpy is almost always the wrong function to
>> use. For one thing, you may end up with something that is not a
>> string.
>
> Ben, I'm not strong at all with string processing in C. I see no
> warnings about strncpy in H&S. What should I use instead?

In C99 (or on Unix), you can use snprintf(), e.g.:

snprintf(pathName, sizeof pathName, "%s/%s", theDir, entry.d_name);

Unlike the strn* functions, snprintf() always includes a NUL byte.
Unfortunately, it's not in C89.

[Where a function takes a pointer to a buffer and the size of the buffer,
it's usually a good idea to use sizeof rather than the macro you used when
defining the buffer. That way, you don't have to update the rest of the
code when you change the declaration.]

However, even if you don't overflow the buffer, you can end up with a
truncated value, which can mean that your program silently produces bogus
output (if the constructed pathname was being used to open a file for
write, or to remove a file, you could end up overwriting or removing the
wrong file).

If you must use fixed-sized buffers, it's probably better to calculate
the length of the resulting string (adding the results of various strlen()
calls, plus 1 for the terminating NUL) and simply report an error if the
result won't fit into the buffer.

E.g.:

int dir_len = strlen(theDir);
int name_len = strlen(entry.d_name);
int total_len = dir_len + 1 + name_len + 1;
if (total_len > sizeof pathName) {
fprintf(stderr, "pathname too long\n");
return -1;

Ben Bacarisse

unread,

Jan 17, 2010, 6:09:24 AM1/17/10

to

frank <fr...@example.invalid> writes:

> Ben Bacarisse wrote:
>> frank <fr...@example.invalid> writes:
>> <snip>
>>> (void) strncpy (pathName, theDir, PATH_SIZE);
>>> (void) strncat (pathName, "/", PATH_SIZE);
>>> (void) strncat (pathName, entry.d_name, PATH_SIZE);
>>
>> Just a quick heads up: strncpy is almost always the wrong function to
>> use. For one thing, you may end up with something that is not a
>> string.
>
> Ben, I'm not strong at all with string processing in C. I see no
> warnings about strncpy in H&S. What should I use instead?

I'd use memcpy, but that's not really the central issue.

>> Also, strncat does not do what you seem to think. The last argument
>> is the maximum number of (non-null) characters to append to the
>> buffer. Using PATH_SIZE (one less than the buffer size) is therefore
>> not useful. There is really no way to do this without knowing the
>> sizes of the string involved.
>>
>
> It makes some sense to me. It does however look flawed in general
> because you'd have
> /home/dan/source/ and 400 null characters then
> /home/dan/source// and 399 null characters then
> /home/dan/source//rd7.c and 395 null characters
>
> I don't know. I'm trying to break down this source and was hoping
> that it was written by an expert, but it appears to have been written
> by someone with little experience or ability.

The code seems to be putting effort into preventing buffer overflow
(it fails to so that but that is almost incidental) but the real issue
is that it should decided if there is room and stop if there is not
(the alternative is to allocate space, of course). It seems pointless
to try pack as much of the string into the buffer as possible since
even if one character won't fit, the program won't work.

It would be better to start off by testing if the size of the
two strings plus 1 for the '/' and one for the '\0' will fit. Since
this involves finding the two lengths you then have all the
information you need to copy safely (using, say, memcpy) or to
allocate space (if that's your preferred solution).

--
Ben.

Ike Naar

unread,

Jan 17, 2010, 7:41:44 AM1/17/10

to

In article <7rfv53...@mid.individual.net>,

frank <fr...@example.invalid> wrote:
>Well yes and no. My buffer wasn't long enough. When I switched from
>100, which I thought was plenty, to 300 for PATH_SIZE, the program
>compiled and behaved as far as I could tell.

Apparently the code that you compiled is not the code that you posted.
The posted code has parts that are almost, but not quite, entirely unlike C:

frank

unread,

Jan 17, 2010, 3:15:31 PM1/17/10

to

Richard Heathfield wrote:
> io_x wrote:
>
> <snip>
>
>> the format string "%f" for printf (but not for scanf) is for print
>> double and not float; i think you should write this instead
>> printf("My Float value = %f\n", (double) My_Float);
>
> No, the float value will be converted to double anyway prior to the
> call. Look up "default argument promotions" in the Standard.
>

It seems like floats have floated away. I can't think of a reason a
person would want a float as opposed to a double: anything that float
can represent is also represented by a double.

6.3.1.7 is a good place to read up on this.
--
frank

jacob navia

unread,

Jan 17, 2010, 3:24:05 PM1/17/10

to

frank a �crit :

Please, do not generalize too much. Float use 50% of the memory
needed by a double, and when memory is important and precision is not,
it is better to use float.

NVIDIA has proposed a new 16 bit floating point format (used now internally
in their GPUs). For game/graphic applications, it is not important to know
with extreme precision where or what color a ray will pass. But using
half the memory it is, since you can output twice as much.

I agree that in "normal" applications double is much better, but float
(and even "short float" have their uses. The new x86 machines will
feature those and treat a lot of them in parallel. They feature now float
support in the XMM registers.

jacob

frank

unread,

Jan 17, 2010, 3:51:42 PM1/17/10

to

Ike Naar wrote:
> In article <7rfv53...@mid.individual.net>,

> Apparently the code that you compiled is not the code that you posted.
> The posted code has parts that are almost, but not quite, entirely unlike C:
>
> if ((strncmp (entry.d_namessize_t readlink(const char *restrict
> path, char *restrict buf,
> size_t bufsize);, ".", PATH_SIZE) == 0) ||
> (strncmp (entry.d_name, "..", PATH_SIZE) == 0))
>

It is true that I posted something different than I compiled to minimize
the extent of the posix extension, providing, for example, function
declarations for readlink as opposed to #including a non-c-standard
header. Otherwise, it's not unlike C precisely because it is C.

I'm trying to work out ways to make my current project more topical for
clc. Hence the stack-smashing example from wiki. It seems that with my
implementation, anytime you write past the storage extent of a variable,
you get told that you are stack smashing, and the OS shuts you down.

I am, however, pleasantly surprised to see a lot of the usual suspects
in comp.unix.programmer.
--
frank

frank

unread,

Jan 17, 2010, 4:32:47 PM1/17/10

to

Thanks, Gordon, this might be a way for me to work around this until I
can implement something fancier:

dan@dan-desktop:~/source$ gcc -std=c99 -Wall -Wextra ss3.c -o out
dan@dan-desktop:~/source$ ./out
FILENAME_MAX is 4096
dan@dan-desktop:~/source$ cat ss3.c
#include <stdio.h>

int main (void)
{
printf("FILENAME_MAX is %d\n", FILENAME_MAX);
return 0;
}

// gcc -std=c99 -Wall -Wextra ss3.c -o out

dan@dan-desktop:~/source$

--
frank

john

unread,

Jan 17, 2010, 4:53:12 PM1/17/10

to

As usual you are completely wrong Twink. Why dont you FOD?

frank

unread,

Jan 17, 2010, 5:25:20 PM1/17/10

to

a's value isn't going to exceed one hundred. If size_t is wider than
int, doesn't it "demote" appropriately?
==
frank

Seebs

unread,

Jan 17, 2010, 5:30:46 PM1/17/10

to

On 2010-01-17, john <jo...@nospam.com> wrote:
> Antoninus Twink wrote:
>> On 16 Jan 2010 at 22:53, Richard Heathfield wrote:
>>> frank wrote:
>>>> printf("a is %d\n", a);
>>>
>>> size_t is an unsigned integral type. If you want to pass a size_t to
>>> printf to match a %d format specifier, cast it to int.

>> Casting to an int so that you can use %d is utterly stupid when you can

>> just use the %zu format specifier and not risk printing nonsense in the
>> quite likely event that size_t is wider than int.

This is nearly good advice, except for the small issue that %zu hasn't seen
the widespread adoption I would have liked.

>> If you happen to be stuck with C90 (and hopefully you've got a better
>> reason for this than Heathfield's one of irrational prejudice), then
>> using %lu and casting to unsigned long int is a much better option than
>> using %d and casting to int.

> As usual you are completely wrong Twink. Why dont you FOD?

Uh, can you explain what's wrong there? I'm not seeing something wrong with
it. I have certainly used systems where size_t was larger than int, and
where objects could have a size larger than the range expressible in int,
signed or unsigned.

Heathfield's advice was correct, technically -- if you want to match a %d
format specifier, yes, cast to int. There may be circumstances in which the
format specifier is externally determined for some reason, so you might
actually need to know this. But Twink's position seems reasonable to me;
if you really do need to support systems without %zu, then %lu/unsigned long
is probably the next best thing.

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!

Seebs

unread,

Jan 17, 2010, 5:31:46 PM1/17/10

to

On 2010-01-17, frank <fr...@example.invalid> wrote:
> a's value isn't going to exceed one hundred. If size_t is wider than
> int, doesn't it "demote" appropriately?

Not in the argument list of a variable-arguments function, because there's
no way for the compiler to know in advance what type it would demote to.
(Actually, to be super picky, gcc often does know, and so do some other
compilers, but as a matter of design, there is no such demotion in place,
and there are easily-generated examples of cases where it's impossible
to tell except at runtime.)

Ben Bacarisse

unread,

Jan 17, 2010, 6:13:13 PM1/17/10

to

frank <fr...@example.invalid> writes:

> Ike Naar wrote:
>> In article <7rfv53...@mid.individual.net>,
>
>> Apparently the code that you compiled is not the code that you posted.
>> The posted code has parts that are almost, but not quite, entirely unlike C:
>>
>> if ((strncmp (entry.d_namessize_t readlink(const char *restrict
>> path, char *restrict buf,
>> size_t bufsize);, ".", PATH_SIZE) == 0) ||
>> (strncmp (entry.d_name, "..", PATH_SIZE) == 0))
>>
>
> It is true that I posted something different than I compiled to
> minimize the extent of the posix extension, providing, for example,
> function declarations for readlink as opposed to #including a
> non-c-standard header. Otherwise, it's not unlike C precisely because
> it is C.

It's C, Jim, but not as we know it (to use what may be a more familiar
reference). It looks like a function prototype has been pasted into a
strncmp call. BTW, why strncmp? It seems to me that strcmp is
obvious choice here.

<snip>
--
Ben.

Ben Bacarisse

unread,

Jan 17, 2010, 6:23:58 PM1/17/10

to

Seebs <usenet...@seebs.net> writes:

> On 2010-01-17, john <jo...@nospam.com> wrote:
>> Antoninus Twink wrote:
>>> On 16 Jan 2010 at 22:53, Richard Heathfield wrote:
>>>> frank wrote:
>>>>> printf("a is %d\n", a);
>>>>
>>>> size_t is an unsigned integral type. If you want to pass a size_t to
>>>> printf to match a %d format specifier, cast it to int.
>
>>> Casting to an int so that you can use %d is utterly stupid when you can
>>> just use the %zu format specifier and not risk printing nonsense in the
>>> quite likely event that size_t is wider than int.
>
> This is nearly good advice, except for the small issue that %zu hasn't seen
> the widespread adoption I would have liked.
>
>>> If you happen to be stuck with C90 (and hopefully you've got a better
>>> reason for this than Heathfield's one of irrational prejudice), then
>>> using %lu and casting to unsigned long int is a much better option than
>>> using %d and casting to int.
>
>> As usual you are completely wrong Twink. Why dont you FOD?
>
> Uh, can you explain what's wrong there?

Nothing wrong expect what I can only see as deliberately provocative
snipping. The context (just one line above) was:

a = sizeof(float);

If Richard Heathfield had suggested that int was wrong and unsigned
long was the type to use, AT would quite likely have gone off on one
about absurd fears that floats might be longer than 32,767 bytes.
Since int is perfectly reasonable here, AT had to snip the context to
go off on one. The common thread here is that AT will go off on one
no matter what one says.

<snip>
--
Ben.

frank

unread,

Jan 17, 2010, 7:52:36 PM1/17/10

to

Richard Heathfield wrote:

> char *pathName = malloc(strlen(theDir) + strlen("/") +
> strlen(entry.d_name) + 1);
> if(pathName != NULL)
> {
> sprintf(pathName, "%s/%s", theDir, entry.d_name);
>
> If the malloc call succeeded, pathName is now guaranteed to be long
> enough, no matter how long theDir and entry.d_name are (up to the upper
> limit of size_t, anyway).

Ok, I see. This obviates the need to know a priori how big these have
to be. Let me ask something a little different. In
comp.unix.programmer, Jens Theoring writes the following about a similar
getdir function:

If I were to "redesign" the function I probably would have
getdir() take only the path argument and have it return a
pointer to the file list, with the modification that the
file list always ends in a NULL pointer to mark its end -
that way you don't have to also keep track of how many
elements it has. And then I'd supply a second function,
free_dir_list(), which receives the file list and free()'s
everything in it.

Does this sound like a good memory model for this type of thing?
--
frank

Seebs

unread,

Jan 17, 2010, 8:00:26 PM1/17/10

to

On 2010-01-17, Ben Bacarisse <ben.u...@bsb.me.uk> wrote:
> Nothing wrong expect what I can only see as deliberately provocative
> snipping. The context (just one line above) was:
>
> a = sizeof(float);
>
> If Richard Heathfield had suggested that int was wrong and unsigned
> long was the type to use, AT would quite likely have gone off on one
> about absurd fears that floats might be longer than 32,767 bytes.
> Since int is perfectly reasonable here, AT had to snip the context to
> go off on one. The common thread here is that AT will go off on one
> no matter what one says.

While in general this is true, I don't think I agree in this case.
It is quite possible for programs to change over time, and sooner or
later, the assignment to a is going to be twenty lines from the printf,
and it's going to change into a calculation of the expected size of a
gigantic matrix or something, and we're going to want to print the
size as %zu (or %lu).

Basically, in my experience, it's always been better to do it right even
when I'm *totally* sure it doesn't matter, because code has a disconcerting
way of surviving until my shortcuts matter.

Jens Thoms Toerring

unread,

Jan 17, 2010, 8:25:49 PM1/17/10

to

Please keep in mind that this was not about your process_directory()
function posted here but a function getdir() by a different poster
(in c.u.p) with perhaps some rather different goals than your's.
Your function returns jut the count of files that have been found,
while the original getdir() function was meant to set up a list of
file names in a directory. So that functions declaration was (near-
ly)

int getdir( const char * path, char *** list );

while your process_directory() function has more or less

unsigned int getdir( const char * path );

Since your function doesn't return a list of file names the
whole thing is quite a bit different.

If you return an array from a function the basic question is:
do I need to keep track of the address of the array *and* the
number of it elements? If an array can't have an element that
tells 'this elememt is invalid and thus can signify the end of
the array' then you have no alternative to keeping track of the
array's address *and* the number of its arguments. But if you
have e.g. an array of pointers there are situations where a NULL
element can be used to indicate 'this is the last (and invalid)
element', and in that case you can use such an element as a
'sentential' (like the '\0' character at the end of a char array
is used to mark the end of a string).

Regards, Jens
--
\ Jens Thoms Toerring ___ j...@toerring.de
\__________________________ http://toerring.de

steve

unread,

Jan 17, 2010, 8:47:57 PM1/17/10

to

On Jan 17, 12:24 pm, jacob navia <ja...@nospam.org> wrote:

>
> NVIDIA has proposed a new 16 bit floating point format
> (used now internally in their GPUs).
>

Is there a problem with binary16 as defined in the
IEEE754 standard?

--
steve

Keith Thompson

unread,

Jan 17, 2010, 11:50:31 PM1/17/10

to

jacob navia <ja...@nospam.org> writes:
> frank a écrit :
[...]

>> It seems like floats have floated away. I can't think of a reason a
>> person would want a float as opposed to a double: anything that
>> float can represent is also represented by a double.
>>
>> 6.3.1.7 is a good place to read up on this.
>
> Please, do not generalize too much. Float use 50% of the memory
> needed by a double, and when memory is important and precision is not,
> it is better to use float.

[...]

Speaking of generalizing too much:

float is *typically* half the size of double, but the only guarantee
is that double is no smaller than float (actually that it has at least
the same range and precision). I've worked on systems where float and
double were the same size.

Still your advice is correct most of the time and probably harmless
the rest of the time.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

frank

unread,

Jan 18, 2010, 1:55:49 AM1/18/10

to

It was my first try, mostly borrowed from something I found while
googling. It seems to me that your ideas that you had for someone
else in c.u.p. were closer to what I'm looking to do.

>
> If you return an array from a function the basic question is:
> do I need to keep track of the address of the array *and* the
> number of it elements? If an array can't have an element that
> tells 'this elememt is invalid and thus can signify the end of
> the array' then you have no alternative to keeping track of the
> array's address *and* the number of its arguments. But if you
> have e.g. an array of pointers there are situations where a NULL
> element can be used to indicate 'this is the last (and invalid)
> element', and in that case you can use such an element as a
> 'sentential' (like the '\0' character at the end of a char array
> is used to mark the end of a string).

Ok. Well this had been a big problem for me when I was declaring
things from a caller. I'll need to make a mock-up in standard c with
a similar function like getText, where the items are returned in a
list.

For the getDir function that you envision, is this an appropriate
declaration:

char * getDir( char * pathName);

"Sentinel" seems to be a word that non-native English writers struggle
with. It means something like "guard." I delivered the "sentinel
tribune" growing up, as a paperboy, which might be a job that doesn't
exist anymore.

Gruss aus Amiland,
--
frank

frank

unread,

Jan 18, 2010, 1:56:05 AM1/18/10

to

On Jan 17, 6:25 pm, j...@toerring.de (Jens Thoms Toerring) wrote:

It was my first try, mostly borrowed from something I found while

googling. It seems to me that your ideas that you had for someone
else in c.u.p. were closer to what I'm looking to do.

>

> If you return an array from a function the basic question is:
> do I need to keep track of the address of the array *and* the
> number of it elements? If an array can't have an element that
> tells 'this elememt is invalid and thus can signify the end of
> the array' then you have no alternative to keeping track of the
> array's address *and* the number of its arguments. But if you
> have e.g. an array of pointers there are situations where a NULL
> element can be used to indicate 'this is the last (and invalid)
> element', and in that case you can use such an element as a
> 'sentential' (like the '\0' character at the end of a char array
> is used to mark the end of a string).

Ok. Well this had been a big problem for me when I was declaring

Richard Heathfield

unread,

Jan 18, 2010, 3:36:59 AM1/18/10

to

One problem with killfiling trolls is that, when they post misleading
information, I usually don't even get to see it, so I don't get to
correct it. The above seems to have nearly been one such case. John, you
are right, and Twink is wrong. He is, however, *nearly* right on this
occasion, which makes a pleasant change from what I remember. Casting to
an int is indeed especially stupid, but if you want to pass a size_t to
printf to match a %d format specifier, you have little option but to
cast it to a int. Thus, the advice I gave was 100% correct in every
particular.

Twink's %zu specifier is of course a non-starter for those who cannot
guarantee to depend on the foibles of particular compilers, as he
himself recognises. Casting to unsigned long and using %lu is, however,
a far better option than using %d and casting to int. In that respect
and that respect only, Twink is correct - and may I say how delighted I
am to be able to report at last that he's actually got one tiny little
part of an article right. It's a start. It's taken several years, but
it's a start. Nevertheless, he stays in my killfile until he learns how
to be a civilised human being.

Richard Heathfield

unread,

Jan 18, 2010, 3:41:30 AM1/18/10

to

Seebs wrote:

<snip>

> Heathfield's advice was correct, technically -- if you want to match a %d
> format specifier, yes, cast to int.

Right.

> But Twink's position seems reasonable to me;
> if you really do need to support systems without %zu, then %lu/unsigned long
> is probably the next best thing.

Also right. Unfortunately, it isn't possible to give an entirely
comprehensive response to every single article. We take shortcuts,
either through time pressure or through simple humanness. Sometimes we
take shortcuts that seem reasonable to some observers but not to others.
And sometimes the opinion of an observer on whether the shortcut is a
reasonable one will depend on his opinion of the shortcutter.

Michael Foukarakis

unread,

Jan 18, 2010, 4:10:24 AM1/18/10

to

On Jan 17, 12:53 am, Richard Heathfield <r...@see.sig.invalid> wrote:
> > 2) Why do I not get 96.03 as the wiki promises?
>
> C does not define the behaviour of a program whose behaviour is
> undefined. So any result is okay, including the Wiki's result and any
> other result (or no result at all).

Puh-lease. :-) If only that were true, then exploits would fail half
the time, only because stack smashing invokes UB. Heh.

OP, compile the example with -fno-stack-protector and/or -
D_FORTIFY_SOURCE=0. You will get your desired result then. Read up on
stack canaries and GCC's stack protection schemes to understand why
your current setup fails (there's code that detects you've overwritten
something on the stack and produces the "*** stack smashing detected
***" message along with the diagnostic backtrace/mmap).

Seebs

unread,

Jan 18, 2010, 4:02:56 AM1/18/10

to

On 2010-01-18, Richard Heathfield <r...@see.sig.invalid> wrote:

> Seebs wrote:
>> Heathfield's advice was correct, technically -- if you want to match a %d
>> format specifier, yes, cast to int.

> Right.

I think the issue is that an uncharitable or hostile reader might ignore
the qualifier or misunderstand it. I assumed that the qualifier meant
exactly what it said; if you need to pass something so that it will print with
%d, cast it to int. I didn't infer any particular assertion as to whether
or not trying to print sizes with %d was a good idea; merely an observation
about how one would go about it.

> Also right. Unfortunately, it isn't possible to give an entirely
> comprehensive response to every single article. We take shortcuts,
> either through time pressure or through simple humanness. Sometimes we
> take shortcuts that seem reasonable to some observers but not to others.
> And sometimes the opinion of an observer on whether the shortcut is a
> reasonable one will depend on his opinion of the shortcutter.

Yes. In this case, I think, a reasonable reader familiar with your
background would not imagine that you were suggesting that %d was the
right choice for printing a size, but a hostile reader might choose to
read it that way to try to score points.

Jens Thoms Toerring

unread,

Jan 18, 2010, 4:22:58 AM1/18/10

to

frank <abqha...@gmail.com> wrote:
> For the getDir function that you envision, is this an appropriate
> declaration:

> char * getDir( char * pathName);

Nearly - but since it's supposed to return an array of strings
it has to be

char ** getDir( char * pathName);

And I probably would throw in a 'const' before 'char * pathName'
if the function doesn't change that string.

> "Sentinel" seems to be a word that non-native English writers struggle
> with.

At least at half past two in the morning;-)

Nick Keighley

unread,

Jan 18, 2010, 4:34:08 AM1/18/10

to

On 17 Jan, 06:40, Richard Heathfield <r...@see.sig.invalid> wrote:

> frank wrote:
> > Ben Bacarisse wrote:
> >> frank <fr...@example.invalid> writes:

> >>> (void) strncpy (pathName, theDir, PATH_SIZE);
> >>> (void) strncat (pathName, "/", PATH_SIZE);
> >>> (void) strncat (pathName, entry.d_name, PATH_SIZE);
>
> >> Just a quick heads up: strncpy is almost always the wrong function to
> >> use. For one thing, you may end up with something that is not a
> >> string.
>
> > Ben, I'm not strong at all with string processing in C. I see no
> > warnings about strncpy in H&S. What should I use instead?

if you read the documentation for strncpy() you will it is rather odd.
strcpy() is designed for copying small fixed arrays of characters that
were not necessarily zero terminated. I believe unix has (or had) such
things.

char thing_name[8] = "AAAABBBB";
the above is not nul terminated in C (C++ is different) becasue the
number of characters in the initialiser exactly fits.

strncpy (thing_name, "NEW_NAME", 8);
copies eactly 8 characters. No null on the end.

strncpy (thing_name, "SMALL", 8);
copies 5 character and appends three nuls.

This all makes sense but it often isn't what people expect. The strncpy
() of "NEW_NAME" means that thing_name isn't a valid C string after
the call! There's no terminator. And that padding habbit can go wildly
wrong.

char buffer[100];
strncpy (buffer, "tmp/", 100);

avoids a call of strlen() on "tmp/" (saving 5 character reads) but
tags 96 nuls onto the end of buffer.

One possibility is to write a function that does what you might expect
strncpy() to do. (but don't call it str*() because that invades a
namespace reserved for the implementation).

> In this case, malloc (to build a sufficiently long buffer), followed by
> a check to ensure it succeeded, followed either by error handling or
> sprintf.

(I was going to post code, but Richard has done it else-thread)

--
Murphy's Law of Thermodynamics
Things get worse under pressure.

jacob navia

unread,

Jan 18, 2010, 4:34:59 AM1/18/10

to

Jens Thoms Toerring a �crit :

> If you return an array from a function the basic question is:
> do I need to keep track of the address of the array *and* the
> number of it elements? If an array can't have an element that
> tells 'this elememt is invalid and thus can signify the end of
> the array' then you have no alternative to keeping track of the
> array's address *and* the number of its arguments. But if you
> have e.g. an array of pointers there are situations where a NULL
> element can be used to indicate 'this is the last (and invalid)
> element', and in that case you can use such an element as a
> 'sentential' (like the '\0' character at the end of a char array
> is used to mark the end of a string).
>
> Regards, Jens

And here we see (AGAIN) how useful would be to have in C a
standard array container that can be used in situations like this!

That is one of the principal motivations of the containers library.

jacob

jacob navia

unread,

Jan 18, 2010, 4:38:25 AM1/18/10

to

steve a �crit :

No, I think the 16 bit float corresponds to some IEEE754
type. I researched this a while ago but I do not have the details
right now.

The next pentium generation will have those built in, that is why
I researched that.

Michael Foukarakis

unread,

Jan 18, 2010, 4:48:32 AM1/18/10

to

It does. When using/building containers, you have two ways of
signifying the end of data; either a sentinel value that denotes the
end of available items (like NULL, or whatever) or a value that tells
you the "length" or "size" of your container. Both are perfectly
acceptable, with their own sets of problems. :-)

John Bode

unread,

Jan 18, 2010, 12:12:14 PM1/18/10

to

> As usual you are completely wrong Twink. Why dont you FOD?- Hide quoted text -
>
> - Show quoted text -

Twink's not wrong. For *this specific case*, it's highly unlikely
that the length of the string will exceed the maximum value for a
signed integer, but in general it's best to treat size_t values as
unsigned, potentially longer than int.

Note that Heathfield included the caveat "If you want to pass a size_t
to printf to match a %d format specifier," so his advice was correct
as far as it went, however I think the OP would have been better
served if Heathfield had given the same advice Twink gave (minus the
personal animosity towards the larger clc community): i.e., use "%lu"
and cast to unsigned long in C89, use %zu in C90.

John Bode

unread,

Jan 18, 2010, 12:16:21 PM1/18/10

to

On Jan 18, 11:12 am, John Bode <jfbode1...@gmail.com> wrote:
> On Jan 17, 3:53 pm, john <j...@nospam.com> wrote:
>
>
>
>
>
> > Antoninus Twink wrote:
> > > On 16 Jan 2010 at 22:53, Richard Heathfield wrote:
> > >> frank wrote:
> > >>> printf("a is %d\n", a);
>
> > >> size_t is an unsigned integral type. If you want to pass a size_t to
> > >> printf to match a %d format specifier, cast it to int.
>
> > > This is exceptionally poor advice - and as usual, the famous "clc peer
> > > review" that would have seen half a dozen people piling in to point out
> > > that the resulting int might have been a trap representation or some
> > > such nonsense if a newbie had posted this remains strangely silent when
> > > it is one of their chums who's boobooed.
>
> > > Casting to an int so that you can use %d is utterly stupid when you can
> > > just use the %zu format specifier and not risk printing nonsense in the
> > > quite likely event that size_t is wider than int.
>
> > > If you happen to be stuck with C90 (and hopefully you've got a better
> > > reason for this than Heathfield's one of irrational prejudice), then
> > > using %lu and casting to unsigned long int is a much better option than
> > > using %d and casting to int.
>
> > As usual you are completely wrong Twink. Why dont you FOD?- Hide quoted text -
>
> > - Show quoted text -
>
> Twink's not wrong. For *this specific case*, it's highly unlikely
> that the

size of type float

> will exceed the maximum value for a
> signed integer, but in general it's best to treat size_t values as
> unsigned, potentially longer than int.
>

Fixed because I'm a moron who doesn't always pay attention to what
he's responding to.

Keith Thompson

unread,

Jan 18, 2010, 12:15:58 PM1/18/10

to

Michael Foukarakis <electr...@gmail.com> writes:
> On Jan 17, 12:53 am, Richard Heathfield <r...@see.sig.invalid> wrote:
>> > 2) Why do I not get 96.03 as the wiki promises?
>>
>> C does not define the behaviour of a program whose behaviour is
>> undefined. So any result is okay, including the Wiki's result and any
>> other result (or no result at all).
>
> Puh-lease. :-) If only that were true, then exploits would fail half
> the time, only because stack smashing invokes UB. Heh.

Ah, but it is true. Any result is okay in the sense that it's
permitted by the standard. That includes getting consistent results
on a particular system, which is what exploits generally take
advantage of.

[...]

Antoninus Twink

unread,

Jan 18, 2010, 3:23:32 PM1/18/10

to

On 18 Jan 2010 at 8:41, Richard Heathfield wrote:

> Seebs wrote:
>> Heathfield's advice was correct, technically -- if you want to match
>> a %d format specifier, yes, cast to int.
>
> Right.

Read again.

I never said that your advice was incorrect. I said it was exceptionally
poor advice.

Consider this example:

"Q: Why do I get a compiler warning for this code?
const char *s;
gets(s);
A: You need to remove the const qualifier if you want to pass s to gets()."

Perfectly correct advice, but also exceptionally poor advice.

> We take shortcuts, either through time pressure or through simple
> humanness. Sometimes we take shortcuts that seem reasonable to some
> observers but not to others.

Yes, an most people by instinct make allowances for this perfectly
normal part of human nature.

You, on the other hand, have made a career out of humiliating newbies
and pissing off old hands by an excessively literal reading of their
posts and a refusal to read between the lines in exactly the way you
describe.

I'm happy to be able to give you a taste of your own medicine once in a
while.

Antoninus Twink

unread,

Jan 18, 2010, 3:26:01 PM1/18/10

to

On 18 Jan 2010 at 9:02, Seebs wrote:
> I think the issue is that an uncharitable or hostile reader might ignore
> the qualifier or misunderstand it.

Yes - and the very person who delights in being an uncharitable or
hostile reader when others are writing is none other than one Dicky
Heathfield.

Antoninus Twink

unread,

Jan 18, 2010, 3:28:57 PM1/18/10

to

On 18 Jan 2010 at 17:12, John Bode wrote:
> use "%lu" and cast to unsigned long in C89, use %zu in C90.

AFAIK, neither C89 nor C90 (which are for all practical purposes
identical) defines %zu.

frank

unread,

Jan 19, 2010, 12:48:30 AM1/19/10

to

Keith Thompson wrote:
> Michael Foukarakis <electr...@gmail.com> writes:
>> On Jan 17, 12:53 am, Richard Heathfield <r...@see.sig.invalid> wrote:
>>>> 2) Why do I not get 96.03 as the wiki promises?
>>> C does not define the behaviour of a program whose behaviour is
>>> undefined. So any result is okay, including the Wiki's result and any
>>> other result (or no result at all).
>> Puh-lease. :-) If only that were true, then exploits would fail half
>> the time, only because stack smashing invokes UB. Heh.
>
> Ah, but it is true. Any result is okay in the sense that it's
> permitted by the standard. That includes getting consistent results
> on a particular system, which is what exploits generally take
> advantage of.

It took me a while in this thread to figure out a) what stack-smashing
is and b) why my program was doing it.

When I worked up the example from the wiki in the original post, with
the 10.5 and the 96.1, one thing notable was that it didn't change that
value on my (ubuntu) implementation. It didn't succeed in changing that
value irrespective of how many characters I used in the buffer that was
being overrun. The OS detected it and refused to write to the stack
data inappropriately.

Also in researching this, I found out some sentinel values used by
stacks to prevent this type of hack. My friend mentioned 0x deadbeef as
one such value.
--
frank

jaysome

unread,

Jan 19, 2010, 1:53:43 AM1/19/10

to

On Sun, 17 Jan 2010 09:54:55 +0000 (UTC), Antoninus Twink
<nos...@nospam.invalid> wrote:

>On 16 Jan 2010 at 22:53, Richard Heathfield wrote:
>> frank wrote:
>>> printf("a is %d\n", a);
>>
>> size_t is an unsigned integral type. If you want to pass a size_t to
>> printf to match a %d format specifier, cast it to int.
>
>This is exceptionally poor advice - and as usual, the famous "clc peer
>review" that would have seen half a dozen people piling in to point out
>that the resulting int might have been a trap representation or some
>such nonsense if a newbie had posted this remains strangely silent when
>it is one of their chums who's boobooed.

I disagree that it's "exceptionally poor advice", and would go further
to argue that it's sane advice, in most cases.

The C standard guarantees that INT_MAX is at least 32767, and the size
of any scalar type will always be less than this (at least in the real
world). Rather than casting the result of sizeof to "unsigned long",
it's simply easier to cast it to "int". In the instant case, we know
that the result of "sizeof(float)" is guaranteed to fit within type
int (again, in the real world).

In all of my years of development, I've never run into a case where
casting the return value of sizeof to "int" in a printf statement has
ever been a problem. If there were cases in which sizeof returned a
value greater than 32767, I was working on a platform in which the
compiler used 32-bit int, so it was not a problem. And if I had ever
ran into a case in which sizeof returned a value greater than 32767
and I was working on a platform in which the compiler used 16-bit int
(e.g., on some embedded devices), then, admittedly, I had bigger fish
to fry (like I don't even have 16K let alone 4K of RAM and thus the
printf code was never executed).

I find printf very useful in test programs to print out the size of my
user-defined (e.g., structure) types, and the pattern I use is:

printf("sizeof(T) is %d\n", (int)sizeof(T));

where T is my user-defined type.

--
jay

Michael Foukarakis

unread,

Jan 19, 2010, 2:01:14 AM1/19/10

to

On Jan 18, 7:15 pm, Keith Thompson <ks...@mib.org> wrote:

> Michael Foukarakis <electricde...@gmail.com> writes:
> > On Jan 17, 12:53 am, Richard Heathfield <r...@see.sig.invalid> wrote:
> >> > 2) Why do I not get 96.03 as the wiki promises?
>
> >> C does not define the behaviour of a program whose behaviour is
> >> undefined. So any result is okay, including the Wiki's result and any
> >> other result (or no result at all).
>
> > Puh-lease. :-) If only that were true, then exploits would fail half
> > the time, only because stack smashing invokes UB. Heh.
>
> Ah, but it is true. Any result is okay in the sense that it's
> permitted by the standard. That includes getting consistent results
> on a particular system, which is what exploits generally take
> advantage of.

Exactly - stack smashing is only related to the C standard until we
overrun a buffer; after that, there's a whole other system of rules to
break - not as meticulously defined or standardized as C (or any
programming language) but it's what makes it fun to break. :)

Nick Keighley

unread,

Jan 19, 2010, 3:44:36 AM1/19/10

to

[as previous post but with (hopefully!) less typos

On 18 Jan, 09:34, Nick Keighley <nick_keighley_nos...@hotmail.com>
wrote:

> On 17 Jan, 06:40, Richard Heathfield <r...@see.sig.invalid> wrote:
> > frank wrote:
> > > Ben Bacarisse wrote:
> > >> frank <fr...@example.invalid> writes:
>
> > >>> (void) strncpy (pathName, theDir, PATH_SIZE);
> > >>> (void) strncat (pathName, "/", PATH_SIZE);
> > >>> (void) strncat (pathName, entry.d_name, PATH_SIZE);
>
> > >> Just a quick heads up: strncpy is almost always the wrong function to
> > >> use. For one thing, you may end up with something that is not a
> > >> string.
>
> > > Ben, I'm not strong at all with string processing in C. I see no
> > > warnings about strncpy in H&S. What should I use instead?
>

> if you read the documentation for strncpy() you will [find] it is rather odd.
> [strncpy()] is designed for copying small fixed arrays of characters that

> were not necessarily zero terminated. I believe unix has (or had) such
> things.
>
> char thing_name[8] = "AAAABBBB";

> the above is not nul terminated in C (C++ is different) because the

> number of characters in the initialiser exactly fits.
>
> strncpy (thing_name, "NEW_NAME", 8);

> copies exactly 8 characters. No nul on the end.

>
> strncpy (thing_name, "SMALL", 8);
> copies 5 character and appends three nuls.
>
> This all makes sense but it often isn't what people expect. The strncpy()
> of "NEW_NAME" means that thing_name isn't a valid C string after
> the call! There's no terminator.
>
> And that padding habbit can go wildly wrong.
> char buffer[100];
> strncpy (buffer, "tmp/", 100);
>
> avoids a call of strlen() on "tmp/" (saving 5 character reads) but
> tags 96 nuls onto the end of buffer.

optimising gnats and pessimising camels.

> One possibility is to write a function that does what you might expect

> strncpy() to do. (but don't call it strsomething() because that invades a

> namespace reserved for the implementation).

<snip>

Keith Thompson

unread,

Jan 19, 2010, 3:44:02 AM1/19/10

to

Michael Foukarakis <electr...@gmail.com> writes:
> On Jan 18, 7:15 pm, Keith Thompson <ks...@mib.org> wrote:
>> Michael Foukarakis <electricde...@gmail.com> writes:

[...

>> > Puh-lease. :-) If only that were true,

[...]

>> Ah, but it is true.

[...]
> Exactly
[...]

Did I miss something?

Michael Foukarakis

unread,

Jan 19, 2010, 3:50:22 AM1/19/10

to

On Jan 19, 10:44 am, Keith Thompson <ks...@mib.org> wrote:
> Michael Foukarakis <electricde...@gmail.com> writes:
> > On Jan 18, 7:15 pm, Keith Thompson <ks...@mib.org> wrote:
> >> Michael Foukarakis <electricde...@gmail.com> writes:
> [...
> >> > Puh-lease. :-) If only that were true,
> [...]
> >> Ah, but it is true.
> [...]
> > Exactly
>
> [...]
>
> Did I miss something?

I don't think so - I was just picking up where your last sentence left
off. :S

Nick Keighley

unread,

Jan 19, 2010, 4:03:05 AM1/19/10

to

On 17 Jan, 10:37, Nobody <nob...@nowhere.com> wrote:

> [Where a function takes a pointer to a buffer and the size of the buffer,
> it's usually a good idea to use sizeof rather than the macro you used when
> defining the buffer. That way, you don't have to update the rest of the
> code when you change the declaration.]

as long as we're talking about chars that is. If buffer is made up of
some other type it's often better to pass the number of elements
rather than the size.

I use
#define ARRAY_SIZE(A) (sizeof(A)/sizeof(A[0]))

Ersek, Laszlo

unread,

Jan 19, 2010, 6:34:14 AM1/19/10

to

In article <eujal512n73u1pl3r...@4ax.com>, jaysome <jay...@spamcop.net> writes:

> The C standard guarantees that INT_MAX is at least 32767, and the size
> of any scalar type will always be less than this (at least in the real
> world). Rather than casting the result of sizeof to "unsigned long",
> it's simply easier to cast it to "int". In the instant case, we know
> that the result of "sizeof(float)" is guaranteed to fit within type
> int (again, in the real world).

(Topic change.) This "int vs. size_t" question makes me remember what I
don't really like about the printf() family:

- The return value signals the number of characters transmitted (if no
error occurred). While strlen() returns a size_t, printf() and co.
return an int.

- Same for %n.

- Same for the "*" field width and precision.

(C99 7.19.6.1 The fprintf function, p15:

----v----
Environmental limits

The number of characters that can be produced by any single conversion
shall be at least 4095.
----^----

Does this mean one can't portably pass a string to a single %s if
strlen() returns at least 4096 for that string?)

I have the (very superficial) impression that unsigned integers are
historically very under-used in favor of signed integers. For example,
the (not standard C) BSD socket interfaces historically took a lot of
"size parameters" (obvious candidates for the sizeof operator) as int's:

- accept(): 3rd param
- bind(): 3rd param
- connect(): 3rd param
- getpeername(): 3rd param
- getsockname(): 3rd param
- getsockopt(): 5th param
- recvfrom(): 6th param
- sendto(): 6th param
- setsockopt(): 5th param

If one opens the manual page for accept() on a GNU/Linux distribution,
something like this should come up:

----v----
The third argument of accept was originally declared as an `int *' (and
is that under libc4 and libc5 and on many other systems like BSD 4.*,
SunOS 4, SGI); a POSIX 1003.1g draft standard wanted to change it into a
`size_t *', and that is what it is for SunOS 5. Later POSIX drafts
have `socklen_t *', and so do the Single Unix Specification and glibc2.
Quoting Linus Torvalds: _Any_ sane library _must_ have "socklen_t" be
the same size as int. Anything else breaks any BSD socket layer stuff.
POSIX initially _did_ make it a size_t, and I (and hopefully others,
but obviously not too many) complained to them very loudly indeed.
Making it a size_t is completely broken, exactly because size_t very
seldom is the same size as "int" on 64-bit architectures, for example.
And it _has_ to be the same size as "int" because that's what the BSD
socket interface is. Anyway, the POSIX people eventually got a clue,
and created "socklen_t". They shouldn't have touched it in the first
place, but once they did they felt it had to have a named type for some
unfathomable reason (probably somebody didn't like losing face over
having done the original stupid thing, so they silently just renamed
their blunder).
----^----

I believe historically wrong parameter types should be fixed in
standards, and it was right to change those parameter types to size_t in
the first place, because size_t is the return type of the sizeof
operator. If for whatever reason it was necessary to match int's size,
"unsigned" would still fit better than "int".

Now if one writes code simultaneously for SUSv1 (UNIX 95) and SUSv2
(UNIX 98) or later, all such function calls need preprocessor magic or
the following ugly but useful "technique":

{
struct msghdr dummy;
struct sockaddr_in addr;
int acc_sock;

dummy.msg_namelen = sizeof addr;
acc_sock = accept(sock, (struct sockaddr *)&addr, &dummy.msg_namelen);
}

Because the type of the msg_namelen member changed from size_t (SUSv1)
to socklen_t (SUSv2+) in parallel to the other parameters listed above.
(Not surprisingly, as msg_namelen communicates an address size
otherwise.)

Similarly, the not standard C fcntl() / F_SETFL lets or requires the
programmer to manipulate a bitmask. Why is the mask represented as an
int, instead of an unsigned? Let's suppose we possibly opened a FIFO in
nonblocking mode and now we want to ensure blocking behavior:

{
int opts;

opts = fcntl(fd, F_GETFL);
if (-1 == opts /* query could have been incorporated here */
|| -1 == fcntl(fd, F_SETFL, opts & ~O_NONBLOCK)) {
(void)fprintf(stderr, "%s: fcntl(): %s\n", progname, strerror(errno));
}
}

O_NONBLOCK is positive (because the value returned by fcntl() / F_GETFL
is positive if no error occurs). It must be representable by an int (see
return type again), thus it is not promoted above int within the bitwise
complement operator. ~O_NONBLOCK will be no trap representation, but it
will be probably negative. Then we BIT-AND that negative value with the
current flags. Ugly.

-o-

read() and write() take a size_t parameter for the number of bytes to be
read/written, but return ssize_t so that -1 can be returned to signal an
error. Consequently, only min { actual number of bytes, SSIZE_MAX } can
be passed.

I think all such functions should return -1 or 0 for error or success
correspondingly, and store the actual output through a
programmer-supplied pointer.

int made_up_fprintf(size_t *wr, FILE *strm, const char *fmt, ...);
int made_up_read(size_t *rd, int fd, void *buf, size_t nbyte);

("restrict" omitted for simplicity.)

fcntl() / F_GETFL takes a variable number of arguments anyway, so it
could return the current file status flags and file access modes through
a pointer using the current prototype. The type carrying the flags
should be unsigned int.

-o-

If I'm already talking about what I perceive as illogical interfaces,
the hierarchy of the BSD socket address structures / functions is wrong.
Consider:

{
struct sockaddr_in addr;

addr.sin_family = AF_INET;
addr.sin_port = htons(12345);
addr.sin_addr.s_addr = inet_addr("192.168.1.3");
}

Part of the TCP/IP (v4) protocol stack looks like this:
1) internet layer (eg. IP addresses and transport protocol selection)
2) transport layer (eg. TCP/UDP ports, dependent on the protocol
selected above)

The very existence of the "port" notion depends on the protocol selected
at the internet layer. There should be other IP-based protocols than TCP
and UDP, with a different or no "port" notion (eg. ICMP). The order of
"specialization" with an API should reflect the underlying protocol
structure.

1a) internet layer: address(es)
1b) internet layer: transport protocol
--------
2) transport layer: protocol-specific stuff, eg. port(s)

In reality, we have

socket(AF_INET, SOCK_STREAM, 0);
or socket(AF_INET, SOCK_STREAM, IPPROTO_TCP);

or

socket(AF_INET, SOCK_DGRAM, 0);
or socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);

or similar. Then we provide local or remote IP address(es) and TCP/UDP
port(s) in the second step (bind() / connect()). This corresponds to

1b) internet layer: transport protocol
--------
1a) internet layer: address(es)
2) transport layer: protocol-specific stuff, eg. port(s)

I'm not saying this doesn't work -- it does and I like network
programming (see http://lacos.hu for my few tiny toys that nevertheless
make huge use to me). I claim that most of the BSD socket interface is a
very non-intuitive black book of magic incantations. See "Hobbit"'s
comments in the netcat source (IIRC), for example.

... I think this was a bit off-topic, sorry.

Cheers,
lacos

Ben Pfaff

unread,

Jan 19, 2010, 12:59:30 PM1/19/10

to

jaysome <jay...@spamcop.net> writes:

> In all of my years of development, I've never run into a case where
> casting the return value of sizeof to "int" in a printf statement has
> ever been a problem. If there were cases in which sizeof returned a
> value greater than 32767, I was working on a platform in which the

> compiler used 32-bit int, so it was not a problem. [...]

Desktop and server platforms are all moving to 64-bit address
spaces, but many of these still have 32-bit int, so you may soon
run into a platform where "(unsigned) int" is not adequate for
the size of an object.
--
"Programmers have the right to be ignorant of many details of your code
and still make reasonable changes."
--Kernighan and Plauger, _Software Tools_

Antoninus Twink

unread,

Jan 19, 2010, 1:13:04 PM1/19/10

to

On 19 Jan 2010 at 6:53, jaysome wrote:
> In the instant case, we know that the result of "sizeof(float)" is
> guaranteed to fit within type int (again, in the real world).

Regular readers of this group will know that I'm more than pragmatic
when it comes to using features or constructions that will work fine in
the real world even if they are not ISO C.

However, even I baulk at a completely gratuitous use of something
non-portable for no gain or convenience whatsoever...

> Rather than casting the result of sizeof to "unsigned long",
> it's simply easier to cast it to "int".

...except saving a few characters of typing. It smacks to me of
carelessness, and that's a bad attribute in a programmer whether they
are real-world pragmatists or clc pie-in-the-sky dreamers.

> I find printf very useful in test programs to print out the size of my
> user-defined (e.g., structure) types, and the pattern I use is:
>
> printf("sizeof(T) is %d\n", (int)sizeof(T));

If you're writing test programs and not production code, then do
whatever you like, of course! It makes no difference either way.

Even so, doesn't it appeal to your lazy instincts to be able to save a
whole 3 characters' worth of typing by using

printf("sizeof(T) is %zu\n", sizeof(T));

Richard Tobin

unread,

Jan 19, 2010, 1:38:29 PM1/19/10

to

In article <slrnhlbthg...@nospam.invalid>,
Antoninus Twink <nos...@nospam.invalid> wrote:

>If you're writing test programs and not production code, then do
>whatever you like, of course! It makes no difference either way.

How often does the question of printing out the value of a sizeof
expression come up in production code?

>Even so, doesn't it appeal to your lazy instincts to be able to save a
>whole 3 characters' worth of typing by using
>
>printf("sizeof(T) is %zu\n", sizeof(T));

There's laziness and laziness... for many people it will be the choice
between typing 3 extra characters and looking up the format string
for a size_t.

-- Richard
--
Please remember to mention me / in tapes you leave behind.

Ben Pfaff

unread,

Jan 19, 2010, 2:25:18 PM1/19/10

to

ric...@cogsci.ed.ac.uk (Richard Tobin) writes:

> How often does the question of printing out the value of a sizeof
> expression come up in production code?

Fairly often, at least in code that emits log messages, for
production code that deals heavily with network protocol parsing.
(Of course, this code is generally not 100% comp.lang.c-compliant
anyhow, since it typically represents protocol entities with C
structures whose data sizes are not 100% portable.)
--
Ben Pfaff
http://benpfaff.org

Seebs

unread,

Jan 19, 2010, 3:15:17 PM1/19/10

to

On 2010-01-19, Richard Tobin <ric...@cogsci.ed.ac.uk> wrote:
> How often does the question of printing out the value of a sizeof
> expression come up in production code?

Depends on how heavily instrumented or logged it is. :)

(More generally, while it's rarely sizeof, I print a lot of size_t
values.)

lawrenc...@siemens.com

unread,

Jan 19, 2010, 3:08:07 PM1/19/10

to

Ben Pfaff <b...@cs.stanford.edu> wrote:
>
> Desktop and server platforms are all moving to 64-bit address
> spaces, but many of these still have 32-bit int, so you may soon
> run into a platform where "(unsigned) int" is not adequate for
> the size of an object.

Even with a 64-bit address space, it's exceedingly rare to have a single
object whose size won't fit in 32 bits. The value of a large address
space is usually the ability to have *lots* of objects rather than very
big ones.
--
Larry Jones

I've never seen a sled catch fire before. -- Hobbes

Ben Pfaff

unread,

Jan 19, 2010, 4:51:41 PM1/19/10

to

lawrenc...@siemens.com writes:

> Ben Pfaff <b...@cs.stanford.edu> wrote:
>>
>> Desktop and server platforms are all moving to 64-bit address
>> spaces, but many of these still have 32-bit int, so you may soon
>> run into a platform where "(unsigned) int" is not adequate for
>> the size of an object.
>
> Even with a 64-bit address space, it's exceedingly rare to have a single
> object whose size won't fit in 32 bits. The value of a large address
> space is usually the ability to have *lots* of objects rather than very
> big ones.

Most of the time, yes. But sometimes you want to do something
like map an entire hard drive (or hard drive virtual image) into
your address space, simulate a virtual machine with lots of
memory, etc. And certainly one can read a large file into memory
as well.
--
"All code should be deliberately written for the purposes of instruction.
If your code isn't readable, it isn't finished yet."
--Richard Heathfield

Nick Keighley

unread,

Jan 20, 2010, 4:55:44 AM1/20/10

to

On 19 Jan, 18:13, Antoninus Twink <nos...@nospam.invalid> wrote:

> If you're writing test programs and not production code, then do
> whatever you like, of course! It makes no difference either way.

I've found if I apply this too literally then I spend all my time
debugging test code, because my test code is buggily claiming my
production code is buggy!

John Bode

unread,

Jan 20, 2010, 9:58:41 AM1/20/10

to

Grr; that's supposed to be C99 instead of C90. There's another
synapse popped...

jaysome

unread,

Jan 21, 2010, 12:54:20 AM1/21/10

to

On Tue, 19 Jan 2010 13:51:41 -0800, Ben Pfaff <b...@cs.stanford.edu>
wrote:

>lawrenc...@siemens.com writes:
>
>> Ben Pfaff <b...@cs.stanford.edu> wrote:
>>>
>>> Desktop and server platforms are all moving to 64-bit address
>>> spaces, but many of these still have 32-bit int, so you may soon
>>> run into a platform where "(unsigned) int" is not adequate for
>>> the size of an object.
>>
>> Even with a 64-bit address space, it's exceedingly rare to have a single
>> object whose size won't fit in 32 bits. The value of a large address
>> space is usually the ability to have *lots* of objects rather than very
>> big ones.
>
>Most of the time, yes. But sometimes you want to do something
>like map an entire hard drive (or hard drive virtual image) into
>your address space, simulate a virtual machine with lots of
>memory, etc. And certainly one can read a large file into memory
>as well.

Ben,

In these cases, wouldn't you most likely (always?) be using a pointer
to memory rather than something like a fixed-size array to refer to
the entire hard drive or virtual machine? If not, how do you know what
size the fixed-size array should be?

That's how typical "map" functions work--they return a pointer and a
size (or you specify the size). In such a case, sizeof is of no use.
Certainly in these types of cases, you would not use sizeof and would
not use "%d" to print out the size. For example:

unsigned char *p;
size_t size;

if ( MapHardDrive("/dev/hd0", &p, &size) )
{
// For C90
printf("Mapped hard drive is size %lu.\n", (unsigned long)size);
// For C99
printf("Mapped hard drive is size %zu.\n", size);
}

--
jay

jaysome

unread,

Jan 21, 2010, 1:08:38 AM1/21/10

to

On Tue, 19 Jan 2010 18:13:04 +0000 (UTC), Antoninus Twink
<nos...@nospam.invalid> wrote:

>printf("sizeof(T) is %zu\n", sizeof(T));

#include <stdio.h>
int main(void)
{
printf("sizeof(int)is %zu\n", sizeof(int));
return 0;
}

This is the output I get with VC++ 6.0:

sizeof(int)is zu

That's because the "%zu" conversion specifier is new in C99, and VC++
6.0, like most of the dozen or so compilers I use, are not
C99-compliant. I think it will always be this way. YMMV.

So rather than have to deal with portability issues, I simply write:

#include <stdio.h>
int main(void)
{
printf("sizeof(int) is %d\n", (int)sizeof(int));
return 0;
}

--
jay

Ben Pfaff

unread,

Jan 21, 2010, 12:23:41 PM1/21/10

to

jaysome <jay...@spamcop.net> writes:

> In these cases, wouldn't you most likely (always?) be using a pointer
> to memory rather than something like a fixed-size array to refer to
> the entire hard drive or virtual machine?

Yes.

But you should not use "int" or "unsigned int" to print the size
of a size_t that measures the size of such a large object.
--
"In My Egotistical Opinion, most people's C programs should be indented six
feet downward and covered with dirt." -- Blair P. Houghton

frank

unread,

Jan 21, 2010, 10:20:06 PM1/21/10

to

jaysome wrote:
> On Tue, 19 Jan 2010 18:13:04 +0000 (UTC), Antoninus Twink
> <nos...@nospam.invalid> wrote:
>
>> printf("sizeof(T) is %zu\n", sizeof(T));
>
> #include <stdio.h>
> int main(void)
> {
> printf("sizeof(int)is %zu\n", sizeof(int));
> return 0;
> }
>
> This is the output I get with VC++ 6.0:
>
> sizeof(int)is zu
>
> That's because the "%zu" conversion specifier is new in C99, and VC++
> 6.0, like most of the dozen or so compilers I use, are not
> C99-compliant. I think it will always be this way. YMMV.

My mileage didn't vary, which is why I stopped using VC++6. It's stuck
in the eighties like the mullet, preppie clothes, and cocaine.

>
> So rather than have to deal with portability issues, I simply write:
>
> #include <stdio.h>
> int main(void)
> {
> printf("sizeof(int) is %d\n", (int)sizeof(int));
> return 0;
> }
>

When I know a number isn't going to be bigger than a hundred, I don't
care enough sometimes to add a cast.

frank

unread,

Jan 22, 2010, 12:06:55 AM1/22/10

to

Jens Thoms Toerring wrote:
> frank <abqha...@gmail.com> wrote:
>> For the getDir function that you envision, is this an appropriate
>> declaration:
>
>> char * getDir( char * pathName);
>
> Nearly - but since it's supposed to return an array of strings
> it has to be
>
> char ** getDir( char * pathName);
>

Thanks, Jens, I won't forget to add the const, but I did tonight. I
think I've got a pretty good template to move forward. I see the
analogy between reading a text file with standard c and reading a
directory with posix extensions. I think both tasks are suited to
resizable 2-d arrays.

So this is what I have (thanks, Richard):

dan@dan-desktop:~/source/unleashed/ch11$ gcc -Wall -Wextra -c -o
string2.o strarr2.c
dan@dan-desktop:~/source/unleashed/ch11$ gcc -std=c99 -Wall -Wextra
string2.o t2.c -o out
dan@dan-desktop:~/source/unleashed/ch11$ ./out strarr2.c

#include <stdlib.h>
#include <string.h>
#include <assert.h>

#include "strarr.h"

void FreeStrArray(char **Array, size_t NumFiles)
{
size_t index;

if(Array != NULL)
{
for(index = 0; index < NumFiles; index++)
{
if(Array[index] != NULL)
{
free(Array[index]);
}
}
free(Array);
}
}

char **AllocStrArray(size_t NumFiles, size_t Width)
{
char **Array = NULL;
size_t index;
int Success = 1;

/* allocating 0 bytes is not a great idea, and
* represents a logic error.
*/
assert(NumFiles > 0);
assert(Width > 0);

/* Just in case the zero allocation is NOT caught
* in testing, we'll check for it here.
*/
if(NumFiles > 0 && Width > 0)
{
Array = malloc(NumFiles * sizeof *Array);
if(Array != NULL)
{
for(index = 0; index < NumFiles; index++)
{
Array[index] = malloc(Width * sizeof *Array[index]);
if(NULL == Array[index])
{
Success = 0;
}
else
{
/* Making this into an empty string is a quick
* op which will almost invariably be The Right
* Thing and can never be The Wrong Thing, so
* we might as well do it.
*/
Array[index][0] = '\0';
}
}
/* If any inner allocation failed,
* we should clean up.
*/
if(1 != Success)
{
FreeStrArray(Array, NumFiles);
Array = NULL;
}
}
}

return Array;
}

int ResizeOneString(char **Array,
size_t index,
size_t NewSize)
{
char *p;
int Success = 1;

assert(Array != NULL);

p = realloc(Array[index], NewSize);
if(p != NULL)
{
Array[index] = p;
}
else
{
Success = 0;
}

return Success;
}

int AddFilesToStrArray(char ***ArrayPtr,
size_t OldNumFiles,
int NumFilesToAdd,
size_t InitWidth)
{
char **p;
int Success = 1;
int index;
int OldFiles;

OldFiles = (int)OldNumFiles;
if(NumFilesToAdd < 0)
{
for(index = OldFiles - 1;
index >= OldFiles + NumFilesToAdd;
index--)
{
free((*ArrayPtr)[index]);
}
}

p = realloc(*ArrayPtr,
(OldFiles + NumFilesToAdd) *
sizeof(**ArrayPtr));

if(p != NULL)
{
*ArrayPtr = p;

for(index = OldFiles;
Success && index < OldFiles + NumFilesToAdd;
index++)
{
(*ArrayPtr)[index] = malloc(InitWidth);
if((*ArrayPtr)[index] != NULL)
{
(*ArrayPtr)[index][0] = '\0';
}
else
{
Success = 0;
}
}
}
else
{
Success = 0;
}
return Success;
}

int ConsolidateStrArray(char **ArrayPtr,
size_t NumFiles)
{
size_t index;
size_t Len;
int NumFailures = 0;

for(index = 0; index < NumFiles; index++)
{
/* If the library has been correctly used, no
* index pointer will ever be NULL, so we should
* assert that this is the case.
*/
assert(ArrayPtr[index] != NULL);
Len = 1 + strlen(ArrayPtr[index]);
if(0 == ResizeOneString(ArrayPtr, index, Len))
{
++NumFailures;
}
}
return NumFailures;
}

/* end of strarr.c */
/*
gcc -Wall -Wextra -c -o string2.o strarr2.c */
dan@dan-desktop:~/source/unleashed/ch11$ ./out t2.c
#include <stdio.h>
#include <string.h>

#include "strarr2.h"

#define DEFAULT_LINE_LEN 64
#define LINES_PER_ALLOC 16
#define ERR_FILES_NOT_ADDED 1
#define ERR_STRING_NOT_RESIZED 2
#define ERR_PATH_OPEN_FAILED 3
#define ERR_ALLOC_FAILED 4

int ReadFile(char *Filename,
char ***Array,
int *NumFiles)
{
char Buffer[DEFAULT_LINE_LEN] = {0};
char *NewLine = NULL;
FILE *fp;
int Error = 0;
int index = 0;
size_t NumBlocks;

*NumFiles = 0;

*Array = AllocStrArray(LINES_PER_ALLOC,
DEFAULT_LINE_LEN);
if(NULL != *Array)
{
fp = fopen(Filename, "r");
if(fp != NULL)
{
*NumFiles = LINES_PER_ALLOC;
NumBlocks = 1;

/* fgets will give us no more than sizeof Buffer
* bytes, including zero terminator and newline
* if one is present within that number of bytes.
* Therefore we need to cater for longer lines.
* To do this, we call fgets again (and again
* and again) until we encounter a newline.
*/
while(0 == Error &&
NULL != fgets(Buffer, sizeof Buffer, fp))
{
NewLine = strchr(Buffer, '\n');
if(NewLine != NULL)
{
*NewLine = '\0';
}
/* This strcat relies on the AllocStrArray()
* function initialising indexs to empty strings.
*/
strcat((*Array)[index], Buffer);
if(NewLine != NULL)
{
/* There was a newline, so the
* next line is a new one.
*/
NumBlocks = 1;
++index;
if(index >= *NumFiles)
{
/* Add another LINES_PER_ALLOC lines.
* If it didn't work, give up.
*/
if(0 == AddFilesToStrArray(Array,
*NumFiles,
LINES_PER_ALLOC,
DEFAULT_LINE_LEN))
{
Error = ERR_FILES_NOT_ADDED;
}
else
{
*NumFiles += LINES_PER_ALLOC;
}
}
}
else
{
++NumBlocks;
/* Make room for some more data on this line */
if(0 ==
ResizeOneString(*Array,
index,
NumBlocks * DEFAULT_LINE_LEN))
{
Error = ERR_STRING_NOT_RESIZED;
}
}
}
fclose(fp);
if(0 == Error && *NumFiles > index)
{
if(0 == AddFilesToStrArray(Array,
*NumFiles,
index - *NumFiles,
0))
{
Error = ERR_ALLOC_FAILED;
}
*NumFiles = index;
}
}
else
{
Error = ERR_PATH_OPEN_FAILED; /* Can't open file */
}
}
else
{
Error = ERR_ALLOC_FAILED; /* Can't allocate memory */
}
if(Error != 0)
{
/* If the original allocation failed,
* *Array will be NULL. FreeStrArray()
* correctly handles this possibility.
*/
FreeStrArray(*Array, *NumFiles);
*NumFiles = 0;
}
else
{
ConsolidateStrArray(*Array, *NumFiles);
}

return Error;
}

int main(int argc, char **argv)
{
char **array = NULL;

int numFiles;
int index;
int error;

if(argc > 1)
{
error = ReadFile(argv[1], &array, &numFiles);
switch(error)
{
case 0:
for(index = 0; index < numFiles; index++)
{
printf("%s\n", array[index]);
}

FreeStrArray(array, numFiles);
break;
case ERR_STRING_NOT_RESIZED:
case ERR_ALLOC_FAILED:
case ERR_FILES_NOT_ADDED:
puts("Insufficient memory.");
break;
case ERR_PATH_OPEN_FAILED:
printf("Couldn't open %s for reading\n", argv[1]);
break;
default:
printf("Unknown error! Code %d.\n", error);
break;
}
}
else
{
puts("Please specify the text file name.");
}

return 0;
}
/* end of c11_018.c */

/*
gcc -std=c99 -Wall -Wextra string2.o t2.c -o out */
dan@dan-desktop:~/source/unleashed/ch11$

My questions now focus on main. Somehow, I've got to take this line:
error = ReadFile(argv[1], &array, &numFiles);
, and turn it into:
char ** getDir( const char * pathName);

Any constructive comments gladly received. Cheers,
--
frank

David Thompson

unread,

Jan 28, 2010, 11:41:59 AM1/28/10

to

On 19 Jan 2010 12:34:14 +0100, la...@ludens.elte.hu (Ersek, Laszlo)
wrote:

> (Topic change.) This "int vs. size_t" question makes me remember what I
> don't really like about the printf() family:
>
> - The return value signals the number of characters transmitted (if no
> error occurred). While strlen() returns a size_t, printf() and co.
> return an int.
>

The return value needs to allow negative for error. Like some other
stdio routines and many Unix syscalls, this practice of 'inband' error
indication isn't aesthetically beautiful, but it worked and at this
point we're basically stuck with it.

> - Same for %n.
>
Pointer (both *scanf and *printf), so once set unsafe to change.

> - Same for the "*" field width and precision.
>

(For *printf only) varargs, thus less convenient to use something that
might be wider than u/int.

Remember that much of what is now the standard library was first
implemented while the type system was still in some flux, and there
was no point thereafter when there was agreement on a flag-day change.
It would be nicer if dmr&co had been prescient, but they had enough
trouble getting anything to work at all, and AIUI no idea C would
become as widespread and persistent as it did.

C89 did make certainly most and I think all value arguments, which can
be (automatically) fixed by the newly-added prototypes, size_t.

> (C99 7.19.6.1 The fprintf function, p15:
>
> ----v----
> Environmental limits
>
> The number of characters that can be produced by any single conversion
> shall be at least 4095.
> ----^----
>
> Does this mean one can't portably pass a string to a single %s if
> strlen() returns at least 4096 for that string?)
>

AIUI not absolutely 100% guaranteed portable, no. FWIW *f were
originally intended mostly to be human read or entered, or at least
editted, and single chunks of data larger than some tens of chars, or
at least hundreds, are usually unsuitable for that. IME when I need
(in C) to do larger chunks they are handled separately and it's no
real trouble to use fwrite().

> I have the (very superficial) impression that unsigned integers are
> historically very under-used in favor of signed integers. For example,
> the (not standard C) BSD socket interfaces historically took a lot of
> "size parameters" (obvious candidates for the sizeof operator) as int's:

<snip long rant ending mostly in socklen_t> (actually int or int*)

Yeah, by the time sockets was done, there was enough experience they
could have got this one right. Especially since they worked hard to
support extendable network types, almost all of which have turned out
to be useless. OTOH if addresses (or options) ever really need to
exceed 32K I don't want to be there to see it, so in practice it
doesn't actually cause a problem, just looks a bit ugly.

Ersek, Laszlo

unread,

Jan 28, 2010, 2:26:07 PM1/28/10

to

In article <kcc3m5955g9jllo0v...@4ax.com>,
David Thompson <dave.th...@verizon.net> writes:

> [historical and practical insights]

Thank you very much.
lacos