Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Copying string till newline

3 views
Skip to first unread message

arnuld

unread,
Sep 1, 2010, 2:23:52 AM9/1/10
to
WANTED: To write a function like strcpy(). Unlike strcpy() it copies till
a newline occurs.

GOT: It is working fine. Just want to have any ideas for improvement :)


#include <stdio.h>

enum { SIZE_ARR = 20 };

int string_copy_till_newline(char dest[], char src[]);


int main(void)
{
char arr_dest[SIZE_ARR]; /* intentionally not initialized with NULLs.
Check the intended function definition */
char arr_src[] = "This is\n an Array\n";

printf("arr_dest = %s\n", arr_dest);
printf("arr_src = %s\n", arr_src);
printf("\n\n------------------------------\n\n");
string_copy_till_newline(arr_dest, arr_src);
printf("arr_dest = %s\n", arr_dest);
printf("arr_src = %s\n", arr_src);

return 0;
}


/* Will copy contents from SRC to DEST till a newline occurs, will not
include newline, puts a NULL character at the end.
returns number of characters copied, else -1 on error. Will write
beyond the array, size checking is user's responsibility */
int string_copy_till_newline(char dest[], char src[])
{
int idx;

if(NULL == dest || NULL == src)
{
printf("IN: %s at %d: One of the arguments is NULL\n", __func__,
__LINE__);
return -1;
}

for(idx = 0; src[idx] != '\n'; ++idx)
{
dest[idx] = src[idx];
}

dest[idx] = '\0';

return idx;
}


==================== OUTPUT ==========================
[arnuld@dune TEST]$ gcc -ansi -pedantic -Wall -Wextra string-copy-till-
newline.c
[arnuld@dune TEST]$ ./a.out
arr_dest = %Ψ
arr_src = This is
an Array

------------------------------

arr_dest = This is
arr_src = This is
an Array

[arnuld@dune TEST]$

--
www.lispmachine.wordpress.com
my email is @ the above blog.

Jens Thoms Toerring

unread,
Sep 1, 2010, 2:49:31 AM9/1/10
to
arnuld <sun...@invalid.address> wrote:
> WANTED: To write a function like strcpy(). Unlike strcpy() it copies till
> a newline occurs.

> GOT: It is working fine. Just want to have any ideas for improvement :)

Unfortunately it's working fine only for your test input...

> #include <stdio.h>

> enum { SIZE_ARR = 20 };

> int string_copy_till_newline(char dest[], char src[]);


> int main(void)
> {
> char arr_dest[SIZE_ARR]; /* intentionally not initialized with NULLs.
> Check the intended function definition */
> char arr_src[] = "This is\n an Array\n";

> printf("arr_dest = %s\n", arr_dest);

Since 'dest' is unitialized this is WRONG, printf() has no way
to figure out when to stop printing chars from 'arr_dest'. You
rely on a '\0' char being somewhere in the uninitialized array
by accident.

> printf("arr_src = %s\n", arr_src);
> printf("\n\n------------------------------\n\n");
> string_copy_till_newline(arr_dest, arr_src);
> printf("arr_dest = %s\n", arr_dest);
> printf("arr_src = %s\n", arr_src);

> return 0;
> }

> /* Will copy contents from SRC to DEST till a newline occurs, will not
> include newline, puts a NULL character at the end.
> returns number of characters copied, else -1 on error. Will write
> beyond the array, size checking is user's responsibility */

> int string_copy_till_newline(char dest[], char src[])
> {
> int idx;

> if(NULL == dest || NULL == src)
> {
> printf("IN: %s at %d: One of the arguments is NULL\n", __func__,
> __LINE__);
> return -1;
> }

> for(idx = 0; src[idx] != '\n'; ++idx)
> {
> dest[idx] = src[idx];
> }

Did you consider what happens if there's no '\n' in the source
string? Then you will copy the final '\0' and continue to copy
and copy and copy....

So this needs to be

for ( idx = 0; src[ idx ] != '\0' && src[ idx ] != '\n'; ++idx )


dest[ idx ] = src[ idx ];

> dest[idx] = '\0';

> return idx;
> }
Regards, Jens
--
\ Jens Thoms Toerring ___ j...@toerring.de
\__________________________ http://toerring.de

Nick Keighley

unread,
Sep 1, 2010, 4:45:56 AM9/1/10
to
On 1 Sep, 07:23, arnuld <sunr...@invalid.address> wrote:

> WANTED: To write a function like strcpy(). Unlike strcpy() it copies till
> a newline occurs.
>
> GOT: It is working fine. Just want to have any ideas for improvement :)
>
> #include <stdio.h>
>
> enum { SIZE_ARR = 20 };
>
> int string_copy_till_newline(char dest[], char src[]);

int string_copy_till_newline(char dest[], const char src[]);

consider returning a char*, say to the end of the dest, this can make
chaining calls together easier.

technically string_copy_till_newline is in a reserved namespacwe
(anything beginning "str" is reserved for the implementation.

consider using C99's "restrict" (though using it marginally reduces
portability)

> int main(void)
> {
>   char arr_dest[SIZE_ARR];  /* intentionally not initialized with NULLs.
> Check the intended function definition */
>   char arr_src[] = "This is\n an Array\n";
>
>   printf("arr_dest = %s\n", arr_dest);
>   printf("arr_src = %s\n", arr_src);
>   printf("\n\n------------------------------\n\n");
>   string_copy_till_newline(arr_dest, arr_src);
>   printf("arr_dest = %s\n", arr_dest);
>   printf("arr_src = %s\n", arr_src);
>
>   return 0;
>
> }
>
> /* Will copy contents from SRC to DEST till a newline occurs, will not
> include newline, puts a NULL character at the end.
>    returns number of characters copied, else -1 on error. Will write
> beyond the array, size checking is user's responsibility */
> int string_copy_till_newline(char dest[], char src[])
> {
>   int idx;
>
>   if(NULL == dest || NULL == src)
>     {
>       printf("IN: %s at %d: One of the arguments is NULL\n", __func__,
> __LINE__);

some people would frown on a library routine that produced error
messages. Some would prefer errors to go to stderr


>       return -1;
>     }
>
>   for(idx = 0; src[idx] != '\n'; ++idx)
>     {
>       dest[idx] = src[idx];
>     }

or more idiomatically

while (*src != '\n')
*dst++ = *src++;

I note you don't copy the \n is taht intended?

I'd also worry about what happened if there were no \n in the string.
If src is a string (ie. has \0 at the end) then I'd check for that.

while (*dst++ = *src++)
if (src == '\n')
break;

>   dest[idx] = '\0';
>
>   return idx;
>
> }

<snip>

arnuld

unread,
Sep 1, 2010, 5:30:10 AM9/1/10
to
> On Wed, 01 Sep 2010 06:49:31 +0000, Jens Thoms Toerring wrote:
>> arnuld <sun...@invalid.address> wrote:

>> printf("arr_dest = %s\n", arr_dest);

> Since 'dest' is unitialized this is WRONG, printf() has no way to figure
> out when to stop printing chars from 'arr_dest'. You rely on a '\0' char
> being somewhere in the uninitialized array by accident.

So, array initialization is should always be used in C before using the
array ?


>> for(idx = 0; src[idx] != '\n'; ++idx)
>> {
>> dest[idx] = src[idx];
>> }

> Did you consider what happens if there's no '\n' in the source string?
> Then you will copy the final '\0' and continue to copy and copy and
> copy....

> So this needs to be
>
> for ( idx = 0; src[ idx ] != '\0' && src[ idx ] != '\n'; ++idx )
> dest[ idx ] = src[ idx ];
>
>> dest[idx] = '\0';
>
>> return idx;


I also know the size of dest, so how about using this condition:

for(idx=0; idx < SIZE_ARR; ++idx)

??

Jens Thoms Toerring

unread,
Sep 1, 2010, 5:43:59 AM9/1/10
to
arnuld <sun...@invalid.address> wrote:
> > On Wed, 01 Sep 2010 06:49:31 +0000, Jens Thoms Toerring wrote:
> >> arnuld <sun...@invalid.address> wrote:

> >> printf("arr_dest = %s\n", arr_dest);
>
> > Since 'dest' is unitialized this is WRONG, printf() has no way to figure
> > out when to stop printing chars from 'arr_dest'. You rely on a '\0' char
> > being somewhere in the uninitialized array by accident.

> So, array initialization is should always be used in C before using the
> array ?

Yes, of course, what sense would it make to use elements of
an array that never were initialized (except maybe when you
want to write a really horribly bad random generator?)

And in the case you were using it you did rely on the array
having a '\0' value somewhere by mere chance - if that's
not the case printf() accesess elements beyond the end of
the array, which is forbidden.

> >> for(idx = 0; src[idx] != '\n'; ++idx)
> >> {
> >> dest[idx] = src[idx];
> >> }
>
> > Did you consider what happens if there's no '\n' in the source string?
> > Then you will copy the final '\0' and continue to copy and copy and
> > copy....
>
> > So this needs to be
> >
> > for ( idx = 0; src[ idx ] != '\0' && src[ idx ] != '\n'; ++idx )
> > dest[ idx ] = src[ idx ];
> >
> >> dest[idx] = '\0';
> >
> >> return idx;


> I also know the size of dest, so how about using this condition:

> for(idx=0; idx < SIZE_ARR; ++idx)

And what would that help you? You explicitely stated that
you want to stop copying at the first '\n' (and, since you're
dealing with strings you must stop after the first '\0' you
encounter). And then that's the length of the destination
array - you must also take that into consideration, since
you're not allowed to write past the end of the destination
array if the source string is too long - but if the source
string is shorter then you must stop at the end of the source
string or you would access elements past the end of that array.

So the correct form, when also taking the final length of the
destination array into account would be

for ( idx = 0;

idx < SIZE_ARR - 1 && src[ idx ] != '\0' && src[ idx ] != '\n';
++idx )

Ben Bacarisse

unread,
Sep 1, 2010, 6:01:02 AM9/1/10
to
arnuld <sun...@invalid.address> writes:

> WANTED: To write a function like strcpy(). Unlike strcpy() it copies till
> a newline occurs.
>
> GOT: It is working fine. Just want to have any ideas for improvement
> :)

I'd suggest looking at standard library functions. Set yourself a
challenge to do it with as little of you own code as possible. The
point being to learn what already there.

<snip>
--
Ben.

arnuld

unread,
Sep 1, 2010, 6:38:59 AM9/1/10
to
> On Wed, 01 Sep 2010 11:01:02 +0100, Ben Bacarisse wrote:

> I'd suggest looking at standard library functions. Set yourself a
> challenge to do it with as little of you own code as possible. The
> point being to learn what already there.


I did not get your point. Best guess is there is some function if C Std.
Lib. that almost matches my requirements of copying till newline ?

James

unread,
Sep 1, 2010, 6:54:26 AM9/1/10
to
"arnuld" <sun...@invalid.address> wrote in message
news:4c7df177$0$50446$1472...@news.sunsite.dk...

> WANTED: To write a function like strcpy(). Unlike strcpy() it copies till
> a newline occurs.

Here is a fairly simple approach:


char* copy_to_newline(char const* src,
char* dest)
{
char const* target = strchr(src, '\n');

if (target)
{
memcpy(dest, src, target - src);
dest[target - src] = '\0';
}

else
{
dest[0] = '\0';
}

return dest;
}


[...]


Nick Keighley

unread,
Sep 1, 2010, 7:46:50 AM9/1/10
to
On 1 Sep, 10:43, j...@toerring.de (Jens Thoms Toerring) wrote:

> arnuld <sunr...@invalid.address> wrote:
> > > On Wed, 01 Sep 2010 06:49:31 +0000, Jens Thoms Toerring wrote:
> > >> arnuld <sunr...@invalid.address> wrote:


> > >>   printf("arr_dest = %s\n", arr_dest);
>
> > > Since 'dest' is unitialized this is WRONG, printf() has no way to figure
> > > out when to stop printing chars from 'arr_dest'. You rely on a '\0' char
> > > being somewhere in the uninitialized array by accident.
>
> > So, array initialization is should always be used in C before using the
> > array ?

there must be a sensible value in a variable before you read it and
take action on it. If you printf("%s") a char array it must contain a
valid string. Isn't this like kind of obvious...

<snip>

> > >>   for(idx = 0; src[idx] != '\n'; ++idx)
> > >>     {
> > >>       dest[idx] = src[idx];
> > >>     }
>
> > > Did you consider what happens if there's no '\n' in the source string?
> > > Then you will copy the final '\0' and continue to copy and copy and
> > > copy....
>
> > > So this needs to be
>
> > >    for ( idx = 0; src[ idx ] != '\0' && src[ idx ] != '\n'; ++idx )
> > >        dest[ idx ] = src[ idx ];
>
> > >>   dest[idx] = '\0';
>
> > >>   return idx;
>
> > I also know the size of dest, so how about using this condition:
> >  for(idx=0; idx < SIZE_ARR; ++idx)

yuk. If you know the size of dest and want to use that fact you ought
to pass it in as a parameter. How do you sort this out.

void piffle (void)
{
# define SIZE_ARR 7
char dst1 [SIZE_ARR] = "";
char dst2 [SIZE_ARR * 2] = "";

char source [SIZE_ARR * 2] = "woofle dust\n";

string_copy_to_newline (dst1, source);
}

this fails to copy the whole string even though there's room

this might be a better declaration

char *string_copy_to_newline (char *dst,
size_t dst_size, const char *source);

Ben Bacarisse

unread,
Sep 1, 2010, 2:23:30 PM9/1/10
to
arnuld <sun...@invalid.address> writes:

>> On Wed, 01 Sep 2010 11:01:02 +0100, Ben Bacarisse wrote:
>
>> I'd suggest looking at standard library functions. Set yourself a
>> challenge to do it with as little of you own code as possible. The

>> point being to learn what [is] already there.


>
> I did not get your point. Best guess is there is some function if C Std.
> Lib. that almost matches my requirements of copying till newline ?

Yes, almost. It really helps to know the standard library -- not that it's
prefect, it's just that it's always there. The library is as much part
of C as ++ and * but people often neglect to learn it.

The plus side is that it is small compared to some language's standard
libraries.

--
Ben.

Keith Thompson

unread,
Sep 1, 2010, 2:42:32 PM9/1/10
to
arnuld <sun...@invalid.address> writes:
> WANTED: To write a function like strcpy(). Unlike strcpy() it copies till
> a newline occurs.
>
> GOT: It is working fine. Just want to have any ideas for improvement :)
[...]

What should it do if the source string doesn't contain a newline?

What if the source array doesn't contain a string (i.e., there's no
terminating '\0')?

Is the newline copied?

What if the source array contains both a newline and a '\0', but the
'\0' is first, for example "foo\0bar\n"? Note that if you copy up to
the newline in this case, then you're not dealing with strings but with
'\n'-terminated character arrays -- which is fine if that's what you
want.

I have no opinion on what the answers to these questions should be;
for at least some of them, you could even say that the behavior
is undefined. But you should have clear answers to all of them,
ideally before you start writing code.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

ImpalerCore

unread,
Sep 1, 2010, 3:08:00 PM9/1/10
to
On Sep 1, 6:54 am, "James" <n...@spam.invalid> wrote:
> "arnuld" <sunr...@invalid.address> wrote in message

For that matter, might as well abstract the newline '\n' character to
any character. I also prefer destination argument first to keep in
style with the other C library functions. In this version, the
arguable point is whether it's better to return 'dst' or 'target' as
the result (I would argue that 'target' is probably more useful since
I rarely see strcpy's result being used). And there is the question
whether the character found by 'strchr' is included or not in the
copy, which should be explicitly documented either way.

\code snippet
char* strchrcpy( char* dst, const char* src, int ch )
{
const char* target = strchr( src, ch );

if (target)
{
memcpy( dst, src, target - src );
dst[target - src] = '\0';
}
else {
dst[0] = '\0';
}

return target;
}
\endcode

If one wanted to include buffer length, one possible prototype that
comes to mind is:

size_t strchrlcpy( char* dst, const char* src, int ch, size_t
dst_size );

Nick Keighley

unread,
Sep 2, 2010, 4:00:24 AM9/2/10
to
On 1 Sep, 11:54, "James" <n...@spam.invalid> wrote:
> "arnuld" <sunr...@invalid.address> wrote in message

this scans the string twice


arnuld

unread,
Sep 2, 2010, 5:43:58 AM9/2/10
to
> On Wed, 01 Sep 2010 03:54:26 -0700, James wrote:

> Here is a fairly simple approach:

> char* copy_to_newline(char const* src,
> char* dest)
> {
> char const* target = strchr(src, '\n');
>
> if (target)
> {
> memcpy(dest, src, target - src);
> dest[target - src] = '\0';


Holy Cow... Pointer-Magic :-o

arnuld

unread,
Sep 2, 2010, 5:48:29 AM9/2/10
to
> On Wed, 01 Sep 2010 11:42:32 -0700, Keith Thompson wrote:

Hey Keith, say Happy Holidays to me ;) , coming after long time to CLC


> What should it do if the source string doesn't contain a newline?

> What if the source array doesn't contain a string (i.e., there's no
> terminating '\0')?

> Is the newline copied?

(1) If it does not contain a newline then we will stop at the end of
string, when NULL character is encountered.

(2) If there is no terminating NULL then we will copy till the \n or
SIZE_ARR whichever comes first.

(3) no, newline is not copied.


> What if the source array contains both a newline and a '\0', but the
> '\0' is first, for example "foo\0bar\n"? Note that if you copy up to
> the newline in this case, then you're not dealing with strings but with
> '\n'-terminated character arrays -- which is fine if that's what you
> want.

Well, We will stop as soon as we get the NULL.



> I have no opinion on what the answers to these questions should be; for
> at least some of them, you could even say that the behavior is
> undefined. But you should have clear answers to all of them, ideally
> before you start writing code.


You are right and I answered these questions after I have written the
code :-/. So long in corporate job, they pay you to write code on
deadlines, not for writing correct code, they pay for writing the code
that 'just works' rather than code without 'hidden bugs'

Barry Schwarz

unread,
Sep 2, 2010, 12:20:16 PM9/2/10
to
On 02 Sep 2010 09:48:29 GMT, arnuld <sun...@invalid.address> wrote:

snip

>(1) If it does not contain a newline then we will stop at the end of
>string, when NULL character is encountered.

You should not invent new names for things which already have
perfectly good names. But if you must, you should never use a name
which already has a completely different meaning. NULL is not the
same thing as nul.


--
Remove del for email

Keith Thompson

unread,
Sep 2, 2010, 3:32:52 PM9/2/10
to
arnuld <sun...@invalid.address> writes:
>> On Wed, 01 Sep 2010 11:42:32 -0700, Keith Thompson wrote:
>
> Hey Keith, say Happy Holidays to me ;) , coming after long time to CLC

Happy Holidays to me!

>> What should it do if the source string doesn't contain a newline?
>
>> What if the source array doesn't contain a string (i.e., there's no
>> terminating '\0')?
>
>> Is the newline copied?
>
> (1) If it does not contain a newline then we will stop at the end of
> string, when NULL character is encountered.
>
> (2) If there is no terminating NULL then we will copy till the \n or
> SIZE_ARR whichever comes first.
>
> (3) no, newline is not copied.

From your original post:

enum { SIZE_ARR = 20 };

int string_copy_till_newline(char dest[], char src[]);

Is 20 really the maximum size you're interested in? That seems
unlikely. It would make much more sense to pass the maximum size (the
size of the destination array) as an argument to the function. This
requires the caller to be aware of the size.

And please stop misusing the word NULL.

[...]

Chad

unread,
Sep 2, 2010, 9:26:11 PM9/2/10
to

> Did you consider what happens if there's no '\n' in the source
> string? Then you will copy the final '\0' and continue to copy
> and copy and copy....
>
> So this needs to be
>
>    for ( idx = 0; src[ idx ] != '\0' && src[ idx ] != '\n'; ++idx )
>        dest[ idx ] = src[ idx ];
>
> >   dest[idx] = '\0';
> >   return idx;
> > }
>

Wouldn't this make the loop (invariant) become src[ idx ] == '\0' or
src[ idx ] == '\n' ?

Ike Naar

unread,
Sep 3, 2010, 3:55:39 AM9/3/10
to
On 2010-09-03, Chad <cda...@gmail.com> wrote:
>> ? ?for ( idx = 0; src[ idx ] != '\0' && src[ idx ] != '\n'; ++idx )
>> ? ? ? ?dest[ idx ] = src[ idx ];
>> > ? dest[idx] = '\0';
>> > ? return idx;

>> > }
>
> Wouldn't this make the loop (invariant) become src[ idx ] == '\0' or
> src[ idx ] == '\n' ?

It wouldn't.
An invariant is a condition that is true at the beginning of the
loop, remains true during execution of the loop, and still holds
when the loop finishes. For the example above, an invariant could be:

dest[0 .. idx-1] is a prefix of src that doesn't contain '\0' or '\n'

At the start of the loop it's enforced by setting idx to zero.
Every iteration of the loop preserves the invariant, because idx is
only incremented under the proper conditions, and dest is updated
accordingly.
When the loop finishes, in addition to the invariant, the negation of
the loop condition holds, so you have:

dest[0 .. idx-1] is a prefix of src that doesn't contain '\0' or '\n',
_and_ src[idx] equals '\0' or '\n'.

which, together, implies:

dest[0 .. idx-1] is the longest prefix of src that doesn't
contain '\0' or '\n'.

arnuld

unread,
Sep 9, 2010, 6:25:44 AM9/9/10
to
> On Thu, 02 Sep 2010 09:20:16 -0700, Barry Schwarz wrote:

> You should not invent new names for things which already have perfectly
> good names. But if you must, you should never use a name which already
> has a completely different meaning. NULL is not the same thing as nul.

Now wait a minute. I already messed into this in 2007 or so. Now lets
make my mind clear.


H&S 5, section 11.1: The value of macro NULL is the traditional null
pointer constant.

H&S 5, section 5.3.2: Every pointer in C has a special value called a
null pointer, which is different from every valid pointer of that type,
which compares equal to a null pointer constant, which converts to the
null pointers of other pointer types, and which has the value "false"
when used in a boolean context.


Hence, (1) NULL is a macro defined in one of the std lib headers

(2) null pointer is equal to the null pointer constant.

(3) And from C-FAQ 5.1 and 5.9, I see null pointer constant is just a 0
(zero). zero and null pointer are equal (in the case of pointers only).

(4) null character is '\0' which is not equal to the null pointer.


am I right now ?

Now I know NULL, null pointer, null pointer constant and null character.
What is nul (with single l) that you are talking about ?

Ben Bacarisse

unread,
Sep 9, 2010, 7:50:33 AM9/9/10
to
arnuld <sun...@invalid.address> writes:

>> On Thu, 02 Sep 2010 09:20:16 -0700, Barry Schwarz wrote:
>
>> You should not invent new names for things which already have perfectly
>> good names. But if you must, you should never use a name which already
>> has a completely different meaning. NULL is not the same thing as nul.
>
> Now wait a minute. I already messed into this in 2007 or so. Now lets
> make my mind clear.
>
>
> H&S 5, section 11.1: The value of macro NULL is the traditional null
> pointer constant.
>
> H&S 5, section 5.3.2: Every pointer in C has a special value called a
> null pointer, which is different from every valid pointer of that type,
> which compares equal to a null pointer constant, which converts to the
> null pointers of other pointer types, and which has the value "false"
> when used in a boolean context.
>
>
> Hence, (1) NULL is a macro defined in one of the std lib headers

Yes, except that NULL is defined to be defined in several headers. You
get it with stddef.h, stdlib.h, stdio.h and others.

> (2) null pointer is equal to the null pointer constant.
>
> (3) And from C-FAQ 5.1 and 5.9, I see null pointer constant is just a 0
> (zero). zero and null pointer are equal (in the case of pointers
> only).

"just a 0" is one permitted form. Any integer constant expression whose
value is zero will do, as will any of these cast to void *.

> (4) null character is '\0' which is not equal to the null pointer.

First, there is no single thing that is "the null pointer". There are
lots of null pointers -- at least one for every type. They all compare
equal so your statement is not ambiguous. I am just tidying it up.

More importantly, '\0' is just a fancy way of writing 0. Both are integer
constant expressions equal to zero and so both are perfectly good null
pointer constants. They therefore both compare equal to any other null
pointer.

> am I right now ?

Very nearly.

> Now I know NULL, null pointer, null pointer constant and null character.
> What is nul (with single l) that you are talking about ?

It is probably the lower case version of the character's ASCII name.
The control characters were all given three-letter names, though they are
usually written all upper case. NUL is the ASCII name of the null
character.

--
Ben.

Vincenzo Mercuri

unread,
Sep 9, 2010, 9:53:06 AM9/9/10
to
arnuld wrote:

> Now I know NULL, null pointer, null pointer constant and null character.
> What is nul (with single l) that you are talking about ?

NUL (sometimes nul) is a symbolic name or an abbreviation for the
non-printable 'null character' in character sets like ASCII or EBCDIC.
Check the following link for a direct comparison of these character sets:

http://www.natural-innovations.com/computing/asciiebcdic.html


--
"Non puoi insegnare qualcosa ad un uomo.
Lo puoi solo aiutare a scoprirla dentro di sé." (G. Galilei)

Vincenzo Mercuri

Barry Schwarz

unread,
Sep 9, 2010, 7:25:27 PM9/9/10
to
On 09 Sep 2010 10:25:44 GMT, arnuld <sun...@invalid.address> wrote:

>> On Thu, 02 Sep 2010 09:20:16 -0700, Barry Schwarz wrote:
>
>> You should not invent new names for things which already have perfectly
>> good names. But if you must, you should never use a name which already
>> has a completely different meaning. NULL is not the same thing as nul.
>
>Now wait a minute. I already messed into this in 2007 or so. Now lets
>make my mind clear.
>
>
>H&S 5, section 11.1: The value of macro NULL is the traditional null
>pointer constant.
>
>H&S 5, section 5.3.2: Every pointer in C has a special value called a
>null pointer, which is different from every valid pointer of that type,
>which compares equal to a null pointer constant, which converts to the
>null pointers of other pointer types, and which has the value "false"
>when used in a boolean context.

I don't have H&S but if it really says this it is misleading.

First off, the null pointer value is a perfectly valid value for
the pointer. Therefore it can't be different from every valid value
that the pointer can have since it can't be different from itself.

Second, the null pointer value of one type of pointer cannot be
converted to the null pointer value of an incompatible type of pointer
except by using a cast.
int *x = NULL;
float *y;
y = x; /* constraint violation */

>
>
>Hence, (1) NULL is a macro defined in one of the std lib headers

Already answered.

>
>(2) null pointer is equal to the null pointer constant.

Not necessarily. On those systems where NULL is defined as 0, the
null pointer constant has type int while the null pointer has type
pointer to <appropriate type>. On those systems where NULL is defined
as (void*)0, the null pointer constant has type pointer to void while
the null pointer has type pointer to <appropriate type>. Not only are
the types different but the bit patterns can also be different. On a
system with 8-bit bytes and 4-byte pointers and int, the following hex
representations are valid:
0 00000000
(void*)0 DEADBEEF
(int*)0 F00BAD00

>
>(3) And from C-FAQ 5.1 and 5.9, I see null pointer constant is just a 0
>(zero). zero and null pointer are equal (in the case of pointers only).

Equivalent rather than equal. And 0 is only one form of the null
pointer constant.


>
>(4) null character is '\0' which is not equal to the null pointer.

My references always use "nul" but I have no objection to "null
character" (except for the extra typing).

>
>
>am I right now ?
>
>Now I know NULL, null pointer, null pointer constant and null character.
>What is nul (with single l) that you are talking about ?

Already answered.

Keith Thompson

unread,
Sep 9, 2010, 9:16:55 PM9/9/10
to
Barry Schwarz <schw...@dqel.com> writes:
> On 09 Sep 2010 10:25:44 GMT, arnuld <sun...@invalid.address> wrote:
>
>>> On Thu, 02 Sep 2010 09:20:16 -0700, Barry Schwarz wrote:
>>
>>> You should not invent new names for things which already have perfectly
>>> good names. But if you must, you should never use a name which already
>>> has a completely different meaning. NULL is not the same thing as nul.
>>
>>Now wait a minute. I already messed into this in 2007 or so. Now lets
>>make my mind clear.
>>
>>
>>H&S 5, section 11.1: The value of macro NULL is the traditional null
>>pointer constant.
>>
>>H&S 5, section 5.3.2: Every pointer in C has a special value called a
>>null pointer, which is different from every valid pointer of that type,
>>which compares equal to a null pointer constant, which converts to the
>>null pointers of other pointer types, and which has the value "false"
>>when used in a boolean context.
>
> I don't have H&S but if it really says this it is misleading.
>
> First off, the null pointer value is a perfectly valid value for
> the pointer. Therefore it can't be different from every valid value
> that the pointer can have since it can't be different from itself.

The meaning of "valid" is vague. A null pointer is a pointer that
doesn't point. It's certainly invalid in the context of a unary "*"
operator -- though I agree it's valid for other purposes.

> Second, the null pointer value of one type of pointer cannot be
> converted to the null pointer value of an incompatible type of pointer
> except by using a cast.
> int *x = NULL;
> float *y;
> y = x; /* constraint violation */

Right. H&S didn't say otherwise. An expression whose value is a
null pointer does compare equal to a null pointer constant, and a
null pointer constant does convert to the null poitners of other
pointer types. No transitivity was stated or, IMHO, implied.

>>Hence, (1) NULL is a macro defined in one of the std lib headers
>
> Already answered.
>
>>
>>(2) null pointer is equal to the null pointer constant.
>
> Not necessarily.

Yes, assuming that "is equal to" means that the "==" operator will
yield 1.

some_type *ptr = NULL;
/* some_type is a null pointer */
int yes_its_equal = some_type == NULL;
/* yes_its_equal is guaranteed to have the value 1 */

> On those systems where NULL is defined as 0, the
> null pointer constant has type int while the null pointer has type
> pointer to <appropriate type>. On those systems where NULL is defined
> as (void*)0, the null pointer constant has type pointer to void while
> the null pointer has type pointer to <appropriate type>. Not only are
> the types different but the bit patterns can also be different. On a
> system with 8-bit bytes and 4-byte pointers and int, the following hex
> representations are valid:
> 0 00000000
> (void*)0 DEADBEEF
> (int*)0 F00BAD00

Irrelevant; it sill compares equal.

[...]

Ike Naar

unread,
Sep 10, 2010, 2:02:11 AM9/10/10
to
On 2010-09-10, Keith Thompson <ks...@mib.org> wrote:
> some_type *ptr = NULL;
> /* some_type is a null pointer */

/* some_type is some type; ptr is a null pointer */

> int yes_its_equal = some_type == NULL;

int yes_its_equal = ptr == NULL;

io_x

unread,
Sep 10, 2010, 2:58:56 AM9/10/10
to

"Barry Schwarz" <schw...@dqel.com> ha scritto nel messaggio
news:bvmi86p1snqvbvs95...@4ax.com...

> On 09 Sep 2010 10:25:44 GMT, arnuld <sun...@invalid.address> wrote:
> Not necessarily. On those systems where NULL is defined as 0, the
> null pointer constant has type int while the null pointer has type
> pointer to <appropriate type>. On those systems where NULL is defined
> as (void*)0, the null pointer constant has type pointer to void while
> the null pointer has type pointer to <appropriate type>. Not only are
> the types different but the bit patterns can also be different. On a
> system with 8-bit bytes and 4-byte pointers and int, the following hex
> representations are valid:
> 0 00000000
> (void*)0 DEADBEEF
> (int*)0 F00BAD00

where above are better than
0 0x00000000
(void*)0 0x00000000
(int*) 0 0x00000000

are for help the compiler?

Keith Thompson

unread,
Sep 10, 2010, 3:48:55 AM9/10/10
to

Yeah, thanks.

Seebs

unread,
Sep 10, 2010, 5:29:16 AM9/10/10
to
On 2010-09-09, arnuld <sun...@invalid.address> wrote:
> Hence, (1) NULL is a macro defined in one of the std lib headers
>
> (2) null pointer is equal to the null pointer constant.

At least, it compares equal.

> (3) And from C-FAQ 5.1 and 5.9, I see null pointer constant is just a 0
> (zero). zero and null pointer are equal (in the case of pointers only).

Not exactly. The key is that a *constant* zero, converted to a pointer,
is guaranteed to be a null pointer.

Consider:
int *p = 0;
int i = 0;
if (((int *) i) != p) {
/* this CAN happen */
}

There aren't all that many systems where the if control expression will
evaluate to true, but they do exist.

> (4) null character is '\0' which is not equal to the null pointer.

Almost right. It turns out that '\0' is, in fact, an integer constant
with the value 0, so it is a valid null pointer constant, and if
converted to a pointer type, yields a null pointer.

> am I right now ?

Sorta.

> Now I know NULL, null pointer, null pointer constant and null character.
> What is nul (with single l) that you are talking about ?

Not nul, NUL. The ASCII character set defines a number of characters
with values from 0 through 31 which have special meanings. Character
zero is NUL. (The first few are NUL, SOH, STX, ETX, EOT, ENQ, ACK,
BEL...)

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
I am not speaking for my employer, although they do rent some of my opinions.

Barry Schwarz

unread,
Sep 10, 2010, 11:30:40 AM9/10/10
to
On Fri, 10 Sep 2010 08:58:56 +0200, "io_x" <a...@b.c.invalid> wrote:

>
>"Barry Schwarz" <schw...@dqel.com> ha scritto nel messaggio
>news:bvmi86p1snqvbvs95...@4ax.com...
>> On 09 Sep 2010 10:25:44 GMT, arnuld <sun...@invalid.address> wrote:
>> Not necessarily. On those systems where NULL is defined as 0, the
>> null pointer constant has type int while the null pointer has type
>> pointer to <appropriate type>. On those systems where NULL is defined
>> as (void*)0, the null pointer constant has type pointer to void while
>> the null pointer has type pointer to <appropriate type>. Not only are
>> the types different but the bit patterns can also be different. On a
>> system with 8-bit bytes and 4-byte pointers and int, the following hex
>> representations are valid:
>> 0 00000000
>> (void*)0 DEADBEEF
>> (int*)0 F00BAD00
>
>where above are better than
> 0 0x00000000
> (void*)0 0x00000000
> (int*) 0 0x00000000
>
>are for help the compiler?

Better, like beauty, is in the eye of whoever is developing the
implementation.

There are reasons why a null pointer of all bits zero is convenient.
It obviously simplifies the code to be generated for something like
char *p = 0;

There are also reasons why it is not. For example, on my system,
address 0 is universally available to all tasks for reading only.
Therefore
char x = *(char*)0;
should not be prohibited while
*(char*)0 = '1';
should be limited to operating system tasks.

0 new messages