Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

How to copy a string safely and efficiently in C?

889 views
Skip to first unread message

will hunt

unread,
Sep 1, 2016, 9:50:37 PM9/1/16
to
should I use strncpy or snprintf, and why?

John Gordon

unread,
Sep 1, 2016, 9:58:46 PM9/1/16
to
In <0b30b0cc-f7c5-4928...@googlegroups.com> will hunt <glo...@gmail.com> writes:

> should I use strncpy or snprintf, and why?

It seems like strncpy() would be the obvious choice. snprintf() is
meant for assembling an output string from one or more component
parts, which doesn't seem like what you want.

--
John Gordon A is for Amy, who fell down the stairs
gor...@panix.com B is for Basil, assaulted by bears
-- Edward Gorey, "The Gashlycrumb Tinies"

will hunt

unread,
Sep 1, 2016, 10:40:32 PM9/1/16
to
On Friday, September 2, 2016 at 9:58:46 AM UTC+8, John Gordon wrote:
> In <0b30b0cc-f7c5-4928...@googlegroups.com> will hunt <glo...@gmail.com> writes:
>
> > should I use strncpy or snprintf, and why?
>
> It seems like strncpy() would be the obvious choice. snprintf() is
> meant for assembling an output string from one or more component
> parts, which doesn't seem like what you want.
>
for copying a sting into a given size buffer,
char dst[size];
strncpy(dst, src, size);
snprintf(dst, size,"%s", src);
which is better?

Joe Pfeiffer

unread,
Sep 1, 2016, 11:12:16 PM9/1/16
to
will hunt <glo...@gmail.com> writes:

> should I use strncpy or snprintf, and why?

In spite of its name, strncpy() isn't a good general purpose routine like
(for instance) strncat(). It copies exactly n bytes (even if the input
string is shorter) and doesn't necessarily null-terminate (if the input
string is longer).

I sort of like

dst[0] = '\0';
strncat(dst, src, n-1);

Does what you most likely intended strncpy() to do, and if you don't
know the length of the source string in advance it's hard to imagine a
significantly more efficient implementation.

If I were strictly limited to the choices in your question I'd probably
use snprintf(), since that would give me a result that didn't have extra
random bytes appended to my string by accident.

If I were limited to the two functions, but could put some extra code
in, I'd probably use

strncpy(dst, src, n);
dst[n-1] = '\0';

since that would be safe and would probably be faster than the extra
code needed to parse the format string in snprintf() unless the string
were *really* long.

(just last week I was bitten by passing the result of feeding an array
generated to xxd to a routine expecting a null-terminated string....).
--
"Erwin, have you seen the cat?" -- Mrs. Shrödinger

Geoff

unread,
Sep 1, 2016, 11:29:13 PM9/1/16
to
On Thu, 1 Sep 2016 19:40:24 -0700 (PDT), will hunt <glo...@gmail.com>
wrote:

>
>for copying a sting into a given size buffer,

char src[] = "ABC";

This is an error:
>char dst[size];
The size of the array must be a constant at compile time.
For example:
char dst[20];

assert(sizeof dst >= sizeof src);

>strncpy(dst, src, size);

Better to write
strncpy(dst, src, sizeof dst);

Why?
Because if you change the size of dst, you don't have to rewrite the
call to strncpy.

You could also write:

size_t size = sizeof dst;
strncpy(dst, src, size);

But this is redundant.

Be aware, the strncpy function copies the initial count characters of
src to dst and returns dst. If count is less than or equal to the
length of src, a nul character is not appended automatically to the
copied string. If count is greater than the length of src, the
destination string is padded with nul characters up to length count.

>snprintf(dst, size,"%s", src);

Better to write:
snprintf(dst, sizeof dst, "%s", src);

Same reason as above.

Note:
The snprintf function formats and stores count or fewer characters in
buffer, and appends a terminating null character if the formatted
string length is strictly less than count characters.

You have to ensure that dst is large enough to receive the formatted
string that results from the function call.

>which is better?

That depends entirely on context.


#include <stdio.h>
#include <string.h>
#include <assert.h>

const char src[] = "ABC";

int main(void)
{
char dst[10];

assert(sizeof dst >= sizeof src);
strncpy(dst, src, sizeof dst);
snprintf(dst, sizeof dst, "%s", src);
}

wil...@wilbur.25thandclement.com

unread,
Sep 2, 2016, 1:15:17 AM9/2/16
to
It may have changed since last time I checked, but the implementation of
snprintf in glibc could fail on allocation failure. It uses the same
internals as vfprintf, must allocate (at least once per thread) a special
FILE-like object, and deeper down the implementation wasn't shy about
dynamically allocating various temporaries so that even a simple format
specifier like "%s" could fail.

Things may have improved since then. And most people never cared to begin
with, or [wrongly] assumed allocation failure wasn't possible on Linux. But
because of that I avoided using snprintf, as failure-proof code is much
easier to work with.

Those details also made glibc's snprintf totally unsafe for use from signal
handlers. OpenBSD's snprintf is async-signal-safe, at least for anything not
formatting floats, and even that may have been fixed. OpenBSD has strlcpy,
anyhow, but it's nice when an implementation makes misuse difficult.

Keith Thompson

unread,
Sep 2, 2016, 2:07:13 AM9/2/16
to
John Gordon <gor...@panix.com> writes:
> In <0b30b0cc-f7c5-4928...@googlegroups.com> will hunt <glo...@gmail.com> writes:
>
>> should I use strncpy or snprintf, and why?
>
> It seems like strncpy() would be the obvious choice. snprintf() is
> meant for assembling an output string from one or more component
> parts, which doesn't seem like what you want.

Yes, strncpy() is the obvious choice. It's also, most of the time, the
*wrong* choice.

I've written about strncpy() at some length here:

https://the-flat-trantor-society.blogspot.com/2012/03/no-strncpy-is-not-safer-strcpy.html

To summarize: The name implies that strncpy() is to strcpy() as
strncat() is to strcat(), i.e., that it's a "safer" version that will
not write past the end of the target array (as long as you correct
tell it how big the target array is). The name is misleading.
In a very real sense, strncpy() isn't really a string function.
If the target array isn't big enough, it will not be null-terminated,
meaning that it will not contain a string. Any further attempt to
treat it as a string can fail in arbitrarily bad ways. (It also
pads the target with nulls when it *is* big enough, but that's just
a very minor inefficiency.)

This:

target[0] = '\0';
strncat(target, src, sizeof target);

is very close to what strncpy() probably *would* do if it really
were a "safer" strcpy().

But even that provides what may be only a false sense of security.

The fact that you're considering using strncpy or snprintf rather than
strcpy means that you're at least thinking about avoiding overflowing
the target array. That's a very good thing. But simply avoiding a
buffer overflow may not be enough. You should ask yourself, if the
target array isn't big enough to hold a copy of the source, *what should
I do*? If you use strncat() or snprintf(), that question is already
answered for you: the copy quietly truncates your data.

Maybe that's what you want. But imagine, for example, that you're
copying a Unix command string:

"rm -rf /home/username/unimportant_directory"

And suppose the target only has enough room to hold:

"rm -rf /home/username"

and you then pass it to system(). (If you're not familiar with
Unix-like systems, the latter will attempt to remove your entire
home directory.)

Only you can decide, in the context of your program, what should
happen if a target array isn't big enough. Sometimes quiet
truncation is the right choice, sometimes you can try to use a bigger
array, and sometimes you'd be better off terminating the program
with a fatal error message rather than continuing to operate on
incomplete data.

snprintf() has the advantage that it returns the number of
characters it *would* have copied if there were enough room, so
you can detect truncation. (You still have to remember to check,
and to decide how to handle it.) snprintf() was added in C99,
and in the past Microsoft's implementation didn't support it
well; I don't think that's a problem these days, but check your
implementation's documentation.

To summarize: Programming is hard.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Malcolm McLean

unread,
Sep 2, 2016, 2:59:07 AM9/2/16
to
On Friday, September 2, 2016 at 2:50:37 AM UTC+1, will hunt wrote:
>
> should I use strncpy or snprintf, and why?
>

As others have said, strncpy() is a trap (it is designed for databases
with fixed string fields instead of nul-terminated strings, and is
a sort of historical hangover).

if( snprintf(out, N, "%s", str) >= N)
/* error */;

is safe.

Be use to use %s in case the string contains % characters.

Because C has undefined behaviour, a lot of people think that
they can make programs correct by removing the undefined behaviour.
Whilst that is an important first step, it doesn't do much good
if undefined behaviour is replaced by wrong behaviour. If a man's
name won't fit in your string field, what are you going to do
about it? Lots of possible answers, but just chopping it is
unlikely to be acceptable - you get complaints about that sort
of thing.

Ben Bacarisse

unread,
Sep 2, 2016, 5:42:15 AM9/2/16
to
will hunt <glo...@gmail.com> writes:

> should I use strncpy or snprintf, and why?

...(from the subject line) to copy a string safely and efficiently in C?

I vote for neither. Several people have explained why strncpy is out --
never use it unless you know its dangerous peculiarities happen to be
*exactly* what you need -- but snprintf, though safe, is rather
complicated for the job and requires a relatively recent C library.

One simple solution is to use strncat(dst, src, n). This concatenates
two strings by adding at most n characters from src to those already
present in dst. To use it for copying, dst must contain an empty
string, and you need to remember that a terminating null byte is added
in addition to the characters that get copied. This means you would
write something like this

char copy[STR_SIZE];
*copy = 0;
strncat(copy, src, STR_SIZE - 1);

--
Ben.

Richard Heathfield

unread,
Sep 2, 2016, 7:36:13 AM9/2/16
to
On 02/09/16 02:50, will hunt wrote:
> should I use strncpy or snprintf, and why?

You have no doubt read the other answers by now. If and only if you
aren't happy with any of them, read on.

I generally prefer to write a function that does exactly what's
required, rather than pick one particular library function for general
use. And you need to establish exactly what's required. In this case,
the key point at issue is, I think, whether it's okay to discard data if
there's too much.

If losing data is acceptable, you can do this:

#include <string.h>

/* Copy As Much Of String As Possible (camosap) */
/* target: destination string */
/* maxlen: amount of storage allocated for string, inc null */
/* source: place from which to copy data */
/* return: 1 if truncated, else 0 */

int camosap(char *target, size_t maxlen, char *source)
{
int truncated = 0;
size_t len = strlen(source);
if(len >= maxlen)
{
/* if the strings are guaranteed not to overlap, you
can use memcpy instead of memmove */
memmove(target, source, len);
target[len] = '\0';
truncated = 1;
}
else
{
strcpy(target, source);
}
return truncated;
}

If losing data is unacceptable, then you can make sure there's enough by
allocating it yourself:

#include <stdlib.h>
#include <string.h>

char *dupstring(const char *source)
{
size_t len = strlen(source) + 1;
char *new = malloc(len);
if(new != NULL)
{
memcpy(new, source, len);
}
return new;
}

but then you take on responsibility for ensuring (as far too many sloppy
programmers do not ensure) that the allocation was successful, and of
course you have to free() it when you've finished with it.

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

Ben Bacarisse

unread,
Sep 2, 2016, 9:24:46 AM9/2/16
to
Richard Heathfield <r...@cpax.org.uk> writes:

> On 02/09/16 02:50, will hunt wrote:
>> should I use strncpy or snprintf, and why?
<snip>
> If losing data is acceptable, you can do this:
>
> #include <string.h>
>
> /* Copy As Much Of String As Possible (camosap) */
> /* target: destination string */
> /* maxlen: amount of storage allocated for string, inc null */
> /* source: place from which to copy data */
> /* return: 1 if truncated, else 0 */
>
> int camosap(char *target, size_t maxlen, char *source)
> {
> int truncated = 0;
> size_t len = strlen(source);
> if(len >= maxlen)
> {
> /* if the strings are guaranteed not to overlap, you
> can use memcpy instead of memmove */
> memmove(target, source, len);
> target[len] = '\0';
> truncated = 1;
> }
> else
> {
> strcpy(target, source);
> }
> return truncated;
> }

Several things have gone wrong with that. You overrun the buffer by
using len rather than maxlen and you are careful to allow copying
between overlapping objects in once case and then use strcpy in the
other (strcpy is undefined when the source and destination overlap).
And the setting of the null was probably intended to be

target[maxlen - 1] = '\0';

I find using maxlen for a size (i.e. a string length plus the
terminating null) to be rather confusing, especially as you also have
len which is a string length (excluding the terminating null).

If I had to maintain house style, I think I'd rewrite it, with a const
added as well, like this:

int camosap(char *target, size_t target_size, const char *source)
{
int truncated = 0;
size_t len = strlen(source);
if (len >= target_size)
{
/* if the strings are guaranteed not to overlap, you
can use memcpy instead of memmove */
memmove(target, source, target_size - 1);
target[target_size - 1] = '\0';
truncated = 1;
}
else
{
memmove(target, source, len + 1);
}
return truncated;
}

But I'd probably actually write like this, swapping the sense of the
return value as well (I suspect you favour zero for success for Anna
Karenina reasons):

int camosap(char *target, size_t target_size, const char *source)
{
size_t source_size = strlen(source) + 1;
if (source_size > target_size) {
memmove(target, source, target_size - 1);
target[target_size - 1] = '\0';
return 0;
}
memmove(target, source, source_size);
return 1;
}

<snip>
--
Ben.

Richard Heathfield

unread,
Sep 2, 2016, 9:51:25 AM9/2/16
to
I've got a bad feeling about this.

> You overrun the buffer by
> using len rather than maxlen

Ouch one.

> and you are careful to allow copying
> between overlapping objects in once case and then use strcpy in the
> other (strcpy is undefined when the source and destination overlap).

Ouch two.

> And the setting of the null was probably intended to be
>
> target[maxlen - 1] = '\0';

Ouch three.

> I find using maxlen for a size (i.e. a string length plus the
> terminating null) to be rather confusing, especially as you also have
> len which is a string length (excluding the terminating null).

Ouch four. In the grand scheme of ouches (as seen above), it's more of
an "ooh", but I do take your point. Believe it or not... and I know
you'll find it difficult to believe... I did actually think about that
issue. But, on balance, I think I'll spend the rest of the day doing
documentation. Today is clearly not meant to be a programming day.

>
> If I had to maintain house style, I think I'd rewrite it, with a const
> added as well, like this:

Ouch five. And, when I saw that you'd replied but hadn't yet got to your
reply, I reviewed my code, and the missing const was the /only/ thing I
could see that was wrong with the code! Today is so not being a good
day. But, of course, that's why we have code reviews.

There was an ouch six, too - the name. It's terrible. And I knew it was
terrible when I wrote it. I just wanted a name that wasn't mystrcpy.

>
> int camosap(char *target, size_t target_size, const char *source)
> {
> int truncated = 0;
> size_t len = strlen(source);
> if (len >= target_size)
> {
> /* if the strings are guaranteed not to overlap, you
> can use memcpy instead of memmove */
> memmove(target, source, target_size - 1);
> target[target_size - 1] = '\0';
> truncated = 1;
> }
> else
> {
> memmove(target, source, len + 1);
> }
> return truncated;
> }
>
> But I'd probably actually write like this, swapping the sense of the
> return value as well (I suspect you favour zero for success for Anna
> Karenina reasons):

Right. If it worked, it worked. If it didn't work, I want to know why.

For functions that return pointers, the indication of failure that makes
most sense to me is NULL, and I think the C standard library bears me
out on this (fopen, malloc, strchr, etc). (Alas, this technique can't
tell me why it failed.)

For integer returns, however, 0 makes an awful lot of sense. -1 would
/also/ make a lot of sense, with the return value otherwise being an
index into an array of reasons for screwup, but (a) that doesn't sit
well with the standard library (fclose, rename, etc), and (b) it sounds
terribly messy.

James Kuyper

unread,
Sep 2, 2016, 12:29:31 PM9/2/16
to
On 09/01/2016 11:29 PM, Geoff wrote:
> On Thu, 1 Sep 2016 19:40:24 -0700 (PDT), will hunt <glo...@gmail.com>
> wrote:
[Snipped text restored:]
>> char dst[size];
>> strncpy(dst, src, size);
>> snprintf(dst, size,"%s", src);
...
> This is an error:
>> char dst[size];
> The size of the array must be a constant at compile time.

Such a declaration defines a variable length array (VLA).

VLAs were not standardized until C99, and since C2011, If
__STDC_NO_VLA__ is pre#defined by the implementation, VLAs will not be
supported (6.10.8.3p1). VLAs can only have block scope or function
prototype scope (6.7.6.2p1), no linkage, and must not have static or
thread storage duration (6.7.6.2p2). However, the context of this code
example was not given in sufficient detail to prohibit a VLA for any of
those reasons.

Taken literally, the example code had the declaration of dst immediately
followed by function calls, which implies block scope. Therefore, since
that declaration didn't use the static, extern or _Thread_local key
words, dst has no linkage and automatic storage duration. The fact that
it was written at all implies that the intended target supported VLAs.

Richard Heathfield

unread,
Sep 2, 2016, 1:05:56 PM9/2/16
to
On 02/09/16 17:22, Stefan Ram wrote:
> will hunt <glo...@gmail.com> writes:
>> should I use strncpy or snprintf, and why?
>
> It is a good tradition in the Usenet to write
> the full question into the body of the post.
>
> If safety is an issue, one can check out
>
> K.3.7.1.4 The strncpy_s function

(a) Optional - compilers need not provide it.
(b) This doesn't address the issue of incomplete copies.
(c) It is easy (as I have already demonstrated in this thread) to
specify the wrong length.

>
> and
>
> K.3.5.3.5 The snprintf_s function

The same arguments apply.

There is quite simply no substitute for taking care to ensure that the
destination is sufficiently large to receive all the required data from
the source. (Again, I have proved this, in the most embarrassing way
possible, earlier in this thread.)

Ike Naar

unread,
Sep 2, 2016, 3:48:33 PM9/2/16
to
On 2016-09-02, Stefan Ram <r...@zedat.fu-berlin.de> wrote:
> The implementation of ?salsub? below /is/ using ?strncpy?,
> but I hope that this usage is correct.

Or use memcpy which has simpler semantics and does the job just as well.

> [...]
>
> s_type.c
>
> #include <stdarg.h> /* va_list, va_start, va_end */
> #include <stdio.h> /* vsnprintf, size_t, vsprintf */
> #include <string.h> /* strncpy */
> #include <stdlib.h> /* malloc, free */
> #include <stddef.h> /* ptrdiff_t */
> #include "s_type.h" /* s_type */

In my opinion it is best to move the #include of a header file
(in this case, s_type.h) to the very top of its implementation
(in this case, s_type.c).
Reason: this way one can easily check whether the header
s_type.h is self-contained and does not depend on other includes
used by its implementation s_type.c .

> [...]
> /* not tested! */
> s_type salfmt( s_type const f, ... )
> { va_list a;
> va_start(a,f);
> salfmtv( f, a );

That should be:

s_type const b = salfmtv(f, a);

> va_end(a);
> return b; }

Tim Rentsch

unread,
Sep 5, 2016, 1:24:01 AM9/5/16
to
Or maybe this (with the same declaration of 'copy')

strncat( strcpy( copy, "" ), src, STR_SIZE-1 );

Keith Thompson

unread,
Sep 5, 2016, 1:53:29 AM9/5/16
to
Tim Rentsch <t...@alumni.caltech.edu> writes:
> Ben Bacarisse <ben.u...@bsb.me.uk> writes:
[...]
>> char copy[STR_SIZE];
>> *copy = 0;
>> strncat(copy, src, STR_SIZE - 1);
>
> Or maybe this (with the same declaration of 'copy')
>
> strncat( strcpy( copy, "" ), src, STR_SIZE-1 );

Interesting. It's easy to forget that strcpy() returns a result,
since that result is rarely useful and usually ignored.

(I'm resisting the temptation to repeat the warning that simple
truncation isn't always the correct response to an overflow.)

Tim Rentsch

unread,
Sep 5, 2016, 2:42:15 AM9/5/16
to
Two more alternatives: one that uses library functions, and one
that doesn't. The second one has different semantics in some
cases where the operands overlap, but still avoids undefined
behavior. The set of return values is expanded to deal with the
case that the "size" is zero. (Disclaimer: not tested.)

int
camosap( char *d, size_t n, const char *s ){
const char *t = memchr( s, 0, n );
size_t const k = t ? t-s+1 : n;
memmove( d, s, k );
return t ? 0 : n > 0 ? d[n-1] = 0, 1 : -1;
}


int
camosap( char *d, size_t n, const char *s ){
return n > 1 && (*d = *s) ? camosap( d+1, n-1, s+1 )
: n > 0 && *s == 0 ? *d = 0, 0
: n > 0 ? *d = 0, 1
: -1;
}

Malcolm McLean

unread,
Sep 5, 2016, 2:59:25 AM9/5/16
to
On Friday, September 2, 2016 at 2:58:46 AM UTC+1, John Gordon wrote:
> In <0b30b0cc-f7c5-4928...@googlegroups.com> will hunt <glo...@gmail.com> writes:
>
> > should I use strncpy or snprintf, and why?
>
> It seems like strncpy() would be the obvious choice. snprintf() is
> meant for assembling an output string from one or more component
> parts, which doesn't seem like what you want.
>
Please read other replies before responding.

The advice sounds sensible, and it's a mistake anyone could make.
But as others have pointed out, it's quite wrong.

mark.b...@gmail.com

unread,
Sep 5, 2016, 4:04:12 AM9/5/16
to
On Monday, 5 September 2016 06:53:29 UTC+1, Keith Thompson wrote:
> Tim Rentsch <t...@alumni.caltech.edu> writes:
> > Ben Bacarisse <ben.u...@bsb.me.uk> writes:
> [...]
> >> char copy[STR_SIZE];
> >> *copy = 0;
> >> strncat(copy, src, STR_SIZE - 1);
> >
> > Or maybe this (with the same declaration of 'copy')
> >
> > strncat( strcpy( copy, "" ), src, STR_SIZE-1 );
>
> Interesting. It's easy to forget that strcpy() returns a result,
> since that result is rarely useful and usually ignored.
>
> (I'm resisting the temptation to repeat the warning that simple
s/resisting/unsuccessfully resisting/
> truncation isn't always the correct response to an overflow.)

FTFY

Keith Thompson

unread,
Sep 5, 2016, 4:18:58 AM9/5/16
to
mark.b...@gmail.com writes:
> On Monday, 5 September 2016 06:53:29 UTC+1, Keith Thompson wrote:
[...]
>> (I'm resisting the temptation to repeat the warning that simple
> s/resisting/unsuccessfully resisting/
>> truncation isn't always the correct response to an overflow.)
>
> FTFY

Good point (but unsuccessful resistance is still resistance, futile
though it may be).

Ben Bacarisse

unread,
Sep 5, 2016, 5:30:42 AM9/5/16
to
Keith Thompson <ks...@mib.org> writes:

> Tim Rentsch <t...@alumni.caltech.edu> writes:
>> Ben Bacarisse <ben.u...@bsb.me.uk> writes:
> [...]
>>> char copy[STR_SIZE];
>>> *copy = 0;
>>> strncat(copy, src, STR_SIZE - 1);
>>
>> Or maybe this (with the same declaration of 'copy')
>>
>> strncat( strcpy( copy, "" ), src, STR_SIZE-1 );
>
> Interesting. It's easy to forget that strcpy() returns a result,
> since that result is rarely useful and usually ignored.

For some reason, I find I use the return value of strcpy (and the
others) quite often. Maybe too much LISP in my youth. But when I want
an expression to do the job in question, I write (or, more accurately,
have written in the past)

strncat((*copy = 0, copy), src, STR_SIZE-1)

so while I so use the value of strcpy, I've never thought to use it to
write the null as Tim did.

<snip>
--
Ben.

Tim Rentsch

unread,
Sep 10, 2016, 2:25:42 PM9/10/16
to
A different way of thinking of it is not writing a null but
initializing 'copy' to an empty string, so the strncat() can
then concatenate the 'src' string onto that empty string. That
ideation may suggest using strcpy() more naturally. Personally I
find this framing more attractive, because both operations are
occurring at the same level of abstraction. (And I readily
confess to using the '*copy = 0' method myself at times.)

Tim Rentsch

unread,
Sep 11, 2016, 9:23:15 AM9/11/16
to
Keith Thompson <ks...@mib.org> writes:

> Tim Rentsch <t...@alumni.caltech.edu> writes:
>> Ben Bacarisse <ben.u...@bsb.me.uk> writes:
>
> [...]
>
>>> char copy[STR_SIZE];
>>> *copy = 0;
>>> strncat(copy, src, STR_SIZE - 1);
>>
>> Or maybe this (with the same declaration of 'copy')
>>
>> strncat( strcpy( copy, "" ), src, STR_SIZE-1 );
>
> Interesting. It's easy to forget that strcpy() returns a result,
> since that result is rarely useful and usually ignored.

I would say that slightly differently. The return value of
strcpy() (and other str...() functions) is rarely used, because
people forget that these functions return a result. In many (or
most?) cases that happens because people are used to thinking
imperatively more than functionally. These return values are
actually useful noticeably more often than they are used.

supe...@casperkitty.com

unread,
Sep 11, 2016, 6:42:19 PM9/11/16
to
On Sunday, September 11, 2016 at 8:23:15 AM UTC-5, Tim Rentsch wrote:
> I would say that slightly differently. The return value of
> strcpy() (and other str...() functions) is rarely used, because
> people forget that these functions return a result. In many (or
> most?) cases that happens because people are used to thinking
> imperatively more than functionally. These return values are
> actually useful noticeably more often than they are used.

The functions return the pointer that was passed in, and thus represents
something that the code already had. It may in some cases allow some things
to be done in initialization expression which would otherwise require
statements, but that's about the extent of their usefulness. Returning the
address of the trailing 0 byte would have been more helpful, but that's not
what the functions do.

Tim Rentsch

unread,
Sep 12, 2016, 2:40:18 AM9/12/16
to
supe...@casperkitty.com writes:

> On Sunday, September 11, 2016 at 8:23:15 AM UTC-5, Tim Rentsch wrote:
>> I would say that slightly differently. The return value of
>> strcpy() (and other str...() functions) is rarely used, because
>> people forget that these functions return a result. In many (or
>> most?) cases that happens because people are used to thinking
>> imperatively more than functionally. These return values are
>> actually useful noticeably more often than they are used.
>
> The functions return the pointer that was passed in, [...]

I've heard this broken record before. It isn't any more
interesting or convincing the eleventh time than it was
the first ten times.

Ian Collins

unread,
Sep 12, 2016, 5:04:37 AM9/12/16
to
The Unix standard folks appear to think differently:

"The stpcpy() function shall return a pointer to the terminating NUL
character copied into the s1 buffer."

http://pubs.opengroup.org/onlinepubs/9699919799/functions/strcpy.html

--
Ian

Melzzzzz

unread,
Sep 12, 2016, 5:21:42 AM9/12/16
to
Hm, gcc returns pointer to result. What does this means? We can't rely on
what strcpy returns?

Ian Collins

unread,
Sep 12, 2016, 5:33:19 AM9/12/16
to
Do what? The library function returns hat the standard says it returns,
gcc return success or failure :)

--
Ian

Ben Bacarisse

unread,
Sep 12, 2016, 5:41:45 AM9/12/16
to
stpcpy != strcpy

--
Ben.

Melzzzzz

unread,
Sep 12, 2016, 6:48:12 AM9/12/16
to
Aaaaah, I missed that, sorry ;(

Tim Rentsch

unread,
Sep 12, 2016, 10:05:10 AM9/12/16
to
I think you are imagining I'm saying things other than what I
actually said. There is no inconsistency between what I said
and the Open Group's decision to define the stpcpy() function.

asetof...@gmail.com

unread,
Sep 12, 2016, 4:09:25 PM9/12/16
to
______
for copying a sting into a given size buffer,
char dst[size];
strncpy(dst, src, size);
snprintf(dst, size,"%s", src);
which is better?
-----+++
for(i=0; i<size&&(dst[i]=src[i]) ++i) ;
if(i>=size) return -1;

If the src and destination point
to the same array memmove
but i not remember args...

Keith Thompson

unread,
Sep 12, 2016, 4:58:46 PM9/12/16
to
Please learn to quote properly. (Your newsreader should handle this for
you.)

This part of your followup:
> for copying a sting into a given size buffer,
> char dst[size];
> strncpy(dst, src, size);
> snprintf(dst, size,"%s", src);
> which is better?
was written by will hunt on September 1.

Dan Cross

unread,
Sep 12, 2016, 9:45:34 PM9/12/16
to
In article <0b30b0cc-f7c5-4928...@googlegroups.com>,
will hunt <glo...@gmail.com> wrote:
>should I use strncpy or snprintf, and why?

This has been covered, but I'll throw my hat in the ring for the,
"neither" camp.

The biggest deficiencies of strncpy() have to do with its surprising
semantics: it was originally written for copying data into
fixed-width fields of data structures that were written to disk
(actually as part of a filesystem). Thus, it zeros out the part of
the destination after the end of the source if the source is shorter
than the buffer size (instead of just appending a single NUL
terminator), and possibly won't NUL terminate if the source is of
length equal-to or longer than the size argument. It always writes
exactly 'size' characters, which is inefficient for short source
strings and large destination buffers.

The possible lack of NUL termination requires care on the part of the
programmer and is sufficiently non-intuitive that it is often
overlooked: this has been a fruitful source of errors over the years.
Further, there's no way to detect truncation (short of comparison
against the source string after the copy or similar).

snprintf() doesn't have the NUL-termination or zeroing problem, but
requires a lot of machinery (all the rest of printf, which is
complicated in its full glory) and is inefficient. However,
snprintf() *does* return a useful value that can be used to probe
for truncation.

The,

*dst = 0;
strncat(dst, src, size - 1);

trick is clever and reasonably efficient, but leaves the programmer
no way to detect truncation (again, unless followed by a comparison
against the source or something).

It should be noted that this "compare against the source" business as
a workaround to detect truncation is redundant: the copying routine
could report if it truncates or not, so requiring a second probing
step should not be necessary. Perhaps that's fine for one's
particular application, but the general problem persists.

To summarize, no string-copying function in the C standard library
provides an acceptable combination of efficiency, safety and the
ability to detect truncation.

The OpenBSD people recognized this some years ago and came up with
functions that had the right properties: strlcpy() and strlcat().
These are pretty simple. They will *always* terminate a greater-than-
zero length destination buffer, they will never write more than one
terminating NUL character, and they return a value that can be used
to detect truncation. They are available from the OpenBSD
distribution under a permissive license and have made it into most
of the BSD distributions, the Mac, and (I think) Solaris. Most Linux
distributions provide copies in a BSD compatibility library, though
they were rejected from glibc because, well, Ulrich Drepper didn't
like them.

An ersatz implementation of strlcpy() taken from
http://pub.gajendra.net/src/strlcpy.c is:

#include <stddef.h>
#include <string.h>

size_t
strlcpy(char *dst, const char *src, size_t size)
{
size_t len, srclen;

/*
* Get the length of the source first, test for the
* pathological case, then copy as much as we can.
*/
srclen = strlen(src);
if (size-- == 0)
return(srclen);
len = (size < srclen) ? size : srclen;
memmove(dst, src, len);
dst[len] = '\0';

return(srclen);
}

Similarly, an implementation of strlcat() taken from
http://pub.gajendra.net/src/strlcat.c is:

#include <stddef.h>
#include <string.h>

size_t
strlcat(char *dst, const char *src, size_t size)
{
char *p;
size_t len;

/*
* This mimics the OpenBSD semantics most closely.
*
* We would like to use strlen() here, but the idea is to
* catch cases where dst isn't a C-style string, and this
* function is (presumably) called in error.
*/
p = memchr(dst, '\0', size);
if (p == NULL)
p = dst + size;
len = p - dst;

return(len + strlcpy(p, src, size - len));
}

Feel free to use these, if you'd like (I'm the original author of
these versions and have put them into the public domain).

- Dan C.

Ben Bacarisse

unread,
Sep 13, 2016, 6:07:10 AM9/13/16
to
cr...@spitfire.i.gajendra.net (Dan Cross) writes:

> In article <0b30b0cc-f7c5-4928...@googlegroups.com>,
> will hunt <glo...@gmail.com> wrote:
>>should I use strncpy or snprintf, and why?
<snip lots of stuff>

> The,
>
> *dst = 0;
> strncat(dst, src, size - 1);
>
> trick is clever and reasonably efficient, but leaves the programmer
> no way to detect truncation (again, unless followed by a comparison
> against the source or something).

As written, yes, but a test similar to the one I use for detecting long
lines with fgets can give you this information in O(1) time (i.e. with
no re-scan for length or comparison). You just add one more null byte:

dst[0] = dst[size - 2] = 0;
strncat(dst, src, size - 1);

after which the expression dst[size - 2] && src[size - 1] is true if,
and only if, there was truncation.

(Note: the context was with a constant STR_SIZE where you could be
permitted to know that STR_SIZE - 2 was valid. Real life may not be so
simple.)

<snip>
--
Ben.

Richard Damon

unread,
Sep 13, 2016, 8:19:28 AM9/13/16
to
On 9/12/16 9:45 PM, Dan Cross wrote:
> In article <0b30b0cc-f7c5-4928...@googlegroups.com>,
> will hunt <glo...@gmail.com> wrote:
>> should I use strncpy or snprintf, and why?
>
> This has been covered, but I'll throw my hat in the ring for the,
> "neither" camp.
>
> The biggest deficiencies of strncpy() have to do with its surprising
> semantics: it was originally written for copying data into
> fixed-width fields of data structures that were written to disk
> (actually as part of a filesystem). Thus, it zeros out the part of
> the destination after the end of the source if the source is shorter
> than the buffer size (instead of just appending a single NUL
> terminator), and possibly won't NUL terminate if the source is of
> length equal-to or longer than the size argument. It always writes
> exactly 'size' characters, which is inefficient for short source
> strings and large destination buffers.
>
> The possible lack of NUL termination requires care on the part of the
> programmer and is sufficiently non-intuitive that it is often
> overlooked: this has been a fruitful source of errors over the years.
> Further, there's no way to detect truncation (short of comparison
> against the source string after the copy or similar).
>
All it takes is comparing the last byte of dest to nul (0). Since
strncpy fills the tail with 0, a string that fit will have trailing
nuls, if it doesn't, there won't be one. Yes, if the 'payload' of the
string exactly fits, you will consider it 'truncated' but then for a
standard string, the nul IS part of the string, so it is.

Note, the lack of notification of truncation isn't that surprising as
that is one of the purposes of the function, to truncate the over length
string.


Dan Cross

unread,
Sep 13, 2016, 10:07:59 AM9/13/16
to
In article <99SBz.71132$eM3....@fx04.iad>,
Note I said, "or similar." Yes, you can do that, but the larger
point I was trying to make is that it must be done as a separate
step.

>Note, the lack of notification of truncation isn't that surprising as
>that is one of the purposes of the function, to truncate the over length
>string.

This is subjective, but I disagree. The function takes an argument
to bound the amount it will copy, but that does not necessarily imply
it shouldn't tell you if it hit that boundary. Yes, you *can* detect
it after the call, but one can also define a function that lets you
know whether it truncated as in strlcpy().

The other deficiencies of strncpy() are such that it really should
not be used for things other than it's original purpose.

- Dan C.

Dan Cross

unread,
Sep 13, 2016, 10:09:53 AM9/13/16
to
In article <87h99kq...@bsb.me.uk>,
That's clever, but sadly as you note it won't work in all cases
specifically, when STR_SIZE - 2 is invalid. That case may seem
superfluous, but one never knows.

- Dan C.

Ben Bacarisse

unread,
Sep 13, 2016, 10:30:16 AM9/13/16
to
The original assumed that size was at least 1. If that's reasonable you
can test

src[size - 1] && size > 1 && dst[size - 2]

and if size could be 0 the test gets more complicated still but not very
much more complicated. The most obvious one being

size && src[size - 1] && size > 1 && dst[size - 2]

It may not be of any practical value, but I think it's interesting to
know that O(1) overflow detection is always possible in these cases.

--
Ben.

Dan Cross

unread,
Sep 13, 2016, 10:52:53 AM9/13/16
to
In article <87wpifq...@bsb.me.uk>,
Yes yes, that's true, but surely we can see that one approaches a point
of diminishing returns with these sorts of shenanigans. One could simply
use strlcpy() and avoid all of this, no?

- Dan C.

Dan Cross

unread,
Sep 13, 2016, 10:55:01 AM9/13/16
to
In article <nr93rt$lgt$1...@reader2.panix.com>,
Dan Cross <cr...@spitfire.i.gajendra.net> wrote:
>In article <87wpifq...@bsb.me.uk>,
>Ben Bacarisse <ben.u...@bsb.me.uk> wrote:
>>> [snip]
>>
>>The original assumed that size was at least 1. If that's reasonable you
>>can test
>>
>> src[size - 1] && size > 1 && dst[size - 2]
>>
>>and if size could be 0 the test gets more complicated still but not very
>>much more complicated. The most obvious one being
>>
>> size && src[size - 1] && size > 1 && dst[size - 2]
>>
>>It may not be of any practical value, but I think it's interesting to
>>know that O(1) overflow detection is always possible in these cases.
>
>Yes yes, that's true, but surely we can see that one approaches a point
>of diminishing returns with these sorts of shenanigans. One could simply
>use strlcpy() and avoid all of this, no?

And to wit: Surely one would not expect a reasonable program to duplicate
these checks at every strncat() call site. Instead, one would wrap this
sequence of code up into some kind of function; at that point, one may as
well simply call the 8 line strlcpy().

- Dan C.

supe...@casperkitty.com

unread,
Sep 13, 2016, 10:58:16 AM9/13/16
to
On Tuesday, September 13, 2016 at 9:07:59 AM UTC-5, Dan Cross wrote:
> The other deficiencies of strncpy() are such that it really should
> not be used for things other than it's original purpose.

Which is writing to fixed-sized zero-padded string buffers which may or
may not have anything to do with directory entries. While the specific
use for directory entries may be obsolete, the function is still
convenient in places where one is dealing with lots of strings that are
short and precisely fit within an data structure which includes aligned
data items. For example, if one has a data structure that contains some
64-bit aligned data items and includes an 8-character field, adding a
trailing byte to the field could double the amount of space it requires.
While memcpy() may often be better than strncpy(), it requires strings
that are padded out to the correct length. If one wanted to populate an
8-character field with "N/A", for example one would need to use

memcpy(thing->field, "N/A\0\0\0\0", 8);

rather than

strncpy(thing->field, "N/A", THING_FIELD_SIZE);.

Note that the former version of the code will break if the thing field
size grows, **and using a macro for the size wouldn't help**, while the
latter version will automatically adapt to any increase in field size.


BartC

unread,
Sep 13, 2016, 10:59:54 AM9/13/16
to
On 13/09/2016 02:45, Dan Cross wrote:

> semantics: it was originally written for copying data into
> fixed-width fields of data structures that were written to disk
> (actually as part of a filesystem). Thus, it zeros out the part of
> the destination after the end of the source if the source is shorter
> than the buffer size

That's handy to do anyway, whether you're going to write to disk or not.

For example, for making full use of small char-arrays within structs.
Then, if you only have a 4-char field, you want to be able to use all 4
chars, instead of needing an odd 5-char field or being limited to an odd
3-char string.

Nothing to do with the subject (and I just fancied writing some C), but
the following code uses store- and get- functions for copying normal
terminated strings into packed strings and extracting them again.

But it stores them as counted strings (storing a length within the
packed string without sacrificing a byte, or a top bit, to do so.).

Truncation is reported (only because it was mentioned in the
sub-thread!). However unused characters within the field are not zeroed.


#include<stdio.h>
#include<string.h>

void setfslength(char* s, int m, int n){
// encode length n of packed string within m-char field, within
// last two bytes of field. Strings can be up to m characters so
// using the entire field.
// a,b The last two chars of the fixed string
// ---
// 0,N Length is N
// 0,0 Length is 0 (special case of 0,N)
// X,0 Length is M-1
// X,Y Length is M
// NOTE: this only works for m in 2..256, and the string can't
// contain zero bytes.

if (m == n); // x,y
else if (n == m-1)
*(s+m-1) = 0; // x,0
else { // 0,n
*(s+m-2) = 0;
*(s+m-1) = n;
}
}

int getfslength(char* s, int m) {
// s points to a packed string encoded with length at its end.
// m is the max length (m>=2)
// return the encoded length

s += m-1;

if (*(s-1) == 0)
return *s;
else if (*s == 0)
return m-1;
else
return m;
}

int storepackedstring(char* dest, int width, char* source) {
// store C string into packed fixed-length non-nul-terminated
// field of size 'width' chars.
// return 1 (OK), 0 (truncated)
// width must be 2 to 256

int slen = strlen(source);
int status;

if (slen>width) {
status = 0;
slen = width;
}
else
status = 1;

if (slen)
memcpy(dest,source,slen);

setfslength(dest,width,slen);
return status;
}

int getpackedstring(char* dest, int width, char* source) {
// copy packed string in fixed width field to normal C terminated
// string. dest must point to at least width+1 characters (but
// maximum space needed is 257 characters).
// returns length of packed string

int slen = getfslength(source, width);

if (slen)
memcpy(dest,source,slen);
*(dest+slen) = 0;
return slen;
}

int main(void) {

typedef struct {
char A[4];
char B[2];
char C[26];
} R;

R S;
char str[257];
int len;
#define STR1 "bartc"
#define STR2 "XY"
#define STR3 "ABC"

if (!storepackedstring(S.A, sizeof S.A, STR1))
printf("%s truncated\n",STR1);

if (!storepackedstring(S.B, sizeof S.B, STR2))
printf("%s truncated\n",STR2);

if (!storepackedstring(S.C, sizeof S.C, STR3))
printf("%s truncated\n",STR3);

len = getpackedstring(str, sizeof S.A, S.A);
printf("S.A = %s Length: %d\n", str, len);

len = getpackedstring(str, sizeof S.B, S.B);
printf("S.B = %s Length: %d\n", str, len);

len = getpackedstring(str, sizeof S.C, S.C);
printf("S.C = %s Length: %d\n", str, len);

}


--
Bartc

Ben Bacarisse

unread,
Sep 13, 2016, 12:18:24 PM9/13/16
to
Yes, of course. That's the "it may not be of any practical value" part.

I suspect we've kept this going simply because my interests differ from
yours. Nowadays, programming is largely an intellectual exercise for
me, so I like to show alternatives, especially when they seem
non-obvious. I don't expect anyone to prefer the above over other, more
robust, engineering solutions. I suppose the trouble is you worry that
readers won't make the assessment correctly whereas I just assume they
will do.

--
Ben.

Ben Bacarisse

unread,
Sep 13, 2016, 12:33:41 PM9/13/16
to
<snip text on strncpy>
supe...@casperkitty.com writes:
> [...] the function is still
> convenient in places where one is dealing with lots of strings that are
> short and precisely fit within an data structure which includes aligned
> data items. For example, if one has a data structure that contains some
> 64-bit aligned data items and includes an 8-character field, adding a
> trailing byte to the field could double the amount of space it
> requires.

I would not call these strings. In fact, I prefer not to call strncpy a
string function.

> While memcpy() may often be better than strncpy(), it requires strings
> that are padded out to the correct length. If one wanted to populate an
> 8-character field with "N/A", for example one would need to use
>
> memcpy(thing->field, "N/A\0\0\0\0", 8);

You might consider using

memcpy(thing->field, (char [sizeof thing->field]){"N/A"}, sizeof thing->field);

since you get the padding and the size automatically. It can always be
turned into a macro to remove the repetition.

> rather than
>
> strncpy(thing->field, "N/A", THING_FIELD_SIZE);.

On advantage of memcpy + compound literal is that if you use a string
literal that is too long, you will most likely get a message from the
compiler. The strncpy won't go wrong, of course, but a string longer
than THING_FIELD_SIZE is almost certainly a programmer error you'd like
to know about.

> Note that the former version of the code will break if the thing field
> size grows, **and using a macro for the size wouldn't help**, while the
> latter version will automatically adapt to any increase in field size.

That's the advantage of using a compound literal.

--
Ben.

Dan Cross

unread,
Sep 13, 2016, 2:13:49 PM9/13/16
to
In article <87lgyvp...@bsb.me.uk>,
Ben Bacarisse <ben.u...@bsb.me.uk> wrote:
> [snip]
>
>You might consider using
>
> memcpy(thing->field, (char [sizeof thing->field]){"N/A"}, sizeof thing->field);
>
>since you get the padding and the size automatically. It can always be
>turned into a macro to remove the repetition.

I was curious about this, as it struck me that a decent compiler
would likely generate the same code for both of these. For strings
longer than the destination this was more or less true with GCC 5.4
and Clang. The generated machine code from GCC was identical; in
clang, the compound literal truncates the string data in the text
section.

Interestingly, in the case where the source was NOT longer than
the destination buffer, clang always calls strncpy (though it does
optimize it as a tail call); the memcpy case is optimized to a couple
of load instructions, depending on the destination buffer size.

For non-overflow, GCC optimizes out the calls to both strncpy() and
memcpy().

>> rather than
>>
>> strncpy(thing->field, "N/A", THING_FIELD_SIZE);.
>
>On advantage of memcpy + compound literal is that if you use a string
>literal that is too long, you will most likely get a message from the
>compiler. The strncpy won't go wrong, of course, but a string longer
>than THING_FIELD_SIZE is almost certainly a programmer error you'd like
>to know about.

Indeed. That's a useful diagnostic. It also generates faster code
on some compilers.

>> Note that the former version of the code will break if the thing field
>> size grows, **and using a macro for the size wouldn't help**, while the
>> latter version will automatically adapt to any increase in field size.
>
>That's the advantage of using a compound literal.

- Dan C.

Dan Cross

unread,
Sep 13, 2016, 2:15:18 PM9/13/16
to
In article <87r38np...@bsb.me.uk>,
Ben Bacarisse <ben.u...@bsb.me.uk> wrote:
>cr...@spitfire.i.gajendra.net (Dan Cross) writes:
>> And to wit: Surely one would not expect a reasonable program to duplicate
>> these checks at every strncat() call site. Instead, one would wrap this
>> sequence of code up into some kind of function; at that point, one may as
>> well simply call the 8 line strlcpy().
>
>Yes, of course. That's the "it may not be of any practical value" part.
>
>I suspect we've kept this going simply because my interests differ from
>yours. Nowadays, programming is largely an intellectual exercise for
>me, so I like to show alternatives, especially when they seem
>non-obvious. I don't expect anyone to prefer the above over other, more
>robust, engineering solutions. I suppose the trouble is you worry that
>readers won't make the assessment correctly whereas I just assume they
>will do.

Ah, yes; I think you are right there. I have enjoyed the back-and-forth,
though.

- Dan C.

supe...@casperkitty.com

unread,
Sep 13, 2016, 2:30:31 PM9/13/16
to
On Tuesday, September 13, 2016 at 11:33:41 AM UTC-5, Ben Bacarisse wrote:
> I would not call these strings. In fact, I prefer not to call strncpy a
> string function.

I would call them "zero-padded" strings. If one is interacting with APIs
that describe various representations for sequences of characters as
different kinds of strings (e.g. a Pascal string, a zero-padded string, etc.)
I don't think using some noun other than "string" is helpful. I agree that
in the absence of context to indicate otherwise, the noun "string" usually
means "C string", but that doesn't mean that the noun "string" should not
be part of the name of other kinds.

> You might consider using
>
> memcpy(thing->field, (char [sizeof thing->field]){"N/A"}, sizeof thing->field);
>
> since you get the padding and the size automatically. It can always be
> turned into a macro to remove the repetition.

Such an approach might be good if C supported static constant compound
literals. Can you identify any implementations where your version would
not end up producing substantially bulkier code than the strncpy version?

Dan Cross

unread,
Sep 13, 2016, 4:03:40 PM9/13/16
to
In article <04886746-5004-45fd...@googlegroups.com>,
<supe...@casperkitty.com> wrote:
>On Tuesday, September 13, 2016 at 11:33:41 AM UTC-5, Ben Bacarisse wrote:
>> I would not call these strings. In fact, I prefer not to call strncpy a
>> string function.
>
>I would call them "zero-padded" strings. If one is interacting with APIs
>that describe various representations for sequences of characters as
>different kinds of strings (e.g. a Pascal string, a zero-padded string, etc.)
>I don't think using some noun other than "string" is helpful. I agree that
>in the absence of context to indicate otherwise, the noun "string" usually
>means "C string", but that doesn't mean that the noun "string" should not
>be part of the name of other kinds.

I would call this "textual data" or a "text buffer" or similar. "string"
is too overloaded of a term to not cause confusion with C strings and their
semantics.

>> You might consider using
>>
>> memcpy(thing->field, (char [sizeof thing->field]){"N/A"}, sizeof thing->field);
>>
>> since you get the padding and the size automatically. It can always be
>> turned into a macro to remove the repetition.
>
>Such an approach might be good if C supported static constant compound
>literals. Can you identify any implementations where your version would
>not end up producing substantially bulkier code than the strncpy version?

As I mentioned earlier, GCC 5.4 generates the same machine code for both
the strncpy() and memcpy()+compound literal variants. Clang 3.7 calls
strncpy() but optimizes out the call to memcpy(). So not only are there
at least two platforms where the generated code is less bulky than the
strncpy() version, on at least one of those it is somewhat even lighter.

- Dan C.

supe...@casperkitty.com

unread,
Sep 13, 2016, 4:39:02 PM9/13/16
to
On Tuesday, September 13, 2016 at 3:03:40 PM UTC-5, Dan Cross wrote:
> As I mentioned earlier, GCC 5.4 generates the same machine code for both
> the strncpy() and memcpy()+compound literal variants. Clang 3.7 calls
> strncpy() but optimizes out the call to memcpy(). So not only are there
> at least two platforms where the generated code is less bulky than the
> strncpy() version, on at least one of those it is somewhat even lighter.

That's curious; the code generation of 6.2 seems to be a step backward there.
Given

#define MSG_SIZE 12
void test(char *dest)
{
memcpy(dest,(char[MSG_SIZE]){"Test"}, MSG_SIZE);
}

The 86-64 GCC 6.2 on godbolt.org wrote out 12 consecutive single-byte
move instructions; if the size were increased to the point where in-
line expansion would become impractical (e.g. 256) it seems to create
a temporary array on the stack though it only subtract 136 from SP, which
I find odd.

Dan Cross

unread,
Sep 13, 2016, 5:39:36 PM9/13/16
to
In article <8b37afbd-a536-4d35...@googlegroups.com>,
Interesting. I wrote it for arm64 and ran it against gcc 6.1, which
is the most recent compiler I have on my dev machine for the aarch64
port of the Harvey OS. On that platform, the strncpy() version emits
a call to strncpy(), as on x86 with Clang, while your code optimizes
out the call to memcpy() and emits four moves and twelve STRB instructions
(this makes sense as presumably the function in question doesn't know
the alignment of `dest` at compile time).

Somewhat more surprising to me was that if I changed the code to write
into e.g. a global variable and I changed the type of that to be e.g.
an int, one still sees the same number of single-byte loads. I would
have expected the compiler to recognize that these were aligned and
load registers with constants containing more than a single byte of
text, and then write to memory with word-storing instructions.

- Dan C.

Ben Bacarisse

unread,
Sep 13, 2016, 6:26:19 PM9/13/16
to
supe...@casperkitty.com writes:

> On Tuesday, September 13, 2016 at 11:33:41 AM UTC-5, Ben Bacarisse wrote:
<snip>
>> You might consider using
>>
>> memcpy(thing->field, (char [sizeof thing->field]){"N/A"}, sizeof thing->field);
>>
>> since you get the padding and the size automatically. It can always be
>> turned into a macro to remove the repetition.
>
> Such an approach might be good if C supported static constant compound
> literals. Can you identify any implementations where your version would
> not end up producing substantially bulkier code than the strncpy version?

Yes. The gcc I have here (5.2.1) compiles

struct S { char field[8]; };

void f(struct S *thing)
{
memcpy(thing->field, (char [sizeof thing->field]){"N/A"},
sizeof thing->field);
}

to just

movq $4271950, (%rdi)
ret

using -O2.

--
Ben.

Tim Rentsch

unread,
Sep 13, 2016, 7:07:35 PM9/13/16
to
cr...@spitfire.i.gajendra.net (Dan Cross) writes:

> An ersatz implementation of strlcpy() taken from
> http://pub.gajendra.net/src/strlcpy.c is: [...]

I'm curious - did you really mean ersatz? The code doesn't
look that ersatz to me. :)

Tim Rentsch

unread,
Sep 13, 2016, 7:14:29 PM9/13/16
to
Ben Bacarisse <ben.u...@bsb.me.uk> writes:

> <snip text on strncpy>
> supe...@casperkitty.com writes:
>> [...]
>> While memcpy() may often be better than strncpy(), it requires strings
>> that are padded out to the correct length. If one wanted to populate an
>> 8-character field with "N/A", for example one would need to use
>>
>> memcpy(thing->field, "N/A\0\0\0\0", 8);
>
> You might consider using
>
> memcpy(thing->field, (char [sizeof thing->field]){"N/A"}, sizeof thing->field);
>
> since you get the padding and the size automatically. It can always be
> turned into a macro to remove the repetition.

This technique using a compound literal is a nice one to know.
One caution though: it doesn't work when the array bound is
not a constant expression.

Dan Cross

unread,
Sep 13, 2016, 8:24:42 PM9/13/16
to
In article <kfn4m5j...@x-alumni2.alumni.caltech.edu>,
Thanks. Perhaps I should have said "bespoke" but I'm not
quite that much of a hipster. :-)

- Dan C.

Tim Rentsch

unread,
Sep 13, 2016, 11:31:16 PM9/13/16
to
The safe copy instruction sequence should of course be wrapped up
in a function. In my view though there are several reasons to
prefer not using strlcpy() for this application.

One, strlcpy() might not be available or in an obvious library.
That means navigating a configuration question so it can be
provided if not there. (Sadly I have run into this very issue
in some open source programs.) A further complication is
that it might be provided by the implementation, as the name
is in the reserved name space.

Two, in the case that truncation occurs, strlcpy() reads more
than it needs to (ie, if all we care about is whether or not
truncation occurred). That is somehow wasteful.

Three, given what strlcpy() returns, truncation is not detectable
directly but needs a comparison against the passed length
argument value, which might mean having to put that value in a
temporary so it can subsequently compared. Of course it isn't
hard to do that but it makes the code a bit messy.

An alternative is to supply the truncation-detecting safe copy
function directly, using only standard functions. This can be
done with, eg, memchr() and memmove(), in just a few lines of
code (as shown in the example posted in a cousin thread not
too long ago).

I appreciate the value of strlcpy() et al. For this particular
application however it seems like not quite the right fit.

Dan Cross

unread,
Sep 15, 2016, 10:03:27 AM9/15/16
to
In article <kfnvaxz...@x-alumni2.alumni.caltech.edu>,
Tim Rentsch <t...@alumni.caltech.edu> wrote:
>cr...@spitfire.i.gajendra.net (Dan Cross) writes:
>> In article <nr93rt$lgt$1...@reader2.panix.com>,
>> Dan Cross <cr...@spitfire.i.gajendra.net> wrote:
>>> [snip]
>>>
>>> Yes yes, that's true, but surely we can see that one approaches a point
>>> of diminishing returns with these sorts of shenanigans. One could simply
>>> use strlcpy() and avoid all of this, no?
>>
>> And to wit: Surely one would not expect a reasonable program to duplicate
>> these checks at every strncat() call site. Instead, one would wrap this
>> sequence of code up into some kind of function; at that point, one may as
>> well simply call the 8 line strlcpy().
>
>The safe copy instruction sequence should of course be wrapped up
>in a function. In my view though there are several reasons to
>prefer not using strlcpy() for this application.

Thanks, you raise some valid points but it begs the question: to
which application do you refer? The original poster simply asked
"How to copy a string safely and efficinetly in C?", with an explicit
question about a choice between two functions: snprintf() and
strncpy(). As far as I know, a more specific context was not given.

>One, strlcpy() might not be available or in an obvious library.
>That means navigating a configuration question so it can be
>provided if not there. (Sadly I have run into this very issue
>in some open source programs.) A further complication is
>that it might be provided by the implementation, as the name
>is in the reserved name space.

Yes, this is unfortunate; I rather hope that they will standardize
it in an upcoming revision of the language, but I have some doubts.

It is, however, easy to write oneself -- or simply take one of the
available open source versions (mine is public domain, OpenBSD's
is under a permissive license). Conflict with a system-supplied
version could be an issue in portable programs, but there are
well-known techniques for working around that (e.g., separate
compilation and conditional compilation based on platform, etc).

>Two, in the case that truncation occurs, strlcpy() reads more
>than it needs to (ie, if all we care about is whether or not
>truncation occurred). That is somehow wasteful.

This is valid; I suspect the original rationale was something along
the lines of, "the next question one may ask after determining that
the string was truncated is how much more buffer was needed for
it?"

But yeah, it could just return a boolean indicating whether it
successfully copied the entirety of the source string. Indeed,
this is what the original version in the Akaros kernel did before
I rewrote it.

Incidentally: I went on a search for strncpy() in that kernel at
one point; out of something like 80ish call sites, only 4 were
correct.

>Three, given what strlcpy() returns, truncation is not detectable
>directly but needs a comparison against the passed length
>argument value, which might mean having to put that value in a
>temporary so it can subsequently compared. Of course it isn't
>hard to do that but it makes the code a bit messy.

This is a very valid criticism.

>An alternative is to supply the truncation-detecting safe copy
>function directly, using only standard functions. This can be
>done with, eg, memchr() and memmove(), in just a few lines of
>code (as shown in the example posted in a cousin thread not
>too long ago).
>
>I appreciate the value of strlcpy() et al. For this particular
>application however it seems like not quite the right fit.

I'm still not quite sure what application you are referring to.
Certainly, for the filesystem thing or similar applications,
strncpy() really does do what one wants.

- Dan C.

Tim Rentsch

unread,
Sep 15, 2016, 3:32:39 PM9/15/16
to
cr...@spitfire.i.gajendra.net (Dan Cross) writes:

> In article <kfnvaxz...@x-alumni2.alumni.caltech.edu>,
> Tim Rentsch <t...@alumni.caltech.edu> wrote:
>> cr...@spitfire.i.gajendra.net (Dan Cross) writes:
>>> In article <nr93rt$lgt$1...@reader2.panix.com>,
>>> Dan Cross <cr...@spitfire.i.gajendra.net> wrote:
>>>> [snip]
>>>>
>>>> Yes yes, that's true, but surely we can see that one approaches a point
>>>> of diminishing returns with these sorts of shenanigans. One could simply
>>>> use strlcpy() and avoid all of this, no?
>>>
>>> And to wit: Surely one would not expect a reasonable program to duplicate
>>> these checks at every strncat() call site. Instead, one would wrap this
>>> sequence of code up into some kind of function; at that point, one may as
>>> well simply call the 8 line strlcpy().
>>
>> The safe copy instruction sequence should of course be wrapped up
>> in a function. In my view though there are several reasons to
>> prefer not using strlcpy() for this application.
>
> Thanks, you raise some valid points but it begs the question: to
> which application do you refer? The original poster simply asked
> "How to copy a string safely and efficinetly in C?", with an explicit
> question about a choice between two functions: snprintf() and
> strncpy(). As far as I know, a more specific context was not given.

Sorry, probably I should have been more explicit. The
application I meant is the one I thought the discussion had
converged on, namely, safe copying of strings with an easy
way to detect that the source string was truncated.

>> One, strlcpy() might not be available or in an obvious library.
>> That means navigating a configuration question so it can be
>> provided if not there. (Sadly I have run into this very issue
>> in some open source programs.) A further complication is
>> that it might be provided by the implementation, as the name
>> is in the reserved name space.
>
> Yes, this is unfortunate; I rather hope that they will standardize
> it in an upcoming revision of the language, but I have some doubts.

Ditto (and sadly, ditto).

> It is, however, easy to write oneself -- or simply take one of the
> available open source versions (mine is public domain, OpenBSD's
> is under a permissive license). Conflict with a system-supplied
> version could be an issue in portable programs, but there are
> well-known techniques for working around that (e.g., separate
> compilation and conditional compilation based on platform, etc).

Right, that's basically what I was saying, although my emphasis
was different.

>> Two, in the case that truncation occurs, strlcpy() reads more
>> than it needs to (ie, if all we care about is whether or not
>> truncation occurred). That is somehow wasteful.
>
> This is valid; I suspect the original rationale was something along
> the lines of, "the next question one may ask after determining that
> the string was truncated is how much more buffer was needed for
> it?"

Definitely, returning the length that would be needed is a
useful functionality, in those cases that want to do something
along those lines.

> But yeah, it could just return a boolean indicating whether it
> successfully copied the entirety of the source string. Indeed,
> this is what the original version in the Akaros kernel did before
> I rewrote it.

Interesting. It has been many years since I used a BSD system,
and I haven't paid much attention (or probably any) for how
those interfaces may have changed over time or between systems.

> Incidentally: I went on a search for strncpy() in that kernel at
> one point; out of something like 80ish call sites, only 4 were
> correct.

The problem with strncpy() is not what functionality it provides
but the name used to provide it. In retrospect I think it should
have been called something like strnotwhatyouexpectcpy().

>> Three, given what strlcpy() returns, truncation is not detectable
>> directly but needs a comparison against the passed length
>> argument value, which might mean having to put that value in a
>> temporary so it can subsequently compared. Of course it isn't
>> hard to do that but it makes the code a bit messy.
>
> This is a very valid criticism.
>
>> An alternative is to supply the truncation-detecting safe copy
>> function directly, using only standard functions. This can be
>> done with, eg, memchr() and memmove(), in just a few lines of
>> code (as shown in the example posted in a cousin thread not
>> too long ago).
>>
>> I appreciate the value of strlcpy() et al. For this particular
>> application however it seems like not quite the right fit.
>
> I'm still not quite sure what application you are referring to.
> Certainly, for the filesystem thing or similar applications,
> strncpy() really does do what one wants.

Oh yes, strncpy() does the job it was meant to do very well.
Now if only more people knew what that job is....

luser droog

unread,
Sep 15, 2016, 3:46:11 PM9/15/16
to
On Thursday, September 15, 2016 at 2:32:39 PM UTC-5, Tim Rentsch wrote:
> cr...@spitfire.i.gajendra.net (Dan Cross) writes:
>
>
> > Incidentally: I went on a search for strncpy() in that kernel at
> > one point; out of something like 80ish call sites, only 4 were
> > correct.
>
> The problem with strncpy() is not what functionality it provides
> but the name used to provide it. In retrospect I think it should
> have been called something like strnotwhatyouexpectcpy().
>

I suggest char_array_cpy() or fixed_field_cpy(). Maybe *fld*.

supe...@casperkitty.com

unread,
Sep 15, 2016, 4:00:28 PM9/15/16
to
Since the functions were defined before the preprocessor became an
essential part of the language, the names were limited to six characters;
I think it might have been good to apply a looser constraint that all
library functions must be *unique* in the first 6 characters, but I don't
think fixed_ would be a particularly good name even then. From a naming
standpoint, I think what's important is that the function converts zero-
terminated strings to zero-padded strings; perhaps "strpad" might be a
good name if the behavior when the source and destination pointers are
equal was defined as padding a string in-place (I'd have a hard time
imagining an implementation where that guarantee would have any cost
beyond requiring that compilers not optimize out what would otherwise
be useful code).

Joe Pfeiffer

unread,
Sep 15, 2016, 4:52:14 PM9/15/16
to
The better question would be why something whose definition was so
idiosyncratic and whose use was so specialized was ever in the standard
library in the first place.

Keith Thompson

unread,
Sep 15, 2016, 4:56:14 PM9/15/16
to
supe...@casperkitty.com writes:
> On Thursday, September 15, 2016 at 2:46:11 PM UTC-5, luser droog wrote:
>> On Thursday, September 15, 2016 at 2:32:39 PM UTC-5, Tim Rentsch wrote:
>> > The problem with strncpy() is not what functionality it provides
>> > but the name used to provide it. In retrospect I think it should
>> > have been called something like strnotwhatyouexpectcpy().
>>
>> I suggest char_array_cpy() or fixed_field_cpy(). Maybe *fld*.
>
> Since the functions were defined before the preprocessor became an
> essential part of the language, the names were limited to six characters;
> I think it might have been good to apply a looser constraint that all
> library functions must be *unique* in the first 6 characters, but I don't
> think fixed_ would be a particularly good name even then.
[...]

C90 required external names to be unique in the first 6 characters,
possibly ignoring case. The C90 standard library had several
functions whose names are longer than 6 characters; vprintf, at 8
characters, is the longest I found in a quick search.

There might have been some pre-ANSI implementations that didn't
permit external names longer than 6 character -- but surely strncat,
at 7 characters, must have been added after any such restrictions
were removed.

How is the preprocessor relevant?

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Working, but not speaking, for JetHead Development, Inc.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

Keith Thompson

unread,
Sep 15, 2016, 5:07:16 PM9/15/16
to
Joe Pfeiffer <pfei...@cs.nmsu.edu> writes:
[...]
> The better question would be why something whose definition was so
> idiosyncratic and whose use was so specialized was ever in the standard
> library in the first place.

[referring to strncpy]

One possibility is that it's because early Unix systems used a
fixed-width character array to store file names in directory entries,
and strncpy() was useful for manipulating them.

It's not clear that that's the actual rationale, but it's plausible.

Richard Heathfield

unread,
Sep 15, 2016, 5:17:33 PM9/15/16
to
On 15/09/16 23:53, Keith Thompson wrote:
<snip>
>
> C90 required external names to be unique in the first 6 characters,
> possibly ignoring case. The C90 standard library had several
> functions whose names are longer than 6 characters; vprintf, at 8
> characters, is the longest I found in a quick search.

I make it 7. Did you mean vfprintf?

<snip>

--
Richard Heathfield
Email: rjh at cpax dot org dot uk
"Usenet is a strange place" - dmr 29 July 1999
Sig line 4 vacant - apply within

Joe Pfeiffer

unread,
Sep 15, 2016, 5:52:40 PM9/15/16
to
Keith Thompson <ks...@mib.org> writes:

> Joe Pfeiffer <pfei...@cs.nmsu.edu> writes:
> [...]
>> The better question would be why something whose definition was so
>> idiosyncratic and whose use was so specialized was ever in the standard
>> library in the first place.
>
> [referring to strncpy]
>
> One possibility is that it's because early Unix systems used a
> fixed-width character array to store file names in directory entries,
> and strncpy() was useful for manipulating them.
>
> It's not clear that that's the actual rationale, but it's plausible.

I've always been under the impression that that is what it's for -- it
doesn't answer why it's in the standard library.

Keith Thompson

unread,
Sep 15, 2016, 6:03:32 PM9/15/16
to
Richard Heathfield <r...@cpax.org.uk> writes:
> On 15/09/16 23:53, Keith Thompson wrote:
> <snip>
>>
>> C90 required external names to be unique in the first 6 characters,
>> possibly ignoring case. The C90 standard library had several
>> functions whose names are longer than 6 characters; vprintf, at 8
>> characters, is the longest I found in a quick search.
>
> I make it 7. Did you mean vfprintf?
>
> <snip>

Yes. (And that 'f' at the end means I can't even claim that my
'f' key is broken.)

Keith Thompson

unread,
Sep 15, 2016, 6:09:43 PM9/15/16
to
The C standard library wasn't designed as a coherent whole.
It evolved. strncpy was useful at the time, so it went in.

supe...@casperkitty.com

unread,
Sep 15, 2016, 7:36:33 PM9/15/16
to
On Thursday, September 15, 2016 at 3:56:14 PM UTC-5, Keith Thompson wrote:
> supercat writes:
> > Since the functions were defined before the preprocessor became an
> > essential part of the language, the names were limited to six characters;
> > I think it might have been good to apply a looser constraint that all
> > library functions must be *unique* in the first 6 characters, but I don't
> > think fixed_ would be a particularly good name even then.
> [...]
>
> C90 required external names to be unique in the first 6 characters,
> possibly ignoring case. The C90 standard library had several
> functions whose names are longer than 6 characters; vprintf, at 8
> characters, is the longest I found in a quick search.

> How is the preprocessor relevant?

If implementations were allowed to regard external names starting with
some character sequence as reserved, then the preprocessor could get
around the 6-character limit for function names, by having an
implementation assign a short implementation-specific external name for
each library function.

#define strcpy_with_pad _Q382x

Note that a similar approach may be useful on platforms which have
configurable integer sizes, e.g.

#if INT_MAX==32767
#define printf __printf16
#else
#define printf __printf32
#endif

Keith Thompson

unread,
Sep 15, 2016, 8:20:59 PM9/15/16
to
supe...@casperkitty.com writes:
> On Thursday, September 15, 2016 at 3:56:14 PM UTC-5, Keith Thompson wrote:
>> supercat writes:
>> > Since the functions were defined before the preprocessor became an
>> > essential part of the language, the names were limited to six characters;
>> > I think it might have been good to apply a looser constraint that all
>> > library functions must be *unique* in the first 6 characters, but I don't
>> > think fixed_ would be a particularly good name even then.
>> [...]
>>
>> C90 required external names to be unique in the first 6 characters,
>> possibly ignoring case. The C90 standard library had several
>> functions whose names are longer than 6 characters; vprintf, at 8
>> characters, is the longest I found in a quick search.
>
>> How is the preprocessor relevant?
>
> If implementations were allowed to regard external names starting with
> some character sequence as reserved, then the preprocessor could get
> around the 6-character limit for function names, by having an
> implementation assign a short implementation-specific external name for
> each library function.
>
> #define strcpy_with_pad _Q382x

So, a hypothetical solution for a problem that, as far as I can tell,
never existed, and certainly did not exist by the time of the 1989
ANSI C standard

[...]

supe...@casperkitty.com

unread,
Sep 16, 2016, 12:39:58 AM9/16/16
to
On Thursday, September 15, 2016 at 7:20:59 PM UTC-5, Keith Thompson wrote:
> supercat writes:
> > If implementations were allowed to regard external names starting with
> > some character sequence as reserved, then the preprocessor could get
> > around the 6-character limit for function names, by having an
> > implementation assign a short implementation-specific external name for
> > each library function.
> >
> > #define strcpy_with_pad _Q382x
>
> So, a hypothetical solution for a problem that, as far as I can tell,
> never existed, and certainly did not exist by the time of the 1989
> ANSI C standard

The issue did exist when the names of standard-library functions were being
established. If the ability to use longer names in source than were attached
at link time had existed then, the length limit for linker names would not
have been an issue.

Further, the approach would still be useful even today to allow an
implementation to make additional library functions available to compilation
units that include header files for them. If an implementation's library
included a function named "fred()", that would conflict with a user-code
identifier by the same name. If, however, the library contained a function
called __qz_fred(), the implementation could bundle a header file <fred.h>
which with the line "#define fred __qz_fred". User code that includes that
header would be unable to defined a "fred()" function, but code which does
not include that header would be free to do so.

Tim Rentsch

unread,
Sep 16, 2016, 9:17:27 AM9/16/16
to
All of these (mine included) are stylistically at odds with other
standard library functions. In light of that I withdraw my
previous strawman and offer strnwtf() as a possible replacement
(the last part being short for "write truncated field").

(I really hope no one is taking these suggestions seriously.)

Malcolm McLean

unread,
Sep 16, 2016, 10:06:21 AM9/16/16
to
On Friday, September 16, 2016 at 2:17:27 PM UTC+1, Tim Rentsch wrote:
>
> >> The problem with strncpy() is not what functionality it provides
> >> but the name used to provide it. In retrospect I think it should
> >> have been called something like strnotwhatyouexpectcpy().
> >
> > I suggest char_array_cpy() or fixed_field_cpy(). Maybe *fld*.
>
> All of these (mine included) are stylistically at odds with other
> standard library functions. In light of that I withdraw my
> previous strawman and offer strnwtf() as a possible replacement
> (the last part being short for "write truncated field").
>
> (I really hope no one is taking these suggestions seriously.)
>
Standardisation actually narrows the functionality available to the programmer,
until the standard sweeps the board and every non-conforming platform
becomes obsolete. That can take a long time.
You can't use the standard identifier because it will be available on some
systems and not on others. You can try conditional define guards, but you just
dig a hole for yourself going too far down that route - the conditionals keep
on breaking.
Now with isnan() we're really in trouble. We can't use the identifier isnan()
because it is in the process of standardisation, and we can't write it portably
as #define myisnan(x) ( (x)==(x) ? 0 : 1) because some compiler will optimise
out the "redundant" test.
But with a function to copy a string to a buffer, taking a length for safety,
just write one, call it mysafestrcpy(), make it static, and then supply it.

The trivial is important, and you can't have programs breaking just because
of a little string copying routine.

supe...@casperkitty.com

unread,
Sep 16, 2016, 12:11:57 PM9/16/16
to
On Friday, September 16, 2016 at 9:06:21 AM UTC-5, Malcolm McLean wrote:
> Now with isnan() we're really in trouble. We can't use the identifier isnan()
> because it is in the process of standardisation, and we can't write it portably
> as #define myisnan(x) ( (x)==(x) ? 0 : 1) because some compiler will optimise
> out the "redundant" test.

The solution would be to define a class of identifiers which are reserved
for future C standards *but* which user code will likely be authorized by
future C standards to use in certain ways on older implementations, with
the older implementations processing them according to natural semantics.

If the _CX prefix had been designated for that purpose, and if a future
standard were to promise that _CX_isnan() will be a macro that invokes an
intrinsic, then user code could safely start including

#if !defined(_CX_isnan)
#define _CX_isnan(x) user_isnan_implementation(x)
#endif

Anyone who used the _CX prefix in a way which didn't end up getting
authorized by a future standard might be out of luck if a future standard
uses their identifier for some contrary purpose, but if compilers were
required to allow user-code definitions of macros they didn't otherwise
know about, user code could easily be written to taken advantage of features
on compilers that have them, but be just as useful on compilers that don't.

Dan Cross

unread,
Sep 19, 2016, 11:15:57 AM9/19/16
to
In article <kfnfup1...@x-alumni2.alumni.caltech.edu>,
Tim Rentsch <t...@alumni.caltech.edu> wrote:
>cr...@spitfire.i.gajendra.net (Dan Cross) writes:
>> [snip]
>>
>> I'm still not quite sure what application you are referring to.
>> Certainly, for the filesystem thing or similar applications,
>> strncpy() really does do what one wants.
>
>Oh yes, strncpy() does the job it was meant to do very well.
>Now if only more people knew what that job is....

Indeed. Personally, I think the name, 'fldncpy()' would have
been better: bounded field copy.

Tim Rentsch

unread,
Sep 23, 2016, 1:25:05 AM9/23/16
to
Malcolm McLean <malcolm...@btinternet.com> writes:

> On Friday, September 16, 2016 at 2:17:27 PM UTC+1, Tim Rentsch wrote:
>>
>>>> The problem with strncpy() is not what functionality it provides
>>>> but the name used to provide it. In retrospect I think it should
>>>> have been called something like strnotwhatyouexpectcpy().
>>>
>>> I suggest char_array_cpy() or fixed_field_cpy(). Maybe *fld*.
>>
>> All of these (mine included) are stylistically at odds with other
>> standard library functions. In light of that I withdraw my
>> previous strawman and offer strnwtf() as a possible replacement
>> (the last part being short for "write truncated field").
>>
>> (I really hope no one is taking these suggestions seriously.)
>
> Standardisation actually narrows the functionality available to the programmer,
> until the standard sweeps the board and every non-conforming platform
> becomes obsolete. That can take a long time.
> You can't use the standard identifier because it will be available on some
> systems and not on others. You can try conditional define guards, but you just
> dig a hole for yourself going too far down that route - the conditionals keep
> on breaking.
> Now with isnan() we're really in trouble. We can't use the identifier isnan()
> because it is in the process of standardisation, and we can't write it portably
> as #define myisnan(x) ( (x)==(x) ? 0 : 1) because some compiler will optimise
> out the "redundant" test.

I find these statements silly or nonsensical on a variety of
levels. For starters, isnan() is not currently in the process of
standardization: in point of fact, that particular process
finished more than fifteen years ago. The idea that a standard
(either original or a revised version) is not usable until every
single implementation is conforming is laughable. As for using
preprocessor-controlled selection, part of the point of having a
standard is to facilitate that, and make it easy and bulletproof.
For example, if C99 features like isnan() are needed, it is very
easy to establish that requirement using a preprocessor test:

#if __STDC_VERSION__ > 199900
#else
#error Sorry, Charlie
#endif

As for the argument that isnan() cannot be written portably for
pre-C99 environments, what is offered is pure and simple a
strawman argument. There are at least a couple easy ways to
write a myisnan() function that evaluate nan-ness reliably and do
not suffer from the optimization malady mentioned. Is the
approach you mentioned the only way you could think of, or did
you just not bother to try exploring alternatives? Either way,
don't blame how standardization happens for the results of your
own shortcomings.

> But with a function to copy a string to a buffer, taking a length
> for safety, just write one, call it mysafestrcpy(), make it static,
> and then supply it.
>
> The trivial is important, and you can't have programs breaking just
> because of a little string copying routine.

ISTM you have totally missed the point. The whole reason the
discussion took place is because the current language standard
(even the latest one) does NOT contain a suitable standardized
function for this purpose. Of course that means writing one.
But the existence of other, lower-level, standard building block
functions makes it almost trivial to do that. Even if the
standard language does not have exactly what is needed, it pays
to leverage what library functionality it does provide. The
point of the discussion is identifying good ways to do that.
0 new messages