strncpy and 'n'

nroberts

unread,

Feb 16, 2012, 12:05:07 PM2/16/12

to

Consider:

char const* f(char const* incoming)
{
static char buf[MAX];

strncpy(buf, incoming, strlen(incoming));
}

Is there ANY reason to use strncpy like that? I'm working on a
project that has such uses all throughout it and before I tell the
team leader that he's using a basic C function incorrectly I thought
I'd make sure I'm right.

Anders Wegge Keller

unread,

Feb 16, 2012, 12:57:36 PM2/16/12

to

nroberts <robert...@gmail.com> writes:

> Consider:
>
> char const* f(char const* incoming)
> {
> static char buf[MAX];
>
> strncpy(buf, incoming, strlen(incoming));
> }
>
> Is there ANY reason to use strncpy like that?

If you want to avoid havig the '\0' copied from the incoming text,
and risk overrunning buf, there could be a point. But apart from that, no.

I would rather use strncpy like this:

strncpy (buf, incoming, MAX);
buf[MAX] = 0;

> I'm working on a project that has such uses all throughout it and
> before I tell the team leader that he's using a basic C function
> incorrectly I thought I'd make sure I'm right.

Submit some code to www.thedailywtf.com :)

--
/Wegge

Leder efter redundant peering af dk.*,linux.debian.*

Keith Thompson

unread,

Feb 16, 2012, 1:01:57 PM2/16/12

to

There's rarely *any* reason to use strncpy(). It's not a "safer"
version of strcpy(); it's a quite different function. It can leave
the target buffer without a terminating '\0' (i.e., not a string),
or it can pad it with multiple needless '\0' bytes.

For that particular call, if strlen(incoming) is 10, for example,
it will only copy 10 bytes; it will not copy the terminating '\0'.
If that's what you want (either buf doesn't need to be a string, or
some other code supplies the '\0'), then memcpy() makes more sense.

--
Keith Thompson (The_Other_Keith) ks...@mib.org <http://www.ghoti.net/~kst>
Will write code for food.
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"

A. K.

unread,

Feb 16, 2012, 1:27:47 PM2/16/12

to

On 16.02.2012 18:57, Anders Wegge Keller wrote:
> nroberts<robert...@gmail.com> writes:
>
>> Consider:
>>
>> char const* f(char const* incoming)
>> {
>> static char buf[MAX];
>>
>> strncpy(buf, incoming, strlen(incoming));
>> }
>>
>> Is there ANY reason to use strncpy like that?
>
> If you want to avoid havig the '\0' copied from the incoming text,
> and risk overrunning buf, there could be a point. But apart from that, no.
>
> I would rather use strncpy like this:
>
> strncpy (buf, incoming, MAX);
> buf[MAX] = 0;

buffer overflow !!! :o)

Anders Wegge Keller

unread,

Feb 16, 2012, 1:34:16 PM2/16/12

to

"A. K." <a...@nospam.org> writes:

> On 16.02.2012 18:57, Anders Wegge Keller wrote:

>> buf[MAX] = 0;

buffer overflow !!! :o)

If you got to have them, better decide yourself where to have them. I
pledge insanity in the act.

Markus Schaub

unread,

Feb 16, 2012, 1:53:24 PM2/16/12

to

nroberts schrieb:

He has probably heard that strcpy() is bad and that he should use
strncpy().

Markus

Malcolm McLean

unread,

Feb 16, 2012, 2:10:14 PM2/16/12

to

On Feb 16, 6:01 pm, Keith Thompson <ks...@mib.org> wrote:
>
> There's rarely *any* reason to use strncpy(). It's not a "safer"
> version of strcpy(); it's a quite different function. It can leave
> the target buffer without a terminating '\0' (i.e., not a string),
> or it can pad it with multiple needless '\0' bytes.
>

It's designed for databases with fixed fields and non-nul terminated
strings. The padding zeros aren't unnecessary, because often these
databases do a quick match or lookup by applying some algorithm to the
whole field.

lawrenc...@siemens.com

unread,

Feb 16, 2012, 1:25:56 PM2/16/12

to

nroberts <robert...@gmail.com> wrote:
> Consider:
>
> char const* f(char const* incoming)
> {
> static char buf[MAX];
>
> strncpy(buf, incoming, strlen(incoming));
> }
>
> Is there ANY reason to use strncpy like that?

Maybe, but almost certainly not. If the incoming string is longer than
MAX bytes, you get a buffer overflow, which is very bad. If it happens
to be exactly MAX characters long, you get an unterminated string in
buf, which is bad if the following code (you don't show any, but I
presume there is some or the function is completely pointless) expects
to treat it as a string. And if the incoming string is less than MAX
bytes, since the code doesn't copy the null byte, you get whatever is
leftover in buf tacked on to the end. (buf is initialized to all null
bytes, but since it's static, that only happens once, not on each call.)
--
Larry Jones

Oh yeah? You just wait! -- Calvin

Ben Pfaff

unread,

Feb 16, 2012, 2:18:09 PM2/16/12

to

nroberts <robert...@gmail.com> writes:

> char const* f(char const* incoming)
> {
> static char buf[MAX];
>
> strncpy(buf, incoming, strlen(incoming));
> }
>
> Is there ANY reason to use strncpy like that?

It looks very odd. I think it would be equivalent code if you
replaced "strncpy" by "memcpy" here.

The behavior here make sense for the first call to the function,
if strlen(incoming) < MAX, but it will be strange on subsequent
calls.
--
"The fact that there is a holy war doesn't mean that one of the sides
doesn't suck - usually both do..."
--Alexander Viro

Stephen Sprunk

unread,

Feb 16, 2012, 2:29:16 PM2/16/12

to

Assuming this is representative of the actual code, it's clearly wrong
because strncpy() will overflow buf if strlen(incoming)+1 is greater
than MAX. This means it is no better than strcpy(buf, incoming).

The correct way to write this would be:

char const* f(char const* incoming)
{
static char buf[MAX];

strncpy(buf, incoming, MAX);
}

Unlike the above code, this guarantees the copy will not overflow buf.

S

--
Stephen Sprunk "God does not play dice." --Albert Einstein
CCIE #3723 "God is an inveterate gambler, and He throws the
K5SSS dice at every possible opportunity." --Stephen Hawking

ImpalerCore

unread,

Feb 16, 2012, 2:32:22 PM2/16/12

to

When you're working with fixed width character buffers, use a fixed
width strlen.

size_t c_strnlen( const char* str, size_t n )
{
const char* p;
p = memchr( str, '\0', n );
return p ? (size_t)( p - str ) : n;
}

Then use the following

strncpy(buf, incoming, c_strnlen(incoming, sizeof (buf)));

Of course, this replaces the potential of a buffer overrun with
truncation, which can lead to other subtle problems.

An alternative is to use something like 'strlcpy', if you want to
guarantee a '\0' character at the end of the string.

Best regards,
John D.

A. K.

unread,

Feb 16, 2012, 2:58:46 PM2/16/12

to

On 16.02.2012 19:34, Anders Wegge Keller wrote:
> "A. K."<a...@nospam.org> writes:
>
>> On 16.02.2012 18:57, Anders Wegge Keller wrote:
>
>>> buf[MAX] = 0;
>
> buffer overflow !!! :o)
>
> If you got to have them, better decide yourself where to have them. I
> pledge insanity in the act.
>

guess where I have learnt that this produces an overflow?
:o)))

Keith Thompson

unread,

Feb 16, 2012, 5:20:29 PM2/16/12

to

Yes, the padding zeros are necessary *if* you're dealing with that
kind of data structure.

I suspect that strncpy() is used incorrectly, under the assumption
that it's a "safer" strcpy(), more often than it's used correctly.
IMHO it shouldn't be in the standard library, at least not with
that name.

Ike Naar

unread,

Feb 16, 2012, 5:45:16 PM2/16/12

to

On 2012-02-16, Stephen Sprunk <ste...@sprunk.org> wrote:
> On 16-Feb-12 11:05, nroberts wrote:
>> char const* f(char const* incoming)
>> {
>> static char buf[MAX];
>>
>> strncpy(buf, incoming, strlen(incoming));
>> }
>>

> Assuming this is representative of the actual code, it's clearly wrong
> because strncpy() will overflow buf if strlen(incoming)+1 is greater
> than MAX. This means it is no better than strcpy(buf, incoming).

Nit: it wil overflow if strlen(incoming) is greater than MAX.
It wil not overflow if strlen(incoming) equals MAX.
In that case, it will leave an unterminated string in buf,
but most people wouldn't call that an overflow.

Nick Keighley

unread,

Feb 17, 2012, 3:50:52 AM2/17/12

to

On Feb 16, 5:05 pm, nroberts <roberts.n...@gmail.com> wrote:
> Consider:
>
> char const* f(char const* incoming)
> {
> static char buf[MAX];
>
> strncpy(buf, incoming, strlen(incoming));
>
> }
>
> Is there ANY reason to use strncpy like that?

no. Besides all the other problems the function doesn't return
anything and buf[] is inaccessible. I suspect you meant to return
&buf[0].

Malcolm McLean

unread,

Feb 17, 2012, 7:19:02 AM2/17/12

to

On Feb 16, 10:20 pm, Keith Thompson <ks...@mib.org> wrote:
>
> I suspect that strncpy() is used incorrectly, under the assumption
> that it's a "safer" strcpy(), more often than it's used correctly.
> IMHO it shouldn't be in the standard library, at least not with
> that name.
>

You're right.
There's no point providing strncpy() but not functions like hash() and
faststrncmpwithtrailingzeros() to actually use fixed width strings.
--
Malcolm's website
http://www.malcommclean.site11.com/www

James Kuyper

unread,

Feb 17, 2012, 10:04:43 AM2/17/12

to

On 02/17/2012 07:19 AM, Malcolm McLean wrote:
> On Feb 16, 10:20�pm, Keith Thompson <ks...@mib.org> wrote:
>>
>> I suspect that strncpy() is used incorrectly, under the assumption
>> that it's a "safer" strcpy(), more often than it's used correctly.
>> IMHO it shouldn't be in the standard library, at least not with
>> that name.
>>
> You're right.
> There's no point providing strncpy() but not functions like hash() and
> faststrncmpwithtrailingzeros() to actually use fixed width strings.

What would be the benefits of using faststrncmpwithtrailingzeros()
rather than memcmp() be?
--
James Kuyper

Malcolm McLean

unread,

Feb 17, 2012, 1:27:21 PM2/17/12

to

On Feb 17, 3:04 pm, James Kuyper <jameskuy...@verizon.net> wrote:
>
> What would be the benefits of using faststrncmpwithtrailingzeros()
> rather than memcmp() be?
>

That's a point. It documents that you're doing a string compare, but
actually it's the same as memcmp(). On most platforms, it will need
guaranteed integer-aligned fields to be fast, however. That's not
something it's easy to specify in the C standard.
--
Vist my website. Play the Alice in Wonderland Card game
http://www.malcommclean.site11.com/www

Malcolm McLean

unread,

Feb 17, 2012, 2:22:09 PM2/17/12

to

On Feb 17, 3:04 pm, James Kuyper <jameskuy...@verizon.net> wrote:
>

> What would be the benefits of using faststrncmpwithtrailingzeros()
> rather than memcmp() be?
>

If you have long fields with mainly short contents, it could also be
faster, since it can terminate at the first pair of nul bytes.

James Kuyper

unread,

Feb 17, 2012, 2:40:34 PM2/17/12

to

If you know that the the end of the string will be determined either by
the end of a fixed-length field, or by a terminating null character,
strncmp(). If you want to check the entire length of the fixed length
field, regardless of null terminators, memcmp() would do. I don't think
that there's sufficient need for a function whose behavior falls
between those two extremes, to make it a standard library function.
--
James Kuyper

Malcolm McLean

unread,

Feb 18, 2012, 5:17:22 AM2/18/12

to

On Feb 17, 7:40 pm, James Kuyper <jameskuy...@verizon.net> wrote:
>
> If you know that the the end of the string will be determined either by
> the end of a fixed-length field, or by a terminating null character,
> strncmp(). If you want to check the entire length of the fixed length
> field, regardless of null terminators, memcmp() would do. I don't think
> that there's sufficient need for a function whose behavior falls
> between those two extremes, to make it a standard library function.
>

The main issue is usually that reading chars byte by byte is slow,
reading words is fast.

So if fields are guaranteed to be memory aligned, and a whole number
of words, a comparison will be certainly four times and often many
more times as fast as a byte by byte compare of unaligned strings of
arbitrary length. Aligned fields of whole word size are quite easy to
achieve at a low level, but difficult to specify in ANSI standard C.

Malcolm's website
http://www.malcolmmclean.site11.com/www

Joe keane

unread,

Feb 19, 2012, 3:04:57 PM2/19/12

to

In article <lnvcn6f...@nuthaus.mib.org>,

Keith Thompson <ks...@mib.org> wrote:
>There's rarely *any* reason to use strncpy(). It's not a "safer"
>version of strcpy(); it's a quite different function.

It is a safer version of 'strcpy'. There is the issue of what to do if
the full copy can't be done, but that's program logic and the library
function can't read people's minds. Even if the program just calls
'abort', that's a huge improvement.

Keith Thompson

unread,

Feb 19, 2012, 3:33:01 PM2/19/12

to

Did you not read my description of what strncpy actually does, or do you
disagree with it?

strncat is a "safer" version of strcat. It takes an extra argument
"n" that specifies the maximum number of characters to be copied. If
the source is longer than n characters, it appends just n characters.
It properly zero-terminates the destination in all cases.

strncpy *looks* like it should be to strcpy as strncat is to strcat,
but it isn't. If the source string is shorter than n characters,
it will pad the destination with multiple null characters, something
that strcpy never does. If the source string is longer than n
characters, it will leave the destination unterminated (i.e.,
not a string).

If it had been defined something like this:

char *better_strncpy(char *dest, const char *src, size_t n) {
dest[0] = '\0';
return strncat(dest, src, n);
}

then it would be reasonable to call it a "safer" version of strcpy.

(It's possible I have an off-by-one error in the above code;
I haven't taken the time to check.)

Joe keane

unread,

Feb 20, 2012, 4:25:09 PM2/20/12

to

In article <lnbooue...@nuthaus.mib.org>,

Keith Thompson <ks...@mib.org> wrote:
>strncpy *looks* like it should be to strcpy as strncat is to strcat,
>but it isn't.

Well there is a number of options.

a) should it make sure there is a zero terminator
b) maybe you want to clear everything that isn't copied
c) does it handle overlapping copies

Do i think it's slightly stupid that they don't match? A little bit.
But one can imagine that good choices were made, and that consistency is
sometimes negative.

Why does 'puts' add a newline, and 'fputs' doesn't?
Why does 'fgets' take a size argument, and 'gets' doesn't?
Why can i use the same format for float/double in 'fprintf', but not in
'fscanf'?
Why is it that multiply long and long gives long, but short and short
gives int?
Why can i use a bitfield in a struct, but not as a local variable?
Why is it that 'int' means signed, but 'char' can be either one?

Keith Thompson

unread,

Feb 20, 2012, 5:46:13 PM2/20/12

to

j...@panix.com (Joe keane) writes:
> In article <lnbooue...@nuthaus.mib.org>,
> Keith Thompson <ks...@mib.org> wrote:
>>strncpy *looks* like it should be to strcpy as strncat is to strcat,
>>but it isn't.
>
> Well there is a number of options.
>
> a) should it make sure there is a zero terminator
> b) maybe you want to clear everything that isn't copied
> c) does it handle overlapping copies
>
> Do i think it's slightly stupid that they don't match? A little bit.
> But one can imagine that good choices were made, and that consistency is
> sometimes negative.

The point is that strncpy is a very different function from strcpy.
It is not intended to work with a *string* in the target array;
it works with a specialized data structure (used to store file
names in very early Unix systems).

> Why does 'puts' add a newline, and 'fputs' doesn't?
> Why does 'fgets' take a size argument, and 'gets' doesn't?
> Why can i use the same format for float/double in 'fprintf', but not in
> 'fscanf'?
> Why is it that multiply long and long gives long, but short and short
> gives int?
> Why can i use a bitfield in a struct, but not as a local variable?
> Why is it that 'int' means signed, but 'char' can be either one?

There are answers for most of these questions. For others,
it's certainly true that the C standard library is not entirelyi
consistent. But I don't think any of them are particularly relevant
to strncpy.

I think you're understating the differences between strcpy and
strncpy. The strncpy function is radically different from strcpy,
and there are very few legitimate uses for it. On the other hand,
the deceptive name has led many C programmers to use it incorrectly,
and I strongly suspect that it's used incorrectly far more often

than it's used correctly.

James Kuyper

unread,

Feb 20, 2012, 5:51:43 PM2/20/12

to

On 02/20/2012 04:25 PM, Joe keane wrote:
...

> Why can i use the same format for float/double in 'fprintf', but not in
> 'fscanf'?

Because float gets promoted to double when passed to fprintf(), while
the concept of "promoted type" doesn't even apply to the pointer
arguments passed to fscanf(). float* and double* are incompatible types,
which means they must be treated differently, and fscanf() needs to know
about that face.

> Why is it that multiply long and long gives long, but short and short
> gives int?

Integer types that are smaller than 'int' get promoted to 'int' on the
principle that 'int' should be the natural integer type for a given
platform. Integer types smaller than 'int' should be used only when
needed to save space - types larger than 'int' should be used only when
needed to represent large numbers.

...

> Why is it that 'int' means signed, but 'char' can be either one?

Because the normal character type on many of the machines that C was
first ported to was signed, while on many others it was unsigned,
whereas the normal integer type was (almost?) always signed.
--
James Kuyper

Shao Miller

unread,

Feb 20, 2012, 6:14:03 PM2/20/12

to

My impression is that in:

char foo[] = "foo";
char bar[3] = "bar";
char baz[10] = "baz";

each of these could be roughly equivalent to:

char foo[sizeof "foo"];
char bar[3];
char baz[10];

strncat(foo, "foo", sizeof "foo");
strncat(bar, "bar", 3);
strncat(foo, "baz", 10);

And that in:

char blah[40] = { 0 };

this is roughly equivalent to:

char blah[40];

memset(blah, 0, 40);

And that in:

int blee[2][2] = { { 13, 42 } };

this is roughly equivalent to:

static const int hidden_blee_initializer[2][2] =
{ { 13, 42 }, { 0, 0 } };
int blee[2][2];

memcpy(blee, hidden_blee_initializer, sizeof blee);

Where each of these standard functions might be highly optimized and
where an implementation might actually choose to implement the
initializations using just the same logic.

Rich Webb

unread,

Feb 20, 2012, 7:09:05 PM2/20/12

to

On Sun, 19 Feb 2012 12:33:01 -0800, Keith Thompson <ks...@mib.org>
wrote:

I'm surprised that the construct

strncpy(dest, source, BUF_LEN)[BUF_LEN - 1] = '\0';

hasn't been mentioned. It seems to be a reasonably compact way of
dealing with uncontrolled input.

--
Rich Webb Norfolk, VA

Keith Thompson

unread,

Feb 20, 2012, 7:46:01 PM2/20/12

to

Rich Webb <bbe...@mapson.nozirev.ten> writes:
[...]

> I'm surprised that the construct
>
> strncpy(dest, source, BUF_LEN)[BUF_LEN - 1] = '\0';
>
> hasn't been mentioned. It seems to be a reasonably compact way of
> dealing with uncontrolled input.

Yes, that should work. (I've never seen anyone actually use that
idiom; have you?)

Note that if dest is, say, 1000 bytes, and strlen(source)==3,
then it will write 997 null characters into dest, when 1 would do.
If the source string is from user or file input, that's probably
not going to be significant.

This avoids that problem:

dest[0] = '\0';
strncat(dest, source, BUF_LEN - 1);

Another problem is that it *silently* truncates overly long input.
(That might be just what you want.)

The problems of strncat can be worked around if you're aware of them,
but I'm skeptical that it's worth the effort.

Shao Miller

unread,

Feb 20, 2012, 7:46:37 PM2/20/12

to

Or maybe:

strncpy(dest, source, BUF_LEN - 1)[BUF_LEN - 1] = '\0';

Rich Webb

unread,

Feb 20, 2012, 8:48:49 PM2/20/12

to

On Mon, 20 Feb 2012 16:46:01 -0800, Keith Thompson <ks...@mib.org>
wrote:

>Rich Webb <bbe...@mapson.nozirev.ten> writes:
>[...]
>> I'm surprised that the construct
>>
>> strncpy(dest, source, BUF_LEN)[BUF_LEN - 1] = '\0';
>>
>> hasn't been mentioned. It seems to be a reasonably compact way of
>> dealing with uncontrolled input.
>
>Yes, that should work. (I've never seen anyone actually use that
>idiom; have you?)

Ermmm, me, actually. Where I use it constantly is in parsing data inputs
that can have real-world issues, such as a noise burst that over-filled
a field or a separator character between two fields that got dropped. I
deal with the correctness of the buffer subsequently but first I need to
ensure that the buffer is safely filled and terminated so that I can
look at it.

There will typically be separate checks for overall format (e.g.,
framing words, number of fields) and a checksum/CRC/hash but some day
the planets will be in a bad alignment where everything else is okay
and, as the saying goes, in a field where "foo" was expected
"supercalifragilisticexpialidocious" was read.

>Note that if dest is, say, 1000 bytes, and strlen(source)==3,
>then it will write 997 null characters into dest, when 1 would do.
>If the source string is from user or file input, that's probably
>not going to be significant.

Which is fine, although the difference is usually on the order of an
input length of eight into a buffer with room for, say, twelve.

>This avoids that problem:
>
> dest[0] = '\0';
> strncat(dest, source, BUF_LEN - 1);

True. A more complete solution would probably be a strlen() test
followed by truncation with \0 and an application diagnostic if the
input is too long. But I'm often dealing with embedded systems where
it's just a black box that can not fail from a data line hiccup.

>Another problem is that it *silently* truncates overly long input.
>(That might be just what you want.)

Pretty much, yes.

pete

unread,

Feb 20, 2012, 11:16:09 PM2/20/12

to

Shao Miller wrote:

> My impression is that in:
>
> char foo[] = "foo";
> char bar[3] = "bar";
> char baz[10] = "baz";
>
> each of these could be roughly equivalent to:
>
> char foo[sizeof "foo"];
> char bar[3];
> char baz[10];
>
> strncat(foo, "foo", sizeof "foo");
> strncat(bar, "bar", 3);
> strncat(foo, "baz", 10);

ITYM strncat(baz, "baz", 10);

foo, bar and baz
would each have to be initialized with a null byte
in order to properly use strncat on them like that.

But, even if bar[3] had static duration
and so was initialized with a null byte,
strncat(bar, "bar", 3);
writes 4 characters and overruns the array.

n1570
7.24.3.2 The strncat function
Synopsis
1 #include <string.h>
char *strncat(char * restrict s1,
const char * restrict s2, size_t n);
Description
2 The strncat function appends not more than n characters
(a null character and characters that follow it are not appended)
from the array pointed to by s2 to the end of the string
pointed to by s1.
The initial character of s2 overwrites the null character
at the end of s1.
A terminating null character is always appended to the result.

--
pete

Joe keane

unread,

Feb 21, 2012, 4:46:30 PM2/21/12

to

In article <lnbooue...@nuthaus.mib.org>,

Keith Thompson <ks...@mib.org> wrote:
>If it had been defined something like this:
>
> char *better_strncpy(char *dest, const char *src, size_t n) {
> dest[0] = '\0';
> return strncat(dest, src, n);
> }
>
>then it would be reasonable to call it a "safer" version of strcpy.

Well there you go. You pretty much solved your complaint. It you are
catenating several strings, it's almost more natural to zero out at the
beginning, then use 'strcat' from there.

You can understand why you might want to zero out the whole array at the
start; from there 'strcat' is fine.

James Kuyper

unread,

Feb 21, 2012, 5:15:15 PM2/21/12

to

On 02/21/2012 04:46 PM, Joe keane wrote:
> In article <lnbooue...@nuthaus.mib.org>,
> Keith Thompson <ks...@mib.org> wrote:
>> If it had been defined something like this:
>>
>> char *better_strncpy(char *dest, const char *src, size_t n) {
>> dest[0] = '\0';
>> return strncat(dest, src, n);
>> }
>>
>> then it would be reasonable to call it a "safer" version of strcpy.
>
> Well there you go. You pretty much solved your complaint.

I don't see how the ability to define such a function changes any of the
issues he complained about - there's still a function named strncpy() in
standard library, and it's name still creates well-justified but false
expectations in newbies about what it might do. Careful reading of the
standard or good documentation will correct those mistaken expectations
- but that shouldn't have been necessary.

I'm not saying that newbies shouldn't need to read the documentation, or
that the standard should be written so that reading it is unnecessary.
I'm merely saying that good naming conventions can make it easier to
guess what the functions do, and easier to remember what they do once
you have learned what it is. Either the behavior should have matched
those expectations, or the name should have been changed to make them
not well-justified.

Keith Thompson

unread,

Feb 21, 2012, 5:29:17 PM2/21/12

to

j...@panix.com (Joe keane) writes:
> In article <lnbooue...@nuthaus.mib.org>,
> Keith Thompson <ks...@mib.org> wrote:
>>If it had been defined something like this:
>>
>> char *better_strncpy(char *dest, const char *src, size_t n) {
>> dest[0] = '\0';
>> return strncat(dest, src, n);
>> }
>>
>>then it would be reasonable to call it a "safer" version of strcpy.
>
> Well there you go. You pretty much solved your complaint. It you are
> catenating several strings, it's almost more natural to zero out at the
> beginning, then use 'strcat' from there.

Not really. My complaint is that strncpy() is in the standard library
with the name "strncpy", and that too many programmers use it
incorrectly.

> You can understand why you might want to zero out the whole array at the
> start; from there 'strcat' is fine.

Note that multiple strcat() calls can be quite inefficient, since
each one has to scan the destination to find the terminating '\0'
before it appends the new data.

Malcolm McLean

unread,

Feb 22, 2012, 4:15:42 AM2/22/12

to

On Feb 21, 10:29 pm, Keith Thompson <ks...@mib.org> wrote:
>
> Not really. My complaint is that strncpy() is in the standard library
> with the name "strncpy", and that too many programmers use it
> incorrectly.
>

What's worse is that often the wrong use won't be detected.

strncpy() appears to eb a safe strcpy() if the buffer length is never
exceeded. Since normally the buffer will be larger than any string you
expect, this often won't be tested. Who's going to pass a string of
more than FILE_MAX to a program?

Then even if it is tested, there's a reasonable chance that the
character immediately folowing the buffer is a byte of value zero. So
it might well appear to a casual tester that the fucntin has worked as
expected - he might not notice the extra character.

--

Joe keane

unread,

Feb 22, 2012, 7:02:50 PM2/22/12

to

In article <lnhayjb...@nuthaus.mib.org>,

Keith Thompson <ks...@mib.org> wrote:
>My complaint is that strncpy() is in the standard library
>with the name "strncpy", and that too many programmers use it
>incorrectly.

I think i agree on naming. For example, 'strncpyz' zeros out the
buffer, 'strncpyu' can leave it without a terminator (the default is
less surprising), 'strncpyzu' does both. The zero seems harmless, at
worst it runs slower, at best it runs faster.

But 'strcpy' doesn't give us much guidance here.

It *can't* zero out the buffer, because it doesn't know the buffer size.
It *can't* decide to leave out the terminator, because it doesn't know
the buffer size. It can't avoid trashing your memory, for same reason.

Keith Thompson

unread,

Feb 22, 2012, 8:02:44 PM2/22/12

to

j...@panix.com (Joe keane) writes:
> In article <lnhayjb...@nuthaus.mib.org>,
> Keith Thompson <ks...@mib.org> wrote:
>>My complaint is that strncpy() is in the standard library
>>with the name "strncpy", and that too many programmers use it
>>incorrectly.
>
> I think i agree on naming. For example, 'strncpyz' zeros out the
> buffer, 'strncpyu' can leave it without a terminator (the default is
> less surprising), 'strncpyzu' does both. The zero seems harmless, at
> worst it runs slower, at best it runs faster.
>
> But 'strcpy' doesn't give us much guidance here.
>
> It *can't* zero out the buffer, because it doesn't know the buffer size.

It doesn't bother to zero out the buffer, because that's rarely a useful
thing to do. If you want to zero a buffer, use memset().

> It *can't* decide to leave out the terminator, because it doesn't know
> the buffer size.

It doesn't leave out the terminator because it *needs* to store the
terminator in order for the destination to be a valid string.

> It can't avoid trashing your memory, for same reason.

It's up to the caller to avoid trashing memory by ensuring that the
destination is big enough to hold the data to be copied into it.
(Admittedly strcpy() doesn't make this easy.)

Jorgen Grahn

unread,

Feb 23, 2012, 6:13:35 PM2/23/12

to

On Mon, 2012-02-20, Keith Thompson wrote:
> j...@panix.com (Joe keane) writes:
>> In article <lnbooue...@nuthaus.mib.org>,
>> Keith Thompson <ks...@mib.org> wrote:
>>>strncpy *looks* like it should be to strcpy as strncat is to strcat,
>>>but it isn't.
>>
>> Well there is a number of options.
>>
>> a) should it make sure there is a zero terminator
>> b) maybe you want to clear everything that isn't copied
>> c) does it handle overlapping copies
>>
>> Do i think it's slightly stupid that they don't match? A little bit.
>> But one can imagine that good choices were made, and that consistency is
>> sometimes negative.
>
> The point is that strncpy is a very different function from strcpy.
> It is not intended to work with a *string* in the target array;
> it works with a specialized data structure (used to store file
> names in very early Unix systems).

Malcolm McLean wrote something similar upthread. Do you have any
references for this?

It would explain the function's weird semantics, but I haven't seen
anything before which says this is its background. (There's also
wcsncpy() for wchar_t -- that one is certainly newer, and useless in
the data structures you mention.)

...

> I think you're understating the differences between strcpy and
> strncpy. The strncpy function is radically different from strcpy,
> and there are very few legitimate uses for it. On the other hand,
> the deceptive name has led many C programmers to use it incorrectly,
> and I strongly suspect that it's used incorrectly far more often
> than it's used correctly.

The static analysis tool we use at work screams bloody murder every
time I use strcpy() and tells me to use strncpy() instead. Argh ...

/Jorgen

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

Keith Thompson

unread,

Feb 23, 2012, 7:20:24 PM2/23/12

to

Jorgen Grahn <grahn...@snipabacken.se> writes:
> On Mon, 2012-02-20, Keith Thompson wrote:

[...]

>> The point is that strncpy is a very different function from strcpy.
>> It is not intended to work with a *string* in the target array;
>> it works with a specialized data structure (used to store file
>> names in very early Unix systems).
>
> Malcolm McLean wrote something similar upthread. Do you have any
> references for this?
>
> It would explain the function's weird semantics, but I haven't seen
> anything before which says this is its background. (There's also
> wcsncpy() for wchar_t -- that one is certainly newer, and useless in
> the data structures you mention.)

I don't have any definitive references, but there's at
least been plenty of speculation to that effect, for example
<http://stackoverflow.com/a/1454071/827263>.

Shao Miller

unread,

Feb 25, 2012, 1:51:40 AM2/25/12

to

On 2/20/2012 23:16, pete wrote:
> Shao Miller wrote:
>
>> My impression is that in:
>>
>> char foo[] = "foo";
>> char bar[3] = "bar";
>> char baz[10] = "baz";
>>
>> each of these could be roughly equivalent to:
>>
>> char foo[sizeof "foo"];
>> char bar[3];
>> char baz[10];
>>
>> strncat(foo, "foo", sizeof "foo");
>> strncat(bar, "bar", 3);
>> strncat(foo, "baz", 10);
>
> ITYM strncat(baz, "baz", 10);
>

Right. Except...

> foo, bar and baz
> would each have to be initialized with a null byte
> in order to properly use strncat on them like that.
>
> But, even if bar[3] had static duration
> and so was initialized with a null byte,
> strncat(bar, "bar", 3);
> writes 4 characters and overruns the array.
>
>
> n1570
> 7.24.3.2 The strncat function
> Synopsis
> 1 #include<string.h>
> char *strncat(char * restrict s1,
> const char * restrict s2, size_t n);
> Description
> 2 The strncat function appends not more than n characters
> (a null character and characters that follow it are not appended)
> from the array pointed to by s2 to the end of the string
> pointed to by s1.
> The initial character of s2 overwrites the null character
> at the end of s1.
> A terminating null character is always appended to the result.
>

Somehow, I completely was confusing 'strncpy' with 'strncat', here. :(
Please substitute 'strncpy' in place of 'strncat' in my post. :( Of
course, that makes it irrelevant to Keith's immediately-preceding post.

Thank you for the correction, pete!

Philip Lantz

unread,

Feb 25, 2012, 3:26:56 AM2/25/12

to

Jorgen Grahn wrote:
> On Mon, 2012-02-20, Keith Thompson wrote:
> > The point is that strncpy is a very different function from strcpy.
> > It is not intended to work with a *string* in the target array;
> > it works with a specialized data structure (used to store file
> > names in very early Unix systems).
>
> Malcolm McLean wrote something similar upthread. Do you have any
> references for this?
>
> It would explain the function's weird semantics, but I haven't seen
> anything before which says this is its background. (There's also
> wcsncpy() for wchar_t -- that one is certainly newer, and useless in
> the data structures you mention.)

I've never seen a reference for that either, but I guessed that this was
its purpose when I first learned the sematics of the function, oh, about
30 years ago, and I've assumed that ever since. Fixed-length Unix file
names are the only place I know of where a string buffer had to be
padded out to its full length with nulls.

> > I think you're understating the differences between strcpy and
> > strncpy. The strncpy function is radically different from strcpy,
> > and there are very few legitimate uses for it. On the other hand,
> > the deceptive name has led many C programmers to use it incorrectly,
> > and I strongly suspect that it's used incorrectly far more often
> > than it's used correctly.
>
> The static analysis tool we use at work screams bloody murder every
> time I use strcpy() and tells me to use strncpy() instead. Argh ...

Aargh! Well, presumably a non-broken fix will shut it up just as well.

pete

unread,

Feb 25, 2012, 10:34:34 AM2/25/12

to

Shao Miller wrote:
>
> On 2/20/2012 23:16, pete wrote:
> > Shao Miller wrote:
> >
> >> My impression is that in:
> >>
> >> char foo[] = "foo";
> >> char bar[3] = "bar";
> >> char baz[10] = "baz";
> >>
> >> each of these could be roughly equivalent to:
> >>
> >> char foo[sizeof "foo"];
> >> char bar[3];
> >> char baz[10];
> >>
> >> strncat(foo, "foo", sizeof "foo");
> >> strncat(bar, "bar", 3);

> Somehow, I completely was confusing 'strncpy' with 'strncat', here. :(
> Please substitute 'strncpy' in place of 'strncat' in my post. :( Of
> course, that makes it irrelevant
> to Keith's immediately-preceding post.
>
> Thank you for the correction, pete!

You're welcome.

What I don't about the idea of using strncpy instead of strcpy,
is that learning to use strncpy instead of strcpy,
seems to me to be more complicated
than learning to use strcpy properly.

The n parameter of strncpy makes it harder to forget
that the size of something must be taken into account.

But anyone who would have trouble remembering
how to use strcpy correctly,
might also have trouble remembering
that if n is less than (1 + source string length)
then the resulting strncpy write, will not be a string.

I've always been comfortable using strcpy.

--
pete

Alan Curry

unread,

Feb 25, 2012, 8:34:58 PM2/25/12

to

In article <slrnjkdi0u.1...@frailea.sa.invalid>,

Jorgen Grahn <grahn...@snipabacken.se> wrote:
>On Mon, 2012-02-20, Keith Thompson wrote:
>>
>> The point is that strncpy is a very different function from strcpy.
>> It is not intended to work with a *string* in the target array;
>> it works with a specialized data structure (used to store file
>> names in very early Unix systems).
>
>Malcolm McLean wrote something similar upthread. Do you have any
>references for this?
>

Ancient Unix source code is available. And it's not too big to grep. Let's go
mythbusting.

First observation: strncpy is present only in V7, not any earlier versions.

The item most commonly accused of being the original reason for strncpy is
the directory entry, which is defined like this:

#ifndef DIRSIZ
#define DIRSIZ 14
#endif
struct direct
{
ino_t d_ino;
char d_name[DIRSIZ];
};

That definition appears in 2 different header files, one which is used in the
kernel and one which is used in userspace. The headers are identical. In
fact, the above 8 lines are the entire contents.

This struct represents the format of a directory entry. There is no
distinction between the external format on permanent storage and the internal
format used by the system. There was no need for such a distinction because
there was only one supported filesystem type.

If the strncpy hypothesis is true, the original use of strncpy should have
been to copy from a \0-terminated string into a d_name field. There are 10
uses of strncpy in V7. (11 if you also include lpr from the "Addenda" tape.
But let's not, because it came later.)

Results of grepping for strncpy, after removing instances that were not
actually calls to strncpy:

usr/src/cmd/atrun.c: strncpy(file, dirent.d_name, DIRSIZ);
usr/src/cmd/crypt.c: strncpy(buf, pw, 8);
usr/src/cmd/ed.c: strncpy(buf, keyp, 8);
usr/src/cmd/expr.y: strncpy(Mstring[0], p, num);
usr/src/cmd/login.c:#define SCPYN(a, b) strncpy(a, b, sizeof(a))
usr/src/cmd/login.c: SCPYN(utmp.ut_name, "");
usr/src/cmd/login.c: SCPYN(utmp.ut_name, argv[1]);
usr/src/cmd/login.c: SCPYN(utmp.ut_line, index(ttyn+1, '/')+1);
usr/src/cmd/mkdir.c: strncpy(pname, d, slash);
usr/src/cmd/ranlib.c: strncpy(firstname, arp.ar_name, 14);
usr/src/cmd/xsend/lib.c: strncpy(buf, s, 10);

Some of the calls were through the SCPYN macro so I also included that as a
grep target.

First notice that all the matches are in usr/src/cmd, not in usr/sys where
the kernel source is. I expected the primeval strncpy to be in the kernel,
perhaps in the creat call where a \0-terminated string from userspace must be
used to populate a new directory entry. Nope.

Well, at least the first match (atrun.c) is working on a d_name. Yay! A
confirmation! Wait a minute, what's the argument order for strncpy again?
Destination first. Crap! A non-confirmation. With additional context, this
looks like exactly the kind of sloppy usage of strncpy that we now try to
avoid. The destination, "file", is declared like this:

char file[DIRSIZ+1];

And after the strncpy we find the usual fixup:

strncpy(file, dirent.d_name, DIRSIZ);
file[DIRSIZ] = '\0';

And then "file" is used as a \0-terminated string. It didn't need the extra
padding. strncpy is doing something useful here through, protecting against
an unterminated source buffer. That's probably as close as we're going to get
to confirming the strncpy hypothesis, since none of the rest of the uses
involve d_name.

In crypt.c we have
char buf[13];
and
strncpy(buf, pw, 8);
In this case, the source string is a user-supplied password, which is
\0-terminated, so strncpy is not protecting against an unterminated source.
It is, however, truncating the source if it is longer than 8 characters. And
if the password is less than 8 bytes long, the padding will actually be
relevant. The buffer is required to contain an 8-byte "key". This use of
strncpy actually needs all of its features.

In ed.c there is basically a copy of the same function from crypt.c, for
editing encrypted files.

In expr.y we have what looks like another sloppy strncpy. It has the fixup:

strncpy(Mstring[0], p, num);
Mstring[0][num] = '\0';

But the length "num" is not related to the size of the destination buffer.
It's the length of the portion of the source buffer that matched the \(...\)
subexpression of a regexp, which will be saved into Mstring[0]. Mstring is
declared like this:

char Mstring[1][128];

so there's a potential buffer overflow here if you use \(...\) to match a
string longer than 128 bytes. Maybe it's impossible to pass a string that
long to the program; I don't know. Moving on...

login.c looks good. The SCPYN macro ties the strncpy length limit to the
destination buffer size, and the destinations are ut_name and ut_line, both
of which are fixed-size buffers that need to be padded. This might actually
be a better case than the atrun.c usage. It fits everything we expected to
find except that it's utmp, not a directory.

mkdir.c is next, and it's an exciting candidate, isn't it? Especially if you
remember that mkdir wasn't a syscall yet, so the userspace mkdir program was
actually setuid root and worked at a low level. Not quite low enough to
operate on a struct direct though. What's happening here

strncpy(pname, d, slash);

is, like the expr.y usage, copying a substring of the source string, which is
taken directly from main's argv, so it's \0-terminated. strncpy is not
protecting against an unterminated source, and it's not truncating a long
source, so its only possibly useful feature is padding. Nope. After the
strncpy, pname is treated as a \0-terminated string. The fixup is hidden in a
strcat this time, but it's still there:

if(slash)
strncpy(pname, d, slash);
strcpy(pname+slash, ".");

Again there's a potential buffer overflow if the source string is longer than
128 bytes.

ranlib.c looks strange. I don't know exactly what it's doing, since I don't
know anything about the ar format. But the strncpy destination is

char firstname[17];

and the strncpy is

strncpy(firstname, arp.ar_name, 14);

seem a bit weird. Later, firstname is used as a \0-terminated string. So if
the source string arp.ar_name was shorter than 14 bytes, the first padding
byte added by strncpy will be the terminator and the rest will be
unnecessary. If the source string was 14 bytes, the terminator will be the \0
found at firstname[14]... not courtesy of strncpy or any post-strncpy fixup,
but just because it's in the bss and it never gets modified.

xsend/lib.c looks like another copy of the encryption key setup code found in
crypt.c and ed.c using a slightly different key generation method that uses
up to 10 characters of the user-supplied password.

I've looked in the V6 source for the code corresponding to each strncpy call
in V7. Most of them (at, expr, ranlib, xsend, and the encryption ability of
ed) don't exist in V6. mkdir was pure assembly in V6. The crypt program looks
like a total rewrite. login is the only one that had a direct equivalent.

struct utmp in V6 was different, using only a single char to identify the
tty, but ut_name was there (called simply "name") and it's appears plausible
that strncpy and SCPYN were added specifically to simplify the existing code,
which copied and padded the array with manual loops.

And one last thing... if strncpy wasn't used to populate the d_name field,
how was it done? Well, the kernel creates directory entries in response to
user requests like creat and mknod and link. All of those eventually end up
calling wdir() in usr/sys/sys/iget.c which does this:

bcopy((caddr_t)u.u_dbuf, (caddr_t)u.u_dent.d_name, DIRSIZ);

So a d_name is created as an exact copy of a u_dbuf, which is declared like
this:

char u_dbuf[DIRSIZ]; /* current pathname component */

and must already be properly padded. That was done before the wdir() call by
namei() in usr/sys/sys/nami.c and here's the answer:

(at this point, c is either the first character of a pathname or a non-slash
character that was found after a slash)

cp = &u.u_dbuf[0];
while (c != '/' && c != '\0' && u.u_error == 0 ) {
if (mpxip!=NULL && c=='!')
break;
if(cp < &u.u_dbuf[DIRSIZ])
*cp++ = c;
c = (*func)();
}
while(cp < &u.u_dbuf[DIRSIZ])
*cp++ = '\0';

Characters are read one at a time from the source string by calling a
callback function (*func). It does that because the source string may or may
not be in userspace, and userspace strings can't be directly addressed.
Afterward, the padding is done with a loop.

--
Alan Curry

Jorgen Grahn

unread,

Feb 28, 2012, 5:46:27 PM2/28/12

to

On Sun, 2012-02-26, Alan Curry wrote:
> In article <slrnjkdi0u.1...@frailea.sa.invalid>,
> Jorgen Grahn <grahn...@snipabacken.se> wrote:
>>On Mon, 2012-02-20, Keith Thompson wrote:
>>>
>>> The point is that strncpy is a very different function from strcpy.
>>> It is not intended to work with a *string* in the target array;
>>> it works with a specialized data structure (used to store file
>>> names in very early Unix systems).
>>
>>Malcolm McLean wrote something similar upthread. Do you have any
>>references for this?
>>
>
> Ancient Unix source code is available. And it's not too big to grep. Let's go
> mythbusting.
>
> First observation: strncpy is present only in V7, not any earlier versions.

[huge snip]

Interesting, and at the same time I'm ashamed to admit that I haven't
read it in detail yet.

That it came from V7 was news to me. In a way, that's almost enough
information. V7 means the late 1970s; programmers were fewer and on
average smarter, and the conventions weren't the same as today.

Joe keane

unread,

Feb 29, 2012, 7:03:36 PM2/29/12

to

In article <slrnjkqma0.1...@frailea.sa.invalid>,

Jorgen Grahn <grahn...@snipabacken.se> wrote:
>That it came from V7 was news to me. In a way, that's almost enough
>information. V7 means the late 1970s; programmers were fewer and on
>average smarter, and the conventions weren't the same as today.

Fixed-width members [or variable with fixed maximum] are old as dirt.
Think punched cards and 'tabulating machines'.

Used as a key in a B-tree or hash table, you really do want to clear out
the 'junk' (no need to go looking for the terminator).

Jorgen Grahn

unread,

Mar 1, 2012, 7:03:15 AM3/1/12

to

I wasn't referring to that ... I was more thinking of things like
function naming conventions, i.e. did people in 1979 see a function
named strsomething() and immediately think of normal nul-terminated C
strings?