Pointer Question II - The Rebirth

DSF

unread,

Feb 1, 2010, 7:07:45 PM2/1/10

to

Hello all,

I want to thank all of those who responded to my previous "Pointer
question" post. The reason I am starting a new thread is that I had a
heck of a time finding my post in the first place. It was attached to
a post titled "Pointer question" from July 28 of 2004! I have to
remember from now on to try to be less generic in the naming of my
posts. :o)

From the replies I received, I gather there is no guaranteed way to
compare pointers against each other to check if they are in range.

And even though it causes no problems on my system, and the odds are
pretty small it will ever be used on a system where it would be
problematic, I would like to get it right for the sake of getting it
right. After all, that is why I asked here.

For the record, I have written programs on systems where there was
no such thing as an invalid pointer. Or maybe I should say an illegal
pointer. A pointer not pointing to where it should could be
considered invalid, but not necessarily illegal.

As for the name, I do seem to remember reading that str* is
reserved. It makes sense to follow that convention. I ran afoul of
it all by myself. I spent four or five minutes trying to find the
documentation for strpad, only to discover I wrote it.

Anyway, here is the updated code. Look better?

int StringDelete(char *str, size_t pos, size_t n)
{
char *p1;
char *p2;

size_t l = strlen(str);

if(pos < l)
{
if(pos + n > l)
n = l - pos;

p1 = str + pos;
p2 = p1 + n;

while(*p2)
*p1++ = *p2++;
*p1 = *p2;
return 0;
}
return 1;
}

DSF

Peter Nilsson

unread,

Feb 1, 2010, 8:02:31 PM2/1/10

to

DSF <notava...@address.here> wrote:
> ... I gather there is no guaranteed way to compare pointers

> against each other to check if they are in range.

There is, it just isn't very efficient...

/* return 1 if p is in [u..v), otherwise return 0 */
int in_range(const void *p, const void *u, const void *v)
{
const char *cp = p;
const char *cu = u;
const char *cv = v;

for (; cu < cv; cu++)
if (cp == cu)
return 1;

return 0;
}

<snip>

> int StringDelete(char *str, size_t pos, size_t n)

Is that 0 if it succeeds?

Also, somewhat technically, C90 need not be case insensitive
with regards to identifiers with external linkage. So your
function name may tread on implementation namespace. Any
identifier beginning with str followed by a lowercase letter
is reserved.

> {
> char *p1;
> char *p2;
>
> size_t l = strlen(str);
>
> if(pos < l)
> {
> if(pos + n > l)
> n = l - pos;
>
> p1 = str + pos;
> p2 = p1 + n;

Consider pos = 5, n = (size_t) -3.

>
> while(*p2)
> *p1++ = *p2++;
> *p1 = *p2;
> return 0;
> }
> return 1;
> }

<snip>

char *str_del(char *s, size_t p, size_t n)
{
size_t z = strlen(s);

if (0 < n && p < z)
{
size_t nn = z - p;
if (nn > n) nn = n;
memmove(s + p, s + p + nn, z - p - nn + 1);
}

return s;
}

--
Peter

Barry Schwarz

unread,

Feb 1, 2010, 10:54:01 PM2/1/10

to

On Mon, 01 Feb 2010 19:07:45 -0500, DSF <nota...@address.here>
wrote:

snip

>int StringDelete(char *str, size_t pos, size_t n)
>{
> char *p1;
> char *p2;
>
> size_t l = strlen(str);

Variables names consisting entirely of characters from the set (l,1)
should be banned.

>
> if(pos < l)
> {
> if(pos + n > l)
> n = l - pos;
>
> p1 = str + pos;
> p2 = p1 + n;
>
> while(*p2)
> *p1++ = *p2++;
> *p1 = *p2;
> return 0;
> }
> return 1;
>}
>
>DSF

--
Remove del for email

DSF

unread,

Feb 2, 2010, 1:14:22 AM2/2/10

to

On Mon, 1 Feb 2010 17:02:31 -0800 (PST), in comp.lang.c you wrote:

>DSF <notava...@address.here> wrote:
>> ... I gather there is no guaranteed way to compare pointers
>> against each other to check if they are in range.

{snip}

>> int StringDelete(char *str, size_t pos, size_t n)
>
>Is that 0 if it succeeds?

Yes, because I wasn't sure if there might be other error conditions
I would want to handle. If not, I will probably use 0 = false = fail,
1 = true = succeed.

>
>Also, somewhat technically, C90 need not be case insensitive
>with regards to identifiers with external linkage.

Did you mean need not be case sensitive? I didn't think that C was
case insensitive anywhere. Is it case insensitive with external
linkage to allow compatibility with other case insensitive languages?

> So your
>function name may tread on implementation namespace. Any
>identifier beginning with str followed by a lowercase letter
>is reserved.

That ties up a lot of identifiers!

>
>> {
>> char *p1;
>> char *p2;
>>
>> size_t l = strlen(str);
>>
>> if(pos < l)
>> {
>> if(pos + n > l)
>> n = l - pos;
>>
>> p1 = str + pos;
>> p2 = p1 + n;
>
>Consider pos = 5, n = (size_t) -3.

Thanks. I knew neither pos or n could be negative, but forgot about
overflow if n was within value pos of overflow. And since my compiler
doesn't even peep with StringDelete(str, 5, -3); it's a definite
concern.
Fortunately, it was easy to fix. (See bottom.)

>
>>
>> while(*p2)
>> *p1++ = *p2++;
>> *p1 = *p2;
>> return 0;
>> }
>> return 1;
>> }
><snip>
>
> char *str_del(char *s, size_t p, size_t n)
> {
> size_t z = strlen(s);
>
> if (0 < n && p < z)
> {
> size_t nn = z - p;
> if (nn > n) nn = n;
> memmove(s + p, s + p + nn, z - p - nn + 1);
> }
>
> return s;
> }

This brings up an interesting question: why do many (all?) of the
string functions return a copy of the "destination" string? Is it
just for the convenience of being able to use the function itself in
code requiring a char * instead of calling the function and then
passing the pointer to the code requiring a char * on another line?

Here's the updated code: (Name's still the same, for now.)

int StringDelete(char *str, size_t pos, size_t n)

{
char *p1;
char *p2;

size_t l = strlen(str);

if(pos < l && !((pos + n) < n))

{
if(pos + n > l)
n = l - pos;

p1 = str + pos;
p2 = p1 + n;

while(*p2)

*p1++ = *p2++;
*p1 = *p2;
return 0;
}
return 1;
}

DSF

unread,

Feb 2, 2010, 1:26:53 AM2/2/10

to

On Mon, 01 Feb 2010 19:54:01 -0800, Barry Schwarz <schw...@dqel.com>
wrote:

>On Mon, 01 Feb 2010 19:07:45 -0500, DSF <nota...@address.here>
>wrote:
>
{snip}

>> size_t l = strlen(str);
>
>Variables names consisting entirely of characters from the set (l,1)
>should be banned.

I understand that. I use Windows and I'd like to get ahold of the
person who chose a system font where the lower case 'i' & 'j' and 'q'
& 'g' are virtually indistinguishable on a 1280x1024 screen. I have a
friend who asks me every once in a while what an ".IPG" (upper case
for clarity here) extension is. :o)

DSF

Ben Bacarisse

unread,

Feb 2, 2010, 7:01:23 AM2/2/10

to

DSF <nota...@address.here> writes:

> On Mon, 1 Feb 2010 17:02:31 -0800 (PST), in comp.lang.c you wrote:

<snip>

>>Also, somewhat technically, C90 need not be case insensitive
>>with regards to identifiers with external linkage.
> Did you mean need not be case sensitive? I didn't think that C was
> case insensitive anywhere.

Well, it is not mandated -- it is simply a possibility. C90
implementations are permitted to ignore case on identifiers with
external linkage and they need only treat the first 6 letters as being
significant. 1990 seems such a long time ago, now.

> Is it case insensitive with external
> linkage to allow compatibility with other case insensitive
> languages?

Maybe, though that could probably be handled by a mechanism outside of
the C language. I suspect it is simply to accommodate old-fashioned
linkers.

<snip>

> This brings up an interesting question: why do many (all?) of the
> string functions return a copy of the "destination" string? Is it
> just for the convenience of being able to use the function itself in
> code requiring a char * instead of calling the function and then
> passing the pointer to the code requiring a char * on another line?

"Just for the convenience" does not do that notion justice. What
reason could there be for not returning a value if there is a value
that might reasonably be returned? The question is then simply which
value should be returned and the destination is a clear winner in most
cases.

> Here's the updated code: (Name's still the same, for now.)
>
> int StringDelete(char *str, size_t pos, size_t n)
> {
> char *p1;
> char *p2;

I'd define these inside the next block, but that is matter of taste
and style.

> size_t l = strlen(str);
>
> if(pos < l && !((pos + n) < n))

What is that second test all about?

> {
> if(pos + n > l)
> n = l - pos;

Ah, maybe you are worried about the possibility that pos+n is bigger
than SIZE_MAX. I would test if n > l - pos and adjust if needed and
delete the "mystery" test in the previous if.

> p1 = str + pos;
> p2 = p1 + n;
>
> while(*p2)
> *p1++ = *p2++;
> *p1 = *p2;

This is what memmove is for.

> return 0;
> }
> return 1;
> }

Odd return values, but I've seen your explanation of why.

--
Ben.

Nobody

unread,

Feb 2, 2010, 10:57:05 AM2/2/10

to

On Tue, 02 Feb 2010 01:14:22 -0500, DSF wrote:

>>Also, somewhat technically, C90 need not be case insensitive
>>with regards to identifiers with external linkage.
>
> Did you mean need not be case sensitive? I didn't think that C was
> case insensitive anywhere. Is it case insensitive with external
> linkage to allow compatibility with other case insensitive languages?

The "case-insensitive, 6 significant characters" rule is to allow
for limitations of the platform's linker and object file format.

DSF

unread,

Feb 3, 2010, 1:28:27 AM2/3/10

to

On Tue, 02 Feb 2010 12:01:23 +0000, Ben Bacarisse
<ben.u...@bsb.me.uk> wrote:

>DSF <nota...@address.here> writes:
>
{snip}

>> This brings up an interesting question: why do many (all?) of the
>> string functions return a copy of the "destination" string? Is it
>> just for the convenience of being able to use the function itself in
>> code requiring a char * instead of calling the function and then
>> passing the pointer to the code requiring a char * on another line?
>
>"Just for the convenience" does not do that notion justice. What
>reason could there be for not returning a value if there is a value
>that might reasonably be returned? The question is then simply which
>value should be returned and the destination is a clear winner in most
>cases.

I guess what I was getting at is that the string function can sit in
place of, but not take the place of a variable.

strcpy(string1, strcpy(string2, string3));

Is functionally the same as:

strcpy(string2, string3);
strcpy(string1, string2);

I had an example of what I mean, but I can't find it. Something
where string2 above was only a placeholder-a buffer to get from
string3 to string1. It's really not important though, and I do
understand your point.

>
>> Here's the updated code: (Name's still the same, for now.)
>>
>> int StringDelete(char *str, size_t pos, size_t n)
>> {
>> char *p1;
>> char *p2;
>
>I'd define these inside the next block, but that is matter of taste
>and style.
>
>> size_t l = strlen(str);
>>
>> if(pos < l && !((pos + n) < n))
>
>What is that second test all about?

No "mystery." The reason is in the part of my reply to Peter
Nilsson that was snipped. The second part above simply tests pos + n
for overflow. Off the top of my head (untested) I guess it could have
been stated as ((pos + n) >= n) and lose the !, but it does work as
written.

>
>> {
>> if(pos + n > l)
>> n = l - pos;
>
>Ah, maybe you are worried about the possibility that pos+n is bigger
>than SIZE_MAX. I would test if n > l - pos and adjust if needed and
>delete the "mystery" test in the previous if.

See above. I really don't think it's all that "mysterious." ;o)

>
>> p1 = str + pos;
>> p2 = p1 + n;
>>
>> while(*p2)
>> *p1++ = *p2++;
>> *p1 = *p2;
>
>This is what memmove is for.

Yeah, I know. But I figured making a wide (16 bit character)
version of this function would be a lot easier. Just change all char
to wchar_t and use the wide version of strlen. However,since its
companion function "StringInsert" already uses memmove, and memmove
will be faster except on the shortest strings (or if pos+n >= l, in
which case it merely plops a 0 in str[n]).

>
>> return 0;
>> }
>> return 1;
>> }
>
>Odd return values, but I've seen your explanation of why.

The conventional method would be to return str unaltered, but I
figure returning "Hey you! Either pos or n is out of range or str is
shorter that you thought!" would be much more useful for debugging or
error handling. :o)

DSF

Ben Bacarisse

unread,

Feb 3, 2010, 6:49:31 AM2/3/10

to

DSF <nota...@address.here> writes:

> On Tue, 02 Feb 2010 12:01:23 +0000, Ben Bacarisse
> <ben.u...@bsb.me.uk> wrote:
>
>>DSF <nota...@address.here> writes:
<snip>

>>> Here's the updated code: (Name's still the same, for now.)
>>>
>>> int StringDelete(char *str, size_t pos, size_t n)
>>> {
>>> char *p1;
>>> char *p2;
>>>

>>> size_t l = strlen(str);
>>>
>>> if(pos < l && !((pos + n) < n))
>>
>>What is that second test all about?
>
> No "mystery." The reason is in the part of my reply to Peter
> Nilsson that was snipped. The second part above simply tests pos + n
> for overflow. Off the top of my head (untested) I guess it could have
> been stated as ((pos + n) >= n) and lose the !, but it does work as
> written.
>
>>
>>> {
>>> if(pos + n > l)
>>> n = l - pos;
>>
>>Ah, maybe you are worried about the possibility that pos+n is bigger
>>than SIZE_MAX. I would test if n > l - pos and adjust if needed and
>>delete the "mystery" test in the previous if.
>
> See above. I really don't think it's all that "mysterious." ;o)

The mystery is why it is there, not what it does. Not only is it not
needed (to get the job done -- you need it with the way you've written
this) it focuses the reader on the wrong thing: something technical
about the addition rather than the logical flaw that n > l - pos.

>>> p1 = str + pos;
>>> p2 = p1 + n;
>>>
>>> while(*p2)
>>> *p1++ = *p2++;
>>> *p1 = *p2;
>>
>>This is what memmove is for.
>
> Yeah, I know. But I figured making a wide (16 bit character)
> version of this function would be a lot easier. Just change all char
> to wchar_t and use the wide version of strlen.

It's not that hard to change memmove to wmemmove at the same time.

Interestingly, I find it easier to write these things with a loop
myself (getting the offsets and lengths right for the memmove often
seems rather fiddly), but I prefer to use the library function in part
because it seems easier to check that the call is correct than to
check all the pointer setup and copying. This may be a figment of my
imagination, but I'd rather have code that is self-evident that code
that was simple to write.

> However,since its
> companion function "StringInsert" already uses memmove, and memmove
> will be faster except on the shortest strings (or if pos+n >= l, in
> which case it merely plops a 0 in str[n]).
>
>>
>>> return 0;
>>> }
>>> return 1;
>>> }
>>
>>Odd return values, but I've seen your explanation of why.
>
> The conventional method would be to return str unaltered, but I
> figure returning "Hey you! Either pos or n is out of range or str is
> shorter that you thought!" would be much more useful for debugging or
> error handling. :o)

An alternative is to return str one success and NULL on failure.
There won't be a lot of benefit, but on occasion the caller will know
that all the sizes are right and they can reply on a non-null result.

--
Ben.