Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Overlapping strings?

1 view
Skip to first unread message

Amittai Aviram

unread,
Jan 6, 2002, 11:41:01 PM1/6/02
to
When you use strcpy() or strcat(), if the source and destination strings
overlap, you get undefined behavior. But how would this condition come
about in the first place? How would you get two strings that would overlap?
Thanks!

Amittai Aviram

Lance Purple

unread,
Jan 6, 2002, 11:50:05 PM1/6/02
to

const char* example = "This is a string";
const char* overlap = example + 5;


--
.-----------------------------------------------------------------.
/ Lance Purple lpurple<a>io.com http://purple.home.texas.net /
'-----------------------------------------------------------------'

Gregory Pietsch

unread,
Jan 6, 2002, 11:58:04 PM1/6/02
to
"Amittai Aviram" <ami...@amittai.com> wrote in message
news:oF9_7.393456$er5.15...@e3500-atl2.usenetserver.com...

This very contrived snippet of code is an example of the undefined behavior:

char s[100];

strcpy(s,"ABCDEFG");
strcpy(s+1, s);

To avoid something this goofy, use memmove() whenever there's any
possibility of overlap among the source and destination strings.

Gregory Pietsch


Robert Stankowic

unread,
Jan 7, 2002, 12:11:55 AM1/7/02
to

"Lance Purple" <lpu...@io.com> schrieb im Newsbeitrag
news:1O9_7.30231$wa.23...@bin6.nnrp.aus1.giganews.com...

> Amittai Aviram <ami...@amittai.com> wrote:
> >When you use strcpy() or strcat(), if the source and destination strings
> >overlap, you get undefined behavior. But how would this condition come
> >about in the first place? How would you get two strings that would
overlap?
> >Thanks!
>
> const char* example = "This is a string";
> const char* overlap = example + 5;

No way to (legally) use the above as parameters to strcpy() or strcat() ;-)
what about:

char my_string[100];

strcpy(my_string, "This is a string"); /* OK */
strcat(my_string, my_string); /* oops, UB */
strcpy(my_string, my_string); /* oops, UB */

Robert

Amittai Aviram

unread,
Jan 7, 2002, 12:23:20 AM1/7/02
to

"Robert Stankowic" <pcdo...@netway.at> wrote in message
news:a1bag3$90p$1...@newsreader1.netway.at...

>
> "Lance Purple" <lpu...@io.com> schrieb im Newsbeitrag
> news:1O9_7.30231$wa.23...@bin6.nnrp.aus1.giganews.com...
> > Amittai Aviram <ami...@amittai.com> wrote:
> > >When you use strcpy() or strcat(), if the source and destination
strings
> > >overlap, you get undefined behavior. But how would this condition come
> > >about in the first place? How would you get two strings that would
> overlap?
> > >Thanks!

> char my_string[100];

O.k., an empty array of 100 char-sized memory units.

> strcpy(my_string, "This is a string"); /* OK */

Now the array has the letters in the string, plus NUL, plus the rest of the
reserved space.

> strcat(my_string, my_string); /* oops, UB */

These two seem to me to be overlapping 100%, i.e., on top of each other. Is
that right?

> strcpy(my_string, my_string); /* oops, UB */

Likewise, copying my_string[0] onto my_string[0], etc., thus 100% overlap.
Correct?

> Robert

Amittai

Amittai Aviram

unread,
Jan 7, 2002, 12:23:57 AM1/7/02
to

"Gregory Pietsch" <gk...@flash.net> wrote in message
news:wV9_7.3073$232.58...@newssvr15.news.prodigy.com...


Cool. Thank you!

Amittai


Robert Stankowic

unread,
Jan 7, 2002, 1:32:10 AM1/7/02
to

"Amittai Aviram" <ami...@amittai.com> schrieb im Newsbeitrag
news:Iaa_7.211808$BX4.12...@e3500-atl1.usenetserver.com...

>
> "Robert Stankowic" <pcdo...@netway.at> wrote in message
> news:a1bag3$90p$1...@newsreader1.netway.at...
> >
> > "Lance Purple" <lpu...@io.com> schrieb im Newsbeitrag
> > news:1O9_7.30231$wa.23...@bin6.nnrp.aus1.giganews.com...
> > > Amittai Aviram <ami...@amittai.com> wrote:
> > > >When you use strcpy() or strcat(), if the source and destination
> strings
> > > >overlap, you get undefined behavior. But how would this condition
come
> > > >about in the first place? How would you get two strings that would
> > overlap?
> > > >Thanks!
>
> > char my_string[100];
>
> O.k., an empty array of 100 char-sized memory units.

Uninitialized, to be precise. There is space for 100 chars, but they contain
garbage.

>
> > strcpy(my_string, "This is a string"); /* OK */
>
> Now the array has the letters in the string, plus NUL, plus the rest of
the
> reserved space.

Yes, btw better name it "plus '\0'"

>
> > strcat(my_string, my_string); /* oops, UB */
>
> These two seem to me to be overlapping 100%, i.e., on top of each other.
Is
> that right?
>
> > strcpy(my_string, my_string); /* oops, UB */
>
> Likewise, copying my_string[0] onto my_string[0], etc., thus 100% overlap.
> Correct?

Yes. Not really very useful, just meant as an example.
Actually, this would probably give the expected result on many
implementations (UB can be anything, including the expected result). But a
slightly different example:

strcpy(my_string + 1, my_string);

will pretty likely crash (unfortunately probably not always) on many
implementations.

Robert

Richard Heathfield

unread,
Jan 7, 2002, 12:50:20 AM1/7/02
to
Gregory Pietsch wrote:
>
> "Amittai Aviram" <ami...@amittai.com> wrote in message
> news:oF9_7.393456$er5.15...@e3500-atl2.usenetserver.com...
> > When you use strcpy() or strcat(), if the source and destination strings
> > overlap, you get undefined behavior. But how would this condition come
> > about in the first place? How would you get two strings that would
> overlap?
> > Thanks!
> >
> > Amittai Aviram
>
> This very contrived snippet of code is an example of the undefined behavior:
>
> char s[100];
>
> strcpy(s,"ABCDEFG");
> strcpy(s+1, s);

Actually, it's not all /that/ contrived. I've seen something similar to
this done (and I wouldn't be surprised to learn I'd done it myself in
the dim and distant past) as a way of removing the first character from
the string:

strcpy(s, s + 1);

Needless to say, the behaviour is completely undefined, just as yours
is.

>
> To avoid something this goofy, use memmove() whenever there's any
> possibility of overlap among the source and destination strings.

<shrug> Yes, or use a temp if you find that more to your liking. (I can
hear the health and efficiency people screaming at me already.)

--
Richard Heathfield : bin...@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton


Gordon Burditt

unread,
Jan 7, 2002, 3:15:26 AM1/7/02
to

You are writing a text editor. One of the commands is "delete word".
You have this line of text in a buffer:

Now is the time for all good men to kick kick Osama bin Laden's ass.
^
cursor

The cursor, represented by a char pointer (named "cursor") is at
the first word "kick". The user issues the "delete word" command.
You determine that you need to delete 5 characters ("kick" and the
following space). You can do this by copying the tail end of the
line (cursor+5 on) to cursor. Unfortunately,

strcpy(cursor, cursor+5); /* trouble */

invokes the wrath of undefined behavior. One way to deal with
this is to use memmove(), which can deal with overlapped copies:

memmove(cursor, cursor+5, strlen(cursor+5)+1);

The length of the tail end of the line is strlen(cursor+5),
plus one to copy the string terminator.

Another example: same starting line above, but the user requests
changing the text "kick kick" to "assassinate". The text to delete
is 9 chars long; the replacement is 11 chars long. You can do this
by moving the tail end 2 chars down (assume there's room), and
copying in the replacement text.

strcpy(cursor+11, cursor+9); /* trouble */
strncpy(cursor, "assassinate", 11);

The above might seem to work, but there's that nasty old undefined
behavior again. An implementation of strcpy() that copies one
character at a time would copy the space after the second "kick"
over the 's' in "Osama", and it would never stop copying as any \0
characters would be written over before they are looked at. It
would stop only when something like OS memory protection or stomping
over the code of strcpy() intrudes, leaving behind a long series
of space and 'O' characters.

Gordon L. Burditt

Thomas Stegen

unread,
Jan 7, 2002, 9:44:59 AM1/7/02
to
[snip]

>
> strcpy(my_string + 1, my_string);
>
> will pretty likely crash (unfortunately probably not always) on many
> implementations.

This will invoke UB since it will dereference an element one past the
end of an allocated block of memory. (You can point, but not touch :)

The OS might also decide to throw you out if you try write to memory
you have no business altering.

>
> Robert
>
>
>

--
Thomas.

Approaching singularity.


Robert Stankowic

unread,
Jan 7, 2002, 10:05:56 AM1/7/02
to

"Thomas Stegen" <tho_s...@hotmail.com> schrieb im Newsbeitrag
news:a1cceb$mqn$1...@dennis.cc.strath.ac.uk...

> [snip]
>
> >
> > strcpy(my_string + 1, my_string);
> >
> > will pretty likely crash (unfortunately probably not always) on many
> > implementations.
>
> This will invoke UB since it will dereference an element one past the
> end of an allocated block of memory. (You can point, but not touch :)

No, Thomas, not really :-)
If you look into my previous posting you will see, that my_string is defined
as char[100], but the striing copied there is much shorter.
Anyway it _is_ UB as I stated, but that is because the source and
destination overlap and most likely strcpy will not come to an end because
the copying process overwrites the terminating '\0'.
But maybe I misunderstand your reply :-)

>
> The OS might also decide to throw you out if you try write to memory
> you have no business altering.
>

Hopefully it will, a crash is always better than a wrong result without any
indication

Kind regards
Robert


Lawrence Kirby

unread,
Jan 7, 2002, 7:32:32 AM1/7/02
to
On Monday, in article
<Iaa_7.211808$BX4.12...@e3500-atl1.usenetserver.com>
ami...@amittai.com "Amittai Aviram" wrote:

>"Robert Stankowic" <pcdo...@netway.at> wrote in message
>news:a1bag3$90p$1...@newsreader1.netway.at...
>>
>> "Lance Purple" <lpu...@io.com> schrieb im Newsbeitrag
>> news:1O9_7.30231$wa.23...@bin6.nnrp.aus1.giganews.com...
>> > Amittai Aviram <ami...@amittai.com> wrote:
>> > >When you use strcpy() or strcat(), if the source and destination
>strings
>> > >overlap, you get undefined behavior. But how would this condition come
>> > >about in the first place? How would you get two strings that would
>> overlap?
>> > >Thanks!
>
>> char my_string[100];
>
>O.k., an empty array of 100 char-sized memory units.
>
>> strcpy(my_string, "This is a string"); /* OK */
>
>Now the array has the letters in the string, plus NUL, plus the rest of the
>reserved space.
>
>> strcat(my_string, my_string); /* oops, UB */
>
>These two seem to me to be overlapping 100%, i.e., on top of each other. Is
>that right?

Except that since strcat() copies to the end of the destination string
it would typically be larger than the source.

>> strcpy(my_string, my_string); /* oops, UB */
>
>Likewise, copying my_string[0] onto my_string[0], etc., thus 100% overlap.
>Correct?

Yes. Even 100% overlap results in undefined behaviour.

--
-----------------------------------------
Lawrence Kirby | fr...@genesis.demon.co.uk
Wilts, England | 7073...@compuserve.com
-----------------------------------------

0 new messages