The list.erase(remove...) idiom, doing half the job???

99 views
Skip to first unread message

Tim Ottinger

unread,
Jan 24, 2000, 3:00:00 AM1/24/00
to
Maybe this has been covered, but I can't find much mention of it.

The STL is brilliant, wonderfully made, efficient, and in many ways
it is intuitive once you've spent a little time with it (expert-
friendly).

However, remove() remains one of the most counter-intuitive bits.
If I call remove(x.begin(),x.end(),someValue), then the size of
'x' is unchanged. We have to take the iterator returned by remove()
and call x.erase(returnedIterator,x.end()) for some reason.

Now, "erase" and "remove" are essentially the same word, they're such
close synonyms. Nothing in the name 'remove' leads a speaker English
(or American <grin>) intuitively to think that 'remove' does a partial
job, while 'erase' does a complete job. Maybe
'remove_but_do_not_truncate' would have been too cumbersome a name.
But why have such a function at all?

I find that I have difficulty recalling things when I don't understand
them really well. Maybe that's normal, but I've never been given a
rationale for remove() working the way it does. The stuff from
returnedIterator to x.end() isn't very useful. Why keep it?

Maybe someone here can explain it in a way that I can recall it more
easily, instead of just memorizing it as an unfortunate fact of C++
life. Right now it sits in my list of "legitimate reasons to dislike
C++" (which, by the way, is shorter than my list of reasons to like
it).

[ Send an empty e-mail to c++-...@netlab.cs.rpi.edu for info ]
[ about comp.lang.c++.moderated. First time posters: do this! ]


Dietmar Kuehl

unread,
Jan 25, 2000, 3:00:00 AM1/25/00
to
Hi,
In article <388c778e...@news.concentric.net>,

tott...@concentric.net (Tim Ottinger) wrote:
> However, remove() remains one of the most counter-intuitive bits.
> If I call remove(x.begin(),x.end(),someValue), then the size of
> 'x' is unchanged. We have to take the iterator returned by remove()
> and call x.erase(returnedIterator,x.end()) for some reason.

I think understanding the reason why 'remove()' does not really remove
the elements is the key to remembering the rule. However, the reason
is extremely simple: There is, in general, no way to determine which
container the iterator are from! Thus, there is in general no way to
remove elements from this container. In fact, there are even containers
where it is impossible to remove elements from, eg. built-in arrays. On
the other hand, it makes very well sense to separate elements into
groups of used and unused elements, even if they cannot really be
removed from the container. If you really want to remove the elements,
you can use the container together with the method 'erase()' which is
named differently because it has a different semantic: This one, being
a method of the container, can very well remove the elements.

The general rule is that you cannot remove or add elements without
having a reference to the container somehow. You can, at least for
non-associative containers, reorder the elements within a container
without a reference to the container because the elements are always
assignable (a basic requirement on the value type for non-associative
containers).

Now, there are algorithms which can actually add elements to a
container, for example you can add elements using 'copy()'. However,
this requires that you pass in an iterator which knows the containers,
for example a 'back_insert_iterator'. However, this is kind of cheating
and also these iterators are normally rather primitive: For example,
the 'back_insert_iterator' is just an OutputIterator' but 'remove()'
would require at least a ForwardIterator. The actual problem in this
case is the multi-pass facility required for ForwardIterator makes it
much harder to implement an iterator which adds some functionality.

> Now, "erase" and "remove" are essentially the same word, they're such
> close synonyms. Nothing in the name 'remove' leads a speaker English
> (or American <grin>) intuitively to think that 'remove' does a partial
> job, while 'erase' does a complete job.

Just look at what is possible to the algorithm to do: Can it actually
change the container because it is available? If so, it will change
the container. As a litmus test you can just ask the question: Can the
algorithm work on built-in arrays? If it can, the algorithm will
neither erase nor append elements: This is impossible for built-in
arrays. If you cannot apply the algorithm to a built-in array, eg.
'x.erase()' because built-in arrays don't have member functions, the
algorithm *might* change the container in other ways than reordering
it. Of course, whether it will depends on the semantics of the
algorithm: While 'erase()' will erase elements from the container,
'sort()' will just rearrange them without removing any of them.

> I find that I have difficulty recalling things when I don't understand
> them really well. Maybe that's normal, but I've never been given a
> rationale for remove() working the way it does. The stuff from
> returnedIterator to x.end() isn't very useful. Why keep it?

I think the above gives a plausible rationale. To answer your question:
There is no way to really remove it from the container in the first
place because it is unknown to which container the elements belong (if
they belong to a container at all...)!
--
<mailto:dietma...@claas-solutions.de>
homepage: <http://www.informatik.uni-konstanz.de/~kuehl>


Sent via Deja.com http://www.deja.com/
Before you buy.

John Potter

unread,
Jan 25, 2000, 3:00:00 AM1/25/00
to
On 24 Jan 2000 13:50:05 -0500, tott...@concentric.net (Tim Ottinger)
wrote:

: However, remove() remains one of the most counter-intuitive bits.
: If I call remove(x.begin(),x.end(),someValue), then the size of
: 'x' is unchanged.

Notice that the function receives iterators. There is no container_of
operator for iterators. The generic function operates on a range. It
could be a contiguous block from a vector or a bunch of nodes from a
list. There is no way for the algorithm to modify the container, it
simply rearanges the data.

: We have to take the iterator returned by remove()


: and call x.erase(returnedIterator,x.end()) for some reason.

Now the member of x can adjust the container to dispose of the portion
which contains the remaining garbage.

Unfortunately, you are still left with list::remove and list::remove_if
which erase. At one time, those functions of set/map were also named
remove, but have been changed to erase.

John

Marco Dalla Gasperina

unread,
Jan 26, 2000, 3:00:00 AM1/26/00
to
Tim Ottinger <tott...@concentric.net> wrote in message
news:388c778e...@news.concentric.net...

> Maybe this has been covered, but I can't find much mention of it.
>
> However, remove() remains one of the most counter-intuitive bits.
> If I call remove(x.begin(),x.end(),someValue), then the size of
> 'x' is unchanged. We have to take the iterator returned by remove()

> and call x.erase(returnedIterator,x.end()) for some reason.
>
> Now, "erase" and "remove" are essentially the same word, they're such
> close synonyms. Nothing in the name 'remove' leads a speaker English
> (or American <grin>) intuitively to think that 'remove' does a partial
> job, while 'erase' does a complete job. Maybe
> 'remove_but_do_not_truncate' would have been too cumbersome a name.
> But why have such a function at all?

I agree with Tim here. From my (admittedly limited) understanding
remove_if() is effectively the same thing as stable_partition().
And remove() is just a convient form which uses operator== as the
predicate. Seems like one should use the 'partition' forms because
it's clear that it divides the sequence into 2 pieces... one that
statisfies a predicate, and one that doesn't. Removal has nothing
to do with any of this.

marco

Marco Dalla Gasperina

unread,
Jan 26, 2000, 3:00:00 AM1/26/00
to

Dietmar Kuehl <dietma...@claas-solutions.de> wrote in message
news:86ik43$5ra$1...@nnrp1.deja.com...

> > I find that I have difficulty recalling things when I don't understand
> > them really well. Maybe that's normal, but I've never been given a
> > rationale for remove() working the way it does. The stuff from
> > returnedIterator to x.end() isn't very useful. Why keep it?
>
> I think the above gives a plausible rationale. To answer your question:
> There is no way to really remove it from the container in the first
> place because it is unknown to which container the elements belong (if
> they belong to a container at all...)!

I think this was Tim's point. If there is no way to 'remove' the
elements, the function shouldn't be called 'remove'.

Pavel Mayer

unread,
Jan 28, 2000, 3:00:00 AM1/28/00
to
Marco Dalla Gasperina <mar...@pacifier.com> wrote:

> I agree with Tim here. From my (admittedly limited) understanding
> remove_if() is effectively the same thing as stable_partition().

At the first glance it looks like that, but "remove_if()" leaves the
elements in the second part of the container unspecified; usually the "true"
elements are just copied into the first part, possibly destroying some
"false" elements in the first part and leaving the original elements in the
second part untouched, which might in fact result in that afterwards no
"false" element exists due to "remove by overwriting", so the chosen name is
not completely wrong.

"stable_partition()" instead swaps elements and is a "lossless" reordering
so that *all* elements are still somewhere in the container.

"stable_partition()" is more expensive and might use temporary buffers do do
its job, so it is reasonable to use "remove_if" if only one part of the
result is needed.
--
Pavel Mayer (pa...@datango.de)
"Without chaos, nothing is created. Without order, nothing can exist."

Christopher Eltschka

unread,
Jan 28, 2000, 3:00:00 AM1/28/00
to
Marco Dalla Gasperina wrote:

[...]

> I agree with Tim here. From my (admittedly limited) understanding
> remove_if() is effectively the same thing as stable_partition().

> And remove() is just a convient form which uses operator== as the
> predicate. Seems like one should use the 'partition' forms because
> it's clear that it divides the sequence into 2 pieces... one that
> statisfies a predicate, and one that doesn't. Removal has nothing
> to do with any of this.

>From the SGI docs for remove:

---8<---8<---
Remove removes from the range [first, last) all elements that are equal
to value. That is, remove returns an iterator new_last
such that the range [first, new_last) contains no elements equal to
value. [1] The iterators in the range [new_last, last) are all
still dereferenceable, but the elements that they point to are
unspecified. Remove is stable, meaning that the relative order of
elements
that are not equal to value is unchanged.
---8<---8<---

The analog is in the remove_if documentation.

Indeed, SGI's remove just calls remove_copy with the output
sequence being the same as the input sequence. This obviously
does _not_ write the removed items back (so in some sense they
_are_ removed, since there's no way to get the value back).

So remove_if is _not_ the same as stable_partition, and
will generally be cheaper (because moving the removed
elements to the end is not required, let alone keeping
them in the same order).

Reply all
Reply to author
Forward
0 new messages