There seems to be a persistent myth about STL’s remove
, remove_if
, etc.
Ask even a relatively experienced C++ programmer to explain this code.
vector v = { 1,2,3,4,5 };
v.erase(remove_if(v.begin(), v.end(),
[] (int i) { return (i & 1) == 0; }),
v.end());
They’ll recognize the erase-remove idiom and correctly say that it’s removing even numbers from the vector
so that what’s left is { 1,3,5 }
. So then you ask them how it works, and likely as not, they say something like:
“The remove_if
moves all the elements you want to remove to the end, then the erase
gets rid of them.”
This isn’t what remove_if
does. (And likewise, the other remove*
algorithms.) If it did that – which is more work than it does – it would in fact be partition
. What remove
does is move the elements that won’t be removed to the beginning. A suitable implementation of remove_if
, which you can see in the libc++ source, is:
template
_ForwardIterator
remove_if(_ForwardIterator __first, _ForwardIterator __last,
_Predicate __pred)
{
__first = _VSTD::find_if<
_ForwardIterator,
typename add_lvalue_reference<_Predicate>::type>
(__first, __last, __pred);
if (__first != __last)
{
_ForwardIterator __i = __first;
while (++__i != __last)
{
if (!__pred(*__i))
{
*__first = _VSTD::move(*__i);
++__first;
}
}
}
return __first;
}
The elements that got removed just got overwritten. They didn’t get moved to the end. After the call to remove_if
(before the erase
call), the vector
was { 1,3,5,4,5 }
.
This means that remove
can potentially invalidate invariants, e.g. if we expect the sequence to contain unique values (the erase
restores the invariant, so erase-remove should always be paired in this case). And if this had been a container of pointers, in all likelihood, memory would have been leaked. Hence item 33 in Effective STL, “Be wary of remove-like algorithms on containers of pointers”.
So remember, remove
doesn’t move things to the end. Next time you hear that, congratulate the programmer saying it – they’re one of today’s lucky 10,000!
For containers holding pointers, a lot of people recommend doing the memory free within the remove_if predicate.
It seems like a nice solution, but also seems a bit strange to have a predicate that accesses the object in a non const way (from a pursist POV)
What do you think about that method?
I recommend using
unique_ptr
in owning containers 🙂You’re right,
remove_if
must not call non-const member functions (if it does, it’s a precondition violation, which technically means undefined behaviour). Butdelete
is not a non-const member function 🙂 so from a purist POV, it’s fine.