A persistent myth about STL’s remove (and friends)

There seems to be a persistent myth about STL’s remove, remove_if, etc.
Ask even a relatively experienced C++ programmer to explain this code.

vector v = { 1,2,3,4,5 };
v.erase(remove_if(v.begin(), v.end(),
                  [] (int i) { return (i & 1) == 0; }),
        v.end());

They’ll recognize the erase-remove idiom and correctly say that it’s removing even numbers from the vector so that what’s left is { 1,3,5 }. So then you ask them how it works, and likely as not, they say something like:

“The remove_if moves all the elements you want to remove to the end, then the erase gets rid of them.”

This isn’t what remove_if does. (And likewise, the other remove* algorithms.) If it did that – which is more work than it does – it would in fact be partition. What remove does is move the elements that won’t be removed to the beginning. A suitable implementation of remove_if, which you can see in the libc++ source, is:

template 
_ForwardIterator
remove_if(_ForwardIterator __first, _ForwardIterator __last, 
          _Predicate __pred)
{
    __first = _VSTD::find_if<
        _ForwardIterator, 
        typename add_lvalue_reference<_Predicate>::type>
            (__first, __last, __pred);
    if (__first != __last)
    {
        _ForwardIterator __i = __first;
        while (++__i != __last)
        {
            if (!__pred(*__i))
            {
                *__first = _VSTD::move(*__i);
                ++__first;
            }
        }
    }
    return __first;
}

The elements that got removed just got overwritten. They didn’t get moved to the end. After the call to remove_if (before the erase call), the vector was { 1,3,5,4,5 }.

This means that remove can potentially invalidate invariants, e.g. if we expect the sequence to contain unique values (the erase restores the invariant, so erase-remove should always be paired in this case). And if this had been a container of pointers, in all likelihood, memory would have been leaked. Hence item 33 in Effective STL, “Be wary of remove-like algorithms on containers of pointers”.

So remember, remove doesn’t move things to the end. Next time you hear that, congratulate the programmer saying it – they’re one of today’s lucky 10,000!

2 comments

  1. For containers holding pointers, a lot of people recommend doing the memory free within the remove_if predicate.

    It seems like a nice solution, but also seems a bit strange to have a predicate that accesses the object in a non const way (from a pursist POV)

    What do you think about that method?

  2. I recommend using unique_ptr in owning containers 🙂

    You’re right, remove_if must not call non-const member functions (if it does, it’s a precondition violation, which technically means undefined behaviour). But delete is not a non-const member function 🙂 so from a purist POV, it’s fine.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.