{"id":1056,"date":"2015-03-08T23:53:59","date_gmt":"2015-03-09T06:53:59","guid":{"rendered":"http:\/\/www.elbeno.com\/blog\/?p=1056"},"modified":"2015-06-30T21:16:23","modified_gmt":"2015-07-01T04:16:23","slug":"a-persistent-myth-about-stls-remove-and-friends","status":"publish","type":"post","link":"https:\/\/www.elbeno.com\/blog\/?p=1056","title":{"rendered":"A persistent myth about STL&#8217;s remove (and friends)"},"content":{"rendered":"<p>There seems to be a persistent myth about STL&#8217;s <code>remove<\/code>, <code>remove_if<\/code>, etc.<br \/>\nAsk even a relatively experienced C++ programmer to explain this code.<\/p>\n<pre lang=\"cpp\">\r\nvector<int> v = { 1,2,3,4,5 };\r\nv.erase(remove_if(v.begin(), v.end(),\r\n                  [] (int i) { return (i & 1) == 0; }),\r\n        v.end());\r\n<\/pre>\n<p>They&#8217;ll recognize <a href=\"http:\/\/en.wikipedia.org\/wiki\/Erase%E2%80%93remove_idiom\">the erase-remove idiom<\/a> and correctly say that it&#8217;s removing even numbers from the <code>vector<\/code> so that what&#8217;s left is <code>{ 1,3,5 }<\/code>. So then you ask them how it works, and likely as not, they say something like:<\/p>\n<p>&#8220;The <code>remove_if<\/code> moves all the elements you want to remove to the end, then the <code>erase<\/code> gets rid of them.&#8221;<\/p>\n<p><strong>This isn&#8217;t what <code>remove_if<\/code> does.<\/strong> (And likewise, the other <code>remove*<\/code> algorithms.) If it did that &#8211; which is <em>more<\/em> work than it does &#8211; it would in fact be <code>partition<\/code>. What <code>remove<\/code> does is move the elements that <strong>won&#8217;t<\/strong> be removed <strong>to the beginning<\/strong>. A suitable implementation of <code>remove_if<\/code>, which you can <a href=\"http:\/\/llvm.org\/svn\/llvm-project\/libcxx\/trunk\/include\/algorithm\">see in the libc++ source<\/a>, is:<\/p>\n<pre lang=\"cpp\">\r\ntemplate <class _ForwardIterator, class _Predicate>\r\n_ForwardIterator\r\nremove_if(_ForwardIterator __first, _ForwardIterator __last, \r\n          _Predicate __pred)\r\n{\r\n    __first = _VSTD::find_if<\r\n        _ForwardIterator, \r\n        typename add_lvalue_reference<_Predicate>::type>\r\n            (__first, __last, __pred);\r\n    if (__first != __last)\r\n    {\r\n        _ForwardIterator __i = __first;\r\n        while (++__i != __last)\r\n        {\r\n            if (!__pred(*__i))\r\n            {\r\n                *__first = _VSTD::move(*__i);\r\n                ++__first;\r\n            }\r\n        }\r\n    }\r\n    return __first;\r\n}\r\n<\/pre>\n<p>The elements that got removed just got overwritten. They didn&#8217;t get moved to the end. After the call to <code>remove_if<\/code> (before the <code>erase<\/code> call), the <code>vector<\/code> was <code>{ 1,3,5,4,5 }<\/code>. <\/p>\n<p>This means that <code>remove<\/code> can potentially invalidate invariants, e.g. if we expect the sequence to contain unique values (the <code>erase<\/code> restores the invariant, so erase-remove should always be paired in this case). And if this had been a container of pointers, in all likelihood, memory would have been leaked. Hence item 33 in <em><a href=\"http:\/\/www.amazon.com\/Effective-STL-Specific-Standard-Template\/dp\/0201749629\/\">Effective STL<\/a><\/em>, &#8220;Be wary of <em>remove<\/em>-like algorithms on containers of pointers&#8221;.<\/p>\n<p>So remember, <strong><code>remove<\/code> doesn&#8217;t move things to the end<\/strong>. Next time you hear that, congratulate the programmer saying it &#8211; they&#8217;re <a href=\"http:\/\/xkcd.com\/1053\/\">one of today&#8217;s lucky 10,000<\/a>!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>There seems to be a persistent myth about STL&#8217;s remove, remove_if, etc. Ask even a relatively experienced C++ programmer to explain this code. vector v = { 1,2,3,4,5 }; v.erase(remove_if(v.begin(), v.end(), [] (int i) { return (i &#038; 1) == 0; }), v.end()); They&#8217;ll recognize the erase-remove idiom and correctly&#8230;<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[22,8],"tags":[],"class_list":["post-1056","post","type-post","status-publish","format-standard","hentry","category-cpp","category-programming"],"_links":{"self":[{"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1056","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1056"}],"version-history":[{"count":10,"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1056\/revisions"}],"predecessor-version":[{"id":1066,"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=\/wp\/v2\/posts\/1056\/revisions\/1066"}],"wp:attachment":[{"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1056"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1056"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.elbeno.com\/blog\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1056"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}