Chapter 19: The STL Generic Algorithms

Don't hesitate to send in feedback: send an e-mail if you like the C++ Annotations; if you think that important material was omitted; if you find errors or typos in the text or the code examples; or if you just feel like e-mailing. Send your e-mail to Frank B. Brokken.

Please state the document version you're referring to, as found in the title (in this document: 8.3.1) and please state chapter and paragraph name or number you're referring to.

All received mail is processed conscientiously, and received suggestions for improvements will usually have been processed by the time a new version of the Annotations is released. Except for the incidental case I will normally not acknowledge the receipt of suggestions for improvements. Please don't interpret this as me not appreciating your efforts.

19.1: The Generic Algorithms

Before using the generic algorithms presented in this chapter, except for those in the Operators category (defined below), the <algorithm> header file must have been included. Before using a generic algorithm in the Operators category the <numeric> header file must have been included.

In the previous chapter the Standard Template Library (STL) was introduced. An important element of the STL, the generic algorithms, was not covered in that chapter as they form a fairly extensive part of the STL. Over time the STL has grown considerably, mainly as a result of a growing importance and appreciation of templates. Covering generic algorithm in the STL chapter itself would turn that chapter into an unwieldy one and so the generic algoritmhs were moved to a chapter of their own.

Generic algorithms perform an amazing task. Due to the strenght of templates algorithms could be developed that can be applied to a wide range of different data types while maintaining type safety. The prototypical example of this is the sort generic algorithm. To contrast: while C requires programmers to write callback functions in which type-unsafe void const * parameters have to be used, internally forcing the programmer to resort to casts, STL's sort frequently allows the programmer merely to state something akin to

    sort(first-element, last-element)

Generic algoritms should be used wherever possible. Avoid the urge to design your own code for commonly encountered algorithms. Make it a habit to first thoroughly search the generic algorithms for an available candidate. The generic algorithms should become your weapon of choice when writing code: acquire full familiarity with them and turn their use into your `second nature'.

This chapter's sections cover the STL's generic algorithms in alphabetical order. For each algorithm the following information is provided:

In the prototypes of the algorithms Type is used to specify a generic data type. Also, the particular type of iterator (see section 18.2) that is required is mentioned as well as other generic types that might be required (e.g., performing BinaryOperations, like plus<Type>).

Almost every generic algorithm expects an iterator range [first, last), defining the series of elements on which the algorithm operates. The iterators point to objects or values. When an iterator points to a Type value or object, function objects used by the algorithms usually receive Type const & objects or values. Usually function objects cannot modify the objects they receive as their arguments. This does not hold true for modifying generic algorithms, which are of course able to modify the objects they operate upon.

Generic algorithms may be categorized. The C++ Annotations distinguishes the following categories of generic algorithms:

19.1.1: accumulate

19.1.2: adjacent_difference

19.1.3: adjacent_find

19.1.4: binary_search

19.1.5: copy

19.1.6: copy_backward

19.1.7: count

19.1.8: count_if

19.1.9: equal

19.1.10: equal_range

19.1.11: fill

19.1.12: fill_n

19.1.13: find

19.1.14: find_end

19.1.15: find_first_of

19.1.16: find_if

19.1.17: for_each

The example also shows that the for_each algorithm may be used with functions defining const and non-const parameters. Also, see section 19.1.63 for differences between the for_each and transform generic algorithms.

The for_each algorithm cannot directly be used (i.e., by passing *this as the function object argument) inside a member function to modify its own object as the for_each algorithm first creates its own copy of the passed function object. A lambda function or a wrapper class whose constructor accepts a pointer or reference to the current object and possibly to one of its member functions solves this problem.

19.1.18: generate

19.1.19: generate_n

19.1.20: includes

19.1.21: inner_product

19.1.22: inplace_merge

19.1.23: iter_swap

19.1.24: lexicographical_compare

19.1.25: lower_bound

19.1.26: max

19.1.27: max_element

19.1.28: merge

19.1.29: min

19.1.30: min_element

19.1.31: mismatch

19.1.32: next_permutation

19.1.33: nth_element

19.1.34: partial_sort

19.1.35: partial_sort_copy

19.1.36: partial_sum

19.1.37: partition

19.1.38: prev_permutation

19.1.39: random_shuffle

19.1.40: remove

19.1.41: remove_copy

19.1.42: remove_copy_if

19.1.43: remove_if

19.1.44: replace

19.1.45: replace_copy

19.1.46: replace_copy_if

19.1.47: replace_if

19.1.48: reverse

19.1.49: reverse_copy

19.1.50: rotate

19.1.51: rotate_copy

19.1.52: search

19.1.53: search_n

19.1.54: set_difference

19.1.55: set_intersection

19.1.56: set_symmetric_difference

19.1.57: set_union

19.1.58: sort

19.1.59: stable_partition

19.1.60: stable_sort

Note that the example implements a solution to an often occurring problem: how to sort using multiple hierarchal criteria. The example deserves some additional attention:

19.1.61: swap

19.1.62: swap_ranges

19.1.63: transform

the following differences between the for_each (section 19.1.17) and transform generic algorithms should be noted:

19.1.64: unique

19.1.65: unique_copy

19.1.66: upper_bound

19.1.67: Heap algorithms

A heap is a kind of binary tree which can be represented by an array. In the standard heap, the key of an element is not smaller than the key of its children. This kind of heap is called a max heap. A tree in which numbers are keys could be organized as shown in figure 19.

Figure 19 is shown here.
Figure 19: A binary tree representation of a heap


Such a tree may also be organized in an array:

        12, 11, 10, 8, 9, 7, 6, 1, 2, 4, 3, 5
In the following description, keep two pointers into this array in mind: a pointer node indicates the location of the next node of the tree, a pointer child points to the next element which is a child of the node pointer. Initially, node points to the first element, and child points to the second element. Since child now points beyond the array, the remaining nodes have no children. So, nodes 6, 1, 2, 4, 3 and 5 don't have children.

Note that the left and right branches are not ordered: 8 is less than 9, but 7 is larger than 6.

A heap is created by traversing a binary tree level-wise, starting from the top node. The top node is 12, at the zeroth level. At the first level we find 11 and 10. At the second level 6, 7, 8 and 9 are found, etc.

Heaps can be constructed in containers supporting random access. So, a list is not an appropriate data structure for a heap. Heaps can be constructed from an (unsorted) array (using make_heap). The top-element can be pruned from a heap, followed by reordering the heap (using pop_heap), a new element can be added to the heap, followed by reordering the heap (using push_heap), and the elements in a heap can be sorted (using sort_heap, which (of course) invalidates the heap).

19.1.67.1: The `make_heap' function

19.1.67.2: The `pop_heap' function

19.1.67.3: The `push_heap' function

19.1.67.4: The `sort_heap' function

19.1.67.5: An example using the heap functions

Here is an example showing the various generic algorithms manipulating heaps:
    #include <algorithm>
    #include <iostream>
    #include <functional>
    #include <iterator>
    using namespace std;

    void show(int *ia, char const *header)
    {
        cout << header << ":\n";
        copy(ia, ia + 20, ostream_iterator<int>(cout, " "));
        cout << '\n';
    }
    int main()
    {
        int ia[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
                    11, 12, 13, 14, 15, 16, 17, 18, 19, 20};

        make_heap(ia, ia + 20);
        show(ia, "The values 1-20 in a max-heap");

        pop_heap(ia, ia + 20);
        show(ia, "Removing the first element (now at the end)");

        push_heap(ia, ia + 20);
        show(ia, "Adding 20 (at the end) to the heap again");

        sort_heap(ia, ia + 20);
        show(ia, "Sorting the elements in the heap");


        make_heap(ia, ia + 20, greater<int>());
        show(ia, "The values 1-20 in a heap, using > (and beyond too)");

        pop_heap(ia, ia + 20, greater<int>());
        show(ia, "Removing the first element (now at the end)");

        push_heap(ia, ia + 20, greater<int>());
        show(ia, "Re-adding the removed element");

        sort_heap(ia, ia + 20, greater<int>());
        show(ia, "Sorting the elements in the heap");
    }
    /*
        Displays:
            The values 1-20 in a max-heap:
            20 19 15 18 11 13 14 17 9 10 2 12 6 3 7 16 8 4 1 5
            Removing the first element (now at the end):
            19 18 15 17 11 13 14 16 9 10 2 12 6 3 7 5 8 4 1 20
            Adding 20 (at the end) to the heap again:
            20 19 15 17 18 13 14 16 9 11 2 12 6 3 7 5 8 4 1 10
            Sorting the elements in the heap:
            1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
            The values 1-20 in a heap, using > (and beyond too):
            1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
            Removing the first element (now at the end):
            2 4 3 8 5 6 7 16 9 10 11 12 13 14 15 20 17 18 19 1
            Re-adding the removed element:
            1 2 3 8 4 6 7 16 9 5 11 12 13 14 15 20 17 18 19 10
            Sorting the elements in the heap:
            20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1
    */

19.2: STL: More function adaptors

Before using the function adaptors presented in these (sub)sections the <functional> header file must have been included.

Member function adaptors are part of the Standard Template Library (STL). They are most useful in combination with the generic algorithms, which is why they are discussed in this chapter instead of the previous chapter which was devoted to the STL.

The member function adaptors defined in the STL allow us to call (parameterless) member functions of class objects as though they were non-member functions and to compose unary or binary argument functions into single function objects so that they can jointly be used with generic algorithms.

In the next section calling member functions as non-member functions is discusses; adaptable functions are covered thereafter.

19.2.1: member function adaptors

Member function adaptors allow the use of generic algorithms when the function to call is a member function of a class. Consider the following class:
    class Data
    {
        public:
            void display();
    };
Obviously, the display members is used to display the information stored in Data objects.

When storing Data objects in a vector the for_each generic algorithm cannot easily be used to call the display member for each of the objects stored in the vector, as for_each accepts either a (non-member) function or a function object as its third argument, but does not accept a member function.

The member function adaptor mem_fun_ref can be used to solve this problem. It expects the address of a member function without any parameters and its function call operator calls that function for the object that is passed to its function call operator. In the next example vector<Data> data is filled with Data objects. Then for_each is used to display the information in the various Data objects that are stored in the data vector:

    int main()
    {
        vector<Data> data;

        // the 'data' vector is filled with Data objects here

        for_each(data.begin(), data.end(), mem_fun_ref(&Data::display));
    }

A second member function adaptor is mem_fun, which is used to call a member function from a pointer to an object. The above example could be rewritten to something like:

    int main()
    {
        vector<Data *> data;

        data.push_back(new Data);   // multiple push-backs if applicable

        for_each(data.begin(), data.end(), mem_fun(&Data::display));

        // delete the Data objects pointed to by data's elements
    }

An interesting observation is that http://www.sgi.com provides an example of the use of mem_fun where polymorphic members are used. If Data::display is a virtual member function and Derived1 and Derived2 (both derived from Data) provide their own implementations of display, then pointers to Derived1 and Derived2 objects can be entered into the data vector. The for_each algorithm will then call the proper (overridden) display function. E.g.,

    int main()
    {
        vector<Data *> data;

        data.push_back(new Derived1);
        data.push_back(new Derived2);

                                        // calls Derived1::display or
                                        // Derived2::display.
        for_each(data.begin(), data.end(), mem_fun(&Data::display));
    }
This example, however, uses public virtual members, confounding the virtual and public interfaces of classes (cf. section 14.6) and is therefore deprecated. Polymorphism could still be used, though, but the public interface should be provided by Data as follows:
    class Data
    {
        public:
            void display();             // calls v_display
        private:
            virtual void v_display();   // overriden by derived
    };                                  // classes
In the above example mem_fun would simply receive &Data::display's address. No need to modify the above main function, but now Data's polymorphic features are completely separated from its public interface.

19.2.2: Adaptable functions

Within the context of the STL an adaptable function is a function object defining and defining result_type as the type of the return value of its function call operator.

The STL defines pointer_to_unary_function and pointer_to_binary_function as adaptors accepting pointers to, respectively, unary and binary functions, converting them into adaptable functions. In the context of the STL using adaptable functions should probably be preferred over pointers to functions as the STL's adaptable functions define the abovementioned types describing their arguments and result types. At the end of this section an example is given where these types are actually required, and a plain pointer to a function cannot be used.

In practice the pointer_to_unary_function and pointer_to_binary_function adaptors are hardly ever explicitly used. Instead, the overloaded function adaptor ptr_fun is used.

When ptr_fun is provided with a unary function it uses pointer_to_unary_function to create an adaptable unary function and when provided with a binary function it uses pointer_to_binary_function to create an adaptable binary function.

Here is an example showing the use of ptr_fun, creating an adaptable binary function. In main, the word to search for is extracted from the standard input stream as well as additional words that are stored in a vector of string objects. If the target word is found the program displays the word following the target word in the vector of string objects. Searching is performed case insensitively, for which the C library function strcasecmp is used:

#include <vector>
#include <algorithm>
#include <functional>
#include <cstring>
#include <string>
#include <iterator>
#include <iostream>

using namespace std;

inline int stringcasecmp(string lhs, string rhs)
{
    return strcasecmp(lhs.c_str(), rhs.c_str());
}

int main()
{
    string target;
    cin >> target;

    vector<string> v1;
    copy(istream_iterator<string>(cin), istream_iterator<string>(),
        back_inserter(v1));

    auto pos = find_if(
                    v1.begin(), v1.end(),
                    not1( bind2nd(ptr_fun(stringcasecmp), target) )
               );

    if ( pos != v1.end())
       cout <<   "The search for `" << target << "' was successful.\n"
                 "The next string is: `" << pos[1] << "'.\n";
}

    // on input:
    //  VERY   I have existed for years, but very little has changed
    // the program displays:
    //  The search for `VERY' was successful.
    //  The next string is: `little'.
The observant reader will have noticed that this is not a very efficient program. The function stringcasecmp defines value type parameters forcing the construction of copies of the arguments passed to stringcasecmp every time it is called. However, when changing the parameter definitions into
    inline int stringcasecmp(string const &lhs, string const &rhs)
the compiler generates an error message like:
In instantiation of 'std::binder2nd<std::pointer_to_binary_function<
        const std::string&, const std::string&, int> >':

    typename _Operation::result_type std::binder2nd<_Operation>::operator()(
            typename _Operation::first_argument_type&) const
cannot be overloaded with:
    typename _Operation::result_type std::binder2nd<_Operation>::operator()(
            const typename _Operation::first_argument_type&) const
This problem (and its solution) is covered in a later chapter (cf. section 21.5.2.1).