Chapter 11: The Standard Template Library, generic algorithms

We're always interested in getting feedback. E-mail us if you like this guide, if you think that important material is omitted, if you encounter errors in the code examples or in the documentation, if you find any typos, or generally just if you feel like e-mailing. Send your email to Frank Brokken.

Please state the document version you're referring to, as found in the title (in this document: 4.4.2).

The Standard Template Library (STL) consists of containers, generic algorithms, iterators, function objects, allocators and adaptors. The STL is a general purpose library consisting of algorithms and data structures. The data structures that are used in the algorithms are abstract in the sense that the algorithms can be used on (practically) every data type.

The algorithms can work on these abstract data types due to the fact that they are template based algorithms. In this chapter the construction of these templates in not further discussed (see chapter 17 for that). Rather, the use of these template algorithms is the focus of this chapter.

Several parts of the standard template library have already been discussed in the C++ Annotations. In chapter 8 the abstract containers were discussed, and in section 7.8 function objects and adaptors were covered. Also, iterators were mentioned at several places in this document.

The remaining components of the STL will be covered in this chapter. Iterators, and the generic algorithms will be discussed in the coming sections. Allocators take care of the memory allocation within the STL. The default allocator class suffices for most applications.

Forgetting to delete allocated memory is a common source of errors or memory leaks in a program. The auto_ptr template class may be used to prevent these types of problems. The auto_ptr class is discussed in section 11.2 of this chapter.

11.1: Iterators

terators are an abstraction of pointers. In general, the following holds true of iterators: The STL containers produce iterators (i.e., type iterator) using member functions begin() and end() and, in the case of reversed iterators (type reverse_iterator), rbegin() and rend(). Standard practice requires the iterator range to be left inclusive: the notation [left, right) indicates that left is an iterator pointing to the first element that is to be considered, while right is an iterator pointing just beyond the last element to be used. The iterator-range is said to be empty when left == right.

The following example shows a situation where all elements of a vector of strings are written to cout using the iterator range [begin(), end()), and the iterator range [rbegin(), rend()). Note that the for-loops for both ranges are identical:

#include <iostream>
#include <vector>
#include <string>

int main(int argc, char **argv)
{
    vector<string>
        args(argv, argv + argc);

    for 
    (
        vector<string>::iterator iter = args.begin();
            iter != args.end();
                ++iter
    )
        cout << *iter << " ";

    cout << endl;

    for 
    (
        vector<string>::reverse_iterator iter = args.rbegin();
            iter != args.rend();
                ++iter
    )
        cout << *iter << " ";

    cout << endl;
    
    return (0);
}

Furthermore, the STL defines const_iterator types to be able to visit a range of the elements in a constant container. Whereas the elements of the vector in the previous example could have been altered, the elements of the vector in the next example are immutable, and const_iterators are required:

#include <iostream>
#include <vector>
#include <string>

int main(int argc, char **argv)
{
    const vector<string>
        args(argv, argv + argc);

    for 
    (
        vector<string>::const_iterator iter = args.begin();
            iter != args.end();
                ++iter
    )
        cout << *iter << " ";

    cout << endl;

    for 
    (
        vector<string>::const_reverse_iterator iter = args.rbegin();
            iter != args.rend();
                ++iter
    )
        cout << *iter << " ";

    cout << endl;
    
    return (0);
}

The examples also illustrate the use of plain pointers for iterators. The initialization vector<string> sarg(argv, argv + argc) provides the sarg vector with a pair of pointer-based iterators: argv points to the first element to initialize sarg with, argv + argc points just beyond the last element to be used, argv++ reaches the next string. This is a general characteristic of pointers, which is why they too can be used in situations where iterators are expected.

The STL defines five types of iterators. These types recur in the generic algorithms, and in order to be able to create a particular type of iterator yourself it is important to know their characteristic. In general, it must be possible to

The example given with the RandomAccessIterator provides an approach towards iterators: look for the iterator that's required by the (generic) algorithm, and then see whether the datastructure supports the required iterator or not. If not, the algorithm cannot be used with the particular datastructure.

11.1.1: Insert iterators

The generic algorithms often require a target container into which the results of the algorithm are deposited. For example, the copy() algorithm has three parameters, the first two of them define the range of elements which are visited, and the third parameter defines the first position where the result of the copy operation is to be stored. With the copy() algorithm the number of elements that are copied are normally available beforehand, since the number is normally equal to the number of elements in the range defined by the first two parameters, but this does not always hold true. Sometimes the number of resulting elements is different from the number of elements in the initial range. The generic algorithm unique_copy() is a case in point: the number of elements which are copied to the destination container is normally not known beforehand.

In situations like these, the inserter() adaptor functions may be used to create elements in the destination container when they are needed.

There are three inserter() adaptors:

11.1.2: istream iterators

The istream_iterator<Type>() can be used to define an iterator (pair) for an istream object or for a subtype of an istream. The general form of the istream_iterator<Type>() iterator is:
istream_iterator<Type> identifier(istream &inStream)
Here, Type is the type of the data elements that are to be read from the istream stream. Type may be any of the types for which the operator>>() is defined with istream objects.

The default (empty) constructor defines the end of the iterator pair, corresponding to end-of-stream. For example,

istream_iterator<string> endOfStream;
Note that the actual stream object which is specified for the begin-iterator is not mentioned here.

Using a back_inserter() and a set of istream_iterator<>()s all strings could be read from cin as follows:

#include <algorithm>
#include <iterator>
#include <string>
#include <vector>

int main()
{
    vector<string>
        vs;

    copy(istream_iterator<string>(cin), istream_iterator<string>(),
         back_inserter(vs));

    for 
    (
        vector<string>::iterator from = vs.begin();
            from != vs.end();
                ++from
    )
        cout << *from << " ";
    cout << endl;

    return (0);
}
In the above example, note the use of the anonymous versions of the istream_iterators. Especially note the use of the anonymous default constructor. Instead of using istream_iterator<string>() the (non-anonymous) construction

    istream_iterator<string>
            eos;

    copy(istream_iterator<string>(cin), eos, back_inserter(vs));
    
could have been used.

The istream_iterator iterators is available when the iterator header file is included. This is, e.g., the case when iostream is included.

11.1.3: ostream iterators

The ostream_iterator<Type>() can be used to define a destination iterator for an ostream object or for a subtype of an ostream. The general forms of the ostream_iterator<Type>() iterator are:
ostream_iterator<Type> identifier(ostream &outStream)
and
ostream_iterator<Type> identifier(ostream &outStream), char const *delimiter
Type is the type of the data elements that are to be written to the ostream stream. Type may be any of the types for which the operator<<() is defined with ostream objects. The latter form of the ostream_iterators separates the individual Type data elements by delimiter strings. The former form does not use any delimiters.

The following example shows the use of a istream_iterators and an ostream_iterator to copy information of a file to another file. A subtlety is the statement in.unsetf(ios::skipws): it resets the ios::skipws flag. The consequence of this is that the default behavior of the operator>>(), to skip whitespace, is modified. White space characters are simply returned by the operator, and the file is copied unrestrictedly. Here is the program:

#include <algorithm>
#include <fstream>
#include <iomanip>

int main(int argc, char **argv)
{
    ifstream
        in(argv[1]);

    in.unsetf(ios::skipws);

    ofstream
        out(argv[2]);

    copy(istream_iterator<char>(in), istream_iterator<char>(),
         ostream_iterator<char>(out));

    return (0);
}

The ostream_iterator iterators are available when the iterator header file is included. This is, e.g., the case when iostream is included.

11.2: The 'auto_ptr' class

One of the problems using pointers is that strict bookkeeping is required about the memory the pointers point to. When a pointer variable goes out of scope, the memory pointed to by the pointer is suddenly inaccessible, and the program suffers from a memory leak. For example, in the following code, a memory leak is introduced in which 200 int values remain allocated:

    #include <iostream>

    int main()
    {
        for (int idx = 0; idx < 200; ++idx)
        {
            int
                c,
                *ip;

            cin >> c;               // read an int
            ip = new int(c);        // ip points to int initialized to 'c'
        }                           // no delete-operation
        next();                     // whatever comes next
        return (0);
    }

The standard way to prevent memory leakage is strict bookkeeping: the programmer has to make sure that the memory pointed to by a pointer is deleted just before the pointer variable dies. In the above example the repair would be:


    #include <iostream>

    int main()
    {
        for (int idx = 0; idx < 200; ++idx)
        {
            int
                c,
                *ip;

            cin >> c;               // read an int
            ip = new int(c);        // ip points to int initialized to 'c'
            delete ip;              // and delete the allocated memory again
        }                           
        next();                     // whatever comes next
        return (0);
    }

When a pointer variable is used to point to a single value or object, the bookkeeping becomes less of a burden when the pointer variable is defined as a auto_ptr object. The template class auto_ptr is available when the header file memory is included.

Normally, an auto_ptr object is initialized to point to a dynamically created value or object. When the auto_ptr object goes out of scope, the memory pointed to by the object is automatically deleted, taking over the programmer's responsibility to delete memory.

Alternative forms to create auto_ptr objects are available as well, as discussed in the coming sections.

Note that

The class auto_ptr has several member functions which can be used to access the pointer itself and to have the auto_ptr point to another block of memory. These member functions are discussed in the following sections as well.

Note:

The memory header file which must be included to use the auto_ptr objects used to be incomplete. A modified memory header file which can be used to replace an incomplete file can be found at ftp://ftp.icce.rug.nl/pub/frank/windows/memory. This file can replace the memory file on Linux systems (found in /usr/include/g++), and on computers running MS-Windows (found in /usr/include/g++-3 on the current RedHat distribution of Cygnus), if problems are encountered when using the distributed memory file.

11.2.1: Defining auto_ptr variables

There are three ways to define auto_ptr objects. Each definition contains the usual <type> specifier between pointed brackets. Concrete examples are given in the coming sections, but an overview of the various possibilities is presented here:

11.2.2: Pointing to a newly allocated object

The basic form to initialize an auto_ptr object is to pass its constructor a block of memory that's allocated by the new operator. The generic form is:
auto_ptr<type> identifier (new-expression);

For example, to initialize an auto_ptr to a string variable the construction

auto_ptr<string> strPtr (new string("Hello world"));
can be used. To initialize an auto_ptr to a double variable the construction
auto_ptr<double> dPtr (new double(123.456));
can be used.

Note the use of the operator new in the above expressions. The use of the operator new ensures the dynamic nature of the memory pointed to by the auto_ptr objects, and allows the deletion of the memory once the auto_ptr objects go out of scope. Also note that the type does not contain the pointer: the type used in the auto_ptr construction is the same type as used in the new expression.

In the example of the 200 int values given earlier, the memory leak can be avoided by using auto_ptr objects as follows:


    #include <iostream>
    #include <memory>

    int main()
    {
        for (int idx = 0; idx < 200; ++idx)
        {
            int
                c;

            cin >> c;               // read an int
            auto_ptr<int> ip (new int(c));
        }                           // no delete-operation needed
        return (0);
    }

Following each cycle of the for loop, the memory allocated by the new int(c) expression is deleted automatically.

All member functions that are available for objects that are allocated by the new expression (like the string object in the first example in this section) can be reached via the auto_ptr as if it was a plain pointer to the dynamically allocated object. E.g., to insert some text beyond the wordt hello in the string pointed to by strPtr, an expression like

strPtr->insert(strPtr->find_first_of(" ") + 1, "C++ ");
can be used.

11.2.3: Pointing to another auto_ptr

Another form to initialize an auto_ptr object is to initialize it from another auto_ptr object for the same type. The generic form is:
auto_ptr<type> identifier (other auto_ptr object);

For example, to initialize an auto_ptr to a string variable, given the strPtr variable defined in the previous section, the construction

auto_ptr<string> newPtr(strPtr);
can be used.

A comparable construction can be used with the assignment operator in expressions. One auto_ptr object may be assigned to another auto_ptr object of the same type. For example:


    #include <iostream>
    #include <memory>
    #include <string>

    int main()
    {
        auto_ptr<string> 
            hello(new string("Hello world")),
            hello2(hello),
            hello3(new string("Another string"));

        hello3 = hello2;
        return (0);
    }        

Looking at the above example, we see that hello is initialized as described in the previous section. A new expression is used to allocate a string variable dynamically. Next, hello2 is initialized to hello, which is possible, as they are auto_ptr objects of the same types. However, in order to prevent problems when either object goes out of scope, special measures are required.

If the program would stop here, both hello and hello2 go out of scope. But only hello2 would point to the dynamically allocated string hello world: once a auto_ptr object is used to initialize another auto_ptr object, the former (initializing) object does not refer anymore to the allocated string. The string is now `owned' by the latter (initialized) object.

A comparable action takes place in the assignment statement hello3 = hello2. Here, prior to the actual assignment, the memory pointed to by hello3 is deleted automatically. Then hello3 gains the ownership of the string Hello world, and hello2 cannot be used anymore to reach the string Hello world.

11.2.4: Creating an plain auto_ptr

The third form to create an auto_ptr object simply creates an empty auto_ptr object that does not point to a particular block of memory:
auto_ptr<type> identifier;

In this case the underlying pointer is set to 0 (zero). Since the auto_ptr object itself is not the pointer, its value cannot be compared to 0 to see if it has not been initialized. E.g., code like


    auto_ptr<int>
        ip;

    if (!ip)
        cout << "0-pointer with an auto_ptr object ?" << endl;

will not produce any output (actually, it won't compile either...). So, how do we inspect the value of the pointer that's maintained by the auto_ptr object? For this the member get() is available. This member function, as well as the other member functions of the class auto_ptr are described in the following sections.

11.2.5: The get() member function

The member function get() of an auto_ptr object returns the underlying pointer. The value returned by get() is a pointer to the underlying data-type. It may be inspected: if it's zero the auto_ptr object does not point to any memory.

The member function get() cannot be used to let the auto_ptr object point to (another) block of memory. Instead the member function reset(), discussed in the next section, should be used.

11.2.6: The reset() member function

The member function reset() of an auto_ptr object can be used to (re)assign a block of memory allocated by the operator new to an auto_ptr. The function reset() does not return a value.

An example of its use is:


    auto_ptr<string>
        str;

    str.reset(new string("Hello"));         // assignment of a value
    str.reset(new string("Hello world"));   // reassignment of a value
 
The object that is assigned to the pointer using reset() must have been allocated using the new operator. The object the pointer points to just before applying reset()) is deleted first. The value 0 can be passed to reset() if the object pointed to by the pointer should be deleted. Following reset(0) the pointer variable has been reinitialized.

Note that it is usually more efficient to use a reassignment member function of the object pointed to by the pointer if the only purpose of the exercise is to redefine the value of the object. For example, the string class supports a function assign() which may be used for that purpose. So, a construction like:


        auto_ptr<string>
            aps(new string("Hello"));

        aps.reset("Hello world");
 
can more efficiently be implemented as:

        auto_ptr<string>
            aps(new string("Hello"));

        aps->assign("Hello world");
 

11.2.7: The release() member function

As we saw in section 11.2.3, when an auto_ptr is assigned to another auto_ptr, the pointer providing the value loses its value and is reinitialized to 0. If that's not what we want, the member function release() may be used.

The release() memeberfunction returns the address of the underlying pointer used by the auto_ptr object, and releases the ownership of the object at the same time. The ownership can then be taken over by another auto_ptr variable (or, indeed, by any other pointer).

In the following example a pointer is initialized, and then another pointer is created to point to the same string as the first auto_ptr points to. The first auto_ptr still points to the string, but doesn't own the string anymore. Therefore, when the first auto_ptr goes out of scope, it won't delete the string pointed to by the second auto_ptr.

    #include <memory>
    #include <string>

    int main()
    {
        auto_ptr<string>
            first;

        {
            auto_ptr<string>
                second(new string("Hello world"));

            first.reset(second.release());

            cout << "Second auto_ptr still points at: " << *second << endl
                << "First auto_ptr also points to: " << *first << endl;
        }
        cout << "Second object now out of scope. First auto_ptr\n"
            "still points at: " << *first << endl;
    }

11.3: The Generic Algorithms

The following sections describe the generic algorithms in alphabetical order. For each algorithm the following information is provided: In the prototypes of the algorithms Type is used to specify a generic (i.e., template) datatype. The particular kind of iterator that is required is mentioned, and possibly other generic types, e.g., performing BinaryOperations, like plus<Type>().

Almost every generic algorithm has as its first two arguments an iterator range [first, last), defining the range of elements on which the algorithm operates.

11.3.1: accumulate()

11.3.2: adjacent_difference()

11.3.3: adjacent_find()

11.3.4: binary_search()

11.3.5: copy()

11.3.6: copy_backward()

11.3.7: count()

11.3.8: count_if()

11.3.9: equal()

11.3.10: equal_range()

11.3.11: fill()

11.3.12: fill_n()

11.3.13: find()

11.3.14: find_if()

11.3.15: find_end()

11.3.16: find_first_of()

11.3.17: for_each()

11.3.18: generate()

11.3.19: generate_n()

11.3.20: includes()

11.3.21: inner_product()

11.3.22: inplace_merge()

11.3.23: iter_swap()

11.3.24: lexicographical_compare()

11.3.25: lower_bound()

11.3.26: max()

11.3.27: max_element()

11.3.28: merge()

11.3.29: min()

11.3.30: min_element()

11.3.31: mismatch()

11.3.32: next_permutation()

11.3.33: nth_element()

11.3.34: partial_sort()

11.3.35: partial_sort_copy()

11.3.36: partial_sum()

11.3.37: partition()

11.3.38: prev_permutation()

11.3.39: random_shuffle()

11.3.40: remove()

11.3.41: remove_copy()

11.3.42: remove_if()

11.3.43: remove_copy_if()

11.3.44: replace()

11.3.45: replace_copy()

11.3.46: replace_if()

11.3.47: replace_copy_if()

11.3.48: reverse()

11.3.49: reverse_copy()

11.3.50: rotate()

11.3.51: rotate_copy()

11.3.52: search()

11.3.53: search_n()

11.3.54: set_difference()

11.3.55: set_intersection()

11.3.56: set_symmetric_difference()

11.3.57: set_union()

11.3.58: sort()

11.3.59: stable_partition()

11.3.60: stable_sort()

11.3.61: swap()

11.3.62: swap_ranges()

11.3.63: transform()

11.3.64: unique()

11.3.65: unique_copy()

11.3.66: upper_bound()

11.3.67: Heap algorithms

A heap is a form of binary tree represented as an array. In the standard heap, the key of an element is greater or equal to the key of its children. This kind of heap is called a max heap.

A tree in which numbers are keys could be organized as follows:

figure 11 is shown here.
figure 11: A binary tree representation of a heap

This tree can be organized in an array as follows:

12, 11, 10, 8, 9, 7, 6, 1, 2, 4, 3, 5

Here, 12 is the top node. its children are 11 and 10, both less than 12. 11, in turn, has 8 and 9 as its children, while the children of 10 are 7 and 6. 8 has 1 and 2 as its children, 9 has 4 and 3, and finally, 7 has left child 5. 7 doesn't have a right child, and 6 has no children.

Note that the left and right branches are not ordered: 8 is less than 9, but 7 is larger than 6.

The heap is formed by traversing a binary tree level-wise, starting from the top node. The top node is 12, at the zeroth level. At the first level we find 11 and 10. At the second level 6, 7, 8 and 9 are found, etc.

Heaps can be created in containers supporting random access. So, a heap is not, for example, constructed in a list. Heaps can be constructed from an (unsorted) array (using make_heap()). The top-element can be pruned from a heap, followed by reordering the heap (using pop_heap()), a new element can be added to the heap, followed by reordering the heap (using push_heap()), and the elements in a heap can be sorted (using sort_heap(), which invalidates the heap, though).

The following subsections introduce the prototypes of the heap-algorithms, the final subsection provides a small example in which the heap algorithms are used.

11.3.67.1: make_heap()

11.3.67.2: pop_heap()

11.3.67.3: push_heap()

11.3.67.4: sort_heap()

11.3.67.5: A small example using the heap algorithms

#include <algorithm>
#include <iostream>
#include <functional>

void show(int *ia, char const *header)
{
    cout << header << ":\n";
    copy(ia, ia + 20, ostream_iterator<int>(cout, " "));
    cout << endl;
}

int main()
{
    int
        ia[] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 
                11, 12, 13, 14, 15, 16, 17, 18, 19, 20};

    make_heap(ia, ia + 20);
    show(ia, "The values 1-20 in a max-heap");

    pop_heap(ia, ia + 20);
    show(ia, "Removing the first element (now at the end)");

    push_heap(ia, ia + 20);
    show(ia, "Adding 20 (at the end) to the heap again");

    sort_heap(ia, ia + 20);
    show(ia, "Sorting the elements in the heap");


    make_heap(ia, ia + 20, greater<int>());
    show(ia, "The values 1-20 in a heap, using > (and beyond too)");

    pop_heap(ia, ia + 20, greater<int>());
    show(ia, "Removing the first element (now at the end)");

    push_heap(ia, ia + 20, greater<int>());
    show(ia, "Adding 20 (at the end) to the heap again");

    sort_heap(ia, ia + 20, greater<int>());
    show(ia, "Sorting the elements in the heap");

    return (0);
}