Back to July Calendar

Previous Entry Next Entry→

September 29, 2005

And Now a Word About Algorithms

Algophobia (al-guh-FO-bee-uh) noun

Usually having a phobia might brand you as a nut but here is a phobia that indicates you're a regular human being, if you have it. Algophobia is the fear of pain. Though the word indicates an unusual, morbid fear of pain, producing intense anxiety.

There is even an instrument called an algometer to measure pain. Now I know why they called that grueling course "algorithms" in my computer science curriculum.

We got the word from the Greek algo- (pain) and -phobia (fear).

-Anu Garg (gargATwordsmith.org)

STL's algorithms can be used generically across a variety of containers.   Inserting, deleting, searching, sorting and other algorithms are appropriate for some or all of the STL containers. 

The 70 or so algorithms of STL operate on container elements only indirectly through iterators. Many algorithms operate on sequences of elements defined by pairs of iterators—a first iterator pointing to the first element of the sequence and a second iterator pointing to one element past the last element of the sequence. Also, you can create new algorithms that operate on the STL containers and iterators 

Algorithms often return iterators. Algorithm find(), for example, locates an element and returns an iterator to it, or, if it's not found, find() returns the end() iterator. The find() algorithm can be used with any STL container.  If an algorithm uses less powerful iterators, the algorithm can also be used with containers that support more powerful iterators. Some algorithms demand powerful iterators; e.g., sort demands random-access iterators.

Sequence Containers

The basic fact about human existence is not that it is a tragedy, but that it is a bore.  It is not so much a war as an endless standing in line.  ~H.L. Mencken

The STL provides three sequence containers—vector, list and deque.  Classes vector and deque are based on arrays. Class list implements a linked-list data structure like the List template class I fussed over a week or so ago. 

Class vector is a kind of “smart” array.  A vector can change size dynamically and one vector can be assigned to another. This is not possible with pointer-based, C-like arrays, because those array names are constant pointers and cannot be the targets of assignments. Just as with C arrays, vector subscripting does not perform automatic range checking, but class vector does provide this capability via member function at.

Some general observations:

It is relatively expensive to insert (or delete) an element in the middle (or front) of a vector—the entire portion of the vector after the insertion (or deletion) point must be moved, because vector elements occupy contiguous cells in memory just as do C or C++ “raw” arrays. So, when possible, insert at the back.

Applications that require frequent insertions and deletions at both ends of a container normally use a deque rather than a vector. Although we can insert and delete elements at the front and back of both, class deque is more efficient than vector for doing insertions and deletions at the front.

Applications with frequent insertions and deletions in the middle and/or at the extremes of a container normally use a list, due to its efficient implementation of insertion and deletion anywhere in the data structure.

In light of these observations, it seems that my partitions data structure might be best described as a vector, since we only ever really need to adjust entries on one end?  Hmmm.  We'll see, maybe.

Sequence containers have several common operations—front to return a reference to the first element in the container, back to return a reference to the last element in the container, push_back to insert a new element at the end of the container and pop_back to remove the last element of the container.

All STL algorithms can operate on a vector. The iterators for a  vector are normally implemented as pointers to elements of the  vector. Each of the STL algorithms that take iterator arguments requires those iterators to provide a minimum level of functionality. If an algorithm requires a forward iterator, for example, that algorithm can operate on any container that provides  forward iterators, bidirectional iterators or random-access iterators. As long as the container supports the algorithm’s minimum iterator functionality, the algorithm can operate on the container.

Here's an example illustrating some of the functions of the vector class template and comparing its usage with that of an array.  

We have std::vector< int > integers; that defines integers to be an instance of class vector storing int values.  When instantiated, an empty vector of size and capacity 0 is created.  The capacity is the number of elements that can be stored without allocating more memory (the dreaded "malloc," now known as new.)

Outputting integers.size() and integers.capacity() demonstrate these functions.  Function size—available in every container—returns the number of elements currently stored in the container. Function capacity returns the number of elements that can be stored in the vector before the vector dynamically resizes itself to accommodate more elements.

The function calls integers.push_back( 22 ) and so on use function push_back—available in all sequence containers—to add an element to the end of the vector. If an element is added to a full vector, the vector increases its size—some STL implementations have the vector double its size.  The change in size and capacity resulting from these pushes is illustrated by printing them out again.  The reason why the capacity goes to 4 when the size increases to 3 is that the capacity is doubled when an element is pushed onto a full vector.  I guess this reduces the number of times the new function has to be called (which, mercifully, is handled by the vector function automatically.)

Scrolling down to the first for loop, find a nice demonstration of how to output the contents of an array using pointers and pointer arithmetic. Compare this with the call to printVector, which uses iterators to output the contents of the integers vector. 

Notice how the function template printVector is declared in the prototype before the main function.  It receives a const reference to a vector (integers2) as its argument. In the definition of the function template a const_iterator called constIterator is defined that iterates through the vector and outputs its contents. A const_iterator enables the program to read the elements of the vector, but does not allow the program to modify the elements. The for structure in the printVector function initializes constIterator using vector member function begin, which returns a const_iterator to the first element in the vector—there is another version of begin that returns an iterator that can be used for non-const containers. Note that a const_iterator is returned because the identifier integers2 was declared const in the parameter list of function printVector.  The loop continues as long as constIterator has not reached the end of the vector. This is determined by comparing constIterator to the result of integers2.end(), which returns an iterator indicating the location after the last element of the vector.  If constIterator is equal to this value, the end of the vector has been reached.  Functions begin and end are available for all first-class containers.  The body of the loop dereferences iterator constIterator to get the value in the current element of the vector.  Remember that the iterator acts like a pointer to the element and that operator * is overloaded to return the value of the element. The expression constIterator++ positions the iterator to the next element of the vector.

After the printVector function call, a reverse_iterator is declared that the following for sturcture uses to iterate through a vector backwards. Functions rbegin (i.e., the iterator for the starting point for iterating in reverse through the container) and rend (i.e., the iterator for the ending point for iterating in reverse through the container) delineate the range of elements to output in reverse. As with functions begin and end, rbegin and rend can return a const_reverse_iterator or a reverse_iterator based on whether or not the container is constant.  All first-class containers support this type of iterator

// Demonstrating standard library vector class template.
#include <iostream>

using std::cout;
using std::cin;
using std::endl;

#include <vector> // vector class-template definition

// prototype for function template printVector
template < class T >
void printVector( const std::vector< T > &integers2 );

int main()  {
   const int SIZE = 6;
   int array[ SIZE ] = { 1, 2, 3, 4, 5, 6 };

   std::vector< int > integers;

   cout << "The initial size of integers is: "
        << integers.size()
        << "\nThe initial capacity of integers is: "
        << integers.capacity();

   // function push_back is in every sequence collection
   integers.push_back( 22 );
   integers.push_back( 33 );
   integers.push_back( 44 );

   cout << "\nThe size of integers is: " << integers.size()
        << "\nThe capacity of integers is: "
        << integers.capacity();

   cout << "\n\nOutput array using pointer notation: ";

   for ( int *ptr = array; ptr != array + SIZE; ++ptr )
      cout << *ptr << ' ';

   cout << "\nOutput vector using iterator notation: ";
   printVector( integers );

   cout << "\nReversed contents of vector integers: ";

   std::vector< int >::reverse_iterator reverseIterator;

   for ( reverseIterator = integers.rbegin();
         reverseIterator!= integers.rend();
         ++reverseIterator )
      cout << *reverseIterator << ' ';

   cout << endl;

   return 0;

} // end main

// function template for outputting vector elements
template < class T >
void printVector( const std::vector< T > &integers2 )
{
   std::vector< T >::const_iterator constIterator;

   for ( constIterator = integers2.begin();
         constIterator != integers2.end();
         constIterator++ )
      cout << *constIterator << ' ';

} // end function printVector

Here's the output of this program:

The initial size of integers is: 0
The initial capacity of integers is: 0
The size of integers is: 3
The capacity of integers is: 4

Output array using pointer notation: 1 2 3 4 5 6
Output vector using iterator notation: 22 33 44
Reversed contents of vector integers: 44 33 22