Home > Articles > Programming > C/C++

  • Print
  • + Share This
From the author of Three-Legged Algorithms

Three-Legged Algorithms

Several algorithms in STL use three iterators: one for the beginning of input, one for the middle, and one for the end. For example, consider STL's nth_element and rotate, with the following (stylized) signatures:

void nth_element(RIt first, RIt mid, RIt last);
void rotate(FIt first, FIt mid, FIt last);

where RIt is a random access iterator and FIt is a forward iterator. mid must fall somewhere between first and last. nth_element is a useful algorithm that moves the smallest mid - first elements of the range toward the beginning, and makes mid point to exactly the (mid - first)-smallest element in the range. The trick is that nth_element does not sort anything—it just finds the nth smallest element (hence its name). Sorting the range and then looking at mid would achieve the same result, but nth_element does considerably less work than sort, important when handling large data sets. (nth_element is used in index searching and nearest-neighbors algorithms.)

rotate is one of my favorites but has a rather arcane name. It rearranges elements in the range (using standard math interval notation) [first, last) such that the elements in [mid, last) appear before the elements in [first, mid). Put another way, rotate is a bring-to-front operation: The [mid, last) portion is brought to the front of the range. Naively implemented, rotate could take a long time, but the algorithm is quite clever about minimizing data moves.

How can such functions be translated into range lingo? This was quite a conundrum that had me thinking for quite a while, until I realized a simple fact: Three-legged algorithms conceptually do not take three iterators. They take two ranges, left and right! The left-hand range is [first, mid), and the right-hand one is [mid, last). Armed with this simple fact, I first defined and implemented nth_element and rotate as follows:

void nth_element(RR left, RR right);
void rotate(FR left, FR right);

where RR is a random access range and FR is a forward range. If you want to call the functions for a given range, you just say, for example:

Range r = ...;
nth_element(r.slice(0, 5),
   r.slice(5, r.length));
rotate(r.slice(0, 5),
   r.slice(5, r.length));

Similar reasoning can be applied for all three-legged STL algorithms, such as partial_sort. This solution was satisfactory, and I was ready to leave it at that, when opportunity knocked at the door in the form of a neat generalization. Consider defining the functions above like this:

void nth_element(R1 left, R2 right);
void rotate(R1 left, R2 right);

where R1 and R2 are two arbitrary ranges that need not be adjacent and not even of the same category. (Adjacency does not inform the algorithms at all.) Suddenly, we have much more potent algorithms to play with. For example, nth_element not only can find the n smallest elements in one range; it can do so in two unrelated ranges sitting in memory at different places! Even better, R2 for nth_element does not need to be random access—an input range would suffice. The implementation of nth_element can detect that and use different algorithms depending on R2's capability.

Using two ranges instead of three iterators not only solves the problem, but offers additional functionality.

  • + Share This
  • 🔖 Save To Your Account