3.2 Sequential Search with Sentinel |
Try to think of some ways to improve the speed of sequential search. It should be clear that, to speed up a program, it pays to concentrate on the parts that use the most time to begin with. In this case, it's the loop.
Consider what happens each time through the loop:
If we could somehow eliminate one of these comparisons, the loop might be a lot faster. So, let's try... why do we need step 1? It's because, otherwise, we might run off the end of array[], causing undefined behavior, which is in turn because we aren't sure that key is in array[]. If we knew that key was in array[], then we could skip step 1.
But, hey! we can ensure that the item we're looking for is in the array. How? By putting a copy of it at the end of the array. This copy is called a sentinel (see sentinel), and the search technique as a whole is called sequential search with sentinel (see sequential search with sentinel). Here's the code:
18. <Sequentially search an array of ints using a sentinel 18> = /* Returns the smallest i such that array[i] == key,
or -1 if key is not in array[]. array[] must be an modifiable array of n ints
with room for a (n + 1)th element. */ int
seq_sentinel_search (int array[], int n, int key)
{ int *p; array[n] = key; for (p = array; *p != key; p++) /* Nothing to do. */; return p - array < n ? p - array : -1; }
This code is included in 602.
Notice how the code above uses a pointer, int *p, rather than a counter i as in <Sequentially search an array of ints 17> earlier. For the most part, this is simply a style preference: for iterating through an array, C programmers usually prefer pointers to array indexes. Under older compilers, code using pointers often compiled into faster code as well, but modern C compilers usually produce the same code whether pointers or indexes are used.
The return statement in this function uses two somewhat advanced features of C: the conditional or “ternary” operator ?: and pointer arithmetic. The former is a bit like an expression form of an if statement. The expression a ? b : c first evaluates a. Then, if a != 0, b is evaluated and the expression takes that value. Otherwise, a == 0, c is evaluated, and the result is the expression's value.
Pointer arithmetic is used in two ways here. First, the expression p++ acts to advance p to point to the next int in array. This is analogous to the way that i++ would increase the value of an integer or floating point variable i by one. Second, the expression p - array results in the “difference” between p and array, i.e., the number of int elements between the locations to which they point. For more information on these topics, please consult a good C reference, such as [Kernighan 1988].
Searching with a sentinel requires that the array be modifiable and large enough to hold an extra element. Sometimes these are inherently problematic—the array may not be modifiable or it might be too small—and sometimes they are problems because of external circumstances. For instance, a program with more than one concurrent thread (see thread) cannot modify a shared array for sentinel search without expensive locking.
Sequential sentinel search is an improvement on ordinary sequential search, but as it turns out there's still room for improvement—especially in the runtime for unsuccessful searches, which still always take n comparisons. In the next section, we'll see one technique that can reduce the time required for unsuccessful searches, at the cost of longer runtime for successful searches.
See also: [Knuth 1998b], algorithm 6.1Q; [Cormen 1990], section 11.2; [Bentley 2000], section 9.2.