4.13 Aside: Joining BSTs |

Occasionally we may want to take a pair of BSTs and merge or “join”
their contents, forming a single BST that contains all the items in
the two original BSTs. It's easy to do this with a series of calls to
*bst_insert*(), but we can optimize the process if we write a function
exclusively for the purpose. We'll write such a function in this
section.

There are two restrictions on the trees to be joined. First, the
BSTs' contents must be disjoint. That is, no item in one may match
any item in the other. Second, the BSTs must have compatible
comparison functions. Typically, they are the same. Speaking more
precisely, if *f*() and *g*() are the comparison functions, *p* and
*q* are nodes in either BST, and *r* and *s* are the BSTs'
user-provided extra comparison parameters, then the expressions
*f*(*p*, *q*, *r*), *f*(*p*, *q*, *s*), *g*(*p*, *q*, *r*), and *g*(*p*, *q*, *s*) must all have the same value for all possible choices of *p* and
*q*.

Suppose we're trying to join the trees shown below:

Our first inclination is to try a “divide and conquer” approach by
reducing the problem of joining *a* and *b* to the subproblems of
joining *a*'s left subtree with *b*'s left subtree and joining *a*'s
right subtree with *b*'s right subtree. Let us postulate for the
moment that we are able to solve these subproblems and that the
solutions that we come up with are the following:

To convert this partial solution into a full solution we must combine
these two subtrees into a single tree and at the same time reintroduce
the nodes *a* and *b* into the combined tree. It is easy enough to do
this by making *a* (or *b*) the root of the combined tree with these
two subtrees as its children, then inserting *b* (or *a*) into the
combined tree. Unfortunately, in neither case will this actually work
out properly for our example. The diagram below illustrates one
possibility, the result of combining the two subtrees as the child of
node 4, then inserting node 7 into the final tree. As you can see,
nodes 4 and 5 are out of order:^{1}

Now let's step back and analyze why this attempt failed. It was
essentially because, when we recombined the subtrees, a node in the
combined tree's left subtree had a value larger than the root. If we
trace it back to the original trees to be joined, we see that this was
because node 5 in the left subtree of *b* is greater than *a*. (If we
had chosen 7 as the root of the combined tree we would have found
instead node 6 in the right subtree of *b* to be the culprit.)

On the other hand, if every node in the left subtree of *a* had a
value less than *b*'s value, and every node in the right subtree of
*a* had a value greater than *b*'s value, there would be no problem.
Hey, wait a second... we can force that condition. If we perform
a root insertion (see Root Insertion in a BST) of *b* into subtree
*a*, then we end up with one pair of subtrees whose node values are
all less than 7 (the new and former left subtrees of node 7) and one
pair of subtrees whose node values are all greater than 7 (the new and
former right subtrees of node 7). Conceptually it looks like this,
although in reality we would need to remove node 7 from the tree on
the right as we inserted it into the tree on the left:

We can then combine the two subtrees with values less than 7 with each other, and similarly for the ones with values greater than 7, using the same algorithm recursively, and safely set the resulting subtrees as the left and right subtrees of node 7, respectively. The final product is a correctly joined binary tree:

Of course, since we've defined a join recursively in terms of itself, there must be some maximum depth to the recursion, some simple case that can be defined without further recursion. This is easy: the join of an empty tree with another tree is the second tree.

It's easy to implement this algorithm recursively. The only
nonobvious part of the code below is the treatment of node *b*. We
want to insert node *b*, but not *b*'s children, into the subtree
rooted at *a*. However, we still need to keep track of *b*'s
children. So we temporarily save *b*'s children as *b0* and *b1* and
set its child pointers to `NULL` before the root insertion.

This code makes use of *root_insert*() from <Robust root insertion of existing node in arbitrary subtree 627>.

97. <BST join function, recursive version 97> = /* Joinsaandb, which are subtrees oftree,

and returns the resulting tree. */staticstructbst_node*join(structbst_table*tree,structbst_node*a,structbst_node*b)

{if(b==NULL)returna;elseif(a==NULL)returnb;else

{structbst_node*b0=b->bst_link[0];structbst_node*b1=b->bst_link[1];b->bst_link[0] =b->bst_link[1] =NULL;root_insert(tree, &a,b);a->bst_link[0] =join(tree,b0,a->bst_link[0]);a->bst_link[1] =join(tree,b1,a->bst_link[1]);returna; } } /* Joinsaandb, which must be disjoint and have compatible

comparison functions.bis destroyed in the process. */voidbst_join(structbst_table*a,structbst_table*b)

{a->bst_root=join(a,a->bst_root,b->bst_root);a->bst_count+=b->bst_count;free(b); }

**See also:**
[Sedgewick 1998], program 12.16.

**Exercises:**

**1.** Rewrite *bst_join*() to avoid use of recursion.
[answer]