Oct 26

Treap Add

Recall the add pseudocode from last time

add_helper(newNode, v):
    # Adds newNode into the tree rooted at v
    # returns the new root of the tree, which could be different than v

    if v == nil:
        return newNode

    if newNode.x == v.x:
        v.x = newNode.x # recall x could be a pair of a key and value

    elif newNode.x < v.x:
        # Insert newNode to the left.  The root of the left subtree could change,
        # so the return value of the recusive call will replace v.left
        v.left = add_helper(newNode, v.left)

        # If the heap property is violated, rotate right
        if v.left.p < v.p:
            v = rotate_right(v)

    else:
        # Insert newNode to the right.  The root of the right subtree could change,
        # so the return value of the recusive call will replace v.right
        v.right = add_helper(newNode, v.right)

        # If the heap property is violated, rotate left
        if v.right.p < v.p:
            v = rotate_left(v)

    return v

Definition We say that \(v\) is the root of a treap if the subtree rooted at \(v\) satisfies the following two properties:

Lemma If \(v\) is the root of a treap and \(v'\) is the return value of \(add\_helper(newNode, v)\), then \(v'\) is the root of a treap containing all the original elements plus the element newNode.

Proof The proof is by induction on the size of the tree.

Running time

The add method has running time \(O(height)\). Since the expected height of the tree is \(O(\log n)\), the expected running time of add is \(O(\log n)\). Also, add uses \(O(\log n)\) stack space since each recursive call has a frame on the call stack. If you look at the iterative version of add in the book, it also has an expected running time of \(O(\log n)\), but uses only \(O(1)\) space. The iterative version in the book obtains this smaller space by using a parent property stored at each node. The downside is these parent pointers take \(O(n)\) space to store, so there is a tradeoff between less space used during add but more space used overall in the nodes. Note that the iterative version in the book could remove the parent pointers and then use an explicit stack during add to use \(O(\log n)\) space without parent pointers. Similarly, if the recursive version above included parent pointers, the recursive version could switch to be tail-recursive, in which case it would use only \(O(1)\) space.

Treap Remove

Here is a recursive version of remove.

remove_helper(x, v):
    # Removes x from subtree rooted at v
    # returns the root of the new subtree

    if v == nil:
        # x is not in the tree, so do nothing
        return v

    if x < v.x:
        v.left = remove_helper(x, v.left)
        return v
    elif x > v.x:
        v.right = remove_helper(x, v.right)
        return v
    else:
        # We have found the element to delete!
        
        # If one of the left or right subtree is missing, it can just replace v
        if v.left == nil:
            # Use the right subtree of v as the new root of this subtree
            return v.right
        elif v.right == nil:
            # Use the left subtree of v as the new root of this subtree
            return v.left
        else:
            # Both the left and right subtree exist.  In this case, we can't immediately
            # remove v, we need to push v closer to a leaf which we can do by performing
            # a rotation.  The direction of rotation is determined by which of the left or
            # right child should replace v to satisfy the heap property
            if v.left.p < v.right.p:
                # Rotate the tree right so that the left child of v becomes the root of the subtree
                t = rotate_right(v)
                # Now v (which is the element we are trying to delete) has moved into the right
                # subtree of t, so recursivly delete it from there.
                t.right = remove_helper(x, t.right)
                return t
            else:
                # Rotate the tree left so that the right child of v (which has smaller p) becomes
                # the root of the subtree
                t = rotate_left(v)
                # v is now in the left subtree of t, so remove it from there.
                t.left = remove_helper(x, t.left)
                return t

remove(x):
    root = remove_helper(x, root)

Lemma If \(v\) is the root of a treap and \(v'\) is the return value of \(remove\_helper(x, v)\), then \(v'\) is the root of a treap containing all the original elements except the element equal to \(x\) if such an element existed.

Proof The proof is by induction on the size of the tree.

Running time

remove runs in time \(O(height)\) so the expected running time is \(O(\log n)\). Similar to add, as written the space usage of remove is \(O(\log n)\). But, if we examine the remove_helper function, it is almost tail recursive. The only thing that happens after a recursive call returns is that we replace the parameter we passed in with the return value. That is, here is a snippet from the code:

v.left = remove_helper(x, v.left)

We pass in \(v.left\) to the recursive call and then whatever is returned, we replace \(v.left\) by it. If in our C++ implementation we passed \(v.left\) by reference, the value in \(v.left\) could just be directly updated. (Recall that a parameter passed by reference can be directly edited by the callee.) If we do change the parameter to remove_helper to a reference, the function then becomes tail-recursive in which case the stack usage drops to \(O(1)\).

Implementation

treap.cpp is my implementation. The add function is essentially directly translated into C++. The remove functions use the above trick to obtain a tail-recursive version; the C++ contains some comments explaining in more detail how the parameter to remove_helper works as a reference.

I had to compile with -O2 in order for g++ to properly optimize remove_helper to perform tail-call optimization. To see that it in fact worked, what I did is compiled with g++ -O2 -g -std=c++11 treap.cpp. I then opened a.out inside gdb and placed a breakpoint on line 156 (which is the first line of the remove_helper function. Each time the breakpoint was hit, I checked the backtrace and saw that the backtrace had only a single function.

Exercise

Write pseudocode for an extract_min method on treaps. The extract_min method takes no parameters. It locates the entry in the treap with the smallest \(x\) value, removes it, and returns the removed \(x\) value. What is the running time of your pseudocode? Next, translate your pseudocode to C++ and add it to my treap.cpp implementation.