Oct 9

Lecture Overview

The final issue to mention with the space efficient linked lists is that my existing implementation from a few days ago does not properly deallocate memory! Did you notice? I delayed discussing it until now because properly deallocating memory requires a technique that works with iterators and I did not want to introduce all the issues at once. The worry is that we deallocate the memory for the list, but there is still an iterator refering to the memory that used to hold our data.

The problem

Why does the existing implementation not deallocate memory properly? A shared_ptr uses what we call a reference count to determine when the memory should be reclaimed. Each time a new shared_ptr is created the reference count is incremented and each time a shared_ptr is reclaimed the count is decremented. Only when the count reaches zero is the memory pointed to by the pointer reclaimed. But if you consider our space efficient list implementation, we had this doubly-linked list of blocks. Each block had shared_ptrs to the next and previous blocks and each of these shared_ptrs caused the reference count to be increased. For the moment, ignore iterators and think of the reference counts on each block. The dummy block will have a count of three, since there is one shared_ptr to it living inside the List structure, and the first and last actual block also both have shared_ptrs to the dummy block. Each non-dummy block will have a reference count of two, since there are shared_ptrs to it from the neighboring blocks in the list. Now when the List structure is reclaimed the destructor for the shared_ptr to dummy executes, which decrements dummy's count from three down to two. Since this is not zero, nothing is deleted. The memory for these blocks is then kept forever since there is no longer any way for our code to reference the blocks, and none of the reference counts will reach zero.

The problem is a pointer cycle. The reference counting implemented by shared_ptrs will fail exactly when there exists a sequence of shared_ptrs such that if you follow the shared_ptrs one after another you come back to where you start. Such a cycle of shared_ptrs causes the reference counters to never reach zero so the memory is never deallocated. The solution is to use a weak_ptr to break the cycle. A weak_ptr is almost like a shared_ptr except that it does not increment the reference counter so will not keep the pointed to structure alive. Therefore, it is possible for the memory pointed to by a weak_ptr to be reclaimed so that there is no direct way of accessing the memory pointed to by a weak_ptr. Instead, the weak_ptr must be converted to a shared_ptr before it can be accessed. The idea is that it is temporarily converted to a shared_ptr to carry out some operation (during which the memory pointed to by the shared_ptr will be kept around), and after the operation is over the temporary shared_ptr can be discarded, keeping only the weak pointer.

The fix

To properly deallocate the memory in the space efficient linked list, we will change the code as follows. First, each previous pointer will become a weak_ptr. This isn't quite enough because we still have a pointer cycle by just following next until we come back around to the dummy. There are several techniques to break this cycle, but the only one that plays nice with iterators is the following: create two dummy blocks, one for the start and one for the end. This makes the list no longer circular.

struct List {

    struct Block {
        shared_ptr<Block> next;
        weak_ptr<Block> prev;
        ...
    }

    struct ListIterator {
        shared_ptr<Block> dummyStart;
        shared_ptr<Block> dummyEnd;
        shared_ptr<Block> blk;
        int j;
        ...
    }


    shared_ptr<Block> dummyStart;
    shared_ptr<Block> dummyEnd;

    ....

    List() {
        dummyStart = make_shared<Block>(0);
        dummyEnd = make_shared<Block>(0);
        dummyStart->next = dummyEnd;
        dummyEnd->prev = dummyStart;
    }
}

Ignore iterators for the moment. Each non-dummy block will have exactly one shared_ptr referring to it, coming from the block right before it in the list. The dummyStart block has exactly one shared_ptr to it coming from the List, and dummyEnd has exactly two pointers referring to it, one from List and one from the block preceding it in the list. When the List struct goes out of scope, dummyStart will have its reference count decremented to zero causing it to be reclaimed. The shared pointer next inside the dummy start Block will then execute, causing the reference count for the next block to decrement to zero. This cascades down the list, causing all blocks, including the dummyEnd block, to be deallocated.

Now consider iterators. Each iterator also holds a shared pointer to dummyStart and dummyEnd, so while the iterator exists the reference count for dummyStart and dummyEnd is incremented by one. Say the List is reclaimed while there is still a ListIterator on the stack somewhere. The shared_ptr to dummyStart living inside List is destructed but that does not decrement the reference count to zero so the dummy start block will be kept in memory which then keeps all other blocks in memory. Once the final ListIterator disappears, the reference count for the dummy start block will go to zero causing a ripple effect to deallocate all blocks in the list.

Exercise