The final issue to mention with the space efficient linked lists is that my existing implementation from a few days ago does not properly deallocate memory! Did you notice? I delayed discussing it until now because properly deallocating memory requires a technique that works with iterators and I did not want to introduce all the issues at once. The worry is that we deallocate the memory for the list, but there is still an iterator refering to the memory that used to hold our data.
Why does the existing implementation not deallocate memory properly? A shared_ptr
uses what we call a reference count to determine when the memory should be reclaimed. Each time a new shared_ptr
is created the reference count is incremented and each time a shared_ptr
is reclaimed the count is decremented. Only when the count reaches zero is the memory pointed to by the pointer reclaimed. But if you consider our space efficient list implementation, we had this doubly-linked list of blocks. Each block had shared_ptr
s to the next and previous blocks and each of these shared_ptr
s caused the reference count to be increased. For the moment, ignore iterators and think of the reference counts on each block. The dummy block will have a count of three, since there is one shared_ptr
to it living inside the List
structure, and the first and last actual block also both have shared_ptr
s to the dummy block. Each non-dummy block will have a reference count of two, since there are shared_ptr
s to it from the neighboring blocks in the list. Now when the List
structure is reclaimed the destructor for the shared_ptr
to dummy executes, which decrements dummy's count from three down to two. Since this is not zero, nothing is deleted. The memory for these blocks is then kept forever since there is no longer any way for our code to reference the blocks, and none of the reference counts will reach zero.
The problem is a pointer cycle. The reference counting implemented by shared_ptr
s will fail exactly when there exists a sequence of shared_ptr
s such that if you follow the shared_ptr
s one after another you come back to where you start. Such a cycle of shared_ptr
s causes the reference counters to never reach zero so the memory is never deallocated. The solution is to use a weak_ptr to break the cycle. A weak_ptr
is almost like a shared_ptr
except that it does not increment the reference counter so will not keep the pointed to structure alive. Therefore, it is possible for the memory pointed to by a weak_ptr
to be reclaimed so that there is no direct way of accessing the memory pointed to by a weak_ptr
. Instead, the weak_ptr
must be converted to a shared_ptr
before it can be accessed. The idea is that it is temporarily converted to a shared_ptr
to carry out some operation (during which the memory pointed to by the shared_ptr
will be kept around), and after the operation is over the temporary shared_ptr
can be discarded, keeping only the weak pointer.
To properly deallocate the memory in the space efficient linked list, we will change the code as follows. First, each previous pointer will become a weak_ptr
. This isn't quite enough because we still have a pointer cycle by just following next until we come back around to the dummy. There are several techniques to break this cycle, but the only one that plays nice with iterators is the following: create two dummy blocks, one for the start and one for the end. This makes the list no longer circular.
struct List {
struct Block {
shared_ptr<Block> next;
weak_ptr<Block> prev;
...
}
struct ListIterator {
shared_ptr<Block> dummyStart;
shared_ptr<Block> dummyEnd;
shared_ptr<Block> blk;
int j;
...
}
shared_ptr<Block> dummyStart;
shared_ptr<Block> dummyEnd;
....
List() {
dummyStart = make_shared<Block>(0);
dummyEnd = make_shared<Block>(0);
dummyStart->next = dummyEnd;
dummyEnd->prev = dummyStart;
}
}
Ignore iterators for the moment. Each non-dummy block will have exactly one shared_ptr
referring to it, coming from the block right before it in the list. The dummyStart
block has exactly one shared_ptr
to it coming from the List
, and dummyEnd
has exactly two pointers referring to it, one from List
and one from the block preceding it in the list. When the List
struct goes out of scope, dummyStart
will have its reference count decremented to zero causing it to be reclaimed. The shared pointer next
inside the dummy start Block
will then execute, causing the reference count for the next block to decrement to zero. This cascades down the list, causing all blocks, including the dummyEnd block, to be deallocated.
Now consider iterators. Each iterator also holds a shared pointer to dummyStart and dummyEnd, so while the iterator exists the reference count for dummyStart and dummyEnd is incremented by one. Say the List
is reclaimed while there is still a ListIterator
on the stack somewhere. The shared_ptr to dummyStart living inside List
is destructed but that does not decrement the reference count to zero so the dummy start block will be kept in memory which then keeps all other blocks in memory. Once the final ListIterator
disappears, the reference count for the dummy start block will go to zero causing a ripple effect to deallocate all blocks in the list.
dummyStart
block so there is something for each iterator to reference to keep the list alive, but do we need the dummyEnd
block? Write an argument either way, that is if you think the dummyEnd
block is needed, write a few sentences explaining why it is needed. If you think it is not needed, briefly describe how you could remove it and what would need to change in the list. Hint: how does the ListIterator know it is at the end of the list?