The only major downside to the space efficient linked lists we discussed the past few days is pseudocode such as the following:
for i from 0 to n-1 {
x = mylist.get(i)
do something with x
}
Assuming the "do something with x" runs in time \(O(1)\) and mylist is a space efficient linked list, this entire loop will run in time \(O(\sum_i \frac{\min(i,n-i)}{b})\) so if we keep \(b = O(\sqrt{n})\), it will run in time \(O(n^{3/2})\). The above loop is slow because it calls get
over and over and get
must walk down the linked list of blocks each time it is called. Knowing the structure, we can write a more efficient loop that loops through each element in the list:
blk = dummy.next
while (blk != dummy) {
for j from 0 to blk.size() - 1 {
x = jth entry from blk
do something with x
}
blk = blk.next
}
Again assuming "do something with x" runs in time \(O(1)\), the above code will run in time \(O(n)\), better than the \(O(n^{3/2})\). But we don't want the user to know about blocks or anything about the internal structure of our data structure so we do not want them writing a loop like this. Therefore, we modify the list ADT to add a new operation that each implementation (including space efficient linked lists) can the implement. Different languages use different approaches to modifying the list ADT.
One common approach, used in languages like Haskell, JavaScript, recent versions of C# with LINQ, and many others is to provide a list ADT operation called map
. (Languages use various names, but one common name is map
.) For example, javascript's map.
The list ADT operation map
accepts a single parameter f
which is a function. The map
operation then calls the function f
on every element of the list. On a space efficient linked list, we could implement this in pseudocode as follows:
map(f):
blk = dummy.next
while (blk != dummy) {
for j from 0 to blk.size() - 1 {
x = jth entry from blk
f(x)
}
blk = blk.next
}
A user that wants to perform some operation on each element in the list creates a function and then passes it to map. As long as f
is \(O(1)\), the execution time of map(f)
will be \(O(n)\) and the user did not need to know the details of how map
is implemented. Note that some implementations of map such as the javascript one I linked above also return a new list with the values replaced by the output of f
.
A second common approach, used in languages such as Rust, Go, and Python, is to use a coroutine or generator. For example, in python we could yield x
from inside the loop. If you recall generators from python, we could write code such as
make_generator(f):
blk = dummy.next
while (blk != dummy):
for j in range(0, blk.size()):
x = blk.get(j)
yield x
blk = blk.next
If you don't know python generators, that is OK, I won't specifically ask about coroutines or generators on an exam.
Finally, the technique C++ uses is called iterators. The idea is that there is a list ADT method called begin
which when called returns a new data value called an iterator. We can think of the iterator as "referencing" a specific element in the list. The iterator has methods get()
, set()
, next()
, and prev()
. In our space efficient linked list implementation, we could write code such as
struct ListIterator {
private:
shared_ptr<Block> dummy;
shared_ptr<Block> blk;
int j;
int block_array_size;
public:
ListIterator(shared_ptr<Block> d, int bsize) {
dummy = d;
blk = dummy->next;
j = 0;
block_array_size = bsize;
}
T get() {
return blk->get(j, block_array_size);
}
void set(T x) {
blk->set(j, x, block_array_size);
}
void next() {
if (j == blk->n - 1) {
//move to the next block
blk = blk->next;
j = 0;
} else {
// just increment j
j += 1;
}
}
bool isAtEnd() {
return blk == dummy;
}
void prev() {
//write me
}
}
and then our list could implement begin
as
ListIterator begin() {
return ListIterator(dummy, block_array_size);
}
public
part of the List
structure insideselist.cpp
. Then, add a begin
method to the List structure.
Fill in the prev
method on the ListIterator
.
Write some test code to loop through the elements of a list using the iterator. You use List<sometype>::ListIterator myiter
to define a variable containing a list iterator. Note the colons, which are used to refer to structs nested inside other structs.