Oct 7

Lecture Overview

Iteration in the list ADT

The only major downside to the space efficient linked lists we discussed the past few days is pseudocode such as the following:

for i from 0 to n-1 {
    x = mylist.get(i)
    do something with x
}

Assuming the "do something with x" runs in time \(O(1)\) and mylist is a space efficient linked list, this entire loop will run in time \(O(\sum_i \frac{\min(i,n-i)}{b})\) so if we keep \(b = O(\sqrt{n})\), it will run in time \(O(n^{3/2})\). The above loop is slow because it calls get over and over and get must walk down the linked list of blocks each time it is called. Knowing the structure, we can write a more efficient loop that loops through each element in the list:

blk = dummy.next
while (blk != dummy) {
    for j from 0 to blk.size() - 1 {
        x = jth entry from blk
        do something with x
    }
    blk = blk.next
}

Again assuming "do something with x" runs in time \(O(1)\), the above code will run in time \(O(n)\), better than the \(O(n^{3/2})\). But we don't want the user to know about blocks or anything about the internal structure of our data structure so we do not want them writing a loop like this. Therefore, we modify the list ADT to add a new operation that each implementation (including space efficient linked lists) can the implement. Different languages use different approaches to modifying the list ADT.

Map

One common approach, used in languages like Haskell, JavaScript, recent versions of C# with LINQ, and many others is to provide a list ADT operation called map. (Languages use various names, but one common name is map.) For example, javascript's map.

The list ADT operation map accepts a single parameter f which is a function. The map operation then calls the function f on every element of the list. On a space efficient linked list, we could implement this in pseudocode as follows:

map(f):
    blk = dummy.next
    while (blk != dummy) {
        for j from 0 to blk.size() - 1 {
            x = jth entry from blk
            f(x)
        }
        blk = blk.next
    }

A user that wants to perform some operation on each element in the list creates a function and then passes it to map. As long as f is \(O(1)\), the execution time of map(f) will be \(O(n)\) and the user did not need to know the details of how map is implemented. Note that some implementations of map such as the javascript one I linked above also return a new list with the values replaced by the output of f.

Coroutines/Generators

A second common approach, used in languages such as Rust, Go, and Python, is to use a coroutine or generator. For example, in python we could yield x from inside the loop. If you recall generators from python, we could write code such as

make_generator(f):
    blk = dummy.next
    while (blk != dummy):
        for j in range(0, blk.size()):
            x = blk.get(j)
            yield x
        blk = blk.next

If you don't know python generators, that is OK, I won't specifically ask about coroutines or generators on an exam.

Iterators

Finally, the technique C++ uses is called iterators. The idea is that there is a list ADT method called begin which when called returns a new data value called an iterator. We can think of the iterator as "referencing" a specific element in the list. The iterator has methods get(), set(), next(), and prev(). In our space efficient linked list implementation, we could write code such as

struct ListIterator {
    private:
      shared_ptr<Block> dummy;
      shared_ptr<Block> blk;
      int j;
      int block_array_size;

    public:

      ListIterator(shared_ptr<Block> d, int bsize) {
          dummy = d;
          blk = dummy->next;
          j = 0;
          block_array_size = bsize;
      }

       T get() {
           return blk->get(j, block_array_size);
       }

       void set(T x) {
           blk->set(j, x, block_array_size);
       }

       void next() {
           if (j == blk->n - 1) {
               //move to the next block
               blk = blk->next;
               j = 0;
           } else {
               // just increment j
               j += 1;
           }
       }

       bool isAtEnd() {
           return blk == dummy;
       }

       void prev() {
           //write me
       }
}

and then our list could implement begin as

ListIterator begin() {
    return ListIterator(dummy, block_array_size);
}

Exercise

Copy the ListIterator structure into the public part of the List structure inside
selist.cpp. Then, add a begin method to the List structure.
Fill in the prev method on the ListIterator.
Write some test code to loop through the elements of a list using the iterator. You use List<sometype>::ListIterator myiter to define a variable containing a list iterator. Note the colons, which are used to refer to structs nested inside other structs.