Merge and Quick Sort

We consider the application of the divide and conquer method to the problem of sorting.

Divide and Conquer Applied to Sorting

To solve a problem, divide it into smaller problems. Solve the smaller problems and use the solutions of the smaller problems to solve the original problem. Recursion is natural, base case is trivial problem, apply mathematical induction for general case. Sorting is a classical, important problem. Consider various issues:

  • Data in memory or on file?
  • Procedural or functional solution?
  • Recursive or iterative algorithm?

Selection sort selects the minimum of the unsorted list and swaps the minimum to the front of the list. The selection is then applied to the remainder of the list (the list minus its first, minimal element), until the remainder is empty.

Consider the application of selection sort to the list L = [51, 26, 30, 51, 53, 43, 26]. The list of sorted numbers is initalized to the empty list S = [].

L = [51, 26, 30, 49, 43, 27]; S = []
min(L) == 26; L.index(26) == 1
L = [51, 30, 49, 43, 27]; S = [26]
min(L) == 27; L.index(27) == 4
L = [71, 51, 30, 49, 43]; S = [26, 27]
min(L) == 30; L.index(30) == 2
L = [71, 51, 49, 43]; S = [26, 27, 30]
min(L) == 43; L.index(43) == 3
L = [51, 49]; S = [26, 27, 30, 43]
min(L) == 49; L.index(49) == 1
L = [51]; S = [26, 27, 30, 43, 49]
min(L) == 51; L.index(51) == 0
L = []; S = [26, 27, 30, 43, 49, 51]

We can define selection sort recursively or iteratively.

  • Selection Sort done recursively:
    • Base case: length of list \(\leq 1\).
    • Let m be the minimum of the list, denote by rest the list minus m, return [m] + sorted(rest}).
  • Selection Sort done iteratively:
    • Let S be the sorted list, S = [].
    • As long as the unsorted list is not empty:
      1. Select the minimum from the unsorted list.
      2. Append the minimum to the sorted list S.
      3. Remove the minimum from the unsorted list.

The procedural version does not return a sorted list, but sorts the list given on input. We say that such a sort happens in place.

The function recursive_select() defines the recursive selection sort algorithm.

def recursive_select(data):
    """
    Returns a list with the data
    sorted in increasing order.
    """
    if len(data) <= 1:
        return data
    else:
        mindata = min(data)
        data.pop(data.index(mindata))
        return [mindata] + recursive_select(data)

An iterative version of the selection sort is defined by the function iterative_select().

def iterative_select(data):
    """
    Returns a list with the data
    sorted in increasing order.
    """
    result = []
    while len(data) > 0:
        mindata = min(data)
        result.append(mindata)
        data.pop(data.index(mindata))
    return result

In considering the cost of a sorting algorithm, we count the number of comparisons and the number of moves. Memory access is often the bottleneck. Let us count the cost to sort n elements with selection sort:

  1. There are n steps in the algorithm.
  2. Every step requires to look for a minimum in a list of k elements, for \(k=n,n-1,\ldots,1\).

Adding up the number of comparisons:

\[{\rm \#comparisons:} \quad n-1 + n-2 + \cdots 2 + 1 = \frac{n(n-1)}{2}\]

Selection sort thus performs \(O(n^2)\) many comparisons. What does a quadratic cost mean? Well, if we double the number of elements, then the amount of work is expected to increase fourfold.

If already sorted, then no moves needed, but on average it takes also \(O(n^2)\) moves to sort a list.

Merge Sort

We can do better than selection sort by divide and conquer: split the list in two equal halves, sort the halves, and then merge the sorted halves.

For example, consider the sorting of the list [23, 86, 41, 24, 69, 85, 30, 31].

[23, 86, 41, 24, 69, 85, 30, 31]
split
[23, 86, 41, 24]  [69, 85, 30, 31]
split
[23, 86]  [41, 24]  [69, 85]  [30, 31]
sort
[23, 86]  [24, 41]  [69, 85]  [30, 31]
merge
[23, 24, 41, 86]  [30, 31, 69, 85]
merge
[23, 24, 30, 31, 41, 69, 85, 86]

The merging of two sorted lists is defined in the function merge() below.

def merge(one, two):
    """
    Returns the merge of lists one and two,
    adding each time min(one[0], two[0]).
    """
    result = []
    while len(one) > 0 and len(two) > 0:
        if one[0] <= two[0]:
            result.append(one.pop(0))
        else:
            result.append(two.pop(0))
    return result + (one if len(one) > 0 else two)

A recursive merge sort is defined by the function recursive_merge().

def recursive_merge(data, verbose=True):
    """
    Returns a list with the data
    sorted in increasing order.
    """
    if verbose:
        print('data =', data)
    if len(data) <= 1:
        return data
    else:
        middle = len(data)//2
        left = recursive_merge(data[:middle])
        right = recursive_merge(data[middle:])
        return merge(left, right)

To introduce an interative version of merge sort, consider the sorting of the list [23, 86, 41, 24, 69, 85, 30, 31].

view list as lists of singletons:
[[23], [86], [41], [24], [69], [85], [30], [31]]
merge lists of length 1:
[[23, 86], [24, 41], [69, 85], [30, 31]]
merge lists of length 2:
[[23, 24, 41, 86], [30, 31, 69, 85]]
merge lists of length 4:
[[23, 24, 30, 31, 41, 69, 85, 86]]

The iterative loop splits, for as long as length of sublists is less than the length of the original list. We Assume the length of the list to sort is power of two in the function below.

def iterative_merge(data):
    """
    Returns a list with the data
    sorted in increasing order.
    Assumes len(data) is a power of 2.
    """
    ind = 1
    hop = 2
    while hop <= len(data):
        k = 0
        while k+hop <= len(data):
            data[k:k+hop] = merge(data[k:k+ind], \
                data[k+ind:k+hop])
            k = k + hop
        ind = hop
        hop = 2*hop
    return data

Quick Sort

Quick sort partitions the list in two halves, based on a pivot. Numbers in the first half are less than the value of the pivot. Numbers in the second half are larger than the value of the pivot. Consider the sorting of [28, 14, 12, 31, 19, 20, 65, 41]. As pivot, we take the first element in the list.

[28, 14, 12, 31, 19, 20, 65, 41]
<= 28:                   > 28:
[14, 12, 19, 20]         [31, 65, 41]
<= 14:   < 14:           <= 31:   > 31:
[12]    [19, 20]         []       [65, 41]
[12] + [14] + [19, 20]   [] + [31] + [41, 65]
[12, 14, 19, 20] + [28] + [31, 41, 65]
[12, 14, 19, 20, 28, 31, 41, 65]

The function quick_sort() defines the recursive version of the quick sort algorithm.

def quick_sort(data, verbose=True):
    """
    Returns a list with the data
    sorted in increasing order.
    """
    if verbose:
        print('data =', data)
    if len(data) <= 1:
        return data
    else:
        (left, right) = ([], [])
        for k in range(1, len(data)):
            if data[k] < data[0]:
                left.append(data[k])
            else:
                right.append(data[k])
        result = quick_sort(left)
        result.append(data[0])
        return result + quick_sort(right)

Merge sort and quick sort cut lists in half in each stage. Let us consider their cost.

Consider the cost for merge sort:

  • Best case: if sorted, then no moves needed.
  • Worst case: if in reverse order, then many swaps.
  • Always needs \(2 \times \log_2(n)\) stages.

Consider the cost for quick sort:

  • In the best case only \(\log_2(n)\) stages.
  • If an unbalanced partition, then its cost is \(O(n^2)\).

We can time Python code with the clock() function of the time module. Its syntax is illustrated below.

import time
starttime = time.clock()
< code to be timed >
stoptime = time.clock()
elapsed = stoptime - starttime
print 'elapsed time: %.2f seconds' % elapsed

In the above, replace the < code to be timed > by a function call.

An alternative and more extensive way to time Python scripts is via the times() method of the os module. The total elapsed time consists of

  • the user cpu time: time spent on user process, and
  • the system time: time spent by operating system.

It is interesting to separate user cpu from system time, because a process which consumes a lot of memory will most likely also consume a lot of system time. The template to time code with os.times is below:

import os
a = os.times()
< code to be timed >
b = os.times()
print 'user cpu time : %.4f' % (b[0] - a[0])
print '  system time : %.4f' % (b[1] - a[1])
print ' elapsed time : %.4f' % (b[4] - a[4])

An intrinsic manner to quantify the cost of an algorithm is to count the operations, which gives a measure of the cost which is not dependent on the hardware. As a byproduct of the computations, we could report the number of operations.

For sorting, we distinguish between the comparisons of data, and the assignments, which are considered as data movements.

Altering the code by making it more general:

  1. Replace <, =, > by a compare function.
  2. Replace := by a copy function.

The function compare and copy maintain tallies of the number of times they have been executed. This is similar to the memoization technique we applied to make recursive functions efficiently.

Exercises

  1. Compare the recursive select sort with the iterative version for longer lists. Which is most efficient? Justify.
  2. Modify the function select sort so that it takes on input an array instead of a list. Instead of returning the sorted array, the array on input should be sorted.
  3. Compare running times for the iterative versions of select sort and merge sort. How long must the list be before merge sort beats select sort?
  4. Assume the data to sort is in an array. Change both versions of merge sort so that it does not return a sorted array but sorts the array on input.
  5. Adjust the implementation of the iterative merge sort so it can also sort lists whose length is not a power of 2.
  6. Describe a linear cost algorithm to sort less than 100 natural numbers, all different from each other.