parallel sorting methods: bucket sort

Our first application area of data partitioning and divide-and-conquer methods is the design of parallel sorting methods. We recall quicksort as one of the best sorting algorithms, it is part of the standard C language, see use_qsort.c for an example of how to use the function qsort. The function qsort is extremely fast, but crashes for huge sequences, say over one billion numbers, simply because it is no longer reasonable to allocate so many numbers in one array.

When we have to sort huge sequences, we naturally represent our sequence as a list of arrays, and then bucket sort is a natural sorting strategy. Bucket sort consists of two steps: (1) distribute the numbers into p buckets (the numbers in the i-th bucket are all smaller than the ones in the (i+1)-th bucket); (2) apply qsort to all buckets. A basic implementation of this method is in the file in bucket_sort.c . Bucket sort belongs to the category of "sorting by distribution", also known as "radix sort". It sorts sequences of size n in time O(n log(n)).

The suggestive notation of p as the number of buckets above leads to an immediate parallel version of bucket sort. The manager distributes the sequence among the processors so that every processor (including the manager node) has one bucket to sort. Every processor applies qsort to its bucket. The sorted buckets are gathered by the manager. We showed, when the size of the sequence n dominates p, that then an even distribution of the numbers into the buckets leads to an optimal speedup. Also for sufficiently large n the computation time (i.e.: the total cost of comparisons) dominates the communication time.

Encouraged by this promising analysis of the performance of this parallel bucket sort, we left the coding of it as an exercise.

Bibliography