Preface

This document contains the lecture notes for the course MCS 572, introduction to supercomputing, at the University of Illinois at Chicago.

This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License.

The first runs of the course followed the book of Wilkinson and Allen. The book by Kirk, Hwu, and El Hajj is our main reference for acceleration on Graphics Processing Units (GPUs).

The goal of the course is to study the design and analysis of parallel algorithms and their implementation using message passing, multithreading, multitasking, and acceleration. We study the application of parallel programs to solve scientific problems.

Three Different Types of Parallelism

We distinguish between three different types of parallel computers:

  1. Distributed Memory Parallel Computers

  2. Shared Memory Parallel Computers

  3. General Purpose Graphics Processing Units

for which we have three corresponding programming models:

  1. Message Passing

  2. Multithreading and Multitasking

  3. Data Staging Algorithms

Load balancing algorithms are introduced for distributed memory and shared memory parallel computers. Partitioning and divide-and-conquer strategies are applied in the design of parallel algorithms. Tools and models to evaluate the performance of parallel programs are the task graph, isoefficiency, and the roofline model. Pipelining is a common technique to make parallel programs.

Programming Languages

The course is not a programming course, but a computational course. Familiarity with computers and programming is assumed.

  1. Python allows for high level parallel programing The package mpi4py enables distributed memory parallel programming via message passing. Kernels can be launched for GPU execution using PyCUDA.

  2. Julia has a MATLAB-like syntax and offers support for message passing via the package MPI.jl, has tools for multithreading and multitasking. One nice feature of the Julia ecosystem is the ability for vendor agnostic GPU acceleration.

  3. C++ achieves performance portability. Using C and C++ as a programming model requires greater attention to the memory model.

It is important to emphasize that we are using programming languages to run short parallel programs, or to call code from software libraries.

Bibliography

  1. Barry Wilkinson and Michael Allen: Parallel Programming. Techniques and Applications Using Networked Workstations and Parallel Computers. Pearson Prentice Hall, second edition, 2005.

  2. David B. Kirk, Wen-mei W. Hwu, Izzat El Hajj: Programming Massively Parallel Processors. A Hands-on Approach. Elsevier/Morgan Kaufmann Publishers, fourth edition, 2023.