MCS 549 - Mathematical Foundations of Data Science
University of Illinois at Chicago
Fall 2022

This course covers the mathematical foundations of modern data science from a theoretical computer science perspective. Topics will include random graphs, small world phenomena, random walks, Markov chains, streaming algorithms, clustering, graphical models, singular value decomposition, and random projections.

Basic Information

Syllabus: pdf
Time and Location: M-W-F 12:00-12:50PM, 302 Addams Hall (AH)
Instructor Contact Information: Lev Reyzin, SEO 417
Online Textbook: Avrim Blum, John Hopcroft, and Ravi Kannan, Mathematical Foundations of Data Science
Office Hours: T 9:00-9:50AM (online), F 10:00-10:50AM (SEO 417)
Piazza site: link

Problem Sets

problem set 1, due 10/3/22

Lectures and Readings

Lecture 1 (8/22/22)
covered material: intro to the course, preview of the material
reading: chapter 1

Lecture 2 (8/24/22)
covered material: some concentration inequalities, intro to geometry in high dimensions
reading: chapters 2.1 - 2.3

Lecture 3 (8/26/22)
covered material: properties of the unit ball, sampling from the unit ball
reading: chapters 2.4 - 2.5

Lecture 4 (8/29/22)
covered material: Gaussian annulus theorem, separating Gaussians, fitting a spherical Gaussian to data
reading: chapters 2.6, 2.8, and 2.9

Lecture 5 (8/31/22)
covered material: random projection theorem, Johnson-Lindenstrauss lemma
reading: chapter 2.7

Lecture 6 (9/2/22)
coveredmaterial: singular value decomposition (SVD), best-fit subspaces, and optimality of greedy algorithm
reading: chapters 3.1 - 3.4

Lecture 7 (9/7/22)
covered material: power iteration, SVD for clustering mixtures of Gaussians
reading: chapter 3.7, begin chapter 3.9

Lecture 8 (9/9/22)
covered material: centering data
reading: finish chapter 3.9

Lecture 9 (9/12/22)
covered material: introduction to random graphs, counting triangles
reading: chapter 8.1

Lecture 10 (9/14/22)
covered material: first and second moment methods for showing phase transitions
reading: begin chapter 8.2

Lecture 11 (9/16/22)
covered material: sharp threshold for diameter, Hamiltonian cycles example
reading: finish chapter 8.2

Lecture 12 (9/19/22)
covered material: isolated vertices, increasing graph properties
reading: chapter 8.5

Lecture 13 (9/21/22)
covered material: Molloy-Reed condition for non-uniform degrees
reading: chapter 8.8

Lecture 14 (9/23/22)
covered material: growth models with and without preferential attachment
reading: chapter 8.9

Lecture 15 (9/26/22)
covered material: intro to Markov chains, stationary distribution, Fundamental Theorem of Markov Chains
reading: chapter 4 intro, 4.1

Lecture 16 (9/28/22)
covered material: Markov chain Monte Carlo (MCMC), Metropolis-Hasting, Gibbs sampling
reading: chapter 4.2

Lecture 17 (9/30/22)
covered material: MCMC for efficient sampling, volume estimation of convex bodies
reading: chapter 4.3

Lecture 18 (10/3/22)
covered material: mixing time of random walks on undirected graphs, normalized conductance
reading: chapter 4.4.1