E0 234: Introduction to Randomized Algorithms, Spring 2021

Instructors: Siddharth Barman and Arindam Khan

TA: Anand Krishna

Time: Tuesdays & Thursdays, 11:00 AM-12:30 PM, Online (Teams).

Course Description

Lectures

Assignments

Projects

References

Course Description

Tentative topics:

Lectures

Probability Refresher: Scribe notes from Toolkit, Handwritten notes.

Week 1 (Arindam): Introduction, Monte Carlo and Las Vegas Algorithms, Karger's Min-cut Algorithm, Coupon Collector, Quicksort.
[M-U Chapter 1, 2]
Notes.
Related Links: Randomized Complexity Classes (Arora-Barak), Karger-Stein paper, STOC'21 deterministic mincut paper, Anupam Gupta's talk on k-cut.

Week 2 (Arindam): Polynomial Identity Testing, Schwartz-Zippel Lemma, Isolation Lemma, MVV Perfect Matching Algorithm.
[M-U Chapter 1, and Notes]
Notes.
Related Links: Mulmuley-Vazirani-Vazirani paper, Field version of Schwarz-Zippel, Ola Svensson's talk on matching in Quasi-NC.

Week 3 (Arindam): Concentration Inequalities, Randomized Median Selection, Set Balancing, Balls and Bins.
[M-U Chapter 3, 4, 5]
Notes.
Related Links: Hashing, Load Balancing and Multiple Choice - an excellent source for Balls and Bins related results, Chazelle's book on Discrepancy Methods, Concentration Inequality survey by McDiarmid.

Week 4 (Arindam): Markov Chains; Random Walks; Algorithms for 2-SAT, 3-SAT, s-t connectivity.
[M-U Chapter 11, JA-YALE, LPW]
Notes.
Related Links: Derandomization of Schoning and k-SAT version, ETH and SETH.

Week 5 (Arindam): Monte Carlo Methods, Markov Chain Monte Carlo, Metropolis Algorithm, DNF Counting, Coupling..
[M-U Chapter 12, 13, JA-YALE, LPW]
Notes.

Week 6 (Arindam): VC Dimensions, Shattering dimension, Epsilon Net, Epsilon Sample, PAC and agnostic Learning.
[M-U Chapter 14 and SH-UIUC Chapter 20]
Notes.
Related Links: Set covers in finite VC-dimension, Epsilon-Nets and Simplex Range Queries.

Week 7 (Siddharth): Introduction to high-dimensional probability, Sub-Gaussian and Sub-Exponential distribution, Hoeffding's inequality, Bernstein's inequality.
[RV Chapter 2]
Notes.

Week 8 (Siddharth): Sub-Gaussian random vectors, Concentration of the norm, Johnson-Lindenstrauss Lemma (JL).
[RV Chapter 2 and Notes]
Notes(part1), Notes(part2).
Related Links: Sparse JL [Achlipotas 03], Fast JL [Ailon Chazelle 09].

Week 9 (Siddharth): JL Applications: Streaming Algorithms (AMS F2 estimation), Nearest Neighbor Search; Matrix Deviation Inequalities (MDI), Gaussian width, MDI tail inequality.
[RV Chapter 4,7,9 and Notes]
Notes(part1), Notes(part2).

Week 10 (Siddharth): MDI Applications: Spectra of random matrices, covariance estimation, random projection of sets, random sections of sets (M* bounds), Escape Theorem (Gordan), compressed sensing.
[RV Chapter 4,5,7,9,10 and Notes]
Notes(part1), Notes(part2).

Week 11 (Siddharth): MDI Applications: Community Detection, Spectral Clustering. Lovasz Local Lemma.
Notes. LLL Notes from Toolkits'20: Handwritten, Scribe..

Important Dates:

References

[M-U] Michael Mitzenmacher and Eli Upfal. Probability and computing. Cambridge university press, 2017.
[MR] Rajeev Motwani, Prabhakar Raghavan. Randomized Algorithms, Cambridge university press.
[RK] R.M. Karp, An introduction to randomized algorithms, Discrete Applied Mathematics, 34, pp. 165-201, 1991.
[BHK] Avrim Blum, John Hopcroft, and Ravindran Kannan. Foundations of Data Science, 2020.
[RV] Roman Vershynin, High-Dimensional Probability.
[DP] D.B. Dubhashi, A. Panconesi, Concentration of Measure for the Analysis of Randomized Algorithms, Cambridge University Press, 2009.
[LPW] David A. Levin, Yuval Peres, Elizabeth L. Wilmer. Markov Chains and Mixing Times.
[MIT-YZ] Yufei Zhao. Lecture Notes (The Probabilistic Methods in Combinatorics), MIT, 2019.
[A-S] Noga Alon and Joel Spencer, The probabilistic method, John Wiley & Sons, 2004.
[UW-TR] Thomas Rothvoss, Lecture Notes (Probabilistic Combinatorics), U Washington, 2019.
[AC] Amit Chakrabarti, Data Stream Algorithms, 2020.
[SM] S. Muthukrishnan. Data streams: Algorithms and applications. Now Publishers Inc, 2005.
Various surveys and lecture notes.

IISc, 2016 , by Arnab Bhattacharyya and Deeparnab Chakrabarty.
[JA-YALE] Yale, 2020 , by James Aspnes.
[SH-UIUC] UIUC, 2018 , by Sariel Har-Peled.
MIT, 2002 , by David Karger.
UT Austin, 2020 , by Eric Price.
UC berkeley, 2003 , by Luca Trevisan.
Columbia, 2019 , by Tim Roughgarden.
Stanford, 2020 , by Mary Wooters.
CMU, 1997 , by Avrim Blum.
Wiezmann, 2013 , by Robert Krauthgamer and Moni Naor.
UMCP, 2017 , by Aravind Srinivasan.
U Iowa, 2018 , by Sriram V. Pemmaraju.
UBC, 2012 , by Nick Harvey.
EPFL 2014 , by Friedrich Eisenbrand.
NUS, 2019 , by Seth Gilbert.
Duke, 2013 , by Kamesh Munagala.
NTHU, 2012 , by Wing Kai Hon.
U Waterloo, 2019, by Gautam Kamath.
U Waterloo, 2018 , by Lap Chi Lau.
U Washington, 2016 , by James Lee.

Drunkard's Walk, Book by Leonard Mlodinow.
Veritasium Video, How We are Fooled By Probability: Regression to the Mean.
Vsauce Video, Birthday Paradox.
Numberphile Video, Monty Hall Problem.
Sunlight is way older than you think, An interesting application of Random Walks (Markov chains).
Veritasium Video, The Bayesian Trap.
MindYourDecisions Video, Buffon's Needle Problem: Pi from Probability.

Assignments

Project Topics

Derandomization

Sublinear time algorithms

Sketching

Survey by David Woodruff.

Survey by Radhakrishnan and Sudan.

Streaming Algorithms for Coin Tossing

VC dimension

Random Order Arrival (Online Algorithms)

Stochastic analysis of bin packing

Counting knapsack solutions

Expanders

Survey by Hoori, Linial, Wigderson.

Solving sparse linear systems

Peng-Vempala [SODA21].

Minimax Theorem

Ben-David,Blais [FOCS'20].

Singular Value Decompostion (SVD)

Vempala, Santosh, and Grant Wang. "A spectral algorithm for learning mixtures of distributions." IEEE FOCS 2002.
The paper shows that a simple spectral algorithm learns mixtures of Gaussians with provable guarantees, and extends the results the mixtures of weakly isotropic distributions.

k-means clustering

Kumar, A., Sabharwal, Y., & Sen, S. "A simple linear time (1+ epsilon)-approximation algorithm for k-means clustering in any dimensions."" IEEE FOCS 2004.
This paper gives the first linear time algorithm for k means, using random subsampling of point sets.

Anti-concentration

Lovett, Shachar. "An elementary proof of anti-concentration of polynomials in Gaussian variables." Electronic Colloquium on Computational Complexity (ECCC). Vol. 17. 2010.
This gives us an elementary proof of anti-concentration for multilinear polynomials for a large class of distributions, which includes polynomials over Gaussian distributions.

Approximate Matrix Multiplication

Drineas, P., Kannan, R. and Mahoney, M.W., 2006. Fast Monte Carlo algorithms for matrices I: Approximating matrix multiplication. SIAM Journal on Computing, 36(1), pp.132-157.
This paper gives fast approximation algorithms for matrix multiplication via matrix sub-sampling, with several applications.

Nearest Neighbor via LSH and p-Stable distributions

Indyk, P., Motwani, R., Raghavan, P., & Vempala, S. Locality-preserving hashing in multidimensional spaces. STOC 1997.
This paper introduces locality preserving hashing which has several applications in geometric optimization problems in high dimensions. It gives explicit constructions and lower bounds for these hash families.

Computing Gaussian Width

Meka, R. A PTAS for computing the supremum of Gaussian processes. IEEE FOCS 2012.
This gives a PTAS for computing the supremum of Gaussian processes.

Optimality of JL Lemma

Larsen, Kasper Green, and Jelani Nelson. "Optimality of the Johnson-Lindenstrauss lemma." IEEE FOCS 2017.
This paper shows bounds on dimensions that are necessary for any approximate isometry preserving maps.

Primality and Identity Testing via Chinese Remaindering, by Manindra Agrawal and Somenath Biswas.
Randomness Efficient Identity Testing of Multivariate Polynomials, by Adam Klivans and Daniel Spielman.

Intended audience:

Prerequisites: Mathematical maturity and a solid background in math (elementary combinatorics, graph theory, discrete probability, algebra, calculus) and theoretical computer science (big-O/Omega/Theta, P/NP, basic fundamental algorithms).

Grading: 40% HW, 30% Projects, 30% Final.

Prepared and Maintained by Arindam Khan