CS365: Foundations of Data Science (Spring’22)



Teaching Fellow

  • TF: Mr. Ryan Yu
  • Email: ryu1@bu.edu
  • Teaching Labs on each Monday  (Attendance mandatory, attend your session!)
  • Office hours: M (PSY B53): 12:20 – 2.15 , W (via zoom): 5:30 – 7 PM

    Zoom Links and passcodes are available on Piazza

Piazza website

CS365 Github


Students taking this class must have taken:

  • CS 112
  • CS 131 (MA293)
  • CS 132 (MA242) 
  • and CS 237 (MA581) or equivalent.

The consent of the instructor is necessary to take the class. Otherwise, you will not get a final grade. CS 330 is highly recommended but not mandatory requirements as the previous.


Topics will include probability, information theory, linear algebra, calculus, Fourier analysis, graph theory with a strong focus on their applicability for analyzing datasets. Finally, two lectures will be devoted to data management, and more specifically the classic relational model, SQL and Datalog. A detailed syllabus is available on Piazza, with the code to sign up on Gradescope.


No need to buy a textbook. There will be assigned readings from the following books that are available online (click for the pdf)

  1. Foundations of Data Science by Avrim Blum, John Hopcroft, Ravi Kannan
  2. Understanding Machine Learning: From theory to algorithms by Shai Shalev-Shwartz and Shai Ben-David
  3. Introduction to Probability for Data Science by Stanley Chan
  4. Mathematics for Machine Learning by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong.


The class assumes familiarity with programming. The recommended languages for this class are Python3 and Julia. R and Matlab are also recommended. Other languages are welcome (C, C++, Java, etc), but are not recommended for this class.


Note: at the end of each lecture, you will find the assigned readings. The readings associated with a magnifying glass are mandatory. The rest is material if you are further interested, and have the time to devote.

  • Lecture 1 (1/20): data visualization – introduction, class logistics, types of data, basics of data visualization
    Slides available here
  • Lecture 2 (1/25): probability I – review of prerequisite material, and other basic concepts through problem solving
    Slides available here.
  • Lecture 3 (1/27): probability II – convergence of random variables, probability inequalities, Weak law of large numbers, confidence intervals
    Slide available here.
  • Lecture 4 (2/1): probability III – π estimation randomized algorithm Central Limit theorem, moment generating functions, Chernoff bounds
    Slides available here.
  • Lecture 5 (2/3): probability IV, statistical inference I , machine learning I– Bayes’ rule, Naive Bayes classifier
    Slides available here.
  • Lecture 6 (2/8): probability V, statistical inference II, machine learning II– denoising images using Bayes’ rule
    Slides available here.
  • Lecture 7 (2/10): probability VI, statistical inference III– concentration of measure (cont.), sampling theorem
    Slides available here.
  • Lecture 8 (2/15): statistical inference IV – method of moments, MLE, Bayesian inference, MAP
    Slides available here.
  • Lecture 9 (2/17): statistical inference V – EM algorithm for parametric inference
    Slides available here.
  • Midterm 2/24
  • Lecture 10 (3/1): streaming algorithms I – streaming model, missing number puzzle, reservoir sampling, moment estimation problem
    Slides available here.
  • Lecture 11 (3/3): streaming algorithms II – F1 estimation using Morris counters
    Slides available here.
  • Lecture 12 (3/15): streaming algorithms III -k-wise independence, F0, F2 estimation
    Slides available here and here.
  • Lecture 13 (3/17): dimensionality reduction I, machine learning III – distance functions, k-nearest neighbors classifier, Johnson-Lindenstrauss lemma
    Slides available here.
  • Lecture 14 (3/22): linear algebra I – vector space, subspace, linear mapping, linear independence, basis, basics of matrices (whiteboard lecture)
    Prerequisite CS132 material here.
  • Lecture 15 (3/24): linear algebra II – projections on subspaces, least squares, eigenvalue decomposition, real symmetric matrices’ spectral properties (whiteboard lecture)
    Prerequisite CS132 material here.
  • Lecture 16 (3/29): dimensionality reduction II, matrix decompositions I – singular value decomposition (SVD)
    Slides available here.
  • Lecture 17 (3/31): dimensionality reduction III, matrix decompositions II – math of singular value decomposition (SVD), and principal component analysis (PCA) (whiteboard lecture)
  • Lecture 18 (4/5): dimensionality reduction IV, matrix decompositions III – PCA for dimensionality reduction
    Python code here and a demo here.
  • Lecture 19 (4/7): graphs I – G(n,p) model, community detection using spectral clustering
    Python code here and notes for the emergence of K4s here.
  • Lecture 20 (4/12): graphs II – community detection using spectral clustering, Markov Chains (intro)
    Python code here.
  • Lecture 21 (4/14): graphs III – Markov Chains (cont.), Pagerank
    Python code here.
  • Lecture 22 (4/19): vector calculus I – level curves, gradient, directional derivative, Hessian, chain rule
    Slides available here.
  • Lecture 23 (4/21): vector calculus II – matrix calculus
    Slides available here.
  • Lecture 24 (4/26): vector calculus III optimization I – Taylor series, formulating problems, minimization, Weierstrass theorem
    Slides available here.
  • Lecture 25 (4/28):vector calculus IV optimization II – 1st and 2nd order necessary conditions, gradient descent
    Slides available here.
  • Lecture 26 (5/3):vector calculus V optimization III – gradient descent, convexity, Lagrange multipliers, duality
    Slides available here.


%d bloggers like this: