Info
Instructor
- When: Tue, Thu 5pm-6.15pm
- Where: CAS-211
- Prof: Babis Tsourakakis
- Email: ctsourak@bu.edu
- Office hours (CDS 912): Tu 12-1pm, Th 10-11am
Teaching Fellow
- TF: Mr. Tiany Chen
- Email: ctony@bu.edu
- Labs : schedule
- Office hours (CDS 362): Wed 2:00-3:30 pm, Fri 10:00-11:30 am
Piazza website
Github
Prerequisites
Students taking this class must have taken:
- CS 112
- CS 131 (MA293)
- CS 132 (MA242)
- and CS 237 (MA581) or equivalent.
This year the prerequisites will be strictly enforced. CS 330 is highly recommended but not a prereq.
Syllabus
Topics will include probability, information theory, linear algebra, calculus, Fourier analysis, graph theory with a strong focus on their applicability for analyzing datasets. Finally, two lectures will be devoted to data management, and more specifically the classic relational model, SQL and Datalog. A detailed syllabus is available on Piazza.
Textbooks
There will be assigned readings from the following books that are available online (click for the pdf)
- Mathematics for Machine Learning by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong.
- Foundations of Data Science by Avrim Blum, John Hopcroft, Ravi Kannan
- Understanding Machine Learning: From theory to algorithms by Shai Shalev-Shwartz and Shai Ben-David
- Introduction to Probability for Data Science by Stanley Chan
Programming
The class assumes familiarity with programming. The recommended languages for this class are Python3 and Julia. R and Matlab are also recommended. Other languages are welcome (C, C++, Java, etc), but are not recommended for this class.
Lectures
Note: at the end of each lecture, you will find the assigned readings. The readings associated with a magnifying glass are mandatory. The rest is material if you are further interested, and have the time to devote.
- Lecture 1 (1/19): data visualization – introduction, class logistics, types of data, basics of data visualization
Slides available here. - Lecture 2 (1/25): probability I – review of prerequisite material, and other basic concepts through problem solving
Slides available here. - Lecture 3 (1/26):probability II – convergence of random variables, Markov’s inequality
Slides available here. - Lecture 4 (2/1): probability III – Weak law of large numbers, confidence intervals, π estimation randomized algorithm Central Limit theorem
Slides available here.
Assignments
- Homework 1 (to be released on 1/27, due to 2/3)