CS591 Large-Scale Graph Mining – Spring 2020

General Information

Instructor: Charalampos E. Tsourakakis
Classroom: EPC 204
Time: 2:30 pm-5:15 pm
Email: tsourolampis at gmail dot com.


Course description

This course aims to introduce students to important topics in large-scale graph mining ranging from graph models, and their applications to deep learning and large-scale algorithm design. Students will learn about random graph models, graph models and various real-world applications, the theory and practice of designing and applying efficient graph algorithms on large-scale real-world graphs, and machine learning applications that involve graph including node embeddings.


The class does not a textbook. However, the following books that are also available in a draft form online will be useful during this course.


  • Class Project 50%
  • Homeworks 25%
    • Homework policy: You may collaborate on each homework with at most one other person. The name of your collaborator must be written down. You must write your solutions on your own. You may change collaborator from homework to homework.  You may consult outside materials, but you must cite your sources in your solutions. You may not search the Web for solutions.

      Violation of the policy may result in receiving a zero grade for the question or the assignment  or even the whole course. The penalty will be assessed at the instructor’s discretion.

  • Exam 15%
  • Class scribes 10%

Central topics

  1. Random graph models
    • Erdős–Rényi model
    • Preferential attachment graphs
    • Kronecker graphs
  2. Properties of real world networks
    • Skewed distributions
    • Community structure
    • Small world and decentralized search
  3. Graph models in practice
    • Opinion dynamics
    • Modeling traffic
    • Kidney exchanges and more generally
    • Markets and strategic interaction in networks
  4. Positive and negative interactions
    • Balance theory
    • Prediction of signed interactions
    • Correlation clustering
  5. Graphs at scale (learning):
    • Graph-based semi-supervised learning
    • Graph convolutional networks
    • Representation learning  (node embeddings)
  6. Graph at scale (systems):
    • MapReduce (Hadoop)
    • Pregel (Apache Giraph)
    • GraphX
    • PowerGraph
  7. Graph problems at scale (algorithm design):
    • Graph connectivity
    • Diameter estimation
    • Subgraph counting
    • Dense subgraph discovery
    • Graph partitioning
    • Graph decompositions
    • Influence maximization


%d bloggers like this: