CS 7880: Foundations of Trustworthy Machine Learning

Seminar in Theoretical CS — Spring 2025

Overview

This seminar course will cover recent advances in the theoretical foundations of (trustworthy) machine learning, including but not limited to topics like privacy, fairness, robustness, generalization, safety, and incentives. The course will consistent primarily of student-led presentations of classic and recent research papers, and will be shaped by students' interests.

Time & Location

MTh 12:00 – 1:40pm
177 Huntington Room 503 (5th Floor)

Note: We've slightly modified the time of the class so we could have a larger room. The class now begins at 12:00pm.
Note: This class meets in the 177 Huntington building, not the official room assigned by the registrar. This building has its own access system that requires you to show ID at the front desk. I will add you to the guest list for the semester.

Instructor

Jonathan Ullman
jullman@ccs.neu.edu
Location: 177 Huntington, Rm 623

Course Content

The course will cover a variety of themes around the theoretical foundations of (trustworthy) machine learning, as well as how those ideas are being applied or might be applied in practice. Each theme will involve a few meetings and I will solicit preferences between each theme. We may cover themes I haven't even anticipated yet, but the following (soon to be growing) list represents topics I am interested in that we may cover:

Privacy for Statistics and ML

Privacy attacks on ML systems
Differential privacy
Contextual integrity
Private ML in practice
Privacy for the Census

Fairness and Decision-Making

Individual and group fairness definitions
Calibration and decision-making
Causal notions of fairness
Tradeoffs between fairnes notions
Fair ML in practice

Generalization and Statistical Validity

Generalization theory
Generalization in deep learning
Generalization in adaptive data analysis
The reproducibility crisis

Robustness and Attacks on ML Systems

Robust statistical inference
Data poisoning
Out-of-distribution generalization

Data Deletion and Machine Unlearning

Attempts at definitions
Common methods and pitfalls

Attempts at definitions
Approaches to protecting copyright

Student Led Meetings

Each student will be responsible for leading one or more meetings (depending on the number of students we have). Before leading a meeting, students are encouraged to discuss the paper(s) they will be presenting informally with me and make a plan for how to lead the meeting effectively.

Grading

This is a PhD seminar course with the goal of exposing students to frontier research topics, and to build the skills of reading, discussing, and presenting research papers. As such, there is no set material you are supposed to come away from the course mastering, and there is no systematic evaluation process. All that is expected is to attends regularly, participate, and put effort into the presentations.

Resources

Since each topic will involve reading a number of classic and recent research papers (possibly with some relevant background material), I will compile a list of relevant papers and other resources here as we go.