Introduction to Flow Matching and Diffusion Models
MIT Computer Science Class 6.S184: Generative AI with Stochastic Differential Equations
Diffusion and flow-based models have become the state of the art for generative AI across a wide range of data modalities, including images, videos, shapes, molecules, music, and more! This course aims to build up the mathematical framework underlying these models from first principles. At the end of the class, students will have built a toy image diffusion model from scratch, and along the way, will have gained hands-on experience with the mathematical toolbox of stochastic differential equations that is useful in many other fields. This course is ideal for students who want to develop a principled understanding of the theory and practice of generative AI.
Course Notes
The course notes serve as the backbone of the course and provide a self-contained explanation of all material in the class. In contrast, lectures slides will generally not be self-contained and are intended to provide accompanying visualizations during the lecture. You may view the notes by clicking on the colored link below.
Lectures
Lecture | Topic | Slides | Recording |
---|---|---|---|
1 |
Flow and Diffusion Models
|
[slides 1] |
|
2 |
Constructing a Training Target
|
[slides 2] |
|
3 |
Training Flow and Diffusion Models
|
[slides 3] |
|
4 |
Building an Image Generator
|
[slides 4] |
|
5 |
Generative Robotics
|
N/A |
|
6 |
Generative Protein Design
|
[slides 6] |
|
Labs
There are 3 labs given as exercises accompanying the class to give you hands-on practical experience. The labs will guide you through building a flow matching and diffusion model from scratch step-by-step. To do the exercises, perform the following steps:- Click on the "Open in Colab" link to open the lab in Google Drive.
- Click on the "Open in Google Colaboratory" link at the center top of the page. A jupyter notebook should appear.
- Click on "File" → "Save a copy in Drive" to save a copy of the lab to your own Google Drive.
- Follow the instructions in the lab to complete the exercises.
Lab 1: Working with SDEs

Lab 2: Flow Matching and Score Matching

Lab 3: Conditional Image Generation

Stuck? Solutions can be found here.
Instructors
This class was co-taught by Peter and Ezra. We are fortunate to have Tommi Jaakkola as our sponsor and advisor.
Prerequisites: Linear algebra, real analysis, and basic probability theory. Students should be familiar with Python and have some experience with PyTorch.
Questions? Email either Peter or Ezra!
Remark about LLMs: This course does not cover large language models (LLMs). LLMs involve discrete data such as text, while this course focuses on data lying in continuous spaces such as images, videos, and protein structures.
Acknowledgements
We would like to thank the following individuals and organizations without whose support this course would not be possible:- Professor Tommi Jaakkola without whose support this class would not be possible
- Lisa Bella, Ellen Reid, and everyone else at MIT EECS for their generous support
- Christian Fiedler, Tim Griesbach, Benedikt Geiger, and Albrecht Holderrieth for invaluable feedback on the lecture notes
- Elaine Mello from MIT Open Learning for support with lecture recordings
- Ashay Athalye from Students for Open and Universal Learning for helping to edit and publish lecture recordings
- Cameron Diao, Tally Portnoi, Andi Qu, and many others for providing invaluable feedback on the labs
- The Missing Semester of Your CS Education upon whose website this one was inspired
- Participants in the original course offering (MIT 6.S184/6.S975, taught over IAP 2025), as well as readers like you for your interest in this course
Licensed under CC BY-NC-SA.