What are Diffusion Models?
Diffusion models learn to reverse a gradual corruption process. During training, data is progressively noised; during generation, the model predicts how to denoise each step until meaningful structure appears.
The Core Intuition
Generation is framed as iterative refinement. Instead of predicting the entire output in one shot, the model repeatedly improves a noisy state, which often yields stable and high-quality samples.
Text Diffusion
Operate over masked token sequences and iteratively refine token guesses with denoising-style updates.
Image Diffusion
Operate over latent image representations and denoise them into coherent visuals guided by text conditions.
Learning Roadmap
Follow this sequence to build intuition from fundamentals to modality-specific systems.
How Diffusion Works
Forward noise, reverse denoising, schedulers, and score matching intuition.
Open topicText Diffusion
Discrete token diffusion, mask-and-predict, padding, and autoregressive differences.
Open topicImage Diffusion
Latent pipelines, U-Net vs DiT, text conditioning, CFG, and step tradeoffs.
Open topic