Flow Matching

Resources

An Introduction to Flow Matching by Cambridge Machine Learning Group
Utkarsh, Utkarsh, Pengfei Cai, Alan Edelman, Rafael Gomez-Bombarelli, and Christopher Vincent Rackauckas. 2025. “Physics-Constrained Flow Matching: Sampling Generative Models with Hard Constraints.” arXiv:2506.04171. Preprint, arXiv, June 4. https://doi.org/10.48550/arXiv.2506.04171.

Neural ODEs

Main Idea

Flow matching is a technique in Generative Modeling relying on an iterated map in pseudo time ("flow time"). That is, beginning with $z_{0} \sim p_{0} (z_{0})$ , where $p_{0}$ is known, updates are computed as

z_{k + 1} = f (z_{k}), k = 0, \dots, K - 1,

with the goal that $p_{K} (z_{K})$ matches some target distribution $p^{*} (z)$ .

To derive an explicit (but expensive) form of $p_{K}$ , begin by considering $z_{0}$ , $z_{1}$ , and their relation assuming $f^{- 1}$ exists:

z_{1} = f (z_{0}) .

Through the Change of Variables formula, we can represent the relation between the likelihoods explicitly (although maybe expensively) as

p_{1} (z_{1}) = p_{0} (z_{0}) | det (\frac{\partial}{\partial z_{1}} f^{- 1} (z_{1})) | .

Further assumption and simplification gives

p_{1} (z_{1}) = p_{0} (z_{0}) {| det (\frac{\partial f (z_{0})}{\partial z_{0}}) |}^{- 1} .

Repeating this until $K$ and using the log-likelihood to replace multiplication with addition, we get

\log p_{K} (z_{K}) = \log p_{0} (z_{0}) - \sum_{k = 1}^{K} \log | det (\frac{\partial f (z_{k - 1})}{\partial z_{k - 1}}) | .

From here, we can choose a simple $p_{0}$ , parameterize $f$ , and maximize the log likelihood. We could substitute data samples as $z_{K}$ , compute the output (which iterates from data samples through $f^{- 1}$ ), and maximize the parameters of $f$ . We could alternatively derive this by minimizing the KL-divergence between our final distribution $p_{K}$ and the data distribution $p^{*}$ . There's an extra term that only depends on $p^{*}$ , but that won't change the optimization procedure.

However, a number of questions still remain, shaping the differences between various approaches:

How do we construct/parameterize $f$ so that is invertible, and so that we know this inverse?
How can we compute the Jacobian in an effective way?

Resources

Related

Main Idea