Faster Inference of Flow-Based Generative Models via Improved Data-Noise Coupling

1Computer Vision Group, Institute of Informatics, University of Bern, Bern, Switzerland 2LIONS, EPFL, Lausanne, Switzerland

at ICLR 2025

Abstract

Conditional Flow Matching (CFM), a simulation-free method for training continuous normalizing flows, provides an efficient alternative to diffusion models for key tasks like image and video generation. The performance of CFM in solving these tasks depends on the way data is coupled with noise. A recent approach uses minibatch optimal transport (OT) to reassign noise-data pairs in each training step to streamline sampling trajectories and thus accelerate inference. However, its optimization is restricted to individual minibatches, limiting its effectiveness on large datasets.

To address this shortcoming, we introduce LOOM-CFM (Looking Out Of Minibatch-CFM), a novel method to extend the scope of minibatch OT by preserving and optimizing these assignments across minibatches over training time.

Our approach demonstrates consistent improvements in the sampling speed-quality trade-off across multiple datasets. LOOM-CFM also enhances distillation initialization and supports high-resolution synthesis in latent space training.

Algorithm

In diffusion and flow-based generative models, to generate a sample, we start from noise and integrate the learned vector field, until a clean data point is reached. However, numerical integration requires discretization, which introduces errors. Finer discretization reduces errors but increases computational cost, making inference slower. Straighter trajectories would reduce the number of function evaluations (NFE) and speed-up sampling.

                

In flow matching the default way of coupling, or sampling minibatches, is to sample noise and data points independently. This leads to ambiguity during the training, as the same input to the network may have multiple different targets in different minibatches. Resolving this ambiguity though averaging results in learning curved sampling trajectories. Prior work has tackled this problem through locally changing the independent coupling by reassigning noise and data points within each separate minibatch (e.g. by solving local OT). However, this approach does not effectively scale to larger and higher-dimensional data.

Similarly to the prior work (e.g. OT-CFM [1] or BatchOT [2]), LOOM-CFM also solves local OT to reassign independently sampled noise-data pairs within a minibatch. However, in contrast to previous approaches, LOOM-CFM works with a fixed set of noise samples and stores the assignments to reuse them as starting points in future minibatches. This procedure allows minibatches to communicate, which leads to finding stricly more globally optimal assignments that the prior work.
To prevent overfitting to the fixed set of noise samples, we propose to store more than one assigned noise sample per data point in the dataset. At each training iteration a minibatch of data-noise pairs is obtained by first sampling data points and then randomly picking one of the assigned noises. This corresponds to artificially enlarging the dataset by duplicating the data points and does not change the underlying data distribution.

Results


Highlights:

  • LOOM-CFM yields straighter sampling trajectories and reduces the FID with 12 NFE by 41% on CIFAR10, 46% on ImageNet-32, and 54% on ImageNet-64 compared to minibatch OT methods;
  • LOOM-CFM serves as an effective initialization for model distillation, further enhancing inference speed;
  • LOOM-CFM is compatible with latent flow matching for generating higher-resolution outputs.
  • For more results and details, check the paper.

    BibTeX

    @inproceedings{davtyan2025faster,
            title={Faster Inference of Flow-Based Generative Models via Improved Data-Noise Coupling},
            author={Aram Davtyan and Leello Tadesse Dadi and Volkan Cevher and Paolo Favaro},
            booktitle={The Thirteenth International Conference on Learning Representations},
            year={2025},
            url={https://openreview.net/forum?id=rsGPrJDIhh}
          }

    References

    [1] Tong, Alexander, et al. "Improving and generalizing flow-based generative models with minibatch optimal transport." Transactions on Machine Learning Research, 2023.
    [2] Pooladian, Aram-Alexandre, et al. "Multisample Flow Matching: Straightening Flows with Minibatch Couplings." International Conference on Machine Learning. PMLR, 2023.