Illustration on diffusion model sampling (red) and consistency model sampling (blue).Credit: https://openai.com/index/simplifying-stabilizing-and-scaling-continuous-time-consistency-models/

OpenAI unveils sCM, a new model that generates video media 50 times faster than current diffusion models

by · Tech Xplore

Two experts with the OpenAI team have developed a new kind of continuous-time consistency model (sCM) that they claim can generate video media 50 times faster than models currently in use. Cheng Lu and Yang Song have published a paper describing their new model on the arXiv preprint server. They have also posted an introductory paper on the company's website.

In machine learning methods by which AI apps are trained, diffusion models, sometimes called diffusion probabilistic models or score-based generative models, are a type of variable generative model.

Such models typically have three major components: forward and reverse processes and a sampling procedure. These models are the basis for generating visually based products such as video or still images, though they have been used with other applications, as well, such as in audio generation.

As with other machine-learning models, diffusion models work by sampling large amounts of data. Most such models execute hundreds of steps to generate an end product, which is why most of them take a few moments to carry out their tasks.

In sharp contrast, Lu and Song have developed a model that carries out all its work using just two steps. That reduction in steps, they note, has drastically reduced the amount of time their model takes to generate a video—without any loss in quality.

The new model uses more than 1.5 billion parameters and can produce a sample video in a fraction of a second running on a machine with a single A100 GPU. This is approximately 50 times faster than models currently in use.

The researchers note that their new model requires a lot less computational power than other models, as well, an ongoing issue with AI applications in general as their use skyrockets. They also note that their new approach has already undergone benchmarking to compare their results with other models, both those in current use and those under development by other teams. They suggest their model should allow for real-time generative AI applications in the near future.

More information: Cheng Lu et al, Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models, arXiv (2024). DOI: 10.48550/arxiv.2410.11081
OpenAI blog: openai.com/index/simplifying-s … -consistency-models/
Journal information: arXiv