Translations:Diffusion Models Are Real-Time Game Engines/43/en

    From Marovi AI
    Revision as of 05:28, 1 September 2024 by FuzzyBot (talk | contribs) (Importing a new version from external source)
    (diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

    Using just 4 denoising steps leads to a total U-Net cost of 40ms (and total inference cost of 50ms, including the auto encoder) or 20 frames per second. We hypothesize that the negligible impact to quality with few steps in our case stems from a combination of: (1) a constrained images space, and (2) strong conditioning by the previous frames.