Translations:Diffusion Models Are Real-Time Game Engines/83/en

    From Marovi AI
    Revision as of 05:28, 1 September 2024 by FuzzyBot (talk | contribs) (Importing a new version from external source)
    (diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

    Diffusion models achieved state-of-the-art results in text-to-image generation (Saharia et al., 2022; Rombach et al., 2022; Ramesh et al., 2022; Podell et al., 2023), a line of work that has also been applied for text-to-video generation tasks (Ho et al., 2022; Blattmann et al., 2023b; a; Gupta et al., 2023; Girdhar et al., 2023; Bar-Tal et al., 2024). Despite impressive advancements in realism, text adherence, and temporal consistency, video diffusion models remain too slow for real-time applications. Our work extends this line of work and adapts it for real-time generation conditioned autoregressively on a history of past observations and actions.