Translations:Diffusion Models Are Real-Time Game Engines/9/zh

    From Marovi AI
    Revision as of 00:18, 9 September 2024 by Felipefelixarias (talk | contribs) (Created page with "近年来,生成模型在根据文本或图像等多模态输入生成图像和视频方面取得了重大进展。在这一浪潮的前沿,扩散模型成为非语言媒体生成的事实标准,如 Dall-E(Ramesh 等人,[https://arxiv.org/html/2408.14837v1#bib.bib25 2022])、Stable Diffusion(Rombach 等人,[https://arxiv.org/html/2408.14837v1#bib.bib26 2022])和 Sora(Brooks 等人,[https://arxiv.org/html/2408.14837v1#bib.bib6 2024])。乍一看,...")
    (diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

    近年來,生成模型在根據文本或圖像等多模態輸入生成圖像和視頻方面取得了重大進展。在這一浪潮的前沿,擴散模型成為非語言媒體生成的事實標準,如 Dall-E(Ramesh 等人,2022)、Stable Diffusion(Rombach 等人,2022)和 Sora(Brooks 等人,2024)。乍一看,模擬視頻遊戲的交互世界似乎與視頻生成類似。然而,"交互式"世界模擬不僅僅是快速生成視頻。因為生成過程中需要以輸入動作流為條件,而輸入動作流只能在生成時獲取,這打破了現有擴散模型架構的一些假設。尤其是,它要求自回歸地生成幀,這往往是不穩定的,並導致採樣發散(見 3.2.1 節)。