All public logs

Combined display of all available logs of Marovi AI. You can narrow down the view by selecting a log type, the username (case-sensitive), or the affected page (also case-sensitive).

00:26, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/61/zh (Created page with "=== 5.2 消融实验 ===")
00:26, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/60/zh (Created page with "'''人类评估。''' 作为评估仿真质量的另一项标准，我们向 10 名评测员提供了 130 个随机短片段（长度为 1.6 秒和 3.2 秒），并排展示我们的仿真和真实游戏。评测员的任务是识别真实游戏（见附录[https://arxiv.org/html/2408.14837v1#A1.SS6 A.6]中的图[https://arxiv.org/html/2408.14837v1#A1.F14 14]）。评测员在 1.6 秒和 3.2 秒的片段中，选择真实游戏而非仿真的比例分别为 58% 和 60%。")
00:26, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/59/zh (Created page with "因此，我们对512个随机保留的轨迹计算FVD（Unterthiner等人，[https://arxiv.org/html/2408.14837v1#bib.bib35 2019]），测量预测轨迹分布与真实值轨迹分布之间的距离，仿真的长度为16帧（0.8秒）和32帧（1.6秒）。对于16帧，我们的模型获得的FVD为<math>114.02</math>。对于32帧，我们的模型获得的FVD为<math>186.23</math>。")
00:26, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/58/zh (Created page with "center|thumb|600x600px|图 6：自回归评估。64 个自回归步骤的 LPIPS 指标")
00:26, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/57/zh (Created page with "center|thumb|600x600px|图 6：自回归评估。64步自回归过程中的PSNR指标。")
00:26, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/56/zh (Created page with "视频质量我们使用第[https://arxiv.org/html/2408.14837v1#S2 2]节中描述的自回归设置，按照真实轨迹所定义的动作序列对帧进行迭代采样，同时将模型自身的过往预测作为条件。自回归采样时，预测轨迹和真实轨迹常常在几步后发生偏离，这主要是由于不同轨迹的帧间积累了少量不同的运动速度。因此，如图[https://arxiv.org/html/2408.14837v1#S5.F6 6]所示，每帧的PSNR和LPIPS值...")
00:26, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/55/zh (Created page with "center|thumb|900x900px|图 5：模型预测与地面实况对比。仅显示过去观测上下文的最后 4 帧。")
00:26, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/54/zh (Created page with "'''图像质量。''' 我们使用第[https://arxiv.org/html/2408.14837v1#S2 2]节中描述的教师强迫设置来测量LPIPS（Zhang 等人，[https://arxiv.org/html/2408.14837v1#bib.bib40 2018]）和PSNR。在该设置中，我们对初始状态进行采样，并根据地面实况的过去观察轨迹预测单帧。在对5个不同级别的2048条随机轨迹进行评估时，我们的模型实现了<math>29.43</math>的PSNR值和<math>0.249</math>的LPIPS值。PSNR...")
00:25, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/53/zh (Created page with "总体而言，从图像质量来看，我们的方法在长轨迹上实现了与原始游戏相当的仿真质量。对于短轨迹，人类评估者在区分仿真片段和实际游戏片段时，仅比随机猜测略胜一筹。")
00:25, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/52/zh (Created page with "=== 5.1 仿真质量 ===")
00:25, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/51/zh (Created page with "== 5 结果 ==")
00:25, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/50/zh (Created page with "我们使用 Stable Diffusion 1.4 的预训练检查点训练所有仿真模型，解冻所有 U-Net 参数。我们使用的批量大小为 128，恒定学习率为 2e-5，采用无权重衰减的 Adafactor 优化器（Shazeer & Stern，[https://arxiv.org/html/2408.14837v1#bib.bib31 2018]），以及梯度剪切为 1.0。我们将扩散损失参数化更改为 v预测（Salimans & Ho [https://arxiv.org/html/2408.14837v1#bib.bib28 2022a]）。我们以 0.1 的概率去...")
00:25, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/49/zh (Created page with "=== 4.2 生成模型训练 ===")
00:25, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/48/zh (Created page with "代理模型使用 PPO（Schulman 等人，[https://arxiv.org/html/2408.14837v1#bib.bib30 2017]）进行训练，采用简单的 CNN 作为特征网络，基于 Mnih 等人（[https://arxiv.org/html/2408.14837v1#bib.bib21 2015]）的方法。在 CPU 上使用 Stable Baselines 3 基础架构（Raffin 等人，[https://arxiv.org/html/2408.14837v1#bib.bib24 2021]）进行训练。代理接收缩小后的帧图像和游戏地图，每个分辨率为 160x120。代理还可以...")
00:25, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/47/zh (Created page with "=== 4.1 代理训练 ===")
00:24, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/46/zh (Created page with "== 4 实验设置 ==")
00:24, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/45/zh (Created page with "我们注意到，类似于 NVidia 的经典 SLI 交替帧渲染（AFR）技术，通过在额外硬件上并行生成多个帧，可以显著提高图像生成速率。然而，与 AFR 类似，实际的仿真速率不会提高，输入延迟也不会减少。")
00:24, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/44/zh (Created page with "由于我们在使用单步采样时确实观察到了质量下降，因此我们在单步设置中进行了类似于（Yin 等人，[https://arxiv.org/html/2408.14837v1#bib.bib39 2024]；Wang 等人，[https://arxiv.org/html/2408.14837v1#bib.bib36 2023]）的模型蒸馏实验。蒸馏确实有很大帮助（使我们达到了上述的 50 FPS），但仍会对仿真质量造成一定影响，因此我们选择在我们的方法中使用不带蒸馏的 4 步版本（见...")
00:24, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/43/zh (Created page with "仅使用 4 个去噪步骤导致 U-Net 总耗时为 40 毫秒（包括自动编码器的推理总耗时为 50 毫秒），即每秒 20 帧。我们推测，在我们的案例中，较少步骤对质量影响可忽略不计，是由于以下因素的结合：(1) 受限的图像空间，以及 (2) 前一帧的强条件作用。")
00:24, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/42/zh (Created page with "在推理过程中，我们需要运行 U-Net 去噪器（进行若干步）和自动编码器。在我们的硬件配置（TPU-v5）下，一次去噪步骤和自动编码器的评估各需 10 毫秒。如果我们以单步去噪器运行模型，设置中的最小总延迟为每帧 20 毫秒，即每秒 50 帧。通常情况下，生成扩散模型（如 Stable Diffusion）通过单次去噪步骤无法产生高质量结果，而是需要数十个采样步骤才能生...")
00:24, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/41/zh (Created page with "==== 3.3.2 去噪器采样步骤 ====")
00:24, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/40/zh (Created page with "我们还尝试了同时生成 4 个样本并合并结果，希望能防止罕见的极端预测被采纳，并减少误差累积。我们尝试了对样本进行平均和选择最接近中位数的样本。平均效果略逊于单帧，而选择最接近中位数的样本效果仅略有提升。由于这两种方法都会将硬件需求提高到 4 个张量处理单元（TPU），因此我们决定不使用这些方法，但注意到这可能是未来研究的一个有...")
00:23, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/39/zh (Created page with "我们使用DDIM采样（Song等人，[https://arxiv.org/html/2408.14837v1#bib.bib34 2022]）。我们仅对过去观测条件<math>o_{< n}</math>采用了无分类器指导（Ho & Salimans，[https://arxiv.org/html/2408.14837v1#bib.bib12 2022]）。我们发现对过去动作条件<math>a_{< n}</math>的指导无法提高质量。我们使用的权重相对较小（1.5），因为较大的权重会产生伪影，而我们的自动回归采样则会放大这些伪影。")
00:22, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/38/zh (Created page with "==== 3.3.1 设置 ====")
00:22, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/37/zh (Created page with "=== 3.3 推理 ===")
00:22, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/36/zh (Created page with "Stable Diffusion v1.4 的预训练自动编码器将 8x8 像素块压缩为 4 个潜通道，在预测游戏帧时会导致有意义的伪影，影响小细节，尤其是底栏 HUD（“抬头显示”）。为了在提高图像质量的同时利用预训练的知识，我们仅使用针对目标帧像素计算的 MSE 损失来训练潜在自动编码器的解码器。使用 LPIPS（Zhang 等人（[https://arxiv.org/html/2408.14837v1#bib.bib40 2018]））等感知损失...")
00:22, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/35/zh (Created page with "==== 3.2.2 潜在变量解码器微调 ====")
00:22, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/34/zh (Created page with "center|thumb|900x900px|图 4：自回归漂移。顶部：我们展示了一个简单轨迹的每第 10 帧，共 50 帧，其中玩家没有移动。在 20-30 步后，质量迅速下降。底部：带有噪声增强的相同轨迹没有出现质量下降。")
00:21, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/33/zh (Created page with "如图[https://arxiv.org/html/2408.14837v1#S3.F4 4]所示，教师强制训练和自动回归采样之间的领域偏移会导致误差积累和采样质量的快速下降。为了避免由于模型的自动回归应用而导致的这种偏差，我们在训练时向编码帧中添加不同程度的高斯噪声来扰动背景帧，并将噪声水平作为输入提供给模型，仿效 Ho 等人（[https://arxiv.org/html/2408.14837v1#bib.bib13 2021]）的方法。为此...")
00:21, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/32/zh (Created page with "==== 3.2.1 使用噪声增强缓解自回归漂移 ====")
00:21, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/31/zh (Created page with "其中 <math>T = {\{ o_{i \leq n},a_{i \leq n}\}} \sim \mathcal{T}_{代理}</math>，<math>x_{0} = \phi{(o_{n})}</math>，<math>t \sim \mathcal{U}{(0,1)}</math>，<math>\epsilon \sim \mathcal{N}{(0,\mathbf{I})}</math>，<math>x_{t} = {\sqrt{\overline{\alpha}_{t}}x_{0} + \sqrt{1 - \overline{\alpha}_{t}}\epsilon}</math>，<math>v{(\epsilon,x_{0},t)} = {\sqrt{\overline{\alpha}_{t}}\epsilon - \sqrt{1 - \overline{\alpha}_{t}}x_{0}}</math>，而 <math>v_{\theta^{\prime}}</math...")
00:21, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/30/zh (Created page with "<math>\mathcal{L} = {{\mathbb{E}}_{t,\epsilon,T}\left\lbrack {\|{v{(\epsilon,x_{0},t)}} - {v_{\theta^{\prime}}{(x_{t},t,\{{\phi{(o_{i < n})}}\},\{{A_{emb}{(a_{i < n})}}\}})}}\|}_{2}^{2} \right\rbrack}</math> (1)")
00:21, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/29/zh (Created page with "我们通过速度参数化训练模型，使得扩散损失最小化（Salimans & Ho, [https://arxiv.org/html/2408.14837v1#bib.bib29 2022b]）：")
00:21, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/28/zh (Created page with "我们重新利用预训练的文本到图像扩散模型 Stable Diffusion v1.4（Rombach 等人，[https://arxiv.org/html/2408.14837v1#bib.bib26 2022]）。我们将模型 <math>f_{\theta}</math> 置于轨迹 <math>T \sim \mathcal{T}_{agent}</math> 的条件下，即在之前的动作 <math>a_{< n}</math> 和观察（帧） <math>o_{< n}</math> 的序列条件下，并移除所有文本条件。具体来说，为了以动作为条件，我们仅需学习将每个动作...")
00:21, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/27/zh (Created page with "现在，我们训练一个生成扩散模型，该模型以在前一阶段收集的代理轨迹<math>\mathcal{T}_{agent}</math>（行动和观察）作为条件。")
00:20, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/26/zh (Created page with "=== 3.2 训练生成扩散模型 ===")
00:20, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/25/zh (Created page with "我们在整个训练过程中记录了代理的训练轨迹，其中涵盖了不同技能水平的游戏。这组记录的轨迹构成了我们的<math>\mathcal{T}_{agent}</math>数据集，用于训练生成模型（见第[https://arxiv.org/html/2408.14837v1#S3.SS2 3.2]节）。")
00:20, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/24/zh (Created page with "我们的最终目标是让人类玩家与我们的仿真进行互动。为此，第[https://arxiv.org/html/2408.14837v1#S2 2]节中的策略<math>\pi</math>即为“人类游戏策略”。由于我们无法直接大规模地从中取样，因此我们首先通过教一个自动代理来玩游戏，以此来近似人类游戏。与典型的强化学习设置不同，该设置旨在最大化游戏得分，我们的目标是生成与人类游戏类似的训练数据，或...")
00:20, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/23/zh (Created page with "=== 3.1 通过代理进行数据收集 ===")
00:20, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/22/zh (Created page with "center|thumb|900x900px|图3：GameNGen方法概览。为了简洁起见，省略了v预测的详细信息。")
00:20, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/21/zh (Created page with "GameNGen（发音为“游戏引擎”）是一个生成扩散模型，它能够在第[https://arxiv.org/html/2408.14837v1#S2 2]节的设置下学习模拟游戏。为了收集该模型的训练数据，我们首先使用教师强制目标训练一个独立的模型与环境进行交互。这两个模型（代理和生成模型）依次进行训练。在训练过程中，代理的全部行为和观察语料 <math>\mathcal{T}_{agent}</math> 被保留下来，并在第二...")
00:20, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/20/zh (Created page with "== 3 GameNGen ==")
00:19, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/19/zh (Created page with "我们总是使用教师强迫目标来训练我们的生成模型。给定一个模拟分布函数 <math>q</math>，可以通过自回归地采样观测值来模拟环境 <math>\mathcal{E}</math>。")
00:19, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/18/zh (Created page with "给定输入交互环境 <math>\mathcal{E}</math> 和初始状态 <math>s_{0} \in \mathcal{S}</math>，一个“交互世界模拟”是一个“模拟分布函数” <math>q \left( o_{n} \,|\, \{o_{< n}, a_{\leq n}\} \right), \; o_{i} \in \mathcal{O}, \; a_{i} \in \mathcal{A}</math>。给定观测值之间的距离度量 <math>D: \mathcal{O} \times \mathcal{O} \rightarrow \mathbb{R}</math>，一个“策略”，即给定过去动作和观测的代理动作分布 <math>...")
00:19, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/17/zh (Created page with "例如，在游戏 DOOM 中，<math>\mathcal{S}</math> 是程序的动态内存内容，<math>\mathcal{O}</math> 是渲染的屏幕像素，<math>V</math> 是游戏的渲染逻辑，<math>\mathcal{A}</math> 是按键和鼠标移动的集合，而 <math>p</math> 是基于玩家输入的程序逻辑（包括任何潜在的非确定性）。")
00:19, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/16/zh (Created page with "一个''交互环境''<math>\mathcal{E}</math>由一个潜在状态空间<math>\mathcal{S}</math>、一个潜在空间的部分投影空间<math>\mathcal{O}</math>、一个部分投影函数<math>V: \mathcal{S} \rightarrow \mathcal{O}</math>、一组动作<math>\mathcal{A}</math>，以及一个转移概率函数<math>p \left( s^{\prime} \,|\, a, s \right)</math>，使得<math>s, s^{\prime} \in \mathcal{S}, a\in \mathcal{A}</math>。")
00:19, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/15/zh (Created page with "== 2 互动世界仿真 ==")
00:19, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/14/zh (Created page with "center|thumb|800x800px|图 2：GameNGen 与之前最先进的 DOOM 仿真的比较")
00:19, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/13/zh (Created page with "GameNGen 回答了在通往游戏引擎新范式的道路上一个重要的问题，即游戏可以自动生成，就像近年来神经模型生成图像和视频一样。仍然存在关键问题，例如如何训练这些神经游戏引擎，以及如何有效地创建游戏，包括如何最佳地利用人类输入。尽管如此，我们对这种新范式的可能性感到非常兴奋。")
00:19, 9 September 2024 Felipefelixarias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/12/zh (Created page with "在这项工作中，我们证明答案是肯定的。具体来说，我们展示了一款复杂的视频游戏——标志性游戏《DOOM》，可以在神经网络（开放式 Stable Diffusion v1.4 的增强版（Rombach 等人，[https://arxiv.org/html/2408.14837v1#bib.bib26 2022]））上实时运行，同时获得与原始游戏相当的视觉质量。尽管这不是精确仿真，该神经模型能够执行复杂的游戏状态更新，例如统计生命值和弹...")