All public logs

Combined display of all available logs of Marovi AI. You can narrow down the view by selecting a log type, the username (case-sensitive), or the affected page (also case-sensitive).

00:28, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/76/zh (Created page with "{| class="wikitable" ! 难度级别 ! 数据生成策略 ! PSNR <math>\uparrow</math> ! LPIPS <math>\downarrow</math> |- | 简单 | 代理 | <math>20.94 \pm 0.76</math> | <math>0.48 \pm 0.01</math> |- | | 随机 | <math>20.20 \pm 0.83</math> | <math>0.48 \pm 0.01</math> |- | 中等 | 代理 | <math>20.21 \pm 0.36</math> | <math>0.50 \pm 0.01</math> |- | | 随机 | <math>16.50 \pm 0.41</math> | <math>0.59 \pm 0.01</math> |- | 困难 | 代理 | <math>17.51 \pm 0.35</math...")
00:28, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/75/zh (Created page with "'''表 2：不同难度级别的表现。''' 我们比较了使用代理生成数据和随机生成数据训练的模型在简单、中等和困难数据集上的表现。简单和中等数据集各有 112 个样本，困难数据集有 232 个样本。在 3 秒后的单帧上计算每个轨迹的指标。")
00:28, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/74/zh (Created page with "总体而言，我们观察到在随机轨迹上训练模型的效果出奇地好，但受到随机策略探索能力的限制。在比较单帧生成时，代理的效果稍好，PSNR 为 25.06，而随机策略为 24.42。在比较 3 秒自回归生成后的帧时，差距增大到 19.02 对 16.84。在手动操作模型时，我们发现某些区域对两者都很容易，而某些区域对两者都很困难，而在某些区域，代理的表现要好得多。基于...")
00:28, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/73/zh (Created page with "我们将代理生成的数据训练与使用随机策略生成的数据训练进行比较。对于随机策略，我们根据与观测结果无关的均匀分类分布对动作进行采样。我们通过对两个模型及其解码器进行")
00:28, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/72/zh (Created page with "==== 5.2.3 代理执行 ====")
00:28, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/71/zh (Created page with "center|thumb|600x600px|图 7：噪声增强的影响。图中显示了每个自回归步骤的 PSNR 平均值（越高越好）。不使用噪声增强时，质量在 10-20 帧后迅速下降。噪声增强可以防止这种情况。")
00:28, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/70/zh (Created page with "center|thumb|600x600px|图 7：噪声增强的影响。图中显示了每个自回归步骤的 LPIPS 平均值（越低越好）。未使用噪声增强时，质量在 10-20 帧后迅速下降，而噪声增强可以防止这种情况。")
00:28, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/69/zh (Created page with "在没有噪声增强的情况下，与真实值相比，LPIPS 距离迅速增加，而 PSNR 下降，这表明仿真结果与真实值的偏差加大。")
00:28, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/68/zh (Created page with "为了消除噪声增强的影响，我们训练了一个不添加噪声的模型。我们对标准噪声增强模型和不添加噪声的模型（经过 200,000 步训练后）进行自回归评估，并计算在随机保留的 512 条轨迹上预测帧与真实帧之间的 PSNR 和 LPIPS 指标。我们在图 [https://arxiv.org/html/2408.14837v1#S5.F7 7] 中报告了每个自回归步骤的平均指标值，最多可达 64 帧。")
00:27, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/67/zh (Created page with "==== 5.2.2 噪声增强 ====")
00:27, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/66/zh (Created page with "{| class="wikitable" ! 历史上下文长度 ! PSNR <math>\uparrow</math> ! LPIPS <math>\downarrow</math> |- | 64 | <math>22.36 \pm 0.033</math> | <math>0.295 \pm 0.001</math> |- | 32 | <math>22.31 \pm 0.033</math> | <math>0.296 \pm 0.001</math> |- | 16 | <math>22.28 \pm 0.033</math> | <math>0.296 \pm 0.001</math> |- | 8 | <math>22.26 \pm 0.033</math> | <math>0.296 \pm 0.001</math> |- | 4 | <math>22.26 \pm 0.034</math> | <math>0.298 \pm 0.001</math> |- | 2 | <math>22.03...")
00:27, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/65/zh (Created page with "表 1：历史帧数量。我们在来自 5 个级别的 8912 个测试集示例中分析了用作上下文的历史帧数量。更多的帧通常会改善 PSNR 和 LPIPS 指标。")
00:27, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/64/zh (Created page with "我们通过训练使用<math>N \in \{ 1,2,4,8,16,32,64\}</math>的模型来评估改变条件上下文中过去观测值数量<math>N</math>的影响（请注意，我们的方法使用<math>N = 64</math>）。这影响了历史帧和动作的数量。我们在解码器保持冻结的情况下训练模型200,000步，并在5个级别的测试集轨迹上进行评估。结果见表[https://arxiv.org/html/2408.14837v1#S5.T1 1]。正如预期的那样，我们发现生成...")
00:27, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/63/zh (Created page with "==== 5.2.1 上下文长度 ====")
00:27, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/62/zh (Created page with "为了评估我们方法中不同组件的重要性，我们从评估数据集中采样轨迹，并计算真实值与预测帧之间的 LPIPS 和 PSNR 指标。")
00:26, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/61/zh (Created page with "=== 5.2 消融实验 ===")
00:26, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/60/zh (Created page with "'''人类评估。''' 作为评估仿真质量的另一项标准，我们向 10 名评测员提供了 130 个随机短片段（长度为 1.6 秒和 3.2 秒），并排展示我们的仿真和真实游戏。评测员的任务是识别真实游戏（见附录[https://arxiv.org/html/2408.14837v1#A1.SS6 A.6]中的图[https://arxiv.org/html/2408.14837v1#A1.F14 14]）。评测员在 1.6 秒和 3.2 秒的片段中，选择真实游戏而非仿真的比例分别为 58% 和 60%。")
00:26, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/59/zh (Created page with "因此，我们对512个随机保留的轨迹计算FVD（Unterthiner等人，[https://arxiv.org/html/2408.14837v1#bib.bib35 2019]），测量预测轨迹分布与真实值轨迹分布之间的距离，仿真的长度为16帧（0.8秒）和32帧（1.6秒）。对于16帧，我们的模型获得的FVD为<math>114.02</math>。对于32帧，我们的模型获得的FVD为<math>186.23</math>。")
00:26, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/58/zh (Created page with "center|thumb|600x600px|图 6：自回归评估。64 个自回归步骤的 LPIPS 指标")
00:26, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/57/zh (Created page with "center|thumb|600x600px|图 6：自回归评估。64步自回归过程中的PSNR指标。")
00:26, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/56/zh (Created page with "视频质量我们使用第[https://arxiv.org/html/2408.14837v1#S2 2]节中描述的自回归设置，按照真实轨迹所定义的动作序列对帧进行迭代采样，同时将模型自身的过往预测作为条件。自回归采样时，预测轨迹和真实轨迹常常在几步后发生偏离，这主要是由于不同轨迹的帧间积累了少量不同的运动速度。因此，如图[https://arxiv.org/html/2408.14837v1#S5.F6 6]所示，每帧的PSNR和LPIPS值...")
00:26, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/55/zh (Created page with "center|thumb|900x900px|图 5：模型预测与地面实况对比。仅显示过去观测上下文的最后 4 帧。")
00:26, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/54/zh (Created page with "'''图像质量。''' 我们使用第[https://arxiv.org/html/2408.14837v1#S2 2]节中描述的教师强迫设置来测量LPIPS（Zhang 等人，[https://arxiv.org/html/2408.14837v1#bib.bib40 2018]）和PSNR。在该设置中，我们对初始状态进行采样，并根据地面实况的过去观察轨迹预测单帧。在对5个不同级别的2048条随机轨迹进行评估时，我们的模型实现了<math>29.43</math>的PSNR值和<math>0.249</math>的LPIPS值。PSNR...")
00:25, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/53/zh (Created page with "总体而言，从图像质量来看，我们的方法在长轨迹上实现了与原始游戏相当的仿真质量。对于短轨迹，人类评估者在区分仿真片段和实际游戏片段时，仅比随机猜测略胜一筹。")
00:25, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/52/zh (Created page with "=== 5.1 仿真质量 ===")
00:25, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/51/zh (Created page with "== 5 结果 ==")
00:25, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/50/zh (Created page with "我们使用 Stable Diffusion 1.4 的预训练检查点训练所有仿真模型，解冻所有 U-Net 参数。我们使用的批量大小为 128，恒定学习率为 2e-5，采用无权重衰减的 Adafactor 优化器（Shazeer & Stern，[https://arxiv.org/html/2408.14837v1#bib.bib31 2018]），以及梯度剪切为 1.0。我们将扩散损失参数化更改为 v预测（Salimans & Ho [https://arxiv.org/html/2408.14837v1#bib.bib28 2022a]）。我们以 0.1 的概率去...")
00:25, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/49/zh (Created page with "=== 4.2 生成模型训练 ===")
00:25, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/48/zh (Created page with "代理模型使用 PPO（Schulman 等人，[https://arxiv.org/html/2408.14837v1#bib.bib30 2017]）进行训练，采用简单的 CNN 作为特征网络，基于 Mnih 等人（[https://arxiv.org/html/2408.14837v1#bib.bib21 2015]）的方法。在 CPU 上使用 Stable Baselines 3 基础架构（Raffin 等人，[https://arxiv.org/html/2408.14837v1#bib.bib24 2021]）进行训练。代理接收缩小后的帧图像和游戏地图，每个分辨率为 160x120。代理还可以...")
00:25, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/47/zh (Created page with "=== 4.1 代理训练 ===")
00:24, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/46/zh (Created page with "== 4 实验设置 ==")
00:24, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/45/zh (Created page with "我们注意到，类似于 NVidia 的经典 SLI 交替帧渲染（AFR）技术，通过在额外硬件上并行生成多个帧，可以显著提高图像生成速率。然而，与 AFR 类似，实际的仿真速率不会提高，输入延迟也不会减少。")
00:24, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/44/zh (Created page with "由于我们在使用单步采样时确实观察到了质量下降，因此我们在单步设置中进行了类似于（Yin 等人，[https://arxiv.org/html/2408.14837v1#bib.bib39 2024]；Wang 等人，[https://arxiv.org/html/2408.14837v1#bib.bib36 2023]）的模型蒸馏实验。蒸馏确实有很大帮助（使我们达到了上述的 50 FPS），但仍会对仿真质量造成一定影响，因此我们选择在我们的方法中使用不带蒸馏的 4 步版本（见...")
00:24, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/43/zh (Created page with "仅使用 4 个去噪步骤导致 U-Net 总耗时为 40 毫秒（包括自动编码器的推理总耗时为 50 毫秒），即每秒 20 帧。我们推测，在我们的案例中，较少步骤对质量影响可忽略不计，是由于以下因素的结合：(1) 受限的图像空间，以及 (2) 前一帧的强条件作用。")
00:24, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/42/zh (Created page with "在推理过程中，我们需要运行 U-Net 去噪器（进行若干步）和自动编码器。在我们的硬件配置（TPU-v5）下，一次去噪步骤和自动编码器的评估各需 10 毫秒。如果我们以单步去噪器运行模型，设置中的最小总延迟为每帧 20 毫秒，即每秒 50 帧。通常情况下，生成扩散模型（如 Stable Diffusion）通过单次去噪步骤无法产生高质量结果，而是需要数十个采样步骤才能生...")
00:24, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/41/zh (Created page with "==== 3.3.2 去噪器采样步骤 ====")
00:24, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/40/zh (Created page with "我们还尝试了同时生成 4 个样本并合并结果，希望能防止罕见的极端预测被采纳，并减少误差累积。我们尝试了对样本进行平均和选择最接近中位数的样本。平均效果略逊于单帧，而选择最接近中位数的样本效果仅略有提升。由于这两种方法都会将硬件需求提高到 4 个张量处理单元（TPU），因此我们决定不使用这些方法，但注意到这可能是未来研究的一个有...")
00:23, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/39/zh (Created page with "我们使用DDIM采样（Song等人，[https://arxiv.org/html/2408.14837v1#bib.bib34 2022]）。我们仅对过去观测条件<math>o_{< n}</math>采用了无分类器指导（Ho & Salimans，[https://arxiv.org/html/2408.14837v1#bib.bib12 2022]）。我们发现对过去动作条件<math>a_{< n}</math>的指导无法提高质量。我们使用的权重相对较小（1.5），因为较大的权重会产生伪影，而我们的自动回归采样则会放大这些伪影。")
00:22, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/38/zh (Created page with "==== 3.3.1 设置 ====")
00:22, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/37/zh (Created page with "=== 3.3 推理 ===")
00:22, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/36/zh (Created page with "Stable Diffusion v1.4 的预训练自动编码器将 8x8 像素块压缩为 4 个潜通道，在预测游戏帧时会导致有意义的伪影，影响小细节，尤其是底栏 HUD（“抬头显示”）。为了在提高图像质量的同时利用预训练的知识，我们仅使用针对目标帧像素计算的 MSE 损失来训练潜在自动编码器的解码器。使用 LPIPS（Zhang 等人（[https://arxiv.org/html/2408.14837v1#bib.bib40 2018]））等感知损失...")
00:22, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/35/zh (Created page with "==== 3.2.2 潜在变量解码器微调 ====")
00:22, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/34/zh (Created page with "center|thumb|900x900px|图 4：自回归漂移。顶部：我们展示了一个简单轨迹的每第 10 帧，共 50 帧，其中玩家没有移动。在 20-30 步后，质量迅速下降。底部：带有噪声增强的相同轨迹没有出现质量下降。")
00:21, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/33/zh (Created page with "如图[https://arxiv.org/html/2408.14837v1#S3.F4 4]所示，教师强制训练和自动回归采样之间的领域偏移会导致误差积累和采样质量的快速下降。为了避免由于模型的自动回归应用而导致的这种偏差，我们在训练时向编码帧中添加不同程度的高斯噪声来扰动背景帧，并将噪声水平作为输入提供给模型，仿效 Ho 等人（[https://arxiv.org/html/2408.14837v1#bib.bib13 2021]）的方法。为此...")
00:21, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/32/zh (Created page with "==== 3.2.1 使用噪声增强缓解自回归漂移 ====")
00:21, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/31/zh (Created page with "其中 <math>T = {\{ o_{i \leq n},a_{i \leq n}\}} \sim \mathcal{T}_{代理}</math>，<math>x_{0} = \phi{(o_{n})}</math>，<math>t \sim \mathcal{U}{(0,1)}</math>，<math>\epsilon \sim \mathcal{N}{(0,\mathbf{I})}</math>，<math>x_{t} = {\sqrt{\overline{\alpha}_{t}}x_{0} + \sqrt{1 - \overline{\alpha}_{t}}\epsilon}</math>，<math>v{(\epsilon,x_{0},t)} = {\sqrt{\overline{\alpha}_{t}}\epsilon - \sqrt{1 - \overline{\alpha}_{t}}x_{0}}</math>，而 <math>v_{\theta^{\prime}}</math...")
00:21, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/30/zh (Created page with "<math>\mathcal{L} = {{\mathbb{E}}_{t,\epsilon,T}\left\lbrack {\|{v{(\epsilon,x_{0},t)}} - {v_{\theta^{\prime}}{(x_{t},t,\{{\phi{(o_{i < n})}}\},\{{A_{emb}{(a_{i < n})}}\}})}}\|}_{2}^{2} \right\rbrack}</math> (1)")
00:21, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/29/zh (Created page with "我们通过速度参数化训练模型，使得扩散损失最小化（Salimans & Ho, [https://arxiv.org/html/2408.14837v1#bib.bib29 2022b]）：")
00:21, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/28/zh (Created page with "我们重新利用预训练的文本到图像扩散模型 Stable Diffusion v1.4（Rombach 等人，[https://arxiv.org/html/2408.14837v1#bib.bib26 2022]）。我们将模型 <math>f_{\theta}</math> 置于轨迹 <math>T \sim \mathcal{T}_{agent}</math> 的条件下，即在之前的动作 <math>a_{< n}</math> 和观察（帧） <math>o_{< n}</math> 的序列条件下，并移除所有文本条件。具体来说，为了以动作为条件，我们仅需学习将每个动作...")
00:21, 9 September 2024 FelipeArias talk contribs created page Translations:Diffusion Models Are Real-Time Game Engines/27/zh (Created page with "现在，我们训练一个生成扩散模型，该模型以在前一阶段收集的代理轨迹<math>\mathcal{T}_{agent}</math>（行动和观察）作为条件。")