Translations:Diffusion Models Are Real-Time Game Engines/64/zh: Difference between revisions

Latest revision as of 00:27, 9 September 2024

Information about message (contribute)

This message has no documentation. If you know where or how this message is used, you can help other translators by adding documentation to this message.

Message definition (Diffusion Models Are Real-Time Game Engines)

We evaluate the impact of changing the number <math>N</math> of past observations in the conditioning context by training models with <math>N \in \{ 1,2,4,8,16,32,64\}</math> (recall that our method uses <math>N = 64</math>). This affects both the number of historical frames and actions. We train the models for 200,000 steps keeping the decoder frozen and evaluate on test-set trajectories from 5 levels. See the results in Table [https://arxiv.org/html/2408.14837v1#S5.T1 1]. As expected, we observe that generation quality improves with the length of the context. Interestingly, we observe that while the improvement is large at first (e.g., between 1 and 2 frames), we quickly approach an asymptote and further increasing the context size provides only small improvements in quality. This is somewhat surprising as even with our maximal context length, the model only has access to a little over 3 seconds of history. Notably, we observe that much of the game state is persisted for much longer periods (see Section [https://arxiv.org/html/2408.14837v1#S7 7]). While the length of the conditioning context is an important limitation, Table [https://arxiv.org/html/2408.14837v1#S5.T1 1] hints that we’d likely need to change the architecture of our model to efficiently support longer contexts, and employ better selection of the past frames to condition on, which we leave for future work.

我们通过训练使用 $N\in \{1,2,4,8,16,32,64\}$ 的模型来评估改变条件上下文中过去观测值数量 $N$ 的影响（请注意，我们的方法使用 $N=64$ ）。这影响了历史帧和动作的数量。我们在解码器保持冻结的情况下训练模型200,000步，并在5个级别的测试集轨迹上进行评估。结果见表1。正如预期的那样，我们发现生成质量随着上下文长度的增加而提高。有趣的是，我们观察到，尽管最初（例如在1到2帧之间）提升较大，但很快就接近一个渐近线，进一步增加上下文长度只能带来微小的质量提升。这有些令人惊讶，因为即使在我们使用的最大上下文长度下，模型也只能访问略多于3秒的历史。值得注意的是，我们观察到大部分游戏状态会持续更长时间（见第7节）。虽然条件上下文长度是一个重要的限制，但表1提示我们可能需要改变模型的架构，以有效支持更长的上下文，并更好地选择过去的帧作为条件，这将是我们未来的工作。