Translations:Attention Mechanisms/3/zh

    From Marovi AI
    Revision as of 23:36, 27 April 2026 by DeployBot (talk | contribs) (Batch translate Attention Mechanisms unit 3 → zh)
    (diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

    早期的序列到序列模型使用循環神經網絡將整個輸入序列編碼為單一的固定維向量。這種瓶頸迫使長程依賴被壓縮到恆定大小的向量中,從而降低了在長序列上的性能。注意力機制通過讓解碼器在每個生成步驟查詢編碼器的每個隱藏狀態,並以學習到的相關性分數對其加權,從而解決了這一問題。