<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://marovi.ai/index.php?action=history&amp;feed=atom&amp;title=Dropout_A_Simple_Way_to_Prevent_Overfitting%2Fzh</id>
	<title>Dropout A Simple Way to Prevent Overfitting/zh - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://marovi.ai/index.php?action=history&amp;feed=atom&amp;title=Dropout_A_Simple_Way_to_Prevent_Overfitting%2Fzh"/>
	<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Dropout_A_Simple_Way_to_Prevent_Overfitting/zh&amp;action=history"/>
	<updated>2026-04-27T16:59:44Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.39.1</generator>
	<entry>
		<id>https://marovi.ai/index.php?title=Dropout_A_Simple_Way_to_Prevent_Overfitting/zh&amp;diff=4551&amp;oldid=prev</id>
		<title>DeployBot: Batch translate Dropout A Simple Way to Prevent Overfitting unit 10 → zh</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Dropout_A_Simple_Way_to_Prevent_Overfitting/zh&amp;diff=4551&amp;oldid=prev"/>
		<updated>2026-04-27T02:53:28Z</updated>

		<summary type="html">&lt;p&gt;Batch translate Dropout A Simple Way to Prevent Overfitting unit 10 → zh&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 02:53, 27 April 2026&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l1&quot;&gt;Line 1:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;languages /&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;languages /&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;{{LanguageBar | page = Dropout A Simple Way to Prevent Overfitting}}&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;{{PaperInfobox&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;{{PaperInfobox&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key mediawiki:diff::1.12:old-4548:rev-4551 --&gt;
&lt;/table&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
	<entry>
		<id>https://marovi.ai/index.php?title=Dropout_A_Simple_Way_to_Prevent_Overfitting/zh&amp;diff=4548&amp;oldid=prev</id>
		<title>DeployBot: Batch translate Dropout A Simple Way to Prevent Overfitting unit 20 → zh</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Dropout_A_Simple_Way_to_Prevent_Overfitting/zh&amp;diff=4548&amp;oldid=prev"/>
		<updated>2026-04-27T02:51:03Z</updated>

		<summary type="html">&lt;p&gt;Batch translate Dropout A Simple Way to Prevent Overfitting unit 20 → zh&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 02:51, 27 April 2026&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l70&quot;&gt;Line 70:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 70:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== 另见 ==&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== 另见 ==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;div lang=&quot;en&quot; dir=&quot;ltr&quot; class=&quot;mw-content-ltr&quot;&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* [[ImageNet Classification with Deep CNNs]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* [[ImageNet Classification with Deep CNNs]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* [[Batch Normalization Accelerating Deep Network Training]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* [[Batch Normalization Accelerating Deep Network Training]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* [[Deep Residual Learning for Image Recognition]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* [[Deep Residual Learning for Image Recognition]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;/div&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;div lang&lt;/del&gt;=&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;quot;en&amp;quot; dir&lt;/del&gt;=&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;quot;ltr&amp;quot; class&lt;/del&gt;=&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;quot;mw-content-ltr&amp;quot;&amp;gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;== &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;参考文献 &lt;/ins&gt;==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;=&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;= References ==&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;/div&amp;gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;div lang=&quot;en&quot; dir=&quot;ltr&quot; class=&quot;mw-content-ltr&quot;&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., &amp;amp; Salakhutdinov, R. (2014). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. &amp;#039;&amp;#039;Journal of Machine Learning Research 15&amp;#039;&amp;#039;, 1929-1958. [https://arxiv.org/abs/1207.0580 arXiv:1207.0580]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., &amp;amp; Salakhutdinov, R. (2014). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. &amp;#039;&amp;#039;Journal of Machine Learning Research 15&amp;#039;&amp;#039;, 1929-1958. [https://arxiv.org/abs/1207.0580 arXiv:1207.0580]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., &amp;amp; Salakhutdinov, R. (2012). Improving Neural Networks by Preventing Co-adaptation of Feature Detectors. &amp;#039;&amp;#039;arXiv:1207.0580&amp;#039;&amp;#039;.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., &amp;amp; Salakhutdinov, R. (2012). Improving Neural Networks by Preventing Co-adaptation of Feature Detectors. &amp;#039;&amp;#039;arXiv:1207.0580&amp;#039;&amp;#039;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., &amp;amp; Fergus, R. (2013). Regularization of Neural Networks using DropConnect. &amp;#039;&amp;#039;ICML 2013&amp;#039;&amp;#039;.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., &amp;amp; Fergus, R. (2013). Regularization of Neural Networks using DropConnect. &amp;#039;&amp;#039;ICML 2013&amp;#039;&amp;#039;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;/div&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;div lang=&quot;en&quot; dir=&quot;ltr&quot; class=&quot;mw-content-ltr&quot;&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Deep Learning]] [[Category:Research]] [[Category:Research Papers]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Deep Learning]] [[Category:Research]] [[Category:Research Papers]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;/div&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key mediawiki:diff::1.12:old-4542:rev-4548 --&gt;
&lt;/table&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
	<entry>
		<id>https://marovi.ai/index.php?title=Dropout_A_Simple_Way_to_Prevent_Overfitting/zh&amp;diff=4542&amp;oldid=prev</id>
		<title>DeployBot: Batch translate Dropout A Simple Way to Prevent Overfitting unit 21 → zh</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Dropout_A_Simple_Way_to_Prevent_Overfitting/zh&amp;diff=4542&amp;oldid=prev"/>
		<updated>2026-04-27T02:50:53Z</updated>

		<summary type="html">&lt;p&gt;Batch translate Dropout A Simple Way to Prevent Overfitting unit 21 → zh&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;&amp;lt;languages /&amp;gt;&lt;br /&gt;
{{LanguageBar | page = Dropout A Simple Way to Prevent Overfitting}}&lt;br /&gt;
&lt;br /&gt;
{{PaperInfobox&lt;br /&gt;
| topic_area  = Deep Learning&lt;br /&gt;
| difficulty  = Research&lt;br /&gt;
| authors     = Nitish Srivastava; Geoffrey Hinton; Alex Krizhevsky; Ilya Sutskever; Ruslan Salakhutdinov&lt;br /&gt;
| year        = 2014&lt;br /&gt;
| venue       = JMLR&lt;br /&gt;
| arxiv_id    = 1207.0580&lt;br /&gt;
| source_url  = https://arxiv.org/abs/1207.0580&lt;br /&gt;
| pdf_url     = https://arxiv.org/pdf/1207.0580&lt;br /&gt;
}}&lt;br /&gt;
{{ContentMeta | generated_by = claude-opus | model_used = claude-opus-4-6 | generated_date = 2026-03-13}}&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Dropout: A Simple Way to Prevent Neural Networks from Overfitting&amp;#039;&amp;#039;&amp;#039; 是 Srivastava 等人于 2014 年发表在《机器学习研究杂志》（Journal of Machine Learning Research）上的论文。该论文形式化并广泛评估了 &amp;#039;&amp;#039;&amp;#039;dropout&amp;#039;&amp;#039;&amp;#039;，这是一种在训练期间随机选择并临时移除神经元的正则化技术。Dropout 防止神经元之间形成复杂的共适应，相当于在单一架构内训练一个指数级大的子网络集成，并成为深度学习中应用最广泛的正则化方法之一。&lt;br /&gt;
&lt;br /&gt;
== 概述 ==&lt;br /&gt;
&lt;br /&gt;
具有大量参数的深度神经网络是强大的函数近似器，但容易出现过拟合，尤其是在训练数据有限时。传统的正则化方法（如 L2 权重衰减和早停）能在一定程度上缓解过拟合，但对于大型网络往往不够。模型组合——即训练多个模型并对它们的预测取平均——被认为可以减少过拟合，但计算代价高昂。&lt;br /&gt;
&lt;br /&gt;
Dropout 提供了一种高效的模型组合近似方法。在每个训练步骤中，每个神经元（包括输入单元）以概率 &amp;lt;math&amp;gt;p&amp;lt;/math&amp;gt; 被保留，以概率 &amp;lt;math&amp;gt;1 - p&amp;lt;/math&amp;gt; 被丢弃（置零）。这意味着在每个训练样本上都会采样出一个不同的“变薄”子网络。在测试时使用所有神经元，但其输出会被 &amp;lt;math&amp;gt;p&amp;lt;/math&amp;gt; 缩放，以近似集成的期望输出。&lt;br /&gt;
&lt;br /&gt;
== 主要贡献 ==&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Dropout 正则化&amp;#039;&amp;#039;&amp;#039;：在每次前向和反向传播过程中随机省略神经元的训练流程，防止神经元形成过度专门化的共适应。&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;集成解释&amp;#039;&amp;#039;&amp;#039;：从理论上将 dropout 视为对 &amp;lt;math&amp;gt;2^n&amp;lt;/math&amp;gt; 个可能的变薄网络（其中 &amp;lt;math&amp;gt;n&amp;lt;/math&amp;gt; 为可丢弃单元的数量）进行近似模型平均，且这些网络共享权重。&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;全面的实证评估&amp;#039;&amp;#039;&amp;#039;：在视觉、语音识别、文本分类和计算生物学等多个领域中均一致地观察到性能提升。&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;实用指南&amp;#039;&amp;#039;&amp;#039;：关于 dropout 比率（隐藏层 &amp;lt;math&amp;gt;p = 0.5&amp;lt;/math&amp;gt;，输入层 &amp;lt;math&amp;gt;p = 0.8&amp;lt;/math&amp;gt;）以及与其他超参数交互方式的建议。&lt;br /&gt;
&lt;br /&gt;
== 方法 ==&lt;br /&gt;
&lt;br /&gt;
在训练期间，对于每个训练样本和每一层，每个神经元的输出都会以概率 &amp;lt;math&amp;gt;1 - p&amp;lt;/math&amp;gt; 独立地被置零。如果 &amp;lt;math&amp;gt;h_i&amp;lt;/math&amp;gt; 是神经元 &amp;lt;math&amp;gt;i&amp;lt;/math&amp;gt; 的输出，则 dropout 操作如下：&lt;br /&gt;
&lt;br /&gt;
&amp;lt;math&amp;gt;r_i \sim \text{Bernoulli}(p)&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;math&amp;gt;\tilde{h}_i = r_i \cdot h_i&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
其中 &amp;lt;math&amp;gt;r_i&amp;lt;/math&amp;gt; 是随机掩码变量。然后将丢弃后的网络用于该训练样本的前向传播和反向传播。每个训练样本和每个梯度步都会采样不同的随机掩码。&lt;br /&gt;
&lt;br /&gt;
在测试时不丢弃任何单元。相反，每个神经元的输出会乘以 &amp;lt;math&amp;gt;p&amp;lt;/math&amp;gt;，以匹配训练期间的期望值：&lt;br /&gt;
&lt;br /&gt;
&amp;lt;math&amp;gt;h_i^{\text{test}} = p \cdot h_i&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
这种 &amp;#039;&amp;#039;&amp;#039;权重缩放推断规则&amp;#039;&amp;#039;&amp;#039; 确保每个神经元在测试时的期望输出与训练期间的期望输出相等。一种等价的替代方法 &amp;#039;&amp;#039;&amp;#039;反向 dropout&amp;#039;&amp;#039;&amp;#039;（inverted dropout）在训练期间将激活值缩放 &amp;lt;math&amp;gt;1/p&amp;lt;/math&amp;gt;，从而在测试时无需进行任何修改。这种做法在现代实现中更为常见。&lt;br /&gt;
&lt;br /&gt;
作者证明，dropout 可以被解释为训练 &amp;lt;math&amp;gt;2^n&amp;lt;/math&amp;gt; 个共享权重的子网络的集成。在测试时，按比例缩放的完整网络提供了对集成预测的几何均值近似；作者证明，对于具有 softmax 输出的单层网络，这一近似是精确的。&lt;br /&gt;
&lt;br /&gt;
该论文还探讨了 dropout 与其他正则化方法的组合，发现将 dropout 与最大范数约束（将权重向量裁剪为具有最大 L2 范数）以及较大且带衰减的学习率结合使用，能产生最佳效果。&lt;br /&gt;
&lt;br /&gt;
== 结果 ==&lt;br /&gt;
&lt;br /&gt;
Dropout 在多个基准上进行了评估，并一致地降低了测试误差：&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;MNIST&amp;#039;&amp;#039;&amp;#039;（手写数字）：在标准前馈网络上使用 dropout 后，错误率从 1.60% 降至 1.25%。&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;CIFAR-10/CIFAR-100&amp;#039;&amp;#039;&amp;#039;：在卷积网络上显著降低错误率；在 CIFAR-100 上的相对改进约为 15-25%。&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;SVHN&amp;#039;&amp;#039;&amp;#039;（街景门牌号）：错误率从 2.80% 降至 2.68%。&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;ImageNet&amp;#039;&amp;#039;&amp;#039;：dropout 使一个大型卷积网络的 top-1 错误率提升约 2 个百分点。&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;TIMIT&amp;#039;&amp;#039;&amp;#039;（语音识别）：在不同规模的架构中均观察到一致的提升。&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Reuters&amp;#039;&amp;#039;&amp;#039;（文本分类）：在词袋文本分类任务上性能改善。&lt;br /&gt;
&lt;br /&gt;
论文还分析了使用 dropout 训练的网络所学到的特征，发现与没有 dropout 的网络相比，隐藏单元发展出更具区分性、各自更具意义的特征；而后者往往学习到冗余的、共适应的特征。&lt;br /&gt;
&lt;br /&gt;
== 影响 ==&lt;br /&gt;
&lt;br /&gt;
在 2010 年代，dropout 成为神经网络训练的标准做法，并在大多数深度学习框架中默认启用。其概念上的简洁性以及一贯的有效性，使其成为机器学习领域被引用次数最多的论文之一。在训练期间通过随机扰动进行随机正则化的思想，影响了许多后续技术，包括 DropConnect、DropBlock、随机深度（stochastic depth）和数据增强策略。&lt;br /&gt;
&lt;br /&gt;
虽然批归一化（batch normalization）和其他技术在一些卷积架构中降低了对 dropout 的需求，但 dropout 在全连接层、Transformer 模型以及任何存在过拟合风险的场景中仍然被广泛使用。该论文确立了随机化正则化作为深度学习方法论中的核心原则。&lt;br /&gt;
&lt;br /&gt;
== 另见 ==&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div lang=&amp;quot;en&amp;quot; dir=&amp;quot;ltr&amp;quot; class=&amp;quot;mw-content-ltr&amp;quot;&amp;gt;&lt;br /&gt;
* [[ImageNet Classification with Deep CNNs]]&lt;br /&gt;
* [[Batch Normalization Accelerating Deep Network Training]]&lt;br /&gt;
* [[Deep Residual Learning for Image Recognition]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div lang=&amp;quot;en&amp;quot; dir=&amp;quot;ltr&amp;quot; class=&amp;quot;mw-content-ltr&amp;quot;&amp;gt;&lt;br /&gt;
== References ==&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div lang=&amp;quot;en&amp;quot; dir=&amp;quot;ltr&amp;quot; class=&amp;quot;mw-content-ltr&amp;quot;&amp;gt;&lt;br /&gt;
* Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., &amp;amp; Salakhutdinov, R. (2014). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. &amp;#039;&amp;#039;Journal of Machine Learning Research 15&amp;#039;&amp;#039;, 1929-1958. [https://arxiv.org/abs/1207.0580 arXiv:1207.0580]&lt;br /&gt;
* Hinton, G. E., Srivastava, N., Krizhevsky, A., Sutskever, I., &amp;amp; Salakhutdinov, R. (2012). Improving Neural Networks by Preventing Co-adaptation of Feature Detectors. &amp;#039;&amp;#039;arXiv:1207.0580&amp;#039;&amp;#039;.&lt;br /&gt;
* Wan, L., Zeiler, M., Zhang, S., Le Cun, Y., &amp;amp; Fergus, R. (2013). Regularization of Neural Networks using DropConnect. &amp;#039;&amp;#039;ICML 2013&amp;#039;&amp;#039;.&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;lt;div lang=&amp;quot;en&amp;quot; dir=&amp;quot;ltr&amp;quot; class=&amp;quot;mw-content-ltr&amp;quot;&amp;gt;&lt;br /&gt;
[[Category:Deep Learning]] [[Category:Research]] [[Category:Research Papers]]&lt;br /&gt;
&amp;lt;/div&amp;gt;&lt;/div&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
</feed>