<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://marovi.ai/index.php?action=history&amp;feed=atom&amp;title=Batch_Normalization%2Fes</id>
	<title>Batch Normalization/es - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://marovi.ai/index.php?action=history&amp;feed=atom&amp;title=Batch_Normalization%2Fes"/>
	<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Batch_Normalization/es&amp;action=history"/>
	<updated>2026-04-24T12:58:54Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.39.1</generator>
	<entry>
		<id>https://marovi.ai/index.php?title=Batch_Normalization/es&amp;diff=2149&amp;oldid=prev</id>
		<title>DeployBot: [deploy-bot] Deploy from CI (8c92aeb)</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Batch_Normalization/es&amp;diff=2149&amp;oldid=prev"/>
		<updated>2026-04-24T07:09:00Z</updated>

		<summary type="html">&lt;p&gt;[deploy-bot] Deploy from CI (8c92aeb)&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 07:09, 24 April 2026&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l98&quot;&gt;Line 98:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 98:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Intermediate]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Intermediate]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Neural Networks]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Neural Networks]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!--v1.2.0 cache-bust--&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!-- pass 2 --&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key mediawiki:diff::1.12:old-2090:rev-2149 --&gt;
&lt;/table&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
	<entry>
		<id>https://marovi.ai/index.php?title=Batch_Normalization/es&amp;diff=2090&amp;oldid=prev</id>
		<title>DeployBot: Pass 2 force re-parse</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Batch_Normalization/es&amp;diff=2090&amp;oldid=prev"/>
		<updated>2026-04-24T07:00:26Z</updated>

		<summary type="html">&lt;p&gt;Pass 2 force re-parse&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 07:00, 24 April 2026&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l99&quot;&gt;Line 99:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 99:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Neural Networks]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Neural Networks]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;!--v1.2.0 cache-bust--&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;!--v1.2.0 cache-bust--&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!-- pass 2 --&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key mediawiki:diff::1.12:old-2053:rev-2090 --&gt;
&lt;/table&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
	<entry>
		<id>https://marovi.ai/index.php?title=Batch_Normalization/es&amp;diff=2053&amp;oldid=prev</id>
		<title>DeployBot: Force re-parse after Math source-mode rollout (v1.2.0)</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Batch_Normalization/es&amp;diff=2053&amp;oldid=prev"/>
		<updated>2026-04-24T06:57:50Z</updated>

		<summary type="html">&lt;p&gt;Force re-parse after Math source-mode rollout (v1.2.0)&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 06:57, 24 April 2026&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l98&quot;&gt;Line 98:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 98:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Intermediate]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Intermediate]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Neural Networks]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Neural Networks]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!--v1.2.0 cache-bust--&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key mediawiki:diff::1.12:old-1998:rev-2053 --&gt;
&lt;/table&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
	<entry>
		<id>https://marovi.ai/index.php?title=Batch_Normalization/es&amp;diff=1998&amp;oldid=prev</id>
		<title>DeployBot: [deploy-bot] Deploy from CI (775ba6e)</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Batch_Normalization/es&amp;diff=1998&amp;oldid=prev"/>
		<updated>2026-04-24T04:01:46Z</updated>

		<summary type="html">&lt;p&gt;[deploy-bot] Deploy from CI (775ba6e)&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;{{LanguageBar | page = Batch Normalization}}&lt;br /&gt;
{{ArticleInfobox | topic_area = Deep Learning | difficulty = Intermediate | prerequisites = [[Neural Networks]], [[Backpropagation]]}}&lt;br /&gt;
{{ContentMeta | generated_by = claude-opus | model_used = claude-opus-4-6 | generated_date = 2026-03-13}}&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Batch normalization&amp;#039;&amp;#039;&amp;#039; (frecuentemente abreviado &amp;#039;&amp;#039;&amp;#039;BatchNorm&amp;#039;&amp;#039;&amp;#039; o &amp;#039;&amp;#039;&amp;#039;BN&amp;#039;&amp;#039;&amp;#039;) es una tecnica para mejorar la velocidad, estabilidad y rendimiento de las redes neuronales profundas mediante la normalizacion de las entradas a cada capa. Introducida por Ioffe y Szegedy en 2015, se ha convertido en un componente estandar en la mayoria de las arquitecturas modernas de aprendizaje profundo.&lt;br /&gt;
&lt;br /&gt;
== Desplazamiento covariante interno ==&lt;br /&gt;
&lt;br /&gt;
La motivacion original de batch normalization fue abordar el &amp;#039;&amp;#039;&amp;#039;desplazamiento covariante interno&amp;#039;&amp;#039;&amp;#039; — el fenomeno por el cual la distribucion de las entradas de cada capa cambia durante el entrenamiento a medida que se actualizan los parametros de las capas precedentes. Esta distribucion cambiante obliga a cada capa a adaptarse continuamente, ralentizando la convergencia y requiriendo una inicializacion cuidadosa y tasas de aprendizaje pequenas.&lt;br /&gt;
&lt;br /&gt;
Aunque el papel preciso del desplazamiento covariante interno ha sido debatido (Santurkar et al., 2018, argumentaron que los beneficios de BatchNorm provienen mas del suavizado del paisaje de perdida), la efectividad practica de la tecnica esta bien establecida.&lt;br /&gt;
&lt;br /&gt;
== El algoritmo de batch normalization ==&lt;br /&gt;
&lt;br /&gt;
=== Durante el entrenamiento ===&lt;br /&gt;
&lt;br /&gt;
Para un mini-lote &amp;lt;math&amp;gt;\mathcal{B} = \{x_1, \dots, x_m\}&amp;lt;/math&amp;gt; de activaciones en una capa dada, BatchNorm procede de la siguiente manera:&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Paso 1.&amp;#039;&amp;#039;&amp;#039; Calcular la media y varianza del mini-lote:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\mu_{\mathcal{B}} = \frac{1}{m} \sum_{i=1}^{m} x_i, \qquad \sigma_{\mathcal{B}}^2 = \frac{1}{m} \sum_{i=1}^{m} (x_i - \mu_{\mathcal{B}})^2&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Paso 2.&amp;#039;&amp;#039;&amp;#039; Normalizar:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\hat{x}_i = \frac{x_i - \mu_{\mathcal{B}}}{\sqrt{\sigma_{\mathcal{B}}^2 + \epsilon}}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
donde &amp;lt;math&amp;gt;\epsilon&amp;lt;/math&amp;gt; es una constante pequena (por ejemplo, &amp;lt;math&amp;gt;10^{-5}&amp;lt;/math&amp;gt;) para estabilidad numerica.&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Paso 3.&amp;#039;&amp;#039;&amp;#039; Escalar y desplazar con parametros aprendidos &amp;lt;math&amp;gt;\gamma&amp;lt;/math&amp;gt; y &amp;lt;math&amp;gt;\beta&amp;lt;/math&amp;gt;:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;y_i = \gamma \hat{x}_i + \beta&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Los parametros &amp;lt;math&amp;gt;\gamma&amp;lt;/math&amp;gt; y &amp;lt;math&amp;gt;\beta&amp;lt;/math&amp;gt; se aprenden durante el entrenamiento. Restauran la capacidad de la red para representar la transformacion identidad si esta es optima, asegurando que la normalizacion no reduzca la expresividad del modelo.&lt;br /&gt;
&lt;br /&gt;
=== Durante la inferencia ===&lt;br /&gt;
&lt;br /&gt;
En el momento de la inferencia, las estadisticas de mini-lotes individuales no son fiables (la entrada puede ser un unico ejemplo). En su lugar, BatchNorm utiliza estimaciones acumuladas de la media y varianza poblacional acumuladas durante el entrenamiento mediante promedios moviles exponenciales:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\mu_{\mathrm{running}} \leftarrow (1 - \alpha)\, \mu_{\mathrm{running}} + \alpha\, \mu_{\mathcal{B}}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\sigma^2_{\mathrm{running}} \leftarrow (1 - \alpha)\, \sigma^2_{\mathrm{running}} + \alpha\, \sigma^2_{\mathcal{B}}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
donde &amp;lt;math&amp;gt;\alpha&amp;lt;/math&amp;gt; es el parametro de momento (tipicamente 0.1). Estas estadisticas fijas aseguran salidas deterministas en la inferencia.&lt;br /&gt;
&lt;br /&gt;
== Beneficios ==&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Tasas de aprendizaje mas altas&amp;#039;&amp;#039;&amp;#039;: Al restringir las distribuciones de activacion, BatchNorm permite pasos mas grandes sin divergencia.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Menor sensibilidad a la inicializacion&amp;#039;&amp;#039;&amp;#039;: Las redes con BatchNorm son mas tolerantes a una inicializacion de pesos deficiente.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Efecto regularizador&amp;#039;&amp;#039;&amp;#039;: El ruido introducido por las estadisticas del mini-lote actua como un regularizador suave, a veces reduciendo la necesidad de [[Dropout]].&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Convergencia mas rapida&amp;#039;&amp;#039;&amp;#039;: El entrenamiento tipicamente requiere menos epocas para alcanzar un nivel dado de rendimiento.&lt;br /&gt;
&lt;br /&gt;
== Ubicacion ==&lt;br /&gt;
&lt;br /&gt;
BatchNorm se aplica tipicamente &amp;#039;&amp;#039;&amp;#039;antes&amp;#039;&amp;#039;&amp;#039; de la funcion de activacion (como en el articulo original), aunque algunos profesionales lo colocan &amp;#039;&amp;#039;&amp;#039;despues&amp;#039;&amp;#039;&amp;#039; de la activacion. Para capas convolucionales, la normalizacion se realiza por canal a traves de las dimensiones espaciales y la dimension del lote.&lt;br /&gt;
&lt;br /&gt;
== Alternativas de normalizacion ==&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Metodo !! Normaliza sobre !! Caso de uso&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;Batch Norm&amp;#039;&amp;#039;&amp;#039; || Dimensiones del lote y espaciales, por canal || CNN con lotes grandes&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;Layer Norm&amp;#039;&amp;#039;&amp;#039; || Todos los canales y dimensiones espaciales, por muestra || Transformers, RNN, lotes pequenos&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;Instance Norm&amp;#039;&amp;#039;&amp;#039; || Solo dimensiones espaciales, por muestra por canal || Transferencia de estilo, generacion de imagenes&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;Group Norm&amp;#039;&amp;#039;&amp;#039; || Grupos de canales, por muestra || Deteccion de objetos, entrenamiento con lotes pequenos&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
La &amp;#039;&amp;#039;&amp;#039;normalizacion de capa&amp;#039;&amp;#039;&amp;#039; (Ba et al., 2016) normaliza a traves de todas las caracteristicas dentro de una unica muestra, haciendola independiente del tamano del lote. Es la opcion estandar en las arquitecturas Transformer.&lt;br /&gt;
&lt;br /&gt;
La &amp;#039;&amp;#039;&amp;#039;normalizacion de grupo&amp;#039;&amp;#039;&amp;#039; (Wu y He, 2018) divide los canales en grupos y normaliza dentro de cada grupo por muestra. Sirve de puente entre Layer Norm e Instance Norm y funciona bien cuando los tamanos de lote son demasiado pequenos para obtener estadisticas de lote fiables.&lt;br /&gt;
&lt;br /&gt;
== Limitaciones ==&lt;br /&gt;
&lt;br /&gt;
* El rendimiento se degrada con tamanos de lote muy pequenos, ya que las estadisticas del lote se vuelven ruidosas.&lt;br /&gt;
* Introduce una discrepancia entre el comportamiento de entrenamiento (estadisticas de lote) y el de inferencia (estadisticas acumuladas).&lt;br /&gt;
* No es directamente aplicable a secuencias de longitud variable sin relleno o enmascaramiento.&lt;br /&gt;
* Las estadisticas acumuladas requieren un manejo cuidadoso cuando se utiliza entrenamiento distribuido en multiples dispositivos.&lt;br /&gt;
&lt;br /&gt;
== Vease tambien ==&lt;br /&gt;
&lt;br /&gt;
* [[Neural Networks]]&lt;br /&gt;
* [[Backpropagation]]&lt;br /&gt;
* [[Dropout]]&lt;br /&gt;
* [[Stochastic Gradient Descent]]&lt;br /&gt;
* [[Transformer]]&lt;br /&gt;
&lt;br /&gt;
== Referencias ==&lt;br /&gt;
&lt;br /&gt;
* Ioffe, S. and Szegedy, C. (2015). &amp;quot;Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift&amp;quot;. &amp;#039;&amp;#039;ICML&amp;#039;&amp;#039;.&lt;br /&gt;
* Ba, J. L., Kiros, J. R. and Hinton, G. E. (2016). &amp;quot;Layer Normalization&amp;quot;. &amp;#039;&amp;#039;arXiv:1607.06450&amp;#039;&amp;#039;.&lt;br /&gt;
* Wu, Y. and He, K. (2018). &amp;quot;Group Normalization&amp;quot;. &amp;#039;&amp;#039;ECCV&amp;#039;&amp;#039;.&lt;br /&gt;
* Santurkar, S. et al. (2018). &amp;quot;How Does Batch Normalization Help Optimization?&amp;quot;. &amp;#039;&amp;#039;NeurIPS&amp;#039;&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
[[Category:Deep Learning]]&lt;br /&gt;
[[Category:Intermediate]]&lt;br /&gt;
[[Category:Neural Networks]]&lt;/div&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
</feed>