<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://marovi.ai/index.php?action=history&amp;feed=atom&amp;title=Loss_Functions%2Fes</id>
	<title>Loss Functions/es - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://marovi.ai/index.php?action=history&amp;feed=atom&amp;title=Loss_Functions%2Fes"/>
	<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Loss_Functions/es&amp;action=history"/>
	<updated>2026-04-24T13:01:33Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.39.1</generator>
	<entry>
		<id>https://marovi.ai/index.php?title=Loss_Functions/es&amp;diff=2155&amp;oldid=prev</id>
		<title>DeployBot: [deploy-bot] Deploy from CI (8c92aeb)</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Loss_Functions/es&amp;diff=2155&amp;oldid=prev"/>
		<updated>2026-04-24T07:09:01Z</updated>

		<summary type="html">&lt;p&gt;[deploy-bot] Deploy from CI (8c92aeb)&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 07:09, 24 April 2026&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l111&quot;&gt;Line 111:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 111:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Machine Learning]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Machine Learning]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Introductory]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Introductory]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!--v1.2.0 cache-bust--&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!-- pass 2 --&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key mediawiki:diff::1.12:old-2104:rev-2155 --&gt;
&lt;/table&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
	<entry>
		<id>https://marovi.ai/index.php?title=Loss_Functions/es&amp;diff=2104&amp;oldid=prev</id>
		<title>DeployBot: Pass 2 force re-parse</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Loss_Functions/es&amp;diff=2104&amp;oldid=prev"/>
		<updated>2026-04-24T07:00:58Z</updated>

		<summary type="html">&lt;p&gt;Pass 2 force re-parse&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 07:00, 24 April 2026&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l112&quot;&gt;Line 112:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 112:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Introductory]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Introductory]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;!--v1.2.0 cache-bust--&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;!--v1.2.0 cache-bust--&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!-- pass 2 --&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key mediawiki:diff::1.12:old-2067:rev-2104 --&gt;
&lt;/table&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
	<entry>
		<id>https://marovi.ai/index.php?title=Loss_Functions/es&amp;diff=2067&amp;oldid=prev</id>
		<title>DeployBot: Force re-parse after Math source-mode rollout (v1.2.0)</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Loss_Functions/es&amp;diff=2067&amp;oldid=prev"/>
		<updated>2026-04-24T06:58:22Z</updated>

		<summary type="html">&lt;p&gt;Force re-parse after Math source-mode rollout (v1.2.0)&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 06:58, 24 April 2026&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l111&quot;&gt;Line 111:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 111:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Machine Learning]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Machine Learning]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Introductory]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Introductory]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!--v1.2.0 cache-bust--&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key mediawiki:diff::1.12:old-2004:rev-2067 --&gt;
&lt;/table&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
	<entry>
		<id>https://marovi.ai/index.php?title=Loss_Functions/es&amp;diff=2004&amp;oldid=prev</id>
		<title>DeployBot: [deploy-bot] Deploy from CI (775ba6e)</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Loss_Functions/es&amp;diff=2004&amp;oldid=prev"/>
		<updated>2026-04-24T04:01:49Z</updated>

		<summary type="html">&lt;p&gt;[deploy-bot] Deploy from CI (775ba6e)&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;{{LanguageBar | page = Loss Functions}}&lt;br /&gt;
{{ArticleInfobox | topic_area = Machine Learning | difficulty = Introductory | prerequisites = }}&lt;br /&gt;
{{ContentMeta | generated_by = claude-opus | model_used = claude-opus-4-6 | generated_date = 2026-03-13}}&lt;br /&gt;
&lt;br /&gt;
Las &amp;#039;&amp;#039;&amp;#039;funciones de perdida&amp;#039;&amp;#039;&amp;#039; (tambien llamadas &amp;#039;&amp;#039;&amp;#039;funciones de coste&amp;#039;&amp;#039;&amp;#039; o &amp;#039;&amp;#039;&amp;#039;funciones objetivo&amp;#039;&amp;#039;&amp;#039;) cuantifican cuan lejos estan las predicciones de un modelo del resultado deseado. Minimizar la funcion de perdida es el objetivo central del proceso de entrenamiento en el aprendizaje automatico: el algoritmo de optimizacion ajusta los parametros del modelo para reducir la perdida al minimo posible.&lt;br /&gt;
&lt;br /&gt;
== Proposito ==&lt;br /&gt;
&lt;br /&gt;
Una funcion de perdida mapea la prediccion del modelo &amp;lt;math&amp;gt;\hat{y}&amp;lt;/math&amp;gt; y el objetivo verdadero &amp;lt;math&amp;gt;y&amp;lt;/math&amp;gt; a un numero real no negativo. Formalmente, para un unico ejemplo:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\ell: \mathcal{Y} \times \mathcal{Y} \to \mathbb{R}_{\geq 0}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Sobre un conjunto de datos de &amp;lt;math&amp;gt;N&amp;lt;/math&amp;gt; ejemplos, la perdida total es tipicamente el promedio:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;L(\theta) = \frac{1}{N}\sum_{i=1}^{N}\ell\bigl(y_i,\, \hat{y}_i(\theta)\bigr)&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
La eleccion de la funcion de perdida codifica la estructura del problema — que tipo de errores importan y con que severidad deben ser penalizados. Una funcion de perdida mal elegida puede llevar a un modelo que optimiza el objetivo equivocado.&lt;br /&gt;
&lt;br /&gt;
== Error cuadratico medio ==&lt;br /&gt;
&lt;br /&gt;
El &amp;#039;&amp;#039;&amp;#039;error cuadratico medio&amp;#039;&amp;#039;&amp;#039; (MSE, por sus siglas en ingles) es la perdida predeterminada para tareas de regresion:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;L_{\text{MSE}} = \frac{1}{N}\sum_{i=1}^{N}(y_i - \hat{y}_i)^2&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
El MSE penaliza los errores grandes de forma cuadratica, lo que lo hace sensible a valores atipicos. Su gradiente es directo:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\frac{\partial}{\partial \hat{y}_i} (y_i - \hat{y}_i)^2 = -2(y_i - \hat{y}_i)&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Una variante estrechamente relacionada es el &amp;#039;&amp;#039;&amp;#039;error absoluto medio&amp;#039;&amp;#039;&amp;#039; (MAE), &amp;lt;math&amp;gt;\frac{1}{N}\sum|y_i - \hat{y}_i|&amp;lt;/math&amp;gt;, que es mas robusto ante valores atipicos pero tiene un gradiente no suave en cero. La &amp;#039;&amp;#039;&amp;#039;perdida de Huber&amp;#039;&amp;#039;&amp;#039; combina ambas: se comporta como el MSE para errores pequenos y como el MAE para errores grandes.&lt;br /&gt;
&lt;br /&gt;
== Perdida de entropia cruzada ==&lt;br /&gt;
&lt;br /&gt;
La &amp;#039;&amp;#039;&amp;#039;perdida de entropia cruzada&amp;#039;&amp;#039;&amp;#039; es la opcion estandar para tareas de clasificacion. Mide la disimilitud entre la distribucion de probabilidad predicha y la distribucion de la etiqueta verdadera.&lt;br /&gt;
&lt;br /&gt;
=== Entropia cruzada binaria ===&lt;br /&gt;
&lt;br /&gt;
Para clasificacion binaria con probabilidad predicha &amp;lt;math&amp;gt;p&amp;lt;/math&amp;gt; y etiqueta verdadera &amp;lt;math&amp;gt;y \in \{0, 1\}&amp;lt;/math&amp;gt;:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;L_{\text{BCE}} = -\frac{1}{N}\sum_{i=1}^{N}\bigl[y_i \log p_i + (1 - y_i)\log(1 - p_i)\bigr]&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Esta perdida se minimiza cuando la probabilidad predicha coincide perfectamente con la etiqueta verdadera (&amp;lt;math&amp;gt;p = 1&amp;lt;/math&amp;gt; cuando &amp;lt;math&amp;gt;y = 1&amp;lt;/math&amp;gt; y &amp;lt;math&amp;gt;p = 0&amp;lt;/math&amp;gt; cuando &amp;lt;math&amp;gt;y = 0&amp;lt;/math&amp;gt;).&lt;br /&gt;
&lt;br /&gt;
=== Entropia cruzada categorica ===&lt;br /&gt;
&lt;br /&gt;
Para clasificacion multiclase con &amp;lt;math&amp;gt;C&amp;lt;/math&amp;gt; clases y vector de probabilidad predicho &amp;lt;math&amp;gt;\hat{\mathbf{y}}&amp;lt;/math&amp;gt;:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;L_{\text{CE}} = -\frac{1}{N}\sum_{i=1}^{N}\sum_{c=1}^{C} y_{i,c} \log \hat{y}_{i,c}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
Cuando las etiquetas verdaderas estan codificadas en formato one-hot, solo sobrevive el termino correspondiente a la clase correcta.&lt;br /&gt;
&lt;br /&gt;
== Perdida de bisagra ==&lt;br /&gt;
&lt;br /&gt;
La &amp;#039;&amp;#039;&amp;#039;perdida de bisagra&amp;#039;&amp;#039;&amp;#039; esta asociada con las maquinas de vectores de soporte (SVM) y los clasificadores de margen maximo. Para un problema de clasificacion binaria con etiquetas &amp;lt;math&amp;gt;y \in \{-1, +1\}&amp;lt;/math&amp;gt; y salida cruda del modelo &amp;lt;math&amp;gt;s&amp;lt;/math&amp;gt;:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;L_{\text{hinge}} = \frac{1}{N}\sum_{i=1}^{N}\max(0,\; 1 - y_i \, s_i)&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
La perdida de bisagra es cero cuando la prediccion tiene el signo correcto con un margen de al menos 1, y aumenta linealmente en caso contrario. Dado que no es diferenciable en el punto de bisagra, se utilizan metodos de subgradiente para la optimizacion.&lt;br /&gt;
&lt;br /&gt;
== Otras funciones de perdida comunes ==&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Perdida !! Formula !! Uso tipico&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;Huber&amp;#039;&amp;#039;&amp;#039; || &amp;lt;math&amp;gt;\begin{cases}\tfrac{1}{2}(y-\hat{y})^2 &amp;amp; |y-\hat{y}|\leq\delta \\ \delta(|y-\hat{y}|-\tfrac{\delta}{2}) &amp;amp; \text{otherwise}\end{cases}&amp;lt;/math&amp;gt; || Regresion robusta&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;Divergencia KL&amp;#039;&amp;#039;&amp;#039; || &amp;lt;math&amp;gt;\sum_c p_c \log\frac{p_c}{q_c}&amp;lt;/math&amp;gt; || Ajuste de distribuciones, VAE&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;Perdida focal&amp;#039;&amp;#039;&amp;#039; || &amp;lt;math&amp;gt;-\alpha(1-p_t)^\gamma \log p_t&amp;lt;/math&amp;gt; || Clasificacion desbalanceada&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;Perdida CTC&amp;#039;&amp;#039;&amp;#039; || Programacion dinamica sobre alineamientos || Reconocimiento de voz, OCR&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;Perdida de tripleta&amp;#039;&amp;#039;&amp;#039; || &amp;lt;math&amp;gt;\max(0,\; d(a,p) - d(a,n) + m)&amp;lt;/math&amp;gt; || Aprendizaje de metricas, verificacion facial&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Eleccion de la perdida adecuada ==&lt;br /&gt;
&lt;br /&gt;
La funcion de perdida apropiada depende de la tarea:&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Regresion&amp;#039;&amp;#039;&amp;#039; — el MSE es la opcion predeterminada; se cambia a MAE o Huber si los valores atipicos son una preocupacion.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Clasificacion binaria&amp;#039;&amp;#039;&amp;#039; — entropia cruzada binaria con salida sigmoide.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Clasificacion multiclase&amp;#039;&amp;#039;&amp;#039; — entropia cruzada categorica con salida softmax.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Clasificacion multietiqueta&amp;#039;&amp;#039;&amp;#039; — entropia cruzada binaria aplicada independientemente por etiqueta.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Ranking o recuperacion&amp;#039;&amp;#039;&amp;#039; — perdida contrastiva, perdida de tripleta o perdidas de ranking por lista.&lt;br /&gt;
&lt;br /&gt;
Una consideracion importante es si la perdida esta &amp;#039;&amp;#039;&amp;#039;calibrada&amp;#039;&amp;#039;&amp;#039; — es decir, si minimizarla produce probabilidades predichas bien calibradas. La entropia cruzada es una regla de puntuacion propia y produce probabilidades calibradas, mientras que la perdida de bisagra no.&lt;br /&gt;
&lt;br /&gt;
== Terminos de regularizacion ==&lt;br /&gt;
&lt;br /&gt;
En la practica, el objetivo total a menudo incluye un &amp;#039;&amp;#039;&amp;#039;termino de regularizacion&amp;#039;&amp;#039;&amp;#039; que penaliza la complejidad del modelo:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;J(\theta) = L(\theta) + \lambda \, R(\theta)&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
donde &amp;lt;math&amp;gt;\lambda&amp;lt;/math&amp;gt; controla la intensidad de la regularizacion. Las opciones comunes incluyen la regularizacion L2 (&amp;lt;math&amp;gt;R = \|\theta\|_2^2&amp;lt;/math&amp;gt;) y la regularizacion L1 (&amp;lt;math&amp;gt;R = \|\theta\|_1&amp;lt;/math&amp;gt;). Vease [[Overfitting and Regularization]] para mas detalles.&lt;br /&gt;
&lt;br /&gt;
== Vease tambien ==&lt;br /&gt;
&lt;br /&gt;
* [[Gradient Descent]]&lt;br /&gt;
* [[Neural Networks]]&lt;br /&gt;
* [[Backpropagation]]&lt;br /&gt;
* [[Overfitting and Regularization]]&lt;br /&gt;
* [[Stochastic Gradient Descent]]&lt;br /&gt;
&lt;br /&gt;
== Referencias ==&lt;br /&gt;
&lt;br /&gt;
* Bishop, C. M. (2006). &amp;#039;&amp;#039;Pattern Recognition and Machine Learning&amp;#039;&amp;#039;, Chapter 1. Springer.&lt;br /&gt;
* Goodfellow, I., Bengio, Y. and Courville, A. (2016). &amp;#039;&amp;#039;Deep Learning&amp;#039;&amp;#039;, Chapters 6 and 8. MIT Press.&lt;br /&gt;
* Lin, T.-Y. et al. (2017). &amp;quot;Focal Loss for Dense Object Detection&amp;quot;. &amp;#039;&amp;#039;ICCV&amp;#039;&amp;#039;.&lt;br /&gt;
* Murphy, K. P. (2022). &amp;#039;&amp;#039;Probabilistic Machine Learning: An Introduction&amp;#039;&amp;#039;. MIT Press.&lt;br /&gt;
&lt;br /&gt;
[[Category:Machine Learning]]&lt;br /&gt;
[[Category:Introductory]]&lt;/div&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
</feed>