<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://marovi.ai/index.php?action=history&amp;feed=atom&amp;title=Neural_Networks</id>
	<title>Neural Networks - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://marovi.ai/index.php?action=history&amp;feed=atom&amp;title=Neural_Networks"/>
	<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Neural_Networks&amp;action=history"/>
	<updated>2026-04-24T11:53:22Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.39.1</generator>
	<entry>
		<id>https://marovi.ai/index.php?title=Neural_Networks&amp;diff=2140&amp;oldid=prev</id>
		<title>DeployBot: [deploy-bot] Deploy from CI (8c92aeb)</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Neural_Networks&amp;diff=2140&amp;oldid=prev"/>
		<updated>2026-04-24T07:08:59Z</updated>

		<summary type="html">&lt;p&gt;[deploy-bot] Deploy from CI (8c92aeb)&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 07:08, 24 April 2026&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l111&quot;&gt;Line 111:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 111:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Introductory]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Introductory]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Neural Networks]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Neural Networks]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!--v1.2.0 cache-bust--&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!-- pass 2 --&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key mediawiki:diff::1.12:old-2105:rev-2140 --&gt;
&lt;/table&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
	<entry>
		<id>https://marovi.ai/index.php?title=Neural_Networks&amp;diff=2105&amp;oldid=prev</id>
		<title>DeployBot: Pass 2 force re-parse</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Neural_Networks&amp;diff=2105&amp;oldid=prev"/>
		<updated>2026-04-24T07:01:02Z</updated>

		<summary type="html">&lt;p&gt;Pass 2 force re-parse&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 07:01, 24 April 2026&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l112&quot;&gt;Line 112:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 112:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Neural Networks]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Neural Networks]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;!--v1.2.0 cache-bust--&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;!--v1.2.0 cache-bust--&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!-- pass 2 --&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key mediawiki:diff::1.12:old-2068:rev-2105 --&gt;
&lt;/table&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
	<entry>
		<id>https://marovi.ai/index.php?title=Neural_Networks&amp;diff=2068&amp;oldid=prev</id>
		<title>DeployBot: Force re-parse after Math source-mode rollout (v1.2.0)</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Neural_Networks&amp;diff=2068&amp;oldid=prev"/>
		<updated>2026-04-24T06:58:25Z</updated>

		<summary type="html">&lt;p&gt;Force re-parse after Math source-mode rollout (v1.2.0)&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 06:58, 24 April 2026&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l111&quot;&gt;Line 111:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 111:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Introductory]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Introductory]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Neural Networks]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Neural Networks]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!--v1.2.0 cache-bust--&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key mediawiki:diff::1.12:old-1989:rev-2068 --&gt;
&lt;/table&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
	<entry>
		<id>https://marovi.ai/index.php?title=Neural_Networks&amp;diff=1989&amp;oldid=prev</id>
		<title>DeployBot: [deploy-bot] Deploy from CI (775ba6e)</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Neural_Networks&amp;diff=1989&amp;oldid=prev"/>
		<updated>2026-04-24T04:01:43Z</updated>

		<summary type="html">&lt;p&gt;[deploy-bot] Deploy from CI (775ba6e)&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;{{LanguageBar | page = Neural Networks}}&lt;br /&gt;
{{ArticleInfobox | topic_area = Deep Learning | difficulty = Introductory | prerequisites = }}&lt;br /&gt;
{{ContentMeta | generated_by = claude-opus | model_used = claude-opus-4-6 | generated_date = 2026-03-13}}&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Neural networks&amp;#039;&amp;#039;&amp;#039; (also called &amp;#039;&amp;#039;&amp;#039;artificial neural networks&amp;#039;&amp;#039;&amp;#039;, or ANNs) are computational models inspired by the structure of biological nervous systems. They consist of interconnected layers of simple processing units called &amp;#039;&amp;#039;&amp;#039;neurons&amp;#039;&amp;#039;&amp;#039; (or nodes) and form the basis of modern deep learning.&lt;br /&gt;
&lt;br /&gt;
== Biological inspiration ==&lt;br /&gt;
&lt;br /&gt;
The biological neuron receives electrical signals through its &amp;#039;&amp;#039;&amp;#039;dendrites&amp;#039;&amp;#039;&amp;#039;, integrates them in the &amp;#039;&amp;#039;&amp;#039;cell body&amp;#039;&amp;#039;&amp;#039;, and, if the combined signal exceeds a threshold, fires an output signal along its &amp;#039;&amp;#039;&amp;#039;axon&amp;#039;&amp;#039;&amp;#039; to downstream neurons. Artificial neural networks abstract this process: each artificial neuron computes a weighted sum of its inputs, adds a bias term, and passes the result through a nonlinear &amp;#039;&amp;#039;&amp;#039;activation function&amp;#039;&amp;#039;&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
While the analogy to biology motivated early research, modern neural networks are best understood as flexible parameterised function approximators rather than faithful brain simulations.&lt;br /&gt;
&lt;br /&gt;
== The perceptron ==&lt;br /&gt;
&lt;br /&gt;
The &amp;#039;&amp;#039;&amp;#039;perceptron&amp;#039;&amp;#039;&amp;#039;, introduced by Frank Rosenblatt in 1958, is the simplest neural network. It computes:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;y = \sigma\!\left(\sum_{i=1}^{n} w_i x_i + b\right) = \sigma(\mathbf{w}^\top \mathbf{x} + b)&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where &amp;lt;math&amp;gt;\mathbf{x}&amp;lt;/math&amp;gt; is the input vector, &amp;lt;math&amp;gt;\mathbf{w}&amp;lt;/math&amp;gt; are learnable weights, &amp;lt;math&amp;gt;b&amp;lt;/math&amp;gt; is a bias, and &amp;lt;math&amp;gt;\sigma&amp;lt;/math&amp;gt; is a step function that outputs 1 if the argument is positive and 0 otherwise. The perceptron can learn any linearly separable function but famously cannot represent the XOR function — a limitation that stalled neural-network research for over a decade.&lt;br /&gt;
&lt;br /&gt;
== Feedforward networks ==&lt;br /&gt;
&lt;br /&gt;
A &amp;#039;&amp;#039;&amp;#039;feedforward neural network&amp;#039;&amp;#039;&amp;#039; (also called a &amp;#039;&amp;#039;&amp;#039;multilayer perceptron&amp;#039;&amp;#039;&amp;#039;, or MLP) stacks multiple layers of neurons. Information flows in one direction — from the &amp;#039;&amp;#039;&amp;#039;input layer&amp;#039;&amp;#039;&amp;#039; through one or more &amp;#039;&amp;#039;&amp;#039;hidden layers&amp;#039;&amp;#039;&amp;#039; to the &amp;#039;&amp;#039;&amp;#039;output layer&amp;#039;&amp;#039;&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
For a network with one hidden layer, the computation is:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\mathbf{h} = g(\mathbf{W}_1 \mathbf{x} + \mathbf{b}_1)&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\mathbf{y} = f(\mathbf{W}_2 \mathbf{h} + \mathbf{b}_2)&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
where &amp;lt;math&amp;gt;g&amp;lt;/math&amp;gt; and &amp;lt;math&amp;gt;f&amp;lt;/math&amp;gt; are activation functions, &amp;lt;math&amp;gt;\mathbf{W}_1, \mathbf{W}_2&amp;lt;/math&amp;gt; are weight matrices, and &amp;lt;math&amp;gt;\mathbf{b}_1, \mathbf{b}_2&amp;lt;/math&amp;gt; are bias vectors. The hidden layer enables the network to learn nonlinear relationships that a single perceptron cannot capture.&lt;br /&gt;
&lt;br /&gt;
Networks with many hidden layers are called &amp;#039;&amp;#039;&amp;#039;deep&amp;#039;&amp;#039;&amp;#039; neural networks, and training them is the subject of &amp;#039;&amp;#039;&amp;#039;deep learning&amp;#039;&amp;#039;&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
== Activation functions ==&lt;br /&gt;
&lt;br /&gt;
The activation function introduces nonlinearity; without it, a multi-layer network would collapse to a single linear transformation. Common choices include:&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Function !! Formula !! Range !! Notes&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;Sigmoid&amp;#039;&amp;#039;&amp;#039; || &amp;lt;math&amp;gt;\sigma(z) = \frac{1}{1+e^{-z}}&amp;lt;/math&amp;gt; || (0, 1) || Historically popular; suffers from vanishing gradients&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;Tanh&amp;#039;&amp;#039;&amp;#039; || &amp;lt;math&amp;gt;\tanh(z) = \frac{e^z - e^{-z}}{e^z + e^{-z}}&amp;lt;/math&amp;gt; || (−1, 1) || Zero-centred; still saturates for large inputs&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;ReLU&amp;#039;&amp;#039;&amp;#039; || &amp;lt;math&amp;gt;\max(0, z)&amp;lt;/math&amp;gt; || [0, ∞) || Default choice in modern networks; can cause &amp;quot;dead neurons&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;Leaky ReLU&amp;#039;&amp;#039;&amp;#039; || &amp;lt;math&amp;gt;\max(\alpha z, z)&amp;lt;/math&amp;gt; for small &amp;lt;math&amp;gt;\alpha &amp;gt; 0&amp;lt;/math&amp;gt; || (−∞, ∞) || Addresses the dead-neuron problem&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;Softmax&amp;#039;&amp;#039;&amp;#039; || &amp;lt;math&amp;gt;\frac{e^{z_i}}{\sum_j e^{z_j}}&amp;lt;/math&amp;gt; || (0, 1) || Used in output layer for multi-class classification&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
== Universal approximation theorem ==&lt;br /&gt;
&lt;br /&gt;
The &amp;#039;&amp;#039;&amp;#039;universal approximation theorem&amp;#039;&amp;#039;&amp;#039; (Cybenko 1989, Hornik 1991) states that a feedforward network with a single hidden layer containing a finite number of neurons can approximate any continuous function on a compact subset of &amp;lt;math&amp;gt;\mathbb{R}^n&amp;lt;/math&amp;gt; to arbitrary accuracy, provided the activation function satisfies mild conditions (e.g. is non-constant, bounded, and continuous).&lt;br /&gt;
&lt;br /&gt;
This theorem guarantees the &amp;#039;&amp;#039;existence&amp;#039;&amp;#039; of a good approximation but says nothing about how to &amp;#039;&amp;#039;find&amp;#039;&amp;#039; it — in practice, training deep networks with many layers is far more effective than using a single wide layer.&lt;br /&gt;
&lt;br /&gt;
== Training overview ==&lt;br /&gt;
&lt;br /&gt;
Training a neural network involves:&lt;br /&gt;
&lt;br /&gt;
# &amp;#039;&amp;#039;&amp;#039;Defining a loss function&amp;#039;&amp;#039;&amp;#039; — a measure of how far the network&amp;#039;s predictions are from the true targets (see [[Loss Functions]]).&lt;br /&gt;
# &amp;#039;&amp;#039;&amp;#039;Forward pass&amp;#039;&amp;#039;&amp;#039; — computing the output of the network for a given input by propagating values layer by layer.&lt;br /&gt;
# &amp;#039;&amp;#039;&amp;#039;Backward pass (backpropagation)&amp;#039;&amp;#039;&amp;#039; — computing the gradient of the loss with respect to every weight by applying the chain rule in reverse through the network (see [[Backpropagation]]).&lt;br /&gt;
# &amp;#039;&amp;#039;&amp;#039;Parameter update&amp;#039;&amp;#039;&amp;#039; — adjusting the weights using an optimisation algorithm such as [[Gradient Descent]] or one of its variants.&lt;br /&gt;
# &amp;#039;&amp;#039;&amp;#039;Iteration&amp;#039;&amp;#039;&amp;#039; — repeating steps 2–4 over many passes (epochs) through the training data.&lt;br /&gt;
&lt;br /&gt;
Successful training also requires attention to &amp;#039;&amp;#039;&amp;#039;initialisation&amp;#039;&amp;#039;&amp;#039; (e.g. Xavier or He schemes), &amp;#039;&amp;#039;&amp;#039;regularisation&amp;#039;&amp;#039;&amp;#039; (to prevent [[Overfitting and Regularization|overfitting]]), and &amp;#039;&amp;#039;&amp;#039;hyperparameter tuning&amp;#039;&amp;#039;&amp;#039; (learning rate, batch size, network architecture).&lt;br /&gt;
&lt;br /&gt;
== Common architectures ==&lt;br /&gt;
&lt;br /&gt;
Beyond the basic feedforward network, several specialised architectures have been developed:&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;[[Convolutional Neural Networks]]&amp;#039;&amp;#039;&amp;#039; (CNNs) — designed for grid-structured data such as images, using local connectivity and weight sharing.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;[[Recurrent Neural Networks]]&amp;#039;&amp;#039;&amp;#039; (RNNs) — designed for sequential data, with connections that form cycles to maintain hidden state.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Transformers&amp;#039;&amp;#039;&amp;#039; — attention-based architectures that have become dominant in natural language processing and increasingly in vision.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Autoencoders&amp;#039;&amp;#039;&amp;#039; — networks trained to reconstruct their input, used for dimensionality reduction and generative modelling.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Generative adversarial networks&amp;#039;&amp;#039;&amp;#039; (GANs) — pairs of networks (generator and discriminator) trained in competition to generate realistic data.&lt;br /&gt;
&lt;br /&gt;
== Applications ==&lt;br /&gt;
&lt;br /&gt;
Neural networks are applied across a vast range of domains:&lt;br /&gt;
&lt;br /&gt;
* Computer vision (image classification, object detection, segmentation)&lt;br /&gt;
* Natural language processing (translation, summarisation, question answering)&lt;br /&gt;
* Speech recognition and synthesis&lt;br /&gt;
* Game playing (AlphaGo, Atari agents)&lt;br /&gt;
* Scientific discovery (protein folding, drug design, weather prediction)&lt;br /&gt;
* Autonomous vehicles and robotics&lt;br /&gt;
&lt;br /&gt;
== See also ==&lt;br /&gt;
&lt;br /&gt;
* [[Gradient Descent]]&lt;br /&gt;
* [[Backpropagation]]&lt;br /&gt;
* [[Loss Functions]]&lt;br /&gt;
* [[Convolutional Neural Networks]]&lt;br /&gt;
* [[Recurrent Neural Networks]]&lt;br /&gt;
* [[Overfitting and Regularization]]&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* Rosenblatt, F. (1958). &amp;quot;The Perceptron: A Probabilistic Model for Information Storage and Organization in the Brain&amp;quot;. &amp;#039;&amp;#039;Psychological Review&amp;#039;&amp;#039;.&lt;br /&gt;
* Cybenko, G. (1989). &amp;quot;Approximation by Superpositions of a Sigmoidal Function&amp;quot;. &amp;#039;&amp;#039;Mathematics of Control, Signals, and Systems&amp;#039;&amp;#039;.&lt;br /&gt;
* Hornik, K. (1991). &amp;quot;Approximation Capabilities of Multilayer Feedforward Networks&amp;quot;. &amp;#039;&amp;#039;Neural Networks&amp;#039;&amp;#039;.&lt;br /&gt;
* LeCun, Y., Bengio, Y. and Hinton, G. (2015). &amp;quot;Deep learning&amp;quot;. &amp;#039;&amp;#039;Nature&amp;#039;&amp;#039;, 521, 436–444.&lt;br /&gt;
* Goodfellow, I., Bengio, Y. and Courville, A. (2016). &amp;#039;&amp;#039;Deep Learning&amp;#039;&amp;#039;. MIT Press.&lt;br /&gt;
&lt;br /&gt;
[[Category:Deep Learning]]&lt;br /&gt;
[[Category:Introductory]]&lt;br /&gt;
[[Category:Neural Networks]]&lt;/div&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
</feed>