<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://marovi.ai/index.php?action=history&amp;feed=atom&amp;title=Transfer_Learning</id>
	<title>Transfer Learning - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://marovi.ai/index.php?action=history&amp;feed=atom&amp;title=Transfer_Learning"/>
	<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Transfer_Learning&amp;action=history"/>
	<updated>2026-04-24T11:52:16Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.39.1</generator>
	<entry>
		<id>https://marovi.ai/index.php?title=Transfer_Learning&amp;diff=2145&amp;oldid=prev</id>
		<title>DeployBot: [deploy-bot] Deploy from CI (8c92aeb)</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Transfer_Learning&amp;diff=2145&amp;oldid=prev"/>
		<updated>2026-04-24T07:09:00Z</updated>

		<summary type="html">&lt;p&gt;[deploy-bot] Deploy from CI (8c92aeb)&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 07:09, 24 April 2026&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l98&quot;&gt;Line 98:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 98:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Machine Learning]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Machine Learning]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Intermediate]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Intermediate]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!--v1.2.0 cache-bust--&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!-- pass 2 --&gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key mediawiki:diff::1.12:old-2116:rev-2145 --&gt;
&lt;/table&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
	<entry>
		<id>https://marovi.ai/index.php?title=Transfer_Learning&amp;diff=2116&amp;oldid=prev</id>
		<title>DeployBot: Pass 2 force re-parse</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Transfer_Learning&amp;diff=2116&amp;oldid=prev"/>
		<updated>2026-04-24T07:01:22Z</updated>

		<summary type="html">&lt;p&gt;Pass 2 force re-parse&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 07:01, 24 April 2026&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l99&quot;&gt;Line 99:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 99:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Intermediate]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Intermediate]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;!--v1.2.0 cache-bust--&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&amp;lt;!--v1.2.0 cache-bust--&amp;gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!-- pass 2 --&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key mediawiki:diff::1.12:old-2079:rev-2116 --&gt;
&lt;/table&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
	<entry>
		<id>https://marovi.ai/index.php?title=Transfer_Learning&amp;diff=2079&amp;oldid=prev</id>
		<title>DeployBot: Force re-parse after Math source-mode rollout (v1.2.0)</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Transfer_Learning&amp;diff=2079&amp;oldid=prev"/>
		<updated>2026-04-24T06:58:45Z</updated>

		<summary type="html">&lt;p&gt;Force re-parse after Math source-mode rollout (v1.2.0)&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 06:58, 24 April 2026&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l98&quot;&gt;Line 98:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 98:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Machine Learning]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Machine Learning]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Intermediate]]&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[[Category:Intermediate]]&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;!--v1.2.0 cache-bust--&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;

&lt;!-- diff cache key mediawiki:diff::1.12:old-1994:rev-2079 --&gt;
&lt;/table&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
	<entry>
		<id>https://marovi.ai/index.php?title=Transfer_Learning&amp;diff=1994&amp;oldid=prev</id>
		<title>DeployBot: [deploy-bot] Deploy from CI (775ba6e)</title>
		<link rel="alternate" type="text/html" href="https://marovi.ai/index.php?title=Transfer_Learning&amp;diff=1994&amp;oldid=prev"/>
		<updated>2026-04-24T04:01:45Z</updated>

		<summary type="html">&lt;p&gt;[deploy-bot] Deploy from CI (775ba6e)&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;{{LanguageBar | page = Transfer Learning}}&lt;br /&gt;
{{ArticleInfobox | topic_area = Machine Learning | difficulty = Intermediate | prerequisites = [[Neural Networks]]}}&lt;br /&gt;
{{ContentMeta | generated_by = claude-opus | model_used = claude-opus-4-6 | generated_date = 2026-03-13}}&lt;br /&gt;
&lt;br /&gt;
&amp;#039;&amp;#039;&amp;#039;Transfer learning&amp;#039;&amp;#039;&amp;#039; is a machine learning technique in which a model trained on one task is reused as the starting point for a model on a different but related task. By leveraging knowledge acquired from large-scale pretraining, transfer learning dramatically reduces the amount of labelled data, compute, and training time required for downstream applications.&lt;br /&gt;
&lt;br /&gt;
== Motivation ==&lt;br /&gt;
&lt;br /&gt;
Training deep neural networks from scratch typically requires large datasets and significant computational resources. In many practical domains — medical imaging, legal text analysis, low-resource languages — labelled data is scarce. Transfer learning addresses this mismatch: a model pretrained on a data-rich source task captures general features (edges, textures, syntactic patterns) that transfer well to a data-scarce target task.&lt;br /&gt;
&lt;br /&gt;
== Key Concepts ==&lt;br /&gt;
&lt;br /&gt;
=== Domain and Task ===&lt;br /&gt;
&lt;br /&gt;
Formally, a &amp;#039;&amp;#039;&amp;#039;domain&amp;#039;&amp;#039;&amp;#039; &amp;lt;math&amp;gt;\mathcal{D} = \{\mathcal{X}, P(X)\}&amp;lt;/math&amp;gt; consists of a feature space &amp;lt;math&amp;gt;\mathcal{X}&amp;lt;/math&amp;gt; and a marginal distribution &amp;lt;math&amp;gt;P(X)&amp;lt;/math&amp;gt;. A &amp;#039;&amp;#039;&amp;#039;task&amp;#039;&amp;#039;&amp;#039; &amp;lt;math&amp;gt;\mathcal{T} = \{\mathcal{Y}, f(\cdot)\}&amp;lt;/math&amp;gt; consists of a label space &amp;lt;math&amp;gt;\mathcal{Y}&amp;lt;/math&amp;gt; and a predictive function &amp;lt;math&amp;gt;f&amp;lt;/math&amp;gt;. Transfer learning applies when the source and target differ in domain, task, or both.&lt;br /&gt;
&lt;br /&gt;
=== Domain Adaptation ===&lt;br /&gt;
&lt;br /&gt;
When the source and target share the same task but differ in data distribution (&amp;lt;math&amp;gt;P_s(X) \neq P_t(X)&amp;lt;/math&amp;gt;), the problem is called &amp;#039;&amp;#039;&amp;#039;domain adaptation&amp;#039;&amp;#039;&amp;#039;. Techniques include:&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Instance reweighting&amp;#039;&amp;#039;&amp;#039; — adjusting sample weights so the source distribution approximates the target.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Feature alignment&amp;#039;&amp;#039;&amp;#039; — learning domain-invariant representations (e.g., via adversarial training or maximum mean discrepancy).&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Self-training&amp;#039;&amp;#039;&amp;#039; — using model predictions on unlabelled target data as pseudo-labels.&lt;br /&gt;
&lt;br /&gt;
== Fine-Tuning vs Feature Extraction ==&lt;br /&gt;
&lt;br /&gt;
{| class=&amp;quot;wikitable&amp;quot;&lt;br /&gt;
|-&lt;br /&gt;
! Strategy !! Description !! When to use&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;Feature extraction&amp;#039;&amp;#039;&amp;#039; || Freeze all pretrained layers; train only a new output head || Very small target dataset; source and target are closely related&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;Fine-tuning (full)&amp;#039;&amp;#039;&amp;#039; || Unfreeze all layers and train end-to-end with a small learning rate || Moderate target dataset; source and target differ meaningfully&lt;br /&gt;
|-&lt;br /&gt;
| &amp;#039;&amp;#039;&amp;#039;Gradual unfreezing&amp;#039;&amp;#039;&amp;#039; || Progressively unfreeze layers from top to bottom over training || Balances stability of lower features with adaptation of higher ones&lt;br /&gt;
|}&lt;br /&gt;
&lt;br /&gt;
A common heuristic is to use a learning rate 10–100x smaller for pretrained layers than for the new classification head, preventing catastrophic forgetting of learned representations.&lt;br /&gt;
&lt;br /&gt;
== Pretrained Models ==&lt;br /&gt;
&lt;br /&gt;
=== Computer Vision ===&lt;br /&gt;
&lt;br /&gt;
ImageNet-pretrained convolutional networks (ResNet, EfficientNet, ViT) serve as standard backbones. Lower layers learn universal features such as edges and textures, while higher layers learn task-specific patterns. Fine-tuning an ImageNet model on a medical imaging dataset with only a few thousand images routinely outperforms training from scratch.&lt;br /&gt;
&lt;br /&gt;
=== Natural Language Processing ===&lt;br /&gt;
&lt;br /&gt;
Language model pretraining transformed NLP. Key milestones include:&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Word2Vec / GloVe&amp;#039;&amp;#039;&amp;#039; — static word embeddings pretrained on large corpora.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;ELMo&amp;#039;&amp;#039;&amp;#039; — contextualised embeddings from bidirectional LSTMs.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;BERT&amp;#039;&amp;#039;&amp;#039; (Devlin et al., 2019) — bidirectional Transformer pretrained with masked language modelling; fine-tuned for classification, QA, NER, and more.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;GPT series&amp;#039;&amp;#039;&amp;#039; — autoregressive Transformers demonstrating that scale and pretraining enable few-shot and zero-shot transfer.&lt;br /&gt;
&lt;br /&gt;
== When to Use Transfer Learning ==&lt;br /&gt;
&lt;br /&gt;
Transfer learning is most beneficial when:&lt;br /&gt;
&lt;br /&gt;
# The target dataset is small relative to the model&amp;#039;s capacity.&lt;br /&gt;
# The source and target domains share structural similarities (e.g., both involve natural images or natural language).&lt;br /&gt;
# Computational resources for full pretraining are unavailable.&lt;br /&gt;
# Rapid prototyping is needed before committing to large-scale data collection.&lt;br /&gt;
&lt;br /&gt;
It may hurt performance (&amp;#039;&amp;#039;&amp;#039;negative transfer&amp;#039;&amp;#039;&amp;#039;) when the source and target domains are fundamentally dissimilar — for instance, transferring from natural images to spectrograms without appropriate adaptation.&lt;br /&gt;
&lt;br /&gt;
== Practical Tips ==&lt;br /&gt;
&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Data augmentation&amp;#039;&amp;#039;&amp;#039; complements transfer learning by artificially expanding the effective size of the target dataset.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Learning rate warmup&amp;#039;&amp;#039;&amp;#039; helps stabilise early training when fine-tuning large pretrained models.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Early stopping&amp;#039;&amp;#039;&amp;#039; on a validation set prevents overfitting during fine-tuning, especially with small datasets.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Layer-wise learning rate decay&amp;#039;&amp;#039;&amp;#039; assigns smaller rates to earlier (more general) layers and larger rates to later (more task-specific) layers.&lt;br /&gt;
* &amp;#039;&amp;#039;&amp;#039;Intermediate task transfer&amp;#039;&amp;#039;&amp;#039; — fine-tuning on a related intermediate task before the final target (e.g., NLI before sentiment analysis) can further improve results.&lt;br /&gt;
&lt;br /&gt;
== Evaluation ==&lt;br /&gt;
&lt;br /&gt;
Transfer learning effectiveness is typically measured by comparing:&lt;br /&gt;
&lt;br /&gt;
:&amp;lt;math&amp;gt;\Delta_{\mathrm{transfer}} = \mathrm{Acc}_{\mathrm{transfer}} - \mathrm{Acc}_{\mathrm{scratch}}&amp;lt;/math&amp;gt;&lt;br /&gt;
&lt;br /&gt;
A positive &amp;lt;math&amp;gt;\Delta_{\mathrm{transfer}}&amp;lt;/math&amp;gt; indicates successful knowledge transfer. Practitioners also track convergence speed, as transferred models often reach target performance in a fraction of the epochs.&lt;br /&gt;
&lt;br /&gt;
== See also ==&lt;br /&gt;
&lt;br /&gt;
* [[Neural Networks]]&lt;br /&gt;
* [[Transformer]]&lt;br /&gt;
* [[Self-supervised learning]]&lt;br /&gt;
* [[Domain adaptation]]&lt;br /&gt;
* [[Fine-tuning]]&lt;br /&gt;
&lt;br /&gt;
== References ==&lt;br /&gt;
&lt;br /&gt;
* Pan, S. J. and Yang, Q. (2010). &amp;quot;A Survey on Transfer Learning&amp;quot;. &amp;#039;&amp;#039;IEEE Transactions on Knowledge and Data Engineering&amp;#039;&amp;#039;.&lt;br /&gt;
* Yosinski, J. et al. (2014). &amp;quot;How transferable are features in deep neural networks?&amp;quot;. &amp;#039;&amp;#039;NeurIPS&amp;#039;&amp;#039;.&lt;br /&gt;
* Devlin, J. et al. (2019). &amp;quot;BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding&amp;quot;. &amp;#039;&amp;#039;NAACL&amp;#039;&amp;#039;.&lt;br /&gt;
* Howard, J. and Ruder, S. (2018). &amp;quot;Universal Language Model Fine-tuning for Text Classification&amp;quot;. &amp;#039;&amp;#039;ACL&amp;#039;&amp;#039;.&lt;br /&gt;
* Zhuang, F. et al. (2021). &amp;quot;A Comprehensive Survey on Transfer Learning&amp;quot;. &amp;#039;&amp;#039;Proceedings of the IEEE&amp;#039;&amp;#039;.&lt;br /&gt;
&lt;br /&gt;
[[Category:Machine Learning]]&lt;br /&gt;
[[Category:Intermediate]]&lt;/div&gt;</summary>
		<author><name>DeployBot</name></author>
	</entry>
</feed>