Translations:Transfer Learning/24/en: Difference between revisions
(Importing a new version from external source) |
(Importing a new version from external source) |
||
| Line 1: | Line 1: | ||
* '''Data augmentation''' complements transfer learning by artificially expanding the effective size of the target dataset. | * '''Data augmentation''' complements transfer learning by artificially expanding the effective size of the target dataset. | ||
* ''' | * '''{{Term|learning rate}} warmup''' helps stabilise early training when {{Term|fine-tuning}} large pretrained models. | ||
* '''Early stopping''' on a validation set prevents overfitting during fine-tuning, especially with small datasets. | * '''Early stopping''' on a validation set prevents {{Term|overfitting}} during {{Term|fine-tuning}}, especially with small datasets. | ||
* '''Layer-wise learning rate decay''' assigns smaller rates to earlier (more general) layers and larger rates to later (more task-specific) layers. | * '''Layer-wise {{Term|learning rate}} decay''' assigns smaller rates to earlier (more general) layers and larger rates to later (more task-specific) layers. | ||
* '''Intermediate task transfer''' — fine-tuning on a related intermediate task before the final target (e.g., NLI before sentiment analysis) can further improve results. | * '''Intermediate task transfer''' — {{Term|fine-tuning}} on a related intermediate task before the final target (e.g., NLI before sentiment analysis) can further improve results. | ||
Revision as of 19:42, 27 April 2026
- Data augmentation complements transfer learning by artificially expanding the effective size of the target dataset.
- learning rate warmup helps stabilise early training when fine-tuning large pretrained models.
- Early stopping on a validation set prevents overfitting during fine-tuning, especially with small datasets.
- Layer-wise learning rate decay assigns smaller rates to earlier (more general) layers and larger rates to later (more task-specific) layers.
- Intermediate task transfer — fine-tuning on a related intermediate task before the final target (e.g., NLI before sentiment analysis) can further improve results.