Translations:Language Models are Few-Shot Learners/7/en
- GPT-3: A 175-billion-parameter autoregressive transformer language model, over 100 times larger than GPT-2, trained on a diverse corpus of internet text.
- In-context learning: Demonstration that large language models can learn tasks from examples presented in the prompt without parameter updates.
- Scaling laws for few-shot performance: Evidence that few-shot performance scales smoothly with model size across three orders of magnitude (125M to 175B parameters).
- Analysis of the social impacts and potential misuse of large language models, including bias, fairness, and energy consumption.