Translations:Language Models are Few-Shot Learners/7/en

GPT-3: A 175-billion-parameter autoregressive transformer language model, over 100 times larger than GPT-2, trained on a diverse corpus of internet text.
In-context learning: Demonstration that large language models can learn tasks from examples presented in the prompt without parameter updates.
Scaling laws for few-shot performance: Evidence that few-shot performance scales smoothly with model size across three orders of magnitude (125M to 175B parameters).
Analysis of the social impacts and potential misuse of large language models, including bias, fairness, and energy consumption.