Translations:Dropout A Simple Way to Prevent Overfitting/7/en

dropout regularization: A training procedure that randomly omits neurons during each forward and backward pass, preventing neurons from developing overly specialized co-adaptations.
Ensemble interpretation: Theoretical motivation of dropout as approximate model averaging over $$ 2^n $$ possible thinned networks (where $$ n $$ is the number of droppable units), with shared weights.
Comprehensive empirical evaluation: Demonstration of consistent improvements across diverse domains including vision, speech recognition, text classification, and computational biology.
Practical guidelines: Recommendations for dropout rates ( $$ p = 0.5 $$ for hidden units, $$ p = 0.8 $$ for input units) and interactions with other hyperparameters.