Translations:Dropout A Simple Way to Prevent Overfitting/12/en
where $ r_i $ is a random mask variable. The dropped-out network is then used for the forward pass and backpropagation on that training case. Different random masks are drawn for each training example and each gradient step.