The update rule is $ \mathbf{w} \leftarrow \mathbf{w} - \eta \nabla_{\mathbf{w}} \mathcal{L} $, where $ \eta $ is the learning rate. Stochastic and mini-batch variants scale to millions of data points.