where $ \alpha $ is the step size (learning rate) and $ \epsilon $ is a small constant for numerical stability.