Marcel Hudiani: Convergence Rate for the Last Iterate of Stochastic Heavy Ball

(Mathematical Physics and Probability Seminar; Math 402)

When

3 – 4 p.m., Feb. 18, 2026
What does `a ball rolling down a hill' have anything to do with Machine Learning? Consider an unconstrained optimization problem involving a convex objective with Holder gradient. Such a problem arises in the training of a linear classifier for text classification. When the risk is differentiable with respect to the training parameters, gradient-based optimization schemes can be used. However, when the gradient is small, relying entirely on the gradient to ‘descend’ to a minimum is slow. In 1964, Polyak introduced an algorithm called the Heavy Ball to accelerate convergence by adding a second order difference term scaled by a positive parameter less than 1. I will present almost sure and in probability convergence rate results for the last iterate of the Stochastic Heavy Ball (SHB) in the above setting.