The Intuition Behind Gradient Descent Optimizers
In machine learning, our goal is almost always to find the "best" set of parameters for a model a process we can visualize as searching for the lowest point in a vast, hilly landscape of error. The fundamental tool for this search is Gradient Descent, but the strategy for taking each "downhill step" can make the difference between getting stuck on a treacherous slope and efficiently reaching the bottom.