Gradient descent steps downhill on a loss surface. Learning rate η controls step size; momentum and adaptive methods refine the basic idea.
Explore interactively
Open equation pageθ ← θ − η∇L — the optimization loop behind virtually all modern machine learning.
8 min read · 2026-06-11
Gradient descent steps downhill on a loss surface. Learning rate η controls step size; momentum and adaptive methods refine the basic idea.
Explore interactively
Open equation page