Skip to content
Information Sciences
Deep Learning
1974/1986
Advanced

Backpropagation Chain Rule

Lwij=Lzjzjwij\frac{\partial L}{\partial w_{ij}} = \frac{\partial L}{\partial z_j} \cdot \frac{\partial z_j}{\partial w_{ij}}

Gradients flow backward through the network via the chain rule—enabling deep learning.

By Paul Werbos, Geoffrey Hinton et al.

Information Sciences
Backpropagation Chain Rule
1974/1986 · Paul Werbos
Human Reviewed
84%

Rabbit Hole Mode

Five doors into the universe behind this equation. Choose your path.

Why it matters: Enabled training of deep neural networks—the deep learning revolution.

Discoverers: Paul Werbos, Geoffrey Hinton et al. (1974/1986)

What does it mean?

Gradients flow backward through the network via the chain rule—enabling deep learning.

Why should I care?

Enabled training of deep neural networks—the deep learning revolution.

Equation Compass

Variables & Units

SymbolNameUnitMeaning
LLLossOutput loss
wijw_ijWeightConnection weight
zjz_jActivationNeuron pre-activation

Worked Example

4-layer network: gradients multiply through 4 Jacobian terms.

AI Guide (Pro)

Ask questions about equations and get answers grounded in the Equation Universe catalog.

Share this equation

Equation Universe

Backpropagation Chain Rule

Lwij=Lzjzjwij\frac{\partial L}{\partial w_{ij}} = \frac{\partial L}{\partial z_j} \cdot \frac{\partial z_j}{\partial w_{ij}}

Real-world impact

Intelligent systems

Mathematics trains models that reshape work and creativity.

Photo: Unsplash — AI concept

Gradients flow backward through the network via the chain rule—enabling deep learning.

equation-universe.vercel.app

Post