This is one of those rules that easily becomes second nature, without the need to understand what is happening at a deeper level.
The chain rule states: to find the derivative of nested functions, multiply their individual derivatives together.
The objective of this video is to explore why this is the case. Understanding the underlying concept is useful as an exercise in understanding calculus on a deeper level.
Or, if you're like me, it's fun to fantasize about whether you yourself could have come up with such a rule.
First to formalize the chain rule
Making similar calculations a decade earlier
Imagine we have three gears: A, B, and C.
We could expand this and use the power rule term by term, but the chain rule is far quicker.
To use the gear analogy, think of everything being driven by a fundamental gear, which here is x.
X will always spin at a constant rate. One full spin represents one unit.
For the sake of this explanation, let's class a full spin of x as representing an infinitesimal — essentially a tiny, tiny change in x, which we call dx.
The x cog is turning the y function. But with our current setup, it's challenging to decipher how the x cog affects the rate of the function cog directly.
The chain rule allows us to drop in a middle gear to simplify the rate finding.
If the x cog spins once, the u cog spins three times because of the scaling by 3.
The derivative represents the instantaneous scaling of the spin rate.
Since $y = u^5$, taking the derivative with respect to u gives:
For example, if u = 1, the scale factor would be 5(1)⁴ = 5
That means at that moment, a single spin of u would spin y five times.
For every 1 spin of x, u spins 3 times.
So we multiply our dy/du by 3 to account for all those extra spins!
That final multiplication isn't an arbitrary rule we just have to memorise—
it is simply the mechanical consequence of connecting these gears.
When you see that dot representing multiplication in the chain rule formula, visualise that middle gear, transmitting the spin rate from the input x all the way to the output y.
It is just the cumulative effect of rates acting upon rates.
↓ Swipe down to learn more with Brilliant.org
Master calculus through interactive problem-solving and visual explanations
Get 20% Off Brilliant Premium →Using this link supports A-Level Maths at no extra cost to you