The Chain Rule
Suppose you are asked to differentiate the function
The differentiation formulas you learned in the previous
sections of this chapter do not enable you to calculate F′(x).
Observe that F is a composite function. In fact, if we let y = f (u) = and let u = g(x) = x2 + 1, then we can write y = F(x) = f(g(x)), that is, F = f ° g.
We know how to differentiate both f and g, so it would be
The Chain Rule
It turns out that the derivative of the composite function f ° g is the product of the derivatives of f and g. This fact is one of the most important of the differentiation rules and is
called the Chain Rule.
It seems plausible if we interpret derivatives as rates of change. Regard du/dx as the rate of change of u with
respect to x, dy/du as the rate of change of y with respect to u, and dy/dx as the rate of change of y with respect to x.
If u changes twice as fast as x and y changes three times as fast as u, then it seems reasonable that y changes six times as fast as x, and so we expect that
The Chain Rule
The Chain Rule
The Chain Rule can be written either in the prime notation (f ° g)′(x) = f′(g(x)) g′(x)
or, if y = f(u) and u = g(x), in Leibniz notation:
Equation 3 is easy to remember because if dy/du and du/dx were quotients, then we could cancel du.
Remember, however, that du has not been defined and du/dx should not be thought of as an actual quotient.
Example 1
Find F'(x) if F(x) = .
Solution 1:
(Using Equation 2): We have expressed F as
F(x) = (f ° g)(x) = f(g(x)) where f(u) = and g(x) = x2 + 1.
Since
and g′(x) = 2x
we have F′(x) = f′(g(x)) g′(x)
Example 1 – Solution 2
(Using Equation 3): If we let u = x2 + 1 and y = , then
cont’d
The Chain Rule
When using Formula 3 we should bear in mind that dy/dx refers to the derivative of y when y is considered as a function of x (called the derivative of y with respect to x), whereas dy/du refers to the derivative of y when
considered as a function of u (the derivative of y with respect to u). For instance, in Example 1, y can be
considered as a function of x (y = ) and also as a function of u (y = ).
Note that
The Chain Rule
In general, if y = sin u, where u is a differentiable function of x, then, by the Chain Rule,
Thus
In a similar fashion, all of the formulas for differentiating trigonometric functions can be combined with the Chain Rule.
The Chain Rule
Let’s make explicit the special case of the Chain Rule where the outer function f is a power function.
If y = [g(x)]n, then we can write y = f(u) = un where u = g(x).
By using the Chain Rule and then the Power Rule, we get
Example 3
Differentiate y = (x3 – 1)100.
Solution:
Taking u = g(x) = x3 – 1 and n = 100 in (4), we have
= (x3 – 1)100
= 100(x3 – 1)99 (x3 – 1)
= 100(x3 – 1)99 3x2
= 300x2(x3 – 1)99
The Chain Rule
The reason for the name “Chain Rule” becomes clear when we make a longer chain by adding another link.
Suppose that y = f(u), u = g(x), and x = h(t), where f, g, and h are differentiable functions.
Then, to compute the derivative of y with respect to t, we use the Chain Rule twice:
How to Prove the Chain Rule
How to Prove the Chain Rule
We know that if y = f(x) and x changes from a to a + ∆x, we define the increment of y as
∆y = f(a + ∆x) – f(a)
According to the definition of a derivative, we have
So if we denote by ε the difference between the difference quotient and the derivative, we obtain
How to Prove the Chain Rule
But
∆y = f′(a) ∆x + ε ∆x
If we define ε to be 0 when ∆x = 0, then ε becomes a
continuous function of ∆x. Thus, for a differentiable function f, we can write
∆y = f′(a) ∆x + ε ∆x where ε → 0 as ∆x → 0
and ε is a continuous function of ∆x. This property of differentiable functions is what enables us to prove the Chain Rule.