Lecture 3: Change and Derivatives

3.1 Derivatives and the Chain Rule

Given a function deriv1.png , the change from deriv2.png to deriv3.png is deriv4.png . For example, if deriv5.png , then the change in going from deriv6.png to deriv7.png is

deriv8.png

This can be seen as a linear function deriv9.png plus another part deriv10.png which is small in the sense that

deriv11.png

Definition 1: If deriv12.png is a function, then a derivative deriv13.png of deriv14.png at deriv15.png is a linear function from deriv16.png to deriv17.png such that

deriv18.png

Proposition 1: There is at most one derivative of deriv19.png at deriv20.png .

Proof: Suppose that deriv21.png and deriv22.png are two derivatives. Then each satisfies the limit condition of the definition. Subtracting these gives:

deriv23.png

For deriv24.png , let deriv25.png for deriv26.png . Then

deriv27.png

since deriv28.png and deriv29.png are linear. So deriv30.png

Exercise 1: If deriv31.png is linear, then deriv32.png for all deriv33.png . If deriv34.png is a constant function, then deriv35.png for all deriv36.png .

Proposition 2: If deriv37.png and deriv38.png , then deriv39.png is differentiable at deriv40.png if and only if the coordinate functions deriv41.png are differentiable for all deriv42.png . If either is true, then deriv43.png .

Proof: If deriv44.png is differentiable at deriv45.png , then

deriv46.png

In particular each of the terms of the sum have limit zero. So the coordinate functions deriv47.png have derivatives and they are the coordinate functions of the derivative of deriv48.png . Conversely, if the coordinate functions deriv49.png are differentiable, then putting their derivatives in the above formula in the place of the coordinate functions of deriv50.png shows that deriv51.png is differentiable at deriv52.png .

The main computational tool for derivatives is:

Theorem 1: (Chain Rule) If deriv53.png is differentiable at deriv54.png and if deriv55.png is differentiable at deriv56.png , then deriv57.png is differentiable at deriv58.png and deriv59.png .

Proof: Define deriv60.png and deriv61.png by

deriv62.png

and

deriv63.png

We need to show that deriv64.png where

deriv65.png

Now, letting deriv66.png and deriv67.png , we have by the definition of deriv68.png ,

deriv69.png

Using the definition of deriv70.png , we get

deriv71.png

By linearity of deriv72.png , this simplifies to:

deriv73.png

We must show that deriv74.png . Since deriv75.png is linear, there is an deriv76.png such that

deriv77.png

and the right hand side approaches zero.

For the other term, for any deriv78.png , there is a deriv79.png such that

deriv80.png

whenever deriv81.png . Note that we can remove the condition that the middle term be non-zero -- if it were zero, then the condition becomes deriv82.png which is true since deriv83.png . Now deriv84.png is bounded for small deriv85.png . So

deriv86.png

This completes the proof.

Corollary 1: If deriv87.png are differentiable at deriv88.png , then

  1. deriv89.png
  2. deriv90.png

Proof: The sum deriv91.png (respectively the product deriv92.png ) function can be written as the composition of the deriv93.png defined by deriv94.png and the function deriv95.png (respectively deriv96.png ) defined by deriv97.png (respectively deriv98.png ). Now deriv99.png is linear and so its derivative is itself. Further, it is easy to check that:

deriv100.png

and so deriv101.png since

deriv102.png

The formulas now follow by the chain rule.

3.2 Partial Derivatives

Definition 2: Let deriv103.png be a function and deriv104.png . Then, for each j, one can define deriv105.png by deriv106.png . The deriv107.png partial derivative deriv108.png of deriv109.png at deriv110.png is defined to be deriv111.png .

Proposition 3: Let deriv112.png be a function with domain deriv113.png and deriv114.png be in the interior of deriv115.png and assume that deriv116.png is either a local minimum or local maximum of deriv117.png . Then deriv118.png provided that the partial derivative exists.

Proof: deriv119.png is a local extremum of the function of one variable deriv120.png .

Theorem 2: Let deriv121.png be a function and deriv122.png . If deriv123.png is differentiable at deriv124.png , then deriv125.png is the linear transformation whose matrix is the deriv126.png array of partial derivatives deriv127.png of the coordinate functions deriv128.png of deriv129.png . Conversely, if The partial derivatives of the coordinate functions of deriv130.png all have partials defined in an open neighborhood of deriv131.png and are continuous at deriv132.png , then deriv133.png is differentiable at deriv134.png .

Proof: By Proposition 2, it is enough to check the case where deriv135.png . The first assertion is immediate from the definitions. For the second assertion break up the change in deriv136.png into a sum of changes in which one varies only one parameter at a time:

deriv137.png

One can apply the Mean Value Theorem to each of the differences on the right:

deriv138.png

where deriv139.png . It follows that

deriv140.png

because the deriv141.png are continuous at deriv142.png .

Corollary 2: (Chain Rule) If deriv143.png for deriv144.png are continuously differentiable at deriv145.png and if deriv146.png is differentiable at (g_1(a),...,g_m(a)), then deriv147.png has partial derivatives

deriv148.png

Proof: Apply the Chain Rule and Theorem 2.

Theorem 3: If deriv149.png is a function defined on an open subset deriv150.png and if deriv151.png and its first and second order partials are exist throughout deriv152.png and are continuous there, then deriv153.png for all deriv154.png .

Proof: It is clearly enough to prove the result in the case where deriv155.png . Let deriv156.png and deriv157.png be such that the rectangle with diagonal from deriv158.png to deriv159.png is contained in deriv160.png . Let deriv161.png be defined by deriv162.png . First apply the Mean Value Theorem to find a deriv163.png such that

deriv164.png

Now, deriv165.png . Apply the Mean Value Theorem again to find deriv166.png between deriv167.png and deriv168.png such that

deriv169.png

Combining results we have

deriv170.png

But

deriv171.png

which is symmetric in the two variables. So, one can obtain another expression for the same by repeating the same construction swapping the roles of the two variables: Let deriv172.png and get deriv173.png and deriv174.png such that

deriv175.png

and

deriv176.png

So deriv177.png . Now take the limit as deriv178.png and use the continuity of the second partials to conclude that the mixed partials are equal.