辅导案例-MATH1231

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

MATH1231 Mathematics 1B
MATH1241 Higher Mathematics 1B
CALCULUS NOTES
Copyright 2020 School of Mathematics and Statistics, UNSW

iii
Preface
Please read carefully.
These Notes form the basis for the calculus strand of MATH1231 and MATH1241. However, not
all of the material in these Notes is included in the MATH1231 or MATH1241 calculus syllabuses.
A detailed syllabus will be uploaded to Moodle.
In using these Notes, you should remember the following points:
1. Most courses at university present new material at a faster pace than you will have been
accustomed to in high school, so it is essential that you start working right from the beginning
of the semester and continue to work steadily throughout the semester. Make every effort to
keep up with the lectures and to do problems relevant to the current lectures.
2. These Notes are not intended to be a substitute for attending lectures or tutorials. The
lectures will expand on the material in the notes and help you to understand it.
3. These Notes may seem to contain a lot of material but not all of this material is equally
important. One aim of the lectures will be to give you a clearer idea of the relative importance
of the topics covered in the Notes.
4. Use the tutorials for the purpose for which they are intended, that is, to ask questions about
both the theory and the problems being covered in the current lectures.
5. Some of the material in these Notes is more difficult than the rest. This extra material
is marked with the symbol [H]. Material marked with an [X] is intended for students in
MATH1241.
6. It is essential for you to do problems which are given at the end of each chapter in addition
to the Online Tutorials found on Moodle. If you find that you do not have time to attempt
all of the problems, you should at least attempt a representative selection of them.
7. You will be expected to use the computer algebra package Maple in tests and understand
Maple syntax and output for the end of term examination.
Note.
This version of the Calculus Notes has been prepared by Robert Taggart and Peter Brown. They
build on notes first developed by Tony Dooley and subsequently edited by several members of
the School of Mathematics and Statistics. The main editors include Mike Banner, Ian Doust and
V. Jeyakumar.
Copyright is vested in The University of New South Wales, c©2020.
iv
c©2020 School of Mathematics and Statistics, UNSW Sydney
CONTENTS v
Contents
Preface iii
Calculus Syllabus ix
Syllabus for MATH1231/1241 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Problem schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
1 Functions of several variables 1
1.1 Sketching simple surfaces in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Partial differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Tangent planes to surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 The total differential approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Chain rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.6 Functions of more than two variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.7 Maple notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Problems for Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2 Integration techniques 25
2.1 Trigonometric integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.1.1 Integrating powers of sine and cosine . . . . . . . . . . . . . . . . . . . . . . . 25
2.1.2 Integrating multiple angles of sine and cosine . . . . . . . . . . . . . . . . . . 28
2.1.3 Integrating powers of tan and sec . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2 Reduction formulae . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.2.1 [X] Application: the irrationality of π . . . . . . . . . . . . . . . . . . . . . . 32
2.3 Trigonometric and hyperbolic substitutions . . . . . . . . . . . . . . . . . . . . . . . 35
2.4 Integrating rational functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.4.1 The overall strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.4.2 Partial fractions decompositions . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.4.3 Integrating rational functions: two examples . . . . . . . . . . . . . . . . . . 42
2.5 Other substitutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.6 Maple notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Problems for Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3 Ordinary differential equations 55
3.1 An introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.2 Initial value problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.3 Separable ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
c©2020 School of Mathematics and Statistics, UNSW Sydney
vi
3.4 First order linear ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.5 Exact ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.6 Solving ODEs by using a change of variable [X] . . . . . . . . . . . . . . . . . . . . . 69
3.7 Modelling with first order ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.7.1 Mixing problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.7.2 Population models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.8 Second order linear ODEs with constant coefficients . . . . . . . . . . . . . . . . . . 76
3.8.1 The homogeneous case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.8.2 The non-homogeneous case . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.8.3 An application: vibrations and resonance . . . . . . . . . . . . . . . . . . . . 84
3.8.4 A connection with linear algebra . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.9 Maple notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
Problems for Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4 Taylor series 101
4.1 Taylor polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.2 Taylor’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.2.1 Classifying stationary points . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.2.2 Some questions arising from Taylor’s theorem . . . . . . . . . . . . . . . . . . 113
4.3 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.3.1 Describing the limiting behaviour of sequences . . . . . . . . . . . . . . . . . 115
4.3.2 Techniques for calculating limits of sequences . . . . . . . . . . . . . . . . . . 117
4.3.3 Suprema and infima [X] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.4 Infinite series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
4.5 Tests for series convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.5.1 Some preliminary results on series summation . . . . . . . . . . . . . . . . . . 126
4.5.2 The kth term divergence test . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.5.3 The integral test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.5.4 The comparison test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.5.5 [X] The limit form of the comparison test . . . . . . . . . . . . . . . . . . . . 130
4.5.6 The ratio test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.5.7 Leibniz’ test for alternating series . . . . . . . . . . . . . . . . . . . . . . . . . 132
4.5.8 Absolute and conditional convergence . . . . . . . . . . . . . . . . . . . . . . 136
4.6 Taylor series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.7 Power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
4.7.1 Radius of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
4.7.2 Convergence of power series at endpoints [X] . . . . . . . . . . . . . . . . . . 146
4.8 Manipulation of power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
4.8.1 Proof of theorems in Section 4.8 [X] . . . . . . . . . . . . . . . . . . . . . . . 151
4.9 Maple notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Problems for Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
5 Averages, arc length, speed and surface area 167
5.1 The average value of a function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
5.2 The arc length of a curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
5.2.1 An intuitive derivation of the arc length formula . . . . . . . . . . . . . . . . 172
5.2.2 Arc length for a parametrised curve . . . . . . . . . . . . . . . . . . . . . . . 174
c©2020 School of Mathematics and Statistics, UNSW Sydney
vii
5.2.3 Arc length for the graph of a function . . . . . . . . . . . . . . . . . . . . . . 175
5.2.4 Arc length for a polar curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
5.3 The speed of a moving particle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
5.4 Surface area . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
5.4.1 An heuristic derivation for the surface area of a surface of revolution . . . . . 179
5.4.2 Surface area formulae and examples . . . . . . . . . . . . . . . . . . . . . . . 182
Problems for Chapter 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
Answers to selected problems 191
Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Chapter 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
Index 201
c©2020 School of Mathematics and Statistics, UNSW Sydney
viii
c©2020 School of Mathematics and Statistics, UNSW Sydney
ix
CALCULUS SYLLABUS FOR
MATH1231 MATHEMATICS 1B
The calculus course for both MATH1231 andMATH1241 is based on the MATH1231/MATH1241
calculus notes that are included in the Course Pack.
A detailed syllabus and lecture schedule will be uploaded to Moodle.
The computer package MAPLE will be used in the algebra course. An introduction to Maple
is included in the booklet titled First Year Maple Notes.
PROBLEM SETS
The Calculus problems are located at the end of each chapter of the Calculus Notes booklet.
To help you decide which problems to try first, each problem is marked with an [R], an [H] or an
[X].
The problems marked [R] form a basic set of problems which you should try first. Problems
marked [H] are harder and can be left until you have done the problems marked [R]. Some harder
parts of [R] problems are marked with a star. Any problems which depend on work covered only
in MATH1241 are marked [X].
You do need to make an attempt at the [H] problems because problems of this type will occur
on tests and in the exam. If you have difficulty with the [H] problems, ask for help in your tutorial.
Questions marked with a [V] have a video solution available from the course page for this subject
on Moodle.
CALCULUS PROBLEM SCHEDULE
Solving problems and writing mathematics clearly are two separate skills that need to be devel-
oped through practice. We recommend that you keep a workbook to practice writing solutions to
mathematical problems. The range of questions suitable for each week will be provided on Moodle
along with a suggestion of specific recommended problems to do before your classroom tutorials.
The Online Tutorials will develop your problem solving skills, and give you examples of math-
ematical writing. Online Tutorials help build your understanding from lectures towards solving
problems on your own.
c©2020 School of Mathematics and Statistics, UNSW Sydney
xc©2020 School of Mathematics and Statistics, UNSW Sydney
1Chapter 1
Functions of several variables
Most functions which arise in real world applications depend on more than one variable. In this
chapter, we give a brief introduction to functions of two (or more) variables. Examples include
functions defined by the following well known formulae:
• A(b, h) = 12bh (the area of a triangle);
• D(x, y) =
√
x2 + y2 (the distance of a point (x, y) from the origin);
• ℓ(x, y) = 2(x+ y) (the perimeter of a rectangle of dimensions x and y units); and
• F (m1,m2,x1,x2) = Gm1m2|x1 − x2|2 (the gravitational force between two bodies of mass m1 and
m2 positioned at points P (x1) and Q(x2)).
Geometrically, a function of two variables represents a surface in R3, just as a function of one
variable represents a curve in R2. We will introduce the partial derivative of a function of two
variables and use this to calculate the equation of the tangent plane to a surface at a point. Other
applications of the partial derivative include function and error estimation.
1.1 Sketching simple surfaces in R3
(Ref: SH10 §15.2, 15.3)
The graph of a function f of one variable, given by y = f(x), gives rise to a curve in R2. The graph
of a function F of two variables, given by z = F (x, y), gives rise to a surface in R3. In this section
we introduce some simple techniques for sketching surfaces given by an equation z = F (x, y).
Sketching a surface in R3 can be challenging because the sketch must be represented in R2.
Topographic maps solve this problem by using contour lines to represent the height (above sea
level) of the surface of the earth at various points. We adapt this idea to sketching a surface
z = F (x, y) described by a function F . Here, z represents the height of the surface above the
xy-plane, and the contours of the surface are defined as follows.
Definition 1.1.1. A contour or level curve of a function F : R2 → R is a curve in
R2 corresponding to an equation of the form F (x, y) = C, where C is a constant.
For each level curve, the corresponding value of C gives the height of the curve above the
xy-plane.
c©2020 School of Mathematics and Statistics, UNSW Sydney
2 CHAPTER 1. FUNCTIONS OF SEVERAL VARIABLES
Example 1.1.2. Sketch level curves for the function F : R2 → R, where F (x, y) = x2 + y2.
Solution. The level curves of F are of the form x2 + y2 = C, where C is a nonnegative constant
(since if C < 0 then there is no solution to the equation x2 + y2 = C). The level curves given by
x2 + y2 = 0, x2 + y2 = 1, x2 + y2 = 2, x2 + y2 = 3 and x2 + y2 = 4 are shown below.
x
y
b
z = 0
z
=
1z
=
2z
=
3z
=
4
1 2
Level curves can be interpreted in the following way. If one walked around the circle x2 + y2 = 4,
then one would remain a constant height of 4 units above the xy-plane. If one started at the origin
and walked 2 units ‘east,’ then one would rise from ‘sea level’ to 4 units above sea level.
Some simple surfaces may be sketched in R3 by using the level curves of the surface and
considering the intersection of the surface with the yz-plane. The final sketch is drawn using
perspective.
Example 1.1.3. Sketch the surface in R3 described by the equation z = x2 + y2.
Solution. The level curves for the surface are circles (see the sketch in the previous example). When
x = 0, we have z = y2. This gives the intersection of the surface with the yz-plane. The profile of
this intersection is shown below.
z
y
z = y2
The yz-profile and level curves help us produce the sketch of the surface.
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.1. SKETCHING SIMPLE SURFACES IN R3 3
z
y
x
z = x2 + y2
z = y2
z = 1
z = 2
z = 3
z = 4
Remark 1.1.4. The previous diagram illustrates the conventional orientation for the x-, y- and
z-axes.
Example 1.1.5. A surface in R3 is described by the equation x2 + y2 − z2 = 1. Sketch some level
curves and hence sketch the surface in R3.
Solution. Each level curve is obtained by setting z equal to C, for some constant C.
z level curve
0 x2 + y2 = 1
±1 x2 + y2 = 2
±2 x2 + y2 = 5
±3 x2 + y2 = 10
Each level curve is a circle. Those given in the table are sketched below.
x
y
z
=
0
z
=
±1z =
±2z
=
±3
1
√
10
c©2020 School of Mathematics and Statistics, UNSW Sydney
4 CHAPTER 1. FUNCTIONS OF SEVERAL VARIABLES
If x = 0 then y2 − z2 = 1, which is a hyperbola.
z
y
y2 − z2 = 1
1−1
Putting these two sketches together gives the following surface.
z
y
x
x2 + y2 − z2 = 1
y2 − z2 = 1
z = 1
z = 2
z = 3
z = 0
z = −1
z = −2
z = −3
This surface is called a hyperboloid of one sheet.
1.2 Partial differentiation
(Ref: SH10 §15.4, 15.6)
Suppose that f is a differentiable function of one variable. In MATH1131 we
• investigated techniques for calculating the derivative of f ,
• interpreted the derivative in terms of the rate of change of f ,
• used the derivative to calculate the equation of the tangent line to the graph of f at a point
a, and
• used the tangent line to give a linear approximation to f near a.
Over the next few sections, we generalise some of these ideas and techniques to functions F of two
variables. In particular, we introduce the notion of a partial derivative and then
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.2. PARTIAL DIFFERENTIATION 5
• show how to calculate the partial derivatives of F ,
• interpret the partial derivatives in terms of the rate of change of F ,
• use the partial derivatives to calculate the equation of the tangent plane to the graph of F
at a point (a, b), and
• use the tangent plane to give a linear approximation for F near (a, b).
Our discussion will assume that we have an intuitive idea of what is meant by a ‘tangent plane to
a surface.’ A rigorous discussion of the existence of tangent planes is connected with the formal
definition of the derivative (as opposed to partial derivative) of a function of two variables. This
rigour is pursued in some second year courses.
To introduce the notion of a partial derivative, consider the function F given by
F (x, y) = x2 + y2.
The surface z = x2 + y2, corresponding to the graph of F , was sketched in Example 1.1.3. Our
immediate goal is to quantify the rate of change of F (x, y) at the point (1, 2).
First, we find the rate of change of F (x, y) at (1, 2) in the y-direction. This means that we hold
x constant and only allow y to vary. Since we are considering the rate of change at the point (1, 2),
we have x = 1 and
z = F (x, y) = F (1, y) = 1 + y2
(since x is constant but y is not). Geometrically, we may interpret z = 1 + y2 as the intersection
of the surface z = x2 + y2 with the plane x = 1. This intersection is shown in Figure 1.1. The rate
of change of F (x, y) at (1, 2) in the y-direction is equal to the gradient of the dashed tangent line
shown in the figure. This gradient may be calculated in the usual way:
gradient =
d
dy
(1 + y2)
∣∣∣
y=2
= 2y
∣∣∣
y=2
= 4.
Therefore the rate of change of F (x, y) at (1, 2) in the y-direction is 4.
A fast way of finding this rate of change is the following. First, differentiate F with respect to
y, treating x as a constant, to obtain
Fy(x, y) = 2y.
Then the rate of change of F (x, y) at (1, 2) in the y-direction is
Fy(1, 2) = 2× 2 = 4.
The function Fy is called the partial derivative of F with respect to y.
In a similar manner, one may find the rate of change of F (x, y) at (1, 2) in the x-direction.
First, differentiate F with respect to x, treating y as a constant, to obtain
Fx(x, y) = 2x.
Then the rate of change of F (x, y) at (1, 2) in the x-direction is
Fx(1, 2) = 2× 1 = 2.
The function Fx is called the partial derivative of F with respect to x.
The partial derivatives of a function F may be defined formally by using limits.
c©2020 School of Mathematics and Statistics, UNSW Sydney
6 CHAPTER 1. FUNCTIONS OF SEVERAL VARIABLES
y
x
z
b
b
(1, 0, 0) b
b
(1, 2, 0)
(1, 2, 5)
{
z = 1 + y2
x = 1
gradient = 4
Figure 1.1: The intersection of the surface z = x2 + y2 with the plane x = 1. The dashed line is
the tangent line to the cross-section at the point (1, 2, 5).
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.2. PARTIAL DIFFERENTIATION 7
Definition 1.2.1. Suppose that F is a function of two variables x and y. The
partial derivatives of F with respect to x and y are defined by
Fx(x, y) = lim
h→0
F (x+ h, y)− F (x, y)
h
and
Fy(x, y) = lim
h→0
F (x, y + h)− F (x, y)
h
,
wherever these limits exist.
Remark 1.2.2. If F is a function of x and y, then Fx(x, y) is calculated by treating y as a constant
and differentiating F with respect to x. On the other hand, Fy(x, y) is calculated by treating x as
a constant and differentiating F with respect to y.
Remark 1.2.3. As illustrated at the beginning of this section, Fy(a, b) gives the rate of change
of F at the point (a, b) in the y-direction. Interpreted geometrically, the number Fy(a, b) is the
gradient of the tangent to the cross-section at (a, b) when the surface z = F (x, y) is intersected
with the plane x = a. Similarly, Fx(a, b) gives the rate of change of F at the point (a, b) in the
x-direction.
Example 1.2.4. Suppose that F (x, y) = x2y + 2y + 4. Find Fx and Fy.
Solution. To calculate Fx, we treat y as a constant and differentiate F with respect to x:
Fx(x, y) = 2xy.
To calculate Fy, we treat x as a constant and differentiate F with respect to y:
Fy(x, y) = x
2 + 2.
Partial derivatives may be denoted in a variety of ways.
Remark 1.2.5 (Notation). Suppose that a function F has partial derivatives Fx and Fy. Then Fx
may be denoted by
∂F
∂x
or D1F,
while Fy may be denoted by
∂F
∂y
or D2F.
The notation involving the ‘curly d’ is used most frequently. However, the notation D1F and D2F
is less ambiguous. (For example, to calculate D1F (y, x), we differentiate F with respect to its first
variable and then evaluate this partial derivative at the point (y, x). In ‘curly d’ notation, one
would write
∂F
∂x
(y, x). Here, the first x represents the first variable of the function while the x in
parentheses represents the second ordinate of the point (y, x).)
Example 1.2.6. Suppose that
F (x, y) = 3exy
3
sin y.
Find
∂F
∂x
and
∂F
∂y
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
8 CHAPTER 1. FUNCTIONS OF SEVERAL VARIABLES
Solution. To calculate
∂F
∂x
, we treat y as a constant and differentiate F with respect to x:
∂F
∂x
= 3y3exy
3
sin y.
To calculate
∂F
∂y
, we treat x as a constant and differentiate F with respect to y:
∂F
∂y
= 3 sin(y)
∂
∂y
(
exy
3
)
+ 3exy
3 ∂
∂y
(sin y) (by using the product rule)
= 3 sin(y)
(
3xy2exy
3
)
+ 3exy
3
(cos y)
= 3exy
3 (
3xy2 sin y + cos y
)
.
So far we have seen examples of first order partial derivatives. Second order partial derivatives
can be computed by differentiating first order partial derivatives. Some notation is given below:
∂2F
∂x2
means
∂
∂x
(
∂F
∂x
)
;
∂2F
∂y2
means
∂
∂y
(
∂F
∂y
)
;
∂2F
∂x∂y
means
∂
∂x
(
∂F
∂y
)
; and
∂2F
∂y∂x
means
∂
∂y
(
∂F
∂x
)
.
Second order partial derivatives may be denoted using other notation. For example,
∂2F
∂x2
= Fxx = D
2
1F
while
∂2F
∂y∂x
= Fxy = D2D1F.
Example 1.2.7. Compute all second order partial derivatives of F , where F (x, y) = 3xy4+ex sin y.
Solution. The first order partial derivatives are given by
Fx(x, y) = 3y
4 + ex sin y and Fy(x, y) = 12xy
3 + ex cos y.
The second order partial derivatives are obtained by furher differentiation. Thus
Fxx(x, y) =
∂
∂x
(
3y4 + ex sin y
)
= ex sin y
Fyy(x, y) =
∂
∂y
(
12xy3 + ex cos y
)
= 36xy2 − ex sin y
Fyx(x, y) =
∂
∂x
(
12xy3 + ex cos y
)
= 12y3 + ex cos y
Fxy(x, y) =
∂
∂y
(
3y4 + ex sin y
)
= 12y3 + ex cos y.
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.2. PARTIAL DIFFERENTIATION 9
Note in the previous example that Fxy = Fyx. This is no accident.
Theorem 1.2.8 (The mixed derivative theorem). Suppose that F is a function of two variables.
If F and all its first and second order partial derivatives are continuous then
∂2F
∂x∂y
=
∂2F
∂y∂x
.
Two questions arise. First, what does it mean for a function F of two variables to be continuous?
Second, why is the theorem true? We give a partial answer to each of these questions in the following
remarks.
Remark 1.2.9. Suppose that F is a function of two variables. While a formal definition for
continuity is not given in this course, the following rule may often be used to verify that F is
continuous.
If F can be constructed by combining (via function addition, multiplication, division
and composition) a finite number of continuous functions of a single variable, then F is
continuous on its domain.
Thus the function F , given by
F (x, y) = 3xy4 + ex sin y,
is continuous since
F (x, y) = f(x)g(y) + h(x)k(y),
where the functions given by f(t) = 3t, g(t) = t4, h(t) = et and k(t) = sin t are continuous.
Similarly, the function G given by
G(x, y) = e3x
4y
is continuous since G(x, y) = h(g(x)f(y)). Suffice to say, most functions of two variables given in
this course are continuous on their domains.
Remark 1.2.10. To begin proving the mixed derivative theorem, one begins with the definition
of the partial derivative. Now
Fxy(a, b) = lim
h2→0
Fx(a, b+ h2)− Fx(a, b)
h2
= lim
h2→0
lim
h1→0
F (a+h1,b+h2)−F (a,b+h2)
h1
− F (a+h1,b)−F (a,b)h1
h2
= lim
h2→0
lim
h1→0
F (a+ h1, b+ h2)− F (a, b+ h2)− F (a+ h1, b) + F (a, b)
h1h2
.
Similarly,
Fyx(a, b) = lim
h1→0
lim
h2→0
F (a+ h1, b+ h2)− F (a+ h1, b)− F (a, b + h2) + F (a, b)
h1h2
.
To show that Fyx(a, b) = Fxy(a, b), one need only swap the order in which the limits are taken.
This step can be justified by the continuity hypothesis of the mixed derivative theorem. The details
are not easy and will be omitted here.
c©2020 School of Mathematics and Statistics, UNSW Sydney
10 CHAPTER 1. FUNCTIONS OF SEVERAL VARIABLES
1.3 Tangent planes to surfaces
In this section, we give an heuristic derivation of the Cartesian equation for the tangent plane to
a surface at a given point. Our treatment relies on intuition rather than rigour; a full presentation
requires a rigorous account of what is meant by a ‘tangent plane’ and conditions under which a
tangent plane exists. Such questions are tackled in some second year courses.
As a side benefit of our derivation for equation of the tangent plane, we also obtain a formula
for a normal vector to the surface at a given point. A normal vector is defined as follows. Suppose
that a surface has a tangent plane at a point P . We say that a vector n is normal to the surface
at P if n is normal to the tangent plane to the surface at P .
The main ideas for deriving a formula for the tangent planes and normal vectors are contained
in the following example.
Example 1.3.1. Suppose that F (x, y) = x2 + y2. By using vector geometry, find the Cartesian
equation of the tangent plane to the surface z = F (x, y) at the point where (x, y, z) = (1, 2, 5).
Find also a vector n that is normal to the surface at this point.
Solution. We will complete the solution in four steps.
Step 1. First, intersect the surface z = x2 + y2 with the plane x = 1, as in Figure 1.1. This
gives the cross-sectional profile {
z = 1 + y2
x = 1
as illustrated in Figure 1.1. The dashed line passing through the point (1, 2, 5) lies in the plane
x = 1 and is tangent to the parabola z = 1 + y2. Its gradient is
Fy(1, 2) = 4.
By using the point-gradient formula for a straight line, the Cartesian equation for this tangent is
given by
z − 5 = 4(y − 2), x = 1.
If λ = y − 2 then the equation of the tangent line in parametric vector form is
xy
z

 =

12
5

+ λ

01
4

 (1.1)
whenever µ ∈ R. Note that this line lies in the tangent plane to the surface at (1, 2, 5).
Step 2. Now we intersect the surface with the plane y = 2 and repeat the method of Step 1.
The cross-sectional profile of this intersection is given by{
z = x2 + 4
y = 2.
Consider the line that lies in the plane y = 2 and is tangent to z = x2 + 4 at (1, 2, 5). Its gradient
is given by
Fx(1, 2) = 2
and hence its Cartesian equation by
z − 5 = 2(x− 1), y = 2,
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.3. TANGENT PLANES TO SURFACES 11
If µ = x− 1 then we find that the parametric vector form of the tangent line is
xy
z

 =

12
5

+ µ

10
2

 (1.2)
whenever λ ∈ R. Note that this line also lies in the tangent plane to the surface at (1, 2, 5).
Step 3. Since the lines given by (1.1) and (1.2) both lie in the tangent plane, and since the
vectors 
01
4

 and

10
2


are nonparallel, the tangent plane to the surface at (1, 2, 5) is given by
xy
z

 =

12
5

+ λ

01
4

+ µ

10
2


where λ and µ are arbitrary real numbers.
Step 4. We now convert the parametric vector form of the plane to point-normal form. A vector
n that is normal to the plane is given by
n =

01
4

×

10
2

 =

 24
−1

 .
Hence the point-normal form for the tangent plane is
 24
−1

 ·

x− 1y − 2
z − 5

 = 0.
By expanding the dot product, one obtains z = 5 + 2(x− 1) + 4(y − 2), which simplifies to
2x+ 4y − z = 5.
This is the cartesian form of the plane, and the vector (2, 4,−1)T is normal to the surface at the
point (1, 2, 5).
Note in the above example that Fx(1, 2) = 2 and Fy(1, 2) = 4. Each of these numbers appear
as coefficients in the equation of the tangent plane and as components of the normal vector. By
generalising the above example, one obtains the formulae in the following proposition.
Proposition 1.3.2. Suppose that F is a function of two variables and (x0, y0, z0) is a point that
lies on the surface z = F (x, y). If the surface has a tangent plane at the point (x0, y0, z0), then the
tangent plane is given by the equation
z = z0 + Fx(x0, y0)(x− x0) + Fy(x0, y0)(y − y0)
and a normal vector to the surface at (x0, y0, z0) is given by
Fx(x0, y0)Fy(x0, y0)
−1

 .
c©2020 School of Mathematics and Statistics, UNSW Sydney
12 CHAPTER 1. FUNCTIONS OF SEVERAL VARIABLES
Remark 1.3.3. The above proposition is similar to what is already known for functions of a single
variable. In fact, we have the following. Suppose that f is a function of one variable and (x0, y0)
is a point that lies on the curve y = f(x). If f is differentiable at x0, then the tangent line to the
curve at (x0, y0) is given by the equation
y = y0 + f
′(x0)(x− x0)
and a normal vector to the curve at (x0, y0) is given by(
f ′(x0)
−1
)
.
Remark 1.3.4. The proposition implicitly assumes that the first order partial derivatives for F
exist at (x0, y0). It can be shown that if a tangent plane exists then these first order partial
derivatives also exist. Hence this implicit assumption causes no problems. (Be warned however,
that the existence of first order partial derivatives at (x0, y0) does not guarantee the existence of a
tangent plane at (x0, y0, z0).)
Example 1.3.5. Suppose that F (x, y) = sin(πxy2). Write down the Cartesian equation for the
tangent plane to the surface z = F (x, y) at the point (2,−1, 0) and find a vector that is normal to
the surface at this point.
Solution. The partial derivatives of F are given by
Fx(x, y) = πy
2 cos(πxy2) and Fy(x, y) = 2πxy cos(πxy
2).
Hence
Fx(2,−1) = π and Fy(2,−1) = −4π.
So the equation of the tangent plane is given by
z = 0 + π(x− 2)− 4π(y + 1) or πx− 4πy − z = 6π.
and a normal vector is 
 π−4π
−1

 .
Notes:
• One could also use the point normal form directly to find the equation of the tangent plane.
Thus the vector normal to the plane is n =

 fxfy
−1

 (evaluated at the point P ), and so the
equation of the tangent plane is (x− P ).n = 0.
Hence, in the previous example, we could simply expand

x−

 2−1
0



 .

 π−4π
−1

 = 0 to find
the equation of the tangent plane.
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.4. THE TOTAL DIFFERENTIAL APPROXIMATION 13
• More generally, if a surface is given in the form g(x, y, z) = 0, then the vector n =

gxgy
gz


evaluated at some point P on the surface, is a vector normal to the surface at P .
Example 1.3.6. Find the equation of the tangent plane to the ellipsoid x
2
4 +
y2
2 +
z2
8 = 1 at the
point P (0, 1, 2)
Solution. A vector normal to the ellipsoid at P is given by
n =

x2y
z
4


P
=

 01
1/2

 .
Hence the equation of the tangent has the form
0x+ 1y +
1
2
z = d
for some constant d. Substituting in P , we find that d = 2 and so the desired equation is 2y + z =
4.
1.4 The total differential approximation
Suppose that f is a differentiable function of one variable. The equation of the tangent to the
graph of f at a point x0 is given by
y = y0 + f
′(x0)(x− x0),
where y0 = f(x0). When x is close to x0, the tangent line is close to the graph of f . In other words,
f(x) ≈ f(x0) + f ′(x0)(x− x0).
Now writing ∆x = x− x0 and ∆f = f(x)− f(x0) we have
∆f ≈ f ′(x0)∆x.
This is called the differential approximation to ∆f . Our goal is to generalise this idea to functions
of two variables.
Suppose that a surface given by z = F (x, y) has a tangent plane at the point (x0, y0, z0). The
equation of the tangent is given by
z = z0 + Fx(x0, y0)(x− x0) + Fy(x0, y0)(y − y0).
If (x, y) is near (x0, y0) then
F (x, y) ≈ z0 + Fx(x0, y0)(x− x0) + Fy(x0, y0)(y − y0).
Since (x0, y0, z0) lies on the surface, z0 = F (x0, y0). Thus
F (x, y)− F (x0, y0) ≈ Fx(x0, y0)(x− x0) + Fy(x0, y0)(y − y0). (1.3)
c©2020 School of Mathematics and Statistics, UNSW Sydney
14 CHAPTER 1. FUNCTIONS OF SEVERAL VARIABLES
Let ∆x denote the difference x − x0 between x and x0 and let ∆y denote the difference y − y0
between y and y0. Then (1.3) becomes
F (x0 +∆x, y0 +∆y)− F (x0, y0) ≈ Fx(x0, y0)∆x+ Fy(x0, y0)∆y.
The left-hand side of this approximation represents the change in F when x0 and y0 are changed
by ∆x and ∆y respectively. We call this the increment in F and denote it by ∆F ; that is,
∆F = F (x0 +∆x, y0 +∆y)− F (x0, y0).
Hence
∆F ≈ Fx(x0, y0)(x− x0) + Fy(x0, y0)(y − y0).
This formula is called the total differential approximation to ∆F . By supressing the point of
evaluation, the total differential approximation may be written as
∆F ≈ ∂F
∂x
∆x+
∂F
∂y
∆y.
The approximation improves when ∆x and ∆y are smaller.
The total differential approximation can be used to estimate the change in the output of a
function given changes to each of the inputs.
Example 1.4.1. The ideal gas law asserts that the pressure P , volume V and temperature T of
an ideal gas are related by the formula
PV = kT,
where k is a constant. If the temperature is increased by 4% and the volume is decreased by 5%,
estimate the percentage increase in pressure.
Solution. We have
P =
kT
V
,
∂P
∂T
=
k
V
and
∂P
∂V
= −kT
V 2
.
Now the temperature is increased by 4%, so ∆T = 0.04T . Similarly, the volume is decreased by
5% and so ∆V = −0.05V . By the total differential approximation,
∆P ≈ ∂P
∂T
∆T +
∂P
∂V
∆V
=
k
V
× 0.04T − kT
V 2
× (−0.05V )
= 0.04P + 0.05P (since P = kTV )
= 0.09P.
Hence the pressure increases by approximately 9%.
In science, engineering, psychology, economics and so on, measurements are often made that are
not exact. Any quantities calculated from these measurements will also contain errors. The total
differential approximation can give us an idea how bad such errors can get. Given a function F of
two variables x and y, one can interpret ∆F as the error in the output given errors ∆x and ∆y in
the inputs. Typically, one does not know the precise value of ∆x and ∆y, but sometimes one can
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.4. THE TOTAL DIFFERENTIAL APPROXIMATION 15
find an upper bound for the absolute errors |∆x| and |∆y|. The total differential approximation
then gives an approximate upper bound for the absolute error |∆F | in F :
|∆F | ≈
∣∣∣∣∂F∂x∆x+ ∂F∂y ∆y.
∣∣∣∣
≤
∣∣∣∣∂F∂x
∣∣∣∣ |∆x|+
∣∣∣∣∂F∂y
∣∣∣∣ |∆y|,
(where we have used the triangle inequality in the last step). The next example illustrates these
ideas.
Example 1.4.2. The dimensions of a cylinder are measured to the nearest millimeter using a
measuring tape. The circumference is measured to be 22.0 cm and height is measured to be 15.0 cm.
Use these measurements to (a) estimate the volume of the cylinder, and (b) estimate an upper bound
for the percentage error in your answer to part (a).
Solution. (a) Let r, C and h denote the radius, circumference and height respectively. Then
C = 2πr and so
V = πr2h = π
(
C
2π
)2
h =
C2h
4π
.
By using the measurements C = 22 and h = 15, one finds that
V =
1815
π
.
And so the volume is estimated to be 1815pi cm
3, which is approximately 577.73 cm3.
(b) The absolute error in each measurement is at most 0.5mm, which is 0.05 cm. Let ∆C and
∆h denote the error in each measurement. Then
|∆C| ≤ 0.05 and |∆h| ≤ 0.05.
The increment ∆V is the error in our calculation for the volume. Now
V =
C2h
4π
,
∂V
∂C
=
Ch
2π
and
∂V
∂h
=
C2
4π
.
So the total differential approximation (when C = 22 and h = 15) is
∆V ≈ ∂V
∂C
∆C +
∂V
∂h
∆h
=
Ch
2π
∆C +
C2
4π
∆h
=
165
π
∆C +
121
π
∆h
If we take absolute values of both sides then
|∆V | ≈
∣∣∣∣165π ∆C + 121π ∆h
∣∣∣∣
≤ 165
π
|∆C|+ 121
π
|∆h| (by the triangle inequality)
≤ 165
π
× 0.05 + 121
π
× 0.05
=
286
20π
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
16 CHAPTER 1. FUNCTIONS OF SEVERAL VARIABLES
So an upper bound for the absolute error in V is approximately
286
20π
(that is, approximately
4.55 cm3). An upper bound for the percentage error is given by
max |∆V |
V
× 100% ≈ 286
20π
· π
1815
× 100%
=
26
33
%.
Hence the percentage error is no more than about 0.79%.
1.5 Chain rules
If f and g are functions of one variable, then the derivative of f ◦ g may be calculated using the
chain rule for functions of one variable. In this section, we study compositions of functions of more
than one variable. To calculate their partial derivatives, we use a chain rule for functions of more
than one variable.
Suppose that F is a function of x and y and that x and y are each functions of t. A small
change ∆t in t produces a corresponding change ∆x and ∆y in x and y. These changes in turn
produce a corresponding change ∆F in F . By the total differential approximation,
∆F ≈ ∂F
∂x
∆x+
∂F
∂y
∆y,
and this approximation gets better as ∆x and ∆y approach zero. If we divide through by ∆t then
∆F
∆t
≈ ∂F
∂x
∆x
∆t
+
∂F
∂y
∆y
∆t
.
As ∆t→ 0,
∆x
∆t
=
x(t+∆t)− x(t)
∆t
→ dx
dt
and
∆y
∆t
=
y(t+∆t)− y(t)
∆t
→ dy
dt
.
Finally, if we view F as a function of t then, by a similar argument,
∆F
∆t
→ dF
dt
. So in the limit,
the total differential approximation becomes
dF
dt
=
∂F
∂x
dx
dt
+
∂F
∂y
dy
dt
. (1.4)
This is an example of a chain rule for a function of two variables.
The above chain rule must be interpreted properly. First, each derivative must be evaluated at
a correct point. Second, the F appearing on the left-hand side is a function of one variable t, while
the F that appears on the right-hand side is a function of two variables x and y. Technically, these
are two different functions. The next theorem expresses chain rule (1.4) without these ambiguities.
Theorem 1.5.1. Suppose that F is a function of two variables and that x and y are both functions
of one variable. Define the function φ by φ(t) = F (x(t), y(t)) and the point (x0, y0) by (x0, y0) =
(x(t0), y(t0)). If x and y are both differentiable at t0 and the partial derivatives of F exist and are
continuous at (x0, y0), then φ is differentiable at t0 and
φ′(t0) = D1F (x0, y0)x′(t0) +D2F (x0, y0)y′(t0). (1.5)
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.5. CHAIN RULES 17
Remark 1.5.2. Formulae (1.4) and (1.5) are equivalent. The former is easier to remember while
the latter is more precise. To remember the rule, consider the following chain diagram.
F
x
y
t
∂F
∂x
∂F
∂y
dx
dt
dy
dt
To construct the diagram, draw an arrow from each function to each of its variables. Then dFdt is
the sum of all paths (left to right) from F to t, where the derivatives are multiplied across each
path.
Example 1.5.3. The potential energy E of a particle at point (x, y) is given byE(x, y) = sin(πx2y).
If the x-ordinate of the particle is increasing at a rate of 3 units per second, and the y-ordinate of
the particle is decreasing at a rate of 2 units per second, find the rate of change of potential energy
when the particle has coordinate (−1, 2).
Solution. Since the ordinates x and y of the particle change with time t, we may view x and y as
functions of t. We are told that
dx
dt
= 3 and
dy
dt
= −2.
By the chain rule,
dE
dt
=
∂E
∂x
dx
dt
+
∂E
∂y
dy
dt
= 2πxy cos(πx2y)× 3 + πx2 cos(πx2y)× (−2)
= 2πx(3y − x) cos(πx2y).
When (x, y) = (−1, 2),
dE
dt
= −14π.
So the rate of change of E at (−1, 2) is −14π.
We now examine the case when F is a function of x and y, where each of x and y is a function
of both s and t. This situation is sometimes written as
F = F (x, y), x = x(s, t) and y = y(s, t).
If we treat s as a constant and differentiate F with respect to t, then chain rule (1.4) gives
∂F
∂t
=
∂F
∂x
∂x
∂t
+
∂F
∂y
∂y
∂t
.
Similarly, if we treat t as a constant and differentiate F with respect to s, then chain rule (1.4)
gives
∂F
∂s
=
∂F
∂x
∂x
∂s
+
∂F
∂y
∂y
∂s
.
Each of these new chain rules may be remembered using the following chain diagram.
c©2020 School of Mathematics and Statistics, UNSW Sydney
18 CHAPTER 1. FUNCTIONS OF SEVERAL VARIABLES
F
x
y
t
s
∂F
∂x
∂F
∂y
∂x
∂t
∂y
∂t
∂x
∂s
∂y
∂s
For example, to remember the rule for ∂F∂s , simply sum all paths (left to right) from F to s, where
the derivatives are multiplied across each path.
Example 1.5.4. Suppose that z = F (x, y). Express the point (x, y) in terms of polar coordinates
(r, θ). Hence express ∂z∂r and
∂z
∂θ in terms of x, y, Fx and Fy. Finally, show that the partial derivatives
satisfy the equation
r
∂z
∂r
+
∂z
∂θ
= (x− y)∂F
∂x
+ (x+ y)
∂F
∂y
.
Solution. We have
z = F (x, y), x = r cos θ, y = r sin θ and r2 = x2 + y2.
So the chain rule gives
∂z
∂r
=
∂z
∂x
∂x
∂r
+
∂z
∂y
∂y
∂r
= Fx(x, y) cos θ + Fy(x, y) sin θ
=
x
r
Fx(x, y) +
y
r
Fy(x, y)
=
x√
x2 + y2
Fx(x, y) +
y√
x2 + y2
Fy(x, y)
and
∂z
∂θ
=
∂z
∂x
∂x
∂θ
+
∂z
∂y
∂y
∂θ
= −Fx(x, y)r sin θ + Fy(x, y)r cos θ
= −y Fx(x, y) + xFy(x, y).
Finally,
r
∂z
∂r
+
∂z
∂θ
= r
(
x√
x2 + y2
∂F
∂x
+
y√
x2 + y2
∂F
∂y
)
+
(
−y∂F
∂x
+ x
∂F
∂y
)
= x
∂F
∂x
+ y
∂F
∂y
− y∂F
∂x
+ x
∂F
∂y
= (x− y)∂F
∂x
+ (x+ y)
∂F
∂y
,
as required.
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.6. FUNCTIONS OF MORE THAN TWO VARIABLES 19
We present one more useful version of the chain rule. Suppose that F is a function of u and
that u is a function of both x and y. This is sometimes written as
F = F (u) and u = u(x, y).
The corresponding chain rules are
∂F
∂x
=
dF
du
∂u
∂x
and
∂F
∂y
=
dF
du
∂u
∂y
.
The chain diagram coresponding to this situation is illustrated below.
F u
x
y
dF
du
∂u
∂y
∂u
∂x
These chain rules may be easily written down after sketching the chain diagram.
1.6 Functions of more than two variables
Until now we have only discussed functions of two variables. In this section, the ideas met in this
chapter are generalised to functions of three variables. We present a summary only.
Suppose that F is a function of three variables x, y and z. The partial derivatives of F are
defined by
Fx(x, y, z) = lim
h→0
F (x+ h, y, z) − F (x, y, z)
h
Fy(x, y, z) = lim
h→0
F (x, y + h, z) − F (x, y, z)
h
Fz(x, y, z) = lim
h→0
F (x, y, z + h)− F (x, y, z)
h
wherever these limits exist. Equivalent notation for each of these partial derivatives is given below:
Fx =
∂F
∂x
= D1F, Fy =
∂F
∂y
= D2F, and Fz =
∂F
∂z
= D3F.
If (a, b, c) is a point in R3 then Fx(a, b, c) is the rate of change of F in the x-direction at (a, b, c).
Similarly, Fz(a, b, c) is the rate of change of F in the z-direction at (a, b, c).
The partial derivatives of F are calculated by differentiating F with respect to one variable and
treating the other variables as constants. For example, if
F (x, y, z) = e2xz cos y
then
Fx(x, y, z) = 2e
2xz cos y, Fy(x, y, z) = −e2xz sin y and Fz(x, y, z) = e2x cos y.
c©2020 School of Mathematics and Statistics, UNSW Sydney
20 CHAPTER 1. FUNCTIONS OF SEVERAL VARIABLES
A ‘hyper-surface’ in R4 is the natural generalisation of a surface in R3. A function F of three
variables can be used to define a hyper-surface w = F (x, y, z) in R4. Given a point (x0, y0, z0, w0)
on the surface, the equation of the ‘tangent plane’ to the surface at this point is given by
w = w0 + Fx(x0, y0, z0)(x− x0) + Fy(x0, y0, z0)(y − y0) + Fz(x0, y0, z0)(z − z0)
(assuming, of course, that that a ‘tangent plane’ to the surface exists at this point).
The total differential approximation ∆F is given by
∆F ≈ ∂F
∂x
∆x+
∂F
∂y
∆y +
∂F
∂z
∆z.
The chain rules are easily written down using chain diagrams. For example, suppose that F is
a function of x, y and z and that x, y and z are each functions of both u and v. The corresponding
chain diagram is shown below.
F
x
y
z
u
v
So the chain rule for
∂F
∂u
is given by
∂F
∂u
=
∂F
∂x
∂x
∂u
+
∂F
∂y
∂y
∂u
+
∂F
∂z
∂z
∂u
,
while the chain rule for
∂F
∂v
is given by
∂F
∂v
=
∂F
∂x
∂x
∂v
+
∂F
∂y
∂y
∂v
+
∂F
∂z
∂z
∂v
.
The generalisation of each of these formulae to a function of four (or more) variables should be
obvious.
1.7 Maple notes
The plot3d command is useful for visualizing the graphs of functions of several variables.
The MAPLE diff command carries out partial differentiation: diff(f(x,y), x); computes
∂f
∂x
,
and diff(f(x,y), x,y); calculates
∂2f
∂y∂x
. For example,
> diff(x^3*y-sin(y^2),x);
3x2y
> diff(x^3*y-sin(y^2),y$2);
4 sin(y2)y2 − 2 cos(y2)
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 1 21
Problems for Chapter 1
Questions marked with [R] are routine, with [H] are harder and with [X] are for MATH1241 only.
You should make sure that you can do the easier questions before you tackle the more difficult
questions.
Problems 1.1 : Sketching simple surfaces in R3
1. [R] For each of the following surfaces, sketch some level curves and sketch the yz-profile
(which is found by intersecting the surface with the plane x = 0). Hence sketch the surface.
a) z = x2 + y2 b) x2 + y2 + z2 = 1
c) z2 = x2 + y2 − 1 d) z2 = x2 + y2
e) [H]z = x2 − y2
Problems 1.2 : Partial differentiation
2. [R] Given that z = ex
2y, find
∂z
∂x
,
∂z
∂y
and
∂2z
∂y∂x
.
3. [R] In each case, find all first and second order partial derivatives and verify that
∂2z
∂x∂y
=
∂2z
∂y∂x
.
a) z = x2y + y2 b) z = tan−1(y/x) c) z = sin(x− cy)
Problems 1.3 : Tangent planes to surfaces
4. [R] Find a normal vector n and the equation of the tangent plane to the surface S at the
point x0.
a) S : z = x2 + y2, x0 = (3, 5, 34).
b) S : z = 4x2y, x0 = (2,−1,−16).
c) S : z = ln(x2 + 3y2), x0 = (2,−1, ln 7).
d) S : z2 + x2 + y2 = 1, x0 =
(
1
3 ,
1
2 ,
√
23
6
)
.
Problems 1.4 : The total differential approximation
5. [R] The volume V of a football in the shape of an ellipsoid of revolution with semi-axes
of length a, b and b is given by
V =
4
3
πab2.
The values of a and b are measured to be 12.0 cm and 7.0 cm respectively, each to the
nearest millimetre.
c©2020 School of Mathematics and Statistics, UNSW Sydney
22 CHAPTER 1. FUNCTIONS OF SEVERAL VARIABLES
a) Use these measurements to calculate the volume of the football.
b) Use the total differential to estimate the maximum absolute error in the calculated
value of V .
c) Hence estimate the percentage error in your answer of (a).
6. [R] Suppose that z =
x+ 1
y2 + 1
. The measured values of x and y are 3 and 1 respectively and
each of the measurements is made with an error whose absolute value is at most 0.02. Use
the total differential approximation of z to estimate the maximum error in the calculated
value of z.
7. [R] Use the total differential approximation of f(x, y) =
√
x2 + y2 to estimate√
2.982 + 4.032.
8. [R] The specific gravity S of a solid is given by
S =
A
A−W ,
where A and W are its weights in air and water respectively.
a) Find
∂S
∂A
and
∂S
∂W
.
b) If A and W are measured to be 15.1 gm and 5.1 gm respectively, and if each of these
measurements is made with an error whose absolute value is at most 0.2 gm, then
use the total differential approximation of S to estimate the maximum error in the
calculated value of S.
9. [R] The specific volume v of a compressible fluid flowing through a section of area A with
mean velocity V is given by
v = kAV
where k is a constant. If v decreases by 5% and A increases by 4%, then estimate the
percentage change in V .
10. [R] A triangle has two sides of length a and b with an included angle measuring π/3
radians. Given that a increases by 5%, b decreases by 6% and the included angle increases
by 2%, estimate the percentage increase of area of the triangle.
Problems 1.5 : Chain rules
11. [R] Use a chain rule to calculate
dw
dt
(as a function of t) when
a) w = xy, x = et, y = t2;
b) w = x2 + y2 + z2, x = cos t, y = sin t, z = t.
12. [R] A cylindrical metallic solid is expanding under heat in such a way that its height is
increasing at the rate of 0.1 cm/sec and its radius is increasing at the rate of 0.05 cm/sec.
Find the rate of increase of its volume at the instant when the height is 10 cm and the
radius is 5 cm.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 1 23
13. [H] Suppose that f is a differentiable function of a single variable and F (x, y) is defined
by F (x, y) = f(x2 − y).
a) Show that F satisfies the partial differential equation
∂F
∂x
+ 2x
∂F
∂y
= 0.
b) Given that F (0, y) = sin y for all y, find a formula for F (x, y).
14. [H] Consider the differential equation
∂2u
∂t2
− 16∂
2u
∂x2
= 0.
(This is an example of the one dimensional wave equation, which can be used to model,
for example, the displacement u(x, t) of a particle at position x along a vibrating guitar
string at time t.)
a) Suppose that g is an arbitrary twice-differentiable function of one variable and that
u(x, t) = g(x+ λt), where λ is a constant. Calculate uxx and utt.
b) Given that u(x, t) = g(x+λt), find all values of λ such that u satisfies the differential
equation.
15. [X] A function f of two variables is said to be homogeneous of degree n if
f(tx, ty) = tnf(x, y)
whenever t > 0. Show that such a function f satisfies the equation
x
∂f
∂x
+ y
∂f
∂y
= nf.
16. [X]
a) Suppose that a and b real numbers. Consider the partial differential equation
a
∂u
∂x
+ b
∂u
∂y
= 0. (1)
Use the chain rule to show that for a suitable choice of constants α and β, every
function of the form u(x, y) = f(αx+ βy), where f is differentiable, satisfies (1).
b) Generalise this remark to find solutions u : Rn → R to the differential equation
a1
∂u
∂x1
+ a2
∂u
∂x2
+ · · ·+ an ∂u
∂xn
= 0.
17. [X] A function u of two variables is defined implicitly by
u(x, t) = f
(
x− t u(x, t)),
where f is a given bounded, differentiable function of one variable, f : R→ R.
c©2020 School of Mathematics and Statistics, UNSW Sydney
24 CHAPTER 1. FUNCTIONS OF SEVERAL VARIABLES
a) Calculate
∂u
∂t
and
∂u
∂x
.
b) Show that
∂u
∂t
+ u
∂u
∂x
= 0.
c) Given that f(s) = 1 − tanh s, find the smallest positive number tm such that ∂u
∂x
is
undefined for precisely one value of x.
d) Sketch u(x, t) as a function of x for several fixed values of t (taking t to be positive),
and hence interpret your result in (c). (Maple would be useful for the plots.)
e) For fixed t > tm is u(x, t) a function of x?
f) Generalise these results and find a way to predict tm for arbitrary differentiable,
bounded, monotonic decreasing functions f .
18. [H] A point sits on the hyperboloid
F (x, y, z) = x2 + y2 − z2 = 1.
The position of the point moves with respect to time as x(t) = z(t) = t. The paramerisation
of y is unspecified aside from y(t) 6= 0.
a) Find the parameterisation of the y co-ordinate and then find dydt .
b) Calculate a normal to the hyperboloid at the point (x, y, z) by calculating the vector
∇F =
(
∂F
∂x
,
∂F
∂y
,
∂F
∂z
)T
.
c) The chain rule states that
dF
dt
= (∇F ) ·
(
dx
dt
,
dy
dt
,
dz
dt
)T
.
Deduce that (∇F ) ·
(
dx
dt ,
dy
dt ,
dz
dt
)T
= 0 and interpret this equation geometrically. Use
this equation to find dydt (without first finding y as a function of t as in part a).
c©2020 School of Mathematics and Statistics, UNSW Sydney
25
Chapter 2
Integration techniques
Many real world problems, such as
• calculating the area of a region
• locating the centre of a region,
• calculating the volume, surface area and centre of mass of a solid,
• calculating the length of a curve,
• determining the probability that a certain event occurs,
• analysing the harmonics of a musical instrument,
• determining the solution to models of various physical phenomena, and
• calculating the work done by a force,
boil down to evaluating an appropriate integral. Some of these applications were explored in last
semester’s course while others will be discussed later in this course or in second year. But what
these applications demand is mastery of integration. In this chapter we work towards that goal by
examining techniques for integrating various types of integrals that arise when solving real world
problems. The hard work done here will pay off when applications of such integrals are studied.
2.1 Trigonometric integrals
(Ref: SH10 §8.3)
In this section we focus specifically on integrals involving the trigonometric functions.
2.1.1 Integrating powers of sine and cosine
The first class of trigonometric integrals considered consist of integrals of the form∫
cosm x sinn x dx, (2.1)
where m and n are non-negative integers. There are essentially two cases: (i) either m or n (or
both) are odd; or (ii) both m and n are even. We’ll begin with the first case.
c©2020 School of Mathematics and Statistics, UNSW Sydney
26 CHAPTER 2. INTEGRATION TECHNIQUES
Case (i). Suppose that m is odd in (2.1). Then we use the substitution u = sinx along with
the identity
sin2 x+ cos2 x = 1
to evaluate the integral.
Example 2.1.1. Evaluate the integral
∫
cos3 x sin4 x dx.
Solution. The substitution
u = sinx, du = cos x dx
yields ∫
cos3 x sin4 x dx =
∫
cos2 x sin4 x cos x dx
=
∫
(1− sin2 x) sin4 x cos x dx
=
∫
(1− u2)u4 du
=
∫
u4 − u6 du
=
u5
5
− u
7
7
+ C
=
sin5 x
5
− sin
7 x
7
+ C.
If n is odd in (2.1) then we use the substitution u = cos x and follow the same strategy. (If
both m and n are odd, then either of the substitutions u = sinx or u = cos x will work.)
Example 2.1.2. Evaluate the integral
∫
cos6 x sin5 x dx.
Solution. This time we use the substitution
u = cos x, du = − sinx dx
to evaluate the integral:∫
cos6 x sin5 x dx = −
∫
cos6 x(sin2 x)2(− sinx) dx
= −
∫
cos6 x(1− cos2 x)2(− sinx) dx
= −
∫
u6(1− u2)2 du
= −
∫
u6 − 2u8 + u10 du
= −
(
u7
7
− 2u
9
9
+
u11
11
)
+ C
= −cos
7 x
7
+
2 cos9 x
9
− cos
11 x
11
+ C.
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.1. TRIGONOMETRIC INTEGRALS 27
Case (ii). The case where both m and n are even in (2.1) requires an entirely different approach.
This time we use the identities
cos2 x =
1 + cos 2x
2
and sin2 x =
1− cos 2x
2
(2.2)
to change integral (2.1) into a sum of integrals of the form∫
cosk(2x) dx.
We then repeat the methods of Case (i) or Case (ii) until each integral in the sum is easy to
compute.
Example 2.1.3. Evaluate
∫
sin2 x dx.
Solution. The second identity in (2.2) gives∫
sin2 x dx =
1
2
∫
1− cos 2x dx
=
1
2
(
x− sin 2x
2
)
+ C.
The next example is much harder.
Example 2.1.4. Evaluate
∫
sin2 x cos4 x dx.
Solution. The identities (2.2) give
∫
sin2 x cos4 x dx =
∫ (
1− cos 2x
2
)(
1 + cos 2x
2
)2
dx
=
1
8
∫
(1− cos 2x)(1 + cos 2x)2 dx
=
1
8
∫
1 + cos 2x− cos2 2x− cos3 2x dx
=
x
8
+
sin 2x
16
− 1
8
∫
cos2 2x dx− 1
8
∫
cos3 2x dx. (2.3)
The first integrand of (2.3) is an even power of cos 2x and is evaluated using the first identity in
(2.2): ∫
cos2 2x dx =
1
2
∫
1 + cos 4x dx
=
x
2
+
sin 4x
8
+ C1.
The second integrand of (2.3) is an odd power of cos 2x. The substitution
u = sin 2x, du = 2cos 2x dx
c©2020 School of Mathematics and Statistics, UNSW Sydney
28 CHAPTER 2. INTEGRATION TECHNIQUES
gives ∫
cos3 2x dx =
1
2
∫
(1− sin2 2x)2 cos 2x dx
=
1
2
∫
1− u2 du
=
u
2
− u
3
6
+ C2
=
sin 2x
2
− sin
3 2x
6
+ C2.
Following from (2.3) we obtain∫
sin2 x cos4 x dx =
x
8
+
sin 2x
16
− x
16
− sin 4x
64
− sin 2x
16
+
sin3 2x
48
+ C
=
x
16
− sin 4x
64
+
sin3 2x
48
+ C.
2.1.2 Integrating multiple angles of sine and cosine
The next class of trigonometric integrals consists of integrals of the form∫
cosmx sinnx dx,
∫
cosmx cosnx dx or
∫
sinmx sinnx dx, (2.4)
where m and n are real numbers. These have many applications, including analysis of waves,
musical harmonics and distribution of heat in solids. Such applications are discussed in some
second year courses.
To evaluate the integrals in (2.4) we need the following trigonometric identities.
Lemma 2.1.5. Suppose that A and B are real numbers. Then
sinA cosB = 12
(
sin(A+B) + sin(A−B)) (2.5)
cosA cosB = 12
(
cos(A−B) + cos(A+B)) (2.6)
sinA sinB = 12
(
cos(A−B)− cos(A+B)) (2.7)
Proof. We only prove the first identity; the other proofs are similar. We begin with the sum and
difference formulae
sin(A+B) = sinA cosB + cosA sinB
sin(A−B) = sinA cosB − cosA sinB.
Adding the two identities gives
sin(A+B) + sin(A−B) = 2 sinA cosB,
whereupon dividing by 2 establishes (2.5).
Example 2.1.6. Evaluate
∫
cos 5x cos 3x dx.
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.2. REDUCTION FORMULAE 29
Solution. Identity (2.6) implies that∫
cos 5x cos 3x dx =
1
2
∫
cos(5x− 3x) + cos(5x+ 3x) dx
=
1
2
∫
cos(2x) + cos(8x) dx
=
sin 2x
4
+
sin 8x
16
+ C.
2.1.3 Integrating powers of tan and sec
Students will not be expected to evaluate difficult integrals involving powers of tan and sec. How-
ever, it is expected that they will be able to use the facts that
tan2 x+ 1 = sec2 x,
d
dx
tanx = sec2 x and
d
dx
sec x = tanx sec x
to find suitable substitutions or strategies.
Example 2.1.7. Evaluate
∫
tan2 x dx.
Proof. The Pythagorean identity gives∫
tan2 x dx =
∫
sec2 x− 1 dx
= tan x− x+ C.
Example 2.1.8. Evaluate
∫
sec4 x tan x dx.
Proof. The substitution
u = sec x, du = sec x tanx dx
yields ∫
sec4 x tan x dx =
∫
u3 du
=
sec4 x
4
+ C.
(Of course, one can always by-pass the substitution and integrate by inspection.)
2.2 Reduction formulae
We begin with an example.
c©2020 School of Mathematics and Statistics, UNSW Sydney
30 CHAPTER 2. INTEGRATION TECHNIQUES
Example 2.2.1. Suppose that In is defined by
In =
∫ pi/4
0
tann x dx
whenever n ≥ 0. Show that
In =
1
n− 1 − In−2 ∀n ≥ 2. (2.8)
Hence evaluate ∫ pi/4
0
tan6 x dx.
Proof. By the identity tan2 x = sec2 x− 1,
In =
∫ pi/4
0
tann−2 x tan2 x dx
=
∫ pi/4
0
tann−2 x(sec2 x− 1) dx
=
∫ pi/4
0
tann−2 x sec2 x dx−
∫ pi/4
0
tann−2 x dx
=
[
tann−1 x
n− 1
]pi/4
0
− In−2
=
1
n− 1 − In−2
as required.
Using (2.8), we see that ∫ pi/2
0
tan6 x dx = I6
=
1
5
− I4
=
1
5
−
(
1
3
− I2
)
=
1
5
− 1
3
+
(
1
1
− I0
)
=
1
5
− 1
3
+
1
1
−
∫ pi/4
0
dx
=
1
5
− 1
3
+
1
1
− π
4
=
13
15
− π
4
.
(Note that I0 must be evaluated directly, since formula (2.8) is only valid when n ≥ 2.)
Formula (2.8) is an example of a reduction formula, since it expresses an integral in terms of
a ‘smaller’ integral of the same type. As illustrated above, once a reduction formula is known,
integrals of that type may be evaluated rapidly. Although not the case with the previous example,
most reduction formulae are proved using integration by parts.
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.2. REDUCTION FORMULAE 31
Example 2.2.2. Suppose that
In =
∫
sinn x dx
whenever n ≥ 0. Show that
In = −sin
n−1 x cos x
n
+
n− 1
n
In−2 ∀n ≥ 2.
Solution. If we apply integration by parts with
u = sinn−1 x v = − cos x
u′ = (n − 1) sinn−2 x cos x v′ = sinx
then
In =
∫
sinn−1 x sinx dx
= − sinn−1 x cos x+ (n − 1)
∫
sinn−2 x cos x cos x dx
= − sinn−1 x cos x+ (n − 1)
∫
sinn−2 x(1− sin2 x) dx
= − sinn−1 x cos x+ (n − 1)
∫
sinn−2 x dx− (n− 1)
∫
sinn x dx
= − sinn−1 x cos x+ (n − 1)In−2 − (n− 1)In.
If we gather the In terms to the left-hand side then
nIn = − sinn−1 x cos x+ (n − 1)In−2.
Dividing both sides by n gives the result.
In the final example, the reduction formula has two parameters (m and n) instead of one.
Example 2.2.3. Suppose that
Im,n =
∫ pi/2
0
cosm x sinn x dx (2.9)
whenever m and n are nonnegative integers.
(a) [X] Show that
Im,n =


(
m−1
m+n
)
Im−2,n provided that m ≥ 2(
n−1
m+n
)
Im,n−2 provided that n ≥ 2.
(2.10)
(b) [R] Using the result of (a), evaluate
∫ pi/2
0
cos4 x sin6 x dx.
c©2020 School of Mathematics and Statistics, UNSW Sydney
32 CHAPTER 2. INTEGRATION TECHNIQUES
Solution. (a) Integration by parts with
u = cosm−1 x v =
sinn+1 x
n+ 1
u′ = −(m− 1) cosm−2 x sinx v′ = sinn x cos x
gives
Im,n =
[
cosm−1 x sinn+1 x
n+ 1
]pi/2
0
+
m− 1
n+ 1
∫ pi/2
0
cosm−2 x sinn+2 x dx
=
m− 1
n+ 1
∫ pi/2
0
sinn x cosm−2 x(1− cos2 x) dx
=
m− 1
n+ 1
Im−2,n − m− 1
n+ 1
Im,n.
By bringing the Im,n terms to the left-hand side and rearranging, the first formula is obtained. The
second formula is proved similarly.
(b) The first formula in (2.10) allows us to reduce the first parameter:∫ pi/2
0
cos4 x sin6 x dx = I4,6
=
3
10
I2,6
=
3
10
· 1
8
I0,6.
The second formula in (2.10) allows us to reduce the second parameter:∫ pi/2
0
cos4 x sin6 x dx =
3
10
· 1
8
I0,6
=
3
10
· 1
8
· 5
6
I0,4
=
3
10
· 1
8
· 5
6
· 3
4
I0,2
=
3
10
· 1
8
· 5
6
· 3
4
· 1
2
I0,0.
Finally, we note from (2.9) that I0,0 =
pi
2 . Hence∫ pi/2
0
cos4 x sin6 x dx =
3
10
· 1
8
· 5
6
· 3
4
· 1
2
· π
2
=
3π
512
,
completing the problem.
2.2.1 [X] Application: the irrationality of pi
In this subsection, we use a reduction formula to prove that π is an irrational number. Students
studying MATH1231 may want to skip to the next section if this does not interest them.
Although the number π has been studied for over 2000 years, it was only in 1770 that it was
shown (by Johann Heinrich Lambert) that π is an irrational number. The proof we give is simpler
than Lambert’s proof and is similar to a proof discovered in the twentieth century. The main
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.2. REDUCTION FORMULAE 33
idea (as with many other irrationality proofs) is to assume that π is a rational number and find a
contradition. In our proof, a contradiction arises by showing that a certain definite integral, which
is known to lie in the interval (0, 1), is an integer if one assumes that π is a rational number.
Suppose that q is a positive integer. For each natural number n, define the integral In by
In =
q2n
n!
∫ pi/2
−pi/2
(
π2
4
− x2
)n
cos x dx. (2.11)
Lemma 2.2.4. Suppose that q is a positive integer and In is defined as above. If n ≥ 2 then
In = (4n − 2)q2In−1 − q4π2In−2. (2.12)
Moreover,
I1 = 4q
2 and I0 = 2.
Proof. We give an outline proof only; students should be able to fill in the details. Suppose that
n ≥ 2. Integration by parts with
u =
(
π2
4
− x2
)n
v = sinx
u′ = −2nx
(
π2
4
− x2
)n−1
v′ = cos x
yields
In =
2nq2n
n!
∫ pi/2
−pi/2
x
(
π2
4
− x2
)n−1
sinx dx.
A second application of integration by parts with
u = x
(
π2
4
− x2
)n−1
v′ = sinx
gives
In =
2q2n
(n− 1)!
∫ pi/2
−pi/2
(
π2
4
− x2
)n−1
cos x dx− 4q
2n
(n− 2)!
∫ pi/2
−pi/2
x2
(
π2
4
− x2
)n−2
cos x dx
= 2q2In−1 − 4q
2n
(n− 2)!
∫ pi/2
−pi/2
x2
(
π2
4
− x2
)n−2
cos x dx.
In the right-most integrand, write x2 as
π2
4
−
(
π2
4
− x2
)
. Hence
In = 2q
2In−1 − 4q
2n
(n − 2)!
(
π2
4
∫ pi/2
−pi/2
(
π2
4
− x2
)n−2
cos x dx−
∫ pi/2
−pi/2
(
π2
4
− x2
)n−1
cos x dx
)
= 2q2In−1 − q4π2In−2 + 4(n − 1)q2In−1.
If we gather both In−1 terms then we obtain (2.12) as required. The proof that I1 = 4q2 and I0 = 2
is straightforward and is left as an exercise.
The next lemma will be used to show that 0 < In < 1 for sufficiently large n.
c©2020 School of Mathematics and Statistics, UNSW Sydney
34 CHAPTER 2. INTEGRATION TECHNIQUES
Lemma 2.2.5. If a > 0 then lim
n→∞
an
n!
= 0.
Proof. Take N to be any integer greater than 2a. If n > N then
an
n!
=
aN
N !
· a
n−N
(N + 1)(N + 2) . . . n
=
aN
N !
· a
N + 1
· a
N + 2
· · · a
n
.
Now
aN
N !
is some fixed number and
a
N + 1
<
1
2
,
a
N + 2
<
1
2
, . . . ,
a
n
<
1
2
,
so
0 <
an
n!
<
aN
N !
(
1
2
)n−N
.
As n → ∞, the right hand side approaches 0 and hence lim
n→∞
an
n!
= 0 by a sequence version of the
pinching theorem.
Theorem 2.2.6. The number π is irrational.
Proof. Suppose that π = pq where p and q are positive integers and consider the integral In defined
by (2.11) whenever n ≥ 0.
First we argue by mathematical induction that In is an integer for every value of n. By Lemma
2.2.4, I0 and I1 are integers. Suppose inductively that Ik−2 and Ik−1 are integers whenever k ≥ 2.
By (2.12) and the assumption that π = p/q,
Ik = (4n − 2)q2Ik−1 − p2q2Ik−2
and so Ik is also an integer. Hence In is an integer whenever n ≥ 0.
On the other hand, it is not hard to see that
0 <
(
π2
4
− x2
)n
cos x ≤
(
π2
4
)n
whenever n ≥ 0 and −pi2 < x < pi2 . Hence
0 < In <
q2n
n!
∫ pi/2
−pi/2
(
π2
4
)n
dx
=
q2n
n!
(
π2
4
)n
π
=
p
q
· (p
2/4)n
n!
. (2.13)
As n → ∞, the expression in (2.13) approaches 0 by Lemma 2.2.5. Hence 0 < In < 1 whenever n
is sufficiently large. In particular, there is a large value of n for which In is not an integer, giving
a contradiction.
Hence we conclude that π is an irrational number.
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.3. TRIGONOMETRIC AND HYPERBOLIC SUBSTITUTIONS 35
2.3 Trigonometric and hyperbolic substitutions
(Ref: SH10 §8.4)
Many integrals can be evaluated by finding the right substitution, but unfortunately there is
no general systematic way to do this. Integrals involving square roots of quadratics often yield to
trigonometric or hyperbolic substitutions.
The following table indicates which substitution can be tried for integrals containing an expres-
sion of the form
√±x2 ± a2.
Expression in integrand Trigonometric substitution Hyperbolic substitution
√
a2 − x2 x = a sin θ x = a tanh θ
√
a2 + x2 x = a tan θ x = a sinh θ
√
x2 − a2 x = a sec θ x = a cosh θ
Whether or not a trigonometric substitution is more efficient than a hyperbolic substitution de-
pends on the particular integral. In general, trigonometric substitutions are favoured because once
integration is completed in the variable θ, it is easier to restate the result in terms of x.
Example 2.3.1. Evaluate
∫ √
1− x2 dx.
Solution. The substitution
x = sin θ dx = cos θ dθ
yields ∫ √
1− x2 dx =
∫ √
1− sin2 θ cos θ dθ
=
∫ √
cos2 θ cos θ dθ (since sin2 θ + cos2 θ = 1)
=
∫
cos2 θ dθ
=
1
2
∫
1 + cos 2θ dθ (by the double-angle formula for cosine)
=
1
2
(
θ +
sin 2θ
2
)
+ C
=
1
2
(θ + sin θ cos θ) + C (by the double-angle formula for sin).
To state our answer in terms of x, it is easiest to draw a triangle.
√
1− x2
x
1
sin θ =
x
1
θ
c©2020 School of Mathematics and Statistics, UNSW Sydney
36 CHAPTER 2. INTEGRATION TECHNIQUES
We see that θ = sin−1 x and cos θ =
√
1− x2
1
. Hence
∫ √
1− x2 dx = 1
2
(
sin−1 x+ x
√
1− x2
)
+ C.
Example 2.3.2. Evaluate
∫
dx
(4 + x2)3/2
.
Solution. By using the substitution
x = 2 tan θ dx = 2 sec2 θ dθ
and the identity tan2 θ + 1 = sec2 θ, we have∫
dx
(4 + x2)3/2
=
∫
2 sec2 θ dθ(√
4 tan2 θ + 4
)3
=
∫
2 sec2 θ dθ(
2
√
tan2 θ + 1
)3
=
∫
2 sec2 θ dθ
(2 sec θ)3
=
1
4
∫
dθ
sec θ
=
1
4
∫
cos θ dθ
=
sin θ
4
+ C.
To write the answer in terms of x, consider the following triangle.
2
x
√
x2 + 4
tan θ =
x
2
θ
Thus sin θ =
x√
x2 + 4
and hence
∫
dx
(4 + x2)3/2
=
x
4
√
x2 + 4
+ C.
Example 2.3.3. Use the substitution x = 3cosh θ to evaluate
∫
x3 dx√
x2 − 9.
Solution. The substitution
x = 3cosh θ dx = 3 sinh θ
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.4. INTEGRATING RATIONAL FUNCTIONS 37
and the identity cosh2 θ − sinh2 θ = 1 gives √x2 − 9 = 3 sinh θ. Hence∫
x3 dx√
x2 − 9 =
∫
34 cosh3 θ sinh θ dθ√
9 cosh2 θ − 9
=
∫
34 cosh3 θ sinh θ dθ
3 sinh θ
(since cosh2 θ − sinh2 θ = 1)
= 27
∫
cosh3 θ dθ
= 27
∫
cosh θ cosh2 θ dθ
= 27
∫
cosh θ(1 + sinh2 θ) dθ (since cosh2 θ − sinh2 θ = 1)
= 27
∫
1 + u2 dx (using the substitution u = sinh θ)
= 27
(
u+ 13u
3
)
+ C.
As was observed above,
√
x2 − 9 = 3 sinh θ and hence
u = sinh θ = 13
√
x2 − 9.
Therefore ∫
x3 dx√
x2 − 9 = 27
(
u+ 13u
3
)
+ C
= 9
√
x2 − 9 + 13 (
√
x2 − 9)3 + C.
Exercise: Evaluate the integral in Example 2.3.3 by using an appropriate trigonometric substi-
tution.
2.4 Integrating rational functions
(Ref: SH10 §8.5)
The main result of this section is that every rational function has an antiderivative among the
elementary functions. Moreover, there is a systematic way of finding this antiderivative.
Before we begin, we remind the reader that a rational function f is of the form
f(x) =
p(x)
q(x)
,
where p and q are polynomials. We say that f is proper if the degree of the denominator q
is greater than the degree of the numerator p. We say that f is improper if the degree of the
denominator q is less than or equal to the degree of the numerator p. We say that a quadratic
polynomial is irreducible if it has no real linear factors. (Equivalently, a quadratic ax2 + bx+ c is
irreducible if its discriminant b2 − 4ac is negative.)
Before articulating the general strategy for integrating a rational function, we revise some known
tactics for integrating simpler examples.
c©2020 School of Mathematics and Statistics, UNSW Sydney
38 CHAPTER 2. INTEGRATION TECHNIQUES
Example 2.4.1. Evaluate
∫
x
x2 + 2x+ 10
dx.
Solution. The first tactic is to rewrite integrand so that the derivative of the denominator is sitting
on the numerator: ∫
x
x2 + 2x+ 10
dx =
1
2
∫
2x
x2 + 2x+ 10
dx
=
1
2
∫
(2x+ 2)− 2
x2 + 2x+ 10
dx
=
1
2
∫
2x+ 2
x2 + 2x+ 10
− 2
x2 + 2x+ 10
dx.
The first term can now be integrated using the ln function. To integrate the second term, we
complete the square in the denominator:
x2 + 2x+ 10 = x2 + 2x+ 1 + 9
= (x+ 1)2 + 32.
Hence ∫
x
x2 + 2x+ 10
dx =
1
2
∫
2x+ 2
x2 + 2x+ 10
dx−
∫
1
(x+ 1)2 + 32
dx
=
1
2
ln |x2 + 2x+ 10| − 1
3
tan−1
(
x+ 1
3
)
+ C.
The integrand of Example 2.4.1 is a proper rational function whose denominator is an irreducible
quadratic. Any such function can be integrated using the techniques illustrated in that example.
We turn now to study a general strategy for integrating any rational function.
2.4.1 The overall strategy
In this subsection we give an overview of the approach to integrating rational functions. The basic
procedure is summarised below, afterwards illustrated with an example.
1. If the rational function is improper, then use polynomial division to write f as the sum
of a polynomial and a proper rational function. Since the polynomial is easy to integrate,
we need only focus on integrating a proper rational function.
2. It can be shown using algebra that every proper rational function f can be written as a
unique sum of functions of the form
A
(x− a)k and
Bx+ C
(x2 + bx+ c)k
, (2.14)
where the quadratic x2 + bx + c is irreducible. This sum is called the partial fractions
decomposition of f . We discuss how to find the partial fractions decomposition in the next
subsection.
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.4. INTEGRATING RATIONAL FUNCTIONS 39
3. Now we only need to integrate functions of the form given by (2.14). By completing the
square, using a substitution or performing simple algebraic manipulation, these can be
integrated by the standard formulae∫
xk dx =
xk+1
k + 1
+ C, k 6= −1∫
g′(x)
g(x)
dx = ln |g(x)| + C∫
dx
a2 + x2
dx =
1
a
tan−1
x
a
+ C.
Example 2.4.2. Find
∫
x4 − 5x3 + 12x2 − 21x+ 35
x3 − 3x2 + 4x− 12 dx.
Solution. Denote the integrand by f(x).
Step 1. Note that f is improper. So polynomial division gives
x+ 2
x3 − 3x2 + 4x− 12 )x4 − 5x3 + 12x2 − 21x+ 35
x4 − 3x3 + 4x2 − 12x
−2x3 + 8x2 − 9x+ 35
−2x3 + 6x2 − 8x+ 24
2x2 − x+ 11
and hence
f(x) = x+ 2 +
2x2 − x+ 11
x3 − 3x2 + 4x− 12 .
Note that rational expression on the far right-hand side is proper.
Step 2. The partial fractions decomposition of
2x2 − x+ 11
x3 − 3x2 + 4x− 12 is given by
2x2 − x+ 11
x3 − 3x2 + 4x− 12 =
2
x− 3 −
1
x2 + 4
. (2.15)
(Note that the quadratic x2 + 4 is irreducible.) It is not hard to verify that (2.15) is true; the
question is, How does one find such a decomposition? We answer this question in Subsection 2.4.2.
Step 3. The results of Steps 1 and 2 give
f(x) = x− 2 + 2
x− 3 −
1
x2 + 4
.
To integrate f , we need only integrate each term in the sum. Hence∫
x4 − 5x3 + 12x2 − 21x+ 35
x3 − 3x2 + 4x− 12 dx =
∫ (
x− 2 + 2
x− 3 −
1
x2 + 4
)
dx
= 12x
2 − 2x+ 2 ln |x− 3| − 12 tan−1 x2 + C,
completing the problem.
In the next subsection, we focus on finding the partial fractions decomposition of a proper
rational function.
c©2020 School of Mathematics and Statistics, UNSW Sydney
40 CHAPTER 2. INTEGRATION TECHNIQUES
2.4.2 Partial fractions decompositions
To find the partial fractions decomposition of a proper rational function pq , we factorise the de-
nominator q as much as possible; that is, we express q as a product of real linear factors and real
irreducible quadratic factors. The form of the partial fractions decomposition is determined by this
factorisation. There are several cases, depending on the type of factorisation.
Case 1: The denominator splits into distinct linear factors. Examples of two such rational
functions and the form of their partial fractions decompositions are given below:
x− 3
(x− 1)(x − 2) =
A
x− 1 +
B
x− 2
x2 − x+ 7
x(2x+ 1)(x − 3) =
A
x
+
B
2x+ 1
+
C
x− 3 .
The constants A, B and C in each case can be determined using the following method.
Example 2.4.3. Find the partial fractions decomposition of
7x− 1
x2 − 2x− 3.
Solution. By factorising we find that x2 − 2x − 3 = (x − 3)(x + 1). So the partial fractions
decomposition takes the form
7x− 1
(x− 3)(x+ 1) =
A
x− 3 +
B
x+ 1
,
where A and B are constants to be determined. To find A and B, multiply through by (x−3)(x+1)
to obtain the polynomial equation
7x− 1 = A(x+ 1) +B(x− 3) ∀x ∈ R.
Since this identity is true for all values x, the values of A and B are easily determined by choosing
suitable values of x:
x = 3 ⇒ 7× 3− 1 = A(3 + 1) ⇒ A = 5
x = −1 ⇒ 7× (−1)− 1 = B(−1− 3) ⇒ B = 2.
Hence the partial fractions decomposition is given by
7x− 1
x2 − 2x− 3 =
7x− 1
(x− 3)(x+ 1) =
5
x− 3 +
2
x+ 1
.
This may be easily verified by rewritting the right-hand side over a common denominator and
simplifying.
Case 2: The denominator has a repeated linear factor. Examples of two such rational functions
and the form of their partial fractions decompositions are given below:
x2 + 1
(x+ 4)3
=
A
x+ 4
+
B
(x+ 4)2
+
C
(x+ 4)3
x2 − 2
(x− 1)(x− 2)2 =
A
x− 1 +
B
x− 2 +
C
(x− 2)2 .
Note carefully how the repeated factors appear on the right-hand side. The constants A, B and C
in each case can be determined using the following method.
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.4. INTEGRATING RATIONAL FUNCTIONS 41
Example 2.4.4. Find the partial fractions decomposition of
x2 − 3x+ 8
x(x− 2)2 .
Solution. The partial fractions decomposition takes the form
x2 − 3x+ 8
x(x− 2)2 =
A
x
+
B
x− 2 +
C
(x− 2)2 ,
where A, B and C are constants. To find these constants, we multiply through by x(x − 2)2 to
obtain
x2 − 3x+ 8 = A(x− 2)2 +Bx(x− 2) + Cx ∀x ∈ R.
Now substitute the obvious values for x to determine the values of A and C:
x = 2 ⇒ 6 = 2C ⇒ C = 3
x = 0 ⇒ 8 = 4A ⇒ A = 2.
To determine B, we can substitute any other value for x. However, it is best to choose a small
integer to keep the arithmetic simple:
x = 1 ⇒ 6 = A−B + C ⇒ B = A+ C − 6 = −1.
(Alternately, one can find B by noting that
x2 − 3x+ 8 = 2(x− 2)2 +Bx(x− 2) + 3x ∀x ∈ R
and comparing coefficients for x2.) Hence we obtain the partial fractions decomposition
x2 − 3x+ 8
x(x− 2)2 =
2
x
− 1
x− 2 +
3
(x− 2)2 .
Case 3: The denominator has an irreducible quadratic factor. Examples of two such rational
functions and the form of their partial fractions decompositions are given below:
x2 + x
(x− 1)(x2 + 9) =
A
x− 1 +
Bx+ C
x2 + 9
x3 − 2x+ 4
(x2 + 5)(x2 + x+ 1)
=
Ax+B
x2 + 5
+
Cx+D
x2 + x+ 1
Note carefully how the irreducible quadratic appears on the right-hand side. As before, the con-
stants A, B, C and D in each case can be determined by algebra.
Example 2.4.5. Find the partial fractions decomposition of
4x2 + 2x+ 1
(x+ 1)(x2 + x+ 1)
.
Solution. The partial fractions decomposition takes the form
4x2 + 2x+ 1
(x+ 1)(x2 + x+ 1)
=
A
x+ 1
+
Bx+ C
x2 + x+ 1
,
where A, B and C are constants. Multiplying through by (x+ 1)(x2 + x+ 1) gives
4x2 + 2x+ 1 = A(x2 + x+ 1) + (Bx+ C)(x+ 1) ∀x ∈ R.
c©2020 School of Mathematics and Statistics, UNSW Sydney
42 CHAPTER 2. INTEGRATION TECHNIQUES
Now substitute suitable values for x:
x = −1 ⇒ 3 = A ⇒ A = 3
x = 0 ⇒ 1 = A+C ⇒ C = 1−A = −2
x = 1 ⇒ 7 = 3A+ 2(B + C) ⇒ B = 1.
(Alternatively, after finding A, we could compare the coefficients of the x2 terms on both sides to
deduce that B = 1, and compare the constant terms to deduce that C = −2.) Hence
4x2 + 2x+ 1
(x+ 1)(x2 + x+ 1)
=
3
x+ 1
+
x− 2
x2 + x+ 1
is the partial fractions decomposition.
Case 4: The denominator has repeated irreducible quadratic factor. This case rarely appears
in first year mathematics courses because it is more computationally intensive. Nevertheless, for
completeness the basic form of decomposition is illustrated below:
x2 + x
(x2 + 9)3
=
Ax+B
x2 + 9
+
Cx+D
(x2 + 9)2
+
Ex+ F
(x2 + 9)3
x3 − 2x+ 4
(x− 2)(x2 + x+ 1)2 =
A
x− 2 +
Bx+C
x2 + x+ 1
+
Dx+ E
(x2 + x+ 1)2
.
As before, the constants appearing in each example can be determined by algebra.
The final example tests our ability to generalise each of these cases to rational functions whose
denominators have many factors of different types.
Example 2.4.6. Write down the form of partial fractions decomposition for the rational function
given by
4x4 − 3x2 + x− 9
x3(x− 7)(x2 + 3)2(x2 + x+ 2) .
(You are not required to evaluate the constant coefficients.)
Solution. The partial fractions decomposition is given by
4x4 − 3x2 + x− 9
x3(x− 7)(x2 + 3)2(x2 + x+ 2) =
A
x
+
B
x2
+
C
x3
+
D
x− 7 +
Ex+ F
x2 + 3
+
Gx+H
(x2 + 3)2
+
Ix+ J
x2 + x+ 2
where A,B, . . . , J are real constants.
Remark 2.4.7. It is important to check that the denominator of the rational function has been
completely factorised before writing down the form of partial fractions decomposition. In particular,
one should check that every quadratic factor appearing in the factorisation is irreducible.
2.4.3 Integrating rational functions: two examples
In this subsection we illustrate how techniques discussed in the previous subsections are applied.
Example 2.4.8. Find
∫
8x3 − 12x2 − 13x− 5
2x2 − 3x− 2 dx.
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.4. INTEGRATING RATIONAL FUNCTIONS 43
Solution. We follow the steps outlined in Subsection 2.4.1. Denote the integrand by f(x).
Step 1. Since f is improper, we begin with polynomial division. This gives
4x
2x2 − 3x− 2 ) 8x3 − 12x2 − 13x− 5
8x3 − 12x2 − 8x
− 5x− 5
whence
f(x) = 4x+
−5x− 5
2x2 − 3x− 2
= 4x− 5x+ 5
2x2 − 3x− 2 .
Step 2. To find the partial fractions decomposition of
5x+ 5
2x2 − 3x− 2
we factorise the denominator:
2x2 − 3x− 2 = (2x+ 1)(x− 2).
Hence the decomposition is given by
5x+ 5
(2x+ 1)(x− 2) =
A
2x+ 1
+
B
x− 2 ,
where A and B are constants. Multiplying through by (2x+ 1)(x − 2) gives
5x+ 5 = A(x− 2) +B(2x+ 1)
and by using the substitution x = 2 we deduce that B = 3. By comparing coefficients of x on both
sides it is easy to see that A = −1. Hence
5x+ 5
(2x+ 1)(x− 2) =
−1
2x+ 1
+
3
x− 2 .
Step 3. By the previous two steps,
f(x) = 4x+
1
2x+ 1
− 3
x− 2 .
Integrating gives ∫
f(x) dx = 2x2 + 12 ln |2x+ 1| − 3 ln |x− 2|+ C,
completing our answer.
Example 2.4.9. Find
∫
4x2 − 15x+ 29
(x− 5)(x2 − 4x+ 13) dx.
c©2020 School of Mathematics and Statistics, UNSW Sydney
44 CHAPTER 2. INTEGRATION TECHNIQUES
Solution. The integrand is a proper rational function and its denominator completely factorised in
the real numbers. So we immediately look for its partial fractions decomposition, which is of the
form
4x2 − 15x+ 29
(x− 5)(x2 − 4x+ 13) =
A
x− 5 +
Bx+ C
x2 − 4x+ 13
for some real constants A, B and C. Hence
4x2 − 15x+ 29 = A(x2 − 4x+ 13) + (Bx+ C)(x− 5),
from which appropriate substitutions allow the evaluation of the unknown constants:
x = 5 ⇒ 54 = 18A ⇒ A = 3
x = 0 ⇒ 29 = 13A− 5C ⇒ C = 2
x = 1 ⇒ 18 = 10A− 4(B + C) ⇒ B = 1.
Hence
4x2 − 15x+ 29
(x− 5)(x2 − 4x+ 13) =
3
x− 5 +
x+ 2
x2 − 4x+ 13 .
The first term of the decomposition is easy to integrate. We therefore focus on the second term:∫
x+ 2
x2 − 4x+ 13 dx =
1
2
∫
2x+ 4
x2 − 4x+ 13 dx
=
1
2
[∫
2x− 4
x2 − 4x+ 13 dx+
∫
8
x2 − 4x+ 13 dx
]
=
1
2
∫
2x− 4
x2 − 4x+ 13 dx− 4
∫
1
x2 − 4x+ 13 dx
=
1
2
ln |x2 − 4x+ 13| − 4
∫
1
(x− 2)2 + 9 dx
=
1
2
ln |x2 − 4x+ 13| − 4
3
tan−1
(
x− 2
3
)
+C.
Putting everything together gives∫
4x2 − 15x+ 29
(x− 5)(x2 − 4x+ 13) dx = 3 ln |x− 5|+
1
2
ln |x2 − 4x+ 13| − 4
3
tan−1
(
x− 2
3
)
+ C.
2.5 Other substitutions
(Ref: SH10 §8.6)
The method of partial fractions allows us, in principle, to find an antiderivative, among the
elementary functions, for any given rational function. So given a ‘non-standard’ integral, a sound
technique for integration is to look for a substitution that will convert the given integral into the
integral of rational function. Choosing a good substitution is often a matter of experience and a
little inspiration.
Example 2.5.1. Evaluate the following integrals.
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.5. OTHER SUBSTITUTIONS 45
(a)
∫
dx
1 + x1/4
(b)
∫
x1/2
x1/3 + x1/4
dx
(c)
∫
dx√
e2x − 1
Proof. (a) The aim is to replace the fractional power x1/4 with something more convenient. The
obvious substitution to use is x = u4, which leads to the substitution dx = 4u3 du. Hence∫
dx
1 + x1/4
=
∫
4u3 du
1 + u
and we now have the integral of a rational function. Polynomial division gives
4u3
1 + u
= 4
(
u2 − u+ 1− 1
1 + u
)
.
(This result may also be obtained by writing
u3 = (u3 + 1)− 1
= (u+ 1)(u2 − u+ 1)− 1,
thus avoiding the use of polynomial long division.) Consequently,∫
dx
1 + x1/4
= 4
(
u3
3
− u
2
2
+ u− ln |1 + u|
)
+ C
=
4x3/4
3
− 2x1/2 + 4x1/4 − 4 ln
∣∣∣1 + x1/4∣∣∣+C.
(b) We aim to remove the fractional powers x1/2, x1/3 and x1/4. The lowest common multiple
of 2, 3 and 4 is 12, so we choose the substitution x = u12. Hence∫
x1/2
x1/3 + x1/4
dx =
∫
u6
u4 + u3
12u11 du
= 12
∫
u14
u+ 1
du.
From here we either use polynomial division, or we can observe the factorisation
un − 1 = (u+ 1)(un−1 − un−2 + · · ·+ u− 1) when n is even,
to obtain ∫
x1/2
x1/3 + x1/4
dx = 12
∫
u13 − u12 + u11 − . . . + u− 1 + 1
u+ 1
du.
It is easy to evaluate the integral from here.
(c) One option is to use the substitution
u = ex du = ex dx
c©2020 School of Mathematics and Statistics, UNSW Sydney
46 CHAPTER 2. INTEGRATION TECHNIQUES
so that ∫
dx√
e2x − 1 =
∫
ex dx
ex
√
e2x − 1
=
∫
du
u
√
u2 − 1 .
From here the integral can be evaluated using the substitution u = sec θ or u = cosh θ. This is left
as an exercise.
A better approach is to remove the square root from the very first substitution. The substitution
u2 = e2x − 1 implies that
2u
du
dx
= 2e2x = 2(u2 + 1),
which leads to the substitution
dx =
u du
u2 + 1
.
Hence ∫
dx√
e2x − 1 =
∫
u du
u(u2 + 1)
=
∫
du
u2 + 1
= tan−1 u+ C
= tan−1
√
e2x − 1 + C.
Thus the second method is more efficient than the first.
As seen in the last example, there may more than one method to evaluate a given integral.
Choosing the most efficient substitution to use is not always easy, but intuition can be developed
with time, experience and practice. The next example is therefore left to the student as an exercise.
Example 2.5.2. Evaluate
∫ 1
0
x3
(4 + x2)5/2
dx by
(i) using the substitution x = 2 tan θ (since the integrand involves
√
4 + x2);
(ii) using the substitution x = 2 sinh θ (since the integrand involves
√
4 + x2);
(iii) using the substitution u2 = 4 + x2 (aiming for a rational function);
(iv) using the substitution u = 4 + x2.
Which method works best?
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.6. MAPLE NOTES 47
2.6 Maple notes
The following MAPLE command is relevant to the material of this chapter:
convert(f, parfrac, x); performs a partial fraction decomposition of the rational function f in
the variable x. For example,
> convert(x^2/(x+2), parfrac, x);
x− 2 + 4
x+ 2
> convert(x/(x-b)^2, parfrac, x);
b
(x− b)2 +
1
x− b
c©2020 School of Mathematics and Statistics, UNSW Sydney
48 CHAPTER 2. INTEGRATION TECHNIQUES
Problems for Chapter 2
Revision problems
1. [R] Evaluate each of the following integrals by inspection. Do not use substitution.
a)
∫
xe2x
2
dx b)
∫
x sin(x2) dx c)
∫
x2 cos(2x3) dx
d)
∫
x
5x2 − 11 dx e)
∫
sinx cos3 x dx f)
∫
dx
x lnx
g)
∫
x+ 2√
x2 + 4x+ 7
dx h)
∫
x
√
1 + x2 dx i)
∫
x2
√
9− 4x3 dx
j)
∫
x2√
9− 4x3 dx k)
∫
x3
(1 + x4)3
dx l)
∫
sec2 x
tan4 x
dx
m)
∫
cos x
sin3 x
dx n)
∫
e2x(4 + 3e2x)
1
3 dx o)
∫
1
x (ln x)5
dx
2. [R] Integrate the following by parts.
a)
∫
x2e−x dx b)
∫
x3 lnx dx c)
∫
x
cos2 x
dx d)
∫
(lnx)2
x2
dx
e)
∫
ex cos x dx f)
∫
lnx dx g)
∫
tan−1 x dx
Problems 2.1 : Trigonometric integrals
3. [R] Evaluate the following integrals.
a)
∫ pi/2
0
sin7 x cos x dx b)
∫ pi
0
sin3 x cos2 x dx
c)
∫
sec3 x tanx dx d)
∫
cos2 θ dθ
e)
∫
cosx cos 10x dx f)
∫
sin 2x cos 3x dx
Problems 2.2 : Reduction formulae
4. [R]
a) By multiplying the integrand by
tanx+ sec x
tanx+ sec x
, find
∫
sec x dx.
b) The reduction formula∫
secn x dx =
secn−2 x tan x
n− 1 +
n− 2
n− 1
∫
secn−2 x dx
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 2 49
is valid whenever n ≥ 2. Use it to find the following integrals.
i)
∫
sec4 x dx ii)
∫
sec5 x dx
c) [X] Prove the reduction formula given above.
5. [R] Suppose that
Im,n =
∫ pi/2
0
cosm x sinn x dx
whenever m and n are nonnegative integers. Use the reduction formula
Im,n =


(
m−1
m+n
)
Im−2,n provided that m ≥ 2(
n−1
m+n
)
Im,n−2 provided that n ≥ 2
to evaluate the following integrals.
a)
∫ pi/2
0
cos6 x sin4 x dx b)
∫ pi/2
0
cos5 x sin5 x dx c)
∫ pi/2
0
cos3 x sin4 x dx
6. [R] Suppose that In =
∫ 1
0
xne−x dx. Prove that
In = n In−1 − 1
e
whenever n > 0. Hence evaluate
∫ 1
0
x3e−x dx.
7. [R] It was proven in the notes that if In =
∫ pi/4
0
tann x dx then
In =
1
n− 1 − In−2
whenever n > 1. Use this to evaluate I7 and I8.
8. [R] Suppose that In =
∫ e
1
x(lnx)n dx. Show that
In =
1
2
(
e2 − nIn−1
)
whenever n ≥ 1. Hence evaluate I3.
9. [R] By writing cosn x as cosn−1 x cos x and integrating by parts, show that∫ pi/2
0
cosn x dx =
n− 1
n
∫ pi/2
0
cosn−2 x dx
whenever n ≥ 2. Hence find
∫ pi/2
0
cos8 x dx and
∫ pi/2
0
cos7 x dx.
c©2020 School of Mathematics and Statistics, UNSW Sydney
50 CHAPTER 2. INTEGRATION TECHNIQUES
10. [R] Suppose that In =
∫ 1
0
xn√
1 + x
dx. Find a reduction formula for In.
11. [H] Show that
∫ 1
0
xm(1− x)n dx = m!n!
(m+ n+ 1)!
for all nonnegative integers m and n.
12. [H] Suppose that In =
∫ pi/2
0
cosn x dx.
a) By writing the reduction formula of Question 9 as In =
(
1− 1
n
)
In−2, show that
I2m =
(
1− 1
2m
)(
1− 1
2m− 2
)
. . .
(
1− 1
2
)
π
2
=
π
2
m∏
k=1
(
1− 1
2k
)
and
I2m+1 =
m∏
k=1
(
1− 1
2k + 1
)
.
b) Deduce that
2
π
I2m
I2m+1
=
m∏
k=1
(
1− 1
(2k)2
)
.
c) By considering cos x on (0, pi2 ), show that
I2m+2 ≤ I2m+1 ≤ I2m
whenever m ≥ 1.
d) Use the result of (a) and (c) and the pinching theorem to deduce that
lim
m→∞
I2m
I2m+1
= 1.
e) Conclude that
lim
m→∞
m∏
k=1
(
1− 1
(2k)2
)
=
2
π
.
This limit is called Wallis’ product.
f) Show that
m∏
k=1
(
1− 1
(2k)2
)
=
m∏
k=1
(2k − 1)(2k + 1)
(2k)2
=
(2m+ 1)((2m)!)2
24m(m!)4
and deduce that Wallis’ product may be written as
π
2
= lim
m→∞
24m(m!)4
(2m+ 1)((2m)!)2
.
13. [X]
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 2 51
a) Show that ∫ 2m
m
lnx dx = m lnm+ 2m ln 2−m.
b) Show that the trapezoidal rule (see SH6 page 457, SH7 page 537 or SH8 page 484)
with m subintervals gives the approximate value of this integral as
ln
(
(2m)!
m!
)
− 1
2
ln 2.
c) It can be shown that if the trapezoidal rule with m intervals of equal width is used to
approximate
∫ b
a f(x) dx, where f is a twice differentiable function, then the absolute
error in the approximation is no greater than
(b− a)3M
12m2
, where M is the maximum
of |f ′′(x)| on [a, b]. Use this error bound to show that the error in the approximation
in part (b) approaches 0 as m→∞.
d) Conclude that
(2m)!
m!
/ {√
2 22mmme−m
}
→ 1
as m→∞.
14. [X] Using the previous two questions (or otherwise), show that
π
2
= lim
m→∞
(m!)2
4mm2me−2m
,
so that
m!√
2πmm+
1
2 e−m
→ 1
as m → ∞. This is sometimes called Stirling’s formula (or Stirling’s approximation for
m!).
Problems 2.3 : Trigonometric and hyperbolic substitutions
15. [R] Evaluate the following integrals.
a)
∫ 1
0
x2√
4− x2 dx b)
∫
dx√
x2 − 6x+ 13
c)
∫ 3
0
√
9− x2 dx d)
∫
dx
x2
√
x2 + 16
e)
∫
(1− x2)− 32 dx f)
∫ 1
−1
dx
x2 + 2x+ 2
16. [R] Evaluate
∫
x√
x2 − 4 dx by making an appropriate substitution. Are there any other
methods or substitutions that could be used? Which one is most efficient?
c©2020 School of Mathematics and Statistics, UNSW Sydney
52 CHAPTER 2. INTEGRATION TECHNIQUES
Problems 2.4 : Integrating rational functions
17. [R] Evaluate the following integrals.
a)
∫
1
x2 + 4x+ 3
dx b)
∫
5x− 7
x2 − 3x+ 2 dx
c)
∫
(x+ 1)
x2(x− 1) dx d)
∫
1
(x2 − 1)2 dx
e)
∫
x2 + 1
x2 − 1 dx f)
∫
18
(x2 + 9)(x − 3) dx
g)
∫
x2 + x+ 2
(x+ 1)(x+ 2)2
dx h)
∫
1− x
(1 + x)3
dx
Problems 2.5 : Other substitutions and miscellaneous integrals
18. [R] Evaluate the following integrals.
a)
∫
x
x2 + 2x+ 10
dx b)
∫
x√
x2 + 2x+ 10
dx
c)
∫
dx
1 +
√
x
d) [H]
∫ 64
1
1
x1/2 + x1/3
dx
19. [R] Evaluate
∫ 1
0
x3
(4 + x2)5/2
dx by
a) using the substitution x = 2 tan θ (since the integrand involves
√
4 + x2);
b) using the substitution x = 2 sinh θ (since the integrand involves
√
4 + x2);
c) using the substitution u2 = 4 + x2 (aiming for a rational function);
d) using the substitution u = 4 + x2.
Which method works best?
20. [X] Caution: for each integral below, it is unlikely to be profitable to seek an indefinite
integral.
a) Use the substitution x = π − u to show that
∫ pi
0
x sinx
1 + cos2 x
dx =
π2
4
.
b) Show that
∫ 1
0
ln(1 + x)
1 + x2
dx =
π
8
ln 2.
21. [X] Find
∫
x2 − 1
x2 + 1
1√
1 + x4
dx.
The substitution u2 = x2 + 1
x2
may be useful, but there are other possible approaches.
[Taken from Spivak’s Calculus.]
22. [R] The following integrals were selected from past papers. Evaluate each one.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 2 53
a)
∫
dx
x(x2 + x+ 1)
b)
∫
8 sinhx cosh4 x dx
c)
∫
3x+ 5
x2 + 4x+ 8
dx d)
∫ √
25− x2 dx
e)
∫
3x2 − 5x+ 3
(x− 1) (x2 − 2x+ 2) dx f)
∫
1
x2
√
1 + x2
dx
g)
∫
dx
(x2 + 3)3/2
h)
∫
cos(4x) sin(3x) dx
i)
∫ ∞
1
e−
√
x
√
x
dx. If convergent evaluate the integral.
c©2020 School of Mathematics and Statistics, UNSW Sydney
54 CHAPTER 2. INTEGRATION TECHNIQUES
c©2020 School of Mathematics and Statistics, UNSW Sydney
55
Chapter 3
Ordinary differential equations
In many practical applications (in physics, economics, social sciences, engineering, applied science,
mathematics and so on), information is known about the relationship between a quantity and its
rates of change, but one may not have an exact formula for the quantity itself. For example,
a simple population model states that the rate of change of a population, at any given time, is
proportional to size of the population itself. If we write P (t) for the population at time t, then we
arrive at the equation
dP
dt
= kP,
where k is the constant of proportionality. An equation, such as the one given above, which involves
one (or more) of the derivatives of a function, is called a differential equation. Some other simple
examples include
• d
2x
dx2
= −k2x, which is used to describe the displacement x from the origin of a particle
undergoing simple harmonic motion;
• dT
dt
= k(T − 20), which is used to describe how the temperature T of an object changes
in room temperature; and
• dx
dy
−0.08y = 0.05(60000+1000t), which is used to describe how the amount y (in dollars)
of a particular investment changes in time (see Example 3.4.3).
If possible, the aim from here is to find an explicit formula (or formulae) describing the unknown
function (respectively P , x, T and y in the examples above) appearing in each differential equation.
The primary goal of this chapter is to examine some techniques for obtaining a formula for a function
given a differential equation for that function.
c©2020 School of Mathematics and Statistics, UNSW Sydney
56 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
3.1 An introduction
We begin with a definition.
Definition 3.1.1. An ordinary differential equation is an equation expressed in
terms of exactly one independent variable and one (or more) of the derivatives of a
function of this variable. The order of an ordinary differential equation is the order
of the highest derivative present.
For example, the equation
d3y
dx3
+ sinx
dy
dx
= 3x2y
is an ordinary differential equation of order 3; the independent variable is x and y is assumed to
be a function of x. The equation (
d2x
dt2
)3/2
+
dx
dt
− tx = 0
is an ordinary differential equation of order 2; the independent variable is t and x is assumed to be
a function of t.
In these notes, the term ‘ordinary differential equation’ will often be abbreviated as ODE. Such
equations are called ‘ordinary’ because they involve ordinary derivatives. This is to distinguish
them from differential equations that involve partial derivatives. (The study of ‘partial differential
equations’ will be introduced in some second year courses.)
Ordinary differential equations can be written in several ways using a variety of notations. For
example, each of the equations
d2y
dx2
+ 4x
dy
dx
= ex
f ′′(x) + 4xf ′(x) = ex
y′′ + 4xy′ = ex
represent the same ODE.
Definition 3.1.2. A solution to an nth order ordinary differential equation is a
function which is n-times differentiable and satisfies the given equation.
The next example illustrates this definition as well as introducing the terms ‘particular solution’
and ‘general solution.’
Example 3.1.3. Consider the ODE
dy
dx
= x2 + 5.
Then the function y, given by
y(x) =
x3
3
+ 5x,
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.1. AN INTRODUCTION 57
is a solution to the ODE. Note that if y(x) = x
3
3 + 5x+ 6 or if y(x) =
x3
3 + 5x− 45, then y is also
a solution to the ODE. Each of these solutions is called a particular solution to the ODE. Using
the mean value theorem (see Section 5.9 of the MATH1131 calculus notes), it is easily shown that
every particular solution y to the ODE can be written in the form
y(x) =
x3
3
+ 5x+ C, (3.1)
where C ∈ R. The family of solutions given by (3.1), where C ∈ R, is called the general solution
to the ODE.
In the above example, solution y could be expressed explicitly as a function of the independent
variable x. Hence we obtained an explicit solution to the ODE. However, this cannot always be
done, as the following example illustrates. Sometimes we must settle for an implicit solution to the
ODE.
Example 3.1.4. Show that y, given implicitly by the equation
y2 = cos(x2 + y2), (3.2)
is a particular solution to the ODE
2x sin(x2 + y2) +
(
2y sin(x2 + y2) + 2y
)dy
dx
= 0.
Proof. To verify that y solves the ODE, we first need to calculate dydx . Implicit differentiation of
(3.2) with respect to x gives
2y
dy
dx
= − sin(x2 + y2)×
(
2x+ 2y
dy
dx
)
(where we have used the chain rule to obtain the right-hand side). Hence
(
2y + 2y sin(x2 + y2)
)dy
dx
= −2x sin(x2 + y2).
By simple rearrangement it is easily seen that
2x sin(x2 + y2) +
(
2y sin(x2 + y2) + 2y
)dy
dx
= 0,
and hence the ODE is satisfied. (This ODE shall be revisited again in Section 3.5, where we shall
find the general solution, instead of merely verifying that a given function is a solution.)
For some differential equations, it may not even be possible to find an implicit solution. If it
is possible to prove that an solution exists, then mathematicians and scientists must often settle
for working with an approximate solution to the ODE. However, such issues will not concern us in
this course.
c©2020 School of Mathematics and Statistics, UNSW Sydney
58 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
3.2 Initial value problems
In most practical applications where ODEs are used, information is also known about the value of
the unknown function and its derivatives at a particular point. This information, together with an
ODE, forms an initial value problem.
Definition 3.2.1. An initial value problem is an nth order ODE together with a
set of values of the solution and its first (n − 1) derivatives at some fixed point x0.
These values are called the initial conditions of the initial value problem.
For example,
• dy
2
dx2
+ 5x
dy
dx
+ y = sinx, y(0) = 2,
dy
dx
∣∣∣∣
x=0
= 7;
• f ′(t)− e2tf(t) = 3t2, f(1) = 5; and
• y′′′ + 3y′′ + 4y = cosh x, y′′(π) = 2, y′(π) = 0, y(π) = −1
are all initial value problems. The term ‘initial value problem’ is often abbreviated as IVP.
To solve an initial value problem, we usually try to find a general solution to the ODE (which is
expressed using unspecified constants) and then determine the values of these constants by imposing
the initial conditions.
Example 3.2.2. Solve the IVP
d2y
dx2
= 6x, y′(0) = 2, y(0) = −1.
Solution. Integrating the ODE once gives
dy
dx
= 3x2 + C
where C ∈ R. By imposing the initial condition y′(0) = 2, we deduce that C = 2. Hence
dy
dx
= 3x2 + 2.
Integrating again gives
y = x3 + 2x+D,
where D ∈ R. The initial condition y(0) = −1 implies that D = −1. Hence the solution y to the
IVP is given by
y = x3 + 2x− 1.
Note that this solution is valid for all x in R; that is, the solution y is defined on R.
Not every initial value problem is as straightforward to solve as the example above. Solving an
IVP is, in general, very difficult, and the following questions arise.
(a) Does the IVP have a solution?
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.2. INITIAL VALUE PROBLEMS 59
(b) Does it have a unique solution?
(c) If initial values are given at the point a, then how far on either side of a does the solution
extend?
The following two examples show that care must be taken in answering such questions, even for
IVPs that appear to be ‘simple.’
Example 3.2.3. Solve the initial values problem
dy
dx
=
√
y, y(0) = 0.
Solution. If we assume that y(x) 6= 0 then the ODE can be written as
dx
dy
=
1√
y
, (3.3)
from which we find that x = 2
√
y +C, where C is a real number. When x = 0 we have that y = 0
and hence C = 0. Rearranging gives
y(x) =
x2
4
.
However, note that y(x) = 0 is also a solution to the IVP. Hence the IVP does not have a unique
solution.
Example 3.2.4. Solve the initial values problem
dy
dx
=
1
x
, y(1) = 2.
How far does the solution extend on either side of the point 1?
Solution. First, the ODE implies that dydx does not exist at 0. However, on the interval (0,∞), we
obtain the general solution
y(x) = lnx+ C,
where C ∈ R. The initial condition implies that C = 2 and so
y(x) = lnx+ 2
whenever x > 0. Hence we have found solution that extends to the interval (0,∞).
Remark 3.2.5. Note that in example 3.2.4 we could give a family of solutions defined on the set
{x ∈ R : x 6= 0} by
y(x) =
{
ln |x|+D if x < 0
ln |x|+ 2 if x > 0,
where D ∈ R. However, for most practical applications such a solution would not be used on
(−∞, 0) because of the break in the domain of y at 0.
c©2020 School of Mathematics and Statistics, UNSW Sydney
60 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
3.3 Separable ODEs
(Ref: SH10 §9.2)
A separable ODE is a differential equation where the two variables involved (say x and y) can
be separated so that all the y’s are on one side of the equation and all x’s are on the other. We
give an example and then state the general form.
Example 3.3.1. Solve the initial value problem
dy
dx
= y2(1 + x2), y(0) = 1.
Solution. First we separate the variables x and y to obtain
1
y2
dy = (1 + x2) dx. (3.4)
Integrating gives ∫
1
y2
dy =
∫
(1 + x2) dx, (3.5)
whence
−1
y
= x+
x3
3
+ C,
where C ∈ R. This gives an implicit solution to the ODE.
We now impose the initial condition to evaluate the constant C. When x = 0 and y = 1 we
obtain
−1
1
= 0 +
03
3
+ C,
whence C = −1 and
−1
y
= x+
x3
3
− 1.
Finally, in this particular example, one can easily make y the subject to obtain the explicit
solution
y =
−1
x+ x
3
3 − 1
=
−3
3x+ x3 − 3 ,
thus completing the problem.
Remark 3.3.2. Equation (3.4) makes sense within the context of integration (see equation (3.5)).
The fact that this kind of symbolic manipulation with dy and dx ‘works’ can be traced back to the
chain rule, which is the basis for implicit differentiation. (To make this clear, note in the example
above that
1
y2
dy
dx
= 1 + x2
and so integrating both sides with respect to x gives∫
1
y2
dy
dx
dx =
∫
1 + x2 dx.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.3. SEPARABLE ODES 61
Hence
− 1
y
= x+
x3
3
+ C, (3.6)
which may be easily verified by (implicitly) differentiating both sides of (3.6) respect to x.) Students
should not think that they can manipulate the symbols dy and dx in other ways and still obtain
valid results.
In general, a separable ODE is one that can be written in the form
dy
dx
=
g(x)
h(y)
. (3.7)
To solve this equation, write
h(y) dy = g(x) dx,
and then integrate both sides to obtain an implicit solution
H(y) = G(x) + C,
where C is the constant of integration. Whenever possible, isolate y on the left-hand side to find
the explicit solution. If initial conditions are given, then the constant C can be determined.
Example 3.3.3. Solve the equation
sinh y cos2 x
dy
dx
= tanx+ 4.
Proof. We separate the variables to obtain
sinh y dy =
tanx+ 4
cos2 x
dx.
Simplification followed by integration gives
sinh y dy = (tan x sec2 x+ 4 sec2 x) dx∫
sinh y dy =
∫
(tan x sec2 x+ 4 sec2 x) dx
cosh y = 12 tan
2 x+ 4 tan x+ C, (3.8)
where C ∈ R. In this case it is best to leave the solution in implicit form (3.8), since cosh is not a
one-to-one function.
We end with an application of this method to the real world.
Example 3.3.4 (Newton’s law of cooling). (a) Newton’s law of cooling states that the rate
of heat loss of a body is proportional to the difference in temperatures between the body
and its surroundings. Set up an ODE to model this law and solve it.
(b) A hot object is placed into a room of temperature 20◦C. Unfortunately the object is too
hot for the thermometer to measure its initial temperature. However, after 6 minutes, the
temperature of the object was measured as 80◦C, and after eight minutes as 50◦C. What
was the original temperature of the object?
Proof. (a) Suppose that
c©2020 School of Mathematics and Statistics, UNSW Sydney
62 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
• T is the temperature of the object at time t,
• A is the ambient temperature (that is, the temperature of the surroundings), and
• k is the constant of proportionality.
Newton’s law of cooling implies that
dT
dt
= k(T −A).
To solve this equation, we separate variables:
1
T −A dT = k dt∫
1
T −A dT =
∫
k dt
ln(T −A) = kt+ C,
where C is the constant of integration. By taking exponentials of both sides we obtain
T −A = ekt+C .
Hence
T = A+Kekt, (3.9)
where K = eC > 0.
(b) We have
A = 20, T (6) = 80, and T (8) = 50.
Hence (3.9) implies that {
80 = 20 +Ke6k
50 = 20 +Ke8k,
which can be solved to give
k = − ln 2
2
and K = 480.
Hence
T (0) = 20 + 480e0 = 500.
So the initial temperature of the object was 500◦C.
3.4 First order linear ODEs
(Ref: SH10 §9.1)
A first order linear ODE can be written in the form
dy
dx
+ f(x)y = g(x), (3.10)
where f and g are given functions of a single variable x. The ODE is called linear since there are
no non-linear terms (such as y2, sin y or
√
y′) involving y or its derivative y′.
A very slick method for solving first order linear ODEs is summarised in the steps below.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.4. FIRST ORDER LINEAR ODES 63
1. Write the ODE in the form (3.10).
2. Calculate e
∫
f(x) dx (ignoring the constant of integration). We denote this by h(x) and call
it the integrating factor.
3. Multiply (3.10) by the integrating factor h(x) to obtain
h(x)
dy
dx
+ h(x)f(x)y = g(x)h(x).
By using the product rule for differentiation, the left-hand side can now be rewritten so
that
d
dx
(
h(x) y
)
= g(x)h(x)
(this is easily seen in the examples that follow).
4. Integrate both sides and then rearrange for y to solve the ODE. Don’t forget the constant
of integration!
Example 3.4.1. Solve
dy
dx
+ 3y = e−x.
Solution. The ODE is already in the form (3.10) with f(x) equal to 3. The integrating factor h is
therefore given by
h(x) = e
∫
3 dx = e3x.
Multiplying the ODE by the integrating factor e3x gives
e3x
dy
dx
+ 3e3xy = e2x.
By the product rule, we can contract the left hand side to obtain
d
dx
(
e3x y
)
= e2x
(this step is easy to check by working backwards). Integrating gives
e3xy = 12e
2x + C,
where C ∈ R. Now divide by e3x to obtain the explicit solution
y = 12e
−x +Ce−3x,
where C ∈ R. (Note that if the constant of integration is accidently omitted then we lose half the
solution!)
Example 3.4.2. Solve the IVP
(x− 1)3 dy
dx
+ 4(x− 1)2y = x+ 1, y(0) = 2. (3.11)
c©2020 School of Mathematics and Statistics, UNSW Sydney
64 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
Solution. First rewrite the ODE into the standard form (3.10) to obtain
dy
dx
+ 4(x− 1)−1y = x+ 1
(x− 1)3 . (3.12)
The integrating factor h is given by
h(x) = e
∫
4(x−1)−1 dx = e4 ln(x−1) = eln(x−1)
4
= (x− 1)4.
(Note that it is important to simplify h(x) before proceeding with the method.)
Multiplying (3.12) by the integrating factor gives
(x− 1)4 dy
dx
+ 4(x− 1)3y = x2 − 1,
from which we obtain
d
dx
(
(x− 1)4 y) = x2 − 1
by the product rule. Integrating gives
(x− 1)4y = 13x3 − x+C,
where C ∈ R. To evaluate C, we impose the initial condition y(0) = 2 and find that
(0− 1)42 = 1303 − 0 + C.
Hence C = 2. Therefore the solution of the IVP is given by
y =
1
3x
3 − x+ 2
(x− 1)4 =
x3 − 3x+ 6
3(x− 1)4 .
(Note that the solution is valid when x ∈ (−∞, 1) or x ∈ (1,∞) but not when x = 1. In fact, it is
not hard to see from (3.11) that there is no real-valued function y satisfying the ODE with 1 in its
domain.)
The following example shows an application of a first order linear ODE to a real world problem.
Example 3.4.3. An investor has a salary of $60, 000 per year which is expected to increase at a
rate of $1000 per annum. Suppose that an initial deposit of $1000 is invested in a program that
pays 8% per annum, and that the investor deposits 5% of their salary each year. Find the amount
invested after t years.
Solution. We will approximate the situation by assuming that interest is calculated continuously
and that deposits are made continuously.
Let y(t) denote the dollars invested after t tears. Then
dy
dt
= 0.08y + 0.05(60000 + 1000t)
( rate of increase of investment = 8% of investment + 5% of salary ).
This is a first order linear ODE. If we rewrite the ODE as
dy
dt
− 0.08y = 0.05(60000 + 1000t),
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.5. EXACT ODES 65
then we see that the integrating factor h is given by
h(t) = e
∫ −0.08 dt = e−0.08t.
After multiplying the ODE by the integrating factor and contracting the left-hand side by the
product rule, one obtains
d
dt
(
e−0.08ty
)
= 0.05(60000 + 1000t)e−0.08t.
Integration (where we use integration by parts for the right-hand side) and rearrangement gives
y(t) = −625t− 46312.5 + Ce0.08t,
where C is the constant of integration. Imposing the initial condition y(0) = 1000 yields the final
solution
y(t) = 45312.5 e0.08t − 625t − 46312.5,
where t ≥ 0.
To illustrate, note that after 10 years the investment totals about y(10) ≈ 51507.86 dollars.
3.5 Exact ODEs
(Ref: SH10 §19.2)
In this section we examine another approach to solving (some) first order ODEs.
To begin, suppose that H is a function of two variables x and y satisfying the equation
H(x, y) = C,
where C is a real constant. If we consider y as a function of x and differentiate both sides with
respect to x, then the chain rule gives
∂H
∂x
+
∂H
∂y
dy
dx
= 0.
If
∂H
∂x
and
∂H
∂y
are denoted by F and G respectively, then we obtain the differential equation
F (x, y) +G(x, y)
dy
dx
= 0. (3.13)
Of course, if F and G are defined as above, then H(x, y) = C is a solution to (3.13).
Conversely, suppose that we want to solve a differential equation of the form (3.13). The above
discussion shows that if there exists a function H of two variables such that F = ∂H∂x and G =
∂H
∂y
then the solution is given by H(x, y) = C, where C ∈ R. The difficulty is, this condition on F
and G is not so easy to verify. Fortunately, there is an easier condition. Recall, by the theorem on
mixed partial derivatives, that if H is a ‘nice’ function then
∂2H
∂y∂x
=
∂2H
∂x∂y
,
c©2020 School of Mathematics and Statistics, UNSW Sydney
66 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
or in other words,
∂F
∂y
=
∂G
∂x
.
This second condition on F and G is known as the condition for exactness and is much easier to
verify.
We summarise these observations in the next definition and theorem.
Definition 3.5.1. An ordinary differential equation of the form
F (x, y) +G(x, y)
dy
dx
= 0
is called exact if
∂F
∂y
=
∂G
∂x
.
Theorem 3.5.2. Suppose that an ordinary differential equation of the form (3.13) is exact. Then
the solution to (3.13) is given by H(x, y) = C, where C is a constant and where H is a function
satisfying the equations
∂H
∂x
= F and
∂H
∂y
= G.
Remark 3.5.3. The differential equation (3.13) is equivalent to
dy
dx
= −F (x, y)
G(x, y)
or
F (x, y)dx +G(x, y)dy = 0.
The left-hand side of the second expression is an example of a differential form.
Example 3.5.4. Show that the differential equation
dy
dx
= −2x+ y + 1
2y + x+ 1
is exact, and hence find its solution.
Solution. First we rewrite the differential equation to obtain
(2x+ y + 1) + (2y + x+ 1)
dy
dx
= 0.
Write F = 2x+ y + 1 and G = 2y + x+ 1. Then
∂F
∂y
= 1 =
∂G
∂x
,
so the differential equation is exact. Hence there exists a function H satisfying
∂H
∂x
= F (x, y) = 2x+ y + 1, (3.14)
∂H
∂y
= G(x, y) = 2y + x+ 1. (3.15)
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.5. EXACT ODES 67
To find H, we begin by integrating (3.14) with respect to x (and treating y as a constant), so that
H(x, y) = x2 + xy + x+C1(y), (3.16)
where the ‘constant of integration’ C1(y) is a function of y. Similarly, integrating (3.15) with
respect to y (and treating x as a constant) gives
H(x, y) = y2 + xy + y + C2(x), (3.17)
where the ‘constant of integration’ C2(x) is a function of x. A comparison of (3.16) and (3.17)
shows that
H(x, y) = x2 + xy + y2 + x+ y. (3.18)
Hence the solution to the differential equation is given by
x2 + xy + y2 + x+ y = C, (3.19)
where C is a real constant.
Note that, since (3.19) is a quadratic in y, we could rewrite the solution explicitly for y to give
y =
−(x+ 1)±
√
(x+ 1)2 − 4(x2 + x− C)
2
, C ∈ R
(this gives two functions for every value of C). However, in this case the solution is probably better
left in implicit form, as in (3.19).
Remark 3.5.5. Technically, (3.18) should read
H(x, y) = x2 + xy + y2 + x+ y +K,
where K is an arbitrary constant. However, then the solution to the ODE is given by
x2 + xy + y2 + x+ y +K = C0,
where C0 is yet another constant. By combining the constants K and C0 on the right-hand side,
this solution is equivalent to (3.19). Hence is is customary to ignore the constant K.
The next example follows the same overall strategy, but illustrates a slightly different approach
to finding the function H.
Example 3.5.6. Solve the differential equation
2x sin(x2 + y2) +
(
2y sin(x2 + y2) + 2y
)dy
dx
= 0.
Solution. Write F (x, y) = 2x sin(x2 + y2) and G(x, y) = 2y sin(x2 + y2) + 2y. It is easily seen that
∂F
∂y
= 4xy cos(x2 + y2) =
∂G
∂x
,
and so the equation is exact. Hence we look for a function H satisfying
∂H
∂x
= F (x, y) = 2x sin(x2 + y2), (3.20)
∂H
∂y
= G(x, y) = 2y sin(x2 + y2) + 2y. (3.21)
c©2020 School of Mathematics and Statistics, UNSW Sydney
68 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
As with the previous example, we integrate (3.20) with respect to x (and treating y as a constant)
to obtain
H(x, y) = − cos(x2 + y2) + C1(y), (3.22)
where the ‘constant of integration’ C1(y) is a function of y. So now we only need to determine
C1(y). To do so, differentiating (3.22) with respect to y (and treating x as a constant) gives
∂H
∂y
= 2y sin(x2 + y2) + C ′1(y).
Comparing this with (3.21) shows that C ′1(y) = 2y, whence C1(y) = y
2. (Here we omit the constant
of integration for reasons given in Remark 3.5.5.) Hence
H(x, y) = − cos(x2 + y2) + y2
and the solution to the differential equation is given by
− cos(x2 + y2) + y2 = C,
where C ∈ R. (Note that in this example it is not possible to give an explicit expression for y in
terms of x.)
The final example gives an interesting variation on this theme.
Example 3.5.7. Solve the differential equation
(ex − sin y) dx+ cos y dy = 0. (3.23)
Solution. Suppose that F (x, y) = ex − sin y and G(x, y) = cos y. Since
∂F
∂y
= − cos y and ∂G
∂x
= 0,
the differential equation is not exact. What happens if we use the method of the last two examples
regardless?
Suppose that there is a function H such that
∂H
∂x
= F (x, y) = ex − sin y, (3.24)
∂H
∂y
= G(x, y) = cos y. (3.25)
Integrating (3.24) with respect to x gives
H(x, y) = ex − x sin y + C1(y),
where C1(y) is independent of x. Partial differentiation with respect to y yields
∂H
∂y
= −x cos y +C ′1(y).
If we compare this with (3.25), then we conclude that C ′1(y) = (1 + x) cos y, which contradicts the
fact that C1(y) is independent of x. Hence no such function H exists.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.6. SOLVING ODES BY USING A CHANGE OF VARIABLE [X] 69
Fortunately, not all is lost. If we multiply (3.23) through by the function e−x, then we obtain
(1− e−x sin y) dx+ e−x cos y dy = 0. (3.26)
Since the integrating factor e−x is never zero, solutions to (3.26) will also be solutions to (3.23).
Moreover, it is easily verified that (3.26) is exact. Hence we can now (successfully) use the method
for solving exact differential equations to obtain the solution
x+ e−x sin y = C,
where C ∈ R. Details are left to the reader.
Remark 3.5.8. As illustrated in the previous example, if an ODE of the form
F (x, y) dx +G(x, y) dy = 0
is not exact, then it may be possible to transform it into an exact ODE by multiplying through by
a suitable function. In general, finding such a function is difficult and lies beyond the scope of this
course.
3.6 Solving ODEs by using a change of variable [X]
(Ref: SH10 §19.1)
This section is for MATH1241 students only. We have seen in previous sections how to solve
separable, linear and exact first order ODEs. While not all first order ODEs are among these types,
some can be transformed into one of these types by a suitable change of variables. We illustrate
the principle with two examples.
Example 3.6.1. Use the substitution y(x) = x · v(x) to solve the differential equation
dy
dx
=
xy − y2
x2
. (3.27)
Solution. The idea is to transform (3.27) into a separable ODE involving v and x. Once the general
solution for v is found, then it is easy to write down the solution for y.
Using the substitution y(x) = x · v(x) and the product rule for differentiation, we see that
dy
dx
=
d
dx
(xv) = v
dx
dx
+ x
dv
dx
= v + x
dv
dx
.
Hence (3.27) becomes
v + x
dv
dx
=
x(xv) − (xv)2
x2
.
If we simplify the right-hand side then
v + x
dv
dx
= v − v2,
and hence we obtain the separable ODE
−dv
v2
=
dx
x
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
70 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
Integrating both sides gives
1
v
= ln |x|+ C
which implies that
v =
1
ln |x|+ C
where C ∈ R. Now v = y
x
and so
y =
x
ln |x|+ C
gives the general solution to (3.27).
Example 3.6.2. Solve the equation
dy
dt
+ 2y + y2t2e2t = 0 (3.28)
by using the substitution z = 1/y.
Solution. Note that (3.28) is not a first order linear ODE because of the term involving y2. In this
example we will see that the nonlinear equation in y and t becomes a linear ODE in z and t under
the transformation z = 1/y.
To make use of the substitution z = 1/y, we need to express dydx in terms of z. To do so,
differentiate both sides of the equation
y =
1
z
with respect to t to obtain
dy
dt
= − 1
z2
dz
dt
.
With this substitution, (3.28) becomes
− 1
z2
dz
dt
+
2
z
+
t2e2t
z2
= 0.
Rearranging gives
dz
dt
− 2z = t2e2t,
which is a first order linear ODE in z. As usual, multiply through by the integrating factor e−2t to
obtain
d
dt
(e−2tz) = t2.
Integrating gives
e−2tz =
t3
3
+ C0,
whereupon
z = e2t
(
t3
3
+ C0
)
,
where C0 ∈ R. Since y = 1/z, the solution to (3.28) is given by
y =
3
e2t(t3 + C)
,
where C = 3C0 ∈ R.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.7. MODELLING WITH FIRST ORDER ODES 71
3.7 Modelling with first order ODEs
(Ref: SH10 §9.1, 9.2)
Many real-life problems can be analysed and solved by attempting to convert them into math-
ematics. In doing so, a number of assumptions have to be made and a theoretical framework set
up which attempts to reflect what is happening in the real world. Such a framework is called a
mathematical model. The reliability of that model can be judged by how well it predicts what
actually happens in the real world.
To construct a mathematical model, one should
1. describe accurately the data we have,
2. decide exactly what information we want to extract from the model,
3. decide which variables in the model are dependent and which are independent, and
4. describe how the dependent variables change as the independent ones vary (which may
lead to a differential equation).
Examples of mathematical modelling with differential equations have already been encountered (see
Examples 3.3.4 and 3.4.3). The following two subsections provide further examples and discussion.
3.7.1 Mixing problems
In this subsection we illustrate how to construct a mathematical model that leads to a differential
equation.
Example 3.7.1. A martini drink is, in essence, a mixture of the two liquids gin and vermouth.
James Blond insists that his martinis be prepared as follows. Initially, 40 cc of gin are placed in a
large container. Then gin is poured into the container at a rate of 2 cc/sec and at the same time
vermouth is poured in at a rate of 6 cc/sec. The mixture is constantly shaken (not stirred) and
flows out at a rate of 4 cc/sec.
(a) Find an expression for the volume of vermouth in the container t seconds after the pouring
commences.
(b) James likes his martini to have roughly two parts gin to three of vermouth. How many
seconds should elapse before he stops pouring and inserts a cocktail glass in the outflow
from the container?
Solution. (a) To begin, it is recommended that we place the relevant information on a diagram.
c©2020 School of Mathematics and Statistics, UNSW Sydney
72 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
Container
Initial volume: 40 cc of gin
Martini outflow, 4 cc/sec
Gin inflow, 2 cc/sec Vermouth inflow, 6 cc/sec
Next, it is important to identify what we want to find. In this case, we want a formula for the
volume of Vermouth. So let V (t) denote the volume of vermouth (in cc) in the container at time t.
Now we write down as much information about V as we can. We know that V (0) = 0. The
other information given tells us how V changes with time. In particular,
dV
dt
= rate of change of V
= (rate of inflow)− (rate of outflow). (3.29)
Now the rate of inflow of vermouth is 6 cc/sec. To calculate the the rate of outflow, we note that
the total volume of liquid in the container at time t is given by
40 + 2t+ 6t− 4t = 40 + 4t.
Hence the proportion by volume of vermouth in the container at time t is
V (t)
40 + 4t
. (3.30)
Since the rate of outflow of liquid is 4 cc/sec, it follows that the rate of outflow of vermouth is
V (t)
40 + 4t
× 4 cc/sec.
Following from (3.29), we obtain the initial value problem
dV
dt
= 6− V (t)
10 + t
, V (0) = 0. (3.31)
Finally, we solve the IVP. The ODE is first order linear equation, whose solution (when V (0) =
0) is given by
V (t) =
3t2 + 60t
10 + t
. (3.32)
The details for finding this solution are left to the reader as an exercise.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.7. MODELLING WITH FIRST ORDER ODES 73
(b) The liquid will be two parts gin to three of vermouth exactly when three-fifths of the liquid
is vermouth. Hence, using (3.30), we require that
V (t)
40 + 4t
=
3
5
.
But V (t) is given by (3.32), so we require that
3t2 + 60t
4(10 + t)2
=
3
5
.
By rearranging this equation, we obtain the quadratic equation
t2 + 20t− 400 = 0.
The quadratic formula shows that
t =
√
500− 10 ≈ 12.36,
where we have chosen the positive solution to the quadratic equation. Hence James should insert
the cocktail glass 12.36 seconds after the mixing process begins.
3.7.2 Population models
In this subsection we compare three different mathematical models for population growth.
Suppose that the city of Mathopolis initially has a population of 3,000,000 inhabitants and
(initially) grows at a rate of 2% per annum. Can we predict what the population of the city will be
in 10, 20 or 100 years time? In the next three examples, we examine different population models,
discuss their accuracy and see what growth forecast each model gives for Mathopolis.
Example 3.7.2 (Population model 1). In this model, we assume that the growth rate of the pop-
ulation remains constant. That is, we assume that the rate of change of population is proportional
to the population, where the constant of proportionality r is the growth rate. In other words, we
have the IVP
dP
dt
= rP, P (0) = P0,
where P is the population at time t, r is the growth rate and P0 is the initial population. Solving
this (separable) ODE gives the solution
P (t) = P0e
rt.
(This model goes back to Thomas Malthus’ bookAn Essay on the Principle of Population, published
in 1798.)
Application. For Mathopolis, P0 = 3, 000, 000 and r = 0.02. Thus the population P (t) of the city
at time t is given by
P (t) = 3000000e0.02t .
The table in Figure 3.1 shows the predicted population when t is 10, 20 and 100. The graph of P
against t is also shown.
c©2020 School of Mathematics and Statistics, UNSW Sydney
74 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
Initially After 10 years After 20 years After 100 years
Model 1 3.00 million 3.66 million 4.48 million 22.17 million
Model 2 3.00 million 3.56 million 4.04 million 6.11 million
Model 3 3.00 million 3.61 million 4.21 million 6.73 million
(a) Table showing population forecast according to various models
Model 3
Model 1
Model 2
Time (years)
| | | |
0 20 40 60 80 100
P
op
u
la
ti
on
(m
il
li
on
s)
|
|
|
|
|
|
|
|
|
1
2
3
4
5
6
7
8
9
10
(b) Graphs showing how models compare
Figure 3.1: Population growth for a city as projected by different models.
Criticisms of this model.
1. The model predicts that population will grow indefinitely. This ignores the fact that the
resources and space needed to support such a population are finite.
2. The model ignores external factors (such as disease, natural disasters and wars) that have
an effect on population size.
In light of the first criticism, we introduce a second model.
Example 3.7.3 (Population model 2). In this model the rate of population growth is not propor-
tional to the population. Instead, there is a critical population Pc which when exceeded causes the
population P to decrease; otherwise the population increases. We try the IVP
dP
dt
= k(Pc − P ), P (0) = P0, (3.33)
where k is a positive constant. Hence the rate of change of P is positive if P < Pc and negative if
P > Pc.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.7. MODELLING WITH FIRST ORDER ODES 75
To solve (3.33), we separate the variables and integrate:∫
dP
Pc − P =
∫
k dt.
In the case when P < Pc we obtain
− ln(Pc − P ) = kt+ C
and so
P = Pc −Ae−kt,
where A = e−C . Now P = P0 when t = 0 and so A = Pm − P0. Hence
P (t) = Pc − (Pc − P0)e−kt.
(The case when P > Pc gives the same solution, as can be easily verified by carefully working
through the details.) Note that P (t)→ Pc as t→∞.
Application. For Mathopolis, P0 = 3, 000, 000 and
dP
dt = 0.02P0 = 60, 000 when t = 0. Assume also
that Pc = 7, 000, 000. (That is, due to available land, resources and other factors, one expects the
city’s maximum sustainable population size is 7 million.) By considering the differential equation
(3.33) when t = 0, we conclude that
60000 =
dP
dt
= k(Pc − P0) = k(7000000 − 3000000)
and hence that k = 0.015. Therefore the population P (t) of Mathopolis at time t is given by
P (t) = 7000000 − 4000000e−0.015t .
See Figure 3.1 for specific population projections under this model and a corresponding graph.
Criticisms of this model.
1. Observe that if P is close to 0 then dPdt ≈ kPc, which means that when the population is
very small the growth rate may be a large positive number. In fact, the rate of increase is
most rapid for tiny populations!
2. As with the first population model, external factors are ignored.
In light of the first criticism we introduce a third model.
Example 3.7.4 (Population model 3). To overcome the first criticism of the previous model, we
instead try the IVP
dP
dt
= kP (Pc − P ), P (0) = P0, (3.34)
where k is a positive constant and Pc is the critical population. Note that
dP
dt is small when P
is small. (This model was first published by Pierre Verhulst in 1838 after he had read Thomas
Malthus’ An Essay on the Principle of Population.)
To solve (3.34), separation of variables and integration gives∫
dP
P (Pc − P ) =
∫
k dt.
c©2020 School of Mathematics and Statistics, UNSW Sydney
76 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
The integral on the left-hand side is evaluated by the method of partial fractions:
1
Pc
∫
1
P
+
1
Pc − P dP =
∫
k dt.
From here it is not difficult to show that
P (t) =
PcP0
P0 + (Pm − P0)e−kPct
(try this as an exercise). Note once again that P (t) → Pc as t → ∞. The resulting curve (see,
for example, the solid gray curve in Figure 3.1 (b)) is a called a logistic curve. The initial stage of
growth is approximately exponential; then, as P approaches Pc the growth slows and approaches
Pc asymptotically.
Application. Once again, for Mathopolis we have the data
P0 = 3, 000, 000, Pc = 7, 000, 000 and
dP
dt
= 0.02P0 = 60, 000 when t = 0.
By considering the differential equation (3.34) when t = 0, we conclude that
60000 =
dP
dt
= kP0(Pc − P0)
and hence that k = 5× 10−9. Therefore the population P (t) of the city at time t is given by
P (t) =
21000000
3 + 4e−0.035t
.
See Figure 3.1 for specific population projections under this model.
Criticisms of this model. As with the other population models, external factors are ignored. In
particular, Pc may change due to factors such as technological advances or climate change.
Remark 3.7.5. Note that as the models introduced become more realistic, the mathematics needed
to solve the corresponding differential equations is more sophisticated. In the case of modelling fluid
flow (such as water flow in the pipes or air flow around an aeroplane wing), the set of (partial) dif-
ferential equations which must be solved (known as the Navier–Stokes equations) raises difficulties
that are beyond the grasp of current mathematical knowledge. For example, it is as yet unknown
whether a solution to these equations always exists and whether (in the case that a solution exists)
it is a ‘smooth’ solution. The Clay Mathematics Institute has listed these problems as one of the
seven Millennium problems and carries prize money of US$1,000,000 for a correct solution.
3.8 Second order linear ODEs with constant coefficients
(Ref: SH10 §9.3, §19.4)
In this section we consider a special class of second order equations, known as second order
linear ODEs with constant coefficients. An equation of this class has the form
d2y
dx2
+ a
dy
dx
+ by = f(x), (3.35)
where a and b are real numbers. Such equations naturally arise in modelling wave mechanics and
prey-predator interaction. You will have seen in earlier calculus courses (and in Physics) that the
second order equation
d2x
dt2
+ n2x = 0 is used to model simple harmonic motion.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.8. SECOND ORDER LINEAR ODES WITH CONSTANT COEFFICIENTS 77
3.8.1 The homogeneous case
To simplify our treatment of the second oder ODE (3.35), we first look at the case when f(x) ≡ 0
(which means that f(x) = 0 for all x).
Definition 3.8.1. A second order linear ODE with constant coefficients is said to
be homogeneous if it is of the form
d2y
dx2
+ a
dy
dx
+ by = 0, (3.36)
where a and b are real numbers.
It turns out that we can always solve a homogeneous second order ODE with real coefficients.
The first important observation towards proving this fact is given by the following lemma.
Lemma 3.8.2. If y1 and y2 are two solutions to the differential equation (3.36) then any linear
combination Ay1 +By2, where A and B are real numbers, is also a solution to (3.36).
Proof. Suppose that y1 and y2 are two solutions to the differential equation (3.36). If y = Ay1+By2,
where A and B are real numbers, then
y′′ + ay′ + by = (Ay1 +By2)′′ + a(Ay1 +By2)′ + b(Ay1 +By2)
= Ay′′1 +Ay
′′
2 +Aay
′
1 +Bay
′
2 +Aby1 +Bby2
= A(y′′1 + ay
′
1 + by1) +B(y
′′
2 + ay
′
2 + by2)
= 0 + 0 (since y1 and y2 are solutions)
= 0.
Hence y is also a solution to (3.36).
Remark 3.8.3. It can also be shown that every second order ODE has at most two linearly
independent solutions (this will be demonstrated in second year linear algebra courses). Hence if
y1 and y2 are two linearly independent solutions of (3.36) then every solution y to (3.36) is of the
form y = Ay1+By2. In this context, y1 and y2 are linearly independent if and only if they are not
constant multiples of each other.
In view of the above lemma and remark, to find a complete solution to (3.36), one only needs
to find two linearly independent solutions. To look for a solution to (3.36), we try a function y that
does not change too much when differentiated. (The idea is that, upon substitution, the terms on
the left-hand side need to cancel each other out to give zero.) If y = eλx, where λ is a constant,
then y′ = λeλx and y′′ = λ2eλx. When these are substituted into (3.36) we obtain
λ2eλx + aλeλx + beλx = 0.
Dividing by eλx gives
λ2 + aλ+ b = 0, (3.37)
Hence we have shown that
y = eλx is a solution to (3.36) if and only if λ is a root of (3.37).
We give the quadratic equation (3.37) a special name.
c©2020 School of Mathematics and Statistics, UNSW Sydney
78 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
Definition 3.8.4. The characteristic equation of the second order linear ODE
d2y
dx2
+ a
dy
dx
+ by = 0,
is given by
λ2 + aλ+ b = 0. (3.38)
The following example illustrates what we have learnt so far.
Example 3.8.5. Solve the second order homogeneous linear ODE
d2y
dx2
− 5dy
dx
+ 6y = 0. (3.39)
Solution. The characteristic equation associated to (3.39) is given by
λ2 − 5λ+ 6 = 0.
By solving the quadratic equation we find that λ = 2, 3. Hence y1 = e
2x and y2 = e
3x are solutions
to (3.39). By Lemma 3.8.2 the linear combination y, given by
y = Ay1 +By2 = Ae
2x +Be3x
where A and B are real numbers, is also a solution to (3.39). Moreover, since y1 and y2 are linearly
independent, every solution is of this form (see Remark 3.8.3).
Note that in the last example, the characteristic equation had two distinct real roots, thus
leading to two linearly independent solutions to the homogeneous ODE. In general, there are three
possibilities since all the coefficients we consider are real numbers. Either (i) the characteristic
equation has two distinct real roots, (ii) the characteristic equation has a repeated real root, or (iii)
the characteristic equation has two distinct complex roots (which are complex conjugates of each
other).
Case (i): The characteristic equation has two distinct real roots λ1 and λ2. Then, as seen
above, we obtain two linearly independent solutions y1 = e
λ1x and y2 = e
λ2x. Hence the general
solution is given by
y = Aeλ1x +Beλ2x,
where A and B are real numbers.
Case (ii): The characteristic equation has a repeat real real root λ1. Then one solution is given
by y1 = e
λ1x. Is there another independent solution? It turns out in this case that y2 = xe
λ1x
also solves the homogeneous equation (3.36), as can be easily verified by substituting this into the
left-hand side of (3.36). Hence the general solution is given by
y = Aeλ1x +Bxeλ1x,
where A and B are real numbers.
Case (iii): The characteristic equation has two distinct complex roots α+βi and α−βi, where
α and β are real numbers and β 6= 0. Then we obtain two solutions
y1 = e
(α+βi)x and y2 = e
(α−βi)x.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.8. SECOND ORDER LINEAR ODES WITH CONSTANT COEFFICIENTS 79
Hence y is also a solution, where
y = Ce(α+βi)x +De(α−βi)x,
and C and D are (complex) constants. The goal is to choose C and D so that the solution is real.
Now Euler’s formula
eiθ = cos θ + i sin θ
gives
y = Ce(α+βi)x +De(α−βi)x
= eαx(Ceβix +De−βix)
= eαx
(
C(cosβx+ i sin βx) +D(cos βx− i sin βx))
= eαx
(
(C +D) cos βx+ i(C −D) sin βx)
= eαx
(
A cos βx+B sin βx
)
,
where A = C +D, B = i(C −D). If we choose C and D to be complex conjugates of each other,
but otherwise with arbitrary real and imaginary parts, then A and B will be real. It is easy to see
that eαx cos(βx) and eαx sin(βx) are independent solutions; hence the general solution in this case
is given by
y = eαx
(
A cos βx+B sinβx
)
where A and B are real numbers.
We summarise our findings in the following theorem.
Theorem 3.8.6. Consider the second order homogeneous ODE given by (3.36) and let λ1 and λ2
denote the roots of the corresponding characteristic equation (3.38).
(i) If λ1 and λ2 are different real numbers then the solution to (3.36) is given by
y = Aeλ1x +Beλ2x,
where A,B ∈ R.
(ii) If λ1 = λ2 then the solution to (3.36) is given by
y = Aeλ1x +Bxeλ1x,
where A,B ∈ R.
(iii) If λ1 = α + βi and λ2 = α− βi, where α, β ∈ R and β 6= 0, then the solution to (3.36) is
given by
y = eαx
(
A cos(βx) +B sin(βx)
)
,
where A,B ∈ R.
Example 3.8.7. Solve the following differential equations:
(a) y′′ − 6y′ + 25 = 0,
(b) y′′ + 4y′ + 4y = 0, with initial conditions y(0) = 1 and y′(0) = 0.
c©2020 School of Mathematics and Statistics, UNSW Sydney
80 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
Solution. (a) The characteristic equation
λ2 − 6λ+ 25 = 0
has roots 3+4i and 3−4i (as determined by completing the square or using the quadratic formula).
Hence the solution y to the ODE is given by
y = e3x(A cos 4x+B sin 4x),
where A and B are real numbers.
(b) The left-hand side of the characteristic equation
λ2 + 4λ+ 4 = 0
is easily factorised to give
(λ+ 2)2 = 0.
Hence −2 is a repeated root and the solution y to the IVP is given by
y = Ae−2x +Bxe−2x, (3.40)
where the constants A and B are to be determined by imposing initial conditions. Differentiation
shows that
y′ = −2Ae−2x − 2Bxe−2x +Be−2x. (3.41)
When x = 0, (3.40), (3.41) and the initial conditions imply that A = 1 and B = 2. Hence
y = e−2x + 2xe−2x
describes the solution y to the IVP.
3.8.2 The non-homogeneous case
We return to solving the second order linear ODE (3.36) in the case when f is not identically zero.
The main idea will be illustrated in the following example.
Example 3.8.8. Solve the equation
y′′ − 5y′ + 6y = 12x− 4. (3.42)
Solution. Since the first and second derivatives of a polynomial are also polynomials, it seems likely
that at least one particular solution yP to the ODE is a polynomial. A little more thought shows
that if yP is polynomial that satisfies (3.42), then the degree of yP is no greater than one. So we
look for a particular solution yP of the form
yP = ax+ b,
where a and b are real numbers whose values are to be determined. Now y′P = a and y
′′
P = 0, so
substituting yP into (3.42) gives
0− 5a+ 6(ax+ b) = 12x− 4.
By equating coefficients we find that a = 2 and b = 1. Hence one particular solution yP to (3.39)
is given by yP = 2x+ 1.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.8. SECOND ORDER LINEAR ODES WITH CONSTANT COEFFICIENTS 81
Are there any other solutions? The answer to this question is ‘yes’. To find them, we consider
the associated homogeneous ODE
y′′ − 5y′ + 6y = 0, (3.43)
whose solution yH is given by
yH = Ae
2x +Be3x,
where A and B are real numbers (see Example 3.8.5). Now we will show that y, where y = yH+yP ,
is a solution to (3.42):
y′′ − 5y′ + 6y = (yH + yP )′′ − 5(yH + yP )′ + 6(yH + yP )
= (y′′H − 5y′H + 6yH) + (y′′P − 5y′P + 6yP ) (by linearity of differentiation)
= 0 + (12x− 4) (by the properties of yH and yP )
= 12x− 4.
Hence y, given by
y = yH + yP = Ae
2x +Be3x + 2x− 1
where A and B are real constants, also solves the ODE. In fact, this gives the general solution (the
discussion in Subsection 3.8.4 explains why).
Bearing in mind the above example, we now detail an algorithm for solving a second order ODE
of the form (3.35).
1. Find the solution yH to the corresponding homogeneous equation (3.36) (by first identifying
the roots of the characteristic equation, as in Subsection 3.8.1).
2. Find a particular solution yP to (3.35).
3. The general solution y to (3.35) is then given by y = yH + yP .
In general, it is best to perform Step 1 before Step 2; the reason for this will soon become obvious.
Step 1 has already been discussed in some detail, and Step 3 is easy. We need only devote some
discussion to performing Step 2.
Example 3.8.9. Solve the ODE
y′′ − 4y′ + 5y = 20e−x. (3.44)
Proof. First, we solve the homogeneous equation
y′′ − 4y′ + 5y = 0.
The roots of the characteristic equation λ2 − 4λ + 5 = 0 are 2 − i and 2 + i (these can be found
using the quadratic formula). Hence the solution yH of the homogeneous equation is given by
yH = e
2x(A cos x+B sinx),
where A and B are real numbers.
Second, we find a particular solution yP . Since the right hand side is an exponential, we try
yP = ae
−x,
where a is a constant to be determined. Note that y′P = −ae−x and y′′P = ae−x. So substituting
yP into (3.44) gives
ae−x + 4ae−x + 5ae−x = 20e−x.
c©2020 School of Mathematics and Statistics, UNSW Sydney
82 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
By comparing coefficients on each side we conclude that yP is a particular solution if and only if
a = 2. Hence our particular solution is given by yP = 2e
−x.
Third, the general solution y to (3.44) is given by
y = yH + yP
= e2x(A cos x+B sinx) + 2e−x,
where A and B are real numbers.
Example 3.8.10. Solve the ODE
y′′ − 3y′ + 2y = 5e2x. (3.45)
Solution. First we solve the corresponding homogeneous ODE
y′′ − 3y′ + 2y = 0.
The characteristic equation λ2 − 3λ+2 = 0 has the solutions λ = 1 and λ = 2. Hence the solution
yH to the homogeneous equation is given by
yH = Ae
2x +Bex, (3.46)
where A and B are real constants.
Second, we look for a particular solution yP . Since the right-hand side of (3.45) is a multiple
of e2x, it seems natural to try yP = ae
2x. However, note that ae2x is a particular solution to the
homogeneous equation. (To see this, simply set A as a and B as 0 in (3.46)). Hence substituting
yP = ae
2x into (3.45) will produce 0 on the left-hand side, and consequently this guess for yP will
not work.
In this circumstance, the ‘trick’ for finding a particular solution yP is to multiply the old guess
ae2x by x. That is, we now try
yP = axe
2x.
(This guess for a particular solution is certainly not in the solution space for the homogeneous
equation.) Now we find that
y′P = ae
2x(2x+ 1) and y′′P = 4ae
2x(x+ 1),
and hence substituting yP = axe
2x into (3.45) gives
4ae2x(x+ 1)− 3ae2x(2x+ 1) + 2axe2x = 5e2x.
This simplifies to
4ae2x − 3ae2x = 5e2x,
and we thereby deduce that a = 5. Therefore a particular solution yP is given by yP = 5xe
2x.
Finally, the general solution y to (3.45) is given by
y = yH + yP
= Ae2x +Bex + 5xe2x
where A and B are real constants.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.8. SECOND ORDER LINEAR ODES WITH CONSTANT COEFFICIENTS 83
Given a function f , the following table indicates which guess for yP will always yield a particular
solution for the nonhomogeneous ODE (3.35).
f(x) Guess for particular solution yP
P (x) (a polynomial of degree n) Q(x) (a polynomial of degree n)
P (x)esx Q(x)esx
P (x) cos(sx) Q1(x) cos(sx) +Q2(x) sin(sx)
P (x) sin(sx) Q1(x) cos(sx) +Q2(x) sin(sx)
P (x)esx cos(tx) or P (x)esx sin(tx) Q1(x)e
sx cos(tx) +Q2(x)e
sx sin(tx)
If any term of the guess for yP is a solution to the homogeneous ODE, then multiply it by x.
If any term of the new guess is still a solution to the homogeneous ODE, then multiply by x again.
The next example illustrates the directive given in the last two rows of the table.
Example 3.8.11. Solve the ODE
y′′ − 6y′ + 9y = 8e3x. (3.47)
Solution. First, the characteristic equation
λ2 − 6λ+ 9 = 0
factorises as (λ−3)2 = 0 and thus has the repeated root 3. So the solution yH to the corresponding
homogeneous equation is given by
yH = Ae
3x +Bxe3x, (3.48)
where A and B are real numbers.
Second, we search for a particular solution yP . Our first guess yP = ae
3x will not work, as it is
a solution to the corresponding homogeneous equation (to see why, simply set A as a and B as 0
in (3.48)). So we multiply by x to obtain a new guess yP = axe
3x. However, this guess also solves
the homogeneous equation. So once again, multiply by x to obtain a new guess yP = ax
2e3x. It is
easy to see that yP no longer lies in the solution space to the homogeneous equation. So this guess
will work.
To determine the value of a, substitute yP = ax
2e3x into (3.47). We leave it to the reader to
verify that a = 4. Hence a particular solution yP is given by yP = 4x
2e3x.
Finally, the general solution y is the sum of yH and yP , namely
y = Ae3x +Bxe3x + 4x2e3x,
where A and B are real numbers.
c©2020 School of Mathematics and Statistics, UNSW Sydney
84 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
Remark 3.8.12. The previous example illustrates the importance of finding yH before making
a guess for yP ; without knowing yH it is not possible to make a suitable guess for yP . The next
example emphasises this point.
Example 3.8.13. Consider the nonhomogeneous ODE
d2y
dt2
+ 6
dy
dt
+ 13y = 5e3t cos(2t). (3.49)
Write down the form of a particular solution yP to this ODE. (You are not required to evaluate
the undetermined coefficients appearing in the the form of yP .)
Solution. The characteristic equation λ2 + 6λ + 13 = 0 has roots 3 + 2i and 3 − 2i (these can be
found by completing the square or using the quadratic formula). Hence the solution yH to the
corresponding homogeneous equation is given by
yH = e
3t(A cos 2t+B sin 2t), (3.50)
where A,B ∈ R.
Since the right-hand side of (3.49) is a product of e3t and cos 2t, our initial guess for the
particular solution yP is given by
yP = ae
3t cos 2t+ be3t sin 2t.
However, (3.50) shows that this guess solves the homogeneous equation. Instead, try
yP = ate
3t cos 2t+ bte3t sin 2t.
This new guess for yP does not lie in the solution space for the homogeneous equation; hence this
gives the form of particular solution that we seek.
Remark 3.8.14. One must take care in the use of the method of undetermined coefficients, and
in particular the two rules at the end of the table given above.
Consider, for example, the differential equation
y′′ − 6y′ + 9y = x2e3x.
The homogeneous solution, from above, is yH = Ae
3x + Bxe3x. Now since x2e3x is not one of the
homogeneous solutions, one would try the particular solution yP = (Cx
2 +Dx+ E)e3x. However,
since this contains terms which are part of the homogeneous solution, we need to multiply by x
twice - thus yP = (Cx
4 + Dx3 + Ex2)e3x - so that no term in this expression is a homogeneous
solution. Substitution will reveal that D and E are zero, while C = 112 .
There is an easier method to solve such a problem using differential operators. This is left to
more advanced courses in differential equations.
3.8.3 An application: vibrations and resonance
Many structures have a natural frequency of vibration. If an external agent causes them to vibrate
at or near one of these frequencies then large oscillations build up and resonance occurs. This can
cause disasters such as the collapse of bridges. The ‘trick’ used by singers breaking wine glasses is
another example of this phenomenon. The first example of this subsection examines the behaviour
of a natural oscillating system; the second example illustrates how introducing external vibrations
into such a system causes resonance.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.8. SECOND ORDER LINEAR ODES WITH CONSTANT COEFFICIENTS 85
Example 3.8.15. Suppose that a spring is mounted to a (fixed) point P and that an object of
massm is suspended from the spring. Let x denote the (vertical) displacement from the equilibrium
position (or resting position) of the object, taking x to be positive if it lies above the equilibrium
position.
P
Object
Equilibrium position b
By using Newton’s second law of motion, Hooke’s law and making some simple assumptions, one
can show that the system satisfies the differential equation
d2x
dt2
+ ω2x = 0, (3.51)
where ω is a positive constant that depends only on the mass of the object and the stiffness of the
spring.
The object is pulled downwards from its equilibrium position by a distance of 4 units and then
released from rest. Find x(t) when t ≥ 0.
Solution. We need to solve the differential equation (3.51) subject to the initial conditions x(0) =
−4 and x′(0) = 0. The characteristic equation λ2 + ω2 = 0 is easily solved, leading to the solution
x(t) = A cosωt+B sinωt
of (3.51), where A and B are constants. Before imposing initial conditions, we calculate x′(t). This
is given by
x′(t) = −Aω sinωt+Bω cosωt.
Now x′(0) = 0 implies that 0 = Bω, from which we conclude that B = 0 (since ω is positive).
Finally, x(0) = −4 implies that A = −4. Hence the solution is given by
x(t) = −4 cosωt.
This type of motion is known as simple harmonic motion and is graphed in Figure 3.2 (a).
Example 3.8.16. Consider the same scenario as Example 3.8.15, except that now the point P
vibrates up and down, such that its vertical displacement y is given by y = 2 sinΩt. In these
circumstances, a simple physical argument shows that x obeys the differential equation
d2x
dt2
+ ω2x = 2 sin Ωt. (3.52)
Describe the motion of the object, given that x(0) = −4 and x′(0) = 0.
Solution. We have already solved the homogeneous equation in Example 3.8.15. So we look for a
particular solution xP .
Case 1: Suppose that Ω 6= ω. Then we look for a particular solution of the form
xP = C cos Ωt+D sinΩt.
c©2020 School of Mathematics and Statistics, UNSW Sydney
86 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
x(t)
t
|
−4
4
0 |
2pi
ω
(a) Simple harmonic motion
x(t)
t
|
−4
4
0 |
2pi
ω
(b) Stable oscillation: ω 6= Ω
x(t)
t
|
−4
4
0 |
2pi
ω
(c) Resonance: ω = Ω
Figure 3.2: Different oscillating systems.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.8. SECOND ORDER LINEAR ODES WITH CONSTANT COEFFICIENTS 87
By substituting this into (3.52) we obtain
xP =
2
ω2 − Ω2 sinΩt.
By using the fact that x = xH + xP and imposing initial conditions we find that
x(t) = −4 cosωt− Ω
ω
2
ω2 − Ω2 sinωt+
2
ω2 − Ω2 sinΩt.
This is simply another oscillating system. Its graph is shown in Figure 3.2 (b) (in the case when
Ω = 32ω).
Case 2: Suppose that Ω = ω. This time we look for a particular solution of the form
xP = Ct cosωt+Dt sinωt.
By substituting this into (3.52) we obtain
xP = − t
ω
cosωt.
Using the fact that x = xH + xP and imposing initial conditions we find that
x(t) = −4 cosωt+ 1
ω2
sinωt− t
ω
cosωt.
Hence as t increases, the amplitude of the cosωt term grows without bound and the system becomes
unstable. Its graph is shown in Figure 3.2 (c).
3.8.4 A connection with linear algebra
In Section 3.8, we have made two claims whose proofs have not been given yet; namely (A) that
every second order homogeneous ODE has at most two linearly independent solutions and (B) that
every solution y to a nonhomogeneous second order ODE can be written in the form yH + yP .
Both of these assertions can be proved by using linear algebra. In this subsection we highlight the
connection between second order linear ODEs and linear algebra, and explain why assertion (B) is
true. The proof of assertion (A) is given in MATH2501. Students should not continue reading this
subsection until Chapter 7 from MATH1231/1241 algebra has been completed.
Consider the second order ODE
y′′ + ay′ + by = f, (3.53)
where a and b are real numbers and f : R→ R is a function. Let V denote the vector space of all
(infinitely) differentiable functions y : R → R. We define the linear transformation T : V → V by
the formula
T (y) = y′′ + ay′ + by.
Observe that
yH is a solution to the homogeneous equation ⇐⇒ y′′H + ay′H + byH = 0
⇐⇒ T (yH) = 0
⇐⇒ yH ∈ ker(T ).
c©2020 School of Mathematics and Statistics, UNSW Sydney
88 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
Hence the general solution to the homogeneous equation is the kernel of T . Also observe that
yP is a particular solution to the nonhomogeneous equation ⇐⇒ y′′P + ay′P + byP = f
⇐⇒ T (yP ) = f.
Hence the ODE (3.53) has a particular solution if and only if f is in the image of T . Moreover,
since
T (yH + yP ) = T (yH) + T (yP ) (by the linearity of T )
= 0 + f
= f,
we conclude that yH + yP is a also a solution to (3.53).
Finally, we prove assertion (B). Suppose that y is a solution to (3.53) and that yP is some
particular solution to (3.53). Then T (y) = f and T (yP ) = f . Hence
T (y − yP ) = T (y)− T (yP ) (by the linearity of T )
= f − f
= 0.
Therefore y − yP is in the kernel of T . But, as observed above, every function in the kernel of T is
a solution yH to the homogeneous equation. Hence y − yP = yH , where yH is some solution to the
homogeneous equation. It follows that y = yH + yP , thus completing the proof of assertion (B).
The kernel and image of T , and their connection to the homogeneous solution space and a
particular solution for the differential equation, may be represented pictorially as shown below.
byP
yH
b
b f
0
V V
T
3.9 Maple notes
The following MAPLE command is relevant to the material of this chapter:
dsolve(deqn, y(x)); solves the ordinary differential equation (or IVP) deqn for the function
y(x). For example,
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.9. MAPLE NOTES 89
> dsolve(diff(y(x), x$2) - y(x) = 1, y(x));
y(x) = −1 +−C1 exp(x) +−C2 exp(−x)
> dsolve({diff(v(t), t) + 2*t = 0, v(1)=5}, v(t));
v(t) = −t2 + 6
c©2020 School of Mathematics and Statistics, UNSW Sydney
90 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
Problems for Chapter 3
Problems 3.3 : Separable ODEs
1. [R] Solve the following differential equations.
a)
dy
dt
= t2(1 + y2) b)
dy
dx
= xy2
c)
dy
dx
=
sinx
y2
d)
dy
dx
=
y
x(x− 1)
e)
dy
dx
= ex+y f) y cos2 x
dy
dx
= tanx+ 2, y = 2 when x = π/4
g) x
dy
dx
= y ln x h)
dy
dx
= 3x2y2, given that y = 1 when x = 0.
2. [H] Try to find the general solution for
dy
dx
= 3y2/3.
Your answer will probably be y = (x+ C)3.
Observe y = 0 is also a solution and it cannot be expressed as y = (x+C)3 for any value
of C.
How do you account for this? Are there any other solutions?
3. [R] Oil is leaking out of a tank in such a way that the depth of oil h in the tank at time
t satisfies
dh
dt
= −
√
2h. If the initial height is 4 cm, then find the time taken for the tank
to empty.
Problems 3.4 : First order linear ODEs
4. [R] Solve the following linear ODEs.
a)
dy
dx
− 2y = x2e2x b) dy
dx
+ 3y =
e−3x
1 + x2
c) x
dy
dx
+ (1 + x)y = 2 d) x
dy
dx
− 2y = 6x5
e) cos2 x
dy
dx
+ y = tan x f)
dy
dx
= x+ 2y tan 2x
5. [R] Solve
dx
dt
= t− x. Sketch the solution curves passing through (0, 1), (0, 0), (0,−1) and
(0,−2).
6. [R] An object falling vertically experiences a resistance which is proportional to its velocity.
a) Explain why its acceleration is given by
dv
dt
= g− kv, where g is the acceleration due
to gravity and k is a positive constant.
b) Solve this as a linear equation for the initial condition v = 0.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 3 91
c) Solve this as a separable equation.
d) What is the terminal (or limiting) velocity of the object?
Problems 3.5 : Exact ODEs and miscellaneous first order ODEs
7. [R] Solve the following exact ODEs.
a) 2xy + (x2 + y2)
dy
dx
= 0 b) (sin y − xy2) dx+ (x cos y − x2y) dy = 0
8. [R] Determine which of the following differential equations are exact and solve those which
are.
a) (2xey + ex) + (x2 + 1)ey
dy
dx
= 0
b) y(x2 + ln y) + x
dy
dx
= 0
c)
(
y
1 + x2
+ ey
)
dx+
(
tan−1 x+ xey
)
dy = 0
d) exy(y cos x− sin x) + xexy cos x dy
dx
= 0
9. [R] (Miscellaneous first order ODEs)
Solve the following ODEs.
a) (2xy − 3 tan x) dy
dx
= 3y sec2 x− y2
b) x− (x2y + y) dy
dx
= 0
c) x2
dy
dx
+ xy = 1, x > 0
d) x2
dy
dx
− xy = y
e) x
dy
dx
=
1 + x2
y2
, y(1) = 3.
10. [X] Solve the differential equation
(2x− 10y3)dy
dx
+ y = 0
by first multiplying it through by some function µ(y) to make it exact.
Problems 3.6 : Solving ODEs by using a change of variable [X]
11. [X] Solve the following ODEs by using the substitution y(x) = x · v(x).
a)
dy
dx
=
xy − y2
x2
b)
dy
dx
=
y
x
+ cos
(
y − x
x
)
c)
dy
dx
+
2xy
x2 + y2
= 0 d)
dy
dx
+
x2 + y2
2xy
= 0
e) x
dy
dx
=
√
x2 + y2 + y, y(1) = 0.
c©2020 School of Mathematics and Statistics, UNSW Sydney
92 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
12. [X] Use the substitution y = x · v(x) to find the solution to the differential equation
dy
dx
=
y − x
y + x
, y(1) = 0.
Use polar coordinates to sketch the corresponding curve.
13. [X] Solve
dy
dx
= (y − x)2 using the substitution u = y − x.
14. [X]
a) Solve
dy
dt
= y − 2y2 by making the substitution z = 1
y
to obtain a linear differential
equation in z and t.
b) Use the same trick to solve
dy
dt
= 5y − ty2 given that y = 1 when t = 0.
Find the maximum value of y when t ≥ 0.
15. [X] Solve the following differential equations by using the given substitution.
a) y
dy
dx
− 2y2 = x; u = y2.
b) x
dy
dx
+ y(x2 + ln y) = 0, x > 0, y > 0; u = ln y.
c) xyy′′ = yy′ + x(y′)2; u = y′/y or u = ln y.
16. [X] In each case, use the substitution v =
dy
dt
to solve the given ODE.
a)
d2y
dt2
+ 2
dy
dt
= 6
b) t2
d2y
dt2
−
(
dy
dt
)2
= 0
17. [X] Suppose that α(x) is a solution to the equation
d2u
dx2
+ b(x)
du
dx
+ c(x)u = 0.
a) Use the substitution u(x) = α(x)v(x) to write down a first order equation for
dv
dx
.
What method can you see to solve this equation?
b) Verify that u(x) = x3 is one solution to the differential equation
x2
d2u
dx2
− 4xdu
dx
+ 6u = 0.
Hence use the technique of (a) to find the general solution.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 3 93
Problems 3.7 : Modelling with first order ODEs
18. [R] The simple population growth model,
dy
dt
= ky, is (because of limitations on resources,
pollution, . . . ) unsatisfactory over a ‘long’ period. We might look at
dy
dt
= k(y)y (1)
(which of course ignores seasonal and other variations with time). Mathematically one of
the simplest assumptions we can make (Verhulst 1839) is that k(y) decreases linearly as y
increases. In this case (1) may be written in the form
dy
dt
= k
(
1− y
K
)
y, (2)
where k and K are constants.
a) i) Equation (2) has two constant (stationary) solutions. What are they?
ii) Solve (2), given that y = y0 when t = 0 and 0 < y0 < K.
iii) State what happens to y as t→∞.
iv) For what value of y is the rate of increase of y a maximum?
(Caution — there is a simple, two line, method.)
v) If y0 > K then what happens to y as t increases?
b) [H] It has been found empirically that for certain bodies k(y) decreases linearly with
ln y. (Here y could be the volume, mass or number of cells of the body.) In this case
(1) may be written as
dy
dt
= α ln
(
K
y
)
y. (3)
Solve this differential equation, given that y = y0 when t = 0.
19. [R] An initially unpolluted lake of 109 litres has a river flowing through it at 1, 000, 000
litres per day. A factory is built which discharges 10, 000 litres per day of pollutant into
the lake. Assume that the total volume of liquid in the lake remains constant at 109 litres.
a) What will be the eventual level of pollution in the lake?
b) How long will it take to reach half this level?
c) Is there anything unrealistic about your model?
20. [R] A tank can hold 100 litres. Initially it holds 50 litres of pure water. Brine, which
contains 2 grams of salt per litre, is run in at the rate of 3 litres per minute. The mixture,
which is stirred continuously, is run off at 1 litre per minute. Let x(t) denote the mass of
salt (in grams) present in the tank after t minutes.
a) Set up a differential equation in x and t to model the system.
b) Show that when the tank is at the point of overflowing it contains 50(4−√2) grams
of salt.
21. [R] A population of size P is subject to seasonal variation. The population has a growth
rate given by
dP
dt
+ P = 100 + 50 sin t.
c©2020 School of Mathematics and Statistics, UNSW Sydney
94 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
a) Solve the differential equation given that P (0) = 20.
b) Find the average population size over a long period of time.
22. [R] The equation
dy
dt
= k(1 + a cos(2πt))y.
is an attempt to model seasonal variation. Solve it.
23. [R] It is estimated that the population growth rate of a certain developing country will
fall linearly from 2% per year to 1.5% per year over the next decade.
a) Express the population growth rate as a function of time.
b) Given that the present population is 10 million, find the estimated population in 10
years time.
24. [R] An object falling in a resisting medium has a constant acceleration due to gravity of
9.8 m/sec2 and also a drag force, which is approximately proportional to the speed (see
Question 6). For a stone falling in water with velocity v m/sec., the acceleration from this
drag force is approximately 10v m/sec2.
a) Given that the stone is dropped from rest at the water surface, write a differential
equation describing this situation, and solve it to find v.
b) Determine the terminal velocity, lim
t→∞ v(t).
c) How long does it take before the stone is travelling with 95% of its terminal velocity?
d) What if it starts at a velocity higher than the terminal velocity?
25. [R] The differential equation
dy
dx
= λ
y
x
is employed as a simple model for comparative
growth.
a) Solve this differential equation.
b) The mass y of the large claw of a fiddler crab is compared to the mass x of its
body (without claw) over a period of time. The measurements taken (in grams) are
recorded in the following table.
x 55 300 536 1080 1449 2233
y 5 72 175 522 773 1498
By graphing f(x) against f(y) for a suitable function f , show that the above model
appears reasonable in this example. Determine approximately the value of λ and the
constant of integration.
26. [R] An investor puts $500, 000 into Hitek Bonds, which pay a profit of 20% a year,
compounded daily (i.e., if P is the amount of money in the bonds at any time,
dP
dt
= 0.2P
approximately).
a) How much money would he own at the end of a year?
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 3 95
b) [H] His tax accountant advises that this is overprofitable, and suggests that he use
$200 a week of the interest towards the hire of a Volvo, which would be a tax write–off
under the government’s Small Business Incentive Scheme, and that at the end of each
six months, all the remaining interest should be taken out and invested in the Lake
Eyre Ricegrowers’ Cooperative, which loses money continuously at a rate of 10% a
year
(
so that
dP
dt
= −0.1P
)
. This would qualify the investor for various advantages
under the Rural Rorts Scheme. If the investor takes this advice, what would the total
of both his investments be after one year? [Take it that there are 52 weeks in a year.]
27. [H] There are n+ 1 tanks each containing 100 litres and connected as shown.
. . .
Throughout, the liquid in every tank is kept well–mixed. The 0-th tank contains 100 litres
of pure water with 50 grams of salt dissolved in it. The remainder contain pure water.
Pure water is pumped into tank 0 at 3 litres per minute and liquid leaves the system from
tank n at 3 litres per minute. Let mk denote the mass (in grams) of salt in tank k.
a) Show that
dmk
dt
= 0.03(mk−1 −mk), k = 1, 2, . . . , n
and
dm0
dt
= −0.03m0.
b) Show that
mn =
50(0.03)n
n!
tne−0.03t.
28. [H] The decay of one atom of radioactive element A yields one atom of B which is itself
radioactive. The decay constants for A and B are 0.25 and 2 (per day) respectively. A
pure sample of K atoms of A is placed in a closed container. Let y1(t) and y2(t) denote
the numbers of atoms of A and B that are present at time t. Write down two differential
equations to describe this situation. Solve to obtain a formula for y2. When is the amount
of B in the container a maximum?
29. [X] A cyclist freewheeling on a level road experiences a retardation (i.e., a negative ac-
celeration) that is proportional to the square of his speed. His speed is reduced from 20
metres per second to 10 metres per second in a distance of 100 metres. Find his average
(with respect to time) speed during this period.
c©2020 School of Mathematics and Statistics, UNSW Sydney
96 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
Problems 3.8 : Second order linear ODEs with constant coefficients
30. [R] Find the general solutions of the following second order ODEs.
a) y′′ + 3y′ + 2y = 0 b) y′′ + 2y′ + 10y = 0
c) y′ + 3y = 0 d) y′′ + 4y′ + 4y = 0
31. [R] Find the solutions of the following initial value problems.
a) y′′ − 6y′ + 5y = 0; y = 1, y′ = 0, when x = 0.
b) y′′ + 2y′ + 2y = 0; y = 1, y′ = 0, when x = 0.
32. [R] Solve the following differential equations.
a) y′′ + 4y′ + 3y = x b) y′′ − 6y′ + 9y = 5e2x
c) y′′ + 2y′ + 2y = 10 cos 2x d) y′′ − y = e−x
e) y′′ + 4y = sin 2x f) y′ + 3y = 10e2x
g) y′′ + 4y = sin x y = 0, y′ = 1 when x = 0
h) y′′ − 5y′ + 4y = 2e2x y = 1, y′ = 3 when x = 0
33. [R] For each of the following differential equations, find the general solution of the associ-
ated homogeneous equation and write down the form of the particular solution you would
seek. (Do not evaluate the unknown coefficients.)
a) 2y′′ − 3y′ − 5y = (x+ 4)e5x/2
b) y′′ + 2y′ − 24y = e4x sin 6x
c) y′′ + 6y′ + 9y = e−3x
34. [H] Find a particular solution for the differential equation y′′ − 4y′ + 4y = 6x2e2x.
35. [X] By making the substitution x = et, reduce the linear differential equation
x2
d2y
dx2
− 4x dy
dx
+ 6y = x5, x > 0,
to one with constant coefficients, and hence solve it.
36. [X]
a) Show that the second order differential equation
y′′ + (m1 +m2)y′ +m1m2y = 0,
where m1 and m2 are constants, can be written in the form
d
dx
(y′ +m1y) +m2(y′ +m1y) = 0.
b) Suppose that m1 = m2 = m. Using the result of (a), solve the first differential
equation by solving successively two first order linear equations.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 3 97
37. [R] A cylindrical buoy of 80 cm in diameter floats in water with its axis vertical. When
depressed slightly and released, it bobs up and down according to the differential equation
m
d2x
dt2
= −π402gx,
where m is the mass (in grams) of the buoy and x is the displacement (in centimetres)
from the equilibrium position. Take the acceleration g due to gravity to be 980 cm/sec2.
The period of oscillation is observed to be 2.5 seconds. What is the mass of the buoy?
38. [R] A block of wood is attached to a spring and slides across a surface as shown.
Let x(t) denote the horizontal distance of the block from its ‘resting’ position at time t.
The motion of the block is modelled by the initial value problem
d2x
dt2
+ c
dx
dt
+ 4x = 0, x′(0) = 0, x(0) = 1,
where c is the ‘coefficient of friction’ between the block and the surface.
a) By solving the differential equation, describe the motion of the block in the ‘ideal’
situation when there is no friction between the block and the surface (that is, when
c = 0).
b) Solve the initial value problem when c = 2 and when c = 5 and explain why the block
oscillates if c = 2 but does not if c = 5.
c) [H] Find the smallest positive c such that the system does not oscillate.
(This value of c corresponds to what is known as ‘critical damping.’)
39. [R] A circuit consists of an inductor and capacitor connected in series with a sinusoidal
power source. The charge q (in coulombs) stored in a capacitor is given by the differential
equation
d2q
dt2
+ 10000q = 1000 sin Ωt,
where Ω/(2π) is the frequency of the power source.
a) Find the solution to the corresponding homogeneous equation.
b) Write down the form of particular solution you would seek for the differential equation.
(Warning: there are two cases, depending on the value of Ω.)
c) What sinusoidal frequency leads to unbounded oscillatory behaviour (known as ‘res-
onance’)?
c©2020 School of Mathematics and Statistics, UNSW Sydney
98 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
40. [R]
a) Find the general solution of the vibrating system modelled by the equation
d2y
dt2
+ 3
dy
dt
+ 2y = 20 sin t.
b) The long term behaviour of this system is independent of the initial conditions. What
is this ‘steady state solution’?
41. [X]
a) Solve the previous question by first finding a particular complex-valued function z(t)
satisfying
d2z
dt2
+ 3
dz
dt
+ 2z = 20eit.
b) Similarly, find a particular solution of
d2y
dt2
+ 4y = sin 2t
by considering
d2z
dt2
+ 4z = e2it.
42. [R] A stationary wave on a guitar string of length L can be (partially) modelled by the
boundary value problem
y′′ = ky; y(0) = 0, y(L) = 0, L > 0,
where k is a constant. Assume that y is not identically zero.
a) Suppose that k is positive and write k as µ2, where µ ≥ 0. Show that the boundary
value problem has no nonzero solutions.
b) Are there any nonzero solutions when k = 0?
c) Suppose now that k is negative and write k as −µ2, where µ > 0. Find all possible
values of µ such that the boundary value problem has a nonzero solution. Give the
corresponding solutions.
(Each such solution corresponds to a natural harmonic of the string.)
43. [X]
a) If g is a given function which is continuous and positive on the interval [0, L], show
that the only solution of the boundary value problem
y′′ − g(x)y = 0; y(0) = 0, y(L) = 0, L > 0,
is y = 0.
(Hint : Give a proof by contradiction using the maximum-minimum theorem for con-
tinuous functions.)
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 3 99
b) Find all possible values for λ such that there is a solution to the differential equation
y′′ + 2y′ + 2λy = 0
satisfying y(0) = y(L) = 0 and y not identically zero.
44. [H] A simple linear predator-prey model:
Let X denote the number of predators at time t and Y the number of prey. Experience
and theoretical considerations suggest that
(α) there is an equilibrium state (X,Y ) = (X0, Y0) with X0, Y0 6= 0;
(β) an increase in the number of prey, from Y0, results in an increase in the number of
predators at a rate (approximately) proportional to the increase in prey, so that
dX
dt
= a(Y − Y0), a > 0; (1)
and
(γ) an increase in the number of predators, from X0, results in a decrease in the number
of prey at a rate (approximately) proportional to the increase in predators, so that
dY
dt
= −b(X −X0), b > 0. (2)
[In fact the ‘linearisation’ of a number of complicated models near an equilibrium state
leads to (1) and (2).]
If x = X −X0 and y = Y − Y0 then we obtain
dx
dt
= ay, (3)
and
dy
dt
= −bx. (4)
a) Eliminate dt between (3) and (4) and solve to obtain a relation between x and y (the
‘phase trajectories’ of the system).
b) Eliminate x by differentiating (4) and substituting from (3). Hence solve for y and
then use (4) again to obtain x.
c) Suppose that a = 0.8 and b = 3.2. If x = −1.2 and y = 3.2 when t = 0, then find x
and y in terms of t and also find the phase trajectory.
45. [X] (Note: do not attempt this question until Chapter 7 of MATH1241 Algebra has been
completed.)
Let V denote the vector space of twice differentiable functions on R. Define a linear map
L on V by the formula
Lu = a
d2u
dx2
+ b
du
dx
+ cu, where a, b and c are real numbers.
Suppose that u1, u2 is a basis for the solution space of L(u) = 0. Find a basis for the
solution space of the fourth order equation L(L(u)) = 0. What can you say about the
kernels of L and L2?
c©2020 School of Mathematics and Statistics, UNSW Sydney
100 CHAPTER 3. ORDINARY DIFFERENTIAL EQUATIONS
c©2020 School of Mathematics and Statistics, UNSW Sydney
101
Chapter 4
Taylor series
Polynomials are nice functions to work with. Their values can be easily evaluated using a finite
number of additions and multiplications. They are easy to differentiate and integrate; moreover
their derivatives and antiderivatives are also polynomials and thus have these properties. Some
(but not all) of these properties are shared by a few other classes of functions. For example,
the derivatives of exponential functions are exponentials, but in general exponentials cannot be
evaluated using a finite number additions and multiplications. Many other useful functions share
none of these properties.
If a function can be accurately approximated by a polynomial, then we can use the polynomial
to approximate the values, derivatives and antiderivative of the function. This generalises an idea
met in MATH1131, where we saw that a differentiable function can be locally approximated by a
linear function (which is a degree 1 polynomial). In this chapter we will see that many n-times
differentiable functions can be locally approximated by a polynomial of degree n. We will devote
considerable time discussing how accurate such approximations are. Finally, we shall prove that
some functions can not only be approximated by polynomials but are also equal to series consisting
of infinitely many polynomial terms. Such series are known as Taylor series.
The ideas presented in this chapter have a long history, going back to Archimedes’ method
of exhaustion, which he used to approximate π. Beginning in the fourteenth century, a school of
Indian mathematicians based in Kerala found accurate polynomial approximations to trigonometric
functions. This allowed them to solve problems in astronomy. In the seventeenth century, the
Scottish astronomer James Gregory independently employed similar techniques. However, it was
not until 1715 that the English mathemtician Brook Taylor published a theorem which gave a
general method for polynomial approximation, and which described precisely the errors involved. In
the twenty-first century, computers and calculators regularly use algorithms to find the approximate
value of functions; many of these modern computation techniques trace their roots back to Taylor’s
method.
4.1 Taylor polynomials
(Ref: SH10 §12.6, 12.7)
Recall from MATH1131 that the function f : R → R, given by f(x) = ex, is defined as the
inverse of the function ln : (0,∞) → R, and that ln is defined in terms of an integral. How, then,
does one evaluate e0.1? (If the answer given is ‘use a calculator,’ then we merely ask the question,
How does a calculator evaluate e0.1?)
c©2020 School of Mathematics and Statistics, UNSW Sydney
102 CHAPTER 4. TAYLOR SERIES
One approach is to suppose that y = e0.1. Then we need to solve the equation ln y = 0.1. By
the definition of ln, this boils down to solving the integral equation∫ y
1
dt
t
= 0.1.
From here we could guess an approximate value for y and use Riemann sums to check whether our
guess is reasonable. Clearly this is an unsatisfactory approach to the problem.
Another method is to locally approximate the function f with a linear function. This technique
was discussed in MATH1131 (see Chapter 4 of the MATH1131 calculus notes) and is based on the
idea that the tangent lies close to the graph of f near the point of contact. Suppose, once again,
that f(x) = ex. To find an approximate value for e0.1, we will approximate f using the tangent to
the graph of f at 0. The tangent function p1 at 0 is a polynomial of degree 1 and is given by
p1(x) = 1 + x.
So when x is close to 0,
ex ≈ 1 + x
and hence e0.1 ≈ 1.1.
The polynomial p1 has the property that its value at 0 and gradient at 0 agree with the value
and gradient of f at 0. That is,
p1(0) = f(0) and p
′
1(0) = f
′(0)
If we want to improve our approximation, we could generalise this idea and look for a degree two
polynomial p2 such that the value, gradient and concavity of f and p2 agree at 0. That is, if
p2(x) = b0 + b1x+ b2x
2,
then we choose the coefficients b0, b1 and b2 such that
p2(0) = f(0), p
′
2(0) = f
′(0) and p′′2(0) = f
′′(0).
By calculating the first and second derivatives of f and p, we find that
b0 = 1, b1 = 1 and 2b2 = 1.
Hence
p2(x) = 1 + x+
x2
2
.
One can see in Figure 4.1 that p2 gives a better approximation to f near 0 than does p1. Using p2,
we obtain the approximation
e0.1 = f(0.1) ≈ p2(0.1) = 1.105.
One can again improve this approximation by using a polynomial of degree 3. If
p3(x) = c0 + c1x+ c2x
2 + c3x
3
then we can determine the unknown coefficients by solving the equations
p3(0) = f(0), p
′
3(0) = f
′(0), p′′3(0) = f
′(0) and p′′′3 (0) = f
′′′(0).
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.1. TAYLOR POLYNOMIALS 103
x
y
0
y = ex
y = ex
y = p1(x)
y = p1(x)
y = p2(x)
y = p2(x)
y = p3(x)
y = p3(x)
Figure 4.1: Polynomial approximations (in gray) for the exponential function (in black) about 0.
These equations imply that
c0 = 1, c1 = 1, 2c2 = 1 and 6c3 = 1
and hence
p3(x) = 1 + x+
x2
2
+
x3
6
(see Figure 4.1). Thus we have the approximation
e0.1 = f(0.1) ≈ p3(0.1) = 1.10516˙.
The table below compares the approximations to e0.1 given by these first, second and third
degree polynomials.
n pn(x) pn(0.1) Error in approximation (3 s.f.)
1 1 + x 1.1 5.17 × 10−3
2 1 + x+ x
2
2 1.105 1.71 × 10−4
3 1 + x+ x
2
2 +
x3
6 1.10516˙ 4.25 × 10−6
In fact, e0.1 (rounded to five decimal places) is equal to 1.10517 and so the approximation p3(0.1) is
accurate to four decimal places. Each of the polynomials p1, p2 and p3 are called Taylor polynomials
for f about 0.
c©2020 School of Mathematics and Statistics, UNSW Sydney
104 CHAPTER 4. TAYLOR SERIES
Using the same technique, one can attempt to approximate any function f at 0 with an n degree
polynomial pn, provided that f is n-times differentiable at 0. Suppose that
pn(x) = a0 + a1x+ a2x
2 + a3x
3 + · · ·+ anxn.
We require that
pn(0) = f(0), p
′
n(0) = f
′(0), p′′n(0) = f
′′(0), p(3)n (0) = f
(3)(0), . . . , p(n)n (0) = f
(n)(0),
where f (j) denotes the jth derivative of f . By calculating the derivatives of p, one finds that
a0 = f(0), a1 = f
′(0), 2!a2 = f ′′(0), 3!a3 = f (3)(0), . . . , n!an = f (n)(0).
Hence
pn(x) = f(0) + f
′(0)x +
f ′′(0)
2!
x2 +
f (3)(0)
3!
x3 + · · ·+ f
(n)(0)
n!
xn.
Definition 4.1.1. Suppose that f is n-times differentiable at 0. Then the Taylor
polynomial pn of degree n for f about 0 is given by
pn(x) = f(0) + f
′(0)x+
f ′′(0)
2!
x2 +
f (3)(0)
3!
x3 + · · ·+ f
(n)(0)
n!
xn.
We also call pn the nth Taylor polynomial for f about 0.
Example 4.1.2. Suppose that f(x) = ex and n ≥ 0. Find the Taylor polynomial of degree n for
f about 0.
Solution. It is clear that
f ′(x) = ex, f ′′(x) = ex, f ′′(x) = ex, . . . , f (n)(x) = ex.
Hence f(0) = f ′(0) = f ′′(0) = · · · = f (n)(0) = 1 and
pn(x) = 1 + x+
x2
2!
+
x3
3!
+ · · ·+ x
n
n!
gives the Taylor polynomial of degree n about 0.
Example 4.1.3. Suppose that f(x) = sinx. Find the Taylor polynomials for f up to degree seven
about the point 0.
Solution. We have
f(x) = sinx f(0) = 0
f ′(x) = cos x f ′(0) = 1
f ′′(x) = − sinx f ′′(0) = 0
f (3)(x) = − cos x f (3)(0) = −1
f (4)(x) = sinx f (4)(0) = 0
...
...
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.1. TAYLOR POLYNOMIALS 105
and the pattern will repeat itself. Hence
p1(x) = x
p3(x) = x− x
3
3!
p5(x) = x− x
3
3!
+
x5
5!
p7(x) = x− x
3
3!
+
x5
5!
− x
7
7!
are the Taylor polynomials for f up to degree seven about the point 0. Their graphs are compared
to the sine function in the following diagram.
x
y
0
y = sinx
y = p1(x)
y = p3(x)
y = p5(x)
y = p7(x)
Note that the sine function is odd and that its Taylor polynomials about 0 are also odd.
The diagram in Example 4.1.3 shows that the Taylor polynomials about 0 are good approxi-
mations for f near 0. However, the approximations get worse farther away from 0. If one wants
to approximate a function f near a point a, where a is not close to 0, then one can try (i) using
Taylor polynomials about 0 of higher degree (which may or may not give satisfactory results) or
(ii) approximating f with Taylor polynomials about a. Method (ii) is particularly pertinent in the
case when f(x) = lnx; since f is not defined at 0, the Taylor polynomials for f about 0 do not
exist.
Suppose f is n-times differentiable at a. To approximate f about a, we try a degree n polynomial
pn, given by
pn(x) = a0 + a1(x− a) + a2(x− a)2 + · · · + an(x− a)n,
where
pn(a) = f(a), p
′
n(a) = f
′(a), p′′n(a) = f
′′(a), . . . , p(n)n (a) = f
(n)(a).
By calculating the derivatives of pn at a, the unknown coefficients can be determined. Hence
pn(x) = f(a) + f
′(a)(x − a) + f
′′(a)
2!
(x− a)2 + f
(3)(a)
3!
(x− a)3 + · · ·+ f
(n)(a)
n!
(x− a)n.
This leads to the following definition.
c©2020 School of Mathematics and Statistics, UNSW Sydney
106 CHAPTER 4. TAYLOR SERIES
Definition 4.1.4. Suppose that f is n-times differentiable at a. Then the Taylor
polynomial pn of degree n for f about a is given by
pn(x) = f(a)+f
′(a)(x−a)+ f
′′(a)
2!
(x−a)2+ f
(3)(a)
3!
(x−a)3+ · · ·+ f
(n)(a)
n!
(x−a)n.
We also call pn the nth Taylor polynomial for f about a.
The nth Taylor polynomial for f about a can be expressed using summation notation as
pn(x) =
n∑
k=0
f (k)(a)
k!
(x− a)k.
Here 0! = 1 (by definition) and we use the convention of writing f(a) as f (0)(a).
Example 4.1.5. Suppose that f(x) = lnx. Find the Taylor polynomial of degree 5 for f about 1.
Solution. Since
f(x) = lnx f(1) = 0
f ′(x) =
1
x
f ′(1) = 1
f ′′(x) = − 1
x2
f ′′(1) = −1
f (3)(x) =
2!
x3
f (3)(1) = 2!
f (4)(x) = − 3!
x4
f (4)(1) = −3!
f (5)(x) =
4!
x5
f (5)(1) = 4!,
we see that
p5(x) = (x− 1) + −1
2!
(x− 1)2 + 2!
3!
(x− 1)3 + −3!
4!
(x− 1)4 + 4!
5!
(x− 1)5
= (x− 1)− (x− 1)
2
2
+
(x− 1)3
3
− (x− 1)
4
4
+
(x− 1)5
5
.
It is preferable to express such a polynomial in powers of (x− 1) rather than in powers of x. There
is usually no need to expand each of the terms (x− 1)k.
4.2 Taylor’s theorem
(Ref: SH10 §12.6, 12.7)
We saw in the last section that e0.1 can be approximated using Taylor polynomials. Suppose
that a calculator (with a 10 digit display) uses Taylor polynomials to calculate the (approximate)
value of e0.1. What degree Taylor polynomial should the calculator use so that the displayed value
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.2. TAYLOR’S THEOREM 107
is accurate to 10 digits? Importantly, is there a way of answering this question without knowing in
advance the decimal expansion of e0.1?
In other words, we seek a method for determining how bad the error can get when approximating
a function by one of its Taylor polynomials without knowing the precise values of the function. The
functions f and g, given in the table below, further highlight the need for such a method.
Function nth Taylor polynomial about 0
f(x) = sinx pn(x) = x− x
3
3!
+
x5
5!
− x
7
7!
+ · · · x
n
n!
(n odd)
g(x) = ln(1 + x) qn(x) = x− x
2
2
+
x3
3
− x
5
5
+ · · · x
n
n
Suppose that we want to approximate f(7). Although 7 is not close to 0, the graph below shows
that the approximation f(7) ≈ pn(7) is reasonable if n = 19 or n = 21.
x
y
|
7
|
|1
-1
p15
p17
p19
p21
f
By plotting more graphs, one would discover that the approximation f(7) ≈ pn(7) seems to improve
as n increases. We could conjecture that as n→∞ the error in the approximation approaches 0.
On the other hand, suppose that we want to approximate g(1.5). Although 1.5 is closer to 0
than is 7, the Taylor polynomials do not provide a good approximation to the true value of g(1.5),
as Figure 4.2 shows. The Taylor polynomials approximate g very well on the interval (−0.7, 0.7)
and are not too bad on (0.7, 1). However, to the right of 1 they should not be used to approximate
the function. Moreover, while the higher degree Taylor polynomials give better approximations
near 0, they give larger errors than lower degree polynomials to the right of 1. This surprising
observation highlights the need for establishing a rigorous basis Taylor polynomial approximation.
In particular, we must have some idea whether the error involved in each approximation is going
to be large or small.
In light of the above discussion, we want an exact expression for the difference between a
function and one of its Taylor polynomials. To obtain this expression, we use integration by parts.
Suppose that f has n + 1 continuous derivatives on an open interval I containing 0. We are
about to compute the nth Taylor polynomial of f about 0 in such a way that we keep track of the
difference (also known as the error or remainder) between f(x) and pn(x). Fix a number x in the
interval I and note that ∫ x
0
f ′(t) dt = f(x)− f(0). (4.1)
c©2020 School of Mathematics and Statistics, UNSW Sydney
108 CHAPTER 4. TAYLOR SERIES
x
y
|||
1.51-1
|
|1
-1
q5
q8
q11
q14
q17
g
Figure 4.2: Taylor polynomials for ln(1 + x) about 0.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.2. TAYLOR’S THEOREM 109
On the other hand, we can evaluate the same integral using integration by parts. If we set
u = f ′(t) v = −(x− t)
du
dt
= f ′′(t)
dv
dt
= 1
then integration by parts gives∫ x
0
f ′(t) dt =
[
− f ′(t)(x− t)
]x
0
+
∫ x
0
f ′′(t)(x− t) dt
= f ′(0)x +
∫ x
0
f ′′(t)(x− t) dt. (4.2)
Thus, from equations (4.1) and (4.2) we see that
f(x) = f(0) + f ′(0)x +
∫ x
0
f ′′(t)(x− t) dt.
(Note that the first two terms on the right-hand side gives the Taylor polynomial of degree 1, and
that the third term on the right-hand side is the error in the approximation f(x) ≈ p1(x).) By
applying integration by parts with
u = f ′′(t) v = −12(x− t)2
du
dt
= f ′′′(t)
dv
dt
= x− t,
we obtain
f(x) = f(0) + f ′(0)x+
f ′′(0)
2!
x+
1
2!
∫ x
0
f ′′′(t)(x− t)2 dt.
(The right-hand side is now the second Taylor polynomial of f with the associated error (also
known as remainder)). If we continue integrating by parts then we obtain after n steps
f(x) = f(0) + f ′(0)x +
f ′′(0)
2!
x+
f (3)(0)
3!
x3 + . . .+
f (n)(0)
n!
xn +
1
n!
∫ x
0
f (n+1)(t)(x− t)n dt.
That is,
f(x) = pn(x) +Rn+1(x)
where pn is the nth Taylor polynomial for f about 0 and
Rn+1(x) =
1
n!
∫ x
0
f (n+1)(t)(x− t)n dt.
We call Rn+1(x) the remainder term.
This argument may be easily adapted for Taylor polynomials about a point a, rather than 0.
We thus have the following theorem.
Theorem 4.2.1 (Taylor’s theorem). Suppose that f has n + 1 continuous derivatives on an open
interval I containing a. Then for each x in I,
f(x) = pn(x) +Rn+1(x),
c©2020 School of Mathematics and Statistics, UNSW Sydney
110 CHAPTER 4. TAYLOR SERIES
where the nth Taylor polynomial pn about a is given by
pn(x) = f(a) + f
′(a)(x − a) + f
′′(a)
2!
(x− a)2 + f
(3)(a)
3!
(x− a)3 + · · · + f
(n)(a)
n!
(x− a)n
and the remainder Rn+1(x) is given by
Rn+1(x) =
1
n!
∫ x
a
f (n+1)(t)(x− t)n dt. (4.3)
Taylor’s theorem tells us that the error in the approximation f(x) ≈ pn(x) is exactly equal
to Rn+1(x). However, the remainder Rn+1(x), as given in the integral form (4.3), is usually very
difficult to compute. A more convenient form is known as the Lagrange formula for the remainder.
Corollary 4.2.2 (Lagrange formula for the remainder). Suppose that f has n + 1 continuous
derivatives on an open interval I containing a. Then for each x in I,
f(x) = pn(x) +Rn+1(x),
where pn is the nth Taylor polynomial for f about a and
Rn+1(x) =
f (n+1)(c)
(n+ 1)!
(x− a)n+1 (4.4)
for some real number c between a and x.
The proof of Corollary 4.2.2 is left as an exercise in the tutorial problems.
Remark 4.2.3. Taylor’s theorem with the Lagrange formula for the remainder is a generalisation
of the mean value theorem. To see this, suppose that f is once differentiable. Then
f(x)− p0(x) = R1(x),
which means that
f(x)− f(a) = f ′(c)(x − a).
for some c between x and a. In other words,
f(x)− f(a)
x− a = f
′(c)
for some c between x and a.
In most examples, it is difficult to find the exact value of the number c that appears in the
Lagrange formula for the remainder. Instead, one uses the fact that c lies between x and a to find
an upper bound for the remainder term. That is while we may not know what c is, we do know
where c is.
Example 4.2.4. Suppose that f(x) = cos x. By considering the second Taylor polynomial for f
about 0, estimate f(1/5) and find an upper bound for the error.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.2. TAYLOR’S THEOREM 111
Solution. The second Taylor polynomial p2 is given by
p2(x) = f(0) + f
′(0)x+
f ′′(0)
2!
x2
= cos(0)− sin(0)x − cos(0)
2!
x2
= 1− x
2
2!
.
Hence we use the estimate cos(15 ) ≈ p2(15) = 4950 . Using the Lagrange formula for the remainder, we
calculate an upper bound for absolute error:
|error| = |f(1/5) − p2(1/5)|
= |R3(1/5)| (by Taylor’s theorem)
=
∣∣∣∣∣f
(3)(c)
3!
(1/5)3
∣∣∣∣∣ (for some c in [0, 1/5])
=
sin c
6
× 1
125
(for some c in [0, 1/5])
≤ 1
6
× 1
125
(since sin c ≤ 1)
=
1
750
.
So an upper bound for the error is 1750 , which is approximately 0.001333.
This upper bound is actually quite crude due to the estimate sin c ≤ 1. If instead we use the
inequality
sin t < t whenever t > 0
(see Chapter 5 of MATH1131), then sin c < c < 15 and we obtain the upper bound
|error| < 1
3750
.
In summary, cos(15 ) ≈ 4950 and the absolute error in this estimate is less than 13750 (that is, less than
0.0002667).
Example 4.2.5. A calculator with a 10 digit display uses a Taylor polynomial about 0 to estimate
e0.1. What degree polynomial should be used to guarantee that e0.1 is displayed accurately?
Solution. We consider the function f given by f(x) = ex. By Taylor’s theorem,
ex = pn(x) +Rn+1(x)
where
pn(x) = 1 + x+
x2
2!
+
x3
3!
+ . . . +
xn
n!
and
Rn+1(x) =
ec
(n+ 1)!
xn+1
for some c between 0 and x. Since the calculator will display the first ten digits of pn(0.1), it suffices
if the error in the approximation e0.1 ≈ pn(0.1) is less than 10−10. That is, we need to find n large
enough so that
|Rn+1(0.1)| < 10−10. (4.5)
c©2020 School of Mathematics and Statistics, UNSW Sydney
112 CHAPTER 4. TAYLOR SERIES
Now
|Rn+1(0.1)| = e
c
(n+ 1)!
(0.1)n+1 (for some c in [0, 0.1])
≤ e
0.1
(n+ 1)!
(0.1)n+1 (since exp is an increasing function)
<
2
(n+ 1)!
(0.1)n+1 (since e0.1 < 30.1 < 31/2 < 2)
=
2
(n+ 1)!
10−(n+1) (since 0.1 = 10−1).
Now 2(n+1)! < 1 whenever n ≥ 1, so we obtain the (very crude) estimate
|Rn+1(0.1)| < 10−(n+1)
provided that n ≥ 1. Thus if n = 9 then (4.5) is satisfied. That is, if the calculator displays the
first 10 digits appearing in the decimal expansion of p9(0.1), then the number appearing on the
display will be accurate.
However, we can do better than this. By trial and error we find that the first (positive) integer
n such that
2
(n+ 1)!
10−(n+1) < 10−10
is 6. That is, the calculator should display the first 10 digits appearing in the decimal expansion
of p6(0.1).
(Note that we did not use the decimal expansion of e0.1 to estimate the error.)
4.2.1 Classifying stationary points
Taylor’s theorem can also be applied to the problem of classifying the stationary points of differen-
tiable functions. For example, the function f given by f(x) = (x− 3)4 has a stationary point at 3
(since f ′(3) = 0). However, since f ′′(3) = 0, the second derivative test cannot be used to determine
whether the stationary point is a maximum, minimum or horizontal point of inflexion. By using
Taylor’s theorem, one can deduce the following improvement on the second derivative test.
Corollary 4.2.6. Suppose that f is n times differentiable at a and that f ′(a) = 0. If
f ′′(a) = f ′′′(a) = . . . = f (k−1)(a) = 0
but f (k)(a) 6= 0, where k ≤ n, then
(i) a is a local minimum point if k is even and f (k)(a) > 0;
(ii) a is a local maximum point if k is even and f (k)(a) < 0;
(iii) a is an horizontal point of inflexion if k is odd.
Sketch proof. Suppose that f is n times differentiable at a, and that
f ′(a) = 0 = f ′′(a) = f ′′′(a) = . . . = f (k−1)(a) = 0 and f (k)(a) 6= 0.
We will make the additional assumptions that f (k)(x) exists for all x sufficiently close to a and that
f (k) is continuous at a. (A proof of the general case is more complicated and will be omitted.)
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.2. TAYLOR’S THEOREM 113
Taylor’s theorem (with Lagrange’s remainder) implies that
f(x) = f(a) + f ′(a)(x− a) + f
′′(a)
2!
(x− a)2 + · · ·+ f
(k−1)(a)
(k − 1)! (x− a)
k−1 +
f (k)(c)
k!
(x− a)k
= f(a) + 0 + 0 + · · ·+ 0 + f
(k)(c)
k!
(x− a)k
= f(a) +
f (k)(c)
k!
(x− a)k (4.6)
for some c between x and a.
We now prove case (i). If k is even then (x − a)k > 0 whenever x 6= a. Since f (k)(a) > 0 and
f (k) is continuous at a, we conclude that f (k)(c) > 0 whenever x (and hence c) is sufficiently close
to a. By combining these inequalities with (4.6), we conclude that f(x) ≥ f(a) for all x sufficiently
close to a. Hence a is a local minimum point for f .
Cases (ii) and (iii) are proved in a similar way.
Example 4.2.7. Suppose that
f(x) = x7 − 17x6 + 101x5 − 229x4 + 3x3 + 621x2 − 297x− 567.
You are given that 3 is a stationary point of f . Classify this stationary point.
Solution. It is easy to check that
f ′(3) = f ′′(3) = f ′′′(3) = 0 and f (4)(3) = −1536 < 0.
By applying Corollary 4.2.6 (with k equal to 4), we conclude that 3 is a local maximum point for
f .
4.2.2 Some questions arising from Taylor’s theorem
To set the agenda for the rest of the chapter, we return now to our solution of Example 4.2.5, where
Taylor polynomials were used to estimate e0.1. We saw that
e0.1 = 1 + (0.1) +
(0.1)2
2!
+
(0.1)3
3!
+ · · · (0.1)
n
n!
+Rn+1(0.1), (4.7)
where
0 < |Rn+1(0.1)| < 2
10n+1(n + 1)!
.
Since
lim
n→∞
2
10n+1(n+ 1)!
= 0,
it is seems reasonable to surmise that |Rn+1(0.1)| (and hence Rn+1(0.1)) approaches 0 as n→∞.
By letting n approach infinity in (4.7), this suggests that
e0.1 = 1 + (0.1) +
(0.1)2
2!
+
(0.1)3
3!
+
(0.1)4
4!
+
(0.1)5
5!
+
(0.1)6
6!
+ · · · .
We therefore ask the following questions.
• What do we mean by Rn+1(0.1) → 0 as n → ∞? More generally, given a sequence of
numbers {a1, a2, a3, . . .}, how do we determine the limiting behaviour of an as n→∞?
c©2020 School of Mathematics and Statistics, UNSW Sydney
114 CHAPTER 4. TAYLOR SERIES
• What is meant by the infinite sum
1 + (0.1) +
(0.1)2
2!
+
(0.1)3
3!
+
(0.1)4
4!
+
(0.1)5
5!
+
(0.1)6
6!
+ · · ·?
Does the sum converge to a real number or (since we are adding infinitely many positive
numbers) does the sum diverge to infinity? If the sum does converge to a real number
then is that number e0.1? More generally, given an infinite sum (also called a series)
a1 + a2 + a3 + · · · , how can we determine whether the series converges to a real number
or else diverges?
• Each of these questions may be framed in a much larger context. Given a function f which
is infinitely differentiable at a, Taylor’s theorem gives
f(x) = f(a)+f ′(a)(x−a)+ f
′′(a)
2!
(x−a)2+ f
(3)(a)
3!
(x−a)3+· · ·+ f
(n)(a)
n!
(x−a)n+Rn+1(x).
For what values of x will lim
n→∞Rn+1(x) = 0? For what values of x will the infinite series
f(a) + f ′(a)(x− a) + f
′′(a)
2!
(x− a)2 + f
(3)(a)
3!
(x− a)3 + · · ·
converge to a finite number? If the series does converge for some x, is it always true that
f(x) = f(a) + f ′(a)(x− a) + f
′′(a)
2!
(x− a)2 + f
(3)(a)
3!
(x− a)3 + · · ·?
The example of ln(x + 1) given at the beginning of this section suggests that we cannot always
answer ‘yes’ to each of these questions. To appreciate a comprehensive answer to the questions
posed, we must spend some time studying the limiting behaviour of sequences and series. We do
so in the next three sections.
4.3 Sequences
(Ref: SH10 §11.6–11.4)
A sequence is a real-valued function defined on (a subset of) the natural numbers. For example,
the function f : N→ R, given by f(n) = n2, is a sequence. Sequences are usually denoted using a
and b, rather than f and g. In the case of sequences, subscript notation is traditionally preferred
over function notation. Thus
a(n) = n2 ∀n ∈ N
and
an = n
2 whenever n = 0, 1, 2, 3, . . .
describe the same sequence, but the second notation is used more frequently. Another way of
describing the same sequence is
{an} = {0, 1, 4, 16, 25, . . .},
or more precisely,
{an}∞n=0 = {n2}∞n=0.
In each case, the number an is called the nth term of the sequence.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.3. SEQUENCES 115
Example 4.3.1. The sequence {
1
2
,
2
5
,
3
10
,
4
17
,
5
26
, . . .
}
is described by the rule an =
n
n2 + 1
whenever n ≥ 1.
Example 4.3.2. The Fibonacci sequence
{1, 1, 2, 3, 5, 8, 13, 21, 34, 55, . . .}
is described recursively by the rule
an =
{
1 if n = 1 or n = 2
an−1 + an−2 if n ≥ 3.
Example 4.3.3. Suppose that f is m + 1 times differentiable at a. Then for each x in R, the
remainder term Rn+1(x) from Taylor’s theorem gives rise to the (finite) sequence {Rn+1(x)}mn=1.
Since sequences are a class of functions, we can add, subtract, multiply and divide any two
sequences that share a common domain. For example,
{n2}n∈N{
√
n}n∈N = {n2
√
n}n∈N.
If the domain (in this case N) is understood, then one can simply write {n2}{√n} = {n2√n}.
4.3.1 Describing the limiting behaviour of sequences
Suppose that {an} is a sequence. Our primary objective is to describe the behaviour of an as
n→∞. There are two main types of behaviour. Either
(a) an approaches some finite number L, in which case we say that the sequence {an} is
convergent and write lim
n→∞ an = L; or
(b) the sequence {an} is not convergent, in which case we say that {an} is divergent.
Divergent sequences can be further classified according to the list below.
(i) If an → ∞ as n → ∞ (that is, an grows without bound) then we say that the sequence
{an} diverges to infinity.
(ii) If an → −∞ as n→∞ then we say that the sequence {an} diverges to negative infinity.
(iii) If {an} has no limit as n → ∞ but remains bounded then we say that {an} boundedly
divergent.
(iv) If {an} exhibits none of the above behaviour then we say that {an} unboundedly divergent.
There are rigorous definitions for each of these cases. We will see one of them later in Remark
4.3.5.
Example 4.3.4. Describe the behaviour of each sequence {an} as n→∞.
(a) an = n
3
c©2020 School of Mathematics and Statistics, UNSW Sydney
116 CHAPTER 4. TAYLOR SERIES
(b) an = sin(nπ/2)
(c) an =
3n2
n2+4n+3
(d) an = (−1)n2n
Solution. (a) Since an →∞ as n→∞, the sequence {an} diverges to infinity.
(b) Since
{an}∞n=0 = {0, 1, 0,−1, 0, 1, 0,−1, 0, . . .}
we see that {an} is bounded but does not have a limit. Hence {an} is boundedly divergent.
(c) Since
3n2
n2 + 4n+ 3
=
3
1 + 4/n + 3/n2
→ 3
1 + 0 + 0
as n→∞, we conclude that {an} converges to 3.
(d) Since
{an}n∈N = {1,−2, 4,−8, 16,−32, 64,−128, . . .}
we see that {an} is not bounded. Moreover, since the even terms of the sequence approach infinity
while the odd terms approach negative infinity, we conclude that the sequence {an} is unboundedly
divergent.
Remark 4.3.5. The formal definition for lim
n→∞ an = L is similar to that for limx→∞ f(x) = L (see the
MATH1131 calculus notes) and is given as follows.
Suppose that {an}∞n=0 is a sequence of real numbers and that L ∈ R. We write
lim
n→∞ an = L
if, for every positive number ǫ, there is a number M such that |an − L| < ǫ
whenever n > M .
This definition may be interpreted geometrically.
L+ ǫ
L− ǫ
n
an
b
b
b
b
b
b
b
b
b
b
b
b
b
b
b
b
b
b
b
b
L
| | | | | | | | | | | | | | | | | | | |
10 20M
|
Given any small number ǫ, there is a point M such that the distance between the an and the limit
L is less than ǫ for every n past M . In other words, for every small band about the limit L, there
is a point M such the sequence always lies in the band past M .
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.3. SEQUENCES 117
4.3.2 Techniques for calculating limits of sequences
Many of the rules and techniques given in MATH1131 for calculating limits of functions also apply
for limits of sequences.
The first proposition of this subsection shows that limits behave well under the standard arith-
metic operations.
Proposition 4.3.6. Suppose that lim
n→∞ an and limn→∞ bn exist. Then
(i) lim
n→∞(an + bn) = limn→∞ an + limn→∞ bn;
(ii) lim
n→∞(anbn) = limn→∞ an × limn→∞ bn;
(iii) lim
n→∞
an
bn
=
lim
n→∞ an
lim
n→∞ bn
, provided that lim
n→∞ bn 6= 0 and bn 6= 0 for any n; and
(iv) lim
n→∞(αan) = α limn→∞ an for every real number α.
Example 4.3.7. Suppose that an =
√
n2 + 4n − n. Determine the limiting behaviour of an as
n→∞.
Solution. We proceed in the same manner as if we were asked to calculate lim
x→∞(
√
x2 + 4x − x).
Now
an =
√
n2 + 4n− n
=
(
√
n2 + 4n − n)(√n2 + 4n+ n)
(
√
n2 + 4n+ n)
(multiplying top and bottom by the ‘conjugate’)
=
n2 + 4n − n2√
n2 + 4n+ n
(difference of two squares)
=
4n√
n2 + 4n+ n
=
4√
1 + 4/n+ 1
(dividing top and bottom by n).
Clearly 4/n→ 0 as n→∞ and so by applying the limit rules of the above proposition we find that
an → 2 as n→∞.
The next proposition can be used when a function is composed with a sequence.
Proposition 4.3.8. Suppose that lim
n→∞ an = a and that f is continuous at a. Then
lim
n→∞ f(an) = f(a).
This proposition is easy to remember if f is continuous everywhere; it amounts to saying that
the function and limit can be swapped, as shown below:
lim
n→∞ f(an) = f
(
lim
n→∞ an
)
.
Example 4.3.9. Find lim
n→∞ sin
(
πn2
4n2 + 1
)
, if it exists.
c©2020 School of Mathematics and Statistics, UNSW Sydney
118 CHAPTER 4. TAYLOR SERIES
Solution. Note that the sine function is continuous everywhere. Therefore
lim
n→∞ sin
(
πn2
4n2 + 1
)
= sin
(
lim
n→∞
πn2
4n2 + 1
)
= sin
(
lim
n→∞
π
4 + 1/n2
)
= sin
(π
4
)
=
1√
2
.
The following rule allows the use of l’Hoˆpital’s rule when calculating limits of sequences.
Proposition 4.3.10. Suppose that {an} is a sequence and f is a function defined on some interval
(b,∞). If an = f(n) for all n sufficiently large and lim
x→∞ f(x) exists then
lim
n→∞ an = limx→∞ f(x).
Note that this proposition only works for limits which are finite. If an = f(n) and f(x) diverges
as x → ∞, one cannot say that an diverges. (Consider, for example, the case when an = 0 for all
n and f(x) = x sin(πx) for all x.)
The next example shows how Proposition 4.3.10 is applied in conjunction with l’Hoˆpital’s rule.
Example 4.3.11. Suppose that an = (1 + 1/n)
n. Determine the limiting behaviour of an as
n→∞.
Solution. Suppose that f(x) = (1 + 1/x)x whenever x > 0. Then an = f(n) whenever n > 0.
Hence
lim
n→∞ an = limx→∞ f(x).
The method for calculating the limit of f is standard: we first take the logarithm of f (to remove
the power) and then rearrange the resulting function so that we can apply l’Hoˆpital’s rule. That
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.3. SEQUENCES 119
is,
lim
n→∞ an = limx→∞
(
1 +
1
x
)x
= lim
x→∞ exp
{
ln
(
1 +
1
x
)x}
(since ln and exp are inverses)
= exp
{
lim
x→∞ ln
(
1 +
1
x
)x}
(since exp is continuous everywhere)
= exp
{
lim
x→∞x ln
(
1 +
1
x
)}
(by the log law)
= exp
{
lim
x→∞
ln (1 + 1/x)
1/x
}
(to prepare for l’Hoˆpital’s rule)
= exp

 limx→∞
(−1/x2
1+1/x
)
−1/x2

 (by l’Hoˆpital’s rule)
= exp
{
lim
x→∞
1
1 + 1/x
}
(simplifying the fraction)
= exp{1}
= e.
In summary, an converges to e as n→∞.
Remark 4.3.12. The limit lim
n→∞
(
1 +
1
n
)n
= e is a standard result and should be familiar to
students.
Finally, we present a version of the pinching theorem for sequences.
Proposition 4.3.13 (The pinching theorem for sequences). Suppose that {an}, {bn} and {cn} are
sequences and that for some positive integer N the inequality
an ≤ bn ≤ cn
is satisfied whenever n > N . If lim
n→∞ an = limn→∞ cn = L then limn→∞ bn = L.
The following example is important because sequences involving both factorials and powers
arise frequently in applications of Taylor’s theorem.
Example 4.3.14. Suppose that an =
n!
nn
. Discuss the limiting behaviour of an as n→∞.
Solution. Note that
an =
1
n
· 2
n
· 3
n
· · · n
n
≤ 1
n
· n
n
· n
n
· · · n
n
=
1
n
whenever n ≥ 1. On the other hand, an is always positive. Thus
0 ≤ an ≤ 1
n
.
As n→∞ we conclude that an → 0 by the pinching theorem.
c©2020 School of Mathematics and Statistics, UNSW Sydney
120 CHAPTER 4. TAYLOR SERIES
A similar technique was used in Lemma 2.2.5 to prove that
lim
n→∞
cn
n!
= 0
whenever c > 0.
Remark 4.3.15. It is helpful to have a good intuition of the order of growth of sequences. The
following table compares the growth of various sequences as n→∞. The lower down on the table
the sequence appears, the faster it grows at infinity.
an growth rate as n→∞
1 constant: does not grow
lnn grows slowly
nk, where k > 0 growth rate is faster for larger k
cn, where c > 1 growth rate is faster for larger c
n! grows rapidly
nn grows very rapidly
For example, the ordering in the table reflects the fact that
lim
n→∞
en
n!
= 0 while lim
n→∞
n!
nn
= 0.
The final theorem is of great theoretical importance and will be used in later sections. We begin
with a definition.
Definition 4.3.16. A sequence {an}∞n=0 of real numbers is said to be
(a) increasing if an < an+1 for each natural number n,
(b) nondecreasing if an ≤ an+1 for each natural number n,
(c) decreasing if an > an+1 for each natural number n, and
(d) nonincreasing if an ≥ an+1 for each natural number n.
If any of these four properties holds then the sequence is said to be monotonic.
Theorem 4.3.17. If {an}∞n=0 is a bounded monotonic sequence of real numbers then it is converges
to some real number L.
This theorem is proved using a property that distinguishes the real numbers from the rational
numbers. Given a bounded monotonic sequence of rational numbers, it is not true, in general, that
the sequence converges to a rational number.
4.3.3 Suprema and infima [X]
In this subsection we summarise some more advanced ideas for students studying MATH1241.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.3. SEQUENCES 121
Definition 4.3.18. Suppose that {an}∞n=0 is a sequence of real numbers.
(a) We say that M is an upper bound for {an}∞n=0 if an ≤M for every natural
number n.
(b) We say that M is a lower bound for {an}∞n=0 if an ≥ M for every natural
number n.
(c) We say that K is the least upper bound for {an}∞n=0 if K is an upper bound
for {an}∞n=0 and K ≤M whenever M is an upper bound for {an}∞n=0.
(d) We say that K is the greatest lower bound for {an}∞n=0 if K is a lower bound
for {an}∞n=0 and K ≥M whenever M is a lower bound for {an}∞n=0.
Example 4.3.19. Find the greatest lower bound and least upper bound for the sequence {an}∞n=1,
where an =
(−1)nn
n+ 1
. Prove your answer.
Solution. We write out the first few terms of the sequence to get a feel for what is happening:{
−1
2
,
2
3
,−3
4
,
4
5
,−5
6
,
6
7
,−7
8
, . . .
}
.
It is clear that the odd terms approach −1 (from above) while the even terms approach 1 (from
below).
We will prove that 1 is the least upper bound. First, it is clear that |an| = nn+1 < 1 for every
positive integer n. So 1 is an upper bound. Suppose now that K is also an upper bound but that
K < 1. Hence K = 1− ǫ for some positive number ǫ, while
n
n+ 1
< K
for every positive integer n. Therefore
n
n+ 1
< 1− ǫ
and hence
1− 1
n+ 1
< 1− ǫ
for every positive integer n. But rearranging this inequality gives
n <
1
ǫ
− 1
for every positive integer n, which gives a contradiction since the set of positive integers has no
upper bound. Hence no such K exists. We conclude that 1 is the least upper bound.
Using a similar technique, one can show that −1 is the greatest lower bound for the sequence.
Note that the sequence has neither a maximum nor minimum value.
The fact that every bounded monotonic sequence of real numbers has a limit in R (see Theorem
4.3.17) follows from one of the axioms of the real number system. This axiom is called the least
upper bound axiom and may be stated as
c©2020 School of Mathematics and Statistics, UNSW Sydney
122 CHAPTER 4. TAYLOR SERIES
‘Every nonempty set of real numbers that has an upper bound has a least upper
bound.’
Note that this axiom is not true for the rational number system.
To prove Theorem 4.3.17 in the case when {an}∞n=0 is a bounded increasing sequence of real
numbers, we note that the values of the sequence forms a bounded nonempty set of real numbers.
By the least upper bound axiom, it therefore has an upper bound, which we denote by L. Using
the definition of the limit (see Remark 4.3.5), one can now show that lim
n→∞ an = L. The proof of
the other cases is similar.
We now introduce some alternate terminology and new notation for least upper bound and
greatest lower bound.
Definition 4.3.20. Suppose that {an}∞n=0 is a sequence of real numbers.
(a) If {an}∞n=0 has a least upper boundM , then M is also called the supremum
of {an}∞n=0 and is denoted by
sup
n≥0
an or sup{an : n ≥ 0}.
(b) If {an}∞n=0 has a greatest lower boundM , thenM is also called the infimum
of {an}∞n=0 and is denoted by
inf
n≥0
an or inf{an : n ≥ 0}.
The plural for supremum and infimum is suprema and infima.
4.4 Infinite series
(Ref: SH10 §12.1, 12.2)
At the end of Section 4.2, we asked the question What is meant by the infinite sum
1 + (0.1) +
(0.1)2
2!
+
(0.1)3
3!
+
(0.1)4
4!
+
(0.1)5
5!
+
(0.1)6
6!
+ · · ·?
In this section, we will develop a framework that gives meaning to this infinite sum by using existing
notions for convergence (and divergence) of sequences. The key is to recognise that
• the terms of the above series form a sequence {ak}∞k=0, where
ak =
(0.1)k
k!
;
and
• if sn denotes the sum of the first n terms in the series (so that
sn = 1 + (0.1) +
(0.1)2
2!
+
(0.1)3
3!
+ · · · + (0.1)
n
n!
whenever n ≥ 0) then {sn}∞n=0 is also a sequence.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.4. INFINITE SERIES 123
Thus questions concerning the meaning of an infinite series can be reduced to studying the limiting
behaviour of {sn}∞n=0. (Naturally, the limiting behaviour of {sn} depends on limiting properties of
the sequence {ak}; we will pay more attention to this aspect of the theory in Section 4.5.)
Definition 4.4.1. Suppose that {ak}∞k=0 is a sequence of real numbers. For each
natural number n, let sn denote the nth partial sum given by
sn = a0 + a1 + a2 + · · ·+ an =
n∑
k=0
ak.
If the sequence {sn}∞n=0 of partial sums converges to a number L then we say that
the infinite series
∑∞
k=0 ak converges to L and we write
∞∑
k=0
ak = L.
In this case we also say that the series is summable. If the sequence {sn}∞n=0 of
partial sums diverges then we say that the infinite series
∞∑
k=0
ak diverges.
The following example illustrates this definition.
Example 4.4.2 (Geometric series). Suppose that r ∈ R and consider the geometric series
∞∑
k=0
rk = 1 + r + r2 + r3 + r4 + · · · .
Determine the values of r for which the series (a) converges and (b) diverges.
Solution. If r 6= 1 then sn is given by the formula
sn =
1− rn+1
1− r
(as is taught in high school). In the case when r = 1, we simply have
sn = 1 + 1 + 1 + · · ·+ 1︸︷︷︸
n times
= n.
In summary,
sn =
{
1−rn+1
1−r if r 6= 1
n if r = 1.
To determine the convergence (or otherwise) of the infinite series
∑∞
k=0 r
k, we simply determine
the convergence of {sn}∞n=0. We break this up into four cases.
• If |r| < 1 then rn+1 → 0 as n→∞ and so
lim
n→∞ sn = limn→∞
1− rn+1
1− r =
1
1− r .
Since the sequence of partial sums converges, so does the series and thus
∞∑
k=0
rk =
1
1− r .
c©2020 School of Mathematics and Statistics, UNSW Sydney
124 CHAPTER 4. TAYLOR SERIES
• If |r| > 1 then rn+1 diverges (either to infinity if r > 1 or unboundedly if r < −1). The
sequence {sn} of partial sums therefore diverges and consequently the series
∞∑
k=0
rk also
diverges.
• If r = 1 then sn = n→∞ as n→∞. Consequently the series diverges.
• If r = −1 then it is easily seen that {sn}∞n=0 = {1, 0, 1, 0, 1, . . .}. Hence {sn} is boundedly
divergent and the series
∞∑
k=0
rk
also diverges.
The next example illustrates the technique of comparing a series with an integral.
Example 4.4.3 (The harmonic series). Show that the harmonic series
∞∑
k=1
1
k
diverges.
Solution. This proof uses results from integration. Consider the diagram below.
x
y
y = 1x
1 2 3 4 5 . . .
. . .
k k + 1 . . .
. . .
n n+ 1
b
(k, 1k )
|
|1
1
2
It is clear that the area of the kth rectangle is 1/k, which is also equal to the kth term of the series.
Moreover, the area under the rectangles on the interval [1, n+1] is greater than the area under the
curve under the same interval. From these two observations it follows that
sn =
n∑
k=1
1
k
≥
∫ n+1
1
1
x
dx
=
[
lnx
]n+1
1
= ln(n + 1).
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.4. INFINITE SERIES 125
Now ln(n + 1) → ∞ as n → ∞, so we conclude that sn → ∞ as n → ∞. Hence the series
diverges.
Using the technique illustrated in the previous example, one can show that
•
∞∑
k=1
1
k2
converges,
•
∞∑
k=1
1√
k
diverges, and
•
∞∑
k=2
1
k ln k
diverges.
All one needs to do is draw a diagram and compare the series with an appropriate improper integral.
Since a convergent infinite series is the limit of a sequence (the sequence of partial sums), many
results for sequences can be interpreted as results for series. The following proposition illustrates
this point.
Proposition 4.4.4. Suppose that
∞∑
k=0
ak and
∞∑
k=0
bk are two summable series. Then
(i)
∞∑
k=0
(ak + bk) =
∞∑
k=0
ak +
∞∑
k=0
bk; and
(ii)
∞∑
k=0
(αak) = α
∞∑
k=0
ak for every real number α.
Proof. Let sn and tn denote the partial sums of
∑∞
k=0 ak and
∑∞
k=0 bk respectively. Now apply
Proposition 4.3.6 (i) and (iv) to the sequences {sn} and {tn}.
Remark 4.4.5. While all the terms of a convergent series contribute to the value of the series, the
convergence (or otherwise) of any series only depends on the ‘tail’ of the series. That is, the first
hundred, thousand or even billion terms of the series are irrelevant to the question of whether the
series converges. More precisely, given any positive integer N ,
∞∑
k=0
ak converges if and only if
∞∑
k=N
ak converges.
With this in mind, all of the theorems presented in the next section are just as true for series
of the form ∞∑
k=1
ak,
∞∑
k=50
ak or
∞∑
k=2000
ak
as they are for series of the form
∞∑
k=0
ak. When the starting point for the series does not matter,
one sometimes simply writes
∞∑
ak or
∑
ak.
c©2020 School of Mathematics and Statistics, UNSW Sydney
126 CHAPTER 4. TAYLOR SERIES
4.5 Tests for series convergence
(Ref: SH10 §12.3–12.5)
To determine whether a series ∞∑
k=0
ak
converges or diverges, mathematicians have developed some simple tests. Typically, these tests
examine the behaviour of the sequence {ak} and thereby deduce the convergence (or otherwise) of
the corresponding series. In this section we introduce three such tests: the kth term test, the ratio
test and the alternating series test.
Remark 4.5.1 (Warning). Care must be taken through this section not to confuse sequences and
series. For example, suppose that ak =
1
k . While the sequence {ak} converges, the infinite series∑
ak does not.
4.5.1 Some preliminary results on series summation
The next two results are fundamental to the study of infinite series. They will later be used to
establish some simple tests for convergence or divergence.
Lemma 4.5.2. Suppose that {ak}∞k=0 is a sequence of positive numbers and let sn denote the partial
sum given by
sn =
n∑
k=0
ak.
If {sn}∞n=0 is a bounded sequence then the infinite series
∑∞
k=0 ak is convergent.
Proof. For any natural number n,
sn+1 = sn + an+1 > sn,
since an+1 is positive. Hence {sn}∞n=0 is a bounded increasing sequence and hence has a limit L
(see Theorem 4.3.17). Therefore
∞∑
k=0
ak = L
and the series converges.
4.5.2 The kth term divergence test
The next test we introduce is a simple test for divergence. One should always use this test first
when trying to decide whether a series converges.
Theorem 4.5.3 (The kth term test for divergence.). If ak 6→ 0 as k →∞ then
∞∑
k=0
ak diverges.
Before proving the theorem, we give an example.
Example 4.5.4. Determine whether the series
∞∑
k=1
k2 + 2k√
k4 + 2
converges.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.5. TESTS FOR SERIES CONVERGENCE 127
Solution. If ak =
k2 + 2k√
k4 + 2
then
lim
k→∞
ak = lim
k→∞
k2 + 2k√
k4 + 2
= lim
k→∞
1 + 2/k√
1 + 2/k4
(by dividing top and bottom by k2)
= 1.
Since ak 6→ 0 as k →∞, the series diverges by the kth term test.
Remark 4.5.5. The kth term test is not a test for convergence. For example, consider the series∑∞
k=1
1
k . In this case, limk→∞
1/k = 0 but the series diverges (see Example 4.4.3).
The kth term divergence test is equivalent to the following theorem, whose proof we give below.
Theorem 4.5.6. If the series
∞∑
k=0
ak converges then ak → 0 as k →∞.
Proof. [H] Suppose that
∞∑
k=0
ak converges to the real number L and let sn denote the nth partial
sum of the series. Then
sn − sn−1 = (a0 + a1 + · · · + an−1 + an)− (a0 + a1 + · · ·+ an−1)
= an. (4.8)
Now lim
n→∞ sn = limn→∞ sn−1 = L and so
lim
n→∞ an = limn→∞(sn − sn−1) (by (4.8))
= lim
n→∞ sn − limn→∞ sn−1
= L− L
= 0,
thus completing the proof.
4.5.3 The integral test
The idea from Example 4.4.3, where we bounded a sum by an integral, can be applied more generally
to produce a test for either convergence or divergence.
Theorem 4.5.7 (The integral test). Suppose that
∑
ak is an infinite series with positive terms.
Suppose f(x) is a positive integrable function decreasing on [1,∞) such that for each positive integer
k, f(k) = ak.
(i) If
∫ ∞
1
f(x) dx converges then so does
∞∑
k=1
ak.
(ii) If
∫ ∞
1
f(x) dx diverges then so does
∞∑
k=1
ak.
c©2020 School of Mathematics and Statistics, UNSW Sydney
128 CHAPTER 4. TAYLOR SERIES
The proof is similar to that given in Example 4.4.3.
Example 4.5.8. Determine whether or not the following series converge.
(a)
∞∑
k=1
1
k2
(b)
∞∑
k=1
k
2k2 + 1
(c)
∞∑
k=2
1
k(log k)2
Proof. (a) Consider the improper integral,
∫ ∞
1
1
x2
dx.
∫ ∞
1
1
x2
dx = lim
N→∞
∫ N
1
1
x2
dx = lim
N→∞
[
−1
x
]N
1
= 1.
Since the improper integral converges, so does the series
∞∑
k=1
1
k2
.
(Note: This is a famous series, first summed by Euler. Its value is remarkably pi
2
6 . This will be
proven in later courses, but a proof of it appeared in the NSW Extension 2 paper, 2010.)
(b) Consider the improper integral,
∫ ∞
1
x
2x2 + 1
dx.
∫ ∞
1
x
2x2 + 1
dx = lim
N→∞
∫ N
1
x
2x2 + 1
dx = lim
N→∞
[
1
4
log(2x2 + 1)
]N
1
→∞.
Since the improper integral diverges, so does the series
∞∑
k=1
k
2k2 + 1
.
(c) Consider the improper integral,
∫ ∞
2
1
x(log x)2
dx.
∫ ∞
2
1
x(log x)2
dx = lim
N→∞
∫ N
2
1
x(log x)2
dx = lim
N→∞
[
− 1
log x
]N
2
=
1
log 2
.
Since the improper integral converges, so does the series
∞∑
k=1
1
k(log k)2
.
4.5.4 The comparison test
The integral test was basically a comparison between each term of a given series and the area of a
corresponding rectangle. This idea can also be applied to the terms of two series. If each term of a
given series is less than each term of another series - whose convergence is easy to determine, then
we can conclude the given series also converges. A similar test can be found for divergence.
Theorem 4.5.9 (The comparison test). Suppose that {ak}∞k=0 and {bk}∞k=0 are two positive se-
quences such that ak ≤ bk for every natural number k.
(i) If
∑∞
k=0 bk converges then
∑∞
k=0 ak also converges.
(ii) If
∑∞
k=0 ak diverges then
∑∞
k=0 bk also diverges.
The comparison test is often used in conjunction with series of the following type.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.5. TESTS FOR SERIES CONVERGENCE 129
Proposition 4.5.10 (Convergence and divergence of p-series). The series
∞∑
k=1
1
kp
converges if p > 1 and diverges if p ≤ 1.
Proof. This theorem may be easily proved by the integral test, or by adapting the proof in Example
4.4.3. The details are left to the reader.
Example 4.5.11. Determine whether or not the following series converge.
(a)
∞∑
k=1
k
k3 + 1
(b)
∞∑
k=2
1√
k2 − 1 (c)
∞∑
k=1
1√
k2 + 1
Typically, one needs some intuition as to whether the series will converge or diverge before the
comparsion test is used to construct a rigorous solution.
Solution. (a) By considering the dominant term (as k →∞), we see that
k
k3 + 1
≈ k
k3
=
1
k2
whenever k is a large positive integer. Since
∑ 1
k2
converges (p-series when p = 2), this suggests
that
∞∑
k=1
k
k3 + 1
also converges.
To prove that
∞∑
k=1
k
k3 + 1
converges, we use part (i) of the comparison test. Note that
0 ≤ k
k3 + 1
≤ k
k3
=
1
k2
whenever k ≥ 1. So suppose that ak = kk3+1 and bk = 1k2 . We have shown that 0 ≤ ak ≤ bk. Since∑
bk converges, it follows from the comparison test that
∑
ak converges.
(b) Looking at the dominant terms,
∞∑
k=2
1√
k2 − 1 ≈
∞∑
k=2
1√
k2
=
∞∑
k=2
1
k
.
Since this series diverges (p-series when p = 1), this suggests that
∞∑
k=2
1√
k2 − 1 also diverges.
To prove that
∞∑
k=2
1√
k2 − 1 diverges, we use part (ii) of the comparison test.
Note that
1√
k2 − 1 ≥
1√
k2
=
1
k
, whenever k ≥ 2. So suppose that ak = 1√k2−1 and bk =
1
k . We
have shown that ak ≥ bk.
Since
∑
bk diverges, it follows from the comparison test that
∑
ak diverges.
(c) The same analysis as in (b) suggests that this series also diverges. However, the inequality
1√
k2 + 1
≥ 1√
k2
=
1
k
is false. To overcome this hurdle, we introduce a ‘fudge factor’ to obtain the
inequality we want. Thus,
1√
k2 + 1
≥ 1
2
√
k2
=
1
2k
for all sufficiently large k (in fact k ≥ 1 will do-
you should check this!). So suppose that ak =
1√
k2+1
and bk =
1
2k . We have shown that ak ≥ bk.
Since
∑
bk diverges, it follows from the comparison test that
∑
ak diverges.
c©2020 School of Mathematics and Statistics, UNSW Sydney
130 CHAPTER 4. TAYLOR SERIES
4.5.5 [X] The limit form of the comparison test
This form of the comparison test is extremely useful and allows us to rely on our intuition without
having to work with inequalities. The disadvantage is that it does not always work in quite the
same way as the straight comparison test does.
Proposition 4.5.12. Suppose an, bn are sequences with positive terms and suppose lim
n→∞
an
bn
is
finite and not zero, then
∞∑
an converges if and only if
∞∑
bn converges.
Proof. Suppose lim
n→∞
an
bn
= K > 0. For any given ǫ > 0, we have
∣∣∣∣akbk −K
∣∣∣∣ < ǫ
for all sufficiently large k. For such k we have
K − ǫ < ak
bk
< K + ǫ⇒ (K − ǫ)bk < ak < (K + ǫ)bk.
Thus, from the last inequality, if
∑
ak converges then
∑
(K − ǫ)bk converges and hence
∑
bk does
also; if
∑
bk converges then
∑
(K + ǫ)bk converges and hence
∑
ak does also.
Example 4.5.13. Discuss the convergence of
∞∑
k=5
k2
k4 + 3
Proof. For large k the summand ak =
k2
k4+3
is roughly bk =
1
k2
.
Now lim
n→∞
an
bn
= 1. Hence, since
∞∑
k=5
1
k2
converges (by p-series with p = 2), so does
∞∑
k=5
k2
k4 + 3
.
Example 4.5.14. Discuss the convergence of
∞∑
k=1
sin(
1
k
)
Proof. Since sinx ≈ x for small x, we try comparing ak = sin( 1k ), (whose terms are positive), with
bk =
1
k .
Now lim
n→∞
an
bn
=
sin( 1k )
1
k
= 1. Hence, since
∞∑
k=1
1
k
diverges (by p-series with p = 1), so does
∞∑
k=1
sin(
1
k
).
Remark 4.5.15. The series
∑ 1
log n
diverges, since, for n ≥ 1, we have log n < n giving 1logn > 1n
and so the comparison test may be applied. On other hand, is we try to use the limit comparison
test with an =
1
logn and bn =
1
n , then limn→∞
an
bn
= lim
n→∞
n
log n
→∞ and so the test fails.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.5. TESTS FOR SERIES CONVERGENCE 131
4.5.6 The ratio test
We now introduce a simple convergence and divergence test known as the ratio test. It is important
to note that this test can only be applied to series whose terms are positive.
Theorem 4.5.16 (The ratio test). Suppose that
∑
ak is an infinite series with positive terms and
that
lim
k→∞
ak+1
ak
= r.
(i) If r < 1 then
∑
ak converges.
(ii) If r > 1 then
∑
ak diverges.
Remark 4.5.17. The ratio test does not specify what happens if r = 1. In this case, the test is
inconclusive; the series may converge or diverge.
The reason the ratio test works is that the tail of any series
∑
ak with ‘ratio’ r given by
r = lim
k→∞
ak+1
ak
behaves like a geometric series with common ratio r. When r < 1, the geometric series is convergent
and hence
∑
ak also converges. Similarly, When r > 1, the geometric series is divergent and
therefore
∑
ak also diverges. Of course, these assertions need to be proved.
Before we see a proof of the ratio test, we shall see how it is applied. As seen in the examples
below, the ratio test is particularly useful when k! or kth powers appear in each term ak.
Example 4.5.18. Determine whether or not the following series converge.
(a)
∞∑
k=1
1
k!
(b)
∞∑
k=1
kk
k!
(c)
∞∑
k=1
1
k
Solution. (a) Suppose that ak =
1
k!
. Then
r = lim
k→∞
ak+1
ak
= lim
k→∞
k!
(k + 1)!
= lim
k→∞
1
k + 1
= 0.
Since r < 1, the series converges by the ratio test.
(b) Suppose that ak =
kk
k!
. Then
r = lim
k→∞
ak+1
ak
= lim
k→∞
(k + 1)k+1
(k + 1)!
· k!
kk
= lim
k→∞
(k + 1)k(k + 1)
k!(k + 1)
· k!
kk
= lim
k→∞
(k + 1)k
kk
= lim
k→∞
(
k + 1
k
)k
= lim
k→∞
(
1 +
1
k
)k
= e
c©2020 School of Mathematics and Statistics, UNSW Sydney
132 CHAPTER 4. TAYLOR SERIES
(see Example 4.3.11 for a calculation of this well known limit). Since r = e > 1, the series diverges
by the ratio test.
(c) Since r = lim
k→∞
k
k + 1
= 1, we cannot say from the ratio test whether or not the series
converges. It can be shown using another method that the series diverges (see Example 4.4.3).
Sketch proof of the ratio test. [X] (i) Suppose that r < 1 and choose R such that r < R < 1. Since
lim
k→∞
ak+1
ak
= r, the terms of the sequence
{
ak+1
ak
}
eventually gets so close to r that they must also
be less than R. More precisely, there is an integer N such that
ak+1
ak
< R whenever k ≥ N. (4.9)
Our goal from here is to show that the tail
∑∞
k=N ak of the series can be bounded above by a
convergent geometric series with common ratio R. From (4.9) we see that
aN+1 < RaN , aN+2 < RaN+1 < R
2aN , aN+3 < RaN+2 < R
2aN+1 < R
3aN ,
and more generally that
aN+j < R
jaN whenever j ≥ 0. (4.10)
Hence
∞∑
k=N
ak =
∞∑
j=0
aN+j
<
∞∑
j=0
RjaN (by inequality (4.10))
= aN
∞∑
j=0
Rj (which is a geometric series)
=
aN
1−R (since R < 1).
Since the tail of the series
∑
ak converges, the series itself must converge.
(ii) If r > 1, then
lim
k→∞
ak+1
ak
> 1
and so
ak + 1
ak
> 1
for all k sufficiently large. That is, ak+1 > ak for all k sufficiently large, which means that the
positive sequence {ak} eventually becomes an increasing sequence. Hence ak 6→ 0 as k → ∞. It
follows from the kth term test that the series is divergent.
4.5.7 Leibniz’ test for alternating series
So far we have given a convergence test for series all of whose terms are positive. If the terms are
all negative, then we simply multiply the series by −1 to obtain series whose terms are positive.
However, if the series has a mixture of positive and negative terms, we cannot apply this trick. In
the next two subsections, we deal with series whose terms have mixed signs. The simplest case is
when the sign alternates from term to term.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.5. TESTS FOR SERIES CONVERGENCE 133
Definition 4.5.19. If {ak}∞k=0 is a sequence of positive real numbers, then the series
a0 − a1 + a2 − a3 + a4 − a5 + a6 − a7 + a8 − a9 + · · ·
is called an alternating series.
An alternating series is often written in the form
∞∑
k=0
(−1)kak.
The following theorem, proved by Leibniz in the early eighteenth century, is a simple test for
the convergence of alternating series.
Theorem 4.5.20 (Alternating series test). Suppose that {ak}∞k=0 is a sequence of real numbers
satisfying the following properties:
(a) ak ≥ 0;
(b) ak ≥ ak+1 for all k (that is, the sequence is nonincreasing); and
(c) lim
k→∞
ak = 0.
Then the alternating series
∞∑
k=0
(−1)kak converges.
Before proving the theorem, we give and example and state a corollary.
Example 4.5.21. Determine whether the series
∞∑
k=2
(−1)kk
k2 + 1
is summable.
Solution. Since this is an alternating series, one naturally tries the alternating series test. Suppose
that ak =
k
k2 + 1
. We need to check that {ak} satisfies hypotheses (a), (b) and (c) of the alternating
series test.
It is clear that (a) and (c) hold. To prove (b), consider the function f given by
f(x) =
x
x2 + 1
.
Now
f ′(x) =
1− x2
(1 + x2)2
and hence f ′(x) < 0 whenever x > 1. That is, f is decreasing on the interval (1,∞). Since
f(k) = ak whenever k ≥ 2, it follows that {ak}∞k=2 is a decreasing sequence.
We now apply the alternating series test and deduce that the series converges.
If the hypotheses of the alternating sequence test are satisfied, then not only do we know that
the series converges to some limit L, but we can also approximate L with any partial sum sn and
obtain an upper bound for the corresponding error.
c©2020 School of Mathematics and Statistics, UNSW Sydney
134 CHAPTER 4. TAYLOR SERIES
Corollary 4.5.22. Suppose that {ak}∞k=0 is a sequence of numbers satisfying properties (a), (b)
and (c) of the alternating series test. Denote the value of the convergent series
∑∞
k=0(−1)kak by
L and the nth partial sum of the same series by sn. Then
|sn − L| ≤ an+1 (4.11)
for every natural number n.
In effect, the corollary says that if you chop the series off after the nth term, the error in
approximation will be less than the (n+1)st term. The proof is given at the end of this subsection.
Example 4.5.23. Estimate the value of the series
∞∑
k=0
(−1)k
k2 + 1
such that the error is less than 1100 .
Solution. It is easy to verify that the sequence {ak}, where ak = 1k2+1 , satisfies the hypotheses of
the alternating series test. Hence the infinite series converges. Denote the value of the series by L.
We will use the estimate sn ≈ L, where n is chosen such that the absolute error is less than 1100 .
Now
absolute error = |sn − L|
≤ an+1 (by Corollary 4.5.22)
=
1
(n+ 1)2 + 1
.
So it is enough to guarantee that
1
(n+ 1)2 + 1
<
1
100
.
Clearly the smallest positive integer n that satisfies this inequality is 9. So
L ≈ s9 =
9∑
k=0
(−1)k
k2 + 1
= 1− 1
5
+
1
10
− 1
17
+ · · · − 1
82
and the error in this approximation is less than 1100 . (Using Maple, one finds that s9 = 0.6305785114
(correct to 10 decimal places) and so L ≈ 0.63.)
We conclude this subsection by proving the alternating series test and its corollary. The follow-
ing diagram illustrates the typical behaviour of the partial sums of a series that satisfies hypotheses
(a), (b) and (c) of the alternating series test. It will be helpful to bear this diagram in mind when
reading the proofs.
n
sn
×
×
× × × × ×
even partial sums
b
b
b
b
b
b
odd partial sums
limit
| | | | | | | | | | | |
0 1 2 3 4 5 6 7 8 9 10 11 12
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.5. TESTS FOR SERIES CONVERGENCE 135
Proof of Theorem 4.5.20. [X] Let sn denote the nth partial sum of the series and suppose that
properties (a), (b) and (c) hold. The proof proceeds in three steps.
Step 1. We will prove that the sequence {s2n}∞n=0 of even partial sums is bounded above by 0.
Now
s2n = (a0 − a1) + (a2 − a3) + (a4 − a5) + · · · + (a2n−2 − a2n−1) + a2n. (4.12)
By property (b) we see that
a0 − a1 ≥ 0, a2 − a3 ≥ 0, , . . . , a2n−2 − a2n−1 ≥ 0
and by property (a) it is evident that a2n ≥ 0. It follows from (4.12) that s2n ≥ 0 for every n.
Step 2. We will prove that the sequence {s2n}∞n=0 of even partial sums is nonincreasing. Now
s2n − s2n+2 = (a0 − a1 + . . .+ a2n)− (a0 − a1 + . . . + a2n − a2n+1 + a2n+2)
= a2n+1 − a2n+2
≥ 0
since a2n+1 ≥ a2n+2 by property (b). Thus s2n ≥ s2n+2 for all n, which means that {s2n}∞n=0 is
nonincreasing.
Step 3. From Steps 1 and 2, we conclude that {s2n}∞n=0 is a bounded monotonic sequence and
hence convergent. Call the limit of this sequence L. If we can show that the sequence {s2n+1}∞n=0
of odd partial sums also converges to L, then we can conclude that {sn} converges. Now
s2n+1 = s2n + a2n+1 → L+ 0
as n→∞ by property (c). Hence {sn} converges.
Proof of Corollary 4.5.22. [X] Suppose that {ak} satisfies the hypotheses (a), (b) and (c) of the
alternating series test. Since the infinite series converges, lim
n→∞ sn = L for some real number L. If
n is odd then
sn+2 = sn + an+1 − an+2 ≥ sn,
and so the odd partial sums increase towards L from below. If n is even then
sn+2 = sn − an+1 + an+2 ≤ sn,
and so the even partial sums decrease towards L from above. Hence if n is odd then
sn ≤ L ≤ sn+1 = sn + an+1,
while if n is even then
sn − an+1 = sn+1 ≤ L ≤ sn.
That is,
either sn ≤ L ≤ sn + an+1 or sn − an+1 ≤ L ≤ sn.
Both cases imply (4.11).
c©2020 School of Mathematics and Statistics, UNSW Sydney
136 CHAPTER 4. TAYLOR SERIES
4.5.8 Absolute and conditional convergence
Consider the series
1 +
1
2!
− 1
3!
+
1
4!
+
1
5!
− 1
6!
+
1
7!
+
1
8!
− 1
9!
+ · · · . (4.13)
Since
lim
k→∞
|ak| = lim
k→∞
1
k!
= 0,
the kth term test for divergence does not apply. One cannot apply the ratio test (since not all the
terms positive) and clearly this is not an alternating series. Is there a way of determining whether
the series is summable? The theorem given below proves very helpful in this instance. First we
give a definition.
Definition 4.5.24. A series
∞∑
k=0
ak is said to be absolutely convergent if the series
∞∑
k=0
|ak|
is convergent.
Theorem 4.5.25. If a series is absolutely convergent then it converges.
Proof. [H] Suppose that the series
∑∞
k=0 ak converges absolutely. For each natural number k,
−|ak| ≤ ak ≤ |ak|
and hence
0 ≤ ak + |ak| ≤ 2|ak|. (4.14)
Since
∑ |ak| converges, it follows that 2∑ |ak| converges and hence ∑ 2|ak| converges. By the
comparison test, we deduce from (4.14) that
∑
(ak + |ak|) converges. Now
ak = (ak + |ak|)− |ak|.
Since the sum of the terms on the right-hand side converges (by Proposition 4.4.4), the sum of the
terms on the left must also converge. That is,
∑∞
k=0 ak converges, thus completing the proof.
Example 4.5.26. Determine whether or not the series given by (4.13) is convergent.
Solution. Let ak denote the kth term of the series. By Theorem 4.5.25, it is enough to show that
the series converges absolutely. Now
∞∑
k=1
|ak| =
∞∑
k=1
1
k!
and this series converges by the ratio test (see Example 4.5.18 (a)).
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.5. TESTS FOR SERIES CONVERGENCE 137
Not every convergent series is absolutely convergent. For example, the alternating series
1− 1
2
+
1
3
− 1
4
+
1
5
− 1
6
+ · · · (4.15)
converges (by the alternating series test) but the corresponding absolute series
1 +
1
2
+
1
3
+
1
4
+
1
5
+
1
6
+ · · ·
diverges (see Example 4.4.3). In this situation, we say that the series (4.15) is conditionally con-
vergent.
Definition 4.5.27. A series
∞∑
k=0
ak is said to be conditionally convergent if it con-
verges but the series
∞∑
k=0
|ak|
diverges.
The distinction between conditionally and absolutely convergent series is brought into bold
relief when considering rearrangements of series.
Definition 4.5.28. A rearrangement of a series
∑
ak is a series that has exactly
the same terms but that is summed in a different order.
For example,
1 +
1
3
+
1
5
− 1
2
+
1
7
+
1
9
− 1
4
+
1
11
+ · · ·
and
−1
2
+ 1− 1
4
+
1
3
− 1
6
+
1
5
− 1
8
+
1
7
− · · ·
are both rearrangements of
1− 1
2
+
1
3
− 1
4
+
1
5
− 1
6
+
1
7
− 1
8
+ · · · .
If the series is finite, then every rearrangement has the same value (since addition of real numbers is
commutative). However, if the series is infinite, then the value (if it exists) of each rearrangement
is determined by a limit of partial sums, and one cannot appeal to commutativity of addition,
as in the finite case. In fact, some rather surprising phenomena occur with rearrangements of
conditionally convergent series.
Theorem 4.5.29. Suppose that
∞∑
k=0
ak is an infinite series.
c©2020 School of Mathematics and Statistics, UNSW Sydney
138 CHAPTER 4. TAYLOR SERIES
(i) If
∑
ak converges absolutely, then every rearrangement of the series converges absolutely
and all rearrangements have the same limit as
∑
ak.
(ii) If
∑
ak converges conditionally, then given any real number L, the series has a rear-
rangement that converges to L. Moreover, every conditionally convergent series has a
rearrangement that diverges to ∞, and another rearrangement that diverges to −∞.
This theorem was published in 1867 by Riemann. One of the tutorial problems illustrates that
the conditionally convergent series
1− 1
2
+
1
3
− 1
4
+
1
5
− 1
6
+
1
7
− · · ·
can be rearranged to sum to different real numbers. The moral of the story is that one should not
rearrange a conditionally convergent series to determine its value.
4.6 Taylor series
(Ref: SH10 §12.7)
At the end of Section 4.2 we posed the question, When is it true that
f(x) = f(a) + f ′(a)(x− a) + f
′′(a)
2!
(x− a)2 + f
(3)(a)
3!
(x− a)3 + · · ·?
Having studied sequences and series of real numbers, we now have the tools to deal with this and
related questions. First, we give the above series expansion a special name.
Definition 4.6.1. Suppose that a function f has derivatives of all orders at a. Then
the series
f(a) + f ′(a)(x − a) + f
′′(a)
2!
(x− a)2 + f
(3)(a)
3!
(x− a)3 + · · · ,
which may also be written as
∞∑
k=0
f (k)(a)
k!
(x− a)k,
is called the Taylor series for f about a. In the case when a = 0, the series is also
called the Maclaurin series for f .
Next, we need to define what we mean by the convergence (or divergence) of a Taylor series.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.6. TAYLOR SERIES 139
Definition 4.6.2. Suppose that I is an interval and that f has derivatives of all
orders at some point a. We say that
(a) the Taylor series for f about a converges on I if the series
∞∑
k=0
f (k)(a)
k!
(x− a)k
converges for each point x in I;
(b) the Taylor series for f about a converges to f on I if for each x in I, x lies
in the domain of f and
f(x) =
∞∑
k=0
f (k)(a)
k!
(x− a)k;
and
(c) the Taylor series for f at a diverges on I if the series
∞∑
k=0
f (k)(a)
k!
(x− a)k
diverges for each point x in I.
Thus the question asked at the beginning of this section may be rephrased as,
For what intervals I will the Taylor series of a function f converge to f?
The following corollary to Taylor’s theorem helps answer this question.
Corollary 4.6.3. Suppose that f has derivatives of all orders at a and that x lies in the domain
of f . Let Rn+1(x) denote the remainder term of Theorem 4.2.1 (or its equivalent form as given in
Corollary 4.2.2). If
lim
n→∞Rn+1(x) = 0 (4.16)
then
f(x) =
∞∑
n=0
f (n)(a)
n!
(x− a)n.
Proof. Let pn denote the nth Taylor polynomial for f about a. Taylor’s theorem implies that
f(x) = pn(x) +Rn+1(x). (4.17)
Note that pn(x) is the nth partial sum of the series
∞∑
n=0
f (n)(a)
n!
(x− a)n.
So we only have to show that
lim
n→∞ pn(x) = f(x). (4.18)
c©2020 School of Mathematics and Statistics, UNSW Sydney
140 CHAPTER 4. TAYLOR SERIES
Now
|pn(x)− f(x)| = |Rn+1(x)| (by (4.17))
→ 0
as n → ∞ by (4.16). That is, the distance between pn(x) and f(x) can be made as small as we
like. Hence (4.18) follows.
Example 4.6.4. Suppose that x ∈ R. Show that
ex = 1 + x+
x2
2!
+
x3
3!
+
x4
4!
+ · · · .
Solution. Suppose that f(t) = et and fix x in R. The sum of the terms up to (and including) x
n
n!
is equal to pn(x), where pn is the nth Taylor polynomial for f about 0. So by Corollary 4.6.3, we
only need to show that lim
n→∞Rn+1(x) = 0. Now
f (n)(t) = et ∀t ∈ R,
and so by the Lagrange formula for the remainder,
Rn+1(x) =
ec
(n+ 1)!
xn+1
for some c between 0 and x. Now ec ≤ e|c| ≤ e|x|. If M = e|x| then
0 ≤ |Rn+1(x)|
=
ec
(n+ 1)!
|x|n+1
≤ M |x|
n+1
(n + 1)!
→ 0
as n→∞ by Lemma 2.2.5. Therefore
lim
n→∞Rn+1(x) = 0
by the pinching theorem for sequences.
We have just proved that the Taylor series about 0 for the exponential function converges to the
exponential function on R. We say that the exponential function is represented by its Taylor series
about 0 on R. Some other convergent Taylor series representations are given in the next theorem.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.6. TAYLOR SERIES 141
Theorem 4.6.5. The following formulae hold whenever x lies in the given interval.
1
1− x = 1 + x+ x
2 + x3 + x4 + . . . x ∈ (−1, 1)
ex = 1 + x+
x2
2!
+
x3
3!
+
x4
4!
+ · · · x ∈ R
sinx = x− x
3
3!
+
x5
5!
− x
7
7!
+ · · · x ∈ R
cos x = 1− x
2
2!
+
x4
4!
− x
6
6!
+ · · · x ∈ R
sinhx = x+
x3
3!
+
x5
5!
+
x7
7!
+ · · · x ∈ R
coshx = 1 +
x2
2!
+
x4
4!
+
x6
6!
+ · · · x ∈ R
ln(1 + x) = x− x
2
2
+
x3
3
− x
4
4
+ · · · x ∈ (−1, 1]
tan−1 x = x− x
3
3
+
x5
5
− x
7
7
+ · · · x ∈ [−1, 1]
Moreover, if x lies outside the given interval then the corresponding Maclaurin series diverges.
Most of these formulae can be proved by showing that the Lagrange formula for the remainder
tends to 0 as n → ∞. However, sometimes one must resort to using the integral form of the
remainder (as given by Theorem 4.2.1). One of the tutorial problems illustrates its use. In Section
4.8, we introduce tools that provide an alternate approach to deriving some of these expansions.
Remark 4.6.6. The Taylor series expansions for sinx and ln(1+x) given by Theorem 4.6.5 explain
the phenomena discussed at the beginning of Section 4.2. In particular, the Taylor series for sinx
converges for all x in R, which explains why sin(7) could be approximated by Taylor polynomials
of sufficiently high degree. On the other hand, we cannot use Taylor polynomials to approximate
ln(1 + x) when x > 1 (as suggested by Figure 4.2) because the Taylor series diverges when x > 1.
This explains why higher order Taylor polynomials give worse approximations for ln(1 + x) when
x > 1.
Remark 4.6.7. The Maclaurin series given by Theorem 4.6.5 can be used to obtain beautiful series
expansions for some irrational numbers. By substituting particular values for x into an appropriate
Maclaurin series, one finds that
e = 1 +
1
1!
+
1
2!
+
1
3!
+
1
4!
+ · · ·
ln 2 = 1− 1
2
+
1
3
− 1
4
+
1
5
− · · ·
π
4
= 1− 1
3
+
1
5
− 1
7
+ · · · .
Unfortunately, the last two series converge too slowly to be of high computational value.
Remark 4.6.8. If f equals its Taylor series on an interval I, then the corresponding Taylor
polynomial pn can be used to approximate f on I. However, it is important to appreciate that
some Taylor series (such as that for ex) converge much more quickly to the function than do others
(such as that for ln(1 + x), which converges slowly). If the series converges very slowly then the
approximation f ≈ pn is only accurate when n is very large.
c©2020 School of Mathematics and Statistics, UNSW Sydney
142 CHAPTER 4. TAYLOR SERIES
Remark 4.6.9. If a Taylor series converges to a function f on an interval I, then (obviously)
the Taylor series converges on I. However, the converse is not true. That is, if the Taylor series
converges on I, then one cannot conclude that the Taylor series converges to f on I. In the tutorial
problems we give one example of a function f whose whose Maclaurin series converges on the entire
real line but only converges to f at the origin.
4.7 Power series
(Ref: SH10 §12.8)
A Maclaurin series is a series of the form
a0 + a1x+ a2x
2 + a3x
3 + a4x
4 + a5x
5 + · · · , (4.19)
where each coefficient ak is given by
ak =
f (k)(0)
k!
for some function f that is infinitely differentiable at 0. For the remainder of this chapter, we
study more general series of the form (4.19), where each coefficient ak is not necessarily a Taylor
coefficient. Such a series is called a power series.
Definition 4.7.1. Suppose that {ak}∞k=0 is a sequence of real numbers and that
a ∈ R. A series of the form ∞∑
k=0
akx
k
is called a power series in powers of x. A series of the form
∞∑
k=0
ak(x− a)k
is called a power series in powers of x− a.
Thus a Maclaurin series is a power series in powers of x, while a Taylor series about a is a power
series in powers of x − a. For the last two sections of this chapter, we discuss the convergence,
addition, multiplication, integration and differentiation of power series. Hence whatever is said
about power series also applies to Maclaurin and Taylor series.
In this section, we focus on the convergence and divergence of power series.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.7. POWER SERIES 143
Definition 4.7.2. Suppose that {ak}∞k=0 is a sequence of real numbers, I is an
interval and a is a real number. We say that a power series
∞∑
k=0
ak(x−a)k converges
(a) at a real number c if the series
∞∑
k=0
ak(c− a)k converges;
(b) on the interval I if the series
∞∑
k=0
ak(x− a)k converges for each x in I.
We say that a power series
∞∑
k=0
ak(x− a)k diverges
(a) at a real number c if the series
∞∑
k=0
ak(c− a)k diverges;
(b) on the interval I if the series
∞∑
k=0
ak(x− a)k diverges for each x in I.
Using the ratio test, one can often determine for what values of x a power series converges
absolutely.
Example 4.7.3. Find an interval I such that the power series
∞∑
k=0
kxk
3k
converges on I.
Solution. We first find an interval I on which the series converges absolutely. To do so, we apply
the ratio test to the (absolute) series
∞∑
k=0
∣∣∣∣kxk3k
∣∣∣∣ .
Now
r = lim
k→∞
∣∣∣∣(k + 1)xk+13k+1
∣∣∣∣ .
∣∣∣∣ 3kkxk
∣∣∣∣
= lim
k→∞
(k + 1)|x|
3k
=
|x|
3
.
To conclude from the ratio test that the series converges, we require that r < 1, which means
that |x| < 3. So the series converges absolutely whenever −3 < x < 3. Hence the series
∞∑
k=0
kxk
3k
converges on the interval (−3, 3).
c©2020 School of Mathematics and Statistics, UNSW Sydney
144 CHAPTER 4. TAYLOR SERIES
Remark 4.7.4. In the previous example, one can also conclude that the series diverges when
|x| > 3. To see this, fix x in R such that |x| > 3. Let bk denote the kth term
kxk
3k
of the series. Now, by the same calculation as before,
lim
k→∞
∣∣∣∣bk+1bk
∣∣∣∣ = |x|3 .
Since |x| > 3 we deduce that lim
k→∞
|bk+1|
|bk| > 1. This shows that
|bk+1|
|bk| > 1 whenever k is sufficiently
large. Rearranging implies that
|bk+1| > |bk|
for all sufficiently large k and hence the tail of the sequence {|bk|} is increasing. Thus |bk| 6→ 0 as
k → ∞. We conclude that bk 6→ 0 as k → ∞ and hence
∑
bk diverges by the kth term test for
divergence.
4.7.1 Radius of Convergence
In the previous example, the power series, in powers of x, converged for |x| < 3. We call the
number 3 the radius of convergence of the power series. It is half the length of the interval of
convergence.
Definition 4.7.5. If a power series of the form
∞∑
k=0
ak(x−a)k converges at all points
in some interval (−R+a,R+a), or equivalently, for |z−a| < R, then the number R
is called the radius of convergence for the power series. The corresponding interval
(−R+ a,R + a) is called the open interval of convergence for the power series.
If the power series converges for all real x, we say that the radius of convergence is
infinite.
Notes: 1. The term ‘radius’ is used since, when x is replaced by the complex variable z, the
open interval is replaced by an open disc, |z − a| < R, in the Argand plane. The number R then
is the radius of this open disc.
2. It is easiest to find the interval of convergence first, using the ratio test, and then write down
the radius of convergence. It can be shown that R = lim
n→∞
∣∣∣∣ akak+1
∣∣∣∣, provided this limit exists. There
are, however, power series which have a radius of convergence, but for which this limit does not
exist.
By generalising the solution to Example 4.7.3 and the argument in Remark 4.7.4, one obtains
the following theorem.
Theorem 4.7.6. Suppose that {ak}∞k=0 is a sequence of real numbers such that
lim
k→∞
∣∣∣∣ akak+1
∣∣∣∣ = R
for some real number R. Then the power series
∞∑
k=0
ak(x− a)k
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.7. POWER SERIES 145
(i) converges absolutely whenever |x− a| < R, and
(ii) diverges whenever |x− a| > R.
Proof of Theorem 4.7.6. [H] The proof of (i) is similar to the solution of Example 4.7.3. We apply
the ratio test to the series ∞∑
k=0
|ak(x− a)k|.
Now
r = lim
k→∞
|ak+1(x− a)k+1|
|ak(x− a)k| = limk→∞
∣∣∣∣ak+1ak
∣∣∣∣ |x− a|.
We have convergence whenever r < 1, which corresponds to the condition that
lim
k→∞
∣∣∣∣ak+1ak
∣∣∣∣ |x− a| < 1.
By rearranging we find that the series converges absolutely whenever
|x− a| < 1
lim
k→∞
∣∣∣ak+1ak ∣∣∣ = limk→∞
∣∣∣∣ akak+1
∣∣∣∣ = R,
where R is the limit given in the theorem.
The proof of (ii) is a simple modification of the argument given in Remark 4.7.4.
Note that we cannot tell from the theorem whether or not the power series converges at the
endpoints a+R or a−R. Sometimes the power series will converge at one endpoint but not at the
other. Other times it will converge at both endpoints or diverge at both endpoints.
Example 4.7.7. Find the largest open interval on which the power series
∞∑
k=0
(5x+ 2)k
k2 + 1
will converge.
Solution. We apply the ratio test to the series
∑∞
k=0
∣∣∣ (5x+2)kk2+1 ∣∣∣. Now
r = lim
k→∞
∣∣∣∣ (5x+ 2)k+1(k + 1)2 + 1
∣∣∣∣
∣∣∣∣ k2 + 1(5x+ 2)k
∣∣∣∣
= lim
k→∞
k2 + 1
(k + 1)2 + 1
|5x+ 2|
= |5x+ 2|.
We require that r < 1 and so that |5x+ 2| < 1. Hence
−1 < 5x+ 2 < 1
or in other words,
−3
5
< x < −1
5
.
So the largest open interval of convergence is (−35 ,−15).
Hence the radius of convergence is 15 .
c©2020 School of Mathematics and Statistics, UNSW Sydney
146 CHAPTER 4. TAYLOR SERIES
4.7.2 Convergence of power series at endpoints [X]
Students studying MATH1241 are also required to determine whether a power series converges
at the endpoints of its interval of convergence. As the next example illustrates, one deduces the
convergence at each endpoint by substituting the endpoint into the power series and determining
whether the resulting series of real numbers converges.
Example 4.7.8. Find the interval of convergence (including endpoints, if appropriate) for the
power series
∞∑
k=2
xk
ln k
.
Solution. First we find the open interval of convergence. Now
r = lim
k→∞
∣∣∣∣ xk+1ln(k + 1) · ln kxk
∣∣∣∣
= lim
k→∞
ln k
ln(k + 1)
|x|
= lim
k→∞
1/k
1/(k + 1)
|x| (by l’Hoˆpital’s rule)
= lim
x→∞
k + 1
k
|x|
= |x|.
The series converges absolutely whenever r = |x| < 1. Hence the largest open interval of conver-
gence is (−1, 1) and the series diverges whenever |x| > 1.
Now we determine whether the series converges at the endpoints 1 and −1. When x = 1, the
series becomes
∞∑
k=2
1
ln k
,
which diverges by comparison with the harmonic series
∑ 1
k . When x = −1, the series becomes
∞∑
k=2
(−1)k
ln k
,
which is alternating and converges (conditionally) by the alternating series test.
Hence the interval of convergence for the power series is [−1, 1). (Note that the largest interval
on which the power series is absolutely convergent is (−1, 1).)
4.8 Manipulation of power series
(Ref: SH10 §12.9)
In this section we investigate what sense (if any) can be made of adding, multiplying, differen-
tiating and integrating power series. Since differentiation and integration are operations applied to
functions, it is most natural to approach this investigation by viewing a power series as a function.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.8. MANIPULATION OF POWER SERIES 147
Suppose that a power series
∞∑
n=0
akx
k converges in the interval (−R,R), where R is its radius
of convergence. Then one can define a function f : (−R,R)→ R by the formula
f(x) =
∞∑
k=0
akx
k whenever |x| < R.
Thus the value of f at each point x is a convergent sum of real numbers. Sometimes it is possible
to find a closed form for f , but other times we must approximate each value f(x) by using partial
sums.
Example 4.8.1. Suppose that f is given by the rule
f(x) =
∞∑
k=0
xk.
By using, say, the ratio test, we see that the series converges whenever |x| < 1 and diverges when
|x| > 1. Hence natural domain for f is (−1, 1).
In fact, by summing the geometric series, we find that f(x) =
1
1− x whenever |x| < 1. This is
the closed form of f(x).
Example 4.8.2. Suppose that f is defined by the rule
f(x) =
∞∑
k=0
xk
k!
.
We instantly recognise this series as the Maclaurin series for the exponential function. Since this
series converges on R, the maximal domain of f is R. The closed form of f(x) is given by f(x) = ex
whenever x ∈ R.
Example 4.8.3. Suppose that f is defined by the rule
f(x) =
∞∑
k=1
xk
k2
.
By the ratio test, we find that the series converges whenever |x| < 1 and diverges whenever |x| > 1.
Therefore we take the domain of f to be (−1, 1). (Students in MATH1141 will note that the power
series also converges when x = 1 and x = −1. So the domain could be extended to [−1, 1].)
There seems to be no obvious closed form for f(x) whenever |x| < 1. How, then, does one
evaluate f(−12)? Note that
f(−12) =
∞∑
k=1
(−1)k
2k k2
= − 1
21 12
+
1
22 22
− 1
23 33
+
1
24 42
− 1
25 52
+ · · ·
and that the right-hand side is an alternating series. Thus
f(−12) ≈ −
1
21 12
+
1
22 22
− 1
23 33
+
1
24 42
− 1
25 52
+ · · ·+ 1
210 102
,
where the absolute error in this approximation is less than
1
211 112
by Corollary 4.5.22.
c©2020 School of Mathematics and Statistics, UNSW Sydney
148 CHAPTER 4. TAYLOR SERIES
It turns out that power series are very well behaved as functions defined on their interval of
convergence. Given two power series with the same interval of convergence, you can add, subtract
and multiply them together in the ‘natural’ way. Power series are also differentiable and integrable,
and their derivatives and antiderivatives can also be expressed as power series in the ‘natural’ way.
The following theorems articulate the precise details. Their proofs are given later in Subsection
4.8.1.
Theorem 4.8.4. Suppose that the functions f : I → R and g : I → R are defined by
f(x) =
∞∑
k=0
ak(x− a)k and g(x) =
∞∑
k=0
bk(x− a)k,
where both power series converge on the interval I. Then, whenever x ∈ I,
(f + g)(x) =
∞∑
k=0
(ak + bk)(x− a)k
and
(fg)(x) =
∞∑
k=0
ck(x− a)k, (4.20)
where
ck =
k∑
j=0
ajbk−j.
Remark 4.8.5. The product formula (4.20) says that
(fg)(x) = a0b0 + (a0b1 + a1b0)(x− a) + (a0b2 + a1b1 + a2b0)(x− a)2
+ (a0b3 + a1b2 + a2b1 + a3b0)(x− a)3 + · · · .
This is the natural generalisation of polynomial multiplication.
Theorem 4.8.6. Suppose that f : I → R is defined by
f(x) =
∞∑
k=0
ak(x− a)k
whenever x ∈ I, where I denotes the open interval of convergence for the power series. Then
(i) f is differentiable on I and
f ′(x) =
∞∑
k=1
kak(x− a)k−1
whenever x ∈ I; and
(ii) f is integrable on I and an antiderivative F for f is given by
F (x) =
∞∑
k=0
ak
k + 1
(x− a)k+1 + C
whenever x ∈ I, where C is a constant.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.8. MANIPULATION OF POWER SERIES 149
Remark 4.8.7. This theorem says that a power series can be differentiated and integrated ‘term
by term’ inside its open interval of convergence. Another way of saying this is that
d
dx
( ∞∑
k=1
ak(x− a)k
)
=
∞∑
k=1
(
d
dx
ak(x− a)k
)
and ∫ ( ∞∑
k=1
ak(x− a)k
)
dx =
∞∑
k=1
(∫
ak(x− a)k dx
)
.
Students should note that one cannot always swap infinite summation with differentiation (or with
integration). For example, if
f(x) =
∞∑
k=1
sin(2kx)
2k
then f is a sum of differentiable functions but f is not differentiable anywhere!
The following corollary follows from Theorem 4.8.6.
Corollary 4.8.8. Suppose that f : I → R is defined by
f(x) =
∞∑
k=0
ak(x− a)k
whenever x ∈ I, where I denotes the open interval of convergence for the power series. Then f is
continuous on I and has derivatives of all orders on I.
Proof. By Theorem 4.8.6 (i), f is differentiable on I and is therefore continuous on I.
We now prove that f has derivatives of all orders on I. Suppose that n is any natural number.
It suffices to show that f is n-times differentiable on I. By Theorem 4.8.6 (i), f is once differentiable
on I and its derivative f ′ has a power series expansion that converges on I. Now apply Theorem
4.8.6 (i) to f ′. We conclude that f ′ is differentiable on I and its derivativef ′′ has a power series
expansion that converges on I. Now apply Theorem 4.8.6 (i) to f ′′. Continuing in this way, after
n steps we conclude that f (n−1) is differentiable on I with derivative f (n). Hence f is n-times
differentiable on I, thus completing the proof.
Remark 4.8.9. Suppose that a function f : I → R is defined by
f(x) =
∞∑
k=0
ak(x− a)k,
where I is the open interval of convergence for the power series. By the corollary, f has derivatives
of all orders at a and therefore has a Taylor series about a. One can easily show that the Taylor
series for f about a converges on I to f . Thus the Taylor series for f about a is equal to the power
series that defines f .
Theorems 4.8.4 and 4.8.6 allow us to find valid Taylor series expansions of functions without
having to derive the Taylor coefficients and verify that the remainder term from Taylor’s theorem
vanishes.
c©2020 School of Mathematics and Statistics, UNSW Sydney
150 CHAPTER 4. TAYLOR SERIES
Example 4.8.10. Given the Taylor expansions
ex = 1 + x+
x2
2!
+
x3
3!
+
x4
4!
+
x5
5!
+
x6
6!
+ · · · (x ∈ R) (4.21)
and
1
1− x = 1 + x+ x
2 + x3 + x4 + x5 + · · · (|x| < 1), (4.22)
find Taylor expansions for each function f , making sure that you state the interval of convergence.
(a) f(x) = coshx (b) f(x) = sinhx (c) f(x) = tan−1(x)
Solution. (a) First note that cosh x = 12 (e
x + e−x). So we aim to add the Maclaurin series for ex
and e−x. By replacing x with −x in (4.21), we find that
e−x = 1− x+ x
2
2!
− x
3
3!
+
x4
4!
− x
5
5!
+
x6
6!
− · · · (x ∈ R).
So
ex + e−x = 2 + 2
x2
2!
+ 2
x4
4!
+ 2
x6
6!
+ · · · (x ∈ R)
by Theorem 4.8.4. Hence
coshx =
1
2
(ex + e−x)
= 1 +
x2
2!
+
x4
4!
+
x6
6!
+ · · ·
whenever x ∈ R.
(b) By differentiating both sides of the expansion
cosh x = 1 +
x2
2!
+
x4
4!
+
x6
6!
+
x8
8!
+ · · · (x ∈ R)
we find that
sinhx = x+
x3
3!
+
x5
5!
+
x7
7!
+ · · ·
whenever x ∈ R.
(c) If |x| < 1 then | − x2| < 1. So we can replace x with −x2 in (4.22) to obtain the convergent
expansion
1
1 + x2
= 1− x2 + x4 − x6 + x8 − x10 + · · ·
whenever |x| < 1. By integrating both sides this of identity, we find that
tan−1 x = x− x
3
3
+
x5
5
− x
7
7
+
x9
9
− x
11
11
+ · · ·
whenever |x| < 1.
The function f , given by f(x) = e−x2 , has no antiderivative among the elementary functions.
In MATH1131, we used Riemann sums to estimate the area underneath the graph of f . As the next
example shows, the use of Taylor series provides a more efficient approach to the same problem.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.8. MANIPULATION OF POWER SERIES 151
Example 4.8.11. Suppose that f(x) = e−x2 . By using the Maclaurin expansion for f , estimate∫ 1
0
f(x) dx and give an upper bound for the absolute error.
Solution. We begin with the Maclaurin expansion
ex = 1 + x+
x2
2!
+
x3
3!
+
x4
4!
+
x5
5!
+ · · · ,
which is valid for all x in R. By replacing x with −x2 we find that
e−x
2
= 1− x2 + x
4
2!
− x
6
3!
+
x8
4!
− x
10
5!
+ · · ·
for all real numbers x. By integrating both sides of this equation on the interval [0, 1], we find that
∫ 1
0
e−x
2
dx =
[
x− x
3
3
+
x5
5(2!)
− x
7
7(3!)
+
x9
9(4!)
− x
11
11(5!)
+ · · ·
]1
0
= 1− 1
3
+
1
5(2!)
− 1
7(3!)
+
1
9(4!)
− 1
11(5!)
+ · · · .
Hence the integral is expressed as alternating series of the form
∑
(−1)kak, where {ak} is a
positive decreasing sequence. If we estimate the series using a partial sum, then Corollary 4.5.22
gives an upper bound for the absolute error. For example, the absolute error in the approximation∫ 1
0
e−x
2
dx ≈ 1− 1
3
+
1
5(2!)
− 1
7(3!)
is no greater than 19(4!) . One can evaluate this partial sum numerically to obtain∫ 1
0
e−x
2
dx ≈ 0.7429
with an error no greater than 0.005.
4.8.1 Proof of theorems in Section 4.8 [X]
In this subsection, we prove Theorems 4.8.4 for the case where a = 0. It is not hard to adapt the
presented proofs to the general case.
Proof of Theorem 4.8.4. We shall only prove the product formula (4.20) when a = 0. Fix x in I
and define the partial sums sn(x) and tn(x) by
sn(x) =
n∑
k=0
akx
k and tn(x) =
n∑
k=0
bkx
k.
Then f(x) = lim
n→∞ sn(x) and g(x) = limn→∞ tn(x). So using Proposition 4.3.6, we find that the
sequence {sn(x)tn(x)} converges and
(f.g)(x) = f(x).g(x) = lim
n→∞ sn(x)× limn→∞ tn(x) = limn→∞
(
sn(x)tn(x)
)
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
152 CHAPTER 4. TAYLOR SERIES
But
sn(x)tn(x) =
(
a0 + a1x+ a2x
2 + · · · anxn
)× (b0 + a1x+ b2x2 + · · ·+ bnxn)
= a0b0 + (a0b1 + a1b0)x+ (a0b2 + a1b1 + a2b0)x
2
+ · · ·+ (a0bn + a1bn−1 + · · ·+ an−1b1 + anb0)xn.
As n→∞, one obtains (4.20).
We move now to the proof of Theorem 4.8.6, which shall be broken into two parts. First we
prove the differentiation result, which is difficult and uses the mean value theorem. After this, the
integration result can be easily deduced from the first part.
Proof of Theorem 4.8.6 (i). Suppose that f : (−R,R)→ R is defined by
f(x) =
∞∑
k=0
akx
k
whenever x ∈ (−R,R), where R is the radius of convergence for the power series. It can be easily
shown (see the tutorial problems) that the radius of convergence for the power series
∞∑
k=1
kakx
k−1
is also R. So define the function g : (−R,R)→ R by the formula
g(x) =
∞∑
k=1
kakx
k−1.
Fix, now, a number x in (−R,R). Our task is to show that f ′(x) = g(x), or in other words,
that
lim
h→0
f(x+ h)− f(x)
h
= g(x).
Now if x+ h ∈ (−R,R) and h 6= 0 then∣∣∣∣g(x)− f(x+ h)− f(x)h
∣∣∣∣ =
∣∣∣∣∣
∞∑
k=1
kakx
k−1 −
∞∑
k=0
ak(x+ h)
k − akxk
h
∣∣∣∣∣
=
∣∣∣∣∣
∞∑
k=1
kakx
k−1 −
∞∑
k=1
ak
(
(x+ h)k − xk
h
)∣∣∣∣∣ .
By the mean value theorem,
(x+ h)k − xk
h
= k ck−1k
for some real number ck between x and x+ h. Hence∣∣∣∣g(x) − f(x+ h)− f(x)h
∣∣∣∣ =
∣∣∣∣∣
∞∑
k=1
kakx
k−1 −
∞∑
k=1
kakc
k−1
k
∣∣∣∣∣
=
∣∣∣∣∣
∞∑
k=1
kak
(
xk−1 − ck−1k
)∣∣∣∣∣
=
∣∣∣∣∣
∞∑
k=2
kak
(
xk−1 − ck−1k
)∣∣∣∣∣ .
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.8. MANIPULATION OF POWER SERIES 153
Again by the mean value theorem,
xk−1 − ck−1k
x− ck = (k − 1)d
k−2
k−1
for some real number dk−1 between x and ck. Hence∣∣∣xk−1 − ck−1k ∣∣∣ = |x− ck| ∣∣∣(k − 1)dk−2k−1∣∣∣ .
Now |x− ck| < |h| and |dk−1| < M , where M = max{|x|, |x+ h|}. So∣∣∣xk−1 − ck−1k ∣∣∣ ≤ |h| ∣∣∣(k − 1)Mk−2∣∣∣ .
Thus ∣∣∣∣g(x) − f(x+ h)− f(x)h
∣∣∣∣ ≤ |h| ∞∑
k=2
∣∣∣k(k − 1)akMk−2∣∣∣ .
One can show using the ratio test that the series on the right-hand side converges and hence
lim
h→0
|h|
∞∑
k=2
∣∣∣k(k − 1)akMk−2∣∣∣ = 0.
Therefore
lim
h→0
∣∣∣∣g(x)− f(x+ h)− f(x)h
∣∣∣∣ = 0
(by the pinching theorem for limits) and we conclude that
lim
h→0
f(x+ h)− f(x)
h
= g(x).
Hence f ′(x) = g(x) for all x in (−R,R).
Now that we have proved that a power series is differentiable inside its open interval of conver-
gence, it is relatively easy to prove that it is integrable inside this interval.
Proof of Theorem 4.8.6 (ii). Suppose that f : (−R,R)→ R is defined by
f(x) =
∞∑
k=0
akx
k
whenever x ∈ (−R,R), where R is the radius of convergence for the power series. It can be easily
shown (see the tutorial problems) that the radius of convergence for the power series
∞∑
k=0
ak
k + 1
xk+1
is also R. So define the function F : (−R,R)→ R by the formula
F (x) =
∞∑
k=0
ak
k + 1
xk+1.
c©2020 School of Mathematics and Statistics, UNSW Sydney
154 CHAPTER 4. TAYLOR SERIES
We now apply the differentiability theorem (Theorem 4.8.6 (i)) to F . In particular,
F ′(x) =
∞∑
k=0
d
dx
(
ak
k + 1
xk+1
)
=
∞∑
k=0
akx
k
= f(x)
whenever x ∈ (−R,R). Hence F is an antiderivative for f on (−R,R) and hence f is integrable on
(−R,R).
4.9 Maple notes
The following MAPLE command is relevant to the material of this chapter:
sum(f(k) , k=m..n); computes the sum of f(k) as k runs from m to n. For example,
> sum(k^2, k=1..4);
30
> sum(k^2, k=1..n);
1
3
(n + 1)3 − 1
2
(n+ 1)2 +
1
6
n+
1
6
> sum(1/k^2, k=1..infinity);
1
6
π2
?powseries will give information about the MAPLE package for manipulating formal power series.
taylor(expr, x=a, k); computes the Taylor series for expr about x=a, up to the term of order k.
convert(taylor(expr, x=a, k), polynom); computes the Taylor polynomial of order k-1 for
expr about x=a.
coeftayl(expr, x=a, k); computes the kth coefficient in the Taylor series expansion of expr
about x=a.
For example,
> taylor(sin(x) ,x=0,8);
x− 1
6
x3 +
1
120
x5 − 1
5040
x7 +O(x8)
> convert(%,polynom);
x− 1
6
x3 +
1
120
x5 − 1
5040
x7
> coeftayl(sin(x),x=0,11);
− 1
39916800
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 4 155
Problems for Chapter 4
Problems 4.1 : Taylor polynomials
1. [R] For each function f , find the Taylor polynomial of degree 9 for f about 0.
a) f(x) = ex b) f(x) = sinx c) f(x) = sinhx
2. [R] Suppose that f(x) = sinx and m ≥ 0. Using summation notation, find a formula for
the Taylor polynomial p2m+1 of degree 2m+ 1 for f about 0.
3. [R]
a) Suppose that f(x) =
√
x. Find the Taylor polynomial of degree 3 for f about 4.
b) Suppose that g(x) = cos x. Find the Taylor polynomial of degree 4 for g about π/4.
4. [R] Let f(x) = 1 + x+ x2. Find the Taylor polynomial pn(x)
a) of degree n = 1 about 1.
b) of degree n = 2 about 1.
c) of degree n = 2 about 2.
Problems 4.2 : Taylor’s theorem
5. [R] Suppose that f(x) = ln(1 + x).
a) Express f(x) in the form p1(x)+R2(x), where p1 is the first Taylor polynomial for f
about 0 and R2 is the Lagrange formula for the remainder.
b) Suppose that x ∈ [−0.1, 0.1] and consider the approximation ln(1+ x) ≈ x. Use your
answer to (a) to show that an upper bound for the absolute error in this approximation
is 1/162.
6. [R] Suppose that f(x) =
√
1 + x and let p2 denote the second Taylor polynomial for f
about 0. If x ∈ [0, 1] then show that the absolute error in the approximation f(x) ≈ p2(x)
does not exceed 116 .
7. [R] Suppose that f(x) = cos x and that n is a positive even integer.
a) Find the nth Taylor polynomial pn for f about 0 and the Lagrange formula for the
remainder Rn+1.
b) Use the mean value theorem to prove that
sinx < x whenever x > 0.
c) Use parts (a) and (b) to find an upper bound for the absolute error in the approxi-
mation f(1/10) ≈ pn(1/10).
c©2020 School of Mathematics and Statistics, UNSW Sydney
156 CHAPTER 4. TAYLOR SERIES
d) Hence find a value for n such that the absolute error in the approximation of (c) is
less than 10−6.
e) The value of cos 2 is estimated using pn(2) for some n. Explain why it is better not
to use the inequality of (b) to find an upper bound for |Rn+1(2)|.
f) Find a value for n such that the absolute error in the approximation f(2) ≈ pn(2) is
less than 10−6.
g) The Taylor polynomial p10 is used to approximate f on the interval [−a, a], where
a is a positive real number. Find a value for a such that the absolute error in the
approximation f(x) ≈ p10(x) is less than 10−6 whenever x ∈ [−a, a].
8. [X] Below is a statement of the mean value theorem for integrals:
If f and g are continuous on [a, b] and g is nonnegative on [a, b], then∫ b
a
f(t)g(t) dt = f(c)
∫ b
a
g(t) dt
for some real number c in [a, b].
This theorem is used in several of the problems below. We give a proof of it in this
question.
a) Suppose that f is continuous on [a, b] and let m and M denote the minimum and
maximum values of f on [a, b] respectively. If m ≤ z ≤ M , then explain why there
exists a real number c in [a, b] such that f(c) = z.
b) By considering a lower and upper bound for the integral
∫ b
a f(t)g(t) dt in terms of m
and M , prove the mean value theorem for integrals.
9. [X] In this question, we will prove the Lagrange formula for the remainder under some
simple assumptions. Suppose that f is (n+1)-times differentiable on some open interval
I containing 0 and that f (n+1) is continuous on I. Suppose that x ∈ I and recall that the
remainder term in Taylor’s theorem is given by
Rn+1(x) =
1
n!
∫ x
0
f (n+1)(t)(x− t)n dt.
Use the mean value theorem for integrals (see Question 8) to deduce that
Rn+1(x) =
f (n+1)(c)
(n+ 1)!
xn+1,
where c is some real number between 0 and x.
10. [R]
a) Suppose that
f(x) = x7 + 5x6 + 3x5 − 17x4 − 16x3 + 24x2 + 16x− 11.
Verify that 1 and −2 are stationary points for f and use the corollary to Taylor’s
theorem to classify each of these stationary points.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 4 157
b) Suppose that
f(x) = x7 − 7x6 + 10x5 + 22x4 − 43x3 − 35x2 + 48x+ 40.
Verify that −1, 2 and 3 are stationary points for f and classify each of these stationary
points.
11. [H] Use the following outline to show that e is irrational.
a) If e were rational, it would be of the form e =
p
q
, where p and q are positive integers.
Select an integer k such that k ≥ 3 and k ≥ q. Use Taylor’s Theorem to show that
p
q
= e = 1 +
1
1!
+
1
2!
+ . . .+
1
k!
+
ez
(k + 1)!
for some z in [0, 1].
b) Suppose that
sk = 1 +
1
1!
+
1
2!
+ . . .+
1
k!
.
Show that k!(e − sk) is an integer.
c) Show that 0 < k!(e− sk) < 1.
d) Conclude that e is irrational.
Problems 4.3 : Sequences
12. [R] Describe the limiting behaviour of the following sequences. If the sequence converges,
then state its limit.
a)
n2 − 2n+ 1
2n2 + 4n− 1 b)
ln n
na
, where a > 0.
c)
n sin(nπ/4)√
n2 + 1
d)
n+ cos nπ
n− cos nπ
e)
n!
nn
f)
(2n)!
(n!)2
g) (an + bn)1/n, where a ≥ b > 0.
13. [H] Suppose that an =
n2 − 2n+ 1
2n2 + 4n− 1.
a) Find L, where L = lim
n→∞ an.
b) For each positive number ǫ, find a number N such that
|an − L| < ǫ whenever n > N.
14. [R] Use the result lim
n→∞
(
1 +
1
n
)n
= e and standard results about limits to evaluate the
following limits.
a) lim
n→∞
(
1 +
1
n
)4n
b) lim
n→∞
(
n
n+ 1
)n
c©2020 School of Mathematics and Statistics, UNSW Sydney
158 CHAPTER 4. TAYLOR SERIES
15. [X] The limit of a recursively defined sequence.
Suppose that a1 = 1 and an+1 =
√
1 + an whenever n ≥ 1.
a) Show that
√
1 + x ∈ [1, 2] whenever x ∈ [1, 2].
b) Use induction to show that the sequence {an} is bounded.
c) Use induction to show that {an} is an increasing sequence.
d) Explain why lim
n→∞ an exists.
e) Find lim
n→∞ an. (That is, find
√
1 +
√
1 +
√
1 + · · · .)
16. [X] Find the supremum and infimum of each of the following sets.
a)
{
n
1 + n2
: n = 1, 2, . . .
}
b)
{
n
1 + n2
: n ∈ Z
}
c) {x ∈ Q : x2 < 2} d)
{
(−1)n
n
+ sinn : n = 1, 2, . . .
}
e) {x ∈ (0,∞) : sinx < 0} f) {1 + tan−1 x : x ∈ R}
17. [X] Suppose that A and B are nonempty subsets of R2. Define the distance d(A,B)
between A and B by the formula
d(A,B) = inf{|a− b| : a ∈ A, b ∈ B}.
a) Explain why this infimum always exists.
b) Suppose that A = {(x, y) : x2 + y2 < 1} and B = {(x, y) : x2 − y2 > 9}. Find
d(A,B).
c) Find two disjoint sets A and B such that d(A,B) = 0.
18. [X] Suppose that {an}∞n=1 is a bounded sequence. Given a positive integer m, define Km
by the formula
Km = sup{am, am+1, am+2, . . . }.
a) Explain why Km ≥ inf
n≥1
an for all m.
b) Explain why Km+1 ≤ Km for all m.
c) Explain why the limit lim
m→∞Km exists.
Let K denote the limit of part (c). We call K the limit superior of the sequence
and write K = lim sup
n
an. It is the largest number to which any subsequence of {an}
converges. The smallest number to which any subsequence converges is called the
limit inferior, lim inf
n
an = lim
m→∞
(
inf
n≥m
an
)
.
d) Prove that lim inf
n
an = lim sup
n
an if and only if {an} converges.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 4 159
Problems 4.4 : Infinite series
19. [R] Examples of telescoping series.
a) Consider the series
∞∑
k=1
ak, where ak =
1
k(k + 1)
.
i) Find the partial fractions decomposition of ak.
ii) Find a simple formula for the partial sum sn, where sn =
n∑
k=1
ak.
iii) Hence find
∞∑
k=1
ak.
b) Repeat the question for the series
∞∑
k=2
ak, where ak =
1
k2 − 1.
20. [R] Let sn denote the nth partial sum of the series
1 +
1√
2
+
1√
3
+ · · ·.
a) Show that sn >
√
n whenever n > 1.
b) Hence explain why the series diverges.
21. [R] Let sn denote the nth partial sum of the harmonic series series
∞∑
k=1
1
k
. When n = 2k−1,
the terms of sn may be bracketed as shown:
sn = 1 +
1
2
+
(
1
3
+
1
4
)
+
(
1
5
+ . . . +
1
8
)
+ · · ·+
(
1
2k−2 + 1
+
1
2k−2 + 2
+ . . . +
1
2k−1
)
.
Hence use an argument similar to that in Question 20 to show that the harmonic series
diverges.
22. [R] The following three questions are similar to Example 4.4.3.
a) By drawing a diagram and interpreting each term in sum as the area of rectangle,
show that
n∑
k=2
1
k2
≤
∫ n
1
dx
x2
.
b) Deduce that
∞∑
k=2
1
k2
converges.
23. [R]
a) By drawing a diagram and interpreting each term in sum as the area of rectangle,
show that
n∑
k=1
1√
k
≥
∫ n+1
1
dx√
x
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
160 CHAPTER 4. TAYLOR SERIES
b) Deduce that
∞∑
k=1
1√
k
diverges.
24. [R] By using the technique of the previous two questions, determine whether or not the
sum
∞∑
k=2
1
k ln k
converges.
Problems 4.5 : Tests for series convergence
25. [R] Use the integral test to examine the convergence of
a)
∞∑
k=1
k
k2 + 4
b)
∞∑
k=3
1
k(ln k)
c)
∞∑
k=1
1
(k + 9)3
26. [R] Use a comparison test to determine whether or not each series converges.
a)
∞∑
k=1
1
(k3 + 3)1/2
b)
∞∑
k=2
1
(k2 − 1)1/3 c)
∞∑
k=1
k
k2 − 6
d)
∞∑
k=2
1
ln k
e)
∞∑
k=2
1
(ln k)9
f)
∞∑
k=1
sin2
1
k
27. [R] Use the ratio test to examine the convergence of each series.
a)
∞∑
k=1
k2
2k
b)
∞∑
k=1
3k
k!
c)
∞∑
k=1
k!
kk
d)
∞∑
k=1
5k
2k + 4k
28. [R] Determine which of the following alternating series converge. Which are absolutely
convergent?
a)
∞∑
k=3
(−1)k
ln k
b)
∞∑
k=1
(−1)kkk
(k + 1)k
c)
∞∑
k=2
(−1)k+1√
k(k2 − 2) d) [X]
∞∑
k=2
(−1)k√
k + (−1)k
29. [R] Consider the alternating series
∞∑
k=0
(−1)k
k3 + 1
. Let L denote the value of the series and
let sn denote the nth partial sum of the series whenever n ≥ 0.
a) Verify that the series is convergent.
b) Calculate s4 and give an upper bound for the absolute error in the approximation
L ≈ s4.
c) Find a value for n such that the absolute error in the approximation L ≈ sn is less
than 10−6.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 4 161
30. [R] By using an appropriate test, determine whether or not each series converges.
a)
∞∑
k=2
(−1)k√
k
b)
∞∑
k=1
3k
k3
c)
∞∑
k=1
3k
1 + 7k
d)
∞∑
k=2
2kk!
kk
e)
∞∑
k=2
(−1)k+1k2√
4k4 + 1
f)
∞∑
k=1
sin
(
(2k − 1)π/4)
2k
31. [X] Determine the convergence or divergence of
∑
ak, for each ak given below.
a)
sin k
k2
b)
√
k
k2 + 1
c)
k
(ln k)k
d)
(ln k)3√
k3 − 3k2 + 1
e)
(−1)k2k
k3
f)
1
k1+
1
k
g)
ln(k!)
k3
32. [H] Discuss the convergence of the series
∞∑
k=1
1
a1a2 . . . ak
=
1
a1
+
1
a1a2
+
1
a1a2a3
+ · · · ,
where {ak} is a strictly increasing sequence and a1 > 0.
33. [X]
a) Prove that if
∑
k
a2k and
∑
k
b2k converge, then
∑
k
akbk converges.
b) Prove that if
∑
k
a2k converges, then
∑
k
ak
k
converges.
Problems 4.6 : Taylor series
34. [R] Find, in each case, the Maclaurin series for f . Express your answer using summation
notation. (You may find your answers to some questions of Problems 4.1 : Taylor polynomials
helpful.)
a) f(x) = ex b) f(x) = sinx c) f(x) = sinhx
d) f(x) = ln(1 + x) e) f(x) =
1
x− 1
35. [R] Suppose that f(x) = ex.
a) Express f in the form f(x) = pn(x)+Rn+1(x), where pn is the nth Taylor polynomial
for f about 0 and Rn+1 is the Lagrange formula for the remainder.
b) Fix x in R and show that Rn+1(x)→ 0 as n→∞.
c) Hence write down the Taylor series expansion for f(x), stating clearly where the
expansion is valid.
36. [H] Repeat the previous question in the case when f(x) = sinhx.
37. [H] Suppose that f(x) = ln(1 + x).
c©2020 School of Mathematics and Statistics, UNSW Sydney
162 CHAPTER 4. TAYLOR SERIES
a) Express f in the form f(x) = pn(x)+Rn+1(x), where pn is the nth Taylor polynomial
for f about 0 and Rn+1 is the Lagrange formula for the remainder.
b) Suppose that 0 ≤ x ≤ 1 and show that Rn+1(x)→ 0 as n→∞.
c) Hence write down the Taylor series expansion for f(x), when 0 ≤ x ≤ 1.
d) [X] We will show that this Taylor series expansion is also valid on (−1, 0). Suppose
that −1 < x < 0.
i) By using the integral form for the remainder, show that
Rn+1(x) =
∫ x
0
(
t− x
1 + t
)n 1
1 + t
dt.
ii) Use the mean value theorem for integrals (when g(t) = 1; see Question 8) to
show that
|Rn+1(x)| <
(
cn + |x|
1 + cn
)n( |x|
1 + x
)
for some number cn between x and 0.
iii) Deduce that
|Rn+1(x)| < |x|n
( |x|
1 + x
)
and hence that Rn+1(x)→ 0 as n→∞.
38. [H] Let I denote the interval (x0−R,x0+R), where x0 and R are real numbers. Suppose
that a function f has derivatives of all orders on I and that all these derivatives have a
common bound (that is, |f (n)(x)| ≤M for all x in I and for all positive integers n). Show
that f is represented by its Taylor series about x0 on I.
39. [X] Suppose that
f(x) =
{
e−1/x
2
if x 6= 0
0 if x = 0.
a) Show that f is differentiable everywhere and find a formula for f ′.
b) Show that f ′ is differentiable everywhere and find a formula for f ′′.
c) Suppose that k is a positive integer. Prove that
d
dx
(
e−1/x
2
xk
)
is a linear combination
of functions of the form
e−1/x
2
xm
, where m is a positive integer.
d) Hence deduce that f (n)(0) = 0 for every natural number n.
e) Write down the Maclaurin series for f . Where does the Maclaurin series converge?
Where does it converge to f?
40. [H] (This exercise illustrates that a conditionally convergent series, when rearranged, can
have a different sum.)
Consider the series s and t, given by
s = 1− 1
2
+
1
3
− 1
4
+
1
5
− 1
6
+ · · ·
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 4 163
and
t = 1− 1
2
− 1
4
+
1
3
− 1
6
− 1
8
+
1
5
− 1
10
− 1
12
+ · · · .
Note that second series is a rearrangement of the first.
a) Explain why s is conditionally convergent.
b) By considering an appropriate Maclaurin series (see Theorem 4.6.5), find the exact
value of s.
c) Denote the sum of the first n terms of s by sn and of t by tn. By using induction,
show that t3n =
1
2 s2n whenever n ≥ 1.
d) Hence find the value of lim
n→∞ t3n.
e) Explain why lim
n→∞ t3n+1 = limn→∞ t3n+2 = limn→∞ t3n.
f) Hence write down the value of t.
Problems 4.7 : Power series
41. [R] Determine the open interval of convergence for each of the following power series.
(Students studying MATH1241 should also examine the behaviour of the power series at
the endpoints.)
a)
∞∑
k=0
(x
6
)k
b)
∞∑
k=0
xk
k2 + 1
c)
∞∑
k=0
kxk
2k
d)
∞∑
k=1
(x− 2)k
k3
e)
∞∑
k=2
(3x− 2)k
k ln k
f)
∞∑
k=1
(−1)kxk
(k + 1)3k
g) [X]
∞∑
k=1
(ln k)kxk
kk
42. [R] Consider the power series
(i)
∑ n!
nn
xn and (ii)
∑ nn
n!
xn.
a) Show that their radii of convergence are e and e−1 respectively.
b) [X] Show, by any method, that a(n) =
(
1 +
1
n
)n
is a strictly increasing sequence
(whose limit is e).
c) [X] Deduce from your working in (a) and (b) that (i) diverges when x = ±e.
d) [X] Here you may assume Stirling’s formula:
n!
/ {√
2π nn+1/2 e−n
}
→ 1 as n→∞.
Show that series (ii) diverges when x = e−1. Using Stirling’s formula and (b), show
that (ii) is conditionally convergent when x = −e−1.
c©2020 School of Mathematics and Statistics, UNSW Sydney
164 CHAPTER 4. TAYLOR SERIES
Problems 4.8 : Manipulation of power series
43. [R] Use your answers to Question 34 to deduce the Maclaurin series for each function g.
a) g(x) = (x+ 1)ex b) g(x) = sin(x2) c) g(x) = (x− 1)−2
44. [R] Suppose that f(x) = (1 − x)−1. Write down the Maclaurin series for f and hence
find the Maclaurin series for each function g given below. On what open interval is each
function represented by its Maclaurin series?
a) g(x) = (1 + x)−1 b) g(x) = (1 + x2)−1 c) g(x) = tan−1 x
45. [R] Suppose that f(x) = x2 sin(x3).
a) By using the Maclaurin series for sine, find the Maclaurin series for f .
b) Hence show that 0 is a stationary point for f .
c) Is 0 a local maximum point, local minimum point or a horizontal point of inflexion?
Explain.
46. [R] Consider the Maclaurin series representation
1
1− x = 1 + x+ x
2 + x3 + x4 + · · · ,
which is valid whenever |x| < 1.
a) By first integrating the above Maclaurin series, deduce that
ln
(
1 + x
1− x
)
= 2
(
x+
x3
3
+
x5
5
+
x7
7
+ · · ·
)
whenever |x| < R, for some real number R.
b) What is the largest possible value of R?
c) Use the first two terms of the series of (a) to find a rational number that approximates
ln 2.
47. [R] The function Si : R→ R is defined by
Si(x) =
∫ x
0
f(t) dt,
where
f(t) =
{
sin t
t if t 6= 0
1 if t = 0.
The Si function is used in signal processing and by surveyors for GPS.
a) Show that Si has a stationary point at π and classify this stationary point.
b) By first writing down the Maclaurin series for the sine function, find the Maclaurin
series for Si.
c) Hence find an estimate for the value of Si(π) such that the absolute error is less than
1/100.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 4 165
48. [H] A function f is defined by the rule
f(x) =
∞∑
k=1
kxk.
a) What is the largest open interval I on which the function well-defined?
b) By manipulating the power series expansion
1
1− x =
∞∑
k=0
xk, |x| < 1,
find a closed formula for f(x).
49. [H] Consider the differential equation
x2
d2y
dx2
+ x
dy
dx
= ex − 1.
Suppose that a solution y has a power series representation given by
y =
∞∑
k=0
akx
k,
where the coefficients ak are to be determined.
a) Write down, in summation notation, the Maclaurin series representation of ex − 1.
b) Write down the power series representations of
dy
dx
and
d2y
dx2
.
c) By expressing both the left- and right-hand sides of the differential equation as a
power series, determine the value of each coefficient ak.
d) Hence write down a solution to the differential equation, stating the values of x for
which the solution is valid.
Note: This method for solving differential equations will be developed in some second
year courses.
c©2020 School of Mathematics and Statistics, UNSW Sydney
166 CHAPTER 4. TAYLOR SERIES
c©2020 School of Mathematics and Statistics, UNSW Sydney
167
Chapter 5
Averages, arc length, speed and
surface area
In this chapter we look at the application of calculus to the following problems:
• finding the average height of a cable, or the average temperature over a certain time
interval;
• finding the length of a curve;
• finding the speed of a particle that travels along a curve in the plane; and
• finding the surface area of certain solids.
Each of these applications involves integration, either through approximating a quantity with a
Riemann sum, or through approximating the rate of change of a quantity and thereafter applying
the fundamental theorem of calculus.
5.1 The average value of a function
(Ref: SH10 §5.9)
A cable is suspended between two poles as shown.
2 metres
3 metres
What is the average height of the cable above the ground? Clearly it must be somewhere between
2 and 3 metres. To obtain a precise answer, recall from MATH1131 that any suspended cable is
the graph of a function f of the form
f(x) =
1
c
cosh(cx)
c©2020 School of Mathematics and Statistics, UNSW Sydney
168 CHAPTER 5. AVERAGES, ARC LENGTH, SPEED AND SURFACE AREA
over some interval [a, b], where c is a constant that depends on the tension in and mass of the cable,
and where the coordinate system is suitably chosen. Thus we rephrase the question as, ‘What is
the average value of f on the interval [a, b]?’
To answer this question, we need a suitable definition for the average value of a function f
over an interval [a, b]. To motivate such a definition, consider the function f whose graph is shown
below.
x
y
| |
a b
One way to proceed is the following. Divide the interval [a, b] into n subintervals of equal length.
We sample the height of the graph in the kth subinterval by choosing a point ck in that subinterval
and calculating f(ck).
x
y
| | | | | | | | |
· · · · · ·c1 c2 c3c4 c5 ck cn−1cn
| |
a b
f(ck)
The average (that is, arithmetic mean) an of these sampled heights is given by
an =
1
n
(
f(c1) + f(c2) + f(c3) + · · ·+ f(cn)
)
.
As the number n of subintervals increases, an should get closer to what we intuitively understand
by ‘the average height of the graph.’
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.1. THE AVERAGE VALUE OF A FUNCTION 169
Now, by multiplying and dividing an by (b− a), we find that
an =
1
n
n∑
k=1
f(ck)
=
1
n
n∑
k=1
f(ck)
(
b− a
b− a
)
=
1
b− a
n∑
k=1
f(ck)
(
b− a
n
)
. (5.1)
Note that f(ck) and
b− a
n
is the height and width of the kth rectangle in the following diagram.
x
y
| | | | | | | | |
· · · · · ·c1 c2 c3c4 c5 ck cn−1cn
| |
a b
f(ck)
b−a
n
So the sum in (5.1) is a Riemann sum. If f is Riemann integrable then
lim
n→∞ an =
1
b− a limn→∞
n∑
k=1
f(ck)
(
b− a
n
)
=
1
b− a
∫ b
a
f(x) dx.
This leads to the following definition.
Definition 5.1.1. Suppose that f is integrable on a closed interval [a, b]. Then the
average value f of f on [a, b] is defined by the formula
f =
1
b− a
∫ b
a
f(x) dx.
Remark 5.1.2. By rearranging this formula, we see that f is the unique constant such that
(b− a)f =
∫ b
a
f(x) dx.
Interpreted geometrically, f is the unique y-value such that the area of the shaded rectangle is
equal to the area under the graph of f .
c©2020 School of Mathematics and Statistics, UNSW Sydney
170 CHAPTER 5. AVERAGES, ARC LENGTH, SPEED AND SURFACE AREA
x
y
| |
a b
f
We now answer the question posed at the beginning of this section.
Example 5.1.3 (The average height of a suspended cable). The suspended cable illustrated at the
beginning of this section is a curve given by the equation
y = 2cosh(x/2), −a ≤ x ≤ a,
where the x-axis runs along the ground, the y-axis passes through the vertex of the curve and
a = 2cosh−1(3/2). Find, to the nearest centimetre, the average height of the cable above the
ground.
Note: The fact that the x-axis runs along the ground is special to this example. This will not
necessarily be the case for any given suspended cable.
Solution. (a) Suppose that f(x) = 2 cosh(x/2). The average value f of f on the interval [−a, a] is
given by
f =
1
a+ a
∫ a
−a
2 cosh(x/2) dx
=
2
2a
∫ a
0
2 cosh(x/2) dx (since cosh is even)
=
1
a
[
4 sinh(x/2)
]a
0
=
4
a
sinh(a/2)
=
2
cosh−1(3/2)
sinh
(
cosh−1(3/2)
)
.
Now cosh2(t)− sinh2(t) = 1 and so
sinh
(
cosh−1(3/2)
)
=
√
cosh2
(
cosh−1(3/2)
) − 1 = √5
2
.
Hence
f =
2
cosh−1(3/2)
×
√
5
2
.
By using a calculator, we find that f ≈ 2.32. So the average height of the cable above the ground
is approximately 2.32 metres.
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.2. THE ARC LENGTH OF A CURVE 171
We saw in MATH1131 that every continuous function f defined on a closed interval [a, b]
attains its maximum and minimum values. (This result is called the maximum-minimum theorem;
see Chapter 2 in the MATH1131 calculus notes). The next theorem says that such a function also
attains its average value.
Theorem 5.1.4 (The mean value theorem for integrals). Suppose that f is continuous on [a, b].
Then there is a number c in (a, b) such that∫ b
a
f(t) dt = f(c)(b− a).
Restated, the conclusion of the mean value theorem for integrals says that there exists a point
c in [a, b] such that f(c) = f , where f is the average value of f on [a, b].
Proof. Define F : [a, b]→ R by the formula
F (x) =
∫ x
a
f(t) dt.
By the fundamental theorem of calculus, F is continuous on [a, b], differentiable on (a, b) and
F ′(x) = f(x). By the mean value theorem, there exists c ∈ [a, b] such that
F (b) − F (a)
b− a = F
′(c). (5.2)
But
F (a) = 0, F (b) =
∫ b
a
f(t) dt and F ′(c) = f(c).
Hence (5.2) implies that
1
b− a
∫ b
a
f(t) dt = f(c)
as required.
Remark 5.1.5. A more general version of the mean value theorem for integrals is given in the
tutorial problems for Chapter 4 and is used to prove the Lagrange formula for the remainder in
Taylor’s theorem.
5.2 The arc length of a curve
(Ref: SH10 §10.7)
Suppose that P0(x0, y0) and P1(x1, y1) are two points in R
2. The distance between P0 and P1
is given by
dist(P0, P1) =
√
(x1 − x0)2 + (y1 − y0)2.
Suppose that the points P0 and P1 are the endpoints of a straight line segment P0P1, as shown.
P0(x0, y0)
P1(x1, y1)
c©2020 School of Mathematics and Statistics, UNSW Sydney
172 CHAPTER 5. AVERAGES, ARC LENGTH, SPEED AND SURFACE AREA
Then we define the length of the line segment P0P1 to be the distance between P0 and P1. In other
words,
length(P0P1) =
√
(x1 − x0)2 + (y1 − y0)2.
The line segment P0P1 is a special example of a curve in R
2. In this section we study the
lengths of curves that are not necessarily straight line segments. We begin by presenting an intuitive
derivation of a formula that gives the arc length of a curve. (A rigorous approach to proving the
validity of such a formula would involve giving a formal definition for arc length and considering
the limits of some technically difficult Riemann sums; we shall not delve into this here.) The
subsections following this derivation present variations of this formula and some examples.
5.2.1 An intuitive derivation of the arc length formula
Suppose that C is a curve in R2. The goal is to give an heuristic derivation of a formula for the arc
length of C. We make the assumption that C can be expressed in parametric form as
C = {(x(t), y(t)) ∈ R2 : a ≤ t ≤ b},
where x and y are differentiable functions of t.
C
(x(a), y(a))
(x(b), y(b))
We also assume that the parametrisation is chosen so that the path traversed by the moving point
(x(t), y(t)) does not retrace its steps (either forwards or backwards).
When a ≤ t ≤ b, let ℓ(s) denote the arc length of the curve Cs, given by
Cs =
{
(x(t), y(t)) ∈ R2 : a ≤ t ≤ s}.
The segment Cs corresponding to the arc length ℓ(s) is illustrated in black in the diagram below.
C
b
b
(x(a), y(a))
(x(s), y(s))
(x(b), y(b))
Cs
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.2. THE ARC LENGTH OF A CURVE 173
The idea is to take a small segment of the curve and approximate its length with the length of
a secant. Suppose that a < t < b and that h is a small real nonzero number. Consider the points
P (x(t), y(t)) and Q(x(t+ h), y(t+ h)).
b
b
(x(a), y(a))
P
Q
(x(b), y(b))
The length of the arc from P to Q is approximately equal to the length of secant PQ. That is
ℓ(t+ h)− ℓ(t) ≈
√
[x(t+ h)− x(t)]2 + [y(t+ h)− y(t)]2,
where the length of the secant is calculated using the distance formula. If we divide both sides by
h then
ℓ(t+ h)− ℓ(t)
h
≈
√[
x(t+ h)− x(t)
h
]2
+
[
y(t+ h)− y(t)
h
]2
.
This approximation gets better as h gets smaller. If we make the assumption that ℓ is a differentiable
function of s then, by taking the limit as h approaches zero, one obtains
ℓ′(t) =
√
[x′(t)]2 + [y′(t)]2.
Hence
ℓ(s) =
∫ s
a
√
[x′(t)]2 + [y′(t)]2 dt+K
for some constant of integration K, by the fundamental theorem of calculus. To evaluate K, note
that ℓ(a) = 0. So if s = a then
0 = ℓ(a) =
∫ a
a
√
[x′(t)]2 + [y′(t)]2 dt+K = 0 +K = K.
Hence K = 0 and thus
ℓ(s) =
∫ s
a
√
[x′(t)]2 + [y′(t)]2 dt.
Finally, the length of the entire curve is ℓ(b). So the arc length of C is
∫ b
a
√
[x′(t)]2 + [y′(t)]2 dt.
Our findings are summarised at the beginning of the next subsection.
c©2020 School of Mathematics and Statistics, UNSW Sydney
174 CHAPTER 5. AVERAGES, ARC LENGTH, SPEED AND SURFACE AREA
5.2.2 Arc length for a parametrised curve
Suppose that a curve C can be expressed in parametric form as
C = {(x(t), y(t)) ∈ R2 : a ≤ t ≤ b},
where x and y are differentiable functions of t. Then its arc length ℓ is given by the formula
ℓ =
∫ b
a
√
[x′(t)]2 + [y′(t)]2 dt (5.3)
The next example shows how this formula is applied. We consider the cycloid, which is closely
related to the so-called ‘curve of fastest descent’ (see Section 7.2 of the MATH1131 calculus course
notes).
Example 5.2.1 (The arc length of a cycloid). Find the arc length of one arch of the cycloid
x(t) = r(t− sin t), y(t) = r(1− cos t), 0 ≤ t ≤ 2π.
x
y
|
2πr
|2r
Solution. We begin by calculating the derivatives:
x′(t) = r(1− cos t), y′(t) = r sin t.
Hence
[x′(t)]2 + [y′(t)]2 = r2(1− 2 cos t+ cos2 t) + r2 sin2 t
= 2r2(1− cos t),
since cos2 t+ sin2 t = 1. We want to substitute this into formula (5.3). Before doing so, it is best
to express 1− cos t as a square. Now 1− cos 2θ = 2 sin2 θ and so
[x′(t)]2 + [y′(t)]2 = 2r2(1− cos t)
= 4r2 sin2(t/2).
Hence (5.3) gives
ℓ =
∫ 2pi
0
√
[x′(t)]2 + [y′(t)]2 dt
=
∫ 2pi
0
√
4r2 sin2(t/2) dt.
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.2. THE ARC LENGTH OF A CURVE 175
At this point we should be careful with taking the square root, since
√
a2 = a only when a ≥ 0.
Now sin(t/2) is positive whenever 0 < t < 2π, and so taking the squareroot in the ‘na¨ıve’ way
causes no problems. Therefore
ℓ =
∫ 2pi
0
2r sin(t/2) dt
= 2r
[
− 2 cos(t/2)
]2pi
0
= 8r.
So the arc length of one arch of the cycloid is 8r units.
Remark 5.2.2. For parametrisations of closed curves, one should be careful with the limits of
integration. For example, a circle of radius r and centre (0, 0) may be parametrised as
x(t) = r cos t, y(t) = r sin t, 0 ≤ t ≤ 2π.
Hence
ℓ =
∫ 2pi
0
√
[−r sin t]2 + [r cos t]2 dt =
∫ 2pi
0
r dt = 2πr,
which shows that the circumference of the circle is 2πr, as expected.
On the other hand, if we use the parametrisation
x(t) = r cos 2t, y(t) = r sin 2t
then ∫ 2pi
0
√
[−2r sin 2t]2 + [2r cos 2t]2 dt =
∫ 2pi
0
2r dt = 4πr,
which is not the circumference of the circle. The reason for this is that, as t varies from 0 to 2π, the
point (x(t), y(t)) moves around the circle twice! To find the arc length using this parametrisation,
one should instead integrate from 0 to π.
5.2.3 Arc length for the graph of a function
Suppose that f is a function of one variable. To find the arc length of the graph of f on the interval
[a, b], we parametrise the curve
y = f(x)
by
x(t) = t, y(t) = f(t), a ≤ t ≤ b.
Now
x′(t) = 1 and y′(t) = f ′(t).
By using the arc length formula (5.3), one finds that
ℓ =
∫ b
a
√
1 + [f ′(t)]2 dt
We usually write the variable of integration as x.
c©2020 School of Mathematics and Statistics, UNSW Sydney
176 CHAPTER 5. AVERAGES, ARC LENGTH, SPEED AND SURFACE AREA
In summary, the arc length ℓ of the graph of a function f on the interval [a, b] is given by
ℓ =
∫ b
a
√
1 + [f ′(x)]2 dx. (5.4)
The use of this formula is illustrated below. We remind readers that a catenary is the shape
of a hanging cable and is described using the hyperbolic cosine function. See Chapter 10 of the
MATH1131 calculus notes for further details.
Example 5.2.3 (The arc length of a catenary). Find the arc length of a catenary whose graph is
given by
y =
1
a
cosh(ax), x ∈ [−b, b].
x
y
||
b−b
1/a
Solution. Suppose that f(x) = 1a cosh(ax). By formula (5.4),
ℓ =
∫ b
−b
√
1 + [f ′(x)]2 dx
=
∫ b
−b
√
1 + sinh2(ax) dx
=
∫ b
−b
√
cosh2(ax) dx (since cosh2 t− sinh2 t = 1)
=
∫ b
−b
cosh(ax) dx (since cosh is always positive)
= 2
∫ b
0
cosh(ax) dx (since cosh is even)
= 2
[
1
a
sinh ax
]b
0
=
2
a
sinh(ab)
=
1
a
(
eab − e−ab
)
.
So the arc length of the catenary is 1a
(
eab − e−ab) units.
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.2. THE ARC LENGTH OF A CURVE 177
5.2.4 Arc length for a polar curve
Suppose that a curve is described using polar coordinates by
r = f(θ), θ0 ≤ θ ≤ θ1.
Since
x = r cos θ = f(θ) cos θ and y = r sin θ = f(θ) sin θ,
we have a parametrisation for the curve in terms of θ. Now
x′(θ) = −f(θ) sin θ + f ′(θ) cos θ, while y′(θ) = f(θ) cos θ + f ′(θ) sin θ.
Hence
[x′(θ)]2 + [y′(θ)]2 = [f(θ)]2 sin2 θ − 2f(θ)f ′(θ) sin θ cos θ + [f ′(θ)]2 cos2 θ
+ [f(θ)]2 cos2 θ + 2f(θ)f ′(θ) sin θ cos θ + [f ′(θ)]2 sin2 θ
= [f(θ)]2 + [f ′(θ)]2,
where we have used the fact that sin2 θ+ cos2 θ = 1. So by using the parametric form (5.3) for arc
length, we find that the arc length ℓ of the polar curve is given by
ℓ =
∫ θ1
θ0
√
[f(θ)]2 + [f ′(θ)]2 dθ.
We usually write f(θ) as r and f ′(θ) as drdθ .
In summary, the arc length ℓ of a polar curve is given by
ℓ =
∫ θ1
θ0
√
r2 +
(
dr
dθ
)2
dθ.
Example 5.2.4. The spiral graphed below is given by the polar equation r = e−θ/10, where θ ≥ 0.
Is the total arc length finite? Explain.
x
y
1
Solution. We begin by calculating the arc length for the segment of the curve when 0 ≤ θ ≤ θ1.
Now
r = e−θ/10 and
dr
dθ
= − 1
10
e−θ/10.
c©2020 School of Mathematics and Statistics, UNSW Sydney
178 CHAPTER 5. AVERAGES, ARC LENGTH, SPEED AND SURFACE AREA
Hence
ℓ =
∫ θ1
0
√(
e−θ/10
)2
+
(− 110e−θ/10)2 dθ
=
∫ θ1
0
√(
1 + 1100
)
e−2θ/10 dθ
=
√
101
10
∫ θ1
0
e−θ/10 dθ
=
√
101
(
1− e−θ1/10
)
.
Now as θ1 →∞, ℓ→
√
101. Hence the total arc length is finite and equals
√
101 units.
5.3 The speed of a moving particle
(Ref: SH10 §10.7)
In Chapter 4 of the MATH1131 calculus notes, we discussed the speed of a particle that moves
along a straight line. Now we consider the speed of a particle that moves along a curve in the plane.
Suppose that a particle P is moving in the plane and that its position at time t is given by
(x(t), y(t)). The distance s(t) that the particle has travelled from time zero to any later time t is
given by the formula
s(t) =
∫ t
0
√
[x′(u)]2 + [y′(u)]2 du
(which is simply the arc length formula for the path that P traverses in this time interval). By
definition, the speed of P is the rate of change of its distance with respect to time. So if v(t)
denotes the speed of P at time t then
v(t) = s′(t) =
√
[x′(t)]2 + [y′(t)]2
by the fundamental theorem of calculus.
In summary, the speed v(t) of a particle P at time t is given by
v(t) =
√
[x′(t)]2 + [y′(t)]2,
where the functions x and y give the position (x(t), y(t)) of P at time t.
Example 5.3.1. A stone is thrown horizontally from the deck of the Sydney Harbour Bridge at
20 metres per second. Its position (x(t), y(t)) exactly t seconds after the stone is thrown is given
by
x(t) = 20t, y(t) = 50 − 5t2, 0 ≤ t ≤
√
10,
where y(t) is the height above the water (see Example 7.2.1 in the MATH1131 calculus notes).
Find the speed of the stone an instant before it hits the water.
Solution. We have
x′(t) = 20 and y′(t) = −10t.
So the speed v(t) of the stone at time t is given by
v(t) =
√
202 + 100t2
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.4. SURFACE AREA 179
whenever 0 < t <
√
10. Now the stone hits the water when y(t) = 0, which is precisely when
t =
√
10. Hence the speed of the stone an instant before it hits the water is given by
lim
t→(√10)−
v(t) =
√
202 + 100(
√
10)2 = 10
√
14 ≈ 37.42.
So the speed of the stone just before it hits the water is approximately 37.42 metres per second.
5.4 Surface area
(Ref: SH10 §10.8)
The problem of finding the surface area of a surface or solid in R3 is not easy. In this section,
we focus on finding the surface area for a surface (or solid) that is formed by rotating a curve
about one of the axes. In the first subsection we derive, intuitively, a formula for the area of such
a surface. In the second subsection, the relevant formulae are summarised and examples given.
The formulae presented all rely on the formula for the surface area of the frustum of a right
circular cone.
b
b
R
r
s
Given a frustum of slant height s and radii r and R, the surface area A of the ‘curved surface’ is
given by
A = π(r +R)s. (5.5)
This formula may be proved using elementary methods and is left as an exercise in the tutorial
problems.
5.4.1 An heuristic derivation for the surface area of a surface of revolution
Suppose that a curve C has parametrisation given by
C = {(x(t), y(t)) ∈ R2 : a ≤ t ≤ b}.
We will assume that
• the curve lies in the upper half-plane (more precisely, x(t) ≥ 0 and if the curve meets the
x-axis then it does so at only a finite number of points); and
• the curve C is simple: if (x(t0), y(t0)) = (x(t1), y(t1)) then t0 = t1 (that is, the curve does
not intersect itself).
c©2020 School of Mathematics and Statistics, UNSW Sydney
180 CHAPTER 5. AVERAGES, ARC LENGTH, SPEED AND SURFACE AREA
An example of such a curve is shown below.
x
y
C
If we rotate the curve about the axis, then a surface of revolution is formed.
x
y
C
The goal of this subsection is to derive a formula for the area of this surface.
Let A(s) denote the area of the surface formed when the curve segment
{
(x(t), y(t)) ∈ R2 : a ≤ t ≤ s}
is rotated about the x-axis. We make the assumption that A is a differentiable function.
Our immediate goal is to compute the derivative of A. Fix t in (a, b) and suppose that h is a
small nonzero real number. Consider the points P (x(t), y(t)) and Q(x(t + h), y(t + h)) and note
that A(t+ h)−A(t) is the area of the surface formed by rotating the curve segment from P to Q
about the x-axis.
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.4. SURFACE AREA 181
x
y
C
P
Q
Since h is small, this surface area is approximately equal to the area of the surface formed when
the secant PQ is rotated about the x axis.
x
y
C
P
Q
This area may be calculated using formula (5.5), where
slant height =
√
[x(t+ h)− x(t)]2 + [y(t+ h)− y(t)]2.
Hence
A(t+ h)−A(h) ≈ π(y(t+ h) + y(t))√[x(t+ h)− x(t)]2 + [y(t+ h)− y(t)]2,
and dividing both sides by h gives
A(t+ h)−A(h)
h
≈ π(y(t+ h) + y(t))
√[
x(t+ h)− x(t)
h
]2
+
[
y(t+ h)− y(t)
h
]2
.
Note that approximation improves as h gets smaller. Moreover, since y is differentiable at t, it
follows that y is continuous at t and so y(t + h) → y(t) as h → 0. By taking the limit as h
c©2020 School of Mathematics and Statistics, UNSW Sydney
182 CHAPTER 5. AVERAGES, ARC LENGTH, SPEED AND SURFACE AREA
approaches zero, we obtain
A′(t) = π
(
y(t) + y(t)
)√
[x′(t)]2 + [y′(t)]2]
= 2πy(t)
√
[x′(t)]2 + [y′(t)]2].
This gives an expression for the derivative of A. By applying the fundamental theorem of calculus,
we find that
A(s) =
∫ s
a
2πy(t)
√
[x′(t)]2 + [y′(t)]2] dt+K
for some constant of integration K. To evaluate K, note that A(a) = 0. So the substitution s = a
yields
0 = A(a) =
∫ a
a
2πy(t)
√
[x′(t)]2 + [y′(t)]2] dt+K = 0 +K = K.
Thus K = 0 and hence
A(s) =
∫ s
a
2πy(t)
√
[x′(t)]2 + [y′(t)]2 dt.
Finally, the substitution s = b yields
A(b) =
∫ b
a
2πy(t)
√
[x′(t)]2 + [y′(t)]2 dt,
which is a formula for the area of the surface of revolution formed by rotating C about the x-axis.
5.4.2 Surface area formulae and examples
Assume that a curve C lies in the upper-half plane and is simple (see the assumptions stated at
the beginning of Subsection 5.4.1). We present formulae for the area of the surface of revolution
about the x-axis when C is described either parametrically, as the graph of a function or using polar
coordinates. In each case we assume that the appropriate derivatives exist.
If C is described parametrically by
C = {(x(t), y(t)) ∈ R2 : a ≤ t ≤ b},
then the area A of the surface of revolution about the x-axis is given by
A =
∫ b
a
2πy(t)
√
[x′(t)]2 + [y′(t)]2 dt. (5.6)
If C is the graph
y = f(x), x ∈ [a, b],
of a function f on [a, b] then the area A of the surface of revolution about the x-axis is given by
A =
∫ b
a
2πf(x)
√
1 + [f ′(x)]2 dx. (5.7)
If C is described using polar coordinates by
r = f(θ), θ0 ≤ θ ≤ θ1,
then the area A of the surface of revolution about the x-axis is given by
A =
∫ θ1
θ0
2πr sin θ
√
r2 +
(
dr
dθ
)2
dθ. (5.8)
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.4. SURFACE AREA 183
Remark 5.4.1. Formula (5.6) was derived heuristically in Subsection 5.4.1. Formula (5.7) may be
easily deduced from (5.6) by using the following parameterisation of the graph of f :
x(t) = t, y(t) = f(t), a ≤ t ≤ b.
Formula (5.8) may be easily deduced from (5.6) by using the following parameterisation of the
polar curve:
x(θ) = f(θ) cos θ, y(θ) = f(θ) sin θ, θ0 ≤ θ ≤ θ1.
Remark 5.4.2. In parametric form, the formula for the area A of surface of revolution about the
y-axis is given by
A =
∫ b
a
2πx(t)
√
[x′(t)]2 + [y′(t)]2 dt.
Other versions of this formula may be easily deduced using the parametrisations given in the
previous remark.
Remark 5.4.3. Each of these formulae only give the area of the surface of revolution. To find
the surface area of the solid of revolution, one must also add the surface area contributed by any
circular ‘caps’ appearing at each end of the surface of revolution. See, for example, Example 5.4.5.
Example 5.4.4. Find the surface area of a sphere of radius r.
Solution. A sphere of radius r is formed by rotating the curve
x(t) = r cos t, y(t) = r sin t, 0 ≤ t ≤ π
about the x-axis.
x
y
By using formula (5.6), we find that the surface area A is given by
A =
∫ pi
0
2πr sin t
√
[−r sin t]2 + [r cos t]2 dt
=
∫ pi
0
2πr sin t
√
r2(sin2 t+ cos2 t) dt
=
∫ pi
0
2πr2 sin t dt
= 2πr2
[
− cos t
]pi
0
= 4πr2.
So the surface area of a sphere of radius r is 4πr2.
c©2020 School of Mathematics and Statistics, UNSW Sydney
184 CHAPTER 5. AVERAGES, ARC LENGTH, SPEED AND SURFACE AREA
Example 5.4.5. A solid S is formed by rotating the curve given by
y =
√
2− x, x ∈ [0, 2],
about the x-axis. Find the surface area of the solid, making sure that the every face of the solid is
accounted for.
Solution. The solid S is drawn below.
y
x
2
√
2
It has two faces: the truncated paraboloid (shaded in lighter gray) and the circular cap (shaded in
darker gray). The area A1 of the truncated paraboloid is given by
A1 =
∫ 2
0
2πf(x)
√
1 + [f ′(x)]2 dx (where f(x) =
√
2− x)
=
∫ 2
0
2π
√
2− x
(
1 +
( −1
2
√
2− x
)2)1/2
dx
=
∫ 2
0
2π
√
2− x
(
1 +
1
4(2 − x)
)1/2
dx
= π
∫ 2
0
√
4(2 − x) + 1 dx
= π
∫ 2
0
√
9− 4x dx
= π
[
(9− 4x)3/2
−6
]2
0
=
13π
3
.
The area A2 of the circular cap is given by
A2 = πr
2 = π(
√
2)2 = 2π.
Hence the total surface area of the solid S is given by
A1 +A2 =
19π
3
,
which is approximately 19.9 square units.
We end with a simple but interesting example. Recall from high school that the volume V of
the solid formed when the graph of a function f : [a, b]→ [0,∞) is rotated about the x-axis is given
by
V =
∫ b
a
π[f(x)]2 dx. (5.9)
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.4. SURFACE AREA 185
Example 5.4.6 (Gabriel’s horn). Suppose that the function f : [1,∞)→ R is defined by f(x) = 1x .
Rotate the graph of f about the x-axis, as shown.
x
y
The surface of revolution formed is known as Gabriel’s horn (after the biblical figure Gabriel) or
Torricelli’s trumpet (after the mathematician and philosopher Evangelista Torricelli, who was a
pupil of Galileo). Even though the length of the horn is infinite, it may still be possible to calculate
its surface area and the volume of the corresponding solid.
We begin with the volume. First, consider the volume VR of the truncated solid shown below.
x
y
1 R
|
Using formula (5.9),
VR =
∫ R
1
π
x2
dx
= π
[
−1
x
]R
1
= π
(
1− 1
R
)
.
Now lim
R→∞
VR = π, and so the volume of the solid of revolution is finite and equal to π cubic units.
We now examine the surface area. The area A of the surface of revolution for the truncated
curve is given by
A = 2π
∫ R
1
1
x
√
1 +
1
x4
dx.
Finding an antiderivative for the integrand looks difficult. However, we are mainly interested in
whether or not the improper integral ∫ ∞
1
1
x
√
1 +
1
x4
dx
converges. Note that
1
x
√
1 +
1
x4
>
1
x
√
1 + 0 =
1
x
whenever x ≥ 1. Since the improper integral∫ ∞
1
1
x
dx
c©2020 School of Mathematics and Statistics, UNSW Sydney
186 CHAPTER 5. AVERAGES, ARC LENGTH, SPEED AND SURFACE AREA
diverges, so too does the integral ∫ ∞
1
1
x
√
1 +
1
x4
dx
by the comparison test for integrals (see Chapter 8 in the MATH1131 calculus notes). Hence the
surface area of Gabriel’s horn is infinite.
The fact that Gabriel’s horn has finite volume (π cubic units) and infinite surface area leads to
the following paradox. To paint the outside surface of the horn, one requires an infinite amount of
paint since the surface area is infinite. However, to paint the inside surface of the horn, one only
needs at most π cubic units of paint. Simply fill the horn with paint and then remove whatever
paint is not touching the surface. This is sometimes called the painter’s paradox.
Question: How is the paradox resolved?
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 5 187
Problems for Chapter 5
Problems 5.1 : The average value of a function
1. [R] It can be shown that when a cable is hanging between two poles the curve that it
forms is always the graph of a hyperbolic cosine function. Suppose that the height h (in
metres) of a cable above the ground is given by
h(x) = 4 cosh
(
x− 10
20
)
,
where 0 ≤ x ≤ 20. Find the average height of the cable above the ground.
2. [R] Suppose that the air temperature T (t), measured in degrees Celsius t hours after noon,
is given by
T (t) = 25 + 2t− t
2
3
.
Find the average temperature between noon and 5 p.m.
Problems 5.2 : The arc length of a curve
3. [R] Calculate the lengths of the given arcs given by
a) y = x3/2, where 0 ≤ x ≤ 1 ;
b) x = t− sin t, y = 1− cos t, where 0 ≤ t ≤ 2π ; and
c) x = t3, y = t2 from (0, 0) to (8, 4).
4. [R] The astroid
x2/3 + y2/3 = a2/3 (1)
has a parametrisation given by
x(θ) = a cos3 θ, y(θ) = a sin3 θ.
Its graph was sketched, in the case when a = 1, in one of the problems from Chapter 7 of
MATH1131.
a) Use the parametric form to calculate the arc length of the astroid.
b) [H]
i) Show that the improper integral
∫ 1
0
x−1/3 dx converges to 3/2 by calculating the
limit
lim
h→0+
∫ 1
h
x−1/3 dx.
ii) Hence calculate the arc length of the astroid by using the implicit equation (1).
5. [R] Find the length of the curve r = eθ, where 0 ≤ θ ≤ 2π.
6. [R] Find the length of the cardioid r = 1 + cos θ.
c©2020 School of Mathematics and Statistics, UNSW Sydney
188 CHAPTER 5. AVERAGES, ARC LENGTH, SPEED AND SURFACE AREA
Problems 5.3 : The speed of a moving particle
7. [R] A projectile is fired from an elevated cannon. Its horizontal distance x (in metres)
from the cannon and height y (in metres) above the ground, exactly t seconds after it is
fired, is given by
x(t) = 40t, y(t) = −5t2 + 40t+ 45, 0 ≤ t ≤ t1,
where t1 is the time the projectile hits the ground.
a) Find t1.
b) Find the speed of the projectile immediately prior to impact.
c) What was the average height of the projectile above the ground during the period
after it was fired and before impact?
d) [X] What distance did the projectile travel during this period?
8. [H] The position (x(t), y(t)) of a particle P at time t is given by
x(t) = cos
(π
2
(cos πt− 1)
)
, y(t) = − sin
(π
2
(cos πt− 1)
)
,
where t ≥ 0.
a) Find a formula for the speed v(t) of the particle at time t.
b) For what values of t is the speed of the particle (i) a maximum and (ii) a minimum?
c) What curve does the trajectory of the particle trace out?
d) What is the length of the curve of (c)?
e) Find the distance that the particle travels during the time interval [0, 3].
Problems 5.4 : Surface area
9. [R] In this question we will show that the surface area of the frustum of a right circular
cone is given by π(R+ r)s. (For a diagram, see the beginning of Section 5.4.)
Consider a truncated right circular cone with slant height s and base radius R. Cut a line
from the vertex Q to the base and flatten the cone as shown below.
s
b
R
Q
Q
a) Explain why the flattened surface is the sector of a circle.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 5 189
b) Find, in terms of R and s, the area of the sector and hence the surface area of the
cone.
c) Hence show that the surface area of a frustum of radii r and R and of slant height s
is given by π(R + r)s.
10. [R] Find the area of the surface of revolution formed when the given curve is rotated
about the x-axis.
a) y = x3, where 0 ≤ x ≤ 2. b) x = t− sin t, y = 1− cos t, where 0 ≤ t ≤ 2π.
11. [R] Show that if −a ≤ b < c ≤ a then the surface area of the sphere x2 + y2 + z2 = a2
between the planes x = b and x = c is 2πa(c− b).
12. [R] Suppose that 0 < r < R. A surface (doughnut) is formed by rotating a circle of radius
r and centre (0, R) about the x-axis. By finding a suitable parametrisation for the circle,
show that the area of the surface is (2πR)(2πr).
13. [R] Find the area of the surface formed when the polar curve r = 1 + cos θ, where
0 ≤ θ ≤ π, is rotated about the x-axis.
c©2020 School of Mathematics and Statistics, UNSW Sydney
190 CHAPTER 5. AVERAGES, ARC LENGTH, SPEED AND SURFACE AREA
c©2020 School of Mathematics and Statistics, UNSW Sydney
Answers to selected problems
Chapter 1
1. a) cup shaped b) unit sphere c) umbrella stand
d) cone with semi-vertical angle π/4 e) saddle
2. 2xyex
2y, x2ex
2y, (2x+ 2x3y)ex
2y.
3.
∂z
∂x
∂z
∂y
∂2z
∂x2
∂2z
∂x∂y =
∂2z
∂y∂x
∂2z
∂y2
a) 2xy x2 + 2y 2y 2x 2
b)
−y
(x2 + y2)
x
(x2 + y2)
2xy
(x2 + y2)2
(y2 − x2)
(x2 + y2)2
−2xy
(x2 + y2)2
c) cos(x− cy) −c cos(x− cy) − sin(x − cy) c sin(x− cy) −c2 sin(x− cy)
4. a) z = 6x+ 10y − 34, n = (6, 10,−1)T
b) z = 32− 16x+ 16y, n = (−16, 16,−1)T
c) 4x− 6y − 7z − 14 + 7 ln 7 = 0, n = (4,−6,−7)T
d) 2x+ 3y +
√
23z − 6 = 0, n = (2, 3,√23)T
5. a) 784π cm3 b) 217pi15 cm
3 c) 1.85%
6. 0.05
7. 5.012 (calculator gives 5.012115)
8. |∆S| ≤ 0.0404
9. 9% decrease.
10. 0.21%
11. a) et(t2 + 2t) b) 2t
12. 7.5π cubic centimetres per second
13. b) F (x, y) = sin(y − x2)
14. a) uxx(x, t) = g
′′(x+ λt), utt(x, t) = λ2g′′(x+ λt)
b) 4, −4
192 CHAPTER 2
17. a) ut(x, t) =
−u(x, t) f ′(x− tu(x, t))
1 + tf ′(x− tu(x, t)) , ux(x, t) =
f ′(x− tu(x, t))
1 + tf ′(x − tu(x, t))
c) tm = 1
d) Hint: Try the Maple commands
with (plots):
implicitplot(y-1+tanh(x-t*y),x=-5..7, y=-1..3, gridrefine = 2);
for a few values of t. Can you animate this in time t?
e) No
18. a) y = ±1, dydt = 0.
b) ∇F = (2x, 2y,−2z)T .
c) The vector
(
dx
dt ,
dy
dt ,
dz
dt
)T
represents the velocity of a particle on the hyperboloid. So the
equation states that the velocity is perpendicular to the surface’s normal.
Chapter 2
1. a)
1
4
e2x
2
b) −1
2
cos(x2) c)
1
6
sin(2x3)
d)
1
10
ln |5x2 − 11| e) −1
4
cos4 x f) ln(ln x)
g)
√
x2 + 4x+ 7 h)
1
3
(1 + x2)
3
2 i) − 1
18
(9− 4x3) 32
j) −1
6
(9− 4x3) 12 k) − 1
8(1 + x4)2
l)
−1
3 tan3 x
m)
−1
2 sin2 x
n)
1
8
(4 + 3e2x)
4
3 o)
−1
4(ln x)4
2. a) −e−x(x2 + 2x+ 2) b) 1
4
x4 lnx− 1
16
x4 c) x tanx+ ln(cosx)
d) − (lnx)
2
x
− 2 lnx
x
− 2
x
e)
1
2
ex(sinx+ cosx) f) x ln x− x
g) x tan−1 x− 1
2
ln(x2 + 1)
3. a) 1/8 b) 4/15
c) 13 sec
3 x+ C d) 14 sin 2θ +
1
2θ + C
e) − 199 (sinx cos 10x− 10 cosx sin 10x) +C or 12
(
1
9 sin 9x+
1
11 sin 11x
)
+C f) − 110 cos 5x+ 12 cosx+ C
4. a) ln | tanx+ secx|+ C
b) i) 13 sec
2 x tanx+ 23 tanx+ C
ii) 14 sec
3 x tanx+ 38 secx tanx+
3
8 ln | tanx+ secx|+ C
5. a) 3π/512 b) 1/60 c) 2/35
6. 6− 16e−1
7. 5/12− (ln 2)/2, π/4− 76/105
8. (e2 + 3)/8
9. 35π/256, 16/35
c©2020 School of Mathematics and Statistics, UNSW Sydney
ANSWERS 193
10. In = (2
√
2− 2nIn−1)/(1 + 2n)
15. a) π/3−√3/2; x = 2 sin θ
b) sinh−1 x−32 + C or ln(x− 3 +
√
x2 − 6x+ 13) +K; complete the square.
c) 9π/4
d) −
√
x2 + 16
16x
+ C; x = 4 tan θ to obtain
∫
cos θ
sin2 θ
dθ.
e) x/
√
1− x2 + C
f) tan−1 2
16.
√
x2 − 4 + C
17. a) 12 ln
∣∣∣x+1x+3 ∣∣∣+ C b) 3 log |x− 2|+ 2 log |x− 1|+ C
c) 2 log |x|+ 1x + 2 log |x− 1|+ C d) − 14
(
2x
x2−1 − log
∣∣∣x+1x−1 ∣∣∣)+ C
e) x+ ln
∣∣∣x−1x+1 ∣∣∣ + C f) ln |x−3|− 12 ln(x2+9)−tan−1 x3 +C
g) ln (x+1)
2
|x+2| +
4
x+2 + C h)
−1
(1+x)2 +
1
1+x =
x
(1+x)2 + C
18. a) 12 ln(x
2+2x+10)− 13 tan−1 x+13 +C b)
√
x2 + 2x+ 10− sinh−1 x+13 + C
c) 2
√
x− 2 ln(1 +√x) + C d) 11− 6 ln(3/2); x = u6 first.
22. a) lnx− 12 ln(x2 + x+ 1)− 1√3 tan−1
(
2x+1√
3
)
+ C
b) 85 cosh
5(x) + C
c) 32 ln(x
2 + 4x+ 8)− 12 tan−1
(
x+2
2
)
+ C
d) 12x
√
25− x2 + 252 sin−1
(
x
5
)
+ C
e) ln |x− 1|+ ln(x2 − 2x+ 2) + tan−1(x− 1) + C
f) −
√
1+x2
x + C
g) 13
x√
x2+3
+ C
h) − 114 cos(7x) + 12 cos(x) + C
i) 2/e
Chapter 3
1. a) y = tan(t3/3 + C) b) y = −2/(x2 +A)
c) y = 3
√
A− 3 cosx d) y = C(x− 1)/x
e) ex+y + Cey + 1 = 0 f) y2 = tan2 x+4 tanx−1, where y > 0
g) (ln x)2 = 2 ln y + C h) y = 1/(1− x3)
2. Let a and b be arbitrary real numbers with a < b. Then there are solutions of the form
y =


(x− a)3 if x < a
0 if a ≤ x ≤ b
(x− b)3 if x > b
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
194 CHAPTER 3
3. 2
√
2
4. a) y = (x3/3 + C)e2x b) y = e−3x(tan−1 x+ C)
c) y = (2 + Ce−x)/x d) y = 2x5 +Ax2
e) y = tanx− 1 +Ae− tan x f) y = x2 tan 2x+ 14 +A sec 2x
5. x = t− 1 + Ce−t
7. a) 3x2y + y3 = A b) x sin y − 12x2y2 = C
8. a) (x2 + 1)ey + ex = C b) Not exact
c) y tan−1 x+ xey = C d) exy cosx = C
9. a) xy2 − 3y tanx = C b) y2 = ln(x2 + 1) + C
c) y =
lnx+ C
x
d) y = Axe−1/x
e) y =
[
3
(
ln |x|+ x
2
2
+
17
2
)]1/3
10. xy2 = 2y5 + C
11. a) x/y = ln |x| + C b) sec y−xx + tan y−xx = Ax
c) 3x2y + y3 = A d) 3xy2 + x3 = A
e) sinh−1(y/x) = ln x; that is, y = 12 (x
2 − 1)
12. ln
√
x2 + y2 = − tan−1(y/x). Logarithmic or equiangular spiral, r = e−θ.
13. 2x+ C = ln
∣∣∣y − x− 1
y − x+ 1
∣∣∣ or y = x− 1 + 2
1 +De2x
14. a) y =
1
2 +Ae−t
b) y = 25/(5t− 1 + 26e−5t); ymax = 25
ln 26
when t =
ln 26
5
15. a) y2 = Ce4x − 12 x− 18 b) x ln y + x3/3 = C c) y = AeBx
2
16. a) y = Ae−2t +B + 3t b) y =
t
c
− ln(1 + ct)
c2
+ d
17. a) Let z =
dv
dx
. We obtain α z′ + (2α′ + bα)z = 0, which is first order homogeneous linear.
b) u = Ax2 +Bx3
18. a) i) y = 0, y = K
ii) y =
y0K
y0 + (K − y0)e−kt
iii) y is strictly increasing and approaches K.
iv) y = K/2; work from the differential equation.
v) The solution is the same as for ii); y is strictly decreasing (and concave upwards) with
y = K as a horizontal asymptote.
b) y = K exp {− ln(K/y0)e−αt} ; one way, let z = ln y.
c©2020 School of Mathematics and Statistics, UNSW Sydney
ANSWERS 195
19. Let y litres of pollutant be present in the lake after t days.
dy
dt
= 104 − y(t)
109
× (106 + 104), giving y = 10
9
101
(
1− e−1.01t/103
)
a) y → 109/101 litres or just under 1%
b)
105 ln 2
101
≈ 686.3 days ≈ 1.88 years
c) That there is perfect mixing, that the pollutant does not precipitate or dissolve, that the
pollutant does not itself create more pollution, that ....
20. With an inflow of 3 litres per minute and an outflow of 1 litre per minute, the volume of liquid in
the tank at time t is 50+2t litres. The inflow of salt is 3×2 grams per minute. In running off 1 litre
per minute with a concentration of x/(50+ 2t) grams of salt per litre, the rate of removal of salt is
1× x/(50+ 2t) grams per minute. So the net rate of increase of x is given by dx
dt
= 6− x/(50+ 2t)
or
dx
dt
+
1
50 + 2t
x = 6. This is a first order linear ODE. (You may find it helpful to consider the
outflow over a small time interval [t, t+∆ t].)
21. a) P (t) = 100− 25 cos t+ 25 sin t− 55e−t
b) 100
22. y = A exp
{
k(t+
a
2π
sin(2πt))
}
23. a) r(t) = 2% + (1.5%− 2%)/10t = 0.02− 0.0005t, where t is measured in years.
b) The differential equation is
dy
dt
= (0.02− 0.0005t)y
y = 107 exp(0.02t− 0.00025t2). When t = 10, y = 107 e0.175 ≈ 107 × 1.19
24. a)
dv
dt
= g − kv; with g = 9.8, k = 10. So v = Ae−kt + g/k. With v = 0 at t = 0,
v = gk (1− e−kt) = 4950 (1− e−10t)
b) g/k = 49/50
c) t ≥ ln 20
k
(≈ 0.3 seconds)
d) A > 0 and v decreases (rapidly) towards g/k.
25. a) y = Axλ
b) Graph ln y against lnx. From the graph, λ ≈ 1.54 and lnA ≈ −4.55 (so that A ≈ 0.0106).
26. a) $610701.38
b) Treating the car payments as continuous at the rate of $10400 a year,
dP
dt
= 0.2P − 10400.
Then P = 52000 + 448000e0.2t, for 0 ≤ t ≤ 1/2. At t = 1/2, P = 547116.57. The capital
remaining in the cooperative after 6 months is 47116.57 e−0.05 = 44818.67. At the end of one
year, the total is 547116.57 (capital plus new interest in Hitek) + 44818.67 (in the co-op) =
591935.24 dollars.
28.
dy1
dt
= − 0.25y1, dy2
dt
= 0.25 y1 − 2y2 with y1(0) = K, y2(0) = 0. Thus y1 = Ke−0.25t
and hence y2 =
K
7 (e
−0.25t − e−2t). The maximum value of y2 occurs for t = 127 ln 2. That is, after
about 1.188 days.
c©2020 School of Mathematics and Statistics, UNSW Sydney
196 CHAPTER 3
29. 20 ln 2 ≈ 13.8 m/sec
30. a) y = Ae−2x +Be−x b) y = e−x(C cos 3x+D sin 3x)
c) y = Ae−3x d) y = (Ax +B)e−2x
31. a) 14 (5e
x − e5x) b) y = e−x(cos x+ sin x)
32. a) y = Ae−3x +Be−x + 13 x− 49 b) y = (Ax +B)e3x + 5e2x
c) y = e−x(A sin x+B cos x)+2 sin 2x−cos 2x d) y = (A− x/2)e−x +Bex
e) y = A cos 2x+B sin 2x− 14 x cos 2x f) y = Ae−3x + 2e2x
g) y = 13 (sin x+ sin 2x) h) y = e
x + e4x − e2x
33. a) Ae−x +Be5x/2; seek yP = x(ax + b)e5x/2
b) Ae4x +Be−6x; seek yP = e4x(a cos 6x+ b sin 6x)
c) (Ax+B)e− 3x; seek yP = x2 ae−3x
34. y = 12x
4e2x.
35. You should obtain
d2y
dt2
− 5 dy
dt
+ 6y = e5t. y = Ax3 +Bx2 +
1
6
x5
37. 502g/π grams; about 780 kilograms
38. a) x(t) = cos 2t, so the block oscillates with fixed amplitude.
b) If c = 2 then x(t) = e−t(cos
√
3t+ 1√
3
sin
√
3t), so the system has damped oscillations.
If c = 5 then x(t) = 13 (e
−t − e−4t), so the system does not oscillate.
c) If the characteristic equation has real roots then the solution has no oscillating terms. This
happens whenever c ≥ 4. So the smallest value of c is 4.
39. a) q(t) = A cos 100t+B sin 100t
b) xP = a cosΩt+ b sinΩt if Ω 6= 100; xP = t(a cosΩt+ b sinΩt) if Ω = 100.
c) 50/π (which corresponds to when Ω = 100).
40. a) y = Ae−t +Be−2t + 2 sin t− 6 cos t
b) y = −6 cos t+ 2 sin t
41. b) yp = − 14 t cos t
42. b) No
c) µ = nπ/L, where n = 1, 2, 3, . . .. The corresponding solutions are yn(x) = Bn sin(nπx/L).
43. b) 2λ = 1 + k
2pi2
L2 , k = 1, 2, 3, . . . with yk = Bke
−x sin (kπx/L)
44. a)
x2
a
+
y2
b
= C, a family of ellipses.
b)
d2y
dx2
+ aby = 0, y = A cos ωt+B sin ωt, where ω2 = ab
and x = (Aω/b) sin ωt− (Bω/b) cos ωt.
c) ω = 1.6, A = 3.2, B = 2.4
x = 1.6 sin ωt− 1.2 cos ωt, y = 3.2 cos ωt+ 2.4 sin ωt or
x = −2 cos(ωt+ φ) = 2 sin(ωt+ φ− π/2) and y = 4 sin(ωt+ φ)
c©2020 School of Mathematics and Statistics, UNSW Sydney
ANSWERS 197
where φ = sin−1
4
5
, and
x2
4
+
y2
16
= 1. The peaks in the predator population lag behind the
peaks in the prey population by a quarter of the period.
45. Care is required if the characteristic equation of Lu = 0 has a double root. Otherwise a basis for
ker(L2) is {u1, xu1, u2, xu2}.
The two dimensional kernel of L is a subspace of the four dimensional kernel of L2.
Chapter 4
1. a) 1 + x+
x2
2!
+
x3
3!
+ · · ·+ x
9
9!
b) x− x
3
3!
+
x5
5!
− x
7
7!
+
x9
9!
c) x+
x3
3!
+
x5
5!
+
x7
7!
+
x9
9!
2. p2m+1(x) =
m∑
k=0
(−1)kx2k+1
(2k + 1)!
3. a) 2 + 14 (x − 4)− 164 (x− 4)2 + 1512 (x− 4)3
b)
1√
2
− 1√
2
(
x− π
4
)
− 1
2
√
2
(
x− π
4
)2
+
1
6
√
2
(
x− π
4
)3
+
1
24
√
2
(
x− π
4
)4
4. a) 3 + 3(x− 1)
b) 3 + 3(x− 1) + (x − 1)2
c) 7 + 5(x− 2) + (x − 2)2
5. a) p1(x) = x and R2(x) = − x
2
2(1 + c)2
for some c between 0 and x.
7. a) pn(x) = 1− x
2
2!
+
x4
4!
− x
6
6!
+ · · ·+ (−1)k x
n
n!
where n = 2k;
Rn+1(x) =
(−1)k+1 sin c
(n+ 1)!
xn+1 for some c between 0 and x.
c) |Rn+1(x)| < 1
10n+2(n+ 1)!
d) n = 4
e) The estimate sinx ≤ 1 (whenever x ∈ R) is less crude in the case when x ≥ 1.
f) n = 14
g) Rearranging the inequality |error| ≤ a
11
11!
< 10−6 gives a < 11
√
11!10−6. So a less than 1.398
will do.
10. a) Horizontal point of inflexion at 1; local maximum at −2.
b) Horizontal point of inflexion at −1; local minima at 2 and 3.
12. a) 1/2 b) 0 c) boundedly divergent d) 1
e) 0 f) diverges to ∞ g) a
c©2020 School of Mathematics and Statistics, UNSW Sydney
198 CHAPTER 4
13. a) 1/2 b) N = 2/ǫ works
14. a) e4 b) 1/e
15. e) (1 +
√
5)/2
16. a) 1/2, 0 b) 1/2, −1/2 c) √2, −√2
d) 1/2 + sin 2, −1/5 + sin 5 e) ∞, π f) 1+π/2, 1−π/2
17. b) 2
19. a) iii) 1 b) iii) 3/4
24. diverges
25. a) divergent b) divergent c) convergent
26. a) convergent b) divergent c) divergent
d) divergent e) divergent f) convergent
27. a) convergent b) convergent c) convergent d) divergent
28. a) conditionally convergent
b) divergent (by the kth term test)
c) absolutely convergent
d) diverges to ∞; (−1)
k
√
k + (−1)k =
(−1)k√k
k − 1 −
1
k − 1
29. b) s4 = 9677/16380≈ 0.59078; |error| ≤ 1/126
c) n equal to 99 will do.
30. a) convergent b) divergent c) convergent
d) convergent e) divergent f) convergent
31. Only e) and f) are divergent.
32. Converges if lim
k→∞
ak > 1 and diverges otherwise.
34. a)
∞∑
k=0
xk
k!
b)
∞∑
k=0
(−1)kx2k+1
(2k + 1)!
c)
∞∑
k=0
x2k+1
(2k + 1)!
d)
∞∑
k=1
(−1)k+1 x
k
k
e) −
∞∑
k=0
xk
35. a) ex = 1 + x+
x2
2!
· · ·+ x
n
n!
+
ecnxn+1
(n+ 1)!
, for some cn between 0 and x.
36. a) Rn+1(x) =
sinh cn
(n+ 1)!
xn+1 if n is odd and Rn+1(x) =
cosh cn
(n+ 1)!
xn+1 if n is even. In each case,
cn lies between 0 and x.
37. a) Rn+1(x) =
(−1)nxn+1
(n+ 1)(1 + cn)n+1
for some cn between 0 and x.
c©2020 School of Mathematics and Statistics, UNSW Sydney
ANSWERS 199
39. a) f ′(x) = 2e−1/x
2
/x3 if x 6= 0; f ′(0) = 0.
b) f ′′(x) = 4e−1/x
2
/x6 − 6e−1/x2/x4 if x 6= 0; f ′′(0) = 0.
e) The Maclaurin series is 0 + 0x+ 0x2 + 0x3 + · · · , and hence converges everywhere.
It only converges to f at 0.
40. b) ln 2
f) 12 ln 2
41. Students studying MATH1231 only need to state the corresponding open interval in each case.
a) (−6, 6) b) [−1, 1] c) (−2, 2)
d) [1, 3] e) [1/3, 1) f) (−3, 3]
g) (−∞,∞)
43. a)
∞∑
k=0
(k + 1)xk
k!
b)
∞∑
k=0
(−1)kx4k+2
(2k + 1)!
c) Differentiate 34(e):
∞∑
k=1
kxk−1
44. All series are valid on (−1, 1).
a)
∞∑
k=0
(−1)kxk b)
∞∑
k=0
(−1)kx2k c)
∞∑
k=0
(−1)k x
2k+1
2k + 1
45. a) f(x) = x5 − x
11
3!
+
x17
5!
− x
23
7!
+ · · · =
∞∑
k=1
(−1)k+1x6k−1
(2k − 1)!
c) A horizontal point of inflexion.
46. R = 1, ln 2 ≈ 5681 .
47. a) A local maximum
b) Si(x) = x− x
3
3!3
+
x5
5!5
− x
7
7!7
+ · · ·
c) Si(π) ≈ π − pi33!3 + pi
5
5!5 − pi
7
7!7 ≈ 1.84
48. a) (−1, 1) b) f(x) = x
(1 − x)2
49. a)
∞∑
k=1
xk
k!
b) y′=
∞∑
k=1
kakx
k−1, y′′=
∞∑
k=2
k(k − 1)akxk−2
c) ak =
1
k2k!
whenever k ≥ 1 d) y =
∞∑
k=1
xk
k2k!
whenever x ∈ R.
Chapter 5
1. 8 sinh(1/2) ≈ 4.17 metres
2. 27 29
◦C
c©2020 School of Mathematics and Statistics, UNSW Sydney
200 CHAPTER 5. AVERAGES, ARC LENGTH, SPEED AND SURFACE AREA
3. a) (133/2 − 8)/27 b) 8 c) 827 (10
√
10− 1)
4. 6a
5.
√
2(e2pi − 1)
6. 8
7. a) 9 seconds
b) 10
√
41 ≈ 64.03 metres per second
c) 90 metres
d) 80
√
2− 80 ln(√2− 1) + 25√41 + 160 ln 2− 80 ln(√41− 5) ≈ 427.53 metres
8. a) v(t) = pi
2
2 | sin(πt)|
b) (i) t = 12 + k, where k is a positive integer. (ii) t = n, where n is a positive integer.
c) A semicircle of centre (0, 0) and radius 1 in the upper half-plane.
d) π
e) 3π
10. a) pi27 (145
3/2 − 1) b) 64π/3
13. 32π/5
c©2020 School of Mathematics and Statistics, UNSW Sydney
INDEX 201
Index
absolutely convergent, 136
alternating series, 133
alternating series test, 133
answers, 191
arc length, 174
of a catenary, 176
of a cycloid, 174
of a polar curve, 177
of the graph of a function, 176
Archimedes, 101
area of the surface of revolution, 182
average height of a suspended cable, 170
average value of f , 169
bounded monotonic sequence, 120
chain rule(s), 16, 17, 19, 20
chain diagram(s), 17, 20
characteristic equation, 78
classifying stationary points, 112
second derivative test, 112
closed form, 147
comparison test, 128
conditionally convergent, 137
contour, 1
contour lines, 1
convergence and divergence of p-series, 129
convergent sequence, 115
decreasing sequence, 120
differential approximation to ∆f , 13
differential equations, 55
applications of, 71
exact equations, 66
explicit solution, 57
general solution, 56
implicit solution, 57
initial value problem, 58
integrating factor, 63
order, 56
ordinary, 56
particular solution, 56, 83
second order, 76
separable equations, 60
solution, 56
differential form, 66
divergent sequence, 115
diverges to negative infinity, 115
e, 157
is irrational, 157
exact differential equations, 66
first order linear differential equations, 62
frustum, 179
Gabriel’s horn, 185
general solution, 56
greatest lower bound, 121
Gregory, James, 101
growth of sequences, 120
harmonic series, 124
increasing, 120
increment, 14
Indian mathematicians, 101
infima, 122
infimum, 122
infinite series, 122
absolutely convergent, 136
alternating series, 133
comparison test, 128
conditionally convergent, 137
converges to L, 123
diverges, 123
integral test, 127
kth term test for divergence, 126
nth partial sum, 123
p-series, 129
ratio test, 131
summable, 123
tail of the series, 125
c©2020 School of Mathematics and Statistics, UNSW Sydney
202 INDEX
initial conditions, 58
initial value problem, 58
integral test, 127
integrating factor, 63
integration
powers of sin and cos, 25
by hyperbolic substitutions, 35
by trigonometric substitutions, 35
multiple angles of sin and cos, 28
of rational functions, 37
of trigonometric functions, 25
powers of tan and sec, 29
reduction formula, 30
irreducible, 38
Lambert, Johann Heinrich, 32
least upper bound, 121, 122
least upper bound axiom, 121
Leibniz, 133
length of a curve, 172
level curve, 1
limit inferior, 158
limit of sequence, 116
limit superior, 158
linear combination, 77
logistic curve, 76
lower bound, 121
Maclaurin series for f , 138
Malthus, Thomas, 73
mathematical model, 71
mean value theorem for integrals, 156, 171
Millennium problems, 76
monotonic, 120
Navier–Stokes equations, 76
Newton’s law of cooling, 61
non-homogeneous differential equations, 80, 83
nondecreasing, 120
nonincreasing, 120
normal vector, 10, 11
open interval of convergence, 144
order of growth of sequences, 120
π, 32, 101
is irrational, 34
painter’s paradox, 186
partial derivative(s), 7, 19
mixed derivative theorem, 9
of F with respect to x, 5
of F with respect to y, 5
second order, 8
particular solution, 56, 83
pinching theorem for sequences, 119
population growth, 73
power series, 148
closed form of, 147
differentiable, 148
in powers of x, 142
in powers of x− a, 142
integrable, 148
open interval of convergence, 144
radius of convergence, 144
radius of convergence, 144
ratio test, 131
rational function(s), 37
distinct linear factors, 40
improper, 37
irreducible, 37
irreducible quadratic factor, 41
partial fractions decomposition, 38, 40
proper, 37
repeated irreducible quadratic factor, 42
repeated linear factor, 40
rearrangement of a series, 137
reduction formula, 30
remainder term, 109
Lagrange formula, 110, 156
resonance, 84
Riemann sum, 169
right circular cone, 179
second derivative test, 112
second order linear differential equations
characteristic equation, 78
homogeneous equations, 77
linearly independent solutions, 77
non-homogeneous equations, 80
particular solution, 56, 83
resonance, 84
with constant coefficients, 76
separable differential equations, 60
sequence(s), 114
bounded monotonic, 120
boundedly divergent, 115
convergent, 115
c©2020 School of Mathematics and Statistics, UNSW Sydney
INDEX 203
decreasing, 120
definition of limit, 116
divergent, 115
diverges to −∞, 115
diverges to ∞, 115
greatest lower bound, 121
infimum, 122
least upper bound, 121
lower bound, 121
nth term of, 114
order of growth, 120
pinching theorem, 119
supremum, 122
unboundedly divergent, 115
upper bound, 121
series, see infinite series
simple curve, 179
sine integral Si(x), 164
speed v(t) of, 178
summable, 123
suprema, 122
supremum, 122
surface of revolution, 180
suspended cable, 167
tail of the series, 125
tangent plane(s), 10, 11, 20
Taylor polynomial
about a, 105
for f about a, 106
of degree n, 104
Taylor series
converges on I, 139
converges to f on I, 139
diverges on I, 139
for f about a, 138
function represented by, 140
Taylor’s theorem, 109
remainder term, 109
Lagrange formula, 110, 156
Taylor, Brook, 101
Torricelli’s trumpet, 185
Torricelli, Evangelista, 185
total differential, 13
approximation ∆F , 20
approximation to ∆F , 14
use with measurement errors, 14
upper bound, 121
Verhulst, Pierre, 75
Wallis’ product, 50
c©2020 School of Mathematics and Statistics, UNSW Sydney

欢迎咨询51作业君