程序辅导案例 > Program >

程序代写案例-MTH2140

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

Real Analysis (MTH2140 & MTH3140)
Lecture notes
Je´roˆme Droniou
School of Mathematical Sciences, Monash University
[email protected]
Date: 15/02/2021
1
How to use these notes
• These notes are a skeleton on which lectures will be built. They are not meant to be used in
isolation from the lectures.
• The reference textbook on which these notes are based is: S. Abbott, Understanding Analysis,
Springer, 2001. Some sections in these notes are not covered by this textbook.
• Dark red texts are clickable and send you to the appropriate section/result/equation (some pdf
readers also underline these hyperlinks). For example this one, which refers to a very old result on
numbers.
• Appendix A gives some elements on proofs in mathematics. From time to time, we will be doing
some incursions in this appendix, but its reading is highly recommended (euphemism) right from
the start of the semester.
• The appendices marked ♦NE♦ are not examinable. Reading them, if only briefly, can however
strengthen your intuition on the strange things that can happen in Real Analysis.
As a general rule, all boxed texts present essential concepts.
Definition 0.1 (A simple definition)
Definitions are presented in these yellow-green-ish boxes, with title on white background. They are
the “building blocks” of mathematics, nothing can be done without understanding the definition.
They simply explain what a word means in mathematics. Definitions are not proved. If you know
a bit about programming, a definition is just a macro in the mathematical world: we gather under
a single word or expression a number of properties. Whenever we write the word/expression, it
means that all these properties hold true for the object referred to by the word/expression.
An essential thing to understand: contrary to common language, in mathematics, a word only
means what is written in a definition; nothing more, nothing less.
Along with worked examples and solutions, Definitions are what you cannot include in the sheet
of notes you are allowed to bring to the exam (see the unit guide).
Theorem 0.2 (A famous theorem)
Theorems, propositions, lemma, corollary... are mathematical statements that say that if some-
thing – the assumptions (or hypotheses) – is true, then something else – the conclusions – is
also true.
Theorems, etc. are results that must be proved, contrary to definitions. They are presented in
purple boxes such as this one, with titles also on coloured background.
It is essential to be able to distinguish assumptions from conclusions. Why? Because when proving
a theorem, etc., you do not try to prove the hypotheses; you assume they are satisfied and you try,
from there, to get to the conclusions. Also, any proof that starts by assuming that the conclusion
holds is necessarily incorrect (it cannot be a proof of the considered theorem, etc.).
Proof, skeleton of proof, hint for the proof
In greyed square boxes you’ll find proofs or skeletons of proof of theorems/lemma/propositions/etc.
To understand what “proving” means, in mathematics, ponder the following statement from Un-
derstanding Analysis by S. Abbott: [A proof] is a set of carefully crafted directions, which, when
followed, should leave the reader absolutely convinced of the truth of the proposition in question. To
achieve this, the steps in a proof must follow logically from previous steps or be justified
by some other agreed-upon set of facts.
In the boxes:
• “Proofs” are complete reasoning, with all details, that are commonly considered as convinc-
ing enough.
• “Skeletons of proof” only give the main arguments, the main steps, and need to be fleshed
2
out with some details to become proofs. The missing details are usually made explicit by
short expressions like “why?” or “do it!”.
Understanding skeletons of proofs is essential. Perhaps even more than understanding com-
plete proofs, since skeletons contain the main ideas, and the missing bits are mostly technical
details.
• “Hints for the proof” are merely nudges in the right direction. They might indicate which
other theorem can be used for the proof, but not give you a clear path to follow.
Complete proofs are not allowed in the sheet you can bring to the exam. Skeletons of proofs (or
hints) are allowed.
Solution, skeletons of solution, hint for the solution
Solutions (or skeletons thereof, or hints for solutions) of exercises are given in square boxes. All
the comments regarding proofs/skeletons or proof/hints for the proof apply to solutions/skeletons
of solution/hints for the solution, including the fact that complete solutions (contrary to skeletons
or hints) are not allowed on the sheet you can design for the exam.
Note 0.3
In green boxes with angular corners, you will find general notes. Most of these notes describe
classical techniques that can be applied to bring solutions to particular kinds of questions. They
are therefore quintessential to your learning.
Even though colours are used in the previous boxes, the layout (form of the box, title with or without
background colour, etc.) also makes them easily distinguishable in a black & white print of these notes.
Example 0.4
Examples and counter-examples are presented this way, with an additional indentation. They are of
utmost importance in mathematics, as they help shape your understanding of a concept. You should
seek to develop your own examples, beyond those in these notes.
Exercise 0.5
Exercises are presented in a similar way as examples. They are obviously here for practising the
notions in the notes, and complement the examples. The solutions to some of them will be covered
during the lectures.
3
Contents
1 Real numbers 8
1.1 Irrationality of
√
2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Bounds, supremum, maximum, etc. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Properties of R, Axiom of Completeness (AoC) . . . . . . . . . . . . . . . . . . . . . . . . 12
2 Sequences of real numbers 14
2.1 Basic notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Limits of sequences: definition and properties . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Indeterminate limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.4 Compactness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5 Monotone convergence theorem (MCT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6 Cauchy sequences and completeness of R . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3 Series of real numbers 24
3.1 Series with positive terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Convergence tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4 Functions 28
4.1 What is a function, exactly? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.2 Limits of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.3 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3.1 Definition and basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.3.2 Types of discontinuities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4 Contraction mapping theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.5 Continuous functions on compact sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.6 Intermediate value theorem (IVT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5 Derivatives 38
5.1 Definition and basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.2 Mean value theorem (MVT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.3 Fundamental theorem of calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.4 Taylor expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.5 Are derivatives continuous? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6 Sequences and series of functions 47
6.1 Polynomial solutions to differential equations... and beyond . . . . . . . . . . . . . . . . . 47
6.2 Sequences of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.2.1 Pointwise convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.2.2 Uniform convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
6.3 Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.4 Series of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.5 Power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.5.1 Taylor series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
A Appendix: elements of mathematical proof 63
A.1 Distinguishing hypotheses from conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . 63
A.2 Table of truth of logical/mathematical statements . . . . . . . . . . . . . . . . . . . . . . 63
A.3 Principles of a proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
A.4 Some classical techniques of proof . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
A.5 What about intuition? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
B Appendix: fixed points of Markov chains ♦NE♦ 67
C Appendix: continuous nowhere differentiable? ♦NE♦ 69
D Appendix: cantor function ♦NE♦ 72
E Appendix: continuity points of pointwise limits ♦NE♦ 73
4
Approximate schedule
Week Topics Assignment due
1 Real numbers: notations, bounds, supremum Diagnostic test (optional)
2 Real numbers: AoC
Sequences: basic notions, limit
Exercise solution
3 Sequences: SLT, ALT, OLT, squeeze, compactness Quiz
4 Sequences: MCT, Cauchy
Series: comparison test, n-th term test, Cauchy, absolute cv test
Exercise solution
5 Series: ratio test, AST
Functions: limits, sequential characterisation, continuity
Quiz
Easter break
6 Functions: contraction mapping theorem, continuity and compact
sets, IVT
Assignment 1
7 Derivatives: properties, IET, extrema Quiz
8 Derivatives: MVT, fundamental theorem of calculus Quiz
9 Derivatives: Taylor expansions.
Sequences of functions: polynomial solutions to ODEs, pointwise
convergence
Exercise solution
10 Sequences of functions: uniform convergence, uniform Cauchy cri-
terion.
Series of functions: M-test
Quiz
11 Power series: definition, convergence, differentiability
12 Power series: Taylor series. Revisions Assignment 2 & oral
Assessment
All assignments can be found, organised by week of due date, in the “Assignments” section of Moodle.
Submissions are all online, through Moodle. The combined continuous assessment counts for 40%
of the final grade.
Diagnostic test: This is a non-mandatory, but highly recommended, test to be taken in Week 1. It is
designed to help you self-assess if your have a clear understanding of the background material (covered
in MTH1030) for this unit. It is self-marked and solutions with marking schemes are provided after you
take the quiz, but if you have questions regarding some of your answers you can ask your tutor in the
first applied class in Week 2. It does not count towards the final unit grade.
Exercise solution:
• Before the start of each of your tutorial, you must submit a solution for one exercise of that week’s
problem set (see Moodle for which specific exercise).
• Your tutor will go over your solution, and provide feedback and an indicative mark.
• The mark does not count towards the unit’s final grade (but see “Applied Classes” below). The
purpose of these works is to provide you clear feedback on your performance in the unit without
putting pressure on you with a grade that would impact your final result.
Applied classes:
• Attendance and active participation in applied classes count for 10% of the final grade.
• Active participation means:
– Submission of all 3 exercise solutions in Weeks 2, 4 and 9 (online and before your tutorial
time). Submitting all three solutions, even if they are not correct (but correspond to genuine
attempts), grants you 5%.
– Engaging in the applied classes: participating in the discussions in your group, proposing
solutions, discussing solutions proposed by peers, etc. Regularly engaging in the applied class
work grants you 5%.
5
Quizzes:
• There are multiple choice questions quizzes on Moodle.
• Each one will be available on the corresponding week, from Monday 8am to Tuesday 8am. You
won’t be able to attempt the quiz after this closing time. These quizzes also serve as preparatory
work for this class.
• Once you start a quiz, you have a limited amount of time to complete it. This time will be indicated
for each quiz on Moodle.
• Each one is worth 1% of the final grade.
Assignments:
• These are home assignments.
• The due date of each assignment is Tuesday 12noon of the corresponding week.
• Assignment 1 is worth 10% of the final grade, Assignment 2 and its oral component are worth
15% .
• Criteria for marking assignments include the following:
– Completeness of the solutions, mathematical correctness and precision of the arguments.
– Usage of words and clear but concise explanations of what you are doing.
– Proper usage of the rules/techniques of proof covered in the first lectures (check the assump-
tions of the theorems you utilise, etc.).
• Assignment 2 is a group assignment.
– Each group should (ideally) comprise 3 students.
– It is your responsibility to create or join a group, before the end of Week 8, by getting in touch
with your peers during the applied classes or through the Moodle social forum. The process
to record the groups will be indicated at a later stage on Moodle.
– Each group only submits one script, which will be marked by your tutor.
– There is an oral component to this assignment, which takes about 15’ per group. During this
oral component, each member of the group will be asked to explain how the group came up
with one particular solution in the assignment. The solution to justify will no be known in
advance and will be different for each member of the group, who will have to respond without
any help from the other members.
The idea is that even though you can decide to split the assignment questions between group
members, each member should have a clear understanding of all the solutions produced by
the group (so if you split the work, make sure you have a group discussion before submitting
so that all members understand all solutions).
The tutor will then allocate to each oral response a quality A, B or C. This quality will affect
the final mark each student will receive (and which might therefore be different from the marks
of other students in the group). If M is the mark awarded to the submitted script, the formula
to determine each student’s final mark for this assessment is:
Quality of oral Final mark
A M
B 0.8M
C 0.6M
General instructions for assessments
1. The types of questions and their specificities are as follows:
6
• Multiple choice questions: only one answer is correct and attracts the indicated marks;
the other answers attract negative marks equal to one-third of the indicated marks (so
responding at random on all such questions will attract on average zero marks).
When an answer (e.g. true/false) identifies some theorems/definition as justification, only
the answer which identifies the proper theorems/definitions is correct.
• Short answer questions: you can type your response using plain sentences and the nota-
tions below (any other notation that is clear enough is also valid). You don’t have to
show all your algebraic workings for these solutions, just the main steps of
the arguments (e.g., in a series of equalities or inequalities, you can skip the simple
intermediate steps). The response should typically be between 3 and 10 lines.
Possible notations:
– Underscore ‘ ’ to denote indices (such as (u_n)_n to denote the sequence (un)n), the
hat ˆ for exponents (e.g. eˆx for ex).
– Simple mathematical expressions can be typed without any particular symbol, such
as sup(A), 3x+2, etc.
– Set of real numbers, use R. Set of natural numbers, use N, or NN if you want to
distinguish it from an integer N.
– Convergence arrow (as in un → `) is ->.
– For ε, write eps.
– For sums, just write Sum (or Sum_n if you need to indicate the index), in lieu of
∑
(or
∑
n).
– Inequalities: write < and > for strict inequalities, and <=, => for large inequalities. Do
not use these symbols for implications, just write “implies”.
– To better separate numerator from denominator in fractions, you can use square
brackets: for example, write [5x+4]/[2x-1] for 5x+42x−1 .
• Paper questions: respond on sheets of paper, that you submit with your exam following
the process described. For this question, proper notations, full explanations, complete
sentences and details of computations are expected, as in any in-semester assignment.
2. In the questions, the sets of real numbers and natural numbers are denoted using bold face:
N, R. Unless otherwise specified, all sets are subsets of R.
3. Unless otherwise specified, you can use all results seen in the lectures. Do not forget to mention
theorems (and check assumptions) when you use them. Questions with “Using only...” or “Prove
from...” must be solved by using only the definitions/results mentioned and the basic properties
of real numbers/functions.
7
1 Real numbers
Let us recall some basic sets...
• N = set of all natural numbers = {1, 2, . . .}. Note that some authors use N = {0, 1, 2, . . .}.
• Z = set of all integers = {. . .− 2,−1, 0, 1, 2, . . .}.
• Q = set of all rational numbers = {pq , p ∈ Z , q ∈ N}.
• R = set of all real numbers, extension of Q by “filling in the gaps” (cf. Theorem 1.5).
... and set-related notations.
• ∅ = empty set.
• x ∈ A: x is an element of A.
• A ⊂ B: A is a subset of B (may be equal to B).
• A ∩B = intersection of A and B.
• A ∪B = union of A and B.
• Ac = complement of A = R\A.
A number x is positive if x ≥ 0, and strictly positive if x > 0; reversing the inequalities give the definition
of negative and strictly negative numbers, respectively. A warning here: the Abbott textbook is not
so clear on the usage of ‘positive’; you will find it sometimes uses ‘positive’ for ‘x > 0’, and sometimes
for ‘x ≥ 0’.
Unless otherwise specified, in this entire unit all sets are subsets of R and all numbers are real numbers.
Technique 1.1 (How to prove set inclusions and equalities?)
Consider two sets A and B. The most efficient and foolproof way to prove that A ⊂ B is the
following:
1. pick a generic element x ∈ A,
2. translate the fact that it belongs to A (using the definition and what you know of A, and
nothing else),
3. use this to show that it belongs to B (using the definition and what you know on B).
To prove that A = B, simply prove that A ⊂ B and B ⊂ A.
Exercise 1.2
Prove that {1/n : n ∈ N} ⊂ (0, 1].
Solution
Take x ∈ A = {1/n : n ∈ N}. By definition, this means that there is some n ∈ N such that
x = 1/n. Since 1 ≤ n, the rules of inequalities and algebra show that 1 = 1/1 ≥ 1/n = x.
Hence, x ≤ 1. Moreover, as the reciprocal of a strictly positive number, x itself is strictly
positive. Hence, 0 < x ≤ 1, which precisely means that x ∈ (0, 1]. We have therefore proved
the required inclusion.
Exercise 1.3
Prove that ∪n∈N[1/n, 1) = (0, 1).
8
Solution
Let us first prove “⊂”. Take x ∈ ∪n∈N[1/n, 1). By definition of an union, this means that
there exists n ∈ N such that x ∈ [1/n, 1), that is 1/n ≤ x < 1. Since 1/n > 0, this shows that
0 < x < 1, that is x ∈ (0, 1). The proof of ∪n∈N[1/n, 1) ⊂ (0, 1) is complete.
We now prove “⊃”. Take x ∈ (0, 1). Since x > 0, we can find k ∈ N such that x ≥ 1/k (this
very natural property is actually not so trivial when we start from an axiomatic definition
of R, see the Archimedean property). Then, 1/k ≤ x < 1, which shows that x ∈ [1/k, 1) for
that k, and thus, by definition of the union, x ∈ ∪n∈N[1/n, 1). Hence, (0, 1) ⊂ ∪n∈N[1/n, 1)
as required.
Intervals will also be of particular interest to us.
Definition 1.4 (Interval of R)
An interval of R is a subset I of R such that, whenever a and b are in I, any point between a and
b belongs to I.
In other words, I is an interval if for all a < b in I we have [a, b] ⊂ I. In particular, R, {a} and ∅
are intervals. Intervals of R have the following form, with u, v real numbers or ±∞: [u, v], (u, v) (open
interval), [u, v), and (u, v]. Intervals of the form [u, v] (with u, v real numbers), [u,∞), (−∞, u] and
R = (−∞,∞) are said to be closed.
1.1 Irrationality of
√
2
Theorem 1.5 (
√
2 is not rational)
There is no rational number whose square is 2.
Where are the hypotheses? The conclusions? We see in Abbott’s book that this theorem is proved by
re-wording it as “if z is a rational number then z2 6= 2”. Under this form, hypotheses and conclusions are
easy to decipher
• Hypothesis: z is a rational number.
• Conclusion: z2 6= 2.
Skeleton of proof of the theorem
By way of contradiction. Assume that z = p/q, with p, q without common factor, and z2 = 2.
Then 2q2 = p2, so p2 is even, so p is even (why?). Hence p = 2p′, so 2q2 = p2 = 4p′2 shows that
q2 is even (why?), and thus q is even. p and q both even, that is divisible by 2, is a contradiction
with the fact that they do not have any common factor.
1.2 Bounds, supremum, maximum, etc.
Definition 1.6 (Upper bound and lower bound)
An upper bound of a set A ⊂ R is x ∈ R such that, for any a ∈ A, x ≥ a (in other words, x is
greater than any element in A). A set is bounded above if it has an upper bound.
Similarly, a lower bound of A is a real which is lower than any element in A. If A has a lower
bound, we say that it is bounded below.
A set is bounded if it is bounded above and bounded below.
Exercise 1.7
If x is an upper bound of A then −x is a lower bound of −A := {b ∈ R : ∃a ∈ A , b = −a} = {−a :
a ∈ A}.
9
Technique 1.8 (Algebraic operations on inequalities)
To properly tackle questions involving inequalities (which appear in the majority of problems in
Real Analysis), you need to recall how inequalities and algebraic operations work:
• If x ≤ y and z ∈ R then x+ z ≤ y + z.
• If x ≤ y and c ≥ 0 then cx ≤ cy.
• If x ≤ y and c ≤ 0 then cx ≥ cy.
Beware of inventing incorrect rules!
Technique 1.9 (Finding an upper or a lower bound of a fraction)
We will often face the need to find upper or lower bounds of numbers of the form ab . The first
thing to deal with is the sign: make sure to bring the problem to the case where a and b are
positive. Then:
• To find an upper bound of ab , find an upper bound of a and a positive lower bound of b:
if 0 ≤ a ≤ A and 0 < β ≤ b then a
b
≤ A
β
.
• To find a lower bound of ab , find a positive lower bound of a and an upper bound of b:
if 0 ≤ α ≤ a and 0 < b ≤ B then α
B
≤ a
b
.
All these relations follow from the algebraic operations described in Technique 1.8.
Definition 1.10 (Supremum and infimum)
Let A ⊂ R be bounded above. A real number s is the supremum (or least upper bound, l.u.b. for
short) of A if
(i) s is an upper bound of A, and
(ii) if x < s then x is not an upper bound of A.
We then denote s = supA.
Similarly, the infimum (or greatest lower bound, g.l.b. for short) of a set A bounded below is a
real number m which is a lower bound of A, and such that any number strictly greater than m is
not a lower bound of A. The infimum of A, if it exists, is denoted by inf A.
Note 1.11 (Two useful characterisations of the supremum)
The following characterisations of the supremum are covered in the problem sets, but it is useful
to know them.
• s is a supremum of A if and only if
1. s is an upper bound of A,
2. for all ε > 0, there is a ∈ A such that s− ε ≤ a ≤ s.
• s is a supremum of A if and only if
1. s is an upper bound of A,
2. if z is an upper bound of A, then z ≥ s.
10
Exercise 1.12
1. If inf(A) exists then sup(−A) exists and sup(−A) = − inf(A).
2. Show that if a supremum exist, then it is unique (thus justifying the “the supremum” in the
definition).
3. Find a set A such that sup(A) 6∈ A.
Solution of 1., hints for solving 2. and 3.
1. Recall that −A = {y ∈ R : for some x ∈ A, y = −x} = {−x : x ∈ A}. Let r = inf(A).
We prove that −r is a supremum of −A by using the definition, i.e. we show that (i) −r is an
upper bound of −A and that (ii) any number lower than −r is not an upper bound of −A.
r is a lower bound of A so, for any x ∈ A, r ≤ x. Hence, by the algebraic operations on
inequalities, −r ≥ −x. By the description of −A, this shows that −r is greater than any
element of −A, that is −r is an upper bound of −A. This takes care of (i). To prove (ii), let
z < −r. Then −z > r and, since r is the g.l.b. of A, −z is not a lower bound of A. So there
is x ∈ A such that x < −z, which implies −x > z. We found an element, −x, of −A that is
greater than z, and thus z is not an upper bound of −A. This concludes (ii), and therefore
the proof that −r is a supremum of −A.
2. Hint: start by taking s and s′ two suprema of A. Translating the fact that s is a suprema,
show that s′ cannot be strictly lower than s, that is s′ ≥ s. Reverse the reasoning to obtain
s ≥ s′, and thus s = s′.
3. Hint: seek a counter example as a simple set, e.g. an interval.
Definition 1.13 (Maximum and minimum)
If A has a supremum s and if s ∈ A, then s is called the maximum of A.
Similarly, a number r is the minimum of a set A if r is an infimum of A and r belongs to A.
The set of rationals Q appears to be a nice one, without any “holes” (contrary to N and Z, for any
given x 6= y ∈ Q we can find a rational, say x+y2 , strictly between x and y), but it actually misses some
numbers... we already saw that in Theorem 1.5, and this result will also show us that subsets of Q does
not necessarily behave well will respect to supremum/infimum. In particular, some subsets in Q do not
have a supremum in Q.
Exercise 1.14
Prove that A = {q ∈ Q : q ≥ 0 and q2 < 2} is bounded above but does not have a supremum in Q.
Solution
Assume s = supA exists and is in Q. We will prove that s2 = 2, which will contradicts the
fact that s ∈ Q (cf. Theorem 1.5).
We see that (s + 1/n)2 = s2 + 2s/n + 1/n2 ≤ s2 + (2s + 1)/n. If s2 < 2 then there exists n
such that (2s+ 1)/n < 2− s2 (pick any integer n larger than (2s+ 1)/(2− s2)) and therefore
s+ 1/n ∈ A (it is rational since s is assumed to be rational), but s+ 1/n > s which violates
the fact that s = supA.
If we assume that s2 > 2 then by noticing that (s − 1/n)2 > s2 − 2s/n we can find n such
that s − 1/n > 0 and (s − 1/n)2 > 2. For any q ∈ A we then have (s − 1/n)2 > q2 which
shows, since the square root is strictly increasing on R+ (or the square function is strictly
increasing on R+), that s− 1/n > q. Hence, s− 1/n is an upper bound of A which is strictly
smaller than s, contradiction again.
Remark 1.15 A function f is increasing if, for any a ≤ b, we have f(a) ≤ f(b). A function f is strictly
increasing if, for any a < b, we have f(a) < f(b).
A function f is decreasing if, for any a ≤ b, we have f(a) ≥ f(b). A function f is strictly decreasing if,
for any a < b, we have f(a) > f(b).
11
1.3 Properties of R, Axiom of Completeness (AoC)
R is the set which fills Q and its holes. What is R? For our purpose, it is defined by its properties.
• R is a field: it is equipped with two laws, + and ×, which satisfy the usual commutativity, associativ-
ity and distributivity laws, and such that every element has an additive inverse and a multiplicative
inverse if non-zero.
• R is an ordered field, i.e. it has an order which compatible with its laws (cf. the note on algebraic
operations on inequalities in page 10).
• R satisfies the axiom of completeness.
Theorem 1.16 (Axiom of completeness (AoC))
Every non-empty subset of R that is bounded above has a supremum.
Z is not a field (although it has two laws, compatible with its order). Q is an ordered field, but it is not
complete (see Exercise 1.14). Proving the existence of such a complete order field R is quite lengthy and
technical, although not particularly complex (see, e.g., Appendix 1 of Chapter 1 in Rudin, Principles of
Mathematical Analysis).
Exercise 1.17
State and prove (using Theorem 1.16) an axiom of completeness for the infimum.
Skeleton of solution
The axiom is easy to state, by recalling that if suprema are associated with sets that are
bounded above, infima are related to sets that are bounded below: every non-empty subset
of R that is bounded below has an infimum.
For the proof, consider using Exercise 1.12.
Exercise 1.18
Find the supremum, infimum, maximum and minimum of A = {1/n : n ∈ N}, B = [0,√2] ∩ Q,
C = (−1, 3]. Prove that the numbers you intuitively find are indeed the supremum, infimum, etc. of
the corresponding sets.
Solution, only for A
Sketching a plot of A on the real line gives us a clear intuition regarding its infimum and
supremum: 0 looks to be the infimum, and 1 the supremum. Since the latter number belongs
to A, this tells us that it would actually be its maximum. 0 does not belong to A, so it cannot
be its minimum and A does not have a minimum. As a conclusion of our intuition: 1 is the
maximum (and thus supremum), 0 is the infimum but not the minimum.
Now with a rigorous proof (very easy here). Consider 1. We have to prove that (i) it is an
upper bound of A, and (ii) anything strictly below 1 is not an upper bound of A. Let x ∈ A;
there is n ∈ N such that x = 1/n. We have n ≥ 1 so, multiplying throughout by 1/n > 0,
1 ≥ 1/n = x. Hence, 1 is greater than anything in A, which concludes (i). We also notice
that 1 = 1/1 belongs to A. To prove (ii), consider z < 1. Then since 1 ∈ A, this shows that
z is not an upper bound of A, thus concluding (ii) (in passing, we see here something quite
general: if you know that some number s is an upper bound of a set A and belongs to A, then
it is necessarily the supremum, and maximum, of A).
Consider now the infimum. For any x ∈ A, there is some n ∈ N such that x = 1/n and since
n > 0, x > 0. Hence, 0 is a lower bound of A. Take now some number z > 0. We have to
show that z is not a lower bound of A. You probably already know that, since z > 0, we can
take n ∈ N large enough such that 1/n < z (take, e.g., n an integer strictly greater than 1/z).
Then 1/n ∈ A is lower than z, thus showing that z is not a lower bound of A. This concludes
the proof that 0 is the infimum of A.
12
B and C are left to you as practice exercises...
As shown by the following proposition, R can be considered as the “smallest” extension of Q which is
complete.
Proposition 1.19 (Density of Q in R)
For any real numbers x < y, there exists q ∈ Q such that x < q < y.
Proof
Let us start with a preliminary lemma.
Lemma 1.20 (Archimedean property)
For any x > 0 and any y ∈ R, there exists n ∈ N such that nx > y.
Proof of the Archimedean property
If no such n exist, then y is an upper bound of A = {nx : n ∈ N}. Let s = supA. Since
x > 0, s− x is not an upper bound of A and thus we can find n ∈ N such that s− x < nx,
that is s < (n+ 1)x, a contradiction.
Now for the proof of the density of Q in R. We want to find n ∈ N and m ∈ Z such that
x < m/n < y, that is m is an integer between nx and ny. To ensure the existence of such an
integer, we first start by selecting n such that the distance between nx and ny is greater than 1,
i.e. n such that n(y − x) > 1. This is possible by the Archimedean property since y − x > 0. We
then take the smallest possible integer m such that m > nx. This means that m > nx ≥ m − 1.
We already have m/n > x. But we also obtain (m− 1)/n ≤ x, that is m/n ≤ x+ 1/n. The choice
of n ensures that x+ 1/n < y (as n(y − x) > 1) and therefore m/n < y.
Another consequence of the completeness of R is the NIP.
Theorem 1.21 (Nested interval property (NIP))
Let In = [an, bn], for n ∈ N, be closed intervals such that I1 ⊃ I2 ⊃ I3 · · · . Then ∩n∈NIn is not
empty.
Skeleton of proof
Let A = {an : n ∈ N}. By the AoC (check the assumptions), A has a supremum s. It is then
proved that s ∈ ∩n∈NIn by checking that s ∈ In for all n ∈ N. s ≥ an by definition of A. bn is an
upper bound of A (why?), so s ≤ bn (why?). Hence an ≤ s ≤ bn so s ∈ In.
Example 1.22
We have
⋂
n∈N[−1/n, 1/n] = {0} but
⋂
n∈N(0, 1/n) = ∅. How is the latter example not a counter-
example to the NIP?
13
2 Sequences of real numbers
2.1 Basic notions
Definition 2.1 (Sequence)
A sequence u (of real numbers) is an ordered family of real numbers indexed by n: u =
(u1, u2, . . . , un, . . .) = (un)n∈N.
Equivalently, a sequence u can be considered as a function u : N 7→ R. We then denote, for n ∈ N,
un instead of u(n).
You already know the following (easy) notions: sum of two sequences and multiplication of a sequence
by a real number. We will also need the (sometimes subtle) notion of sub-sequence.
Definition 2.2 (Subsequence)
Let u be a sequence. A subsequence v of u is a sequence defined by vn = uϕ(n), where ϕ : N 7→ N
is a given strictly increasing function.
We usually do not explicitly define ϕ and sometimes use the notation v = (unk)k∈N for the subsequence.
In this notation, nk = ϕ(k) (so we must have nk+1 > nk) and we have vk = unk ; beware of the indexes,
especially if you want to use the index n for the elements of v...
Exercise 2.3
What are the functions ϕ and/or the indices (nk)k∈N for the following subsequences?
1. v = (n2)n∈N is a subsequence of u = (n)n∈N.
2. v = (un3+n)n∈N is a subsequence of (un)n∈N.
3. v = (u1, u3, u5, u7, . . .) is a subsequence of (un)n∈N.
Solution
1. We have vn = n2 and u` = `. So if we want vn = uϕ(n), we need n2 = ϕ(n), which defines
ϕ (and it is indeed strictly increasing). Equivalently, nk = ϕ(k) = k2.
2. As above (exercise for you).
3. We first need to idenfity the generic element vn. We have v1 = u1, v2 = u3, v3 = u5 so
an inductive reasoning shows that vn = u2n−1. Then, as above, this gives ϕ(n) = 2n− 1 and
thus nk = 2k − 1.
Exercise 2.4
Are v = (0, 1, 0, 1, 0, 1, . . .) and w = (0, 0, 0, . . .) subsequences of u = (0, 1, 2, 3, 4, . . .)?
Solution
No. If we were to write vn = uϕ(n), we would have to impose something like ϕ(1) = 1,
ϕ(2) = 2, ϕ(3) = 1, ϕ(4) = 2... and this function is not strictly increasing. Likewise,
imposing wn = uϕ(n) would lead to ϕ(n) = 1 for all n, which still does not define a strictly
increasing function.
The following notions should not be an issue.
Definition 2.5 (Bounded (above/below) sequence)
A sequence (un)n∈N is bounded above if the set of its values {u1, u2, . . .} is bounded above, i.e. if
there exists M ∈ R such that, for any n ∈ N, un ≤M .
14
We similarly define a sequence that is bounded below, and a bounded sequence.
An easy but interesting characterisation of a bounded sequence:
Proposition 2.6 (Characterisation of bounded sequences)
A sequence (un)n∈N is bounded if and only if (|un|)n∈N is bounded above, that is there exists
M ∈ R such that |un| ≤M for all n ∈ N.
Technique 2.7 (Absolute value and inequalities)
Beware of invented rules for dealing with absolute values in inequalities. For example, a ≤ b does
not necessarily imply |a| ≤ |b| (find a counter-example). The best to avoid making any mistake is
to remember the following things.
• |a| = a if a ≥ 0, |a| = −a if a ≤ 0.
• |a| ≤ m if and only if −m ≤ a ≤ m. In particular, −|a| ≤ a ≤ |a|.
• Triangle inequality: |a+ b| ≤ |a|+ |b|.
Proof of Proposition 2.6
Direction “⇒”: take (un)n∈N bounded. It is bounded above by some M and below by some m,
that is, m ≤ un ≤ M for all n. Hence, using the facts above, −|m| ≤ m ≤ un ≤ M ≤ |M |. Set
S = max(|M |, |m|). We have S ≥ |M | and −S ≤ −|m|, so −S ≤ un ≤ S for all n. Using again
the facts above, we infer |un| ≤ S for all n, which shows that (|un|)n∈N is bounded.
Direction “⇐”: if (|un|)n∈N is bounded by M then |un| ≤ M for all n, and thus −M ≤ un ≤ M .
So (un)n∈N is bounded below and above.
2.2 Limits of sequences: definition and properties
You probably have an intuitive knowledge of the limit of a sequence, and some (mostly algebraic) tech-
niques to compute this limit.
Exercise 2.8
Compute the limits as n→∞ of un = e1/n + 11+1/n2 and of vn = n
2+1
n2 .
“Solution”... sort of...
As n → ∞, 1/n → 0 so e1/n → e0 = 1. Similarly, 1/n2 → 0 so 1 + 1/n2 → 1 and
1/(1 + 1/n2)→ 1. Hence, un → 2.
For (vn)n∈N, we write vn = 1 + 1/n2 so, as above, vn → 1.
The arguments above are not incorrect, but lack some justifications... at this stage, we would actually not
be able to give a rigorous justification for any of the deductions above. The issue is that we only relied
on an intuitive definition of the limit. But what is a limit exactly? When can we say that a sequence has
a limit?
Exercise 2.9
1. Prove that if (un)n∈N has limit ` as n→∞ then v = (u2n − un + 1)n∈N converges to `2 − `+ 1.
2. Prove that if (un)n∈N has limit ` as n→∞ and if u2n+1 = un for all n ∈ N, then ` = 0 or ` = 1.
It is impossible to tackle these questions without a proper definition of “limit”, and possibly some prop-
erties of the limit. We could say “since un → ` then u2n → `2”, but why is that true?
So, let’s give a proper definition of the limit, in order to be able to do something with this concept.
15
Definition 2.10 (Limit of a sequence)
A sequence (un)n∈N has limit ` ∈ R as n→∞ if:
for any ε > 0, there exists a rank Nε ∈ N such that, for any n ≥ Nε, |un − `| ≤ ε.
In that case, we also say “converges to ` as n → ∞” and we write “limn→∞ un = `” or “un → `
as n→∞”.
If a sequence does not converge to any real number, then we say that it diverges.
Technique 2.11 (How to prove from the definition that a sequence converges?)
Say you have the intuition that un → ` and you want to prove it from the definition. You have to
take an arbitrary ε > 0 and provide a natural number Nε such that, whenever n ≥ Nε, |un−`| ≤ ε.
How can you find Nε? Here is a way.
• Write explicitly |un − `|, and try to find an nice simple upper bound for this quantity in
terms of n, which is small when n is large.
For example, |un − `| ≤ C/n for some fixed constant C not depending on n.
• Impose that n must be such that this upper bound is less that ε.
Following the previous example, this would be C/n ≤ ε.
• Do some algebraic manipulations on the inequality thus obtained to re-write it under the
form “n ≥M” (where M does not depend on n).
Here, this would be: C/n ≤ ε if C ≤ nε, that is C/ε ≤ n.
• As Nε, take any natural number that is greater than M . This ensures that whenever n ≥ Nε,
n ≥M , and thus |un − `| ≤ ε.
Exercise 2.12
Prove that 1n → 0 as n→∞, and that n+3n+1 → 1 as n→∞.
Solution
Let u = (1/n)n∈N. Fix ε > 0. We write |un − `| = |1/n− 0| = 1/n. We want 1/n ≤ ε, which
can be re-written as n ≥ 1/ε. Take N ∈ N greater than 1/ε (the existence is ensured by the
Archimedean property). Then whenever n ≥ Nε we have n ≥ 1/ε and thus |un−`| ≤ 1/n ≤ ε.
Consider now vn = n+3n+1 and ` = 1. Write
|vn − `| =
∣∣∣∣n+ 3n+ 1 − 1
∣∣∣∣ = ∣∣∣∣n+ 3− n− 1n+ 1
∣∣∣∣ = ∣∣∣∣ 2n+ 1
∣∣∣∣ = 2n+ 1 .
We want this to be less than ε: 2/(n + 1) ≤ ε. This holds true provided that 2 ≤ (n + 1)ε,
that is 2/ε ≤ n+1, or (2/ε)−1 ≤ n. Take Nε a natural number greater than (2/ε)−1. Then
n ≥ Nε implies 2/(n+ 1) ≤ ε so |vn − `| ≤ ε as required.
A sequence which does not have a limit may either go to +∞, −∞, oscillate, or do all sorts of crazy
things.
Exercise 2.13
Explain why the sequences (0, 1, 0, 1, 0, . . .) and (0, 1, 0, 2, 0, 3, 0, 4, . . .) diverge.
Do not try to give a rigorous proof of this fact, it’s way too complicated at this stage (just using the
definition of limit). Rather, try and intuitively understand why they cannot converge.
16
Proposition 2.14 (Convergent sequences are bounded)
If (un)n∈N converges, then it is bounded.
Exercise 2.15
Using this proposition, give a rigorous proof that the second sequence in Exercise 2.13 diverges.
Solution
By way of contradiction. If u = (0, 1, 0, 2, 0, 3, 0, 4, . . .) where convergent, then by Proposition
2.14, it would be bounded – and thus, by definition, in particular bounded above. There
would exist M such that un ≤M for all n. Given the structure of u, this would impose that
M is greater than any natural number, and that is a contradiction (there is no real number
greater than any natural number).
Skeleton of proof of Proposition 2.14
Let ε = 1 and take Nε from the definition of “(un)n∈N has a limit `” (translate this properly).
For n ≥ Nε, the triangle inequality gives |un| = |un − ` + `| ≤ |un − `| + |`| ≤ 1 + |`|. Then
M = max(|u1|, . . . , |uNε−1|, 1 + |`|) is an upper bound of (|un|)n∈N (why?) which shows by
Proposition 2.6 that (un)n∈N is bounded.
Proposition 2.16 (Uniqueness of the limit)
If (un)n∈N has a limit, then this limit is unique and it is denoted by limn→∞ un.
Skeleton of proof
By way of contradiction. Assume that (un)n∈N has two limits ` and `′, with ` < `′. Take
ε = (`′− `)/3 > 0 and translate, for each limit, the definition (do it!). This provides two ranks Nε
and N ′ε. Take n ≥ max(Nε, N ′ε). Then, by the triangle inequality, |`′ − `| ≤ |`− un|+ |un − `′| ≤
2ε = 23 (`′ − `) = 23 |`′ − `|. Hence, ` = `′ (why?), which is a contradiction.
Definition 2.17 (Infinite limit)
A sequence (un)n∈N diverges to +∞, and we write limn→∞ un = +∞, if
for any M ∈ R there exists a rank NM ∈ N such that, for any n ≥ NM , un ≥M .
Similarly, limn→∞ un = −∞ if for any M ∈ R there exists NM ∈ N such that un ≤ M whenever
n ≥ NM .
Theorem 2.18 (Subsequence limit theorem)
If (un)n∈N converges to ` as n→∞ then any subsequence of (un)n∈N also converges to the same
limit `.
This theorem is as much a way to prove that some sequences converge, as a way to prove that some
sequences do not converge.
Exercise 2.19
Give a rigorous proof that the first sequence in Exercise 2.13 does not converge.
17
Solution
By way of contradiction. If it converges to some `, then its subsequence (0, 0, 0, . . .) must also
converge to ` so ` = 0. But the other subsequence (1, 1, 1, . . .) should then also converge to
0, which is a contradiction.
The subsequence limit theorem is also valid if (un)n∈N diverges to +∞, that is if limn→∞ un = +∞ then
any subsequence of (un)n∈N also diverges to +∞. The same holds true for sequences that diverges to
−∞ (their subsequences diverge to −∞ too).
If (un)n∈N diverges “by oscillating”, then nothing can be said about its subsequences (some may diverge,
some may converge...).
Proof of the subsequence limit theorem
Let v be a subsequence of u. Then vn = uϕ(n) for a strictly increasing ϕ : N → N. Let us first
prove that ϕ(n) ≥ n for all n ∈ N. This is done by induction on n. The base case ϕ(1) ≥ 1 is
obvious since ϕ takes values in N. Let us do the induction step. Take n ∈ N and assume that
ϕ(n) ≥ n. Then since ϕ is strictly increasing, ϕ(n+ 1) > ϕ(n) ≥ n. ϕ(n+ 1) is a natural number
that is strictly greater than n; it must therefore be greater than or equal to n+1, the next natural
number. Hence, ϕ(n+ 1) ≥ n+ 1 and the inductive step is complete.
Let us now turn to the proof that vn → ` as n→∞. Let ε > 0. Let us translate the assumption,
i.e. that un → `: there is Nε such that, whenever n ≥ Nε, |un − `| ≤ ε. Take n ≥ Nε. We
proved above that ϕ(n) ≥ n, so ϕ(n) ≥ Nε. The choice of Nε shows that |uϕ(n) − `| ≤ ε, that is,
|vn − `| ≤ ε. Hence, we found Nε such that, whenever n ≥ Nε, |vn − `| ≤ ε. This precisely shows
that vn → ` as n→∞.
Theorem 2.20 (Algebraic Limit Theorem (ALT) for sequences)
Hypotheses : Let (un)n∈N and (vn)n∈N be two sequences. Assume that limn→∞ un = ` and
limn→∞ vn = m with `,m ∈ R.
Conclusions : The following holds:
(i) For any λ ∈ R, limn→∞ λun = λ`.
(ii) limn→∞[un + vn] = `+m.
(iii) limn→∞ unvn = `m.
(iv) limn→∞ unvn =
`
m , provided that m 6= 0 and vn 6= 0 for any n ∈ N.
Which ones of these properties are valid if ` = ±∞ or m = ±∞?
Skeleton of proof of the ALT
(i) Write |λun−λ`| = |λ| |un− `|. The case λ = 0 is trivial, so we consider λ 6= 0. Take ε > 0 and
translate the definition “un → `” with ε/|λ| instead of ε (do it!). This provides Nε such that, for
all n ≥ Nε, |λ| |un − `| ≤ ε (check this).
(ii) Write |un + vn − (`+m)| = |un − `+ vn −m| ≤ |un − `|+ |vn −m|. Take ε and translate the
fact that limn→∞ un = ` and limn→∞ vn = m (do it!). This gives two ranks Nε and Mε. Consider
max(Nε,Mε) as the rank for un + vn...
(iii) Write |unvn−`m| = |(un−`)vn+`(vn−m)| ≤ |un−`| |vn|+ |`| |vn−m|. Use Proposition 2.14
on (vn)n∈N (do it!) to find M such that, for all n, |unvn − `m| ≤ M |un − `| + |`| |vn −m|. Take
ε > 0 and translate the limits of (un)n∈N and (vn)n∈N with, respectively, ε/(2M) and ε/(2|`|) (dot
it, and treat ` = 0 aside). Take the maximum of the two obtained ranks to conclude.
(iv) Using (iii), we only need to prove (iv) in the case un = 1 for all n (why?). Write∣∣∣∣ 1vn − 1m
∣∣∣∣ = ∣∣∣∣m− vnvnm
∣∣∣∣ = |vn −m||vn| |m| .
18
Prove first the following: there is N0 such that, for n ≥ N0, |vn| ≥ |m|/2 (translate the fact that
vn → m with ε = |m|/2). Then, for n ≥ N0, 1|vn| ≤ 2/|m| and thus∣∣∣∣ 1vn − 1m
∣∣∣∣ ≤ 2|vn −m||m|2 .
Take ε > 0 and translate then the fact that vn → m with m2ε/2 instead of ε (do it!), and take
the greatest of the rank obtained and of N0.
Example 2.21
The ALT and the subsequence limit theorem provide very easy solutions to Exercise 2.9.
Theorem 2.22 (Order Limit Theorem (OLT))
If (un)n∈N and (vn)n∈N converge respectively to ` and m and if, for any n ∈ N, un ≤ vn then
` ≤ m.
Exercise 2.23
What can you say if (un)n∈N converges to `, (vn)n∈N converges to m, and un < vn for all n ∈ N?
Partial solution
Only that ` ≤ m, not that ` < m in general. Find an example where un < vn for all n but
` = m.
Skeleton of proof of the OLT
By contradiction. Assume that m < ` and take ε = (` − m)/3. Take Nε corresponding to
limn→∞ un = ` and Mε corresponding to limn→∞ vn = m (write this properly). Consider then
n = max(Nε,Mε). These choices and the assumption un ≤ vn show that ` ≤ un + ε ≤ vn + ε ≤
m+ 2ε (check that), that is (`−m)/2 ≤ ε = (`−m)/3. This gives (`−m)/6 ≤ 0, so ` ≤ m, which
is a contradiction with m < `.
Theorem 2.24 (Squeeze theorem)
Let (un)n∈N, (vn)n∈N and (wn)n∈N be three sequences. If (un)n∈N and (wn)n∈N converge to the
same limit ` and if un ≤ vn ≤ wn for any n ∈ N, then (vn)n∈N converges to `.
Skeleton of proof of the squeeze theorem
Let ε > 0 and take Nε and Mε corresponding to the convergence of (un)n∈N and (wn)n∈N (write
this properly). Let Lε = max(Nε,Mε). If n ≥ Lε then
`− ε ≤ un ≤ vn ≤ wn ≤ `+ ε
(why?) which shows that −ε ≤ wn − ` ≤ ε. This gives |wn − `| ≤ ε whenever n ≥ Lε.
2.3 Indeterminate limits
None of the results above allows us to directly prove that n2+1n2 has a limit and compute it. When trying
to use the ALT, for example, we obtain +∞+∞ which is not a valid limit and can end up being anything.
Exercise 2.25
19
The limits of all the following sequences are formally +∞+∞ and therefore cannot be determined by
simply using the ALT. Do those sequences converge? If so, towards what?
n2
n
,
n− 1
n2
,
an2 + 1
n2
(for a ∈ R+) , n
2(2 + (−1)n) + 1
n2
. (2.1)
Indeterminate limits are those of the form ∞∞ ,
0
0 , 0×∞ and some particular `0 when the denominator can
change sign (e.g. 1(−1)n/n ). They come from competitions between terms in the expression under study.
Technique 2.26 (Finding indeterminate limits)
There is no hard-and-fast rule to deal with indeterminate limits, but some general ideas can be
used.
1. Try and factorise the “dominant” term, to simplify the expression and find the limit or prove
that there is no limit.
This is for example the case for all limits in (2.1).
2. Use the following “classical indeterminate limits” (proved later) to find the dominant term:
• limn→∞ n−ken = +∞ for any k ∈ N,
• limn→∞ nke−n = 0 for any k ∈ N,
• limn→∞ ln(n)n = 0.
3. Being aware of some results regarding limits of functions (see Section 4.2 and Exercise 5.27)
may help.
All the preceding results can be applied if we change R into any ordered field, say e.g. Q. We did not
make use of the very specific property of R, that is the axiom of completeness. Can we obtain better
results if we use this axiom? In particular, all preceding results are mostly good if we know that some
limits exists and/or if we know the expected value of the limit. Can we prove that some limits exists
without knowing their values to start with? The following two sections answer this question.
2.4 Compactness
Theorem 2.27 (Bolzano-Weierstraß)
If (un)n∈N is a bounded sequence of reals then there exists a subsequence (unk)k∈N of (un)n∈N
which converges in R.
Q does not satisfy this theorem: there are bounded sequences in Q which have no subsequences converging
in Q (see Exercise 2.35).
Exercise 2.28
1. (un)n∈N = (0, 1, 0, 1, 0, 1, . . .) is bounded and has many convergent subsequences. Can you find
at least three of them?
2. (un)n∈N = (0, 1, 2, 3, 4, 5, . . .) is unbounded, and does not have any convergent subsequence
(why?).
3. (un)n∈N = (0,−1, 0,−2, 0,−3, 0,−4, . . .) is unbounded but has many convergent subsequences.
Find at least two.
Skeleton of solution
1. An easy one: (0, 0, 0, . . .). Find two more.
2. Remember that the subsequence limit theorem is a powerful tool to prove that some
sequences do not converge. What would happen with a subsequence of (un)n∈N.
3. Not very different from 1.
20
Proof of the Bolzano-Weierstraß theorem
The proof is based on a dichotomy argument and the NIP. Let us present the proof in clear steps.
• Since (un)n∈N is bounded, there is M such that un ∈ [−M,M ] for all n.
• Split [−M,M ] into two closed half intervals [−M, 0] and [0,M ]. One of these halves, let’s
call it I1, is such that there is an infinite number of n ∈ N such that un ∈ I1. Let n1 be such
that un1 ∈ I1.
• Split now I1 in two closed half intervals. Since I1 contains an infinite number of elements
of the sequence (un)n∈N, for one the closed halves of I1, let’s call it I2, there is an infinite
number of n ∈ N such that un ∈ I2. In particular, we can find n2 > n1 such that un2 ∈ I2.
• Continue the process by induction: having constructed Ik, a closed half of Ik−1, such that
{n ∈ N : un ∈ Ik} is infinite, and having selected nk > nk−1 with unk ∈ Ik, construct Ik+1,
a closed half of Ik+1, such that {n ∈ N : un ∈ Ik+1} is infinite, and select nk+1 > nk such
that unk+1 ∈ Ik+1.
• Since nk+1 > nk for all k, (unk)k∈N is a subsequence of (un)n∈N.
• Each Ik has length half of Ik−1. Since the length of I1 is 2M , an easy proof by induction
shows that the length of Ik is M/2k−2, for all k ∈ N.
• (Ik)k∈N is a nested sequence of closed intervals so, by the NIP, ∩k∈NIk is not empty. Let us
take ` in this intersection.
• For all k ∈ N, unk and ` both belong to Ik. Since the length of Ik is M/2k−2 this shows that
|unk − `| ≤M/2k−1.
• We “know” (but could you prove it from the definition?) that M/2k−1 → 0 as k →∞. The
squeeze theorem then shows that |unk − `| → 0 as k →∞.
• This implies (see e.g. Exercise 5 in Problem Set 2) that unk → `, and the proof is complete:
we found a convergent subsequence of (un)n∈N.
The Bolzano-Weierstraß theorem is strongly related to the notion of compact set.
Definition 2.29 (Compact set)
A subset K of R is compact if any sequence in K has a subsequence which converges in K.
Corollary 2.30
An interval I is compact if and only if it is bounded and if it contains its endpoints, i.e. if and
only if I is an interval of the form [a, b] with a and b real numbers.
Skeleton of proof (only two gaps)
Direction “⇒”: by way of contradiction. If I is unbounded, we can construct (do it!) a sequence
(un)n∈N in I such that un → +∞ or un → −∞ as n → ∞. Then (un)n∈N does not have
a converging subsequence, since any subsequence also goes to +∞ or −∞ by the subsequence
limit theorem. I is therefore not compact since one of its sequence does not have a convergent
subsequence.
If I does not contain one of its endpoints, say a, we can easily (do it!) construct a sequence in I
that converges to a. Then any subsequence also converges to a, which shows that none of these
subsequence has a limit in I, and I is therefore not compact.
Direction “⇐”: let (un)n∈N be a sequence in I. Since I is bounded, then (un)n∈N is bounded and
therefore has a convergent subsequence (unk)k∈N, by the Bolzano-Weierstraß theorem. Let us call
` the limit of this subsequence. To conclude the proof we need to show that ` ∈ I. We have
21
I = [a, b] by assumption (it contains its endpoints), so a ≤ unk ≤ b for all k. By the OLT, we
infer a ≤ ` ≤ b, which proves that ` ∈ I.
Corollary 2.30 has a generalisation to any subset of R, called the Heine-Borel theorem: a subset K of
R is compact if and only if it is bounded and closed (that is, any limit of a sequence of element in K
belongs to K).
Compactness has many consequences, in particular when studying functions, but it only establishes limits
of subsequences. Can we establish that a whole sequence converges, without knowing its limit to start
with?
2.5 Monotone convergence theorem (MCT)
Recall that a sequence (un)n∈N is increasing if un+1 ≥ un for all n ∈ N. A sequence is decreasing if
un+1 ≤ un for all n ∈ N. Finally, a sequence is monotone if it is either increasing or decreasing.
Theorem 2.31 (Monotone convergence theorem (MCT))
If (un)n∈N is an increasing and bounded above sequence of real numbers, then (un)n∈N converges.
The same conclusion holds if (un)n∈N is decreasing bounded below.
As in many, if not all, results presented here, all assumptions are important. What happens if you remove
one of the two assumptions (monotony – i.e. increasing or decreasing – or boundedness of the sequence)?
Example 2.32
Let (un)n∈N be defined by u1 = 0 and un+1 = 12 (u2n + 1) for all n ≥ 1. Prove that (un)n∈N is
increasing and converges to 1.
Skeleton of solution
Prove that un+1 ≥ un by writing un+1 − un = 12 (u2n − 2un + 1) = 12 (un − 1)2. Prove then by
induction that, for all n ∈ N, un ≤ 1.
Proof of the MCT
The set A = {un : n ∈ N} is not empty (it contains u1) and bounded above by assumption on
(un)n∈N. Hence, s = sup(A) exists by the AoC. We prove that un → s as n→∞.
Let ε > 0. Using one of the characterisation of the supremum (see page 10), we find a ∈ A such
that s − ε ≤ a ≤ s. There is Nε ∈ N such that a = uNε . Since (un)n∈N is increasing, we deduce
that s− ε ≤ uNε ≤ un whenever n ≥ Nε. s being an upper bound of A, we also have un ≤ s for
all n ∈ N.
Hence, for all n ≥ Nε, s − ε ≤ un ≤ s, which shows that |un − s| = s − un ≤ ε and thus that
un → s as n→∞.
2.6 Cauchy sequences and completeness of R
Definition 2.33 (Cauchy sequence)
A sequence (un)n∈N is a Cauchy sequence if it satisfies the Cauchy criterion: for any ε > 0, there
exists a rank Nε ∈ N such that, for any n, p ≥ Nε, |un − up| ≤ ε.
22
Proposition 2.34
1. Any Cauchy sequence is bounded.
2. Any converging sequence is a Cauchy sequence.
Skeleton of proof of Proposition 2.34
1. Extremely similar to the proof of Proposition 2.14 (convergent sequences are bounded). Re-
produce this proof by changing ` into up...
2. Let (un)n∈N be a convergent sequence and ` its limit. Take ε > 0 and Nε/2 given by the
definition of the convergence, with ε/2 instead of ε (write this properly). If n, p ≥ Nε/2 then, by
the triangle inequality, |un − up| = |un − `+ `− up| ≤ |un − `|+ |up − `| ≤ ε/2 + ε/2 = ε (check
that).
What about the converse? The converse of Item 1 is clearly false (find an example). The converse of
Item 2 is false in Q, as shown by the following exercise.
Exercise 2.35
Construct a Cauchy sequence in Q which does not converge in Q (Q is not complete...).
Solution
Let A = (0,
√
2). By density of Q in R we find xn ∈ Q such that
√
2 − 1/n < xn <
√
2.
(xn)n∈N converges in R to
√
2 by the squeeze theorem, so it is a Cauchy sequence, but it does
not converge in Q.
In R, however, the converse of Item 2 in Proposition 2.34 is true. The Cauchy criterion characterises
therefore converging sequences, without requiring an a priori knowledge of their limit.
Theorem 2.36 (R is complete)
Any Cauchy sequence of reals converges in R.
A powerful application of this theorem, in the context of Rm rather than R, is presented in Appendix B.
Skeleton of proofs of Theorem 2.36
Various possible proofs:
1. Using the MCT.
By the AoC, vn = inf{un, un+1, . . .} exists (why?) and is bounded (why?). (vn)n∈N is
increasing (why?). The MCT therefore shows that vn → ` as n→∞.
By the property of Cauchy sequence, for n, p ≥ Nε, |un − up| ≤ ε, so un − ε ≤ up. This
holds for any p ≥ n and thus un − ε is a lower bound of inf{un, un+1, . . .}, so un − ε ≤ vn
(why?). Hence, vn ≤ un ≤ un + ε if n ≥ Nε.
Take n large enough such that |vn − `| ≤ ε. The previous inequality shows that |un − `| ≤ ε
(why?).
2. Using the Bolzano-Weierstraß theorem
(un)n∈N has a convergent subsequence (unk)k∈N (why?). Let ` be its limit and take Nε
from the definition of a Cauchy sequence (write this properly). If n, k ≥ Nε then nk ≥ Nε
(remember that nk ≥ k) so |un − unk | ≤ ε. For k large enough (write this properly),
|unk − `| ≤ ε which shows that |un − `| ≤ 2ε.
23
3 Series of real numbers
Series are essentially just particular kinds of sequences, but for which we can establish nice results.
Definition 3.1 (Series)
Given a sequence (an)n∈N, the series with terms (a1, a2, . . .) is the sequence S = (sn)n∈N of partial
sums
sn = a1 + . . .+ an =
n∑
k=1
ak.
The series S is often denoted by
∑
an (note the absence of indices).
S converges if (sn)n∈N has a limit in R. In that case, we denote by
∑∞
n=1 an (note the presence
of indices) the sum of the series, i.e. limn→∞ sn.
The notations
∑
an and
∑∞
n=1 an represent two very different objects (one is a particular sequence, the
other is a real number). Note that the distinction between these two notations is often not clearly done
in textbooks.
Given that a series is nothing more than a sequence of partial sums, all results previously established
for sequences can be translated to series. For example, the ALT for series states that if
∑
an and
∑
bn
converges, then
∑
(an + bn) converges and
∑
(an + bn) =
∑
an +
∑
bn.
This is however NOT a trivial result since the sum is infinite. Infinite sums must NOT be handled like
finite ones, they must be handled like limits of finite sums and all justifications pertaining to their status
of limit must be properly given.
3.1 Series with positive terms
Series
∑
an with an ≥ 0 for all n ∈ N enjoy very nice convergence properties, consequences of the
monotone convergence theorem.
Theorem 3.2 (Behaviour of series with positive terms)
If
∑
an is a series with positive terms, then
∑
an converges if and only if the series is bounded
(that is, the partial sums (
∑n
k=1 ak)n∈N form a bounded sequence).
If the series is not bounded, then it diverges to +∞, that is limn→∞ sn = +∞.
Proof
For any n ∈ N, sn+1− sn = (a1 + . . .+ an+1)− (a1 + . . .+ an) = an+1 ≥ 0 by assumption. Hence,
the series (sn)n∈N is an increasing sequence. If it is bounded, the MCT shows that it converges.
Conversely, if it converges it is bounded by Proposition 2.14.
If (sn)n∈N is not bounded then by an exercise in Problem Set 3, it must diverge to +∞.
Corollary 3.3 (Comparison test)
Let (an)n∈N and (bn)n∈N be two sequences such that 0 ≤ an ≤ bn for any n ∈ N.
• If
∑
bn converges, then
∑
an converges.
• If
∑
an diverges (i.e. tends to +∞), then
∑
bn diverges.
Skeleton of proof
If
∑
bn converges then its partial sums are bounded (why?). So the partial sums of
∑
an must
also be bounded (why?), and
∑
an converges by Theorem 3.2.
If
∑
an diverges, then
∑
bn cannot be bounded (this is the contrapositive of the first item), so by
Theorem 3.2 the series
∑
bn must also diverge.
24
3.2 Convergence tests
The first test is actually not a “convergence” test, but rather a test to disprove convergence.
Proposition 3.4 (n-th term test)
If (an)n∈N does not converge to 0, then
∑
an diverges.
Remark 3.5 This test is often stated in the converse direction: if
∑
an converges then an → 0 as n→∞.
Both forms are equivalent, but the most frequent usage of this test is through the formulation in the
proposition.
Example 3.6
1/n→ 0 but ∑ 1/n diverges (see below). How is that not a contradiction with the n-th term test?
Remark 3.7 The n-th term test cannot be used to prove that series converge, only that they diverge.
Proof of the n-th term test
The proof is made by way of contradiction. Assuming that
∑
an converges, then the sequence
(sn)n∈N converges to some `. By the subsequence limit theorem, (sn−1)n≥2 also converges to the
same `. The ALT then show that sn − sn−1 = an → `− ` = 0.
Theorem 3.8 (Cauchy criterion for series)
The series
∑
an converges in R if and only if for any ε > 0, there exists Nε ∈ N such that, for any
n ≥ p ≥ Nε, |ap + . . .+ an| ≤ ε.
Skeleton of proof
This is a direct translation of the Cauchy criterion on the sequence (sn)n∈N, by noticing that, if
p ≤ n, sn − sp−1 = ap + . . .+ an.
The following theorem is mostly used in combination with the comparison test for series with positive
terms.
Theorem 3.9 (Absolute convergence test)
If the series
∑ |an| converges, then the series ∑ an converges.
Skeleton of proof
We have |ap + . . . + an| ≤ |ap| + . . . + |an|. The Cauchy criterion for series (direction “⇒”) for∑ |an| shows that the right-hand side is small when n and p are large (write this properly). Hence,
the Cauchy criterion for series (direction “⇐”) for ∑ an shows that this series converges (write
this properly).
Definition 3.10 (Absolutely and conditionally convergent series)
A series
∑
an such that
∑ |an| converges is said to be absolutely convergent. If a series converge
but is not absolutely convergent, it is conditionally convergent.
25
Note 3.11 (Some important series)
Given the previous results, it appears essential to have a set of “classical” series whose convergence
or divergence is known.
1. Geometric series:
∑
n r
n converges if and only if |r| < 1. In that case, ∑∞n=1 rn = r1−r .
2. Riemann series, or p-series:
∑ 1
np converge if and only if p > 1.
Theorem 3.12 (Ratio test)
Let (an)n∈N be non-zero reals and assume that limn→∞
∣∣∣an+1an ∣∣∣ = r. Then
1. If r < 1,
∑
an is absolutely convergent.
2. If r > 1,
∑
an diverges.
If r = 1, we cannot draw any conclusion (any situation can happen). Consider for example
∑
n
n2+1 and∑
n
n3+1 .
Skeleton of proof of the ratio test
Case r < 1: we compare the series with a geometric series. Take s ∈ (r, 1) and translate the limit
of |an+1/an| with ε = s − r > 0 (do it!). This shows that, for n ≥ Nε, |an+1/an| ≤ s (why?).
Using this relation, an induction (write it) yields, for n ≥ Nε, |an| ≤ |aNε |sn−Nε = Csn where
C = |aNε |s−Nε does not depend on n. The comparison test then shows that
∑ |an| is convergent
(why?).
case r > 1: by definition of the limit with ε = r− 1 > 0, for n ≥ Nε, |an+1/an| ≥ r− (r− 1) = 1,
so |an+1| ≥ |an|. Hence, |an| ≥ |aNε | and (an)n∈N cannot converge to zero (why? Remember that
all ak are non-zero). The n-th term test then shows that
∑
an does not converge.
Theorem 3.13 (Alternating series test (AST))
Let (an)n∈N be a decreasing sequence of positive reals such that limn→∞ an = 0. Then
∑
(−1)nan
converges.
Exercise 3.14
Prove that
∑ (−1)n
n is conditionally convergent.
Skeleton of solution
The series is convergent by the AST (check this properly). To prove that it is not absolutely
convergent, i.e. that
∑
1/n diverges, start by proving that, if nk = 2k, then
snk = 1 + 1/2 + (1/5 + 1/6 + 1/7 + 1/8) + · · ·
≥ 1 + 1/2 + (1/4 + 1/4) + (1/8 + 1/8 + 1/8 + 1/8) + · · ·
+ (1/2k + . . .+ 1/2k)
≥ 1 + 1/2 + 1/2 + · · · = 1 + k/2.
Use then the subsequence limit theorem.
There is a more general version of this result, called the Abel theorem. Actually, the proofs of these
theorems are based on a the transformation (the “Abel transformation”) which is nothing more than a
discrete integration-by-parts.
26
Skeleton of proof of the AST
By monotony of (an)n∈N, for p ≥ n,
|ap − ap+1 + ap+2 − ap+3 + ap+4 − ap+5 + . . .+ an−2 − an−1 + an|
= |(ap − ap+1) + (ap+2 − ap+3) + (ap+4 − ap+5) + . . .+ (an−2 − an−1) + an|
= (ap − ap+1) + (ap+2 − ap+3) + (ap+4 − ap+5) + . . .+ (an−2 − an−1) + an
(why?). Still using the monotony of (an)n∈N, we have ap+2 ≤ ap+1, ap+4 ≤ ap+3, etc. Hence,
(ap − ap+1) + (ap+2 − ap+3) + (ap+4 − ap+5) + . . .+ (an−2 − an−1) + an
≤ (ap − ap+1) + ( ap+1 − ap+3) + ( ap+3 − ap+5) + . . .+ ( an−3 − an−1) + an
≤ ap + (−ap+1 + ap+1) + (−ap+3 + ap+3) + . . .+ (−an−3 + an−3)− an−1 + an
≤ ap − an−1 + an ≤ ap.
The Cauchy criterion for series then shows that
∑
(−1)nan converges (write this properly).
Technique 3.15 (How to choose a convergence test?)
Asked to analyse the convergence properties of a series
∑
an, how to choose the right test/result
among all the previous ones?
• Your best buddy and first, easiest choice should be the ratio test.
• Should it fail to provide a response, the next attempt should be with the absolute convergence
test combined with the comparison test. This requires, however, to find a “nice” series whose
term bound |an|. To this end, the classical series (geometric, p-series) might help.
• If the signs of (an)n∈N obviously alternate, then maybe the AST can help.
• If you think the series diverges, the easiest way to prove it would be to use the n-th term
test. This test is however not foolproof, some series diverge whilst still having a generic term
that goes to 0 (can you find an example?).
• Only attempt the Cauchy criterion if you are totally desperate. This criterion is usually
very difficult to check on a given explicit series
∑
an, and mostly used to establish other
theoretical results than to prove that a particular series converges or diverges.
Exercise 3.16
Study the convergence of the following series:
(1)
∑
n
(−1)n
n2
, (2)
∑
n
cos(n)n2
2n , (3)
∑
n
n
n+ 1 ,
(4)
∑
n
√
n
n+ 1 , (5)
∑
n
cos(npi) ln
(
1 + 1
n
)
Hints
(1): absolute convergence test.
(2): absolute convergence test test and ratio test.
(3): n-th term test.
(4): comparison test and p-series.
(5): AST.
27
4 Functions
4.1 What is a function, exactly?
You may know several different representations of functions, for example:
• via a formula, e.g. f(x) = x2ex−ln(3x cos(x−1)),
• via a graph, e.g.
• via a rule, explaining you how to compute f(x) from x, e.g. “f(x) is the 3rd decimal in the decimal
expansion of x”,
• via a table of values, e.g.:
x f(x)
1 3.4
2 −3
2.4 pi
...
...
From a mathematical perspective, these does not rigorously define what a function is, these are merely
representations of the object “function” (and even very improper representations, since the domains are
not explicitly stated).
Definition 4.1 (Function)
Let A,B be two sets. A function F from A to B is a subset of A × B such that, for any x ∈ A,
there exists one and only one y ∈ B such that (x, y) ∈ F .
A is called the domain of the function, and B its co-domain.
The relation between this definition and the “usual” understanding we have of a function f : A 7→ B, as
an object which “associates” to each x ∈ A a unique element f(x) ∈ B, is natural: f(x) is the unique
y ∈ B such that (x, y) ∈ F .
BEWARE! Do not confuse the function f with its value f(x) taken at a given point x, or with the
equation y = f(x)...
4.2 Limits of functions
Definition 4.2 (Limit point)
A limit point of a set A ⊂ R is a real c that can be approximated by at least one sequence (xn)n∈N
in A such that xn 6= c for any n.
An equivalent characterisation is: for any δ > 0, (c−δ, c+δ) contains at least a point of A different
from c.
28
We denote by L(A) the set of limit points of A.
In practice in this unit, we will mostly deal with limit points of intervals I (not reduced to one point),
which are simply any element in the interval and the two endpoints of the interval. In other words, if
I = (u, v), [u, v), (u, v] or [u, v] with u < v, then L(I) = [u, v]
Definition 4.3 (Limit of a function)
Let f : A→ R, c a limit point of A and ` ∈ R. f has limit ` at c if
for all ε > 0 there exists δ > 0 such that, if x ∈ A and 0 < |x− c| ≤ δ, then |f(x)− `| ≤ ε.
We then write limx→c f(x) = `, or f(x)→ ` as x→ c.
Note that condition 0 < |x− c| implies that x 6= c. The value of f at c, if c belongs to A, is irrelevant in
this definition of limit (but see Proposition 4.11).
Technique 4.4 (How to prove from the definition that a function has a limit at some
point?)
We follow the same ideas as in the note for sequences in page 16, with natural adaptations (i.e.
we do not have a natural number n becoming large here, but a variable x becoming close to some
number `).
• Write explicitly |f(x) − `|, and try to find a nice simple upper bound for this quantity in
terms of x, which is small when |x− c| is small.
For example, |f(x)− `| ≤ C|x− c|2 for some fixed constant C not depending on x, at least
for x around c.
• Impose that x must be such that this upper bound is less that ε.
Following the previous example, this would be C|x− c|2 ≤ ε.
• Do some algebraic manipulations on the inequality thus obtained to re-write it under the
form “|x− c| ≤ δ” (where δ does not depend on x).
Here, this would be: C|x− c|2 ≤ ε if |x− c|2 ≤ ε/C, that is |x− c| ≤√ε/C.
• This δ is the one you are looking for.
There might need to be some adjustement on the δ if, during the estimate of |f(x) − `| you had
to assume some things about x...
Exercise 4.5
1. f(x) = 2x+ 4 has limit ` = 2 at c = −1.
2. f(x) = x4 + 2x+ 3 has limit ` = 6 at c = 1.
3. f(x) =
√
x has limit ` = 2 at c = 4.
Sketch of solution
1. Write f(x)−2 = 2x+4−2 = 2x+2 = 2(x+1), hence |f(x)−2| ≤ 2|x+1| = 2|x− (−1)| =
2|x− c|. We want 2|x− c| ≤ ε, which imposes |x− c| ≤ ε/2. So take δ = ε/2...
2. Write f(x)− ` = x4 + 2x+ 3− 6 = x4 + 2x− 3. This is a polynomial, and it vanishes at
c = 1 (why could we have the intuition that this was the case?). So we can factor out x− 1,
for example using long division: f(x) = (x− 1)(x3 + x2 + x+ 3). Assume that “x is around
c = 1” in the precise sense that |x− 1| ≤ 1, that is 0 ≤ x ≤ 2.
Then |x3+x2+x+3| ≤ |x3|+ |x2|+ |x|+3 ≤ 8+4+2+3 = 17 and thus |f(x)−6| ≤ 17|x−1|.
We want 17|x − 1| ≤ ε, that is |x − 1| ≤ ε/17. However, the previous estimate on f(x) − 6
29
only holds if |x− 1| ≤ 1 so we also need to impose this in the condition “|x− 1| ≤ δε”.
We therefore fix δε = min(1, ε/17). If |x − 1| ≤ δε then |x − 1| ≤ 1 so our estimate is valid
and |f(x)− 6| ≤ 17|x− 1| ≤ 17δε ≤ 17× ε/17 = ε.
3. (Hint) There is a trick to remember here:
√
a−√b = (
√
a−√b)(√a+√b)√
a+
√
b
= a−b√
a+
√
b
.
Theorem 4.6 (Sequential characterisation of limit)
Let f : A→ R and c ∈ L(A). Then (
lim
x→c f(x) = `
)
m(
for any sequence (xn)n∈N in A converging to c and such
that xn 6= c for any n, we have f(xn)→ `.
)
Theorem 4.6 is as much a tool to prove as a tool to disprove the existence of a limit.
Exercise 4.7
1. f(x) = x2. Prove that, for any c ∈ R, limx→c f(x) = c2 by using the ALT for sequences.
2. f(x) = cos(1/x) defined on A = R\{0}. Limit of f at c = 0?
Solution
1. Assume that (xn)n∈N is a sequence in R converging to c, but never equal to c. Then
f(xn) = xn × xn → c × c by the ALT. This shows that f(xn) → c2. Hence, the sequential
characterisation of limit enables us to conclude that limx→c f(x) = c2.
2. Looking at the graph of f , it does not look like the limit exists. To prove that, we use the
sequential characterisation of limit. We just need to find one sequence (xn)n∈N of numbers in
A, such that (f(xn))n∈N does not converge. Again looking at the graph of f , we can have the
idea of selecting a sequence that give us the minima and maxima of f . We consider therefore
xn = 1/(npi). Then f(xn) = cos(npi) = (−1)n, and we know that the sequence (−1)n does
not converge (could you re-prove this?). As a consequence, the limit of f at 0 does not exist.
Sketch of proof of the sequential characterisation of limit
Direction “⇓”: fix ε > 0 and translate the definition of “f(x)→ ` as x→ c” to find a corresponding
δ. Take now a sequence (xn)n∈N as the second statement, and translate “xn → c as n→∞” with
δ instead of ε. This gives N such that, whenever n ≥ N , 0 < |xn − c| ≤ δ. Then the choice of δ
implies |f(xn)− `| ≤ ε.
Direction “⇑”: by contraposition. Assume that f(x) 6→ ` as x → c. This means that there is
ε > 0 such that, for all δ > 0, we can find at least one x ∈ A satisfying 0 < |x − c| ≤ δ and
|f(x)− `| ≥ ε. Consider δ = 1/n and take the corresponding x, that you call xn. Argue then that
(xn)n∈N converges to c, is never equal to c, but that f(xn) 6→ ` (use for example the OLT). This
proves the negation of the statement at the bottom, as required.
Theorem 4.8 (Algebraic Limit Theorem (ALT) for functional limits)
Hypotheses : Let f and g be defined on A and c ∈ L(A). Assume that limx→c f(x) = ` and
limx→c g(x) = m.
30
Conclusions : The following holds:
(i) For any k ∈ R, limx→c kf(x) = k`.
(ii) limx→c[f(x) + g(x)] = `+m.
(iii) limx→c f(x)g(x) = `m.
(iv) limx→c f(x)g(x) =
`
m , provided that m 6= 0 and g does not vanish on A.
Hint for the proof
Everything follows from the sequential characterisation of limit and the ALT for sequences.
4.3 Continuity
4.3.1 Definition and basic properties
Intuitively:
“Continuous function”
=
“its graph can be drawn without lifting up the pencil.”
(4.1)
Or, equivalently, a function is not continuous if there’s a “jump” in its graph. To do any kind of
mathematics with the notion of continuity, we need a rigorous definition...
Definition 4.9 (Continuity)
Let f : A→ R and c ∈ A. f is continuous at c if
for all ε > 0, there exists δ > 0 such that, whenever x ∈ A and |x − c| ≤ δ, it follows
|f(x)− f(c)| ≤ ε.
f is continuous on B ⊂ A if it is continuous at each c ∈ B.
Example 4.10
1. The Dirichlet function (=1 on rational points, =0 on irrational points) is nowhere continuous.
2. The modified Dirichlet function (=x×Dirichlet function) is continuous only at 0.
3. The Thomae function (=0 on irrational points, =1/n at x = m/n in reduced form) is continuous
at irrational points.
Proposition 4.11 (Continuity and limit)
Let f : A→ R and c ∈ A∩L(A). Then “f continuous at c” is equivalent to “limx→c f(x) = f(c)”.
Hint for the proof
The proof is rather easy, it simply consists in translating each property, and in carefully dealing
in these translations with the case x = c. Cf. Problem set 7.
Theorem 4.12 (Sequential characterisation of continuity)
Let f : A → R and c ∈ A. f is continuous at c if and only if for any sequence (xn)n∈N in A
converging to c we have f(xn)→ f(c).
31
Hint for the proof
Follows immediately from the relation between continuity and limit, and the sequential charac-
terisation of functional limits. Note that the case c ∈ A\L(A) needs to be dealt separately, but it
is trivial (any function on A is continuous at any point of A\L(A)...).
Exercise 4.13
Thomae’s function is discontinuous at rational points.
Hint for the solution
To prove that f is not continuous at c, according to the sequential characterisation of con-
tinuity, you just need to find (xn) ⊂ A that converges to c and such that f(xn) 6→ f(c).
Remember that, as close as you want to any rational point, there are irrational points.
Theorem 4.14 (Algebraic Continuity Theorem (ACT))
Let f, g : A→ R be continuous at a point c ∈ A. Then:
(i) For all k ∈ R, kf is continuous at c.
(ii) f + g is continuous at c.
(iii) fg is continuous at c.
(iv) Provided that g does not vanish on A, fg is continuous at c.
Hint for the proof
Trivial consequence of the sequential characterisation of continuity and the ALT for sequences.
Example 4.15
The following functions are continuous:
1. All polynomials P (x) = a0 + a1x+ · · ·+ arxr are continuous on R. A rational function PQ , with
P and Q polynomials, is continuous on R\{roots of Q}.
2.
√·, sin, cos, tan, exp and ln are continuous on their domain of definition.
Theorem 4.16 (Composition of continuous functions)
Let f : A→ B be continuous at c ∈ A and g : B → R be continuous at f(c). Then g ◦ f (defined
by g ◦ f(x) = g(f(x))) is continuous at c.
Hint for the proof
Again, easy consequence of the sequential characterisation of continuity (used twice, and in both
directions).
4.3.2 Types of discontinuities
Let us come back to the intuitive point of view (4.1) on continuity, and consider the two following
functions.
32
Step function: Oscillating function:
f(x) = sign(x) g(x) = sin(1/x) if x 6= 0, g(0) = 0.
These functions are both non-continuous at 0. We cannot draw the graph of the step function without
lifting the pencil, so this matches our intuitive understanding on non-continuity. However, the situation
seems to be a little bit different for the oscillating function: it seems, if we had an infinite time, that we
could perhaps draw its graph without lifting the pencil...
Whereas the intuitive definition of continuity only made us think about one type of non-continuity (a
jump in the graph), the mathematical definition gives rise to two types of non-continuous functions: those
presenting a jump in their graph (Type 1) and those strongly oscillating (Type 2).
Remark 4.17 The step function attains values −1 and 1 but do not cover all the range of real numbers
between these values. The oscillating function attains values −1 and 1 and covers the whole range [−1, 1].
Remark 4.18 A function can oscillate while getting closer and closer to a given value, in which case it is
continuous. Example: f(x) = x sin(1/x) if x 6= 0, and f(0) = 0.
The function f defined by f(x) = x sin(1/x) if x 6= 0, and f(0) = 0
4.4 Contraction mapping theorem
The contraction mapping theorem is a very applicable result in real analysis, and a very good example
of the power of the Cauchy criterion. We consider here only a very specific case of this theorem; a more
general version – set on a complete metric space – can be found in the textbook (Abbott, Section 8.2).
33
Definition 4.19 (Fixed point)
Let f : A→ R be a function. A fixed point of f is x∗ ∈ A such that f(x∗) = x∗.
The usual distance between two real numbers x and y is denoted by d(x, y) = |x− y|.
Definition 4.20 (Contraction)
A contraction is a mapping f : A→ A for which there exists γ ∈ (0, 1) such that, for all x, y in A,
d(f(x), f(y)) ≤ γd(x, y).
γ is then called a contraction constant of f .
Remark 4.21 Note that some textbooks call this a strict contraction.
Exercise 4.22
1. Show that the function f(x) = 13 (x− 2) is a contraction on R.
2. Is f(x) = exp(x) a contraction on R?
3. Is f(x) = x2 a contraction on R?
4. Is f(x) = x/2 + 2 a contraction on I = [0, 1]?
5. A consequence of the Mean Value Theorem is: if g : R→ R is differentiable and |g′(x)| ≤ m for
all x ∈ R, then for all a, b in I we have |g(b)− g(a)| ≤ m|b− a|.
Show that g(x) = 11+x2 is a contraction on R.
Graph of g(x) = 11+x2 Graph of g′
Hint for the solution
1. Write the definition of contraction and you’ll see that γ = 1/3 is a contraction constant of
f .
2. No. Find two points x and y such that |ex − ey| ≥ |x− y|.
3. Same as above.
4. Do not forget that a contraction needs to have the same domain and co-domains.
5. We have g′(x) = − 2x(1+x2)2 . Study |g′| to find its maximum and see that it is stricly less
than 1.
We recall that a closed interval takes the form I = [a, b], I = [a,+∞), I = (−∞, b], or I = R.
34
Theorem 4.23 (Contraction Mapping Theorem)
Let I be a closed interval and f : I → I be a contraction. Then f has a unique fixed point x∗.
Moreover, if x0 is an arbitrary point of X then the sequence (xn)n∈N defined inductively by
xn = f(xn−1) converges to x∗.
Proof, with two gaps:
Let γ ∈ (0, 1) be a contraction constant of f , that is d(f(x), f(y)) ≤ γd(x, y) for all x, y ∈ X.
Uniqueness of the fixed point: if x∗ and x∗∗ are fixed points of f then f(x∗) = x∗ and f(x∗∗) =
x∗∗. Thus d(x∗, x∗∗) = d(f(x∗), f(x∗∗)) ≤ γd(x∗, x∗∗) and thus (1 − γ)|x∗ − x∗∗| ≤ 0. Since
1− γ > 0, dividing by this quantity yields |x∗ − x∗∗| ≤ 0, that is x∗ = x∗∗. In other words, f has
at most one fixed point.
Existence of the fixed point: construct a sequence (xn)n∈N as indicated in the theorem. Then
d(xn, xn+1) = d(f(xn−1), f(xn)) ≤ γd(xn−1, xn).
Iterating this inequality we find by induction d(xn, xn+1) ≤ γnd(x0, x1). Use now the triangle
inequality to write, for m > n,
d(xn, xm) ≤ d(xn, xn+1) + d(xn+1, xm)
≤ d(xn, xn+1) + d(xn+1, xn+2) + d(xn+2, xm)
≤ d(xn, xn+1) + d(xn+1, xn+2) + · · ·+ d(xm−1, xm)
≤ γnd(x0, x1) + γn+1d(x0, x1) + · · ·+ γm−1d(x0, x1).
Now,
γn + γn+1 + · · ·+ γm = γn(1 + γ + · · ·+ γm−n) = γn 1− γ
m−n+1
1− γ ≤
γn
1− γ
and thus
d(xn, xm) ≤ γ
n
1− γ d(x0, x1).
Let ε > 0. Since γn → 0 as n → ∞ (because γ ∈ (0, 1)), the ALT shows that γn1−γ d(x0, x1) → 0
as n → ∞. Hence there is Nε such that γ
n
1−γ d(x0, x1) ≤ ε. It follows that, for any m ≥ n ≥ Nε,
|xn − xm| ≤ ε. Hence (xn)n∈N is a Cauchy sequence and converges to some x∗ by Theorem 2.36.
Since I is closed, x∗ ∈ I (why?) It remains to prove that x∗ is a fixed point. For this, we note
that (xn+1)n∈N is a subsequence of (xn)n ∈ N and, as such, also converges to x∗. Since f is a
contraction, it is continuous (why?), and the sequential characterisation of continuity thus implies
that f(xn) → f(x∗) as n → ∞. We can therefore pass to the limit in the equality f(xn) = xn+1
to see that f(x∗) = x∗.
Exercise 4.24
Find, if they exist, the fixed points of the functions in Exercise 4.22.
Hint for the solution
Algebraically solve for items 1., 3. and 4. Use a graph for 2. and 5. (for the latter,
x∗ ≈ 0.68232).
4.5 Continuous functions on compact sets
If f : A→ R and B ⊂ A, we denote by f(B) = {f(x) : x ∈ B} the set of values covered by f(x) when
x runs through B.
Exercise 4.25
35
Find a continuous function f : R→ R and an open interval I such that f(I) is not an open interval.
Do the same with a closed interval I (and, possibly, a different f).
Hint for the solution
For the first one, take f not monotone, e.g. f(x) = x2. For the second one, consider an
unbounded interval and f(x) = sin(x).
Theorem 4.26 (Preservation of compact sets)
Let f : A→ R be continuous on A and K ⊂ A be compact. Then f(K) is compact.
Sketch of proof
We want to prove that f(K) is compact. Take a sequence (yn)n∈N in f(K). There is, for any
n ∈ N, xn ∈ K such that yn = f(xn). The sequence (xn)n∈N is in K. Translate the fact that K
is compact, and use the continuity of f along the sequential characterisation of continuity.
Definition 4.27 (Maximum/minimum of a function, extremum)
Let f : A→ R. A maximum of f on A is c ∈ A such that f(c) ≥ f(x) for all x ∈ A. In that case,
f(c) is called a maximum value.
We similarly define a minimum and a minimum value of f by reversing the inequality above.
A maximum or a minimum of f is called an extremum of f .
Exercise 4.28
Find a bounded set A and a continuous function f : A→ R such that f does not have a maximum
nor a minimum on A.
Hint for the solution
Start from A = (0, 1) and look for a very simple function.
Theorem 4.29 (Extreme Value Theorem (EVT))
Let K be a non-empty compact subset of R and f : K → R be continuous. Then f has a maximum
and a minimum on K.
In other words, there exists x0 and x1 in K such that, for all x ∈ K, f(x0) ≤ f(x) ≤ f(x1).
Proof
By the preservation of compact sets, since K is compact and f is continuous, then f(K) is compact
and non-empty (since K is non-empty). From Problem Set 4, we know that the set f(K) has
a maximum M and a minimum m. Since m,M are in f(K), we can find x0 and x1 in K such
that m = f(x0) and M = f(x1). Then, for any x ∈ K, f(x) ∈ f(K) so m ≤ f(x) ≤ M , that is
f(x0) ≤ f(x) ≤ f(x1) as required.
4.6 Intermediate value theorem (IVT)
Exercise 4.30
1. Let I be an interval. Find a function f : I → R that takes the values 0 and 1 but not the value
0.5.
36
2. Find a set A and a continuous function f : A → R that takes the values 0 and 1, but not the
value 0.5.
Hint for the solution
Use a graph to help your intuition. For the second one, think of a slightly “complicated” set
A (e.g., not an interval).
Theorem 4.31 (Intermediate Value Theorem (IVT))
Let f : [a, b]→ R be continuous and L be a real number between f(a) and f(b). Then there exists
c ∈ [a, b] such that f(c) = L.
Sketch of proof
Assume f(a) ≤ L ≤ f(b), let c = sup{x ∈ [a, b] | f(x) ≤ L}. Use Problem Set 3 to see that there
is a sequence (xn)n∈N such that f(xn) ≤ L and xn → c as n→∞. The OLT shows that f(c) ≤ L.
Assuming that c < b (treat the case c = b separately), argue that f(c+ 1/n) > L and deduce that
f(c) ≥ L.
Exercise 4.32
Let f : R→ R be defined by f(x) = x4 + cos(x2) sin(x)− 1. Show that there exists c ∈ R such that
f(c) = 0.
Hint for the solution
Find two real numbers a and b such that f(a) ≤ 0 and f(b) ≥ 0, then apply the IVT (do not
forget to check the assumptions).
Remark 4.33 Some functions are not continuous but satisfy the IVT: f(x) = sin(1/x), with f(0) = 0.
See Remark 4.17.
Theorem 4.34 (Preservation of intervals)
Let I be an interval and f : I → R be continuous. Then f(I) is an interval.
Sketch of proof
Remember the definition of an interval. Take y < y′ in f(I) and L between y and y′. There is
x, x′ in I such that f(x) = y and f(x′) = y′. Apply then the IVT between x and x′ to L.
As a consequence of the IVT, we have an interesting new fixed point theorem. The genuine Brouwer
theorem replaces [a, b] by a convex compact subset of Rm... and is infinitely more difficult to prove!
Theorem 4.35 (Brouwer)
Let f : [a, b]→ [a, b] be continuous. Then f has at least one fixed point.
Proof
Consider g(x) = f(x)− x, which defines a continuous function on [a, b] by the ACT. Since f(x) ∈
[a, b] for all x, we have g(a) = f(a) − a ≥ 0 and g(b) = f(b) − b ≤ 0. Hence, 0 lies between g(a)
and g(b), so the IVT shows that there is c ∈ [a, b] satisfying g(c) = 0. Translated in terms of f ,
37
this gives f(c) = c and concludes the proof.
5 Derivatives
5.1 Definition and basic properties
Intuitively: derivative = slope of the tangent to the graph.
Slope=g′(c)
Slope=g(x)−g(c)
x−c
c x
g(x)
g(c)
Definition 5.1 (Derivative)
Let I be an open interval of R, g : I → R and c ∈ I. g is differentiable at c if the limit
lim
x→c
g(x)− g(c)
x− c (5.1)
exists. We then define the derivative of g at c by g′(c) = limx→c g(x)−g(c)x−c .
If g is differentiable at any point of I, we say that g is differentiable on I.
Letting h = x− c, we can also write
g′(c) = lim
x→c
g(x)− g(c)
x− c = limh→0
g(c+ h)− g(c)
h
.
Exercise 5.2
1. g(x) = xn is differentiable on R and g′(x) = nxn−1.
2. g(x) =
√
x is differentiable on (0,∞) and g′(x) = 12√x .
3. g(x) = |x| is differentiable on R\{0}, with g′(x) = sign(x), but g is not differentiable at 0.
Hint for the solution
1. Use the identity xn − cn = (x− c)(xn−1 + xn−2c+ xn−3c2 + · · ·+ xcn−2 + cn−1) and the
ALT.
2. Remember that
√
x−√c = x−c√
x+
√
c
.
3. We have (g(x) − g(0))/(x − 0) = |x|/x. Consider this along the sequence xn = (−1)n/n
and use the sequential characterisation of functional limit.
38
Proposition 5.3 (1st order asymptotic expansion)
Let I be an open interval of R, g : I → R and c ∈ I. Then g is differentiable at c if and only if there
exists L ∈ R and a function r defined on an open interval around 0 such that limh→0 r(h) = 0
and, for all h in this interval, g(c+ h) = g(c) + Lh+ hr(h).
In this case, we have g′(c) = L and we can therefore write
g(c+ h) = g(c) + g′(c)h+ hr(h). (5.2)
Proof
Direction “⇒”: we know that g is differentiable at c. For h 6= 0 we set
r(h) = g(c+ h)− g(c)− g
′(c)h
h
,
and r(0) = 0. Then r is defined on an interval around 0, because c belongs to the open interval I
over which g is defined, and by construction g(c + h) − g(c) − g′(c)h = hr(h), that is g(c + h) =
g(c) + Lh + hr(h) with L = g′(c). Moreover, r(h) = g(c+h)−g(c)h − g′(c) tends to 0 as h → 0, by
definition of the derivative of g at c.
Direction “⇐”: assume that g(c+ h) = g(c) + Lh+ hr(h) for some L ∈ R and some r that tends
to 0 as h→ 0. Then, using the ALT for functional limits,
g(c+ h)− g(c)
h
= Lh+ hr(h)
h
= L+ r(h)→ L as h→ 0.
This shows, by definition, that g is differentiable at c with g′(c) = L.
“Differentiable” is stronger than “continuous”:
Theorem 5.4 (Differentiable ⇒ Continuous)
Let I be an open interval of R, g : I → R and c ∈ I. If g is differentiable at c, then g is continuous
at c.
Proof
By Proposition 5.3, we have g(c + h) = g(c) + Lh + hr(h). Using the ALT for functional limits,
we see that g(c + h) → g(c) as h → 0. According to the link between limit and continuity, we
deduce that g is continuous at c.
Theorem 5.5 (Algebraic Differentiation Theorem)
Let I be an open interval of R, f, g : I → R and c ∈ I. If f and g are differentiable at c, then
(i) For all k ∈ R, kf is differentiable at c and (kf)′(c) = kf ′(c).
(ii) f + g is differentiable at c and (f + g)′(c) = f ′(c) + g′(c).
(iii) fg is differentiable at c and (fg)′(c) = f ′(c)g(c) + f(c)g′(c).
(iv) If g does not vanish on I, fg is differentiable at c and
(
f
g
)′
(c) = g(c)f
′(c)−f(c)g′(c)
(g(c))2 .
Hint for the proof
39
Most of them are straightforward consequences of the definition of differentiability. For (iii), write
(fg)(c+ h)− (fg)(c) = f(c+ h)g(c+ h)− f(c)g(c)
= (f(c+ h)− f(c))g(c+ h) + f(c)(g(c+ h)− g(c)).
Divide throughout by h and let h→ 0, using Theorem 5.4 to ensure that g(c+h)→ g(c) as h→ 0.
The quotient rule, property (iv), is probably the trickiest. Here is a way to properly deal with it,
with a minimal risk of mistake. First, we reduce the complexity of what we have to prove. Assume
that we have proved that 1/g is differentiable at c and that (1/g)′(c) = −g′(c)/g(c)2. Then by
the product rule (iii) (which we already proved) applied to f and 1/g, f/g = f × (1/g) is also
differentiable at c and(
f
g
)′
(c) = f ′(c)× 1
g(c) + f(c)×
−g′(c)
g(c)2 =
f ′(c)g(c)− f(c)g′(c)
g(c)2
as we wanted (to obtain the second equality, all terms have been reduced to the same denominator).
We therefore only have to prove that 1/g is differentiable at c, and establish the formula for
(1/g)′(c) (so we have simplified the proof to the case f = 1). Let’s do that.
By differentiability of g, g(c+h) = g(c)+g′(c)h+hr(h) for some r that goes to 0 as h→ 0. Write
then
1
g(c+ h) −
1
g(c) =
g(c)− g(c+ h)
g(c+ h)g(c) =
−g′(c)h− hr(h)
g(c+ h)g(c) .
Divide then by h and let h → 0, using the property of r, the continuity of g, and the Algebraic
Limit Theorem for functional limits.
Example 5.6
The following functions are differentiable:
1. All polynomials P (x) = a0 + a1x + · · · + arxr are differentiable on R. A rational function PQ ,
with P and Q polynomials, is differentiable on R\{roots of Q}.
2. sin, cos, tan, exp and ln are differentiable on their domain of definition.
3.
√· is differentiable only on (0,+∞), not on its entire domain of definition (it is not differentiable
at 0...).
Theorem 5.7 (Chain Rule (=Differentiation of a Composition))
Let I and J be open intervals of R, f : I → J and g : J → R and c ∈ I. If f is differentiable at c
and g is differentiable at f(c) then g ◦ f is differentiable at c and (g ◦ f)′(c) = g′(f(c))f ′(c).
Sketch of proof
Write, by assumption, f(c+h) = f(c)+f ′(c)h+hr(h) and g(f(c)+k) = g(f(c))+g′(f(c))k+kq(k)
where r(h)→ 0 as h→ 0 and q(k)→ 0 as k → 0. Letting k = f(c+ h)− f(c) = f ′(c)h+ hr(h),
which tends to 0 when h→ 0, we get
g(f(c+ h)) = g(f(c)) + g′(f(c))f ′(c)h+ g′(f(c))hr(h)
+ h[f ′(c) + r(h)]q(f ′(c)h+ r(h)h)
= g(f(c)) + g′(f(c))f ′(c)h+ hR(h).
Here, R(h) gather several terms. Write precisely what R(h) is, and deduce that R(h) → 0 as
h→ 0.
Example 5.8
1. (gn)′(x) = ng′(x)g(x)n−1.
40
2. exp(g)′(x) = g′(x) exp(g(x)).
3. If we know that (1/x)′ = −1/x2, then we can find back (1/g)′ by composition and (f/g)′ by
product.
Theorem 5.9 (Interior Extremum Theorem (IET))
Let I be an open interval, f : I → R and c ∈ I. If c is an extremum of f and f is differentiable
at c, then f ′(c) = 0.
Sketch of proof
Assume that c is a maximum of f . We have f(c+h) ≤ f(c). Using the 1st order expansion shows
that
f ′(c)h+ hr(h) ≤ 0. (5.3)
Taking h > 0 gives f ′(c) + r(h) ≤ 0. Use the OLT (or a functional version thereof) to deduce that
f ′(c) ≤ 0. Taking h < 0 in (5.3) gives f ′(c) + r(h) ≥ 0 and thus f ′(c) ≥ 0.
Remark 5.10 A local maximum of f is c ∈ I for which there exists δ > 0 such that f(x) ≤ f(c) for all
x ∈ (c− δ, c+ δ). We similarly define the notion of “local minimum”.
The Interior Extremum Theorem works if we replace “extrema” by “local extrema” (just apply the
theorem with (a, b) replaced by (c− δ, c+ δ)).
Exercise 5.11
1. Find A, f : A → R and c ∈ A such that f ′(c) = 0 but c is not an extremum (not even a local
extremum) of f .
2. Let A = [0, 1] and f(x) = x. 0 is a minimum of f on A, yet f ′(0) = 1 6= 0. Where’s the
problem? How, in the proof of the IET, was this situation avoided?
Hint for the solution
1. It is a classical mistake to consider that whenever the derivative vanishes, you have a
maximum or a minimum of the function... find a counter-example and NEVER forget it.
2. One of the assumptions of the IET is failing. In the proof, this situation is avoided because
we can take both h > 0 and h < 0 (being able to only take one of these gives only an inequality
on f ′(c), not an equality to 0).
The EVT ensures the existence of extrema, and the IET helps locating these extremas (they should be
looked for at the zeros of the derivatives). However, both theorems have contradictory assumptions: the
former demands to consider a function on a compact set, whereas the latter requires the domain to be
open. To combine them in order to locate extrema, we thus have to be careful.
Proposition 5.12 (Location of extrema)
Let f : [a, b] → R be continuous on [a, b] and differentiable on (a, b). Then f has minima and
maxima, and these extrema are either a, b or c ∈ (a, b) such that f ′(c) = 0.
Sketch of proof
By the EVT, f has at least one minimum and one maximum. Consider one of these extrema. If
it is not a or b, then it lies in (a, b) and it is an extremum of f on the open interval (a, b). Thus,
the IET can be applied and f ′ vanishes at this extremum.
Exercise 5.13
41
Find the minima and maxima of
1. f(x) = x on [0, 1].
2. f(x) = x3 − x+ 1 on [0, 2].
Solution
1. f ′(x) = 1 never vanishes so by Proposition 5.12, the extrema of f are either 0 or 1. One
of them is a minimum, the other a maximum. Since f(0) = 0 < f(1) = 1, 0 is the minimum
and 1 the maximum.
2. f ′(x) = 3x2 − 1 has zero 1/√3 in [0, 2]. The extrema of f are either 0, 1/√3 or 2. To find
which one is a minimum and which one is a maximum we consider the values of f at these
points: f(0) = 1, f(1/
√
3) = 1/(3
√
3)− 1/√3 + 1 = −2/(3√3) + 1 and f(2) = 8− 2 + 1 = 7.
We clearly have f(1/
√
3) ≤ f(0) ≤ f(2) so the minimum of f is 1/√3 and the maximum is 2.
5.2 Mean value theorem (MVT)
Theorem 5.14 (Rolle’s theorem)
Let a < b. Let f : [a, b] → R be continuous on [a, b] and differentiable on (a, b). If f(a) = f(b)
then there exists c ∈ (a, b) such that f ′(c) = 0.
Proof
By the EVT, f has a minimum and a maximum. If one of them is different from a and b, then
it is an extremum of f on (a, b) and the IET shows that f ′ vanishes at this point, since f is
differentiable on (a, b). Assume now that the minimum and maximum of f are a and b, say a is
the minimum and b the maximum. Then for any x ∈ [a, b], f(a) ≤ f(x) ≤ f(b). But f(a) = f(b)
so this shows that f(x) = f(a) for all x. f is therefore constant on [a, b] so its derivative vanishes
everywhere on (a, b).
Theorem 5.15 (Mean Value Theorem (MVT))
Let a < b and f : [a, b]→ R be continuous on [a, b] and differentiable on (a, b). Then there exists
c ∈ (a, b) such that f ′(c) = f(b)−f(a)b−a .
Hint for the proof
Apply Rolle’s theorem to the difference between f and the straight line going through (a, f(a))
and (b, f(b)), that is to the function g defined by g(x) = f(x)− [f(a) + f(b)−f(a)b−a (x− a)].
As a consequence, something we used in Exercise 4.22 (item 5) in the section on the Contraction Mapping
Theorem.
Corollary 5.16 (Lipschitz estimate)
Let a < b. Let f be continuous on [a, b] and differentiable on (a, b). Assume that f ′ is bounded
on (a, b). Then |f(b)− f(a)| ≤ |b− a| supx∈(a,b) |f ′(x)|.
Hint for the proof
Straightforward from the MVT.
Finally, the MVT has a well-known consequence...
42
Corollary 5.17 (Monotony and sign of the derivative)
If f is differentiable on an open interval I and if f ′ = 0 (respectively f ′ ≥ 0, respectively f ′ ≤ 0)
on I then f is constant (resp. increasing, resp. decreasing) on I.
Hint for the proof
Again, straightforward from the MVT.
Exercise 5.18
Are Corollaries 5.16 and 5.17 true if I is not an interval?
Hint for the solution
Consider I made of the union of two intervals.
5.3 Fundamental theorem of calculus
Proposition 5.19
If f : I 7→ R is continuous and a ∈ I then F (x) = ∫ x
a
f(s) ds is differentiable and F ′ = f on I.
Proof
We write the definition of the derivative, use Chasles’ relation for integrals and the fact that
f(x) = 1h
∫ x+h
x
f(x) ds (in this integral, f(x) is constant...):
F (x+ h)− F (x)
h
− f(x) = 1
h
(∫ x+h
a
f(s) ds−
∫ x
a
f(s) ds
)
− f(x)
= 1
h
∫ x+h
x
f(s) ds− f(x)
= 1
h
∫ x+h
x
(f(s)− f(x)) ds.
Let ε > 0 and take δ from the definition of the continuity of f at x. Assume that |h| ≤ δ. Then
any s between x and x+h belongs to [x− δ, x+ δ] so by choice of δ, |f(s)− f(x)| ≤ ε. This yields∣∣∣∣∣ 1h
∫ x+h
x
(f(s)− f(x)) ds
∣∣∣∣∣ ≤ 1h
∫ x+h
x
|f(s)− f(x)| ds ≤ ε.
Hence, |F (x+h)−F (x)h − f(x)| ≤ ε whenever |h| ≤ δ, which shows that limh→0 F (x+h)−F (x)h = f(x).
Theorem 5.20 (Fundamental theorem of calculus)
Let I be an open interval, and f : I 7→ R be differentiable. Assume that f ′ is continuous on I.
Then, for any a, x ∈ I,
f(x) = f(a) +
∫ x
a
f ′(s) ds.
Proof
Define F (x) = f(a) +
∫ x
a
f ′(s) ds. By Proposition 5.19, since f ′ is continuous F is differentiable
43
on I and F ′ = f ′. In other words, (F − f)′ = 0. Corollary 5.17 then shows that F − f is constant
on the interval I. Since this function vanishes at a, it vanishes everywhere and F = f on I.
5.4 Taylor expansions
Definition 5.21 (n-times continuously differentiable)
Let I be an open interval of R and n ∈ N. f : I 7→ R is n times continuously differentiable if f ′,
f ′′, . . . , f (n) exist and are continuous on I.
f is indefinitely differentiable on I if its derivatives up to any order exist on I. In that case, all
these derivatives are also continuous on I.
Theorem 5.22 (Taylor expansion with integral remainder)
Let I be an open interval that contains 0 and n ∈ N ∪ {0}. Let f : I → R be (n + 1) times
continuously differentiable. Then, for all x ∈ I,
f(x) = f(0) + f ′(0)x+ f
′′(0)
2! x
2 + · · ·+ f
(n)(0)
n! x
n + 1
n!
∫ x
0
f (n+1)(ξ)(x− ξ)n dξ.
Sketch of proof
By induction on n. The base case n = 0 is the Fundamental Theorem of Calculus. The inductive
step is done by using an integration by part to write∫ x
0
f (n+1)(ξ)(x− ξ)n dξ =
[
−f (n+1)(ξ) (x− ξ)
n+1
n+ 1
]x
0
+
∫ x
0
f (n+2)(ξ) (x− ξ)
n+1
n+ 1 dξ.
Corollary 5.23 (Taylor expansion with O remainder)
Under the assumptions of Theorem 5.22, for any x ∈ I,
f(x) = f(0) + f ′(0)x+ f
′′(0)
2! x
2 + · · ·+ f
(n)(0)
n! x
n +Rn+1(x)
where
|Rn+1(x)| ≤ |x|
n+1
(n+ 1)! supξ∈[0,x]
|f (n+1)(ξ)| = O(xn+1).
(Note that ξ ∈ [0, x] means here “ξ between 0 and x” so, if x < 0, it should really be noted
ξ ∈ [x, 0].)
Note 5.24 (Notation O)
The notation “O(xn+1)” stands for “a function of x that is bounded, near x = 0, by C|x|n+1 for
some constant C not depending on x”. In a more precise way: there exists η > 0 and C ∈ R such
that for any x ∈ [−η, η],
|O(xn+1)| ≤ C|x|n+1.
The constants C and η can depend on the function “hidden” in O(xn+1), but not on x. For
example, we could write (prove it!)
x+ x2 = O(x) , x3 cos(x2) = O(x3) , x2(12 + sin(x)) = O(x).
For our purpose, the following simple properties will be useful:
44
1. if k ≥ m, O(xk) +O(xm) = O(xm) (the smallest power remains),
2. O(xn+1) = xnO(x),
3. O(x
n+1)
xn = O(x)→ 0 as x→ 0.
Note 5.25 (Expansion around a generic real number)
The above expansions are written around 0: we express f(x) = f(0 + x) in terms of derivatives of
f at 0, powers of x and a remainder. They can easily be generalised to yield expansions around
an arbitrary point x0 ∈ I. For example, Corollary 5.23 would become:
f(x0 + x) = f(x0) + f ′(x0)x+
f ′′(x0)
2! x
2 + · · ·+ f
(n)(x0)
n! x
n +O(xn+1).
Proof of Corollary 5.23
By the Taylor expansion with integral remainder, the formula holds with
Rn+1(x) =
1
n!
∫ x
0
f (n+1)(ξ)(x− ξ)n dξ.
The upper bound on Rn+1 follows by writing, assuming first that x ≥ 0,∣∣∣∣∫ x
0
f (n+1)(ξ)(x− ξ)n dξ
∣∣∣∣ ≤ ∫ x
0
|f (n+1)(ξ)|(x− ξ)n dξ
≤ sup
ξ∈[0,x]
|f (n+1)(ξ)|
∫ x
0
(x− ξ)n dξ
= sup
ξ∈[0,x]
|f (n+1)(ξ)| x
n+1
n+ 1 . (5.4)
Note that the supremum exists by the EVT because f (n+1) is a continuous function on the compact
set [0, x] ⊂ I. If x < 0, one can easily check that (5.4) is still valid provided that xn+1 is replaced
with |x|n+1 (do not forget that if x < 0 then | ∫ x0 (...)| ≤ ∫ 0x |(...)|). Estimate 5.4 thus shows that
|Rn+1(x)| ≤ |x|
n+1
(n+ 1)! supξ∈[0,x]
|f (n+1)(ξ)|. (5.5)
We now have to prove that Rn+1(x) = O(xn+1). To this purpose, Take η > 0 such that
[−η, η] ⊂ I and notice, as above, that f (n+1) is bounded on the compact set [−η, η]. The num-
ber M = supξ∈[−η,η] |f (n+1)(ξ)| is therefore well defined and, if x ∈ [−η, η], since [0, x] ⊂ [−η, η]
we have supξ∈[0,x] |f (n+1)(ξ)| ≤ M . Estimate (5.5) then shows that, for such an x, |Rn+1(x)| ≤
M
(n+1)! |x|n+1, which concludes the proof that Rn+1(x) = O(xn+1).
Note 5.26 (Usage of Taylor expansions, and classical expansions)
Taylor expansions, usually with n = 0 or n = 1, are extremely helpful to find indeterminate limits
of the kind 0/0 or, after transformation, ∞/∞.
45
To that purpose, the following classical Taylor expansions around 0 prove very useful:
exp(x) = 1 + x+ x
2
2! +
x3
3! + . . .+
xn
n! +O(x
n+1) ,
sin(x) = x− x
3
3! +
x5
5! −
x7
7! + . . .+ (−1)
n x
2n+1
(2n+ 1)! +O(x
2n+3) ,
cos(x) = 1− x
2
2! +
x4
4! −
x6
6! + . . .+ (−1)
n x
2n
(2n)! +O(x
2n+2) ,
1
1− x = 1 + x+ x
2 + . . .+ xn +O(xn+1) ,
ln(1 + x) = x− x
2
2 +
x3
3 −
x4
4 + . . .+ (−1)
n−1x
n
n
+O(xn+1).
Exercise 5.27
1. limx→0 sin(x)x = 1.
2. limx→0 (e
x−1)2
cos(x)−1 = −2.
3. limx→+∞ e
x
x = +∞.
4. limx→+∞ xe−x = 0.
5. limx→+∞ ln(x)x = limx→0 x ln(x) = 0.
Hint for the solution
1. and 2. are purely based on the expansions of sin(x), ex and cos(x) with O remainder.
3. and 4. A Taylor expansion with integral remainder shows that ex ≥ x2/2.
5. Set x = ey to come back to 4.
5.5 Are derivatives continuous?
Consider, for x 6= 0, gn(x) = xn sin(1/x) for n ∈ N, and let g0(x) = sin(1/x). Define gn(0) = 0 for all
n = 0 or n ∈ N. All these functions are continuous and differentiable on R\{0} (why?). What about 0?
We already saw that g0 is not continuous at 0, but that g1 is continuous at 0.
Graph of g0 Graph of g1
And differentiability? g1 is not differentiable at 0: we have
g1(x)− g1(0)
x− 0 = sin
(
1
x
)
= g0(x),
which does not have a limit as x→ 0.
46
Graph of g′1
Let us look at g2. It is differentiable on R and g′2(x) = 2x sin
( 1
x
)− cos ( 1x) if x 6= 0, g′2(0) = 0 (why?).
Graph of g2 Graph of g′2
However, g′2 is not continuous on R (why?). Conclusion: a derivative is not necessarily a continuous
function.
Remark 5.28 The function g2 also shows that it’s not because a the derivative of f has no limit at some
point c that f is not differentiable at c.
Remark 5.29 Things can be much worse than in the previous examples. Appendix C constructs a
function that is continuous everywhere on [0, 1] but differentiable nowhere on (0, 1).
Finally, let us notice a curiosity. Darboux’s Theorem (see Abbott) shows that a derivative always
satisfies the Intermediate Value Property: if f : I → R is differentiable on an open interval I and
a, b lie in I then, for all L between f ′(a) and f ′(b) there exists c ∈ (a, b) such that f ′(c) = L. Hence, the
IVT is not a characterisation of continuous functions...
6 Sequences and series of functions
6.1 Polynomial solutions to differential equations... and beyond
We all know how to solve simple ordinary differential equations (ODE), like
y′(x) = y(x).
The solution to that one is y(x) = Cex where (here and in the rest of this section) C is any real number.
Slightly more complicated:
y′(x) = xy(x).
47
This one involves the exponential of the anti-derivative x2/2 of the coefficient x in front of y(x): the
solutions are y(x) = Cex2/2. Let us modify a bit more by considering a non-homogeneous ODE:
y′(x) = xy(x) + 1. (6.1)
You might remember the notions of variation of parameter or integrating factors, which in the end give
you the solutions to this equation under the form
y(x) = Cex
2/2 + ex
2/2F (x) where F is an anti-derivative of e−x
2/2.
This raises the question of explicitly computing F ... not an easy task at first sight!
All the techniques used to compute the solutions above are based on algorithms that require to compute
anti-derivatives, which might be challenging. Another technique is based on a sort of trial-and-error
approach: we start by assuming that the solution has a certain form, with known functions and un-
known coefficients, and we plug this form in the ODE to identify the coefficients (it’s the technique of
undetermined coefficients). For example, we could try and find a polynomial solution to the ODE
y′(x) = 2y(x)− 2x2 − 3. (6.2)
We are therefore looking for a solution under the form y(x) = anxn+an−1xn−1 + . . .+a1x+a0, for some
unknown n ∈ N and a0, . . . , an in R. Let us plug this form into (6.2):
nanx
n−1 + (n− 1)an−2xn−2 + · · ·+ 3a3x2 + 2a2x+ a1 = 2anxn + 2an−1xn−1 + . . .+ 2a0 − 2x2 − 3.
To find the coefficients ai, we first move all terms to the same side:
− 2anxn + (nan − 2an−1)xn−1 + ((n− 1)an−2 − 2an−2)xn−2 + · · ·
+ (3a3 − 2a2 + 2)x2 + (2a2 − 2a1)x+ a1 − 2a0 + 3 = 0.
For this polynomial to be equal to 0, all its coefficients must be zero. This imposes
−2an = 0
nan − 2an−1 = 0
(n− 1)an−2 − 2an−2 = 0
· · ·
3a3 − 2a2 + 2 = 0
2a2 − 2a1 = 0
a1 − 2a0 + 3 = 0.
Although this was implicit in our writing above, this shows that n ≥ 2 (otherwise, the 3rd equation from
the bottom reduces to 2 = 0, which is not solvable). Starting from the top and working our way down,
we also see that an = an−1 = · · · = a3 = 0, that is, the polynomial y(x) is actually of degree 2. It’s then
a simple matter to solve the remaining equations −2a2 + 2 = 0, 2a2 − a1 = 0 and a1 − 2a0 + 3 = 0 to
find that y(x) = x2 + x+ 2.
Let’s come back to (6.1). We found a solution which requires to compute
∫
e−x
2/2 dx, something that we
do not know how to tackle. Maybe we could apply the method of undetermined coefficients and look for
a polynomial solution? After all, all coefficients in the ODE (6.1) are polynomial, so it would not be so
surprising to find a polynomial solution.
We start from the trial form y(x) = anxn + · · ·+ a0, and we plug this form in (6.1); this forces
nanx
n−1 + · · ·+ a1 = anxn+1 + · · ·+ a0x+ 1.
From there, we move all the terms on one side and gather the powers of x to find that the coefficients
must satisfy
− anxn+1 − an−1xn + (nan − an−2)xn−1 + ((n− 1)an−1 − an−3)xn−2 + · · ·
+ (3a3 − a1)x2 + (2a2 − a0)x+ a1 − 1 = 0.
48
Imposing that each coefficient is equal to zero leads to the system
−an = 0
−an−1 = 0
nan − an−2 = 0
(n− 1)an−1 − an−3 = 0
· · ·
3a3 − a1 = 0
2a2 − a0 = 0
a1 − 1 = 0.
(6.3)
Working our way from the top, we then see that an = 0, an−1 = 0, an−2 = 0, . . . , a1 = 0, a0 = 0 and,
finally, a1 = 1. That is a problem: a1 cannot be simultaneously equal to 0 and equal to 1. So there is no
polynomial solution to the (relatively simple) ODE (6.1).
What happened in the example above? The issue is with the two equations at the top of (6.3). These
two equations are the ones that impose that all coefficients must be equal to zero. Could we remove
them? In a sense, yes, if we assume that y(x) is a ‘polynomial with an infinite degree’, i.e. that instead
of writing
y(x) =
n∑
k=0
akx
k
we write
y(x) =
∞∑
k=0
akx
k. (6.4)
Each value y(x) is therefore computed as the limit of a series, with a varying parameter x inside. Formally,
if we differentiate by writing y′(x) =
∑∞
k=0 kakx
k−1, then the reasoning above can be conducted and
(6.3) becomes an infinite system of equations without the two at the top: kak − ak−2 = 0 for all k ≥ 2,
and a1 − 1 = 0.
This however raises important mathematical questions: even though (6.4) defines a value y(x) (provided
that the series converges), what are the properties of the resulting function y? Can we differentiate it
term by term, as if it were a polynomial?
6.2 Sequences of functions
6.2.1 Pointwise convergence
Definition 6.1 (Pointwise convergence of a sequence of functions)
A sequence (fn)n∈N of functions fn : A→ R converges pointwise on A to a function f : A→ R if:
for all x ∈ A, limn→∞ fn(x) = f(x).
In other words,
for all x ∈ A, for all ε > 0, there exists Nx,ε ∈ N such that, for all n ≥ Nx,ε, |fn(x)−f(x)| ≤ ε.
Technique 6.2 (How to prove a pointwise convergence?)
For any x ∈ A, (fn(x))n∈N is a sequence of real numbers. Proving the convergence of this sequence
can be done by using any of the tools we developed for sequences of real numbers.
Exercise 6.3
1. Prove that the sequence (fn)n∈N of functions defined by fn(x) = xn converges pointwise on
[0, 1] to some f . Give a formula for f and plot it, on the same graphs as the first few (fn)n∈N.
Any comment?
49
2. Let fn be defined by:
n2 Graph of fn
1/n 2/n0 1
Prove that (fn)n∈N converges pointwise to some function f to be determined. What is the
behaviour of the sequence of real numbers (
∫ 1
0 fn(x) dx)n∈N?
Solution
1. Let x ∈ [0, 1]. We have to study the limit of the sequence fn(x) = xn. If x ∈ [0, 1) we
know that xn → 0 as n → ∞ (could you re-prove it?). If x = 1, then xn = 1 converges to
1 as n → ∞. Hence, defining f(x) = 0 if x < 1 and f(1) = 1, we have proved that, for any
x ∈ [0, 1], fn(x) → f(x) as n → ∞. This establishes the pointwise convergence of (fn)n∈N
towards f .
Each fn is obviously continuous (it’s a polynomial) on [0, 1]. However, their limit f is not
continuous. The pointwise convergence of continuous functions (fn)n∈N does not
ensure that their limit function is continuous...
2. Let x ∈ [0, 1]. If x = 0 then fn(x) = 0 for all n, so fn(0)→ 0 as n→∞. If x > 0 then for
n large enough (such that x > 2/n, that is n > 2/x), fn(x) = 0 as x is “after” the tent of fn.
Hence, for any such x, fn(x) → 0 as n → ∞ (we have |fn(x) − 0| = 0 for n large enough).
This shows that (fn)n∈N converges pointwise towards f = 0.
Computing the area of the triangle in the graph of fn shows that
∫ 1
0 fn(x) dx = n. The se-
quence (
∫ 1
0 fn(x) dx)n∈N therefore diverges to +∞. The pointwise convergence of (fn)n∈N
does not ensure the convergence of their integrals...
6.2.2 Uniform convergence
Definition 6.4 (Uniform convergence of a sequence of functions)
A sequence of functions (fn)n∈N, defined on a set A, converges uniformly on A to a function
f : A→ R if:
for all ε > 0, there exists Nε ∈ N such that, for all n ≥ Nε and all x ∈ A, |fn(x)− f(x)| ≤ ε.
Exercise 6.5
Consider again the sequence of functions (fn)n∈N defined by fn(x) = xn on [0, 1]. This sequence
converges pointwise to some f (see Exercise 6.3). For a given ε ∈ (0, 1) and x ∈ [0, 1], find the minimal
Nx,ε that is valid in Definition 6.1. Deduce that, for any a < 1, (fn)n∈N converges uniformly on
[0, a], but that it does not converge uniformly on [0, 1].
50
Sketch of solution
If x = 1 or x = 0, Nx,ε = 1 is obviously valid since fn(x) = f(x) for all n ∈ N. If x < 1,
we want |fn(x) − f(x)| = xn ≤ ε, which is equivalent to n ln(x) ≤ ln(ε) (why?), that is
n ≥ ln(ε)/ ln(x) (why?). This a a necessary and sufficient condition (it is equivalent to
xn ≤ ε) and thus the minimal Nx,ε must be greater than ln(ε)/ ln(x). Hence, Nx,ε is the first
natural number after ln(ε)/ ln(x).
Let a < 1 and ε ∈ (0, 1) . Take Nε = Na,ε . For all x ∈ [0, a] we have ln(ε)/ ln(a) ≥
ln(ε)/ ln(x) (why?) and thus Na,ε ≥ Nx,ε. Hence, for any n ≥ Nε , we have n ≥ Nx,ε and
thus |fn(x)− f(x)| ≤ ε by definition of Nx,ε. The boxed terms show that fn → f uniformly
on [0, a].
Let us now consider the uniform convergence on [0, 1]. If fn → f uniformly on [0, 1], then
for ε = 1/2 we could find N1/2 such that for all x ∈ [0, 1], |fn(x) − f(x)| ≤ 1/2. Then N1/2
is valid in Definition 6.1 for any x ∈ [0, 1) and thus, by definition of Nx,1/2, N1/2 ≥ Nx,1/2.
But Nx,1/2 ≥ ln(1/2)/ ln(x) and thus N1/2 ≥ ln(1/2)/ ln(x), which shows that ln(x) ≤
ln(1/2)/N1/2 (why?). This should be valid for any x ∈ [0, 1). Letting x → 1 show that
0 ≤ − ln(2)/N1/2, which is a contradiction. Hence, (fn)n∈N cannot converge uniformly on
[0, 1].
Note 6.6 (Link between pointwise and uniform convergence)
Uniform convergence implies pointwise convergence (prove it!), but the converse is false in general:
some sequences of functions converge pointwise but not uniformly on some sets (consider Exercise
6.5).
Exercise 6.7
If f, g are functions on A, define the uniform distance between f and g by
d(f, g) = sup
x∈A
|f(x)− g(x)| (6.5)
(this “supremum” is set to +∞ if |f − g| is not bounded). Prove that (fn)n∈N converges uniformly
on A to f if and only if d(fn, f)→ 0 as n→∞.
Hint for the solution
Both direction consist in carefully writing the definition of uniform convergence of (fn)n∈N
and the definition of d(fn, f) → 0. Note also that if you find a real number M such that
|fn(x)−f(x)| ≤M for all x ∈ A, then by definition of the supremum (the least upper bound),
you can claim that supx∈A |fn(x)− f(x)| ≤M .
Theorem 6.8 (Continuity of the uniform limit)
Let (fn)n∈N be a sequence of functions on A and c ∈ A. Assume that each fn is continuous at c.
If (fn)n∈N converges uniformly to f on A then f is also continuous at c.
As a consequence, if each fn is continuous on A, then f is continuous on A.
Proof
We have
|f(x)− f(c)| ≤ |f(x)− fn(x)|+ |fn(x)− fn(c)|+ |fn(c)− f(c)|. (6.6)
Let ε > 0 and take Nε from the definition of uniform convergence of (fn)n∈N to f . We then have,
51
for any x ∈ A, |fNε(x) − f(x)| ≤ ε and |fNε(c) − f(c)| ≤ ε. The function fNε is continuous at c
so we can find δ = δ(Nε) > 0 such that, for x ∈ (c− δ, c+ δ) ∩A, |fNε(x)− fNε(c)| ≤ ε.
Combining the previous estimates in (6.6) with n = Nε shows that |f(x) − f(c)| ≤ 3ε for any
x ∈ (c− δ, c+ δ) ∩A, and the proof of the continuity of f at c is complete.
Technique 6.9 (Studying the convergence of a sequence of functions)
How can you tackle a question of the kind “study the convergence of the sequence (fn)n∈N defined
on A by...”?
1. The first task is to see if (fn)n∈N converges pointwise.
Fix an arbitrary x ∈ A and consider the sequence of real numbers (fn(x))n∈N. Use any of
the tools in Section 2, in particular the ALT, to analyse the convergence of this sequence. If
it does converge for any x ∈ A, call f(x) its limit and you have proved that fn → f pointwise
on A.
If there is at least one x ∈ A such that (fn(x))n∈N diverges, then (fn)n∈N does not converge
pointwise on A.
2. Assuming there is pointwise convergence, you then have to consider if the sequence converges
uniformly (which is stronger).
By Exercise 6.7, you just need to prove that supx∈A |fn(x)− f(x)| → 0 as n → ∞. This is
usually done in the following way.
(a) Write explictly, for a generic x, |fn(x)− f(x)|.
(b) Try to find ωn, as small as possible and not depending on x ∈ A, such that |fn(x)−
f(x)| ≤ ωn.
(c) If you can do that with (ωn)n∈N that converges to 0, then you’ve proved the uniform
convergence of (fn)n∈N to f .
3. If you think that the pointwise convergence is not uniform, several options exist:
• “The hard way”, as in Exercise 6.5: find the smallest possible Nx,ε (which can be risky)
and show that it does not remain bounded as x goes through A, or
• Use the continuity of the uniform limit in a contrapositive way: if each fn is continuous
but f is not continuous, then the convergence is not uniform, or
• Prove that supx∈A |fn(x) − f(x)| does not converge to zero, by finding a lower bound
of this quantity that does not converge to 0 as n→∞.
Exercise 6.10
Prove that the sequence of functions defined by fn(x) = nxn+x converges uniformly on [0, 1].
Solution
We first have to find the pointwise limit. For a fixed x ∈ [0, 1], we have to analyse the
convergence of ( nxn+x )n∈N. In general (for x 6= 0), this is an indeterminate limit so we divide
throughout by n to find
fn(x) =
nx
n+ x =
x
1 + x/n.
The ALT for sequences then shows that fn(x)→ x (since x/n→ 0) as n→∞. We found that
(fn)n∈N converges pointwise on [0, 1] to the function f defined by f(x) = x for all x ∈ [0, 1].
To shows the uniform convergence, we start by writing
fn(x)− f(x) = nx
n+ x − x =
nx− nx− x2
n+ x =
−x2
n+ x.
52
If x ∈ [0, 1] then | − x2| ≤ 1 and n+ x ≥ n, so
|fn(x)− f(x)| ≤ 1
n
=: ωn.
The number ωn does not depend on x ∈ [0, 1], and converges to 0 as n → ∞, which proves
that fn → f uniformly on [0, 1].
Exercise 6.11
Show that the sequence (fn)n∈N defined by fn(x) = nx1+nx converges pointwise but not uniformly on
[0, 1].
Solution
We have fn(x) = x(1/n)+x . If x = 0 then fn(0) = 0 for all n ∈ N. If x 6= 0, then (1/n) + x→
x 6= 0 as n → ∞ and the ALT therefore allows us to write that fn(x) → xx = 1 as n → ∞.
Setting f(0) = 0 and f(x) = 1 if x ∈ (0, 1], we have just proved that fn(x)→ f(x) as n→∞,
for all x ∈ [0, 1]. This establishes the pointwise convergence of (fn)n∈N.
Each fn is continuous by the ACT (note that, for any n ∈ N and x ∈ [0, 1], n + x is never
equal to 0), but f is clearly not continuous. Hence, the convergence of (fn)n∈N to f cannot
be uniform, as this would violate the continuity of the uniform limit.
Theorem 6.12 (Integration and uniform limit)
If (fn)n∈N be a sequences of continuous functions on [a, b] that converges uniformly to f on [a, b].
Then
∫ b
a
fn(x) dx→
∫ b
a
f(x) dx.
Hint for the proof
Write ∣∣∣∣∣
∫ b
a
fn(x) dx−
∫ b
a
f(x) dx
∣∣∣∣∣ =
∣∣∣∣∣
∫ b
a
(fn(x)− f(x)) dx
∣∣∣∣∣ ≤
∫ b
a
|fn(x)− f(x)| dx
and use the fact that, if 0 ≤ g(x) ≤M for all x ∈ [a, b], then 0 ≤ ∫ b
a
g(x) dx ≤M(b− a)
Theorem 6.13 (Uniform Cauchy criterion)
Let (fn)n∈N be a sequence of functions defined on A. (fn)n∈N converges uniformly on A if and only
if for all ε > 0, there exists Nε ∈ N such that, for all n, p ≥ Nε and all x ∈ A, |fn(x)− fp(x)| ≤ ε.
Sketch of proof
We consider only the direction “⇐”, the other one is left as an exercise.
Let x ∈ A. The uniform Cauchy property shows that (fn(x))n∈N is a Cauchy sequence of real
numbers (why?). It therefore converges to some limit, that we denote by f(x). This defines a
function f to which (fn)n∈N converges pointwise.
We then use the OLT to let p → ∞ in the uniform Cauchy property. This shows that, for all
n ≥ Nε and all x ∈ A, |fn(x)− f(x)| ≤ ε. This precisely states that (fn)n∈N converges uniformly
to f .
Remark 6.14 This Cauchy criterion for functions is as useful as the Cauchy criterion for sequences. It
enables in particular the construction of surprising objects, such that functions that are continuous but
nowhere differentiable (Appendix C), or functions on [0, 1] that are continuous, increase from 0 to 1, but
have a zero derivative on a set of “length” 1 in [0, 1] (Appendix D).
53
6.3 Differentiation
Exercise 6.15
Show that (fn)n∈N, defined on [−1, 1] by fn(x) =
√
x2 + 1/n, converges uniformly on [−1, 1] to some
f .
Show that each fn is differentiable and that (f ′n)n∈N converges pointwise on [−1, 1].
Is f differentiable?
Sketch of solution
The ALT, continuity of √ on [0,∞) and the sequential characterisation of continuity show
that fn(x)→
√
x2 = |x| for all x ∈ [−1, 1]. Hence, (fn)n∈N converges point wise to f defined
by f(x) = |x|.
The ADT and the chain rule, combined with the fact that √ is differentiable on (0,∞), show
that each fn is differentiable on [−1, 1] with f ′n(x) = x√x2+1/n .
If x = 0 then f ′n(0) = 0 converges obviously to 0 as n → ∞. If x 6= 0, the ALT and the
sequential characterisation of continuity show that f ′n(x) → x√x2 =
x
|x| as n → ∞. Hence,
(f ′n)n∈N converges pointwise on [−1, 1].
f is clearly not differentiable. Hence, the uniform limit of differentiable functions is not
necessarily differentiable, even if the derivative of the functions converge pointwise.
Theorem 6.16 (Differentiation of the limit)
Let (fn)n∈N be a sequence of continuous functions on an interval [a, b], that are differentiable
on (a, b). Assume that (fn)n∈N converges uniformly to f on [a, b] and that (f ′n)n∈N converges
uniformly to g on (a, b). Then f is differentiable on (a, b) and f ′ = g.
In other words, in that case, (limn→∞ fn)′ = limn→∞ f ′n.
Proof
Let us first consider what we need to prove. To establish that f is differentiable at z ∈ (a, b) with
derivative g(z), we need to prove that the function r(h), defined for non-zero but small h by (see
Proposition 5.3)
r(h) = f(z + h)− f(z)− g(z)h
h
,
tends to 0 as h→ 0. If we extend r to h = 0 by setting r(0) = 0, this amounts to proving that r
is continuous at 0.
We now consider what we know: each fn is differentiable. So if we define, still for non-zero small
h (exactly: 0 < |h| < min(b− z, z − a)),
rn(h) =
fn(z + h)− fn(z)− f ′n(z)h
h
,
we know that limh→0 rn(h) = 0. Extending rn to h = 0 by setting rn(0) = 0, this means that
each function rn is continuous at 0.
The functions r and rn are now defined on the interval (−h0, h0) with h0 = min(b−z, z−a). For a
fixed h in this interval, by pointwise convergence of (fn)n∈N and (f ′n)n∈N towards, respectively, f
and g, we easily see that rn(h)→ r(h) as n→∞. In other words, rn → r pointwise on (−h0, h0).
If we could prove that this convergence is actually uniform, then r would inherit from the continuity
of rn at 0 by the continuity of the uniform limit, which would prove that f is differentiable at
z with f ′(z) = g(z), as required. We now have a clear objective in mind: prove that rn → r
uniformly. How do we do that? Obviously, there is one assumption that we have not used so far,
and which by virtue of Example 6.15 is necessary to prove this result: the uniform convergence of
(f ′n)n∈N.
54
Let us form rn(h)− r(h), for h 6= 0 (this quantity is equal to zero if h = 0 anyway):
rn(h)− r(h) = fn(z + h)− fn(z)− f
′
n(z)h− [f(z + h)− f(z)− g(z)h]
h
=: Fn(h)
h
(where Fn(h) is a shorthand for the numerator). We see that Fn(0) = 0 and thus Fn(h) =
Fn(h) − Fn(0). This is an increment of Fn and if we could use the Lipschitz estimate we could
estimate this increment in terms of h multiplied by the supremum of |F ′n|. Unfortunately, because
we do not know if f is differentiable (that’s what we want to prove!), we cannot see if Fn is
differentiable with respect to h because of the term f(z + h). The trick here is therefore to
replace, in the definition of Fn, the terms f by some fp (and g by f ′p); the idea is that these are
differentiable, and if p is large they are close to f ...
Let us therefore consider, for a fixed z ∈ (a, b) and any h ∈ (−h0, h0),
Fn,p(h) = fn(z + h)− fn(z)− f ′n(z)h− [fp(z + h)− fp(z)− f ′p(z)h].
The function Fn,p is defined and differentiable on (−h0, h0), with
F ′n,p(h) = (f ′n(z + y)− f ′p(z + y))− (f ′n(z)− f ′p(z)).
Since Fn,p(0) = 0, the Lipschitz estimate applied to Fn,p between 0 and h thus gives
|Fn,p(h)− Fn,p(0)| ≤ h sup
y∈(0,h)
|(f ′n(z + y)− f ′p(z + y))− (f ′n(z)− f ′p(z))|
≤ h sup
y∈(0,h)
(|f ′n(z + y)− f ′p(z + y)|+ |f ′n(z)− f ′p(z)|)
≤ 2h sup
w∈(a,b)
|f ′n(w)− f ′p(w)| (6.7)
(note that |f ′n(z + y) − f ′p(z + y)| ≤ supw∈(a,b) |f ′n(w) − f ′p(w)| and |f ′n(z) − f ′p(z)| ≤
supw∈(a,b) |f ′n(w)− f ′p(w)|).
Take now ε > 0. The sequence (f ′n)n∈N converges uniformly on A, and therefore satisfies the
uniform Cauchy criterion on this set. We can thus find Nε ∈ N such that, for n, p ≥ Nε,
sup
w∈(a,b)
|f ′n(w)− f ′p(w)| ≤
ε
2 .
Plugged into (6.7) this shows that, for n, p ≥ Nε,
|Fn,p(h)− Fn,p(0)| ≤ hε.
By pointwise convergence of the sequences (fp)p∈N and (f ′p)p∈N towards f and g, respectively, we
see that Fn,p → Fn pointwise on (−h0, h0) as p → ∞. Hence, passing to the limit p → ∞ (by
using the ALT, the continuity of | · | and the OLT), we see that, for n ≥ Nε and any h ∈ (−h0, h0),
|Fn(h)− Fn(0)| ≤ hε.
In other words, for any n ≥ Nε and any h ∈ (−h0, h0) (note that Nε was chosen independently of
h, this is the key point), we have
|rn(h)− r(h)| ≤ ε.
This precisely proves that rn → r uniformly on (−h0, h0), which is what, based on our initial
analysis, we needed to establish in order to prove the theorem.
Exercise 6.17
Prove that (fn)n∈N defined on R by fn(x) = sin(nx)n converges uniformly on R to some f , that f is
differentiable, but that (f ′n)n∈N does not converge pointwise on R. Is that a contradiction with the
differentiation of the limit?
55
Sketch of solution
We have |fn(x)| ≤ 1/n for all x ∈ R and thus fn → 0 uniformly (see green box in page 52).
f ′n(x) = cos(nx) does not converge for many x (e.g. x = pi, since then f ′n(x) = (−1)n). Of
course, this is not a contradiction with the theorem...
6.4 Series of functions
Series are merely special kinds of sequences. Given a sequence (fn)n∈N of functions defined on A, we
define
Sn(x) = f1(x) + · · ·+ fn(x) =
n∑
k=1
fk(x).
The sequence of functions (Sn)n∈N is the series
∑
fn(x). This series converges pointwise (resp. uniformly)
to a function f if the sequence (Sn)n∈N converges pointwise (resp. uniformly) to f . We then write
f =
∑∞
n=1 fn. All previous theorems thus apply in the context of series.
Theorem 6.18 (Some translations to series of preceding theorems...)
1. Let (fn)n∈N be a sequence of continuous functions on A. If the series
∑
fn converges
uniformly then the limit function
∑∞
n=1 fn is continuous.
2. Let (fn)n∈N be a sequence of continuous functions on [a, b], that are differentiable on (a, b).
If the series
∑
fn and
∑
f ′n converge uniformly on (a, b), then the limit function
∑∞
n=1 fn
is differentiable and (
∑∞
n=1 fn)′ =
∑∞
n=1 f
′
n.
Hint for the proof
The proof is straightforward using the continuity of the uniform limit and differentiation of the
limit, but do not forget to use the ACT or the ADT when needed.
Theorem 6.19 (Uniform Cauchy Criterion for series)
A series of functions
∑
fn converges uniformly on A if and only if, for all ε > 0 there exists Nε ∈ N
such that, for all n ≥ p ≥ Nε and all x ∈ A,∣∣∣∣∣∣
n∑
k=p
fk(x)
∣∣∣∣∣∣ = |fp(x) + fp+1(x) + · · ·+ fn(x)| ≤ ε.
Hint for the proof
Again, trivial proof from the uniform Cauchy criterion, if you remember how we proved the Cauchy
criterion for series of real numbers from the Cauchy criterion for sequences of real numbers.
Corollary 6.20 (Weierstraß M-test)
Let (fn)n∈N be a sequence of functions defined on A and (Mn)n∈N be real numbers such that, for
all x ∈ A, |fn(x)| ≤Mn. If the series of positive real numbers
∑
Mn converges then the series of
functions
∑
fn converges uniformly on A.
Proof
We aim at applying the uniform Cauchy criterion for series. For all x ∈ A and n ≥ p, the triangle
56
inequality yields
|fp(x) + · · ·+ fn(x)| ≤ |fp(x)|+ · · ·+ |fn(x)| ≤Mp + · · ·+Mn.
Since
∑
Mn converges, it satisfies the Cauchy criterion for series of real numbers. Hence, for any
ε > 0, we can find Nε ∈ N such that, if n ≥ p ≥ Nε, Mp + · · ·+Mn ≤ ε. This shows that, for any
x ∈ A and any n ≥ p ≥ Nε, |fp(x) + · · ·+ fn(x)| ≤ ε. Hence, by the uniform Cauchy criterion for
series,
∑
fn converges uniformly on A.
Technique 6.21 (How to prove the uniform convergence of a series of functions?)
The Weierstraß M-test is your best buddy. You essentially have to find a nice upper bound Mn
of |fn| for each n. To show that
∑
Mn converges, use any of the tests in Section 3 for series of
real numbers.
Exercise 6.22
Prove that
∑ cos(nx+arctan(ln(n+x2)))
n2 converges uniformly on R.
Solution
Let fn(x) = cos(nx+arctan(ln(n+x
2)))
n2 . Since | cos | ≤ 1, we have |fn(x)| ≤ 1/n2 for all x ∈ R.
The series of real numbers
∑
1/n2 converges (this is a p-series with p = 2 > 1) so by
the Weierstraß M-test,
∑
fn converges uniformly on R. In particular, the limit function
f(x) =
∑
n≥1
cos(nx+arctan(ln(n+x2)))
n2 is defined for any x ∈ R.
Technique 6.23 (How to prove that the limit of a series is continuous/differentiable?)
It all comes from Theorem 6.18. Let us consider a series of functions
∑
fn.
1. Establish first the uniform convergence of
∑
fn, probably through the Weierstraß M-test as
we just saw (so, find a “nice” upper bound Mn of |fn|).
2. If each fn is continuous, then the first item in Theorem 6.18 shows that f =
∑
n≥0 fn is also
continuous.
If the continuity was the only property you wanted to establish, you can stop here. If you also
want to prove that f is differentiable, continue...
3. Check if each fn is differentiable, then establish the uniform convergence of the series of
derivatives
∑
f ′n. This is again probably easiest to do using the Weiertraß M-test, by finding
a “nice” upper bound Rn of |f ′n| (such that
∑
Rn converges).
4. The second item in Theorem 6.18 then shows that f is differentiable with f ′ =
∑
n≥0 f
′
n.
Note that the uniform convergences do not have to be done on the entire domain over which we
want to establish the continuity/differentiability of f . If these convergences are uniform on any
compact set in this domain, then the continuity/differentiability of f is valid on the entire domain.
Exercise 6.24
Prove that
∑ cos(nx2)
n3 converges uniformly on R and that the limit is differentiable on R.
Solution
Let fn(x) = cos(nx
2)
n3 . Inspection and the ADT shows that this function is defined and
57
continuous on R. Since | cos | ≤ 1, we have |fn(x)| ≤ 1/n2 for all x ∈ R. The series of real
numbers
∑
1/n3 converges (this is a p-series with p = 3 > 1) so by the Weierstraß M-test,∑
fn converges uniformly on R. Since each fn is continuous, f :=
∑∞
n=1 fn is also continuous
on R.
Let us now turn to the differentiability of f . We have f ′n(x) = − 2nx sin(nx
2)
n3 = − 2x sin(nx
2)
n2 .
We would like to apply the Weierstraß M-test to
∑
f ′n. We can write |f ′n(x)| ≤ 2|x|n2 . The
denominator is good (because
∑
1/n2 converges) but this quantity is not a fixed number
independent of x... and is not bounded uniformly with respect to x ∈ R! However, for any A >
0, if x ∈ [−A,A] then |f ′n(x)| ≤ 2An2 = Rn. Since
∑
Rn converges (it’s 2A times the convergent
series
∑
1/n2), the Weierstraß M-test shows that
∑
f ′n converges uniformly on [−A,A]. By
Theorem 6.18, f is therefore differentiable on (−A,A), with f ′(x) = −2x∑n≥1 sin(nx2)n2 . Since
this is true for any A > 0, this shows that f is indeed differentiable over R.
6.5 Power series
Power series are specific types of series of functions
∑
fn, in which fn(x) = anxn for a constant number
an (1). Note that power series actually start at n = 0, not n = 1 as before.
Theorem 6.25 (Convergence of Power Series)
Assume that the power series
∑
anx
n converges at some c 6= 0. Then:
1. For any x ∈ R satisfying |x| < |c|, ∑ anxn converges absolutely.
2. For any 0 ≤ r < |c|, ∑ anxn converges uniformly on the interval [−r, r].
Remark 6.26 We however cannot ensure that the series converges at x such that |x| = |c|. Consider∑
xn
n , which converges at c = −1 (alternating series) but not at x = 1, despite |x| = |c|.
Proof of Theorem 6.25:
1. We have
|anxn| = |ancn| ×
( |x|
|c|
)n
.
Since
∑
anc
n converges, the n-th term test shows that ancn → 0 as n→∞. Hence, (ancn)n∈N is
bounded by Proposition 2.14. Let M such that |ancn| ≤M for all n ∈ N. We infer
|anxn| ≤M
( |x|
|c|
)n
. (6.8)
Since |x|/|c| < 1, the series of real numbers∑M(|x|/|c|)n converges (it is a multiple of a geometric
series with ratio in [0, 1)). Hence, by comparison test,
∑ |anxn| converges, and thus ∑ anxn
converges absolutely.
2. Let x ∈ [−r, r]. Coming back to (6.8) we have
|anxn| ≤M
(
r
|c|
)n
=: Mn.
As above, the series
∑
Mn converges since r/|c| < 1. Hence, the Weierstraß M-test shows that∑
anx
n converges uniformly on [−r, r].
As a consequence, the set I ⊂ R on which a given power series ∑ anxn converges must have a very
specific form, namely an interval centered at 0, with or without some of its endpoints: I = (−R,R),
1We could also consider functions of the form fn(x) = an(x − x0)n, but a change of variable x¯ = x − x0 leads back to
anx¯n.
58
I = [−R,R), I = (−R,R] or I = [−R,R]. I is the interval of convergence, and R can be found the
following way.
Definition 6.27 (Radius of convergence)
The radius of convergence of a power series
∑
anx
n is
R = sup{c ≥ 0 :
∑
anc
n converges}.
If the set whose supremum is considered is not bounded, we set R = +∞. Otherwise, R is a
number in [0,∞).
A power series with radius of convergence R has the following behaviour:
• for |x| < R, it converges absolutely.
• for |x| = R, we cannot say...
• for |x| > R, it diverges.
• it converges uniformly on any interval [−r, r] with 0 ≤ r < R. In particular, the sum ∑∞n=0 anxn is
continuous on (−R,R).
Lemma 6.28
Assume that (an)n∈N is a sequence of non-zero real numbers such that ` = limn→∞ |an+1||an| exists.
Then
∑
anx
n has radius of convergence R = 1/` if ` > 0, or R = +∞ if ` = 0.
Proof
Let x ∈ R, x 6= 0, and let us analyse the convergence of ∑ anxn by using the ratio test. We have,
by the ALT,
|an+1xn+1|
|anxn| =
|an+1|
|an| |x| → `|x| as n→∞.
Hence, by the ratio test, if `|x| < 1 the series ∑ anxn converges, and if `|x| > 1 it diverges.
Assume that ` 6= 0. If x ∈ (−1/`, 1/`), then `|x| < 1 and therefore ∑ anxn converges. If |x| > 1/`
then `|x| > 1 and ∑ anxn diverges. Denoting by I the interval of convergence, this shows that
(−1/`, 1/`) ⊂ I and I ⊂ [−1/`, 1/`] (no x outside this latter interval can be a point of convergence).
Since I has endpoints ±R by definition, this proves that R = 1/`.
If ` = 0, then |x|` < 1 for all x ∈ R and therefore ∑ anxn converges for any real number x. Hence,
I = R and R = +∞.
Technique 6.29 (How to find the radius and interval of convergence of
∑
anx
n?)
Lemma 6.28 is in practice your best friend. Check if ( |an+1||an| )n∈N converges. If it does, write
R = 1/`. If it does not, you have to find another way...
In that latter case, you can simply remember that the radius of convergence can be defined as
R = sup{x ≥ 0 ;
∑
anx
n converges}
(with this supremum being +∞ if the corresponding set is not bounded above). Finding R
therefore consists in finding the largest possible x for which
∑
anx
n converges. This boils down to
analysing the convergence and divergence of a series of real numbers, depending on the positions
of x. Consider using any of the tests, other than the ratio test, of Section 3.
Once R is known, the interval of convergence I is easy to find. Just check if the series converges
at x = −R and/or x = R, and it will tell you if I = [−R,R], I = (−R,R], I = [−R,R) or
I = (−R,R) (include in I any endpoints ±R where the series converges).
59
Exercise 6.30
Prove that
1.
∑
xn
n2 has interval of convergence [−1, 1].
2.
∑
xn
n has interval of convergence [−1, 1).
3.
∑ (−x)n
n has interval of convergence (−1, 1].
4.
∑
xn has interval of convergence (−1, 1).
Hint for the solution
For each, Lemma 6.28 applies and gives R = 1. You then need to check if the given series
converge at ±1. This can be done by using AST, p-series, the classical combination of absolute
convergence test and comparison test, or the n-th term test.
Theorem 6.31 (Differentiation of power series)
A power series
∑
anx
n and the power series of its derivatives
∑
nanx
n−1 have same radius of
convergence R.
As a consequence,
∑
anx
n is differentiable on (−R,R) and (∑∞n=0 anxn)′ = ∑∞n=1 nanxn−1.
Proof
Let R the radius of the series and R′ the radius of the series of derivatives, and let us start by
proving that R′ ≤ R. Let 0 < r < R′. For n ≥ 1 we have
|nanrn−1| ≥ |anr
n|
|r| .
By choice of r and definition of R′,
∑
nanr
n−1 converges absolutely so, by comparison test and
absolute convergence test,
∑
anr
n converges. This shows that r ≤ R. Since this is true for any
r < R′, letting r → R′ shows that R′ ≤ R.
We now want to prove that R′ ≥ R, which will establish that R′ = R. Take r < R and s ∈ (r,R).
Then
|nanrn−1| ≤ n
s
(r
s
)n−1
|ansn|.
Since r/s < 1, the sequence (n( rs )n−1) is bounded (it converges to 0), say byM . Thus, |nanrn−1| ≤
M |ansn|. By choice of s,
∑
ans
n converges absolutely, which proves (still using comparison test
and absolute convergence test) that
∑
nanr
n−1 converges. Hence, r ≤ R′ and since this is true
for any r < R this shows that R ≤ R′ as required.
Finally, let us prove that
∑
anx
n is differentiable on (−R,R). For any r < R, ∑ anxn and∑
nanx
n−1 converge uniformly on [−r, r] by Theorem 6.25. Hence, Theorem 6.18 shows that∑
anx
n is differentiable on (−r, r) with (∑ anxn)′ = ∑nanxn−1 on this interval. Since this is
true for any r < R, this shows that the same differentiability and relation holds on (−R,R).
Corollary 6.32 (Power series are indefinitely differentiable)
If a power series
∑
anx
n has radius of convergence R, then its limit
∑
n≥0 anx
n is indefinitely
differentiable on (−R,R), and it can be differentiated term-by-term, that is, for any natural
number k,
dk
dxk
∑
n≥0
anx
n
 = ∑
n≥0
an
dk
dxk
xn =
∑
n≥0
n(n− 1) · · · (n− k + 1)anxn−k.
60
Hint for the proof
Argue by induction on k, using Theorem 6.31 in the inductive step.
Exercise 6.33
The famous exponential function properly defined...
Let
exp(x) =
∞∑
n=0
xn
n! .
1. Prove that exp is well defined and indefinitely differentiable on R.
2. Show that exp′ = exp.
3. Show that, for all x, exp(x) exp(−x) = 1.
4. Show that, for all x, y, exp(x+ y) = exp(x) exp(y).
Solution
1. exp is defined as the limit of a power series. We therefore have to check the convergence
properties of this power series. We have
1/(n+ 1)!
1/n! =
1
n+ 1 → 0 as n→∞
so Lemma 6.28 shows that the radius of convergence of
∑
xn
n! is +∞. Hence, this series is
defined on R.
2. By Theorem 6.31 we can differentiate term by term on (−∞,∞) = R to find
exp′(x) =
∞∑
n=0
nxn−1
n! .
The term n = 0 can be removed as it vanishes and, re-indexing, we find
exp′(x) =
∞∑
n=1
xn−1
(n− 1)! =
∞∑
k=0
xk
k! = exp(x).
3. f(x) = exp(x) exp(−x) is differentiable on R by the ADT and the chain rule, and we have
f ′(x) = exp(x) exp(−x) + exp(x) × (− exp(−x)) = 0. Corollary 5.17 then shows that f is
constant on R. We clearly have exp(0) = exp(−0) = 1, so f(0) = 1 and thus f(x) = 1 for all
x ∈ R.
4. Fix y and set g(x) = exp(x + y) and f(x) = exp(x) exp(y). f and g are differentiable on
R with f ′ = f and g′ = g. If x ≥ 0 then from its definition we see that exp(x) ≥ 1 > 0.
Hence, exp(−x) = 1/ exp(x) > 0, that is, exp is always strictly positive. This therefore also
holds for g and the ADT enable us to write ( fg )′ =
gf ′−fg′
g2 =
gf−fg
g2 = 0 on R. Hence
f
g is
constant on R. Since it takes value 1 at x = 0, we have fg = 1 everywhere, which proves that
exp(x) exp(y) = exp(x+ y).
Exercise 6.34
Find a solution of the ODE (6.1) under the form of a power series y(x) =
∑
n≥0 anx
n. Compute the
radius of convergence of this series (Hint: Exercise 5 in Problem Set 12 can be helpful).
Partial solution
Plugging the formula for y into the ODE and identifying the powers of x lead to the relations:
61
a1 = 1, 2a2 = a0, 3a3 = a1, . . . , kak = ak−2 (for k ≥ 2). No condition is imposed on a0.
From this, we can find a formula for the generic coefficient an of the power series (hint: find
formulas for a2n and for a2n+1). The radius of convergence follows easily from the recurrence
relation and from Exercise 5 in Problem Set 12.
6.5.1 Taylor series
Can we write a power series with two different sets of coefficients? In other words, can we have∑∞
n=0 anx
n =
∑∞
n=0 bnx
n on some non-trivial interval, with (an)n∈N and (bn)n∈N different sequences?
The answer is no. From the sum f(x) =
∑∞
n=0 anx
n of the power series we can find back all the coefficients
an by differentiating. We have f(0) = a0, f ′(0) = a1, f ′′(0) = 2a2, f ′′′(0) = 3!a3 and, more generally,
an =
f (n)(0)
n!
(recall that f is indefinitely differentiable on the interior of the interval of convergence). This formula
shows how to construct (an)n∈N from the sum f of the series.
This leads to another question: can any indefinitely differentiable function f be written as a power series?
It depends on the behaviour of its Taylor remainder...
As a consequence of Corollary 5.23 we have, if f is indefinitely differentiable on (−r, r), for any x in this
interval, ∣∣∣∣∣f(x)−
n−1∑
k=0
f (k)(0)
k! x
k
∣∣∣∣∣ ≤ sup(−r,r) |f (n)|r
n
n! . (6.9)
Hence, if sup(−r,r)
|f(n)|rn
n! → 0 as n→∞, then the Taylor series corresponding to f converges to f and
f can be represented as a power series on (−r, r).
Example 6.35
1. exp on R, of course.
2. 11−x on (−1, 1), with 11−x =
∑∞
n=0 x
n.
3. ln(1− x) on (−1, 1), with ln(1− x) = −∑∞n=1 xnn .
4. cos and sin on R (indeed, all their derivatives are bounded by 1 on R). We have
cos(x) =
∞∑
n=0
(−1)n x
2n
(2n)! , sin(x) =
∞∑
n=0
(−1)n x
2n+1
(2n+ 1)! .
5. Some functions cannot be represented as power series. A counter-example:
g(x) =
{
e−1/x
2 if x 6= 0,
0 if x = 0
(plot this function, and zoom over and over around x = 0...). It can be proved that g is indefi-
nitely differentiable on R with g(n)(0) = 0 for all n. Hence, if we had g(x) =
∑∞
n=0
g(n)(0)
n! x
n on
some (−r, r), we would have g = 0 on (−r, r), which is not the case.
Note however that the power series defined from g (with coefficients g
(n)(0)
n! = 0) has an infinite
radius of convergence!
62
A Appendix: elements of mathematical proof
Here, “theorem” is a generic word used in lieu of “theorem, lemma, proposition, etc.”
A.1 Distinguishing hypotheses from conclusions
A theorem states that if some hypotheses hold true, then so do some conclusions. To understand the
theorem, its usage and its proof, it is essential to be able to distinguish the hypotheses and the conclusions
in its statement.
Example A.1
Consider the AoC: “Every nonempty set of real numbers that is bounded above has a least upper
bound”.
This could be reworded as: “if A is a non-empty set of real numbers and A is bounded above, then
A has a least upper bound”. This form makes clearer the hypotheses and conclusions of the result:
• Hypotheses:
1. A is a set of real numbers,
2. A is not empty,
3. A is bounded above.
• Conclusion:
1. A has a least upper bound.
The Axiom of Completeness states that if all hypotheses are satisfied, then the conclusion holds.
Clearly writing down the hypotheses on one side and the conclusions on the other side, as done here, is
a very good exercise when trying to understand a theorem.
A.2 Table of truth of logical/mathematical statements
Any statement in mathematics has one of the two following values: true or false. In other words, a
given statement A is true, or its negation ¬A is true (excluded middle).
Statements can be combined to create new statements. For example, if A and B are mathematical
statements, then “if A then B” (or A⇒ B in logical form) is also a mathematical statement. The truth
value of a statement obtained by combining two (or more) others can be established by looking at the
value of each statement, and considering the following tables.
B true B false
A true TRUE FALSE
A false TRUE TRUE
B true B false
A true TRUE FALSE
A false FALSE FALSE
B true B false
A true TRUE TRUE
A false TRUE FALSE
Value of “A⇒ B” Value of “A & B” Value of “A or B”
Example A.2
1. Write the table of value of (¬B) ⇒ (¬A) and deduce the principle of proof by contrapositive:
to prove “A⇒ B”, it suffices to prove “(¬B)⇒ (¬A)”.
2. What is the negation of A ⇒ B (i.e. a statement that is true whenever A ⇒ B is false, and
vice-versa)?
A.3 Principles of a proof
A proof of a theorem consists in establishing that if all the theorem’s hypotheses hold, then all the
conclusions hold. If A is the set of hypotheses and B the set of conclusions, it therefore consists in
showing that “A⇒ B” is always true, whatever the truth value of A and B.
Consider the value table of “A⇒ B” in Section A.2. In the case where B is true, then “A⇒ B” is always
true, so we have nothing to do here. This is why starting a proof by assuming that the conclusions hold is
never useful, and never leads to anything interesting. In the case where A holds true, then for “A⇒ B”
to be true we must ensure that B also holds true (is not false). This is actually the only work we have
to do to prove “A⇒ B”: assume that A is true, and prove from there that B is also true.
63
This is why it’s essential to be able to distinguish assumptions from conclusions, in any mathematical
statement.
Example A.3
When proving the following result: “if f is a increasing function R → R then, for any y ∈ R,
{x ∈ R : f(x) = y} is an interval”, you should not try to prove that f is increasing, and you should
not start by assuming that, for any y ∈ R, {x ∈ R : f(x) = y} is an interval.
The proof by contrapositive, justified by Example A.2, also gives you another way to prove “A ⇒ B”:
assume that B is false, and prove that A must be false (note that, even here, we do not start by assuming
that the conclusions hold... we actually assume that they do not hold!).
Note A.4 (On writing mathematics)
Proofs, as results, do not necessarily need to be written using many mathematical symbols and
notations. In particular, utilising words and grammatically correct English sentences in a proof
tends to make it more legible and easier to understand. Proper written communication of math-
ematics do not use symbols (such as ⇒ or ∀) in English sentences; such symbols are restricted to
statements written in logical form.
As already mentioned, the textbook stipulates that, in a proof, each step must follow logically from
previous steps or be justified by some other agreed-upon set of facts. This describe the two “authorised”
ways of going from one step to another in a mathematical proof.
• Logical deduction: this must rely on on very simple and straightforward arguments. When you
pass from one elementary step of a proof to the other by using “deduction”, if the reader/listener
asks “why is it true?” then this means that more arguments (intermediate steps) must be used.
• Justified by agreed-upon set of facts: these are previously proved results, mostly theorems from
the lectures (or previous units, etc.). When using such results, you need to clearly (i) check that
the hypotheses hold in the context of your current reasoning, and (ii) detail the conclusions
in this context.
You are also expected to state a precise name/reference, and possibly also a source, of the result
you use (e.g. “Theorem 4.3 in Abbott”, “Lemma 4.7 in the lecture notes”, “Intermediate Value
Theorem from the lectures”, etc.)
Exercise A.5
Prove that A = {1− 1n : n = 1, 2, . . .} has a least upper bound.
Solution
Let us first highlight the hypothesis and the conclusion.
• Hypothesis: A is the particular set described by {1− 1n : n = 1, 2, . . .}.
• Conclusion: A has a least upper bound (l.u.b).
Now for a solution. We have a previously established result which precisely has a conclusion “some
set has a l.u.b”, so we should look into invoking this result (the Axiom of Completeness). We need
to check the hypotheses:
1. A is a set of real numbers (ok: 1− 1/n is real for any n),
2. it is not empty (ok: it contains 1− 1/1 = 0), and
3. it is bounded above (ok: by 1 since for any n, 1− 1/n ≤ 1... could you prove that?).
Then the conclusion of the Axiom of Completeness holds: A has a l.u.b... hurray, the proof is
done!
Exercise A.6
64
Prove that the function f(x) = x3 is increasing on R.
Solution... and discussion
Very classical: compute the derivative f ′(x) = 3x2. It is positive so the function is increasing.
But (i) why is the derivative like that?, (ii) why is it positive?, and (iii) why does that imply the
conclusion?
(i) “from the formula for derivatives of monomials” [e.g. Theorem 5.2.4 in Abbott],
(ii) “we know that x2 is nonnegative for all x” [if again we ask why: “because if x ≥ 0 then
x×x ≥ 0 by the established rules for signs and inequalities in R, and if x ≤ 0 then x×x ≥ 0
by the same rules], “so 3x2 is also nonnegative” [same rules],
(iii) we need to invoke (and thus check the hypotheses/conclusion) a theorem (from the lectures,
previous units, books, etc.). Here, we need the result of Exercise 5.3.7 (a) in Abbott: “If f
is differentiable on (a, b) and f ′(x) ≥ 0 for all x ∈ (a, b), then f is increasing on (a, b)” (see
also Remark 3.17 (2) in the Lecture Notes and Exercise 4 in Problem Set 9). In our case we
can take (a, b) = (−∞,∞) and f(x) = x3, which is differentiable and satisfies f ′(x) ≥ 0.
Note A.7 (Level of precision)
In the solution above, the arguments are very detailed and pedantic. This was for the sake
of presentation. In practice, many usual properties of real numbers can be directly used, such
as the fact that x2 is positive for any x. Such properties may also be dubbed “obvious” but,
from a mathematical perspective, it is important to understand that they are not just intuitive
or “natural”: they have been rigorously established using the construction of real numbers and
their associated order. Keep in mind the assumed background knowledge for the unit. Trivial
calculations and minor concepts need not be explained fully (but you should be able to prove
them if asked to), but everything “new” should be.
Recall the following citation from the textbook: [A proof ] is a set of carefully crafted directions, which,
when followed, should leave the reader absolutely convinced of the truth of the proposition in question.
How can you make sure that your proof convinced your reader/listener? Well, you cannot unless you
have some sort of feedback, but the first reader/listener to convince is yourself. If you are not able to
clearly explain how, in a proof, you deduce one line from the other then you need to work more and find
a justification. Then, if you are convinced yourself, you need to convince someone else. This can only be
done through practice and feedback (from your colleagues, tutor, etc.). It is an invaluable exercise to
present proofs to your fellow students.
A.4 Some classical techniques of proof
These are general techniques at your disposal.
• Direct proof : start by assuming that the hypotheses hold and try to infer the conclusions (see
e.g. the proof in Exercise A.6).
• Proof by contradiction, or contrapositive proof: assume that the hypotheses hold and that the
conclusions do not hold, and try to find a mathematical impossibility (see e.g. the proof of Theorem
1.5).
• Proof by induction: when you want to prove that a property depending on an integer n holds
for any natural numbers, use the following:
1. clearly state the property P (n), depending on n, which you want to prove for any integer, and
explain that you will prove that by induction on n,
2. prove P (1) (base case),
3. prove that, for any integer n, if P (n) holds (induction hypothesis) then P (n + 1) holds (i.e.
prove P (n)⇒ P (n+ 1)) (inductive step).
65
An alternate (equivalent) form of this is:
1. let S the set of all n ∈ N such that P (n) is true,
2. prove that 1 belongs to S (base case),
3. prove that, for any integer n, if n belongs to S (induction hypothesis) then n+ 1 belongs to S
(inductive step).
If you can do that, then the induction principle tells you that P (n) holds for any n.
Exercise A.8
Prove by induction that, for any natural number n, 1 + 2 + · · ·+ n = n(n+1)2 .
Solution
The property to prove is P (n) :“1 + 2 + · · ·+ n = n(n+1)2 ”.
Base case: for n = 1, the property is “1 = 1×22 ”, which is clearly true.
Inductive step: let us take n ∈ N and assume that P (n) holds true. Then 1 + 2 + · · · + n =
n(n+1)
2 . To prove P (n+ 1), we start with
1 + 2 + · · ·+ (n+ 1) = 1 + 2 + · · ·+ n+ (n+ 1).
We then use P (n) to infer
1 + 2 + · · ·+ (n+ 1) = n(n+ 1)2 + (n+ 1)
= n(n+ 1) + 2(n+ 1)2 =
(n+ 1)(n+ 2)
2
which proves that P (n+ 1) holds, as required.
Some tips and tricks may also help you get ideas for proofs (the order of the tips is meaningless):
• Tip 1: Look at the hypotheses and the conclusions of the result you want to prove. Do they look the
same as some previously established result? If so, you may want to try and invoke this previously
established result during your proof (we did that in Exercise A.5), or you may try to look at the
proof of this previously established result and see if it can be adapted to the result you want to
prove.
• Tip 2: Without necessarily having the same hypotheses or conclusions, does the result look like
something you have already seen elsewhere (another result, a previous exercise, etc.)? Try and see
if the technique used then can be applied.
• Tip 3: Translate, in different terms, the hypotheses and conclusions to see if that may help you
start.
• Tip 4: See if you can prove a simpler or special case of the result, to get insight into the proof of
the whole result,
• Tip 5: Use rough drawings/diagrams to help you get an intuition on the problem.
(see also the book “How to solve it” by G. Polya, published by Princeton University Press, Princeton NJ,
2004, 253p).
A.5 What about intuition?
Intuition is at the core of finding a proof (cf. previous tips), but an intuitive reasoning by itself cannot
be considered as a rigorous proof. And, sometimes, intuition is misleading.
Example A.9
66
Consider the following reasoning:
The series
∑∞
n=2
1
n log(n) converges. Indeed, its general term
1
n log(n) tends to 0 quickly than
1
n (since
log(n) → ∞ as n → ∞). We know that ∑∞n=2 1n diverges, but since 1n log(n) tends to 0 faster than
1
n we see that
∑∞
n=2
1
n log(n) converges.
Unfortunately, this intuitive reasoning is false as the series diverges – compare it with the integral
of 1x log(x) , whose anti-derivative is log(log(x)). The mistake in the preceding intuitive reasoning is
that series whose general term tend to 0 faster than 1n do not necessarily converge.
This issue of misleading intuition is also illustrated in some of the “paradoxes” presented at the start
of each week.
Example A.10
Consider the following reasoning:
The set A = (0, 2) does not have a maximum, for its maximum would necessarily be 2, which does
not belong to the set” (recall that a maximum is a point in the set which is larger than or equal to
all the other points in the set).
Good intuition, this time, but which needs to be formalised if we are to get a proper rigorous proof.
Proof that A = (0, 2) does not have a maximum
We prove the result by contradiction. Assume that A has a maximum m. Then, by definition,
m must be larger than or equal to 2; otherwise we have m < 2 and, taking the middle point
x = m+22 between m and 2 we find x ∈ A (can you check this?) such that x > m (can you
also check this?); this is a contradiction with the fact that m is greater than or equal to any
point in A. Hence we have m ≥ 2, but since m ∈ A (definition of the maximum), we must
have m < 2 (definition of A), and we end up with our final contradiction. Conclusion: A
cannot have a maximum...
Ideally, the difference between an intuitive proof and the corresponding formal proof should only lie within
technical details and computations not given in the intuitive form of the proof. But all main arguments,
including reference to key theorems used in the proof, should already be given in the intuitive reasoning.
B Appendix: fixed points of Markov chains ♦NE♦
A Markov chain is a process that evolves through a sequence of stages. At any stage the process occupies
one of say m possible states and the state to which the process will evolve next depends only on the current
state and not on the previous history. The process is modelled by an m by m column stochastic matrix A
called the transition probability matrix of the Markov chain. A column vector x = (x1, . . . , xm) ∈ Rm
is called stochastic if each component xi belongs to [0, 1] and if x1 + x2 + · · · + xm = 1 (the reason
for denoting components with superscripts instead of indices will become clear later). A matrix is called
column stochastic if each column is stochastic.
Of fundamental importance in the theory of Markov chains is that for every square column stochastic
matrix A there is a stochastic vector p such that Ap = p. For regular transition probability matrices
(meaning some power of A has no zero entries) this vector p is unique and is called the steady state
vector. Its components are the long-term probabilities that the process is in the corresponding states.
In the language of linear algebra, p is a stochastic eigenvector of A corresponding to the eigenvalue 1.
See MTH2021 for more details and practical examples of Markov chains and limit distributions.
The existence of p is frequently used but rarely proved in mathematics courses. A number of proofs are
known, most of them very long. For example, there is a five page proof of a special case in the classic
text by W. Feller (An Introduction to Probability Theory and its Applications. Vol 1, 1968). We will show
that the existence of p is a simple consequence of the compactness property of bounded sequences.
Let us start with a few basic extensions of the notion we saw in R.
Take x = (x1, x2, . . . , xm) ∈ Rm. In place of the modulus function |x| we have the Euclidean norm
‖x‖ = √(x1)2 + (x2)2 + · · ·+ (xm)2 and for the distance between x,y ∈ Rm we have d(x,y) = ‖x− y‖.
67
A sequence in Rm is (xn)∞n=1 where xn = (x
(1)
n , x
(2)
n , . . . , x
(m)
n ). A sequence (xn)∞n=1 in Rm is said to
converge to the limit x ∈ Rm if each of its components converge to the corresponding component of x:
for any i = 1, . . . ,m, x(i)n → x(i). This comes down to d(xn,x)→ 0.
Theorem B.1 (Heine-Borel theorem for Rm)
Let K be a subset of Rm which is bounded (i.e. there exists M > 0 such that for any x ∈ K we
have ||x|| ≤M) and closed (i.e. if (xn)n∈N is in K and converges to some x in Rm then x ∈ K).
Then K is compact, that is any sequence in K has a subsequence which converges in K.
Exercise B.2
Let K be the set of all stochastic vectors in Rm and let A be an m by m column stochastic matrix.
1. Sketch K in the cases m = 2 and m = 3.
2. Prove that K is bounded.
3. Prove that K is closed, and thus that it is compact.
4. Prove that K is convex, that is if x1, . . . ,xl belong to K and λ1, . . . , λl are reals in [0, 1] such
that
∑l
k=1 λk = 1, then
∑l
k=1 λkxk ∈ K.
5. If p ∈ K explain why Ap ∈ K.
Skeleton of solution
1. Doodle it!
2. If x is a stochastic vector, ||x||2 ≤ (x1 + · · ·+ xm)2 ≤ 1 (why?).
3. If each xin → `i as n → ∞, for all i = 1, . . . ,m, then by the ALT 1 = x1n + · · · + xmn →
`1 + · · ·+ `m. Moreover, by the OLT, 0 ≤ xin ≤ 1 for all n ∈ N implies 0 ≤ `i ≤ 1.
4. Simple algebra: using 0 ≤ xik ≤ 1 and x1k + · · · + xmk = 1, and 0 ≤ λk ≤ 1, check that
0 ≤∑lk=1 λkxik ≤ 1 for all i = 1, . . . ,m and that ∑mi=1 (∑lk=1 λkxik) = 1.
5. Also rather simple algebra using the properties on the coefficients of A.
As often taught in Linear Algebra, a function T : Rm → Rm is called an affine transformation if it is of
the form T (x) = Ax+b where A is a fixed m by m matrix and b ∈ Rm is also fixed. For ease of notation
we write T (x) = Tx. It can be shown that every affine transformation is continuous (exercise!). By
this we mean that it takes convergent sequences to convergent sequences; that is, if (xn) → x then
(Txn)→ Tx.
Exercise B.3
Prove that any affine transformation takes convex combinations to convex combinations, i.e. when-
ever x1,x2, . . . ,xl ∈ Rm, λ1, λ2, . . . , λl ∈ [0, 1] and λ1+λ2+· · ·+λl = 1, we have T (λ1x1+· · ·+λlxl) =
λ1T (x1) + · · ·+ λlT (xl).
Skeleton of solution
Very simple algebra: A(λixi) = λiAxi. Write down each side the the equality and use
λ1 + · · ·+ λl = 1.
Theorem B.4 (Markov-Kakutani theorem)
Let T be an affine transformation and K a non-empty compact convex subset of Rm. If T maps
K into K then T has a fixed point in K, that is there exists x ∈ K such that T (x) = x.
68
Proof
Choose any z ∈ K and define xn = 1n (z +Tz +T 2z + ...+Tn−1z) where T k = T ◦T ◦T ◦ ... ◦T is
the k-fold composite of T . By convexity of K we conclude xn ∈ K. By compactness of K we can
find a convergent subsequence (xnj )∞j=1 with limit x ∈ K. We show that x is a fixed point of T .
Since (xnj ) → x and T is continuous, we have (Txnj ) → Tx. By the Heine-Borel theorem, K is
bounded so there is an M > 0 such that ‖a‖ ≤M for all a ∈ K. Therefore, using Exercise B.3,
‖xn − Txn‖ =
∥∥∥∥ 1n (z + Tz + T 2z + ...+ Tn−1z)
− 1
n
(Tz + T 2z + T 3z + ...+ Tnz)
∥∥∥∥
=
∥∥∥∥ 1n (z− Tnz)
∥∥∥∥
≤ 1
n
(‖z‖+ ‖Tnz‖)
≤ 1
n
(M +M).
Replacing n by nj we get 0 ≤
∥∥xnj − Txnj∥∥ ≤ 2Mnj and letting j →∞ we conclude 0 ≤ ‖x− Tx‖ ≤
0. This means x = Tx showing that x is a fixed point as claimed.
Finally we apply this theory to Markov chains.
Theorem B.5 (Fundamental theorem of Markov chains)
Given any square column stochastic matrix A there is a stochastic vector p such that Ap = p.
Proof
Let K be the set of all stochastic vectors in Rm and let T be the affine transformation on Rm
defined by T (x) = Ax. By Example B.2 (item 5), T maps K into K. By the same example, K is
compact and convex. It is also non-empty. By the Markov-Kakutani theorem T has a fixed point
p ∈ K. Thus Ap = p.
C Appendix: continuous nowhere differentiable? ♦NE♦
All the continuous functions we saw these notes, and probably all the continuous functions we can think
of, are either differentiable everywhere or everywhere except a few points. Can we find a continuous
function that is nowhere differentiable?
The answer is yes, but the construction of such a function is much more difficult than everything we
previously encountered. The example we describe here is a variation of an example given by Bernard
Bolzano around 1830.
Consider I = [0, 1] and define the functions
• f0(x) = x on I.
• I is cut in thirds I1 = [0, 1/3], I2 = [1/3, 2/3], I3 = [2/3, 1] and f1 is the piecewise affine function
on I, with slopes s1 = 2 on I1, s2 = −1 on I2 and s3 = 2 on I3.
• Each interval Ii is cut in thirds Ii,1, Ii,2, Ii,3 and f2 is the piecewise affine function on I, with slopes
2si on Ii,1, −si on Ii,2 and 2si on Ii,3.
That is, f2 has slopes
? 4 on I1,1 = [0, 1/9], −2 on I1,2 = [1/9, 2/9], 4 on I1,3 = [2/9, 3/9],
? −2 on I2,1 = [3/9, 4/9], 1 on I2,2 = [4/9, 5/9], −2 on I2,3 = [5/9, 6/9],
? 4 on I3,1 = [6/9, 7/9], −2 on I3,2 = [7/9, 8/9] and 4 on I3,3 = [8/9, 1].
69
• and so on and so forth...
Functions f1 (dotted line), f2 (dashed line) and f3 (continuous line).
More precisely, knowing fn define
fn+1(x) =

2
3fn(3x) if x ∈ [0, 13 ],
2
3 − 13fn
(
3
(
x− 13
))
if x ∈ [ 13 , 23 ],
2
3fn
(
3
(
x− 23
))
) + 13 if x ∈ [ 23 , 1].
(C.1)
Using this formula, a proof by induction shows that fn is affine on each tri-adic interval of the form
[k/3n, (k + 1)/3n] (with k integer). Moreover, the set of slopes of fn on [0, 1] is Sn = (−1)n ×
{1,−2, 4,−8, 16, · · · , (−2)n−1} = {(−1)n+q2q : q = 0, . . . , n− 1}.
Hint for the proof
The formula shows that the set of slopes Sn+1 of fn+1 is 2 × Sn ∪ (−Sn), and from this we get
the description of the set of slopes.
Induction also shows that fn+1(k/3n) = fn(k/3n). In particular, if n ≥ p,
fn
(
k
3p
)
= fp
(
k
3p
)
(C.2)
Proof
We prove that fn+1(k/3n) = fn(k/3n), the conclusion (C.2) then follows easily by noticing that
k/3n = (3k)/3n+1.
If k3n ≤ 13 then fn+1( k3n ) = 23fn( k3n−1 ) = 23fn−1( k3n−1 ) by induction, and thus fn+1( k3n ) =
2
3fn−1(3× k3n ) = fn( k3n ).
If 13 ≤ k3n ≤ 23 then fn+1( k3n ) = 23 − 13fn(3( k3n − 13 )) = 23 − 13fn(k−3
n−1
3n−1 ) with k − 3n−1 integer
between 0 and 3n−1. Hence, by induction, fn(k−3
n−1
3n−1 ) = fn−1(
k−3n−1
3n−1 ) = fn−1(3(
k
3n − 13 )) and
70
fn+1(x) = 23 − 13fn−1(3( k3n − 13 )) = fn( k3n ).
Similarly if k3n ≥ 23 .
The sequence (fn)n∈N has a uniform limit f . This function is called the Bolzano function.
Skeleton of proof
We first prove by induction that d(fn+1, fn) ≤ 23d(fn, fn−1), where d(g, h) = supx∈[0,1] |g(x)−h(x)|
is the distance defined by (6.5). Different cases are studied depending on the position of x.
If x ∈ [1/3, 2/3] then
|fn+1(x)− fn(x)| = 13 |fn(3(x− 1/3))− fn−1(3(x− 1/3))|
≤ 13 supy∈[0,1]
|fn(y)− fn−1(y)| = 13d(fn, fn−1).
Likewise if x ∈ [0, 1/3] or x ∈ [2/3, 1] – that’s when we get the 2/3 factor (do it!)
Using again induction (do it!), we infer that d(fn+1, fn) ≤ ( 23 )nd(f1, f0) ≤ ( 23 )n for all n ∈ N. The
triangle inequality on d (check it using the definition of this distance) then yields, for p ≥ n,
d(fn, fp) ≤ d(fn, fn+1) + d(fn+1, fn+2) + · · ·+ d(fp−1, fp)
≤
(
2
3
)n
+ · · ·+
(
2
3
)p−1
≤
(
2
3
)n 1
1− 23
= 3
(
2
3
)n
.
The RHS in this inequality is as small as we want if n is large, which enables us to apply the
uniform Cauchy criterion to (fn)n∈N (write this properly).
The Bolzano function f is continuous, being the uniform limit of continuous functions, but it is nowhere
differentiable.
f20 (close to the Bolzano function...)
Skeleton of proof (only two gaps)
Assume that f is differentiable at x and write thus f(x+h) = f(x)+f ′(x)h+hr(h). In particular,
for h, k strictly positive, f(x+h)− f(x− k) = f ′(x)(h+ k) +hr(h) + kr(−k), or f(x+h)−f(x−k)h+k =
71
f ′(x) hh+k r(h) +
k
h+k r(−k).
Let αp ∈ Z and hp, kp be chosen as in the following figure.
kp hp
xαp
3p
αp+1
3p
With these choices, we get
3p
[
f(αp + 13p )− f(
αp
3p )
]
= f ′(x) + hp
hp + kp
r(hp) +
kp
hp + kp
r(−kp). (C.3)
But, taking the limit as n→∞ of (C.2), 3p[f(αp+13p )− f(αp3p )] = 3p[fp(αp+13p )− fp(αp3p )] is a slope
of fp (affine on this interval), i.e. is some integer lp in Sp. Since hphp+kp ≤ 1 and
kp
hp+kp ≤ 1,
hp
hp+kp r(hp) and
kp
hp+kp r(−kp) tend to 0 as p → ∞ (why?) and (C.3) shows that lp → f ′(x) as
p→∞.
However, the sequence (lp) cannot converge. Indeed, a sequence of integer which converges is
ultimately constant (i.e lp = lP for all p ≥ P – why?), but then we would have lP = lP+1 ∈
SP ∩ SP+1 which is impossible since SP ∩ SP+1 = ∅ (otherwise, there exists q, q′ such that
(−1)P+q2q = (−1)P+1+q′2q′ ; taking the absolute value then leads to q = q′ and then (−1)P =
(−1)P+1).
Example C.1
Another construction of a continuous nowhere differentiable function (from Rudin, Principles of
Mathematical Analysis, McGraw-Hill – Theorem 7.18).
1. Let ϕ(x) = |x| if x ∈ [−1, 1] and extend ϕ by 2-periodicity, i.e. by requiring that ϕ(x+2) = ϕ(x)
for all x ∈ R. Prove that ϕ satisfies |ϕ(s)− ϕ(t)| ≤ |s− t| for all s, t ∈ R.
2. Prove that f(x) =
∑
n≥0( 34 )nϕ(4nx) is defined and continuous on R.
3. Let x ∈ R. Define δm = ± 124−m, where the sign is chosen so that no integer lies between
4mx and 4m(x + δm) (one of the intervals (4mx, 4mx + 1/2) or (4mx − 1/2, 4mx) does not
contain any integer). Prove that, if n > m, ϕ(4n(x + δm)) − ϕ(4nx) = 0 and that, if n ≤ m,∣∣∣ϕ(4n(x+δm))−ϕ(4nx)δm ∣∣∣ ≤ 4n, with equality if n = m.
4. Deduce that ∣∣∣∣f(x+ δm)− f(x)δm
∣∣∣∣ =
∣∣∣∣∣3m +
m−1∑
n=0
(34)
nϕ(4n(x+ δm))− ϕ(4nx)
δm
∣∣∣∣∣
≥ 3m −
m−1∑
n=0
3n = 3m − 3
m − 1
2 =
3m
2 +
1
2
and that f is therefore not differentiable at x.
D Appendix: cantor function ♦NE♦
The Cantor function (see Abbott, exercise 6.2.3) is defined as the uniform limit on [0, 1] of the sequence:
f1(x) =

3
2x if x ∈ [0, 1/3],
1
2 if x ∈ [1/3, 2/3],
3
2x− 12 if x ∈ [2/3, 1],
72
and, for n ≥ 1,
fn+1(x) =

1
2fn(
3
2x) if x ∈ [0, 1/3],
fn(x) if x ∈ [1/3, 2/3],
1
2fn(3x− 2) + 12 if x ∈ [2/3, 1].
Graph of f1 Graph of f2
Graph of f3 Graph of f4
In a similar way as for the Bolzano function, it can be proved that d(fn+1, fn) ≤ 12d(fn, fn−1) and thus
that (fn)n∈N converges uniformly to some continuous f : [0, 1] → [0, 1]. Each fn being increasing, the
same holds for f (pointwise convergence is enough for that). Moreover, f(0) = 0, f(1) = 1 and f ′ = 0
on [0, 1]\C, where C is the Cantor set (see Abbott, Chapter 3.1).
The Cantor function f is therefore continuous, increasing from 0 to 1 but has a derivative that vanishes
on a set of length 1 (=length of [0, 1]).
E Appendix: continuity points of pointwise limits ♦NE♦
Exercise 6.3 shows that pointwise limits of sequences of continuous functions are not necessarily contin-
uous. However, the limit function given in this example is only discontinuous at one point. Is this a
general case? Can pointwise limits of sequences of continuous functions be discontinuous everywhere (or
have many points of discontinuity)?
The Baire-Osgood theorem states that if (gn)n∈N is a sequence of continuous functions which converges
pointwise to g, then the set of points of continuity of g is dense in R.
73
This shows in particular that derivatives must have “lots” of points of continuity. Indeed, if f is dif-
ferentiable on R, say, then f ′(x) = limn→∞ f(x+1/n)−f(x)1/n so f ′ is the pointwise limit of the continuous
functions x 7→ f(x+1/n)−f(x)1/n .
A reference for the Baire-Osgood theorem is “Functional analysis” (vol 1, page 420) by Dzung Minh
Ha (available at the Hargrave Andrew Library under the code 515.7 H111F 2006). The Baire-Osgood
theorem in this reference is presented in a generic setting and requires some background in topology (in
particular in induced topology) to be readable.
74

欢迎咨询51作业君