The University of Melbourne

School of Mathematics and Statistics

MAST20026

Real Analysis

Semester 2, 2020

Partial Lecture Slides

Some things to be aware of:

I These notes are not intended as a textbook. Rather they are an accompaniment

to the lectures. You will need to take notes during the lectures.

I You are expected to watch all lectures in the week they are published in order to

keep up with tutorials and homework assignments.

I You are expected also to utilise textbooks from the library (or elsewhere).

This compilation has been made in accordance with the provisions of Part VB of the copyright act for teaching purposes at the University.

For the use of students of the University of Melbourne enrolled in the subject MAST20026 Real Analysis.

Main Topics

Topic 1 Logic and Proof 9

Topic 2 Sequences 162

Topic 3 Limits and Continuity 223

Topic 4 Differentiability 284

Topic 5 Integration 296

Topic 6 Series 325

1

Introduction

In your previous subjects, you will have seen some simple proofs and constructed some

yourselves.

For example, in Linear Algebra, showing that a particular set is (or is not) a vector

subspace of a bigger vector space.

This subject is more about developing your skills in proof writing rather than about

calculation! To do this we need to have a thorough understanding of how

mathematical arguments work. This involves starting with logic and the way it is used

in mathematics.

This subject has two main aims:

I To study how to write mathematical proofs

I To study Real Analysis

2

Consider as a motivating example, Euler’s number e. How do we understand this

number? You might have been taught

e ∼ 2.718281828

or the deceptive expression

e = 2.718281828...

The first is an approximation while the latter is not even a precise

statement.

3

Hopefully you were told one of the rigorous definitions. Here are two of them

e = lim

n→∞

(

1 +

1

n

)n

or

e =

∞∑

n=0

1

n!

= 1 +

1

1

+

1

1 · 2 +

1

1 · 2 · 3 + · · ·

Both of these definitions involve letting some value (in both cases “n”) tend to

infinity.

Moreover, they are equivalent, in the sense that they define the same real

number.

We mean to make this process mathematically precise. This is not so much about

making the idea of infinity precise. It is more about understanding the rigorous

definition of convergence.

4

e =

∞∑

n=0

1

n!

= 1 +

1

1

+

1

1 · 2 +

1

1 · 2 · 3 + · · · (1)

To make this expression (and the other definition above) precise took a long time.

Cauchy first did it in 1821, which in mathematics is recent history. This used what is

sometimes called the (, δ) or the (,N) definition of convergence.

I You will learn this rigorous definition and apply it in all sorts of contexts to

understand the concept of convergence

I You will learn rigorous algebraic properties of numbers;

for example e is a real number but not a rational number.

(More on ‘rational vs real’ later.)

5

The definition in (1) is best for us because it readily generalizes to give us the powers

of e:

ex =

∞∑

n=0

xn

n!

= 1 +

x

1

+

x2

1 · 2 +

x3

1 · 2 · 3 + · · ·

This is probably the most important function in mathematics. In particular it solves

the important differential equation:

d(ex)

dx

= ex

Perhaps you can show this is true formally, but proving it rigorously is another

matter.

Making rigorous such precise properties of functions is also a central part of this

subject.

6

Thus what we seek are

Definitions: precise definitions of the objects we are working with.

Proofs: to be able to prove statements based on the definitions as well as

statements based on axioms.

(axioms are statements assumed to be true)

7

Without precision in reasoning it is easy to make (erroneous) arguments which

lead to ridiculous conclusions.

For example consider the sum

1− 1 + 1− 1 + 1− 1 + 1− 1 + · · · · · ·

By grouping one way we argue that the (infinite) sum is 0, while by grouping another

way we argue it is 1. Have we proved that 1 = 0? No! We have not proven

anything.

Now consider the sum

1

2

+

1

4

+

1

8

+

1

16

+

1

32

+

1

64

+ · · · · · ·

8

Logic and Proof

We want

1. A precise way to write mathematical definitions.

2. A precise notion of what constitutes a mathematical proof.

3. A precise notion of what constitutes a theorem.

4. Some methods of proof.

Need two forms of logic

1. Propositional logic

2. First order logic

9

Propositional Logic

Logic is about how to infer true things from known or assumed things.

Propositional logic is about two valued variables and two valued functions of those

variables. These two values will be called “true” and “false” or sometimes 0 and

1.

Definition 2.1 (Statement)

A statement is a “sentence” or “expression” that is either true or false.

The statement takes on the role of a logical variable (the variables are true or false).

We will generally use lower case letters p, q, r , . . . to represent statements.

Some books use the word proposition rather than statement.

10

Examples

Example 2.1 (Which of these are statements?)

I All maths lecturers are named Paul.

I 2 + 3 = 7

I 1 + 1 = 2

I potato

I x > 2

I For all x ∈ Z, x2 ≥ 0

I Every even number greater than 2 is the sum of two primes.

11

Connectives

We can combine statements to give new statements. Such statements are called

compound statements (or logical formulae). They are constructed from:

1. primitive statements (that is, logical variables)

2. connectives

3. parentheses (to remove ambiguity)

Generally, we use upper case letters A,B,C , . . . to denote compound

statements.

The connectives we will consider are:

I negation

I disjunction

I conjunction

I implication

I equivalence

12

Definition 2.2 (Negation)

If p is a statement, the negation of p is the statement “not

p” and is denoted by ∼p. Its logical value is the opposite of p

(ie., if p is true then ∼p is false).

It can also be denoted by ∼ p.

p ∼ p

T F

F T

Example 2.2

If p is the statement “the grass is green” then ∼p is the statement “not [the case

that] the grass is green”. (We would usually say “the grass is not green”.)

Example 2.3

If p is the statement “5 > 0” then ∼ p is the statement

“5 is not > 0” which is the same as “5 ≤ 0”.

Example 2.4

If q is the statement “1 + 1 = 7” then “∼ q” is the statement “1 + 1 6= 7”.

13

In the English language, there are two different usages of the word ‘or’.

The exclusive use of “or”: I will ride my bike or catch the train.

usually means that one or the other of these modes of transportation will be used, but

not both. This is not the mathematical usage of “or”.

Definition 2.3 (Disjunction)

If p and q are statements, the disjunction of p and

q is the statement “p or q” and is denoted by p ∨ q.

It is True if one or both of p and q are True.

It is False if both p and q are False.

p q p ∨ q

T T

T F

F T

F F

The disjunction is an inclusive use of “or”, that is; it allows for the possibility that

both statements are true.

14

Example 2.5

If p is the statement “2 + 5 = 7” and q is the statement “1 + 1 = 2”, then p ∨ q is

the statement “2 + 5 = 7 or 1 + 1 = 2”.

Example 2.6

If p is the statement “2 + 5 = 7” and q is the statement “1 + 1 = 0”, then p ∨ q is

the statement “2 + 5 = 7 or 1 + 1 = 0”.

15

When defining the above connectives we gave their Truth Tables.

p ∼ p

T

F

p q p ∨ q

T T

T F

F T

F F

We will use truth tables to define some other connectives.

16

Definition 2.4 (Conjunction)

If p and q are statements, the conjunction of p

and q is the statement “p and q”, denoted p ∧ q. p q p ∧ q

T T

T F

F T

F F

Conjunction accords with the natural language use of the word “and”.

Example 2.7

If p is the statement “2 + 5 = 7” and q is the statement “1 + 1 = 2” then the

conjunction of p and q is the statement “2 + 5 = 7 and 1 + 1 = 2”.

Example 2.8

If p is the statement “2 + 5 = 7” and q is the statement “1 + 1 = 0” then the

conjunction of p and q is the statement “2 + 5 = 7 and 1 + 1 = 0”.

17

Example 2.9

Note that p ∧ q has the same truth table as q ∧ p, and p ∨ q has the same truth table

as q ∨ p; that is, conjunction and disjunction are commutative.

18

Definition 2.5 (Implication)

A statement of the form “if p then q”is called an

implication (or a conditional statement).

It is written as p ⇒ q.

p q p ⇒ q

T T

T F

F T

F F

The statement p is called the antecedent and the statement q is called the

consequent.

19

Example 2.10

p : 1 + 1 = 2

q : 22 = 4

r : pi = 3

p ⇒ q is True

p ⇒ r is

r ⇒ p is True

20

There are many natural language equivalent statements:

I If p then q

I p only if q

I q follows from p

I q is necessary for p

I q provided that p

I q is a necessary condition for p

I there are others.........

21

Example 2.11

p q p ⇒ q ∼ p (∼ p) ∨ q

22

Definition 2.6 (equivalence)

A statement of the form“p if and only if q” is called

an equivalence (or a biconditional statement).

It is written as p ⇔ q.

p q p ⇔ q

Note that p if and only if q is the same as (ie., “logically equivalent to”)

(p ⇒ q) ∧ (q ⇒ p).

23

Definition 2.7 (Compound statement)

Statements constructed from several primitives are called compound statements.

Usually they are written with capital letters, P,Q,R, . . . .

A compound statement is also either true or false. Another way to look at it is, if you

combine statements with connectives in a valid way then you have a compound

statement which is either true or false, and its truth value depends on the truth values

of the statements comprising it.

Example 2.12

Let p, q, r be primitive statements, then

I p ∧ q

I (∼ p) ∧ (∼ q)

I (p ⇒ q) ∧ (q ⇒ p)

are all compound statements.

24

Negating Compound Statements

Example 2.13

Since p ∧ q is a statement, we can negate it, giving, ∼ (p ∧ q). The truth table

is

p q p ∧ q ∼ (p ∧ q)

Be careful with your use of brackets. They should be used to remove any ambiguity.

If I write ∼ p ∧ q do I mean (∼ p) ∧ q or ∼ (p ∧ q)? They are not the same.

25

Example 2.14

Now consider the truth table for (∼ p) ∨ (∼ q)

p q ∼ p ∼ q (∼ p) ∨ (∼ q)

Notice that (∼ p) ∨ (∼ q) always gives the same truth value as ∼ (p ∧ q)

26

Logical Equivalence

Definition 2.8 (Logically equivalent)

Statements which have the same truth table are said to be logically equivalent.

That is, p and q are logically equivalent if the statement p ⇔ q is a tautology.

It is denoted p ≡ q.

1. p ∧ q ≡ q ∧ p

2. p ∨ q ≡ q ∨ p

3. ∼ (p ∧ q) ≡ (∼ p) ∨ (∼ q)

4. ∼ (p ∨ q) ≡ (∼ p) ∧ (∼ q)

27

Example 2.15

We show number 4 from above

p q p ∨ q ∼ (p ∨ q) ∼ p ∼ q (∼ p) ∧ (∼ q) ∼ (p ∨ q)⇔ (∼ p) ∧ (∼ q)

28

First Order Logic

Mathematical theories have several ingredients that propositional logic cannot easily

represent (for example, sets, functions and mathematical operations).

We use mathematical variables x , y , z , . . . to represent elements of the set of objects

U (called the universal set).

We also have equations and inequalities containing variables:

I x > 0

I x2 − 3x + 2 = 0.

These two formulae are not statements. They are neither true nor false, because their

truth depends on the value of x .

Formulae which depend on (free) variables are called conditions (or predicates)

29

Conditions on x are written as p(x), q(x), . . . .

Example 2.16

Some conditions on x are:

I p(x) : x > 0

I q(x) : x2 − 3x + 2 = 0.

They are neither True nor False.

Some statements are:

I p(2) : 2 > 0

I p(−5) : −5 > 0

I q(3) : 32 − 32 + 2 = 0.

They can be seen as True or False.

30

If x is replaced by a particular value (or element of a set), then the formulae become

statements.

There are (at least) two ways to convert formulae containing mathematical variables

into logical statements.

1. Substitution

Pick any element of U and substitute into the formula:

Thus, if U is Z, then

I (x = 2)

I (x = −5)

I (x = 3)

31

2. Using quantifiers

Thus, if U is still Z, then

I for all x ∈ Z, x > 0

I for all x ∈ Z, x2 − 3x + 2 = 0

I there exists an x ∈ Z such that x > 0

I there exists an x ∈ Z such that x2 − 3x + 2 = 0

The phrases “for all” and “there exists” are quantifiers.

32

Expressions like p(x) ∧ q(x) and p(x)⇒q(x) are compound conditions

(or predicates).

Example 2.17

I p(x) ∧ q(x) is the condition “p(x) and q(x)”.

I p(x) ∨ q(x) is the condition “p(x) or q(x)”.

I p(x)⇒q(x) is the condition “if p(x) then q(x)”.

These all become valid logic statements when a particular value for x is

substituted.

33

Quantifiers

As we have seen, we can turn conditions on x into statements.

Consider the condition p(x) : x2 ≥ 3.

We can turn this into a statement in various ways:

Definition 2.9 (Universal Quantifier)

The phrase “for all” is called the universal quantifier and is written as ∀.

34

Now consider the condition p(x) : x2 − 1 = 0.

Definition 2.10 (Existential Quantifier)

The phrase “there exists” is called the Existential Quantifier and is written as ∃.

35

Proving and Disproving Quantified Statements

If Q is the statement “∀x ∈ D p(x)”, where D is a set and p(x) is a condition on x ,

then Q is true only if p(x) is be true for every member of D (this might be difficult to

show).

However if at least one instance is false (say p(c)) then p(a) ∧ p(b) ∧ p(c) · · · is

false.

So Q can be disproved if at least one false case can be found.

This is called a counterexample.

36

Example 2.18

∀x ∈ R, (x − 1)2 > 0

37

Similarly, if R is the statement “∃x ∈ D p(x)”, then R is true if there is one true

instance of p(x).

So R can be proved true if at least one primitive statement is true.

This is called an example.

38

Example 2.19

∃x ∈ R (x − 1)2 > 0

In general

I Disprove a universal statement with a counterexample.

I Prove an existential statement with an example.

39

Writing theorems using First Order Logic

We can use First Order Logic to write theorems in a precise way.

1. Any real number has the property that its square is not negative.

2. For every even integer m, 3m5 will also be an even integer.

40

3. If x and y are real numbers, the absolute value of their sum is no greater than

the sum of their absolute values.

4.

√

2 is irrational.

5. If n is an integer greater than or equal to 5, then 2n is greater than n2.

41

Negation of ∀

Since quantified conditions are statements their negation is well defined. For

example:

∼ [ ∀x ∈ D p(x) ] and ∼ [ ∃x ∈ D p(x) ]

Are these logically equivalent to anything else?

Theorem 2.1 (Negation of ∀)

∼( ∀x ∈ D p(x)) ≡ ∃x ∈ D ∼p(x)

42

Negation of ∃

Theorem 2.2 (Negation of ∃)

∼( ∃x ∈ D p(x) ) ≡ ∀x ∈ D ∼p(x)

Example 2.20

No positive real number satisfies x3 = −1.

This is the same as:

43

Tautologies and Contradictions

Definition 2.11 (Tautology)

If the truth table of a compound statement P(p1, . . . , pn) is true for all values of

p1, . . . , pn then it is called a tautology.

Example 2.21

p ∼p (∼p) ∨ p

Thus (∼p) ∨ p is a tautology.

44

Example 2.22

P(p, q) : p ⇒ p ∨ q

p q p ∨ q p ⇒ (p ∨ q)

Thus P is a tautology.

45

Example 2.23

Show [p ∧ (p ⇒ q)]⇒ q is a tautology.

p q p ⇒ q p ∧ (p ⇒ q) [p ∧ (p ⇒ q)]⇒ q

46

Definition 2.12 (Contradiction)

If the truth table of a compound statement P(p1, . . . , pn) is false for all values of

p1, . . . , pn then it is called a contradiction.

Example 2.24

P(p) : (∼p) ∧ p

p ∼p (∼p) ∧ p

Thus P is a contradiction.

Note: There is a second type of contradiction associated with inference rules — see

later.

47

Converse and Contrapositive

Definition 2.13 (Converse)

The converse of p ⇒ q is the statement q ⇒ p.

A conditional statement is not logically equivalent to its converse.

Example 2.25

p q p ⇒ q q ⇒ p

48

Definition 2.14 (Contrapositive)

The contrapositive of p ⇒ q is the statement (∼q)⇒ (∼p).

(ie., switch and negate p and q)

The following theorem says that the contrapositive of a conditional statement is

logically equivalent to the original statement. This is very useful!

Theorem 2.3 (Conditional and Contrapositive)

If p and q are statements, then

p ⇒ q ≡ (∼q)⇒ (∼p)

49

Proof.

p q ∼p ∼q p ⇒ q (∼q)⇒ (∼p)

This means that a conditional statement is logically equivalent to its

contrapositive.

50

The following is a very useful form of the biconditional.

Theorem 2.4 (Biconditional and Conditional)

If p and q are statements, then

[p ⇔ q] ≡ [(p ⇒ q) ∧ (q ⇒ p)].

Proof.

p q p ⇔ q p ⇒ q q ⇒ p (p ⇒ q) ∧ (q ⇒ p)

Columns 3 and 6 are the same thus the statements are logically equivalent.

51

For proof writing if we need to show p ⇔ q is true, then we can use

Theorem 2.4.

In general we show:

1. p ⇒ q is true, and

2. q ⇒ p is true (that is, the converse is true).

Thus (p ⇒ q) ∧ (q ⇒ p) is true (T − T row of truth table).

Thus p ⇔ q is true.

Conversely, if we know p ⇔ q is true then Theorem 2.4 tells us that p ⇒ q is true and

that q ⇒ p is true

52

Example 2.26

An even number x is a number that can be written in the form x = 2k, k ∈ Z.

Written as a definition:

x ∈ Z is even ⇔ ∃k ∈ Z such that x = 2k.

53

Logic, Theorems, and Mathematical Proofs

We will now begin our discussion of proofs in depth. We start with an example:

Theorem 2.5

If x is an even integer, then x2 is an even integer.

54

Let p : x is an even integer and q : x2 is an even integer.

The theorem states,

“If p then q” is true;

that is, the conditional

“p ⇒ q”

is true.

Method: To prove the theorem, we assume that p is true (i.e. that x is an even

integer) and use this together with valid rules of inference to conclude that q is true

(i.e. that x2 is an even integer.

55

Many theorems can be written as P(p1, . . . , pn)⇒ Q(p1, . . . , pn).

To say that P(p1, . . . , pn)⇒ Q(p1, . . . , pn) is a theorem, means that

P(p1, . . . , pn)⇒ Q(p1, . . . , pn) is a tautology; it is always true.

From the truth table for P ⇒ Q, we see that it is true in all cases except when P is

TRUE and Q is FALSE.

So to prove the theorem, all we need to do is show that this can’t happen; that

is,

If p is true then q is true.

56

Using theorems: When we use the theorem we usually take p as true (ie., choose an

even integer) and then use the theorem to infer q is true (ie., the square of an even

integer is even).

This is justified by the truth table for the conditional:

p q p ⇒ q

Assuming p is true and having the theorem true puts us on row 1 and hence q must

be true.

57

Thus we get

p true and (p ⇒ q) true allows us to infer q

In logic this is usually written

{p, p ⇒ q} q

where is read“infers”.

This is a famous “rule of inference” called Modus Ponens1 or the Law of

Detachment - q has been detached from p ⇒ q. It is valid as long as p and (p ⇒ q)

are assumed (or known) to be true.

1

From Latin: “The method that affirms the consequent by affirming the antecedent”

58

To emphasize the structure of mathematical proofs, let us prove the following

simple statement: the sum two odd numbers is an even number.

First let’s write this in a form where the logic is clear.

If (m, n are odd integers), then (m + n is an even integer).

Assumed true

Inferred steps

Conclusion

59

True statements that can be used as steps in a proof without inference come in

various flavours.

I Premises – statements assumed true for the purposes of a proof.

I Axioms – very simple statements assumed true for the purposes of building an

area of mathematics.

I Tautologies – always true.

I Theorems – already proved true.

I Definitions

I etc...

60

Thus a proof is a finite sequence of statements

A1,A2, . . . . . . ,An

such that each Ai is either

I known or assumed true or

I can be inferred from a known or assumed true statement Aj , with j < i (ie., a

previous statement).

The last statement, An, is usually called the conclusion and so a proof is sometimes

written

A1,A2, . . . . . . ,An−1 proves An

61

Indirect Proofs

Up until now we have used “direct proofs”:

• Start with true premises and infer a true conclusion using valid arguments.

Indirect proofs are different. We will consider two types:

• Proof using the contrapositive.

• Proof by contradiction.

62

Contrapositive Proofs

Instead of proving

P ⇒ Q

we use the logical equivalence (Theorem 2.3)

(P ⇒ Q) ≡ ((∼Q)⇒ (∼P))

and prove

(∼Q)⇒ (∼P)

using any method of proof.

63

Theorem 2.6

(∀x ∈ Z), x2 is even iff x is even.

64

Example 2.27

Prove that there are infinitely many prime numbers. (You may take as axiomatic the

rules or arithmetic and the factorization of integers into prime numbers.)

65

Proof by Contradiction

If a theorem is in the form of a conditional:

Theorem: If P then Q

that is, we need to prove P ⇒ Q is true, then a proof by contradiction generally takes

the form:

Proof (by contradiction):

P Premise (ie. assume P true)

∼Q Contradiction premise (ie. assume ∼Q true)

Main steps of proof ...

Negation of some previous statement (ie. a contradiction)

Thus Q is true Contradiction Inference

Thus P ⇒ Q Deduction Inference. QED.

66

Example 2.28

Prove by contradiction that,

Theorem: If x is a rational number and y is an irrational number, then x + y is

irrational.

First place theorem into logic form:

Proof (by contradiction):

67

There are many methods of proof:

I Direct proofs

I Indirect proofs

I proof by contradiction,

I contrapositive proof

I Proof by cases

I Others...

All these methods of proof are shown to be valid methods using theorems from

logic.

The previous “sum of even integers is even” proof is an example of a direct

proof.

68

Naive Set Theory

We are going to spend a little time thinking about the basics of set theory. Although

fundamental to the way we think and write mathematics today, surprisingly, formal

set theory has only been around since the late 1800’s when Georg Cantor formulated

the basic definitions and axioms. The most basic of these notions is that of set,

which Cantor described simply as any collection of objects. But this quickly became

problematic, giving rise for example, to Russell’s Paradox.

In the early 1900’s, axiomatic set theory was developed, by Von Neumann and others,

in an attempt to iron out some of the crinkles in early set theory.

Recommended if you want to go beyond this subject: “Naive Set Theory” by P

Halmos. See Wikipedia (ZF set theory) for the logic forms of these axioms.

69

Russell’s Paradox

70

In this subject, we will try to get to some understanding of sets, with a little logic as

its foundation.

The fundamental relation in set theory is that of belonging, written

x ∈ A

read “x belongs to A” or “x is contained in A” or “x is a member of A”.

There are several ways to build up the mathematics of sets.

Currently most mathematicians start with the ZFC axioms: Zermelo-Fraenkel axioms

+ axiom of choice.2

This is a list of ten axioms.

To give you some idea of what is required to start building mathematics here is an

informal list (for interest only):

2see Srivastava p12 if interested.

71

Axioms of Set Theory (simplified)

1. Existence. There exists a set.

2. Equality. Two set are the same (ie. equal) if they have the same elements.

3. Specification – defining subsets using conditions.

4. Pairing – For any two sets there exists a set that they both belong to.

5. Union – for any collection of sets there exists a set that contains all the elements

that belong to at least one set in the collection.

6. Power set – for each set there exists a set that contains all its subsets.

7. Replacement – the range of a function defined on a set falls inside some set.

8. Inductive set – there exists a set A such that if x ∈ A then the set x ∪ {x} is also

in A. Required to prove that a set like the natural numbers exists.

9. Foundation – required to prevent Russell’s paradox (there is no set of all sets).

10. Axiom of choice – from a collection of sets I can always choose one element from

each to make a new set.

72

The primary thing we do is build new sets from given sets.

First, in more detail, our most useful axiom:

Axiom of Specification

For every non-empty set E and every condition p(x) on its elements there is a set A

which contains those elements of E for which p(x) is true. We write

A = { x ∈ E : p(x) }

Note: This is essentially saying certain subsets of E exist: those containing the

elements of E for which p(x) is true.

73

Example 3.1

I If A is a set then the set {x ∈ A : x 6= x} exists (separation axiom).

It is called the

I {x ∈ R : x2 = 2} =

I {x ∈ Z : x > 0} =

I {x ∈ R : (x < 0) ∨ (x = 0) ∨ (x > 0)} =

74

Recall that the Axiom of Specification says that if p(x) is a condition on the elements

of some set E then there always exists a set A which contains those elements of E for

which p(x) is true. But what if p(x) is false for all elements of E?

Definition 3.1 (Empty Set)

We define the empty set (or nullset), denoted ∅, to be the set

{x ∈ U : p(x)}

where U is any set and p(x) is any condition that is false for all elements in U.

The axiom of specification tells us such a set exists.

75

Relations between sets

Definition 3.2 (Equality)

Let A and B be sets. We say that A = B if A and B contain the same elements.

Logic: A = B ⇔ ∀x ∈ U (x ∈ A ⇔ x ∈ B)

Assuming some universal set U.

Example 3.2

A = {1, 2},B = {1, 2},C = {1, 3} and U = {1, 2, 3, 4}.

x x ∈ A x ∈ B x ∈ A⇔ x ∈ B

1 T T

2 T T

3 F F

4 F F

x x ∈ A x ∈ C x ∈ A⇔ x ∈ C

1 T T

2 T F

3 F T

4 F F

76

Definition 3.3 (Subsets)

Let A and B be sets. We say that A is a subset of B if every element of A is also an

element of B. This is written A ⊆ B.

Logic: A ⊆ B ⇔ ∀x ∈ U (x ∈ A⇒ x ∈ B)

If A is a subset of B and there is an element of B which is not an element of A then

we say A is a proper subset of B.

Assuming some universal set U.

Also useful for showing A is not a subset of B is the negation:

A 6⊆ B ⇔ ∃x ∈ U s.t. (x ∈ A) ∧ (x 6∈ B)

77

Example 3.3

Let A = {1},B = {1, 2}, E = {1, 2, 3, 4}

x x ∈ A x ∈ B (x ∈ A)⇒ (x ∈ B) (x ∈ B)⇒ (x ∈ A)

1 T T

2 F T

3 F F

4 F F

78

Theorem 3.1

Let A be a set. Then ∅ ⊆ A.

Logic form:

“Proof”.

79

Theorem 3.2 (Equality and Subsets)

If A and B are sets then A = B if and only if A ⊆ B and B ⊆ A.

Logic: A = B ⇔ (A ⊆ B) ∧ (B ⊆ A)

How do we prove a biconditional statement?

We use

p ⇔ q ≡ (p ⇒ q) ∧ (q ⇒ p)

and write two proofs:

First the “Forward proof” showing (p ⇒ q) is true

then the

“Converse proof” showing (q ⇒ p) is true.

Since (p ⇒ q) and (q ⇒ p) are both true so is p ⇔ q is true.

This is a method of proof - justified by the logical equivalence.

80

Proof

81

82

New sets from old

Given two known sets A and B we can define new sets in many ways.

Definition 3.4 (Union)

Let A and B be sets. We define the union of A and B, denoted A ∪ B, to be the set

that contains all elements of A and all elements of B.

A ∪ B = {x ∈ U : (x ∈ A) ∨ (x ∈ B)}

Definition 3.5 (Intersection)

Let A and B be sets. We define the intersection of A and B, denoted by A∩B, to be

the set that contains only those elements of A which are also all elements of B.

If A ∩ B = ∅ we say that A and B are disjoint.

A ∩ B = {x ∈ U : (x ∈ A) ∧ (x ∈ B)}

83

Definition 3.6 (Complement)

Let A and B be sets. We define the complement of B in A, denoted by A\B, to be

the set that contains only those elements of A which are not elements of B.

A \B = {x ∈ U : (x ∈ A) ∧ (x 6∈ B)}

84

Theorem 3.3

Let A,B and C be subsets of some set E . Then

A \ (B ∪ C ) = (A \ B) ∩ (A \ C ).

We usually prove set equality using Theorem 3.2

Proof.

85

86

More notation

Definition 3.7 (Indexed Family)

Let i be an element of some nonempty set I . If for each i ∈ I there is a corresponding

set Ai , then

A = {Ai : i ∈ I}

is called an indexed family of sets and I is called an indexing set.

The union of the elements in A is denoted by⋃

i∈I

Ai

and the intersection of the elements of I is denoted by

⋂

i∈I

Ai .

Similarly, if N = {Ni : i ∈ I} is set of objects Ni that can be multiplied (eg. real

numbers, algebraic expressions etc.), then∏

i∈I

Ni

denotes the product (multiplication) of all the elements of N.

87

Example 3.4

88

Ordered Pairs and Cartesian Products

Definition 3.8 (Ordered Pairs)

Let A and B be sets. An ordered pair is an object

(a, b)

where a ∈ A and b ∈ B.

Two ordered pairs (a, b) and (c , d) are equal iff a = c and b = d .

Definition 3.9 (Cartesian Product)

Given sets A and B, the Cartesian product of A and B, written A× B is the set of

all ordered pairs:

A× B = {(a, b) : a ∈ A and b ∈ B}

Proposition 3.1

Let A be a set. If (a1, a2) ∈ A× A and a1 6= a2 then (a1, a2) 6= (a2, a1).

89

Rational Numbers

Definition 3.10 (Rational Numbers)

The set of rational numbers, denoted Q, is the set of ratios of integers,

Q =

{ r

s

| r , s ∈ Z, s 6= 0

}

We will investigate below the way in which the real numbers can be constructed from

the rationals.

Aside:

A rational number is actually an equivalence class of ordered pairs of integers

(k, l), l 6= 0. Two pairs (m, n) and (k , l) represent the same rational number iff

ml = nk. Addition and multiplication of rational numbers is defined as

(k, l) + (m, n) = (kn + lm, ln)

(k , l) · (m, n) = (km, ln)

90

Example 3.5

There is no rational number q satisfying q2 = 2

91

92

Example 3.6

Let A and B be subsets of Q given by

A = {q ∈ Q : q < 0 or q2 < 2} B = {q ∈ Q : q > 0 and q2 > 2}

Then A contains no largest element and B contains no smallest element.

93

Ordered sets and bounds

You are familiar with many basic properties properties of the real numbers.

We want to build up a list of properties that uniquely determine the real number

system and see how they are related to Q.

A familiar property of Q and R is that they are ‘ordered’.

Definition 3.11 (ordered set)

Let S be a set. An order on S is a relation, denoted < , satisfying

1. If x ∈ S and y ∈ S , then exactly one of the following holds:

x < y x = y y < x

2. If x , y , z ∈ S satisfy x < y and y < z , then x < z

An ordered set is a set equipped with an order.

The notation x ≤ y is used to mean that x < y or x = y .

We sometimes write y > x in place of x < y .

94

Example 3.7

The set Q of rational numbers equipped with the usual order (in which x < y is

defined to mean that y − x is positive) is an ordered set.

Definition 3.12 (bounded above)

Let S be an ordered set and A ⊆ S a subset. If there exists β ∈ S such that

∀x ∈ A, x ≤ β

then we say that A is bounded above and call β an upper bound for A.

Example 3.8

The set A from Example 3.6 is bounded above and any element of B is an upper

bound for A.

95

Definition 3.13 (least upper bound)

Let S be an ordered set and A ⊆ S a subset that is bounded above. An element

α ∈ S is called a least upper bound (or supremum) of A if it has the following

properties:

1. α is an upper bound for A

2. If γ is an upper bound for A, then α ≤ γ

It is clear from the definition that there can be at most one least upper bound for a

given set A. We denote it by supA.

Example 3.9

1. {q ∈ Q : q ≤ 12} ⊆ Q has least upper bound 12

2. {q ∈ Q : q < 12} ⊆ Q has least upper bound 12

3. The set A ⊆ Q as in Example 3.6 is bounded above but has no least upper bound

96

The defintions of lower bound and greatest lower bound (or infimum) are defined

in exactly the same manner, but with ≤ replaced by ≥.

Definition 3.14

Let S be an ordered set and A ⊆ S a subset. If there exists β ∈ S such that

∀x ∈ A, x ≥ β

then we say that A is bounded below and call β a lower bound for A.

An element α ∈ S is called a greatest lower bound (or infimum) of A if it has the

following properties:

1. α is a lower bound for A

2. If γ is a lower bound for A, then α ≥ γ

It is clear from the definition that there can be at most one greatest lower bound for a

given set A. We denote it by inf A.

97

Example 3.10

1. E = { 1n : n = 1, 2, 3, . . . } ⊂ Q is bounded above and bounded below.

We have supE = 1 and inf E = 0.

Notice that, in this example, the infimum is not itself an element of E .

2. The set B ⊆ Q of Example 3.6 is bounded below but has no greatest lower bound

98

Definition 3.15

An ordered set S is said to have the least-upper-bound property if the following

holds: If A ⊆ S is not empty and is bounded above, then A has a least upper bound

in S .

We have seen in the examples above that Q does not have the least-upper-bound

property.

99

Fields

Definition 3.16 (Field)

A field is a set F equipped with two binary operations called addition (denoted +)

and multiplication (denoted using × or simply by concatenation) that satisfy the

following axioms (A1– A5, M1–M5, D):

Field axioms for addition

(A1) ∀x , y ∈ F , x + y ∈ F (closure)

(A2) ∀x , y ∈ F , x + y = y + x (commutivity)

(A3) ∀x , y , z ∈ F , (x + y) + z = x + (y + z) (associativity)

(A4) ∃0 ∈ F ,∀x ∈ F , x + 0 = x (additive identity)

(A5) ∀x ∈ F ,∃ y ∈ F , x + y = 0 (additive inverse)

100

Definition (Field, continued)

Field axioms for multiplication

(M1) ∀x , y ∈ F , xy ∈ F (closure)

(M2) ∀x , y ∈ F , xy = yx (commutivity)

(M3) ∀x , y , z ∈ F , (xy)z = x(yz) (associativity)

(M4) ∃1 ∈ F \ {0}, (∀x ∈ F , 1x = x) (multiplicative identity)

(M5) ∀x ∈ F \ {0}, (∃y ∈ F , s.t. xy = 1) (multiplicative inverse)

Distributive law

(D) ∀x , y , z ∈ F , x(y + z) = xy + xz

Example 3.11

1. Q equipped with the usual operations (defined above) forms a field.

2. Z (with the usual operations) is not a field.

101

Definition 3.17 (Ordered Field)

An ordered field is a field F that is also an ordered set and satisfies:

1. ∀x , y , z ∈ F , (y < z ⇒ x + y < x + z)

2. ∀x , y ∈ F , ((x > 0) ∧ (y > 0)⇒ xy > 0)

Example 3.12

Q is an ordered field

102

The real field

We have seen that Q is an ordered field but does not have the least-upper-bound

property.

Theorem 3.4

There exists an ordered field R that has the least-upper-bound property.

Moreover, R contains Q as a subfield.

The proof of this theorem constructs R from Q using ‘Dedekind cuts’.

In fact, R is the unique field that satisfies the axioms of an ordered field having the

least-upper-bound property.

103

Axioms for the reals

The real numbers R consist of a set equipped with two binary operations called

addition (+) and multiplication (×), and a relation < that satisfy the following

axioms (A1– A5, M1–M5, D,O1,O2,AO,MO,C):

(A1) ∀x , y ∈ R, x + y ∈ R (closure)

(A2) ∀x , y ∈ R, x + y = y + x (commutivity)

(A3) ∀x , y , z ∈ R, (x + y) + z = x + (y + z) (associativity)

(A4) ∃0 ∈ R,∀x ∈ R, x + 0 = x (additive identity)

(A5) ∀x ∈ R,∃ y ∈ R, x + y = 0 (additive inverse)

(M1) ∀x , y ∈ R, xy ∈ R (closure)

(M2) ∀x , y ∈ R, xy = yx (commutivity)

(M3) ∀x , y , z ∈ R, (xy)z = x(yz) (associativity)

(M4) ∃1 ∈ R \ {0}, (∀x ∈ R, 1x = x) (multiplicative identity)

(M5) ∀x ∈ F \ {0}, (∃y ∈ F , s.t. xy = 1) (multiplicative inverse)

104

(D) ∀x , y , z ∈ R, x(y + z) = xy + xz (distributivity)

(O1) ∀x , y ∈ R exactly one of the following hold: x < y , x = y , y < x

(O2) ∀x , y , z ∈ R, (x < y ∧ y < z)⇒ x < z

(AO) ∀x , y , z ∈ R, (y < z ⇒ x + y < x + z)

(MO ∀x , y ∈ R, (0 < x) ∧ (0 < y)⇒ 0 < xy

(C) Every non-empty subset of R that is bounded above has a least upper bound.

105

Dedekind Cuts

Definition

A subset A ⊆ Q is called a cut if it satisfies the following:

1. A 6= ∅ and A 6= Q

2. ∀a, b ∈ Q, (a ∈ A) ∧ (b < a)⇒ (b ∈ A)

3. ∀a ∈ A ∃b ∈ A, a < b

Example 3.13

A√2 = {q ∈ Q | a < 0 ∨ a2 < 2} A2 = {q ∈ Q | q < 2}

Let R = {A ⊂ Q | A is a cut}.

Our goal is to define an order on R and operations (+ and ×) in such a way that R

satisfies all the axioms for R.

106

Defining an order on R

Each element of R is a subset of Q, so we can ask if one is a subset of another.

We define an order on R by declaring that A < B precisely when A ( B.

Of course, we then need to check that this does gives an order on R.

(i.e., that axioms O1 and O2 are satisfied.)

Exercise

Check that O1 and O2 are satisfied.

Example 3.14

With A√2 and A2 as above, we have A√2 < A2 since

a ∈ A√2 ⇒ (a < 0) ∨ (a2 < 2)⇒ a < 2⇒ a ∈ A2

107

R has the least upper bound property

Claim

The ordered set R satisfies axiom C (the completeness axiom).

Let E ⊆ R and suppose that E is bounded above by some element B ∈ R.

Define α ⊂ Q by

α =

⋃

A∈E

A

Then α is a cut, and is the least upper bound of E .

To see that α is an upper bound for E note that:

A ∈ E ⇒ A ⊆

⋃

A∈E

A⇒ A ≤ α

It is the least upper bound because:

β an upper bound ⇒ ∀A ∈ E ,A ⊆ β ⇒

⋃

A∈E

A ⊆ β ⇒ α ≤ β

108

Addition in R

We define addition on R as follows

A + B = {a + b | a ∈ A, b ∈ B}

Exercise

Show that A + B is a cut (given that A and B are cuts).

Let 0R be the cut given by 0R = {q ∈ Q | q < 0}

Exercise

Show that 0R is the addition identity in R, and that the additive inverse of A ∈ R is

given by −A = {q − a | q ∈ Q, q < 0, q ∈ Q \ A}

109

Multiplication in R

Defining multiplication is a little more involved.

We first define it for A,B ∈ R satisfying 0R ≤ A and 0R ≤ B

A× B = {ab | a ∈ A, b ∈ B, 0 ≤ a, 0 ≤ b} ∪ {q ∈ Q | q < 0}

Example 3.15

Noting that 0 ≤ A√2, we have

A√2 × A√2 = {ab | (a < 0 ∨ a2 < 2) ∧ (0 ≤ a) ∧ (b < 0 ∨ b2 < 2) ∧ (0 ≤ b)}

∪ {q ∈ Q | q < 0}

= {ab | (a2 < 2) ∧ (0 ≤ a) ∧ (b2 < 2) ∧ (0 ≤ b)} ∪ {q ∈ Q | q < 0}

= {q ∈ Q | (q < 2) ∧ (0 ≤ q)} ∪ {q ∈ Q | q < 0}

= A2

110

To define A× B when one or both of A and B are less than 0R, we can use the

definitions already given to define

A× B = −(A× (−B)) if 0R ≤ A and B < 0R

A× B = (−A)× (−B) if A < 0R and B < 0R

It is possible to show that R with the operations defined above satisfies all the axioms

listed required for R.

111

Positive and negative numbers

Notation:

I The set R+ = {x ∈ R : 0 < x} is called the set of positive real numbers.

I The set R− = {x ∈ R : x < 0} is called the set of negative real numbers.

I Note, zero is in neither of these sets.

I Less common is the notation: R+0 = R+ ∪ {0} and R−0 = R− ∪ {0}.

112

Addition and Cancellation

Theorem 3.5 (Addition and Cancellation)

∀a, b, c ∈ R, (a = b) ⇔ (a + c = b + c).

113

114

Inequalities

Theorem 3.6

∀x ∈ R, (0 < x)⇒ (−x < 0).

115

116

Absolute value and inequalities

The absolute value will be used extensively in the subject so we need to know how to

use it in proofs. It is an example of a situation where it may be easier to prove a

“p ⇒ q” theorem by special cases rather than all cases at once.

Definition 3.18

The modulus or absolute value of x ∈ R, denoted |x |, is defined as

|x | =

{

x , if x ≥ 0

−x , if x < 0.

Geometrically, |x | is the distance from x to the origin on the real line.

117

The following theorem gives us a way of avoiding using |x | in some cases (or vice

versa).

Theorem 3.7 (Absolute-Interval)

Let x ∈ R and a ∈ R+. Then |x | ≤ a if and only if −a ≤ x ≤ a.

118

119

The following of results are useful when using absolute value in proofs.

Theorem 3.8

1. ∀x ∈ R 0 ≤ |x |

2. ∀x ∈ R x ≤ |x |

3. ∀x ∈ R − x ≤ |x |

120

The triangle inequality

Another useful result about inequalities (which you should remember from Linear

Algebra) is:

Theorem 3.9 (Triangle Inequality)

∀x , y ∈ R |x + y | ≤ |x |+ |y |

Corollary 3.1 (reverse triangle inequality)

∀x , y ∈ R ∣∣|x | − |y |∣∣ ≤ |x − y |

121

122

Bounded sets in R

Geometrically, we would expect that any number bigger than an upper bound, is also

an upper bound of the set E .

Theorem 3.10

Let s, t ∈ R and E ⊆ R. If s is an upper bound of E and s < t then t is an upper

bound of E .

Proof:

123

How do we show w is not an upper bound of E ⊆ R?

Proposition 3.2 (Not an Upper Bound)

Let w ∈ R and E ⊆ R. Then w is not an upper bound of E if and only if there exists

x ∈ E such that x > w .

Thus to prove a number, w , is not an upper bound of a subset, we need to find at

least one number in the subset that is greater than w .

124

Definition 3.19

(i) If a < b then the subset [a, b] = {x ∈ R : a ≤ x ≤ b} is called a closed interval.

(ii) If a < b then the subset (a, b) = {x ∈ R : a < x < b} is called an open interval.

Note that: the intervals [a, b) = {x ∈ R : a ≤ x < b} and

(a, b] = {x ∈ R : a < x ≤ b} are neither open nor closed.

Both of the sets [a, b] and (a, b) are bounded above, however there is a fundamental

difference between them.

125

126

Question Does there exist s ∈ (a, b) such that s is an upper bound of (a, b)? More

generally, do all subsets E which are bounded above contain an upper bound of

themselves?

To answer this we will need the following theorem:

Theorem 3.11

∀x , y ∈ R x < y ⇒ x < x + y

2

< y .

Geometrically:

This theorem states that between any two real numbers, there exists at least one

other real number.

Note: This statement is also true in Q but not in Z!

127

Proof:

128

Proposition 3.3

Let w ∈ R. Then w is an upper bound of (a, b) if and only if w ≥ b.

Thus the subset (a, b) does not contain an upper bound of itself.

Proof:

129

130

Exercise

1. Let s, t ∈ R. If s is a lower bound of E and t < s then t is a lower bound of E .

2. If a, b ∈ R with a < b, then a is a lower bound of (a, b)

3. If a, b ∈ R with a < b, then a is a lower bound of [a, b]

4. ∀s ∈ R (s is not a lower bound of E ) ⇔ ∃w ∈ E s.t. w < s

Note: a /∈ (a, b) and a ∈ [a, b].

131

We now combine the idea of upper and lower bounds:

Definition 3.20 (Bounded)

If E ⊆ R and E is bounded above and below, we say that E is a bounded set. If E is

not bounded it is called an unbounded set.

The subsets R+ and R− are important examples of unbounded subsets.

Proposition 3.4

1. R = R− ∪ {0} ∪ R+.

2. If x ∈ R− and y ∈ R+ then x < y .

Proof: - exercise -

132

Recall that b is an upper bound of (a, b) and [a, b] but b /∈ (a, b) and b ∈ [a, b]. The

situation in which a set contains its upper bound is special.

Definition 3.21 (Greatest element)

Let E ⊆ R. If γ ∈ E and γ is an upper bound of E , then γ is called the greatest

element of E .

Definition 3.22 (Least element)

Let E ⊆ R. If λ ∈ E and λ is a lower bound of E , then λ is called the least element

of E .

133

Example 3.16

E1 = {1, 2, 3, 4}

Note: The greatest element and least element, if they exist, are unique. That is, a set

can have at most one greatest element and can have at most one least element.

134

Example 3.17

E2 = (a, b)

Example 3.18

E3 = [a, b]

135

Example 3.19

E4 =

{

−1

n

: n ∈ N

}

136

Summary:

Subset Least Element Greatest Element

So, some subsets have greatest elements or least elements and some don’t. However,

consider the following two subsets:

Definition 3.23 (Set of Upper/Lower bounds)

Let E ⊆ R.

UE = {x ∈ R : x is an upper bound of E}

LE = {x ∈ R : x is a lower bound of E}

137

That every nonempty subset of R that is bounded above (below) has a supremum

(infimum) is ensured by the least-upper-bound property. This is sometimes referred to

as the completeness axiom and stated in the following form:

Axiom (Completeness). Let E be a non-empty subset of R. Every non-empty set of

upper bounds of E has a least element.

138

The following result is sometimes useful when establishing suprema.

Proposition 3.5

γ is the supremum of the set E if and only if for any > 0 no element of E is greater

than γ + and there is an element of E greater than γ − .

Logic:

γ = supE ⇔ ∀ > 0 ((∀x ∈ E x ≤ γ + ) ∧ (∃x ∈ E s.t. x > γ − ))

139

Example: If A = {− 1n s.t. n ∈ N} then supA = 0.

140

Since we expect to add two real numbers to get another real, or multiply two reals to

get a real, we should be able to add and multiply sup’s to get new sup’s (since all

reals are sup’s of some set).

In fact sup’s should satisfy all the axioms of reals.

How adding sup’s works is seen from the following theorem:

Theorem 3.12 (Sup Addition)

Let A and B be nonempty subsets of R and

C = {x + y ∈ R : x ∈ A and y ∈ B}.

If A and B have suprema, then C has a supremum and

supC = supA + supB.

141

Proof:

142

Mathematical Induction

We now consider our last method of proof - Induction.

Denote the set of positive integers by N = {1, 2, 3 . . .}.

Suppose p(n) is a condition on n ∈ N and that we wish to prove theorems such

as

∀n ∈ N, p(n).

That is, we want to prove that p(n) is true for all n ∈ N. For example,

1. p(n) : 1 + 2 + 3 + . . .+ n = 12n(n + 1)

2. p(n) : 2n > n

3. p(n) : (1 + x)n ≥ 1 + nx , x ∈ R, x > −1

To do this we use the Principle of Mathematical Induction.

143

Mathematical Induction

The following inference rule allows us to replace showing p(n) is true (for all n) with

showing a conditional, p(n)⇒ p(n + 1) is true (for all n) - generally much

easier.

Theorem 3.13 (Principle of Mathematical Induction)

If p(n), n ∈ N are statements such that:

I p(1) is true, and

I ∀k ∈ N (p(k)⇒ p(k + 1)) is true

then,

∀n ∈ N p(n) is true

This means that all the statements p(1), p(2), . . . are true.

The proof relies on the well ordering, or least element, property of N.

144

Theorem 3.14

Any non-empty subset of the natural numbers has a least element.

We use this theorem to prove Theorem 3.13. Given a proposition P(n), let

S = {n ∈ N | P(n) is false}.

If we can prove S is empty, then we have proven P(n) is true for all n ∈ N.

If P(1) is true then S is not all of N.

If S is non-empty then it possesses a least element m, say. Thus m − 1 /∈ S so

P(m − 1) is true.

If we know P(k)⇒ P(k + 1) then P(m − 1) is true implies P(m) is true, but this

contradicts m ∈ S!

We conclude that S has no least element, hence it is empty.

145

To use Theorem 3.13, we need to

I Show p(1) is true.

I Show that, if we assume p(k) true then we can deduce p(k + 1) is true.

I Then, the deduction principle tells us that p(k)⇒ p(k + 1) is true.

I Then the Induction principle tells us that p(1), p(2), . . . are all true

146

Example 3.20

p(n) : 1 + 2 + 3 + . . .+ n = 12n(n + 1)

147

Example 3.21

p(n) : 2n > n

148

149

150

Generalisations

Some more general forms of induction exist.

I Shift from n ≥ 1 to n ≥ −m (m > 0);

that is, p(−4), p(−3), p(−2), . . ..

Show p(−4) is true then p(k)⇒ p(k + 1)

I Another form of induction (Strong Mathematical Induction) allows you to

assume all of p(k), p(k + 1), . . . p(n) are true in order to prove p(n + 1)

I Show p(k) is true for k ≤ n ∈ N

I Show p(k), p(k + 1), . . . p(n)⇒ p(n + 1)

I Conclude p(n) is true for all integers n ≥ k .

151

The Archimedean Property

An important consequence of the Completeness Axiom is called the Archimedean

Property.

Theorem 3.15 (Archimedean Property - Form I)

The set N of natural numbers is unbounded above in R

Proof.

152

153

The two other theorems equivalent to the Archimedean Property are:

Theorem 3.16 (Archimedean Property - Form II)

If x ∈ R+ then ∃n ∈ N s.t. n − 1 < x ≤ n

This gives us the picture that their is always an integer to the left and to the right of

any real number.

Theorem 3.17 (Archimedean Property - Form III)

If x , y ∈ R+ then ∃n ∈ N s.t. y < nx

This is usually paraphrased as: “No matter how small a measuring stick it is always

possible to place it end-to-end to extend beyond any distance”.

154

Density of the Rationals

Theorem 3.18

If x and y are real numbers with x < y , then there exists a rational number r such

that x < r < y .

Proof.

155

An easy consequence of Theorem 3.18 tells us that we can also find an irrational

number between any two reals. This means that wherever you look along the real line

- every interval, no matter how small, contains a rational number and an irrational

number.

156

Functions

Recall the informal definition of a function:

A function (or map) f : A→ B from the set A, called the domain, to the set B called

the codomain, is a “rule” that assigns to every x ∈ A, a unique element y ∈ B,

called the image of x . The subset of all images is called the range of f .

Example 4.1

157

This informal definition is not quite precise enough for our purposes, since the

question of when something is, or isn’t, a rule is not easy to answer. To help us get a

handle on the precise idea of a function, we use the concept of a relation.

Definition 4.1 (Relation)

A relation R between two sets A and B, is a subset of the Cartesian product

A× B.

Thus the “rule” is defined by the pairs.

Example 4.2

158

Definition 4.2 (Function)

A function f : A→ B is a triple (A,B, f ), where f ⊆ A× B is a relation, such

that

1. if (x , y1) ∈ f and (x , y2) ∈ f then y1 = y2,

2. if x ∈ A then ∃z ∈ B such that (x , z) ∈ f .

159

Example 4.3

If A = {−1, 0, 1}, B = {T ,F} then

160

Special types of functions

Injective:

Surjective:

Bijective:

We will discuss functions again later in the subject. For now we are interested in

functions with domain A = N.

161

Sequences

Consider a function whose domain is N (or sometimes a subset of N).

162

Definition 4.3 (Sequence)

A sequence is a function f : N→ B (usually B = R) and is written

f (1), f (2), f (3), . . .

or

f1, f2, f3, . . .

or

{fn}

or

{fn}n≥1,

or

(fn)

or

(fn)n≥1,

where the image of n, f (n), is called the nth term of f.

163

Convergence of Sequences

In real analysis we are mostly interested in how the terms of a sequence approximate

some real number, L.

Question: As n gets larger are the terms fn a better approximation to L than previous

terms?

Think of decimal approximations to pi. Terminating the expansion gives a rational

number:

f1 = 3.1, f2 = 3.14, f3 = 3.142, f4 = 3.1416, f5 = 3.14159, . . .

or

f1 =

31

10

, f2 =

314

100

, f3 =

3142

1000

, f4 =

31416

10000

, . . .

Is each successive rational a better approximation to pi?

Those that do we will end up calling: convergent

164

Possible long-term behaviours of a sequence

165

Definition of Convergence of a Sequence

Definition 4.4 (Sequence Limit)

Let L ∈ R and f : N→ R (denoted n 7→ fn) be a sequence.

We say that L is the limit of the sequence fn, if

for all > 0 there exists an M ∈ N such that

for all n > M, we have |fn − L| <

This is written

lim

n→∞ fn = L

If so, we say that the sequence fn converges to L

166

If we write this using logic symbols, we have

∀ > 0 ∃M ∈ N s.t. ∀n > M, |fn − L| <

Thinking geometrically, this says:

fn converges to L if:

1. for any distance > 0 we choose,

2. we can always find an M ∈ N,

3. such that for all n > M,

4. the distance of every fn to L is smaller than .

167

Definition 4.5 (Sequence Convergence)

If there is some L such that lim

n→∞ = L (i.e., if such an L exists) then we say that the

sequence fn converges.

If there is no such L then we say that the sequence fn diverges.

168

For the previous example, fn = 1− 1/n:

Question: Given > 0 what value of M do we need?

169

The previous calculation gives us a candidate for M (which depends on ). We now

need to prove that it “does the job”, that is, it will enable us to prove that

∀ > 0 (∃M ∈ N s.t. (∀n > M, |fn − L| < ))

is true?

Proof

170

Note the following notation:

I The phrase “For all > 0 p()” will be written:

∀ > 0 p()

where p() is some condition on n.

This is an abbreviation for the logic statement:

∀ ∈ R+ p().

I Similarly, the phrase “For all n > M p(n)” will be written:

∀n > M p(n)

where p(n) is some condition on n.

This is an abbreviation for the logic statement:

∀n ∈ {M + 1,M + 2, . . .} p(n)

171

Example 4.4

f : N→ R, fn = 2

n − 1

2n

172

173

174

Example 4.5

g : N→ R, gn = 1 +

(−1

2

)n

175

176

177

Note: Not all sequences behave like fn and gn ie. “get closer to some number”.

Example 4.6

an = (−1)n

178

Luckily the proof that a sequence converges to L is almost always of the same form.

We have a simple template:

179

The integer M does not have to be the smallest integer that works.

The difficult part is finding M (which depends on ) and then filling in the

middle.

Usually this is done before writing the proof. In the proof you have to work out how

to get from your already computed M to |fn − L| < .

Sometimes the previous calculation can be “reversed”. Other times a different route

is required.

180

For proofs about convergence of sequences, we often need to use the ceiling

function.

Definition 4.6 (Ceiling)

Let x ∈ R. We define the ceiling of x , dxe, to be the smallest integer greater than or

equal to x . Equivalently,

dxe = inf{k ∈ Z : x ≤ k}.

Geometrically dxe it is the closest integer to the right (or equal to) of x .

Note, inf is only defined for non-empty sets. The fact that {k ∈ Z : x ≤ k} is

non-empty is a consequence of the Archimedean Property (Form II).

Example 4.7

181

Example 4.8

Show that the sequence

(

n

2n + 1

)

n∈N

converges to 12 .

182

183

184

Example 4.9

For the sequence fn =

1

n2 + 6n + 4

which converges to L ∈ R,

find an M ∈ N such that, if > 0, and n ∈ N with n > M, then |fn − L| < .

185

186

187

188

Chain of inequalities

Chain of inequalities to simplify expressions.

189

Notes:

I Most books use “N” in place of “M”, in which case the proof of the existence of

a limit is called an “− N proof”.

I Many books use a “challenge-response” explanation of the idea convergence.

Consider a two player game:

Player A chooses any > 0 and then challenges player B to find a natural number M

such that (∀n > M), |fn − L| < .

If player B can do this for any > 0 then B wins and L is the limit of the

sequence.

190

Written as a proof, player B needs M() - hence producing an M for any > 0 - and

also needs the calculation starting from M() and showing that |fn − L| is indeed less

than for all n > M. The proof shows that no matter what > 0 is, for the

subset

M = {n ∈ N : n > M}

it is true that there exists an L ∈ R such that |fn − L| < .

Note: This approach appears to put time into the proof ie. if the the

challenge-response goes on “forever” then the theorem is proved - this is not the

case. The logic statement only requires showing existence of integer (M) for every

positive real number () such that a certain condition is satisfied (|fn − L| < ) - this

has nothing to do with some process going on forever.

191

Divergent Sequences

To show that (fn) does not converge to L, we need to show that

∼

[

lim

n→∞ fn = L

]

is true. That is,

∼ [∀ > 0 (∃M ∈ N s.t. (∀n > M, |fn − L| < ))] (2)

≡ ∃ > 0 s.t. (∀M ∈ N (∃n > M, s.t. |fn − L| ≥ )) (3)

This is equivalent to saying that:

we can find an example of > 0 so that,

for any M ∈ N,

we can always find some n > M such that |fn − L| ≥ .

192

Note:

I We only need to find one such > 0, since the statement is now a “there exists”

statement.

I We only need to find one term (ie. there exists n) such that |fn − L| ≥ , for

some n > M.

Example 4.10

Show that (−1)n does not converge to L = 1.

Proof:

193

Example 4.11

Show that 1n does not converge to L = 1, but it does converge to L = 0.

Proof:

194

Not convergent

Beware the distinction between “not convergent to L” and “not convergent”.

To prove (fn) is not convergent, we need to prove that, for any L ∈ R,

lim

n→∞ fn 6= L.

This can usually be done by cases on L: choose ranges of L where lim

n→∞ fn 6= L is

easier to prove.

195

There is a theorem for proving divergence that generally gives a simpler proofs. To

use the theorem we need the idea of “subsequences”. A subsequence of the

sequence {fn}n≥1 is a sequence fn1 , fn2 , fn3 , fn4 , . . . where n1 < n2 < n3 < . . . ie. pick

a subset (without changing the order) of the terms of {fn}n≥1. The subsequence is

usually written in the form {fnk}k≥1.

Example 4.12

The following are subsequences of {1/n}:

Theorem 4.1 (Divergence Criteria for Sequences)

A sequence {fn}n≥1 is divergent if any of the following hold:

1. {fn}n≥1 has two subsequences {fnk}k≥1 and {fnp}p≥1 that converge to two

different limits.

2. {fn}n≥1 has a subsequence that is divergent.

3. {fn}n≥1 is unbounded.

196

Example 4.13

The sequence fn = (−1)n is divergent.

Proof:

197

Special Types of Sequences

For certain types of sequences we do not need full −M proofs to show

convergence.

Definition 4.7

The following sequences are called monotonic sequences.

A sequence (fn) is:

I Increasing if: ∀n ∈ N fn+1 > fn

I Non-Decreasing if: ∀n ∈ N fn+1 ≥ fn

I Decreasing if: ∀n ∈ N fn+1 < fn

I Non-Increasing if: ∀n ∈ N fn+1 ≤ fn

198

Example 4.14

199

200

Bounded Sequences

Definition 4.8 ((Un)Bounded Above)

I A sequence (fn) is bounded above if there exists an s ∈ R such that

for all n ∈ N, fn ≤ s.

I A sequence (fn) is unbounded above if no such s exists.

Similarly,

Definition 4.9 ((Un)Bounded Below)

I A sequence (fn) is bounded below if there exists an s ∈ R such that

for all n ∈ N, fn ≥ s.

I A sequence (fn) is unbounded below if no such s exists.

Note it can be proved (do it as an exercise) that unbounded sequences do not

converge and so sometimes also called divergent.

201

Definition 4.10 (Bounded)

A sequence is called bounded if it is both bounded above and bounded below.

Example 4.15

Prove that the sequence (2n) is not bounded.

Proof:

202

203

Combining the ideas of bounded and monotonic gives a very useful theorem:

Theorem 4.2 (Non-decreasing and convergence)

Let (fn) be a non-decreasing sequence. If (fn) is bounded above, then it is

convergent.

Note: This theorem tells us that (fn) has a limit, but not what that limit is.

(The limit is actually supA, where A is a set containing all the terms of (fn)).

204

205

206

There is a similar theorem for non-increasing, bounded below sequences. When we

combine the two, we get:

Theorem 4.3 (Bounded monotonic sequence)

Every bounded, monotonic sequence is convergent.

To use this theorem we need to show:

I The sequence is bounded above.

I The sequence is bounded below.

I The sequence is monotonic.

If Lu and Ll are the upper and lower bounds, then

Ll ≤ lim fn ≤ Lu

that is, the bounds give an estimate of the limit. The better the bounds, the better

the estimate.

207

Algebra of Limits

Now that we have precise definitions of convergence, we can prove some limit

laws.

Theorem 4.4

If fn → α and gn → β then:

1. (fn + gn)→ α + β.

2. (fn · gn)→ α · β.

3.

fn

gn

→ α

β

, for β 6= 0.

4. k · fn → k · α, for any k ∈ R.

208

Many of these (and other) algebra theorems use the following:

Recall “∀ > 0” is an abbreviation for “∀ ∈ R+”. It does not matter how the

elements of R+ are “represented” so long as they give all of R+. This allows us to

change the to some other function, g(), so long as the following condition

holds:

{g() : > 0} = R+

Examples of valid functions:

1. g1() = /2

2. g2() =

2

3. g3() = log(+ 1)

Examples of invalid functions:

1. g4() = 1 +

2. g5() =

2 − 1

3. g6() = log()

If g() is valid then we get logical equivalence. For example, all the statements below

are logically equivalent.

1. ∀ > 0 ∃M ∈ N s.t. ∀n > M |fn − L| <

2. ∀ > 0 ∃M ∈ N s.t. ∀n > M |fn − L| < /2

3. ∀ > 0 ∃M ∈ N s.t. ∀n > M |fn − L| < 2

In the above three examples |fn − L| still has to be less than some positive real

number. What will in general change is M().

209

Proof of (fn + gn)→ α + β

210

211

The downside of the convergence definition is that you need to know the limit. Can

we get around this for an arbitrary (convergent) sequence?

Geometrically, we have

This suggests that if (fn) converges, then |fn − fm| → 0. What we would like is the

converse: “if |fn − fm| → 0 then (fn) converges”.

212

Cauchy Sequences

Definition 4.11 (Cauchy Sequence)

A sequence fn of real numbers is called a Cauchy Sequence if for each ε > 0 there

exists an M ∈ N such that whenever n,m > M we have |fn − fm| < ε.

Example 4.16

213

Lemma 4.1 (Convergent-Cauchy)

Every convergent sequence is a Cauchy sequence.

Proof.

214

Theorem 4.5 (Cauchy-Bounded)

Every Cauchy sequence is bounded.

Proof: Omitted - it’s rather long – see just about any real analysis text book.

215

Cauchy’s Convergence Criterion

Theorem 4.6 (Cauchy’s Convergence Criterion)

A sequence (fn) converges if and only if for every > 0 there exists M ∈ N such that

for all n,m > M,

|fn − fm| < .

This is known as Cauchy’s Convergence Criterion.

Proof:

216

Limit Points

We now generalise the idea of a point γ being supE .

Definition 5.1 (Limit Points)

Let E ⊆ R and ` ∈ R. Then ` is called a limit point of E if, given any real number

δ > 0 there is a point x ∈ E such that

0 < |x − `| < δ.

Logic: (` a limit point of E ) ⇔ ∀δ > 0 ∃x ∈ E s.t. 0 < |x − `| < δ

Note that:

1. The condition 0 < |x − `| excludes x = `.

2. E may or may not contain the limit point.

3. Limit points are also called accumulation points or cluster points in some

books.

217

218

Example 5.1

If E =

{

x ∈ R : sin ( 1x ) = ±1} then ` = 0 is a limit point of E (and 0 /∈ E ).

219

220

Definition 5.2 (Deleted neighbourhood)

Let ` ∈ R and δ > 0. The set defined as

N0(`, δ) = {x ∈ R : 0 < |x − l | < δ}

is called a deleted neighbourhood of `.

221

We can extend Definition 5.1 to allow ±∞ to be limit points.

Definition 5.3 (Limit point at Infinity)

Let E ⊆ R. Then if, given any α > 0, there is an element x ∈ E such that x > α, we

say that E has a limit point at ∞ (or if x < α, E has a limit point at −∞).

This is equivalent to saying that E is unbounded above (respectively unbounded

below).

Note that;

1. “±∞” are not elements of R, but are limit points of R.

2. The only limit point of N is ∞.

222

Continuous Functions

We would like to generalise the idea of convergence of sequences to convergence of

functions to some limit L. That is, we will think about functions with domain E ⊆ R

where E is not N.

This means we want to know what happens to f (x) as x gets close to some point `.

1. n→∞ becomes x → ` ∈ R.

2. fn → L becomes f (x)→ L.

223

Similarly for functions

1. ` is not generally in E ,

2. f may not be defined at x = ` .

So we would like to define f → L without assuming that ` is in the domain of f . This

requires us to think about what “gets close to” means.

• ` /∈ E requires us to define the deleted neighbourhood.

• f not defined at x = ` will be avoided by using |f (x)− L| < ε with x 6= `.

224

Example 5.2

225

We can now generalise the idea of sequence limits to function limits:

Definition 5.4 (Limits of Functions)

Let f be the function f : E → R. Let ` be a limit point of E and f be defined on a

deleted neighbourhood of ` (that is, N0(`, δ) ⊆ E ). We say that the limit of f as x

tends to ` is L ∈ R, if: for any > 0 there is a δ > 0 such that for all x

satisfying

0 < |x − l | < δ

we have that

|f (x)− L| < .

We write this as

lim

x→`

f = L

or f → L as x → `. We also say that f converges to L as x tends to `.

Logic Form:

∀ > 0 (∃δ > 0 s.t. (∀x ∈ E (0 < |x − `| < δ ⇒ |f (x)− L| < ) )) .

226

To motivate the conditional

(0 < |x − `| < δ) ⇒ (|f (x)− L| < )

consider the following example:

f : R \ {2} → R ; f (x) = x

2 − 4

x − 2 .

How does the value of f (x) change as the value of x gets closer to x = 2?

227

228

Logic Form:

∀ > 0 (∃δ > 0 s.t. (∀x ∈ E (0 < |x − `| < δ ⇒ |f (x)− L| < ) )) .

It might be easier to read when written in the form:

∀ > 0 ∃δ > 0 s.t. ∀x ∈ E p(x , δ, )

where

p(x , δ, ) : (0 < |x − `| < δ) ⇒ (|f (x)− L| < ).

Thus: for every > 0 we need to show there exists a δ > 0 such that p(x , δ, ) is

true.

p(x , δ, ) is true if: whenever x is in the deleted neighbourhood, (ie. 0 < |x − `| < δ)

it is the case that f (x) is within the strip (ie. |f (x)− L| < ).

229

Recall the logical equivalence: If A ⊂ B then,

∀x ∈ B (x ∈ A⇒ p(x)) ≡ ∀x ∈ A p(x) .

This enables us to rewrite the limit definition as

∀ > 0

(

∃δ > 0 s.t. (∀x ∈ N0(`, δ) |f (x)− L| < ) ))

where N0(`, δ) is the subset 0 < |x − `| < δ.

This form of the logic then reads as:

For every > 0 we need to show there exists a δ > 0 such that for every x in the

deleted neighbourhood it is the case that |f (x)− L| < .

230

Notes:

I f is never evaluated at x = ` so it does not matter if ` /∈ E .

I N0(`, δ) ⊆ E is a strong restriction on the domain of E .

I The definition requires ` ∈ R but can be extended to allow for limit points at

±∞.

I We specify first and then find the deleted neighbourhood of ` such that f (x) is

always within of L.

231

Note the similarity with the −M convergence definition:

232

I Thus a proof that f → L as x → ` has a very similar template to the one we

used for sequences.

I Before the proof (in the strategy section) we use |f (x)− L| < to find a suitable

δ(). That is, we start with |f (x)− L| < and try to get to 0 < |x − l | < δ().

I In the proof we start with 0 < |x − l | < δ() and try to get to |f (x)− L| < .

Proof Template:

233

Geometrically we have the picture:

234

Example 5.3

If f : R→ R, x 7→ x2, prove that lim

x→2

f (x) = 4

235

We will need:

Lemma 5.1

Let a, x , y ∈ R. If a < min(x , y) then a < x and a < y .

Proof of previous example:

236

237

Beware bad points

Functions which are not defined near limit points or are unbounded near limit points

need special care. Choosing δ small enough can help to avoid these points.

Example 5.4

Prove that

lim

x→1

x2 − 1

2x − 1 = 0

238

239

Proof:

240

241

One-sided limits

The definition of one-sided limits is the same as limits, except the deleted

neighbourhood N0(`, δ) is replaced by left or right deleted neighbourhoods.

242

Left and Right Limits

The left limit is written:

lim

x→`−

f (x) = L

“x approaches ` from below (or from the left)”.

The right limit is written:

lim

x→`+

f (x) = L

“x approaches ` from above (or from the right)”.

243

The following theorem is useful for showing a limit does not exist.

Theorem 5.1 (Left-Right Limits)

Let f : E → R be a function. Then

lim

x→`

f (x) = L

if and only if

lim

x→`−

f (x) = L and lim

x→`+

f (x) = L.

244

Example 5.5

Prove that f (x) =

{

x − 1 x ≤ 2

x + 1 x > 2

does not have a limit as x → 2.

245

246

The limit points ±∞

We can now think about what happens to a function as:

1. “x → ±∞”

2. x → ` but “f (x)→ ±∞”.

f bounded.

In the first case, as with sequences, we are really asking, what happens to f as x gets

arbitrarily large (positive or negative)? That is, what values can f take in the

neighbourhood

N∞(δ) = {x ∈ R : x > δ}

or similarly in the neighbourhood

N−∞(δ) = {x ∈ R : x < δ}?

So we are looking at what happens on an unbounded subset of R.

247

Thus the logic form of the definition of lim

x→∞ f (x) = L is:

∀ > 0 (∃δ > 0 s.t. (∀x ∈ E (x > δ ⇒ |f (x)− L| < ))).

where E is the domain of f . Note, E must be unbounded above for lim

x→∞ f (x) to be

defined. This is logically equivalent to:

∀ > 0 (∃δ > 0 s.t. (∀x ∈ N∞(δ) |f (x)− L| < )).

Geometrically this gives:

248

f unbounded.

In the second case, if f is unbounded as x → ` ∈ R we say, lim

x→`

f (x) = +∞ if

∀α > 0 (∃δ > 0 s.t. (∀x ∈ E (0 < |x − l | < δ ⇒ f (x) > α))).

where E is the domain of f . We also say that f is divergent to ∞.

Note that there are analogous results for −∞.

Remember that: ∞ is not a real number, but a convenient notation for unbounded

subsets of R or as limit points.

249

Example 5.6

250

Limit Laws

Now that we have a precise definition of a limit, we can prove the standard limit

laws.

Theorem 5.2

If f → α and g → β as x → ` where α, β ∈ R, then

1. f + g → α + β

2. f · g → α · β

3.

f

g

→ α

β

, β 6= 0

These are proved in a very similar way to the sequence limit laws and are left as an

exercise.

251

When a limit does not exist at x = `.

There are generally two ways to show that a function has no limit as x → `.

1. Prove the negation of the limit definition is true.

2. Use some theorems, for example left and right limits must both exist and be

equal.

Example 5.7

252

Continuity

Question: When is lim

x→`

f (x) = f (`)? That is, when can we just substitute to get the

value of a limit?

Consider the following functions at x = 0:

253

254

Definition 5.5 (Continuous)

Let f be a real valued function with domain E ⊆ R. Suppose ` ∈ E and E contains a

neighbourhood of `. Then f is said to be continuous at x = `, if

lim

x→`

f (x) = f (`).

If D ⊆ E and f is continuous at all ` ∈ D, then f is said to be continuous on

D.

A very useful consequence of this is that, if we know a function is continuous at x = `

then we immediately know the value of the limit as x → `. We get the value of the

limit by substituting x = `, that is f (`).

255

So continuity at x = ` requires:

1. limx→` f (x) ∈ R

2. f (`) is defined (that is, ` ∈ E )

3. the limit equals the value of f at `.

The definition requires N(`, δ) ⊆ E , so that x = ` is in the domain of f (so f (`) must

be defined). Thus if f is not defined at x = ` it cannot be continuous at x = `.

256

Definition 5.6

A neighbourhood of x = ` is the open interval

N(`, δ) = {x ∈ R : |x − l | < δ}

for some δ > 0.

Note the difference between a neighbourhood and a deleted neighbourhood. Here,

x = ` is included.

Thus the answer to the original question is:

If f is continuous at x = ` then its limit as x → ` equals the value of the

function.

This is only really useful if we know a function is continuous at x = ` without proving

the limit exists (via − δ) - which is where the continuity laws come in.

257

Theorem 5.3

Let f and g be continuous at x = `. Then the following functions are continuous at

x = `.

1. f + g

2. fg

3.

f

g

, as long as g(`) 6= 0

4. f ◦ g , as long as f is continuous at x = g(`).

These can all be proved using the modified − δ Definition 5.7, or using the limit

laws.

258

We also have the following theorem:

Theorem 5.4

The following functions are all continuous on the stated domains:

I Polynomials:

m∑

n=0

anx

n; D = R, an ∈ R

I ex , cos(x), sin(x); D = R

I log(x); D = R+

I x1/p; D = R+, p ∈ N (that is, pth root).

259

Since the limit point ` is required to be in the domain of f , a deleted neighbourhood

is not required in the − δ definition, which gives the following equivalent

definition:

Definition 5.7 (Continuity at a point)

Let f be a real valued function with domain E and ` ∈ E . Then f is continuous at

x = ` if: for any > 0 there exists δ() > 0 such that for all x satisfying

|x − l | < δ()

we have

|f (x)− f (`)| < .

Note: The original − δ definition of a limit requires a deleted neighbourhood, but

Definition 5.7 does not.

260

Example 5.8

Show that f : R→ R is continuous at x = 0, where

f (x) =

{

x sin

(

1

x

)

, x 6= 0

0, x = 0.

261

262

263

Bounded and Monotonic Functions

As with sequences, (where E = N) we have similar bounded and monotonic properties

for functions.

Assume that E ,D ⊆ R.

Definition 5.8 (Bounded Functions)

Let f : E → D.

1. f is bounded above if ∃α ∈ R such that (∀x ∈E f (x) ≤ α).

2. f is bounded below if ∃α ∈ R such that (∀x ∈E f (x) ≥ α).

3. f is bounded above and bounded below (on E ) if and only if f is bounded on E .

264

Definition 5.9 (Monotonic Functions)

Let f : E → D. Then f is monotonic on E if and only if it satisfies one of the

following conditions:

(i) f is increasing; that is,

(∀x , y ∈ E x < y ⇒ f (x) < f (y)).

(ii) f is non-decreasing; that is,

(∀x , y ∈ E x < y ⇒ f (x) ≤ f (y)).

(iii) f is decreasing; that is,

(∀x , y ∈ E x < y ⇒ f (x) > f (y)).

(iv) f is non-increasing; that is,

(∀x , y ∈ E x < y ⇒ f (x) ≥ f (y)).

If either (i) or (iii) apply, then we say f is strictly monotonic.

Monotonicity is very useful for the existence of inverse functions.

Question: If f : E → D when can we define g : D → E such that

f (g(x)) = x?

265

Theorem 5.5 (Injectivity, Continuity & Monotinicity)

Let f be a continuous, real-valued, injective function on an interval I . Then f is either

increasing or decreasing on I .

The theorem leads to the following useful relationship between continuity and inverse

functions.

Theorem 5.6 ( Continuity, Injectivity & Inverse,)

Let f be a continuous, real-valued, injective function on an interval I . Then an

inverse function f −1 exists on I and is a continuous function.

Note that I can be [a, b], [a, b), (a, b] or (a, b).

266

Example 5.9

The sine function. The existence of an inverse depends on the domain.

267

Bounded and Continuous Functions

Boundedness and continuity are related as shown in the following theorem.

Theorem 5.7 (Bounded and Continuous)

Let f : E → R, E ⊆ R, be a continuous function on [a, b].

Then f is bounded on [a, b].

268

The following example shows why the interval [a, b] of the Bounded and Continuous

theorem must be closed.

269

To account for continuity at points were the domain of the function only contains half

of any deleted neighbourhood, the definition of continuity can be weakened to left

and right continuity.

Definition 5.10 (One sided Continuity)

A function f : A→ B is left continuous at x = ` if

lim

x→`−

f (x) = f (`)

and is right continuous at x = ` if

lim

x→`+

f (x) = f (`)

Example 5.10

270

Continuity and Solving Polynomials

Question: How do I know that

5x6 − 3x4 + 2x3 − 2 = 0

has at least one real solution?

271

Example 5.11

272

Intermediate Value Theorem

We now come to a classic theorem of real analysis.

Theorem 5.8

Let g be a continuous real valued function on the closed interval I = [a, b].

Then g takes on every value in the interval

J = [inf Ig , supIg ].

Conventionally, this is stated in a weaker form.

Theorem 5.9 (Intermediate Value Theorem)

Let f : [a, b]→ R be a continuous function and let y ∈ R such that f (a) < y < f (b)

or f (a) > y > f (b).

Then there exists c ∈ (a, b) such that f (c) = y .

273

274

275

Open and closed subsets of R

Recall that N(x , ) = {y | d(x , y) < }.

Definition 5.11

A subset E ⊂ R is called open if for all x ∈ E there exists an > 0 such that

N(x , ) ⊆ E . A subset if called closed is its complement is open.

The union of any collection of open sets is open.

Proposition 5.1

Let A = {Ai | i ∈ I} be a collection of sets. If all the Ai are open, then ∪i∈IAi is

open.

276

Example 5.12

The intersection of a collection of open sets need not be open.

For i ∈ N define Ai = (−1/i , 1/i) ⊂ R. Then ∩i∈NAi = {0} and is not open.

However, the intersection of a finite number of open sets is open.

Proposition 5.2

Let A1,A2, . . .An ⊂ R be a finite collection of open sets.

Then A1 ∩ A2 ∩ · · · ∩ An is open.

Corollary 5.1

Let C = {Ci | i ∈ I} be a collection of closed sets. Then ∩i∈ICi is closed. Let

C′ = {C1, . . . ,CN} be a finite collection of closed sets. Then ∪Ni=1Ci is closed.

277

Compact subsets

Definition 5.12

A subset K ⊂ R is compact if every infinite sequence (xi )i∈N with xi ∈ K has a

convergent subsequence which converges to a point in K .

(This is in fact what is called ‘sequential compactness’.)

Example 5.13

Let a ∈ R be positive.

I [0, a) is not compact. (We’ll show this now.)

I [0, a] is compact. (This is the Heine-Borel theorem below.)

Example 5.14

[0,∞) is closed but not compact.

278

Theorem 5.10 (Heine-Borel)

Let E be a subset of R. Then E is compact iff E is both closed and bounded.

Proof: Suppose first that E ⊂ R is compact.

To see that E is bounded

279

Now to show that E is closed.

Suppose that it were not. Then R \ E is not open, therefore there exists x ∈ R \ E

such that for all > 0, (x − , x + ) ∩ E 6= ∅.

280

For the converse, suppose now that E ⊂ R is closed and bounded.

We must show that E is compact.

281

One of the central features of compact sets is that a continuous function on a

compact set has a maximum value.

Lemma 5.2

Let K ⊂ R be compact. Then supK ∈ K .

Proof:

Lemma 5.3

Let f : → R be a contiuous function, and let O ⊂ R be open. Then the set

f −1(O) = {a ∈ R : f (a) ∈ O} is open.

282

Proposition 5.3

Let f be a continuous function and K ⊂ R a compact set. Then there is an a ∈ K

such that f (a) ≥ f (b) for all b ∈ K .

Proof: By the previous lemma it suffices to show that f (K ) is compact.

283

Differentiability

In this topic, I represents any of the intervals [a, b], (a, b), [a, b), (−∞, a), (−∞, a],

etc, where a < b.

Refresher: Let f : I → R. The chord function

284

Since we now have an − δ definition of a limit, we can prove that f ′(x0) exists (as a

real number).

Definition 6.1 (Differentiability)

Let I be an interval of R and f : I → R. The derivative of f at x0 denoted f ′(x0), is

defined as

f ′(x0) = lim

h→0

f (x0 + h)− f (x0)

h

If f ′(x0) ∈ R (that is, exists) we say that f is differentiable at x0. If f is

differentiable for all x ∈ I we say that f is differentiable. If f is differentiable we can

define the function

f ′ : I → R, x 7→ f ′(x).

285

Note:

I f ′(x0) is the limiting value of the chord function, Cx0(h) as h→ 0.

Thus if Cx0(h) is not defined in a deleted neighbourhood N0(x0, δ), then the limit

defining f ′(x0) cannot exist.

In particular f (x0) must be defined.

286

What is the relationship between continuity and differentiability?

Theorem 6.1 (Differentiability and Continuity)

If f : I → R is differentiable at x0 ∈ I then f is continuous at x0.

Proof:

Uses the trick

f (x0 + h)− f (x0) = h · f (x0 + h)− f (x0)

h

, h 6= 0.

Also, if f ′(x0) exists then Cx0(h) exists in some deleted neighbourhood (N0(0, δ)) so

f (x0) is defined.

287

Note: Since lim f = lim g 6⇒ f = g , the converse of the theorem is not true. That is,

continuity does not imply differentiability.

Example 6.1

The function f (x) = |x | is continuous for all x ∈ R but not differentiable at

x = 0.

288

Relationship between differentiability and monotonicity

If f is differentiable at x0 we can get some information about the “local”

monotonicity properties of f . That is, if we are close enough to x0 then we have the

following result:

Theorem 6.2 (Differentiability and Monotonicity)

Let f be differentiable at x0. Then if f

′(x0) > 0 we have:

I ∃ > 0 such that

∀h, (0 < h < )⇒ f (x0 + h) > f (x0).

I ∃ > 0 such that

∀h, (− < h < 0)⇒ f (x0) > f (x0 + h).

with similar statements for f ′(x0) < 0.

289

Rolle’s Theorem

We now come to another classic theorem of real analysis.

Theorem 6.3 (Rolle’s Theorem)

Let f be continuous on [a, b], differentiable on (a, b) and suppose f (a) = f (b) = 0.

Then there exists a point c ∈ (a, b) such that f ′(c) = 0.

That is, there must exist at least one point of (a, b) where the tangent is horizontal.

If f goes upwards somewhere after x = a it must “turn around” to get back down to

x = b.

Note carefully the conditions on the intervals.

It must be continuous on the closed interval, differentiable on the open interval and c

must be in the open interval.

290

Proof:

291

Mean Value Theorem

The primary application is to prove another classic theorem, the Mean Value theorem,

which in turn is used to prove results for Taylor polynomials - see later.

Rolle’s theorem can be generalised to remove the condition f (a) = f (b) = 0, in which

case the slope becomes non-zero, as given by:

Theorem 6.4 (Mean Value Theorem)

Let f be continuous on [a, b], differentiable on (a, b). Then there exists c ∈ (a, b)

such that

f ′(c) =

f (b)− f (a)

b − a .

Note that, f ′(c) is the slope of the chord through f (a), f (b).

292

Thus the Mean Value theorem says

The proof uses a trick to define a function g(x) that can be used with Rolle’s

theorem:

g(x) = f (x)− f (a)−

(

f (b)− f (a)

b − a

)

· (x − a).

Note that g(a) = g(b) = 0.

293

Proof:

294

Notes:

I You may use all the usual rules for differentiating functions:

I The sum rule, difference rule, scalar multiple rule.

I The product rule, quotient rule, chain rule.

I Standard derivatives for polynomials, sin, cos, log, etc.

These are all straightforward to prove, and are left as an exercise.

I The main use of the MVT is for proving things about Taylor polynomials.

295

Riemann Integration

(There is a more general form of integration, called Lebesgue Integration, but this

requires Measure Theory.)

Recall: Integration has two ideas

I The idea of an antiderivative, the opposite operation to differentiation.

I The area under a curve.

The two ideas are linked via the Fundamental Theorem of Calculus. We want a

precise definition of:

∫ b

a

f (x) dx

Generalising the first idea is too restrictive, some functions are not differentiable, but

can be integrated. So we will try to make the second idea precise.

296

To do this we will show that for some functions, approximating the area under the

curve by rectangles can be a very good approximation to the actual area.

297

There are two considerations:

I How do we divide up [a, b]?

I What height do we use for each rectangle?

In the first instance, we allow arbitrary positions:

Definition 7.1 (Partition)

A partition P of [a, b] is a finite ordered set, written

P = {a = x0 < x1 < . . . < xn−1 < xn = b}.

298

Thus, the partition fixes the widths of the rectangles, but what height should we

use?

There are many possibilities

But there is a natural choice:

299

Definition 7.2 (Riemann Sums)

Let f : [a, b]→ R be a bounded function. For each subinterval [xk−1, xk ] of the

partition P = {a = x0 < x1 < . . . < xn−1 < xn = b}, define

mk = inf{f (x) : x ∈ [xk−1, xk ]}

Mk = sup{f (x) : x ∈ [xk−1, xk ]}

and then define two approximations to the area: the Lower Riemann Sum

L(f ,P) =

n∑

k=1

mk(xk − xk−1)

and the Upper Riemann Sum:

U(f ,P) =

n∑

k=1

Mk(xk − xk−1).

300

Informally,

I U(f ,P) uses the maximum value of f on each subinterval.

I L(f ,P) uses the minimum value of f on each subinterval.

301

Note:

1. We only assume f is bounded (so we can ensure that sup and inf exist), not

continuous.

2. Riemann sums depend on P. Changing the number of points, or the position of

points will change the sum.

The useful property of sup and inf is that mk ≤ Mk for all subintervals, thus we

have:

Theorem 7.1 (Sum ordering: same partition)

L(f ,P) ≤ U(f ,P)

for all partitions of [a, b].

Thus, L(f ,P) is always a lower bound and U(f ,P) is always an upper bound on the

“area”.

302

Question

How do L and U change as we add new points to the partition P? That is, what if we

keep the current n points fixed, but add new points between the existing ones to give

a new partition Pˆ so that

P ⊆ Pˆ?

For example,

P =

{

1, 32 , 2

}

and Pˆ =

{

1, 54 ,

3

2 , 2

}

.

303

Definition 7.3 (Refinement)

If P ⊆ Pˆ then Pˆ is called a refinement of P (that is, Pˆ has the same points and

more).

How L and U change under refinement is given by the following theorem:

Theorem 7.2 (Sum ordering: refinement)

If Pˆ is a refinement of P then

L(f ,P) ≤ L(f , Pˆ)

and

U(f , Pˆ) ≤ U(f ,P).

So increasing the number of subintervals increases L and decreases U. Hence, it

improves the estimates.

304

Intuitively we can see this from the geometry:

305

The proof relies on the following theorem:

Theorem 7.3 (Subintervals and Sup’s/Inf’s)

Let f : [a, b]→ R be bounded on [a, b], with c ∈ (a, b). Then

1. sup[a,c] f ≤ sup[a,b] f

2. sup[c,b] f ≤ sup[a,b] f

with similar (reversed) inequalities for infimum.

The proof is left as an easy exercise.

306

We now prove Theorem 7.2.

307

308

309

What happens as we change the position of the points in P? If we compare U and L

(rather than U and U) then we have:

Theorem 7.4 (Sum Ordering: different partitions)

If P and Q are any partitions of [a, b], then

L(f ,P) ≤ U(f ,Q).

Note this is a generalisation of Theorem 7.1.

Proof:

310

Thus as we refine the partition (that is, increase points), L is non-decreasing and U is

non-increasing.

Question: Do they meet in the middle somewhere?

Answer: Sometimes... it depends on f .

311

Riemann Integrable

In the special case that L and U are equal in the limit, we define

Definition 7.4 (Riemann Integrable)

Let P˜ be the set of all possible partitions of [a, b].

Define the Lower Riemann Integral of f on [a, b] as:

L(f ) = sup{L(f ,P) : P ∈ P˜}

and the Upper Riemann Integral of f on [a, b] as:

U(f ) = inf{U(f ,P) : P ∈ P˜}.

We say that f is Riemann integrable on [a, b] if U(f ) = L(f )

312

In this case we define the integral of f by∫ b

a

f = U(f ) = L(f ).

Example 7.1

Here is a bounded function that is not integrable.

313

314

Question: Is it possible to determine if f is integrable without testing all

partitions?

Answered by Riemann:

Theorem 7.5 (Integrability and Boundedness)

Let f : [a, b]→ R be bounded. Then f is integrable if and only if, for all > 0 there

exists a partition P of [a, b] such that,

U(f ,P)− L(f ,P) < .

This is similar to an − δ definition except finding δ() is replaced by finding

P().

315

316

317

In the special case when f is continuous, we can say more (without considering any

partitions at all).

Theorem 7.6 (Integrability and Continuity)

If f : [a, b]→ R is continuous on [a, b], then f is integrable on [a, b].

Notes

1. The converse is NOT true: Integrable 6⇒ Continuous.

2. Functions which are not continuous on [a, b] may still be integrable, for example,

f : R→ [−1, 1]

f (x) =

{

1, x ≥ 0

−1, x < 0.

3. The proof is not difficult but requires “uniform continuity” which is not in this

subject.

318

Theorem 7.7 (Fundamental Theorem of Calculus)

I If f : [a, b]→ R is integrable and F : [a, b]→ R satisfies F ′(x) = f (x) for all

x ∈ [a, b] then ∫ b

a

f (x) dx = F (b)− F (a)

I Let g : [a, b]→ R be integrable and define G : [a, b]→ R by

G (x) =

∫ x

a

g(u) du, G (a) = 0.

Then G is continuous on [a, b]. If g is continuous at c ∈ [a, b] then G is

differentiable at c and G ′(c) = g(c).

319

The theorem says:

1. If f has an antiderivative, then we can evaluate

∫ b

a f (x) dx .

2. If we can evaluate

∫ x

a f (u) du, then we get the antiderivative (which is also

continuous).

320

Algebraic Properties of the Integral

All these usual properties can be proved from the definition:

Theorem 7.8 (Algebra of Integrals)

If f and g are integrable on [a, b] and c ∈ [a, b] then:

1.

∫ c

a f and

∫ b

c f exist

2.

∫ b

a f =

∫ c

a f +

∫ b

c f

3.

∫ b

a (f + g) =

∫ b

a f +

∫ b

a g

4.

∫ b

a λf = λ

∫ b

a f , λ ∈ R

321

Improper Integrals

Question: What if f is unbounded at some point c ∈ [a, b]? Without loss of

generality, we can take c = a or b since we can use Theorem 7.8 part 3.

As with all things that might be unbounded, we consider the limit of finite

things!

Definition 7.5 (Improper Integral: unbounded integrand)

Let f : (a, b]→ R be such that f is integrable on [c , b] for all c ∈ (a, b). If the

limit

lim

c→a+

∫ b

c

f

exists we call it the improper integral of f on (a, b].

322

Beware: the same notation ∫ b

a

f

is generally used . You need to check the function f carefully to see if

lim

c→a+

∫ b

c

f

is meant. That is, check to see if a is actually in the domain of f .

323

We can make a similar definition for unbounded intervals:

Definition 7.6 (Improper integral: unbounded interval)

Let f : [a,∞)→ R be integrable on [a, c] for every c > a. If the limit

lim

c→∞

∫ c

a

f

exists we call it an improper integral, written∫ ∞

a

f .

Note: there is a similar definition for

∫ a

−∞

f = lim

c→−∞

∫ a

c

f .

See practice class sheets for examples.

324

Series

For this topic the focus is on applying theorems rather than proving them.

Consider the following methodof approximating the area under the parabola

y = 1− x2

325

The (finite) partial sums

Ak = 1 +

1

4

+ · · ·+ 1

4k

provide an increasingly better approximation to the area.

326

This example shows very clearly that:

I LHS is an increasingly better approximation to the area

I The approximation is always less than 43

Ak <

4

3

, ∀k ∈ N

and so is bounded above

I Less obviously, if I choose any number less than

4

3

, I can always find a k ∈ N

such that Ak is larger than it, that is,

4

3

= sup{Ak : k ∈ N}.

Sound familiar?

∀ > 0 ∃k0 ∈ N s.t. ∀k > k0

∣∣∣∣Ak − 43

∣∣∣∣ < .

327

Recall the definition of the convergence of a sequence fn:

∀ > 0 ∃M ∈ N s.t. ∀n > M, |fn − L| < .

Thus the partial sums behave like the terms of a sequence

A1,A2,A3, . . . .

So we say that, Ak converges to

4

3

and write it as

1 +

1

4

+

1

16

+ . . . =

4

3

.

This is NOT the addition of an “infinite” number of terms: it is a convenient notation

for saying the partial sums converge to

4

3

. That is, the approximations have a finite

limit.

328

Question: Why can’t we think of it as the sum of an infinite number of terms?

Clearly Ak is a sum of terms!

If it were a summation, it would need to have all the properties of the binary

operation “+”, that is:

I Associative (1 + 2) + (3 + 4) = 1 + (2 + 3) + 4

I Commutative 1 + 2 + 3 = 3 + 2 + 1

I etc.

329

Example 8.1

Consider 1− 1 + 1− 1 + 1− 1 + . . . = ?

330

331

332

333

The essential idea is to consider the partial sums, Ak , as defining a sequence (Ak) and

then consider the convergence of the sequence (Ak).

Note: Historically things happened the other way around: Problems with “Infinite

sums” (particularly those from Fourier series) were finally resolved once convergence

was understood.

Beware: “a1 + a2 + a3 + . . .” is not the same mathematical object as

“a1 + a2 + . . .+ ak” (ie. a finite sum).

The notation is, however, convenient as sometimes a1 + a2 + a3 + . . . has some of

the properties of a finite sum.

334

Definition 8.1 (Series)

The expression “a1 + a2 + a3 + . . .” is called a series and is usually written as

∞∑

n=1

an

or ∑

n>0

an.

The number an is called the n-th term of the series.

Furthermore,

Ak = a1 + a2 + . . .+ ak

is called the k-th partial sum of the series and (Ak) the sequence of partial

sums.

335

Thus we have

a1 + a2 + a3 . . .

Since the partial sums give a sequence of approximations to the series, in the sense of

Archimedes, we are motivated to define:

336

Definition 8.2 (Convergent series)

The series ∞∑

n=1

an

is said to be convergent if the sequence (Ak)

Ak =

k∑

n=1

an

is convergent. If lim

k→∞

Ak = L, we call L the sum (or value) of the series and

write:

a1 + a2 + . . . = L, or

∑

n>0

an = L.

The series is divergent if (Ak) is divergent.

337

Again we have the standard algebra:

Theorem 8.1 (Algebra of Series)

If

∑

n>0

an = α and

∑

n>0

bn = β then

∞∑

n=1

(an + bn) = α + β

and ∞∑

n=1

(λan) = λα, λ ∈ R.

Proof: Follows from the algebra of sequences.

338

Warning: Do not confuse the sequence of terms (an) with the sequence of partial

sums (Ak).

Given any sequence (an) we can define a series via its partial sums:

The convergence of (Ak) (and hence the value of the series “a1 + a2 + . . . ”) is related

to the sequence of terms in an important way:

339

Theorem 8.2 (Series and Term convergence)

If

∞∑

n=1

an converges then lim

n→∞ an = 0.

The converse is false: lim

n→∞ an = 0 does not imply that

∞∑

n=1

an converges.

The classic counterexample is the Harmonic Series:

∞∑

n=1

1

n

which diverges, even though

1

n

→ 0.

340

The Harmonic series is the s = 1 case of the famous Riemann Zeta Function:

ζ(s) =

∞∑

n=1

1

ns

which is the function of the famous Riemann Hypothesis.

Proof of Theorem 8.2:

341

Even though the converse is false, the contrapositive (which is necessarily true) is

very useful:

∼ ( lim

n→∞ an = 0)⇒ ∼

(∑

n>0

an converges

)

which gives the equivalent theorem:

Theorem 8.3 (Terms & Series Divergence)

If lim

n→∞ an 6= 0 then

∑

n>0

an diverges.

This is sometimes called the divergence test.

342

Example 8.2

1.

∑

n>0

2n diverges.

2.

∑

n>0

n3

n2 + 4

diverges.

3.

∑

n>0

(−1)n+1 diverges.

343

Back to the original problem:

Question: How do we know when the series a1 + a2 + a3 + . . . behaves like a finite

sum, that is, we can commute or associate the terms without changing its

value?

Theorem 8.4 (Associativity)

If

∑

n>0

an converges then any series obtained by grouping the terms without changing

their order converges to the same value.

That is; convergence ⇒ associative (regrouping) of terms is allowed.

Again the converse is false. If you split the terms of a1 + a2 + . . . into parts, then the

series may or may not converge.

344

But the contrapositive of Theorem 8.4 is still useful. Informally:

Theorem 8.5 (Associativity and Divergence)

Regrouped

∑

n>0

an diverges ⇒

∑

n>0

an diverges.

We can use this to show that the harmonic series diverges.

345

346

What about the convergence of

∞∑

n=1

1

n2

or

∞∑

n=1

1

np

etc?

Previous contrapositive theorems relate one divergent series to another divergent

series.

For convergence, do we always have to use the definition? That is, show the sequence

of partial sums converges via −M?

The answer depends on the type of series (that is, the properties of its terms).

347

Don’t get lost (summary of some of the tests)

348

Positive Series

Series with all positive terms are generally easier to test for convergence.

Definition 8.3 (Positive series)

A series

∑

n>0 an is positive if an ≥ 0 for all n ∈ N.

Theorem 8.6 (Positive Bounded Series)

A positive series

∑

n>0 an converges if and only if there exists K > 0 such that

∀n ∈ N An ≤ K ,

where An =

n∑

m=1

am

Note:

1. if all an ≤ 0 then you can factor out −1 to get a positive series

2. if some of the terms equal zero then ‘re-indexing’ changes it to the case an > 0

for all n.349

Example 8.3

Show that

∑

n>0

1

n2

converges.

350

Comparison Test

Theorem 8.6 still requires knowledge of the partial sums {Ak}.

Question: Can we determine the convergence of a series using only the terms

an?

Answer: If we can compare the series with some other series known to converge,

then, yes:

351

Theorem 8.7 (Comparison Test)

Let

∑

n>0

an be a series and

∑

n>0

bn a series convergent to L.

If there exists K > 0 such that an and bn satisfy

0 ≤ an ≤ K bn,

for all n ∈ N, then

∑

n>0

an converges and

∑

n>0

an ≤ KL.

Conditions:

The theorem is similar to the sandwich theorem for limits. If the terms of a positive

series

∑∞

n=1 an are always bounded above by the terms of a convergent positive series

(and bounded below by zero) then we get the convergence of

∑∞

n=1 an.

352

We also have the contrapositive of the theorem

Theorem 8.8 (Comparison test - divergence)

If there exists K > 0 such that for all n ∈ N, 0 ≤ bn ≤ Kan and

∑

n>0

bn diverges,

then

∑

n>0

an diverges.

Example 8.4

Does

∑

n>0

2

n2 + 1

converge?

353

Ratio Test

A test that does not rely on the convergence of another series is:

Theorem 8.9 (Ratio Test - Non-limit form)

Let

∑

n>0

an be a positive series. Let r and R be real numbers with 0 < r < 1 and

R > 1. Then

1. If there exists M1 ∈ N such that for all n > M1,

an+1

an

< r , then

∑

n>0

an converges.

2. If there exists M2 ∈ N such that for all n > M2,

R <

an+1

an

, then

∑

n>0

an diverges.

Conditions:

354

Thus, if the ratio of consecutive terms is bounded by a positive constant, we can

determine if

∑

n>0

bn converges.

On occasion it is easier to use the limit form of the Ratio Test.

Theorem 8.10 (Ratio Test - Limit Form)

Let

∑

n>0

an be a series such that the sequence

∣∣∣∣an+1an

∣∣∣∣ converges to a limit r .

1. If r < 1 then

∑

n>0

|an| converges (as does

∑

n>0

an).

2. If r > 1 then

∑

n>0

an diverges.

Conditions:

355

Example 8.5

Does

∑

n>0

1

5n

converge?

356

Example 8.6

More generally, when does

∑

n≥0

rn converge?

357

Example 8.7

Show that

∑

n>0

n2

2n

converges.

358

What happens when

∣∣∣∣an+1an

∣∣∣∣ −→ 1 as n→∞?

359

There are many convergence tests. A few more are given below (without proof). You

many use any of them to prove convergence.

Theorem 8.11 (The Integral Test)

Let f be a continuous function defined on [N,∞), N ∈ N, which is positive and

decreasing. Then the series

∞∑

n=N

f (n)

converges if and only if

lim

n→∞

(∫ n

N

f (x)dx

)

exists.

Conditions:

360

Theorem 8.12 (p-series)

The p-series is the series

∑

n>0

1

np

where p > 0.

1. If p > 1, then the series converges.

2. If 0 < p ≤ 1 then the series diverges.

p-series can be used to test convergence by using it as part of other tests eg.

comparison test.

361

Theorem 8.13 (The Limit Comparison Test)

Let

∑∞

n=1 an and

∑∞

n=1 bn be series with positive terms.

Consider

lim

k→∞

ak

bk

.

1. If the limit is a positive number, then

∑∞

n=1 an converges if and only if

∑∞

n=1 bn

converges.

2. If the limit is 0 and

∑∞

n=1 bn converges then

∑∞

n=1 an converges.

3. If the limit is ∞ (ie. unbounded) and ∑∞n=1 bn diverges then ∑∞n=1 an diverges.

Conditions:

362

Alternating Series

What about series that have some negative terms? The simplest type in this class are

those with terms whose signs alternate.

Definition 8.4

Alternating series are those of the form

∞∑

n=1

an where an = (−1)n+1bn and bn > 0.

We use (−1)n+1 rather than (−1)n so that the first term is positive.

For example,

1− 1

2

+

1

3

− 1

4

+

1

5

− 1

6

+ . . . =

∞∑

n=1

(−1)n+1

n

.

363

Alternating series have a very special relationship between (Ak) and (an):

Theorem 8.14 (Alternating Series Test)

Let (bn) be a non-increasing sequence with bn > 0 and bn → 0. Then the alternating

series ∞∑

n=1

(−1)n+1bn

converges.

Conditions:

Thus for alternating series it is necessary and sufficient for convergence, that bn → 0

(see Theorem 6.3).

364

Example 8.8

The series ∑

n>0

(−1)n+1

n

converges.

365

Absolute Convergence

We have seen that “convergence implies regrouping is ok” but what about

commutativity - rearranging the order of terms? This depends on the following

property:

Definition 8.5 (Absolute Convergence)

The series

∑

n>0

an is called absolutely convergent if

∑

n>0

|an| converges.

Absolute Convergence allows us the arbitrarily change the signs of the terms:

Theorem 8.15 (The Absolute Convergence Test)

If

∑

n>0

an is absolutely convergent, then

∑

n>0

an converges.

Conditions:

366

The converse is false.

Note: Series for which

∑

n>0

an converges, but

∑

n>0

|an| does not, are called

conditionally convergent.

For example,

∑

n>0

(−1)n+1

n

converges, but

∑

n>0

1

n

does not.

367

Example 8.9

Does the series

∞∑

n=1

sin n

n2

converge?

368

We can now finally state the rearrangement condition:

Theorem 8.16 (Rearrangement)

Let

∑

n>0 an be an absolutely convergent series. Then every rearrangement of∑

n>0 an converges absolutely to the same value.

This says that, absolute convergence ⇒ the terms commute.

So we have

“Absolute convergence ⇒ rearranging is allowed”

and

“Convergence ⇒ regrouping is allowed”

Since

“Absolute convergence ⇒ convergence”

then

Absolutely convergent series can be both rearranged and regrouped!

369

Aside

We need to define “rearrangement”.

Let σ : N→ N be a bijection. If

∑

n>0

an and

∑

n>0

bn are two series such that

∀n ∈ N an = bσ(n),

then

∑

n>0

an is a rearrangement (or permutation) of

∑

n>0

bn.

370

Power Series

A power series is an infinite polynomial. These series can be added, subtracted,

multiplied, differentiated and integrated to give new power series.

We will study some of these series with a particular interest on if and when they

converge.

Consider the geometric series

1 + r + r2 + r3 + . . . , r ∈ R.

Example 8.10

What does it converge to?

371

Theorem 8.17 (Geometric Series)

The geometric series

∑

n≥0

rn converges if |r | < 1 and

1 + r + r2 + r3 + . . . =

1

1− r .

If |r | ≥ 1 then the geometric series diverges.

372

We can generalise the geometric series by changing the coefficients of each

term:

Definition 8.6 (Power Series)

A (real) power series, (about r = 0), is a series of the form

a0 + a1r + a2r

2 + a3r

3 + . . . , an, r ∈ R.

The sequence (an)n≥0 is the sequence of coefficients.

Note, a power series conventionally starts with a0 as then the coefficient index is the

same as the exponent of r :

∞∑

n=0

anr

n.

373

Question: The geometric series (an = 1, ∀n ∈ N) converges for |r | < 1. How do the

coefficients change this?

Obviously convergence depends on (an). Thus we refine the question: Given the

coeffcients (an) how does the convergence depend on r?

Notice that the partial sums are polynomials

Ak = a0 + a1r + a2r

2 + . . .+ ak r

k ,

thus we can think of the partial sums as polynomial functions of r :

Pk : R→ R, r 7→ a0 + a1r + . . .+ ak rk .

374

So we can think of the sequence of partial sums (Ak) = (Pk(r)) as a sequence of

functions and hence the power series

f (r) = lim

k→∞

Pk(r)

as a function obtained as a “limit of polynomial functions”.

Question: How does the convergence of (Pk(r)) depend on r?

Question: Polynomials are continuous and differentiable for all r ∈ R. What about

f (r)?

Question: Can known functions (for example, sin, log) be expressed as power

series?

375

First some examples.

f1(r) = r − r

2

2

+

r3

3

− r

4

4

+ . . . an =

(−1)n+1

n

f3(r) = 1 + r +

1

2!

r2 +

1

3!

r3 + . . . an =

1

n!

f2(r) = r − 1

3!

r3 +

1

5!

r5 + . . . an =

{

(−1)k

n! n = 2k + 1 (ie. odd)

0 n even

Given the coefficients (an)n≥0, how does the convergence of the series change with

r?

376

Consider the set of all values of r for which f (r) =

∞∑

n=0

anr

n converges:

Let

S(f ) = {r ∈ R : f (r) converges } .

I Since

∞∑

n=0

anr

n converges if r = 0 the set S is non-empty.

I If

∞∑

n=0

anr

n converges absolutely for some r = s (ie.

∞∑

n=0

|ansn| converges) then for

all r satisfying 0 ≤ |r | ≤ |s| it must also converge.

Why?

377

Thus we can consider the “largest possible |s|”:

ρ(f ) = sup

{ |s| : s ∈ S(f )}.

• If

∞∑

n=0

anr

n converges for all |r | then of course ρ does not exist, but if not, then ρ

certainly exists as S contains s = 0. Thus we define:

Definition 8.7 (Radius of convergence)

Let S(f ) be the set of all r ∈ R for which f (r) =

∞∑

n=0

anr

n converges.

I If S is unbounded then the series is said to have infinite radius of convergence.

I If S is bounded, then let

ρ(f ) = sup

{ |s| : s ∈ S(f )}.

and ρ(f ) is called the radius of convergence of the power series f (r).

378

Note:

1. ρ is the sup of the absolute values |s| so therefore ρ ≥ 0.

2. It is called the radius as r can be generalised to a complex number and then ρ is

the radius of a circle in the complex plane (and

∑∞

n=0 anr

n converges for all

values inside the circle).

3. If |r | > ρ, then by definition of ρ, the power series diverges.

4. The set S(f ) may or may not contain ±ρ. Thus is is always necessary to check if

f (r) converges when r = ±ρ. For example the power series g(r) = ∑n≥0 rn has

ρ = 1 but g(1) does not converge.

5. S(f ) unbounded means the series converges for all r ∈ R.

379

The radius of convergence marks the divide between values of r for which

∞∑

n=0

anr

n

certainly converges and does not converge:

Theorem 8.18 (Power series convergence)

Let ρ be the radius of convergence of the power series

∞∑

n=0

anr

n. Then if |r | < ρ the

series is absolutely convergent; if |r | > ρ it is divergent.

Proof:

380

Example 8.11

For what values of x does ∑

n≥0

xn

n!

converge?

381

Example 8.12

For what values of x does ∑

n≥0

(−1)n+1

n

xn

converge?

382

Example 8.13

For what values of x does ∑

n≥0

xn

5n

converge?

383

Note that Power Series are easily generalised to the form:

∞∑

n=0

an(x − b)n, b ∈ R,

in which case the convergence condition for |r | is replaced by one for |x − b|.

Example 8.14

∑

n≥0

(x − 2)n

5n

384

We first define a series related to f (x) and then consider if the series converges to

f .

The idea is to match the derivatives of f to those of the partial sums.

Assume f is infinitely differentiable (that is, all derivatives exist). Then, if

Ak =

k∑

n=0

akx

k

we require

dnAk

dxn

∣∣∣

x=0

=

dnf

dxn

∣∣∣

x=0

for 0 < n ≤ k .

This will give unique values to the coefficients of

∑

n≥0 akx

n.

385

Example 8.15

Let f (x) = ex , then

386

387

Thus we get the series∑

n≥0

anx

n = 1 + x + 12!x

2 + 13!x

3 + 14!x

4 + . . .

Question: Does this series converge to ex , and if so, what is the radius of

convergence?

We generalise this to an arbitrary point x = b rather than x = 0.

388

Definition 8.8 (Taylor Series)

Let b ∈ R and J be an open interval centred on x = b. Let f : J → R be infinitely

differentiable at x = b.

The Taylor Series of f about x = b is the series

∞∑

n=0

f (n)(b)

n!

(x − b)n.

The partial sums

Ak(x) =

k∑

n=0

f (n)(b)

n!

(x − b)n

are called the Taylor Polynomials of f about x = b.

389

Example 8.16

The Taylor polynomials for ex are

390

But, does

∑

n≥0

1

n!

xn converge to ex for x ∈ (−ρ, ρ)?

In other words, does lim

k→∞

Ak(x) = e

x?

To prove convergence (which has to be done every time we change f ) we need to look

at the error in the approximation

|f (x)− Ak(x)|.

That is, ∀ > 0, ∃M such that ∀n > M |f (x)− An| < .

To help with this it would be useful if Ak , or better, f (x)− Ak could be

bounded:

391

Theorem 8.19 (Taylor’s Theorem)

Let f : (−ρ+ b, b + ρ)→ R be a k + 1 times differentiable function on

(−ρ+ b, b + ρ) and let

Ak(x) =

k∑

n=0

f (n)(b)

n!

(x − b)n

be the kth partial sum of the Taylor Series of f . Then for any x ∈ (−ρ+ b, b + ρ)

there exists a real number c between b and x such that

f (x)− Ak(x) = f

(k+1)(c)

(k + 1)!

(x − b)k+1.

392

Notes:

1. Instead of having to work with

f (x)−

k∑

n=0

f (n)(b)

n!

(x − b)n

we only need to know the (n + 1)th derivative of f .

2. f (x)− Ak(x) is a measure of how well Ak(x) approximates f (x) - compare with

the Archimedes parabola area.

3. For fixed k the approximation gets worse as x moves away from b.

4. The only requirement is that f is k + 1 times differentiable in an open

neighbourhood of x = b.

393

It is sometimes possible to estimate the remainder Rn(x). This is a very useful

result.

Theorem 8.20 (The Remainder Estimation Theorem)

If there is a positive constant M such that |f (k+1)(t)| ≤ M for all t between x and b,

then the remainder term Rk(x) in Taylor’s Theorem satisfies

|Rk(x)| ≤ M |x − b|

(k+1)

(k + 1)!

.

If this condition holds for every k ∈ N and all conditions of Taylor’s Theorem are

satisfied by f , then the series converges to f (x).

394

We have seen how Taylor series can be used to approximate functions, locally, by

polynomials. We now turn our attention to Fourier series, which are useful for

approximating functions on a wider interval, and can sometimes be used for

discontinuous functions.

Partial Sums

In this case, our partial sums look like

Fk(x) =

k∑

n=0

(an cos(nx) + bn sin(nx)).

395

Thus if

lim

k→∞

Fk(x)

exists (for x ∈ J) then so does the series

∑

n≥0

(an cos(nx) + bn sin(nx)).

Note:

The function sin(x) is an odd function:

sin(−x) = − sin(x)

and cos(x) is an even function:

cos(−x) = cos(x).

Thus

k∑

n=0

an cos(nx) is even and

k∑

n=0

bn sin(nx) is odd.

396

Historically:

Early 1800’s, Fourier proposed a solution to heat flow in a thin infinite slab:

as a solution to the heat equation - a partial differential equation for the temperature

in the x − y plane: T (t; x , y).

397

The boundary conditions are:

T (x , 0) = f (x), where f is an even function.

T (−1, y) = T (1, y) = 0; that is, constant temperature on left and right sides.

The simplest case is f (x) = 1.

Fourier proposed the time independent solution

T (x , y) =

∑

n≥0

ane

−(2n−1)piy

2 cos

(

(2n − 1)pi

2

x

)

where the an’s depend on f (x).

398

Thus, along the x-axis, he proposed a solution of the form:

f (x) =

∑

n≥0

an cos

(npi

2

x

)

and for f (x) = 1, his method gave

1 =

4

pi

∞∑

n=0

(−1)n−1

2n − 1 cos

(

(2n − 1)pi

2

x

)

, −1 ≤ x ≤ 1

or

pi

4

=

∞∑

n=0

(−1)n−1

2n − 1 cos

(

(2n − 1)pi

2

x

)

.

399

Let’s check the partial sums:

Fk(x) =

k∑

n=0

(−1)n−1

2n − 1 cos

(

(2n − 1)pi

2

x

)

.

400

Note: Fk(±1) does fit with the temperature of the plate on the two sides, however,

mathematically we are trying to find an infinite series that converges to 1 for

x ∈ [−1, 1], perhaps this is not possible (with a Fourier series) and it only converges

to 1 for x ∈ (−1, 1)?

Let’s use the computer to plot the partial sums; that is, the approximations to

f (x) = 1.

Fk(x) =

4

pi

[

cos

(pix

2

)

− 1

3

cos

(

3pix

2

)

+ . . .

· · ·+ (−1)

n−1

2n − 1 cos

(

(2n − 1)pi

2

x

)]

401

Computing the coefficients was given by Fourier:

Definition 8.9

Let f : [−pi, pi]→ R be a function. Then the Fourier Coefficients are defined

as:

a0 =

1

2pi

∫ pi

−pi

f (x) dx

an =

1

pi

∫ pi

−pi

f (x) cos(nx) dx

bn =

1

pi

∫ pi

−pi

f (x) sin(nx) dx

and the Fourier Series as

a0 +

∞∑

n=1

(an cos(nx) + bn sin(nx)).

402

Note: The assumed property of f is that all the integrals

∫ pi

−pi

f (x) sin(nx) dx ,

∫ pi

−pi

f (x) cos(nx) dx

exist; that is, f (x) sin(nx) and f (x) cos(nx) are integrable.

Some books replace a0 with

a0

2 and then all the integrals have a

1

pi factor.

The big question is what does the series converge to?

403

The answer was given by Dirichlet:

Theorem 8.21

Let f : [−pi, pi]→ R be a bounded, piecewise continuous and piecewise monotonic

function. Furthermore, assume that f is periodic with period 2pi and that

f (x) =

1

2

(

lim

h→0+

f (x + h) + lim

h→0−

f (x + h)

)

for every value of x . Then

f (x) = a0 +

∞∑

k=1

(an cos(nx) + bn sin(nx))

where (an) and (bn) are given by Definition 8.9.

404

There are several assumptions for f :

1. f is integrable on [−pi, pi]

2. f is periodic with period 2pi

3. If f has a discontinuity its value must be the average of the left and right limits.

4. Piecewise continuous (can be discontinuous at isolated points)

5. Piecewise monotonic

The rather strange condition

f (x) =

1

2

(

lim

h→0+

f (x + h) + lim

h→0−

f (x + h)

)

is only important where the function is discontinuous. If the function is continuous at

x , then the left and right limits are equal:

lim

h→0+

f (x + h) = lim

h→0−

f (x + h) = f (x).

and the equation is redundant.

405

Note:

There is a more general version (where item 5 is relaxed and only requires that f is a

bounded variation; that is, there exists increasing functions, g and h such that

f (x) = g(x)− h(x) for all x ∈ [−pi, pi]).

Thus Dirichlet’s Theorem explains Fourier’s solution with f (x) = 1.

406

Example 8.17

407

408

Example 8.18

409

410

欢迎咨询51作业君