程序代写案例-MATH2061

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
Course Notes
for
LINEAR MATHEMATICS
Weeks 1–6 of MATH2061
Jenny Henderson
c© School of Mathematics and Statistics
The University of Sydney, 2011.

Contents
1 Solving Linear Systems 1
1.1 The existence of solutions . . . . . . . . . . . . . . . . . . . . 1
1.2 Techniques for finding solutions . . . . . . . . . . . . . . . . . 3
1.2.1 Gaussian elimination . . . . . . . . . . . . . . . . . . . 3
1.2.2 Solution by other methods . . . . . . . . . . . . . . . . 5
1.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Vector Spaces and Subspaces 9
2.1 Axioms of a vector space . . . . . . . . . . . . . . . . . . . . . 10
2.2 Examples of vector spaces . . . . . . . . . . . . . . . . . . . . 11
2.3 Subspaces of a vector space . . . . . . . . . . . . . . . . . . . 13
2.4 Examples of subspaces . . . . . . . . . . . . . . . . . . . . . . 15
2.4.1 Subspaces of R3 . . . . . . . . . . . . . . . . . . . . . . 15
2.4.2 Subspaces of F . . . . . . . . . . . . . . . . . . . . . . 18
2.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3 Linear Combinations and Span 21
3.1 Null space of a matrix . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Linear combinations . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 The span of a set of vectors . . . . . . . . . . . . . . . . . . . 23
3.4 Column space of a matrix . . . . . . . . . . . . . . . . . . . . 27
3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4 Linear Dependence and Independence 33
4.1 Linear dependence . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2 Linear independence . . . . . . . . . . . . . . . . . . . . . . . 35
4.3 Basis and dimension . . . . . . . . . . . . . . . . . . . . . . . 37
4.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5 Some Special Bases 49
5.1 Lagrange polynomials . . . . . . . . . . . . . . . . . . . . . . . 49
i
ii CONTENTS
5.2 Bases of the null and column spaces of a matrix . . . . . . . . 55
5.2.1 Null space basis . . . . . . . . . . . . . . . . . . . . . . 56
5.2.2 Column space basis . . . . . . . . . . . . . . . . . . . . 57
5.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6 Linear Transformations 63
6.1 Linear transformations . . . . . . . . . . . . . . . . . . . . . . 63
6.2 Examples of linear transformations . . . . . . . . . . . . . . . 65
6.3 Matrix transformations . . . . . . . . . . . . . . . . . . . . . . 67
6.4 Geometric significance of matrix transformations . . . . . . . . 69
6.4.1 Reflections . . . . . . . . . . . . . . . . . . . . . . . . . 70
6.4.2 Rotations . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.4.3 Shears . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.4.4 Scalings . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.5 Composite transformations . . . . . . . . . . . . . . . . . . . . 75
6.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7 Eigenvalues and Eigenvectors 81
7.1 Eigenvalues, eigenvectors and eigenspaces . . . . . . . . . . . . 82
7.2 How to find eigenvalues and eigenvectors . . . . . . . . . . . . 83
7.3 The diagonalisation theorem . . . . . . . . . . . . . . . . . . . 88
7.4 Finding powers of a matrix . . . . . . . . . . . . . . . . . . . . 93
7.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
8 Linear Recurrence Relations 97
8.1 Leslie population model . . . . . . . . . . . . . . . . . . . . . 98
8.2 Conduction of electrical impulses . . . . . . . . . . . . . . . . 104
8.3 Growth of branches on a tree . . . . . . . . . . . . . . . . . . 106
8.4 The general case . . . . . . . . . . . . . . . . . . . . . . . . . 108
8.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
9 Systems of Differential Equations 115
9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
9.2 Solving using diagonalisation . . . . . . . . . . . . . . . . . . . 116
9.3 The mixing problem . . . . . . . . . . . . . . . . . . . . . . . 121
9.4 Populations of interlinked species . . . . . . . . . . . . . . . . 124
9.5 The diffusion equation . . . . . . . . . . . . . . . . . . . . . . 126
9.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Chapter 1
Solving Linear Systems
The ability to solve linear systems is the fundamental technique of any ele-
mentary course in linear algebra. In this chapter we revise the methods of
solution that were taught in first year.
1.1 The existence of solutions
A system of m linear equations in the unknowns x1, x2, . . . , xn,
a11x1 + . . .+ a1nxn = b1
...
...
am1x1 + . . .+ amnxn = bm
is first rewritten as a single matrix equation:

a11 . . . a1n... ...
am1 . . . amn



x1...
xn

 =

 b1...
bm


Denoting the coefficient matrix by A, the unknowns by the n × 1 vector x
and the known constants b1, . . . , bm by the m × 1 vector b, the system can
be expressed concisely as Ax = b.
Another way of writing this linear system is to describe it in a way which
emphasises the columns of the matrix A.
1
2 CHAPTER 1. SOLVING LINEAR SYSTEMS
We use the fact that
Ax =

a11 . . . a1n... ...
am1 . . . amn



x1...
xn

 =

 a11x1 + . . .+ a1nxn...
am1x1 + . . .+ amnxn


= x1

a11...
am1

+ . . .+ xn

a1n...
amn


= x1c1 + . . .+ xncn
where the vectors c1, . . . , cn denote the columns of A.
In this way, we see that the original linear system Ax = b can be written as
x1c1 + . . .+ xncn = b.
The left-hand side is called a linear combination of the column vectors
c1, . . . , cn. Linear combinations occur frequently in this unit of study. They
consist of the sum of one or more scalar multiples of vectors. The scalars
x1, . . . , xn are called weights in the linear combination. (The use of the word
“weights” does not imply that x1, . . . , xn are always positive.)
If x1, . . . , xn are numbers which satisfy the original linear system, then the
right hand side column vector b is therefore expressible as a linear combina-
tion of the columns of A. Conversely, given any column vector b which can
be expressed as a linear combination of the vectors c1, . . . , cn with weights
w1, . . . , wn, that is,
b = w1c1 + . . .+ wncn,
then the weights w1, . . . , wn provide a solution of the linear system Ax = b.
That is, the system has at least one solution, namely
x =

w1...
wn

 .
The system Ax = b has a solution if and only if b is a linear
combination of the columns of A, the components of each solution
vector x being weights in the linear combination.
1.2. TECHNIQUES FOR FINDING SOLUTIONS 3
1.2 Techniques for finding solutions
The three techniques for finding solutions of Ax = b that were taught in first
year are: solution by Gaussian elimination, solution by obtaining reduced row
echelon form and solution by obtaining the inverse of the coefficient matrix
A.
1.2.1 Gaussian elimination
We illustrate this technique with an example of a system with infinitely many
solutions.
Example Solve Ax = b with A =

1 5 1 02 8 3 6
1 3 1 4

 and b =

32
1

.
The Gaussian elimination technique involves forming the augmented matrix
[A b] and applying elementary row operations (EROs) to obtain a matrix
in row echelon form.
[A b] =

1 5 1 0 32 8 3 6 2
1 3 1 4 1

 R2=R2−2R1, R3=R3−R1−−−−−−−−−−−−−−→

1 5 1 0 30 −2 1 6 −4
0 −2 0 4 −2


R2=−12R2−−−−−→

1 5 1 0 30 1 −1
2
−3 2
0 −2 0 4 −2

 R3=R3+2R2−−−−−−→

1 5 1 0 30 1 −1
2
−3 2
0 0 −1 −2 2


R3=−R3−−−−→

1 5 1 0 30 1 −1
2
−3 2
0 0 1 2 −2


The last matrix is in row echelon form. The (1,1), (2,2) and (3,3) positions of
this matrix are called pivot positions. These are the positions of the leading
1s.The fourth column (corresponding to the coefficients of x4) is missing
a leading 1, so x4 is a free variable (parameter) and we express all other
variables in terms of x4. The equivalent (simplified) linear system is
x1 + 5x2 + x3 = 3
x2 − 12x3 − 3x4 = 2
x3 + 2x4 = −2
4 CHAPTER 1. SOLVING LINEAR SYSTEMS
Using back substitution we get
x3 = −2 − 2x4
x2 = 2 +
1
2
x3 + 3x4 = 2 +
1
2
(−2 − 2x4) + 3x4 = 1 + 2x4
x1 = 3− 5x2 − x3 = 3− 5(1 + 2x4)− (−2− 2x4) = −8x4
The system has infinitely many solutions, one for each value of x4, namely
x =


−8x4
1 + 2x4
−2 − 2x4
x4

 =


0
1
−2
0

+ x4


−8
2
−2
1

 .
This gives us not only the general solution x of the linear system but also a
way of expressing b as a linear combination of the columns of A, using the
components of the solution vector x as weights:
b =

32
1

 = −8x4

12
1

+ (1 + 2x4)

58
3

+ (−2 − 2x4)

13
1

+ x4

06
4


for all values of x4. That is, b is expressible as a linear combination of the
columns of A in infinitely many ways. For example, the particular solution
of the system corresponding to x4 = 0 is x =


0
1
−2
0

, and
b =

32
1

 = 0

12
1

+

58
3

− 2

13
1

+ 0

06
4

 .
The particular solution of the system corresponding to x4 = 1 is x =


−8
3
−4
1

,
and
b =

32
1

 = −8

12
1

+ 3

58
3

− 4

13
1

+

06
4

 .
1.2. TECHNIQUES FOR FINDING SOLUTIONS 5
Example Consider the system Ax = b where this time A is the square
matrix

2 2 61 1 4
3 4 5

 and b is the vector

23
6

.
Form the augmented matrix and apply EROs to obtain row echelon form.
[A b] =

2 2 6 21 1 4 3
3 4 5 6

 R1=12R1−−−−→

1 1 3 11 1 4 3
3 4 5 6


R2=R2−R1, R3=R3−3R1−−−−−−−−−−−−−−→

1 1 3 10 0 1 2
0 1 −4 3

 R2↔R3−−−→

1 1 3 10 1 −4 3
0 0 1 2


The last matrix is in row echelon form. Again the pivot positions are the
(1,1), (2,2) and (3,3) positions of the matrix.
The equivalent (simplified) linear system is
x1 + x2 + 3x3 = 1
x2 − 4x3 = 3
x3 = 2
Back substitution then gives
x3 = 2, x2 = 3 + 4x3 = 11, x1 = 1− x2 − 3x3 = −16.
The system has a unique solution, x =

−1611
2

.
According to our previous remarks, the vector b is therefore expressible
uniquely as a linear combination of the columns of A, with the components
of the solution x as weights:
b =

23
6

 = −16

21
3

 + 11

21
4

+ 2

64
5

 .
1.2.2 Solution by other methods
Another way of solving the system of the previous example would be to
continue to apply EROs until the reduced row echelon form of [A b] is
6 CHAPTER 1. SOLVING LINEAR SYSTEMS
obtained. Using this procedure we obtain the reduced row echelon matrix
1 0 0 −160 1 0 11
0 0 1 2

 ,
yielding the unique solution x1 = −16, x2 = 11, x3 = 2. This method involves
more row operations than Gaussian elimination but has the benefit that the
solution is displayed in the final matrix, and no back substitution is required.
A third way of solving the system is to use the fact that the matrix A is (in
this example) invertible, with
A−1 =

 11/2 −7 −1−7/2 4 1
−1/2 1 0

 .
The unique solution is then given by the following calculation:
x = A−1b =

 11/2 −7 −1−7/2 4 1
−1/2 1 0



23
6

 =

−1611
2

 .
1.3. SUMMARY 7
1.3 Summary
These are the results you should know and understand:
• A linear system Ax = b is consistent (that is, a solution vector x exists)
if and only if the right-hand-side column vector b can be written as a linear
combination of the columns of the matrix A.
• Solutions of linear systems can be found in various ways, but Gaussian
elimination is the most efficient from a computational point of view when
the system is of reasonable size.
This is the technique you must be able to apply:
• To solve Ax = b by Gaussian elimination, apply elementary row op-
erations (EROs) to the augmented matrix [A b] to obtain a matrix in row
echelon form. Then find the solution by back-substitution. This is an essen-
tial technique to master for this unit of study.
Practice Examples
Solve Ax = b using Gaussian elimination for the following matrices A and
column vectors b. If you’re a bit rusty on the procedure, refer to “the little
blue book” or re-read your first year linear algebra notes.
1. A =

1 2 53 1 0
0 2 2

 , b =

26
4

 .
2. A =

1 2 53 1 7
0 5 8

 , b =

26
1

 .
3. A =

1 0 1 33 1 0 2
1 0 2 2

 , b =

 5−3
6

 .
Answers
1. Unique solution, x =

 13
−1

 . 2. No solution – system inconsistent.
3. Infinitely many solutions, x =


4
−15
1
0

 + x4


−4
10
1
1

 for all values of x4.
8 CHAPTER 1. SOLVING LINEAR SYSTEMS
Chapter 2
Vector Spaces and Subspaces
So far we have been studying the theory and practice of solving Ax = b, in
particular when A is square and the system has a unique solution. Now we
look at the more general case of rectangular A and a system with infinitely
many solutions, to motivate the study of vector spaces.
Suppose that A is a 3× 4 matrix and that the augmented matrix [A b] has
the following reduced row echelon form:
[A b]
various EROs−−−−−−−→

1 0 2 3 10 1 5 7 8
0 0 0 0 0


There are two parameters (x3 and x4) and infinitely many solutions,
x =


1− 2x3 − 3x4
8− 5x3 − 7x4
x3
x4


=


1
8
0
0

+ x3


−2
−5
1
0

+ x4


−3
−7
0
1

 , x3, x4 ∈ R.
One particular solution is x′ =


1
8
0
0

, corresponding to x3 = x4 = 0.
If we had started with the homogeneous system Ax = 0, then the reduction
9
10 CHAPTER 2. VECTOR SPACES AND SUBSPACES
to reduced row echelon form would have given the following matrix:
[A 0] −→

1 0 2 3 00 1 5 7 0
0 0 0 0 0


and the (infinitely many) solutions would be described by the equation
x = x3


−2
−5
1
0

 + x4


−3
−7
0
1

 , x3, x4 ∈ R.
We can see that every solution of Ax = b is the sum of a particular solution
x′ of Ax = b plus a solution of the homogeneous system Ax = 0. The
“infinity of solutions” comes from the homogeneous system Ax = 0.
The study of homogeneous systems is important for two reasons:
1. If u and v are both solutions of Ax = 0 then so is u+ v.
A(u+ v) = Au+ Av = 0+ 0 = 0.
2. If u is a solution of Ax = 0 then so is ku for any real number k.
A(ku) = kAu = k0 = 0.
In other words, if we confine our attention to the set of all solutions ofAx = 0,
the operations of addition of vectors and multiplication of a vector by a scalar
go on within the set, and never take us outside the set. This particular set is
an example of a mathematical structure called a vector space.
2.1 Axioms of a vector space
Definition A vector space over R is a non empty set V whose elements
are called vectors, and on which there are two operations defined, namely
addition of vectors and multiplication of a vector by a scalar (scalars being
numbers in R), satisfying the following 10 axioms:
A1 For all u and v in V , the vector u + v is also in V . This property is
called closure under addition.
A2 For all u, v and w in V , the vector (u + v) + w equals the vector
u+ (v +w). This is called the associative law of addition.
2.2. EXAMPLES OF VECTOR SPACES 11
A3 For all u and v in V , the vector u + v equals the vector v + u. That
is, addition is commutative.
A4 There is a zero vector, written 0, in V with the property that for all
vectors v in V ,
v + 0 = 0+ v = v.
A5 For every vector v in V there is a vector which we write as −v and call
a negative of v, such that v +−v = 0.
S1 For every vector v in V , and for every scalar k in R, the vector kv is in
V . This property is called closure under multiplication by a scalar.
S2 For all scalars k in R and all vectors u and v in V , k(u+v) = ku+kv.
S3 For all scalars k1 and k2 in R and all vectors v in V ,
(k1 + k2)v = k1v + k2v.
S4 For all scalars k1 and k2 in R and all vectors v in V , (k1k2)v = k1(k2v).
S5 For all vectors v in V , 1v = v.
From these axioms it is possible to deduce a number of important results,
including the following. In a given vector space V , there is only one zero
vector; for each vector v in V there is a unique negative vector. As a result,
we may confidently speak of the zero vector in V and the negative of v. In
addition, it is also true that 0v = 0 and (−1)v = −v, for every vector v in
V .
2.2 Examples of vector spaces
The study of vector spaces is hugely important in mathematics because so
many apparently unrelated mathematical structures obey the vector space
axioms. Developing an abstract theory of vector spaces then allows us to
apply the results to a wide range of different situations.
Example The most important vector spaces for us are denoted by the
symbols
R, R2, R3, . . . Rn, . . . . . .
(one for each positive integer n). The vector space Rn is defined as follows:
Rn =




x1
x2
...
xn

 | x1, x2, . . . , xn ∈ R


12 CHAPTER 2. VECTOR SPACES AND SUBSPACES
This is read as “Rn is the set of all columns of length n containing real
number entries which can take all real values”.
In Rn, the operations of addition and multiplication by a scalar are performed
“component by component” within the columns. For example, in R2,(
1
5
)
+
(
7
−2
)
=
(
8
3
)
and in R3,
5

31
4

 =

155
20

 .
For a given positive integer n, the set Rn together with these operations
satisfies all the axioms of a vector space. The vector 0 =


0
0
...
0

 in Rn has the
property required in A4, and the vector


−x1
−x2
...
−xn

 has the property required
in A5; namely it is the negative of


x1
x2
...
xn

 since


x1
x2
...
xn

+


−x1
−x2
...
−xn

 =


0
0
...
0

 = 0.
Example The next example of a vector space may be somewhat surprising.
Define the set F as follows:
F = { f | f is a function with domain and codomain R}
= { f | f : R → R}
This set F satisfies all ten axioms of a vector space. It is called a function
space as the vectors are functions, with the two operations of addition of
2.3. SUBSPACES OF A VECTOR SPACE 13
functions and multiplication of a function by a scalar. Each of these is
calculated using the formula for the function. For example, if f and g are
the functions whose formulas are f(x) = x2 and g(x) = sin x, then the
functions f+g, 3g, −5f are also in the set F, and are functions with formulas
x2 + sin x, 3 sinx, −5x2 respectively. The zero vector in F is the function z
with formula z(x) = 0 for all x in R:
f(x) + z(x) = f(x) + 0 = f(x), for all x ∈ R,
that is,
f + z = z + f = f for all f ∈ F.
Example Yet another quite different example of a vector space over R is
the set C of all complex numbers, C = {a + ib | a, b ∈ R}. Addition and
multiplication are the usual complex number operations,
(a+ ib) + (c+ id) = (a+ c) + i(b+ d) and k(a + ib) = ka+ ikb.
The zero vector is 0 + i0 and the negative of a + ib is −a− ib.
Example The set Mm,n of all m × n matrices with real entries is a
vector space over R. For example, two particular vectors in M2,3 include the
matrices (
1 3 4
0 2 1
)
and
(−3 1 −1
−8 4 −6
)
.
In M2,3 (and in all Mm,n) addition of vectors and multiplication of a vector
by a scalar are defined by the usual rules:(
a b c
d e f
)
+
(
g h i
j k l
)
=
(
a + g b+ h c+ i
d+ j e+ k f + l
)
and
λ
(
a b c
d e f
)
=
(
λa λb λc
λd λe λf
)
.
The zero vector in M2,3 is the zero 2× 3 matrix
(
0 0 0
0 0 0
)
.
2.3 Subspaces of a vector space
The vector spaces we shall study most often are vector spaces contained within
Rn or within F. If one vector space is contained within another vector space,
14 CHAPTER 2. VECTOR SPACES AND SUBSPACES
with the same operations of addition and multiplication by a scalar, then the
smaller space is called a subspace of the larger space.
Definition Let S be a non empty subset of a vector space V . If S itself
satisfies the 10 vector space axioms previously listed (axioms A1, A2, A3,
A4, A5, S1, S2, S3, S4, S5), with the same operations of addition and
multiplication by a scalar, then S is called a subspace of V .
Note that this definition allows S to be equal to V , so that every vector space
is considered to be a subspace of itself. Also note that at the other extreme,
the set containing just the zero vector 0 of V also satisfies all ten axioms and
is therefore a subspace of V .
A more important observation is that it’s not necessary to test all 10 axioms
to confirm that a given subset of a known vector space is a subspace. This
is because every subset of a vector space automatically satisfies axioms A2,
A3, S2, S3, S4 and S5.
• If S also satisfies S1, then for all vectors v ∈ S, the vector 0v = 0 will
also be in S, as will the vector (−1)v = −v. That is, “axiom S1 is true for
S” implies that “axioms A4 and A5 are also true for S”.
• If, in addition, S satisfies A1, then we have the complete set of 10 axioms
satisfied for S.
Our conclusion can then be summarised in the form of a testing procedure
for subspaces, as follows.
A subset S of a vector space V is a subspace of V if S is not empty
and
(i) S satisfies A1 (that is, for all vectors u and v ∈ S, the vector
u+ v is also in S), and
(ii) S satisfies S1 (that is, for all vectors u ∈ S and for all scalars
k, the vector ku is in S.
When testing a subset S to see if it’s a subspace of a given vector space V ,
the condition that S is non empty is traditionally confirmed by showing that
the zero vector in V is a member of the subset S. Every subspace is a vector
space in its own right, and therefore must possess a zero vector. In fact, the
zero vector of a subspace of V must be the same as the zero vector of the
space V . If the zero vector of V is not a member of a subset S of V , then that
subset cannot be a subspace of V . Checking this as a first step is therefore
a good idea.
2.4. EXAMPLES OF SUBSPACES 15
If the zero vector of V is not in S then S is not a subspace. If the
zero vector of V is a member of S, then axioms A1 and S1 must
then be checked to determine whether or not S is a subspace.
2.4 Examples of subspaces
2.4.1 Subspaces of R3
Example Let S =



x1x2
x3

 | x1 = x2 = x3

, a subset of R3. This set
contains all columns of length 3 in which all entries are the same. The set S
is not empty, as

00
0

 ∈ S, since 0 = 0 = 0.
We now check A1. Let

aa
a

 and

bb
b

 be any two vectors in S. Then

aa
a

+

bb
b

 =

a+ ba+ b
a+ b

 ,
Clearly,

a + ba + b
a + b

 is also in S (since all the entries are the same), and therefore
A1 is satisfied.
We now check S1. Let

aa
a

 be any vector in S and let k be any scalar. Then
k

aa
a

 =

kaka
ka

 . Clearly

kaka
ka

 is also in S, and therefore S1 is satisfied.
We conclude that S is a subspace of R3.
The subspace S has a geometrical interpretation: S corresponds to all points
in 3D space with coordinates which are multiples of (1, 1, 1). To see this,
identify the point in space with coordinates (x1, x2, x3) with the vector

x1x2
x3


in R3. Then it is possible to visualise S as the line through the origin and
the point (1, 1, 1), that is, the line in the direction of the cartesian vector
i+ j + k.
16 CHAPTER 2. VECTOR SPACES AND SUBSPACES



(1, 1, 1)
(a, a, a)
(0, 0, 0)
y
z S
x
Example Let S = {
(
x1
x2
)
| x2 ≥ x1}, a subset of R2. It is clear that S is
not empty:
(
0
0
)
is in S, as 0 ≥ 0.
Check A1. Let u =
(
x1
x2
)
and v =
(
y1
y2
)
be any two vectors in S. Then
x2 ≥ x1 and y2 ≥ y1. The vector u + v is the vector
(
x1 + y1
x2 + y2
)
, and this
vector is also in S since if x2 ≥ x1 and y2 ≥ y1 then x2 + y2 ≥ x1 + y1 (add
the two inequalities). Therefore A1 is satisfied.
Check S1. In this particular case, S does not satisfy axiom S1. To see this,
simply multiply a vector in S by a negative scalar. For example, −2
(
1
2
)
=(−2
−4
)
. The resulting vector
(−2
−4
)
is no longer a vector in S, as −4 is not
greater than or equal to −2.
We conclude that S is not a subspace of R2.
Example Let S = {

x1x2
x3

 | x1 = x2 + x3}. Then S is not empty:

00
0

 is
in S, since 0 = 0 + 0.
2.4. EXAMPLES OF SUBSPACES 17
Check A1. Let u =

x1x2
x3

 and v =

y1y2
y3

 be any two vectors in S. Adding
them gives the vector u+ v =

x1 + y1x2 + y2
x3 + y3

. We know that x1 = x2 + x3 and
y1 = y2 + y3 since both u and v are in S. Therefore
x1 + y1 = x2 + x3 + y2 + y3
= x2 + y2 + x3 + y3
Hence

x1 + y1x2 + y2
x3 + y3

 is in S, as its first component is the sum of its second and
third components.
Check S1. Let k be any scalar, and let

x1x2
x3

 be any vector in S. Then
x1 = x2 + x3, and so kx1 = kx2 + kx3. That is, the vector

kx1kx2
kx3

 is also in
S.
We conclude that S is a subspace of R3. Geometrically, S corresponds to all
points lying on the plane through the origin with cartesian equation
x1 − x2 − x3 = 0.

x
y
z
S
Subspaces of Rn where n > 3 are not possible to visualise in this way. But
for the cases n = 2 and n = 3 we will shortly see that all proper subspaces
of R2 and R3 correspond geometrically to either lines or planes through the
origin in 2D and 3D space.
18 CHAPTER 2. VECTOR SPACES AND SUBSPACES
2.4.2 Subspaces of F
Example Let S be the set of all polynomials of degree less than or equal to
2. Thus S includes all constant polynomials (including the zero polynomial),
all linear functions and all quadratics. In set notation,
S = {f | f(x) = a+ bx+ cx2 where a, b, c ∈ R}.
Clearly, S is not empty; it contains the zero polynomial z whose formula is
z(x) = 0 + 0x+ 0x2 = 0 for all real x.
Check A1. Let f and g be any two functions in S. Let f(x) = a+ bx+ cx2
and g(x) = d+ ex+ fx2. Then (f + g)(x) = (a+ d) + (b+ e)x+ (c+ f)x2,
which shows that f + g is in S. Therefore A1 is satisfied.
Check S1. Let k be any scalar, and let f be any function in S. Then as
k(a + bx+ cx2) = ka+ kbx+ kcx2,
we see that kf is a polynomial of degree ≤ 2, and hence kf is in S. Therefore
axiom S1 is satisfied, and we conclude that S is a subspace of F.
The subspace we have just investigated is part of a useful family of subspaces
of F which warrants a special notation:
P2 = the set of all polynomials of degree ≤ 2.
Similarly,
P1 = the set of all polynomials of degree ≤ 1,
P3 = the set of all polynomials of degree ≤ 3,
and in general,
Pn = the set of all polynomials of degree ≤ n.
For all non-negative integers n, Pn is a subspace of F, and if m and n are
non-negative integers with m ≤ n then Pm is a subspace of Pn.
Example Let S be the set of all polynomials of degree exactly equal to 2.
S = {f | f(x) = a+ bx+ cx2, where a, b, c ∈ R and c = 0}.
That is, S consists precisely of the set of all quadratic polynomials. Then
axiom A1 is not satisfied, since it is possible to add two quadratics and
obtain a polynomial of lesser degree. For example,
(x2 + x) + (−x2 + 3x+ 2) = 4x+ 2,
2.5. SUMMARY 19
which has degree 1. We conclude that S is not a subspace of F. The same
conclusion could also have been reached by observing that the zero vector in
F, namely the function z with formula z(x) = 0 for all x, is not a member of
S, as it can’t be considered as a polynomial of degree 2.
Example Let S be the set of all functions f such that f(5) = 0. Then S
consists of all functions whose graphs pass through the point in the xy−plane
with coordinates (5,0). The zero function is one such function, as its graph
certainly passes through (5,0). Therefore S is not empty.
Multiplying any function in S by an arbitrary scalar k produces another
function whose graph still passes through (5,0), as does adding any two such
functions. The relevant formal calculations are as follows.
For axiom S1: f(5) = 0, (kf)(5) = k f(5) = k × 0 = 0.
For axiom A1: If f(5) = 0 and g(5) = 0, then
(f + g)(5) = f(5) + g(5) = 0 + 0 = 0.
Therefore S is a subspace of F.
2.5 Summary
You should know and understand the following definitions.
• A vector space is a non empty set which obeys the ten axioms A1 to
A5 and S1 to S5 listed in this chapter.
• A subspace of a vector space V is a non empty subset of V which satisfies
the given ten axioms using the same operations of addition and multiplication
by a scalar that apply in V .
You must be able to apply the testing procedure for subspaces.
A subset S of a vector space V is a subspace of V if S is not empty
and
(i) S satisfies A1 (that is, for all vectors u and v ∈ S, the vector
u+ v is also in S), and
(ii) S satisfies S1 (that is, for all vectors u ∈ S and for all scalars
k, the vector ku is in S.
To check that S is not empty, see whether or not the zero vector of V is a
member of the set S. If it’s not, then S is not a subspace and no further
20 CHAPTER 2. VECTOR SPACES AND SUBSPACES
testing is needed. If it is, then the next step is to find out if axioms A1
and S1 are satisfied. If you think that they are very likely to be satisfied,
then attempt to check them by selecting arbitrary vectors and scalars, not
particular numerical examples of vectors and scalars. However, if you believe
that S is not a subspace, you may select numerical examples of vectors
and/or scalars to demonstrate that either A1 or S1 is not satisfied.
Practice Examples
Test each subset to decide whether or not it is a subspace of the given vector
space.
1. S =
{(
x1
x2
)
| x1 + x2 ≥ 0
}
, a subset of R2.
2. T = {f | f(x) = f(−x) for all values of x}, a subset of F.
Solutions
1. The zero vector
(
0
0
)
is in S, as 0 + 0 ≥ 0. Therefore S is not empty.
However, S is not closed under scalar multiplication. To see this, ob-
serve that
(
5
−1
)
is in S because 5+(−1) ≥ 0, but −3
(
5
−1
)
=
(−15
3
)
is not in S as −15 + 3 is not greater than or equal to zero. Thus S is
not a subspace.
2. You probably recognise T as the set of all even functions (with graphs
symmetrical about the y-axis). The set T contains the zero function z,
as z(x) = z(−x) = 0 for all x. Hence T is not empty.
To check A1, let f and g be any two functions in T . We know that
f(x) = f(−x) and g(x) = g(−x). Then the function f + g is also in T ,
because for all x,
(f + g)(x) = f(x) + g(x) = f(−x) + g(−x) = (f + g)(−x).
To check S1, let f be any function in T and let k be any scalar. We
know that f(x) = f(−x). Then the function kf is also in T , because
for all x,
(kf)(x) = k × f(x) = k × f(−x) = (kf)(−x).
The set T is therefore a subspace of F.
Chapter 3
Linear Combinations and Span
3.1 Null space of a matrix
Remarks at the beginning of the previous chapter show that if A is an m×n
matrix and S is the set of all solutions of the homogeneous linear system
Ax = 0, then
• the zero column vector of length n is a solution,
• the sum of any two solutions is also a solution, and
• any scalar multiple of a solution is also a solution.
Since all solutions x are members of Rn, we can now re-state these facts in
the light of our discussion of vector spaces and subspaces.
For everym×nmatrix A, the set of all solutions of the homogeneous
linear system Ax = 0 is a subspace of Rn.
This particular subspace of Rn is given a special name.
Definition The set of all solutions of Ax = 0, where A is an m×n matrix,
is called the null space of A and is a subspace of Rn.
Null(A) = {x | Ax = 0}.
Example Find the null space of A =
(
1 3 2 4
2 7 4 9
)
.
Let x =


x1
x2
x3
x4

. Forming the augmented matrix [A 0] and applying EROs
21
22 CHAPTER 3. LINEAR COMBINATIONS AND SPAN
gives the reduced row echelon form of [A 0] to be the matrix
(
1 0 2 1 0
0 1 0 1 0
)
.
Hence x3 and x4 are parameters and
x = x3


−2
0
1
0

+ x4


−1
−1
0
1

 , for all x3, x4 ∈ R.
Every vector in the null space of A is therefore a combination of the two con-
stant vectors


−2
0
1
0

 and


−1
−1
0
1

 in R4. The coefficients in this combination
are scalars x3 and x4. Expressions such as
x3


−2
0
1
0

+ x4


−1
−1
0
1


which consist of scalar multiples of vectors added to other scalar multiples
of vectors, are called linear combinations of vectors.
3.2 Linear combinations
We have already referred briefly to linear combinations in the first section of
Chapter 1, and we now make a formal definition.
Definition Let v1, v2, . . . . . . ,vn be vectors in a vector space V , and let
a1, a2, . . . . . . , an be scalars. Then the vector a1v1 + a2v2 + . . . . . .+ anvn is
called a linear combination of the vectors v1, v2, . . . . . . ,vn.
Example In R3, the vector

 51
−2

 is a linear combination of the vectors

10
0

 ,

01
0

 ,

00
1

 in a natural way, as

 51
−2

 = 5

10
0

+ 1

01
0

− 2

00
1

 .
3.3. THE SPAN OF A SET OF VECTORS 23
Notice that

 51
−2

 is also expressible as a linear combination of other sets
of vectors in R3, for example
 51
−2

 = 1

21
1

+ 1

12
1

 + 2

 1−1
−2

 .
Example In F, let f, g, h be the functions given by the following formulas.
f(x) = sin2 x, g(x) = 1, h(x) = cos 2x
for all values of x. Then f is a linear combination of g and h, since
f = 1
2
g − 1
2
h.
This comes from the trigonometric identity
sin2 x = 1
2
(1)− 1
2
(cos 2x),
which is true for all x.
Example Every vector in the null space of the matrix A =
(
1 3 2 4
2 7 4 9
)
is
a linear combination of the vectors


−2
0
1
0

 and


−1
−1
0
1

 . Moreover, referring
back to the null space calculation at the beginning of this chapter, it is clear
that Null(A) is the set of all linear combinations of these two particular
vectors, as the coefficients x3 and x4 take all real values.
3.3 The span of a set of vectors
Definition Let X = {v1, v2, . . . . . . , vn} be a set of vectors in a vector
space V . The span of X, written Span(X), is the set of all linear combina-
tions of vectors in X. In set notation,
Span(X) = {a1v1 + a2v2 + . . . . . .+ anvn | a1, a2, . . . . . . , an ∈ R}.
The span of a set of vectors is an important concept to understand well.
Notice that the set of vectors X mentioned in the definition contains a fi-
nite number of vectors, whereas Span(X) contains infinitely many vectors,
24 CHAPTER 3. LINEAR COMBINATIONS AND SPAN
generated as the coefficients a1, a2, . . . . . . , an take on infinitely many differ-
ent real values. (There is one exception. In the extreme case when X is
the set containing only the zero vector, then Span(X) also contains just the
zero vector.) More importantly, the span of a set of vectors is a subspace.
This means we have a method of constructing a subspace of a given vector
space, simply by selecting any non-empty set of vectors X from the space
and finding the span of that set.
Theorem 3.1 Let X = {v1, v2, . . . . . . , vn} be a set of vectors in a vector
space V . Then Span(X) is a subspace of V . Moreover, Span(X) is the
smallest subspace of V containing the set X .
Proof First we prove that Span(X) is a subspace of V . Denoting the zero
vector of V by 0, we see that 0 ∈ Span(X), as
0 = 0v1 + 0v2 + . . . . . .+ 0vn.
Therefore Span(X) is not empty.
Check A1: Let a1v1 + a2v2 + . . . . . .+ anvn and b1v1 + b2v2 + . . . . . .+ bnvn
be any two members of Span(X). Then their sum is
(a1 + b1)v1 + (a2 + b2)v2 + . . . . . .+ (an + bn)vn,
which is clearly a linear combination of vectors inX, and hence also a member
of Span(X). Thus A1 is satisfied.
Check S1: Let k be any scalar and a1v1+a2v2+ . . . . . .+anvn be any vector
in Span(X). Then
k(a1v1 + a2v2 + . . . . . .+ anvn) = ka1v1 + ka2v2 + . . . . . .+ kanvn,
again clearly in Span(X). Therefore axiom S1 is also satisfied. We conclude
that Span(X) is a subspace of V .
Now we prove that Span(X) is the smallest subspace of V that contains X.
Notice that X is certainly contained in Span(X), as every vector vi in X is
a linear combination of all the vectors in X,
vi = 0v1 + 0v2 + . . .+ 0vi−1 + 1vi + 0vi+1 . . .+ 0vn.
Now suppose that W is a subspace of V containing X . We must show that
Span(X) ⊆ W . As W is a subspace, axioms A1 and S1 hold in W . Thus
aivi ∈ W for every value of ai (by S1), and then
∑n
i=1 aivi ∈ W (by A1).
Therefore every linear combination of {v1, v2, . . . . . . ,vn} is in W . Hence
Span(X) is contained in W and the proof is complete.
3.3. THE SPAN OF A SET OF VECTORS 25
We have defined the span of a set of vectors X as the set of all linear com-
binations of vectors in that set. Suppose that Span(X) is denoted by W .
There are several different but equivalent ways of stating this information –
you should be familiar with all of these:
• W = Span(X)
• W is the span of X
• X spans W
• W is spanned by X
• X is a spanning set for W .
Example Let X =



10
0

 ,

01
0

 ,

00
1



 , a subset of R3. Then Span(X)
is a subset of R3 by definition. However, every vector in R3 is expressible as
a linear combination of the vectors in X ,
ab
c

 = a

10
0

 + b

01
0

+ c

00
1

 .
That is, R3 ⊆ Span(X). Hence Span(X) = R3.
Example Let X =



12
3

 ,

11
1



, a subset of R3. By definition,
Span(X) =

a

12
3

+ b

11
1

 | a, b ∈ R

 .
Can Span(X) be identified more precisely as a particular subspace of R3?
All proper subspaces of R3 correspond geometrically to either a line through
the origin or a plane through the origin. We can find Span(X) by finding
the condition that an arbitrary vector v =

xy
z

 must satisfy in order to be
a member of Span(X).
The vector v is in Span(X) if and only if there are scalars a, b such that
v =

xy
z

 = a

12
3

+ b

11
1

.
26 CHAPTER 3. LINEAR COMBINATIONS AND SPAN
This is equivalent to the linear system

1 12 1
3 1

(a
b
)
=

xy
z


having at least one solution for the unknowns a and b; in other words, the
system must be consistent.
Form the augmented matrix and obtain its row echelon form.
1 1 x
2 1 y
3 1 z
−→
1 1 x
0 −1 y − 2x
0 −2 z − 3x
−→
1 1 x
0 1 2x− y
0 0 x− 2y + z
We see that the system is consistent if and only if x − 2y + z = 0. This
means that the vector

xy
z

 is in Span(X) if and only if x − 2y + z = 0.
Geometrically, Span(X) corresponds to points (x, y, z) on the plane through
the origin with cartesian equation x− 2y + z = 0.
Example Suppose that p(x) = 1 and q(x) = x. Let X be the set {p, q}, a
subset of the space P1. Then
Span(X) = {ap+ bq | a, b ∈ R},
where
(ap+ bq)(x) = ap(x) + bq(x) = a+ bx.
Therefore Span(X) is equal to the vector space P1.
Example Let X = {p, q}, a subset of the space P2, where p(x) = 5 and
q(x) = 3 + 2x2. What is Span(X)?
Functions in Span(X) have formulas in the style 5a + b(3 + 2x2); that is,
(5a + 3b) + 2bx2. Since a and b can take any real values (independently of
one another), the coefficient of the x2 term, 2b, can take any real value and
the constant term 5a + 3b can also take any real value. This means that
polynomials in Span(X) are precisely the polynomials of the form α + γx2,
for all values of α and γ. Span(X) is a proper subspace of P2 as it does not
contain polynomials with formulas that have a non zero coefficient of x, such
as 3x and 1 + x+ x2.
3.4. COLUMN SPACE OF A MATRIX 27
3.4 Column space of a matrix
We have already seen one subspace associated with an m × n matrix A,
namely the null space of A. Another important subspace is the column
space of A. Let the columns of A be denoted by the vectors c1, c2, . . . . . . , cn.
These are all vectors in Rm. We write A as an augmented matrix (column 1
augmented by column 2, augmented by column 3, etc).
A = [c1 c2 . . . . . . cn].
Definition The column space of A is defined to be the span of the set of
columns of A, and is written as follows:
Col(A) = Span{c1, c2, . . . . . . , cn}.
Example Find geometrically the column space of A =

1 2 32 7 3
1 0 5

 . Is the
vector

10
4

 in Col(A)?
The vector

xy
z

 is in Col(A) if and only if there are scalars a1, a2, a3 such
that
a1c1 + a2c2 + a3c3 =

xy
z

 ,
that is, if and only if the linear system
1 2 32 7 3
1 0 5



a1a2
a3

 =

xy
z


is consistent. As usual, we reduce the augmented matrix.
1 2 3 x2 7 3 y
1 0 5 z

 various EROs−−−−−−−→

1 2 3 x0 3 −3 y − 2x
0 0 0 7x− 2y − 3z


The condition for consistency is 7x − 2y − 3z = 0. The column space of A
therefore corresponds geometrically to the points (x, y, z) lying on the plane
28 CHAPTER 3. LINEAR COMBINATIONS AND SPAN
through the origin with cartesian equation 7x−2y−3z = 0. Notice that the
points corresponding to the three columns of A, namely (1, 2, 1), (2, 7, 0),
and (3, 3, 5) certainly satisfy this condition, but that (1, 0, 4) does not satisfy
it, as 7 × 1 − 2 × 0 − 3 × 4 = 0. This means that the vector

10
4

 is not in
the column space of A, and so in particular, the linear system Ax =

10
4


cannot be solved.
In fact, as 
33
5

 = 5

12
1

−

27
0

 ,
we see that for the matrix A of the previous example, c3 = 5c1− c2. So c3 is
redundant in the calculation of Col(A). That is, c3 is in Span{c1, c2}. Since
every linear combination of c1 and c2 is also a linear combination of c1, c2
and c3:
a1c1 + a2c2 = a1c1 + a2c2 + 0c3,
and, since every linear combination of c1, c2, c3 is just a linear combination
of c1 and c2:
a1c1 + a2c2 + a3c3 = a1c1 + a2c2 + a3(5c1 − c2) = (a1 + 5a3)c1 + (a2 − a3)c2,
we see that
Span{c1, c2, c3} = Span{c1, c2}.
This means that the column space of A can be created more efficiently using
the smaller set of vectors {c1, c2}. This leads us naturally to the notion of
linear dependence which is discussed in the next chapter.
3.5 Summary
You should know and understand these definitions and results:
• The null space of an m × n matrix A is the set of all vectors x
satisfying Ax = 0. Null(A) is a subspace of Rn.
• The vector a1v1 + a2v2 + . . . . . .+ anvn is called a linear combination
of the vectors v1, v2, . . . . . . ,vn.
3.5. SUMMARY 29
• The span of a set of vectors X is the set of all linear combinations of
the vectors in X.
• If X is a non empty subset of a vector space V , then Span(X) is the
smallest subspace of V containing X.
• The column space of an m × n matrix A consists of all linear com-
binations of the columns c1, c2, . . . . . . , cn of A. Col(A) is a subspace of
Rm.
You should master the following techniques:
• Find the null space and column space of small size matrices, and
identify the results geometrically as particular subspaces of R2 and R3.
• Determine if a given vector is a linear combination of a particular set
of vectors.
• Find the span of a given set of vectors.
Practice Examples
1. Find the null space and column space ofA =

1 4 00 2 −2
1 0 4

, and identify
the results geometrically as particular subspaces of R3.
2. If possible, express each of the vectors

 77
10

 and

−17
1

 as a linear
combination of the vectors
13
1

 ,

−24
0

 ,

 113
3

 .
Solutions
1. The reduced row echelon form of A is

1 0 40 1 −1
0 0 0

. With x =

x1x2
x3

,
solving Ax = 0 gives x3 as a parameter and x = x3

−41
1

 for all
30 CHAPTER 3. LINEAR COMBINATIONS AND SPAN
values of x3. Therefore Null(A) corresponds geometrically to points on
the line through the origin and the point (−4, 1, 1).
Col(A) is the set of all linear combinations of the vectors
10
1

 ,

42
0

 ,

 0−2
4

 .
A vector

xy
z

 is in the column space of A if and only if there are
scalars a, b, c such that
a

10
1

 + b

42
0

+ c

 0−2
4

 =

xy
z

 .
That is, the linear system
1 4 00 2 −2
1 0 4



ab
c

 =

xy
z


must be consistent. Applying various EROs to the augmented matrix
gives the following:
1 4 0 x
0 2 −2 y
1 0 4 z
−→
1 4 0 x
0 1 −1 y/2
0 0 0 z − x+ 2y
The system is consistent if and only if z − x + 2y = 0. This tells us
that Col(A) corresponds geometrically to points on the plane through
the origin with cartesian equation z − x+ 2y = 0.
2. The vector

 77
10

 can be written as a linear combination of

13
1

 ,

−24
0

 ,

 113
3


3.5. SUMMARY 31
if and only if there are scalars a, b, c such that
a

13
1

+ b

−24
0

+ c

 113
3

 =

 77
10

 .
This is equivalent to saying that the linear system
1 −2 13 4 13
1 0 3



ab
c

 =

 77
10


is consistent. Applying various EROs to the augmented matrix gives
the following:
1 −2 1 7
3 4 13 7
1 0 3 10
−→
1 0 3 0
0 1 1 0
0 0 0 1
The last row corresponds to the impossible condition 0 = 1, and hence
there is no solution for a, b, c and the vector

 77
10

 can’t be written
as a linear combination of the given three vectors.
Repeating the procedure for the vector

−17
1

 yields the following
calculation:
1 −2 1 −1
3 4 13 7
1 0 3 1
−→
1 0 3 1
0 1 1 1
0 0 0 0
This shows that

−17
1

 can be written as a linear combination of the
given three vectors in infinitely many ways. Using c as the parameter,
the solution of the system is
ab
c

 =

1− 3c1− c
c


for all values of c. Therefore
−17
1

 = (1− 3c)

13
1

+ (1− c)

−24
0

+ c

 113
3


32 CHAPTER 3. LINEAR COMBINATIONS AND SPAN
for all values of c. In particular, when c = 0, this gives
−17
1

 = 1

13
1

 + 1

−24
0

 + 0

 113
3

 .
Chapter 4
Linear Dependence and
Independence
Some insight into the notions of linear dependence and independence of vec-
tors can be gained from a geometrical viewpoint. Take two vectors in 3D
space, and think of them for the moment as directed line segments (arrows)
originating at the origin. They are linearly dependent if they have the same
or opposite direction (in which case the angle between them is either 0◦ or
180◦, and one is a scalar multiple of the other). They are linearly independent
otherwise, that is, if the angle between them is not 0◦ and not 180◦.

Linearly independent
vectors
Linearly dependent
vectors
(0, 0, 0) (0, 0, 0)
Suppose that the two vectors are linearly independent; then they determine
a unique plane through the origin. Consider now a third vector (arrow). If
it lies in the plane through the origin determined by the original two, the
set of three vectors is linearly dependent: the third vector “depends” on the
original two. If it doesn’t lie in that plane, the three vectors are said to be
linearly independent.

independent vectors dependent vectors

33
34 CHAPTER 4. LINEAR DEPENDENCE AND INDEPENDENCE
4.1 Linear dependence
These geometric ideas are now defined algebraically.
Definition The set of vectors {v1, v2, . . . . . . ,vn} in a vector space V is
called a linearly dependent set if there are scalars a1, a2, . . . . . . , an not all
zero such that
a1v1 + a2v2 + . . . . . .+ anvn = 0.
Suppose that in this equation, ai is non zero. Then we can express vi as a
linear combination of the other vectors:
vi = − 1
ai
(a1v1 + . . .+ ai−1vi−1 + ai+1vi+1 + . . . anvn).
Immediate consequences are:
• Every set {0, v1, v2, . . . . . . ,vn} containing the zero vector is linearly
dependent, since
10+ 0v1 + 0v2 + . . . . . .+ 0vn = 0,
and not all coefficients are zero.
• If a set contains exactly two non zero vectors {v1, v2}, then the set is
linearly dependent if and only if a1v1 + a2v2 = 0, with at least one of a1
and a2 non zero. Suppose without loss of generality that a1 = 0. Then
v1 = −a2a1v2. That is,
The set {v1, v2} is linearly dependent if and only if one vector is a
scalar multiple of the other.
Testing a given set of vectors for linear dependence is straightforward, and
always reduces to solving a linear system to find the values of the scalars
a1, a2, . . . . . . , an in the equation
a1v1 + a2v2 + . . . . . .+ anvn = 0.
Example Show that the set



13
5

 ,

35
1

 ,

 31
−13



 is linearly depen-
dent.
4.2. LINEAR INDEPENDENCE 35
Suppose that
a

13
5

 + b

35
1

+ c

 31
−13

 =

00
0

 . (∗)
This is equivalent to the linear system
1 3 33 5 1
5 1 −13



ab
c

 =

00
0

 ,
which leads to the following reduction of the augmented matrix:
1 3 3 03 5 1 0
5 1 −13 0

 various EROs−−−−−−−→

1 3 3 00 1 2 0
0 0 0 0


The system has infinitely many solutions for a, b, c and hence there are cer-
tainly non zero values of the scalars a, b, c that make (∗) true. Therefore the
set of vectors is linearly dependent. The actual dependency relationships be-
tween the three vectors can be found from the solution of the linear system.
Using c as a parameter, we see that b = −2c and a = −3b− 3c = 3c. Hence
for all values of c,
3c

13
5

− 2c

35
1

+ c

 31
−13

 =

00
0

 ,
and in particular, when c = 1,
3

13
5

− 2

35
1

 +

 31
−13

 =

00
0

 .
Each of the three vectors can be written as a linear combination of the other
two.
4.2 Linear independence
The complementary concept to linear dependence is linear independence. A
given set of vectors is either a linearly dependent set or a linearly indepen-
dent set. The case of linear independence requires that the only values of
a1, a2, . . . . . . , an that satisfy the testing equation
a1v1 + a2v2 + . . . . . .+ anvn = 0
36 CHAPTER 4. LINEAR DEPENDENCE AND INDEPENDENCE
are the zero values.
Definition The set of vectors {v1, v2, . . . . . . ,vn} is called a linearly inde-
pendent set if the only scalars that satisfy the equation
a1v1 + a2v2 + . . . . . .+ anvn = 0
are the scalars a1 = 0, a2 = 0, . . . . . . , an = 0.
In a linearly independent set, none of the vectors can be expressed as a linear
combination of the others.
Example In F, the set {f, g, h} is linearly independent, where
f(x) = 1, g(x) = sin x, h(x) = cos x.
For if
a1 + b sin x+ c cos x = 0 for all x,
then letting x take the values 0, π/2, π gives the three equations
a+ c = 0
a+ b = 0
a− c = 0
Solving the equations simultaneously gives the unique solution
a = b = c = 0.
Hence {f, g, h} is a linearly independent set.
Example In P2, the set {f, g, h} where
f(x) = 1, g(x) = x, h(x) = x2,
is linearly independent. For if
a1 + bx+ cx2 = 0 for all x,
then letting x take the values x = 0, 1, −1 gives the equations
a = 0
a + b+ c = 0
a− b+ c = 0
4.3. BASIS AND DIMENSION 37
Again, there is a unique solution, a = b = c = 0. Hence {f, g, h} is linearly
independent in P2.
This result generalises:
For any non-negative integer n, if
f0(x) = 1, f1(x) = x, f2(x) = x
2, . . . . . . , fn(x) = x
n,
the set {f0, f1, . . . . . . , fn} is linearly independent in Pn.
Example In R3, the set



10
1

 ,

21
0

 ,

34
5



 is linearly independent.
For if
a

10
1

+ b

21
0

+ c

34
5

 =

00
0

 ,
then 
1 2 30 1 4
1 0 5



ab
c

 =

00
0


from which we obtain the following reduction to row echelon form:
1 2 3 00 1 4 0
1 0 5 0

 various EROs−−−−−−−→

1 2 3 00 1 4 0
0 0 1 0

 .
This system has a unique solution found by back substitution, namely c = 0,
then b = 0 then a = 0. That is, the given set


10
1

 ,

21
0

 ,

34
5




is linearly independent.
4.3 Basis and dimension
We now combine the two major concepts of “linear independence” and “span”
to arrive at an important definition. Re-read the last example from Chapter
3, which demonstrated that the column space of a particular 3× 3 matrix A
38 CHAPTER 4. LINEAR DEPENDENCE AND INDEPENDENCE
could be given as the span of just two of its columns: not all columns were
necessary. The span of all columns of A certainly gives the column space,
but it is a larger set than is needed. For this particular matrix, the columns
are linearly dependent. At least one column is a linear combination of others,
and so is not needed as a member of the spanning set of the column space.
In the search for the most efficient spanning set for a given vector space, it
is clear we seek one containing the smallest possible number of vectors. This
will happen when the set both spans the space and is linearly independent.
Definition A set of vectors {v1, v2, . . . . . . ,vn} in a vector space V is called
a basis of V if the following conditions are satisfied:
(i) {v1, v2, . . . . . . ,vn} is a linearly independent set, and
(ii) Span{v1, v2, . . . . . . ,vn} = V.
Example The set {f0, f1, f2, . . . . . . , fn}, where
f0(x) = 1, f1(x) = x, f2(x) = x
2, . . . . . . , fn(x) = x
n,
is a basis of Pn for all non-negative integers n. It’s linearly independent and
spans Pn, as every polynomial p of degree less than or equal to n is already
a natural linear combination of these vectors. For if
p(x) = a0 + a1x+ a2x
2 + . . . . . .+ anx
n
then clearly
p = a0f0 + a1f1 + a2f2 + . . . . . .+ anfn.
The set {f0, f1, f2, . . . . . . , fn} is called the standard basis of Pn.
Example The set 


10
0

 ,

01
0

 ,

00
1




is called the standard basis of R3. It spans R3, as
xy
z

 = x

10
0

+ y

01
0

+ z

00
1


for all vectors

xy
z

 in R3, and it’s linearly independent, since if
x

10
0

+ y

01
0

+ z

00
1

 =

00
0


4.3. BASIS AND DIMENSION 39
then 
xy
z

 =

00
0


and hence x, y and z must all be zero.
This generalises to Rn: the set



1
0
...
0

 ,


0
1
...
0

 , . . . . . . ,


0
0
...
1




is called the standard basis of Rn for all positive integers n.
Example The set 


10
1

 ,

21
0

 ,

34
5




is also a basis for R3. The linear independence has already been confirmed in
a previous example. To see that this set spans R3, let

xy
z

 be any vector in
R3. Then

xy
z

 is in Span



10
1

 ,

21
0

 ,

34
5



 if and only if the linear
system 
1 2 30 1 4
1 0 5



ab
c

 =

xy
z


has a solution for a, b, c. We apply EROs to obtain row echelon form:
1 2 3 x
0 1 4 y
1 0 5 z
various EROs−−−−−−−→
1 2 3 x
0 1 4 y
0 0 1 1
10
(z − x+ 2y)
The system is consistent, and hence the given set spans R3. Thus the set is
a basis.
Theorem 4.1 If {v1, v2, . . . . . . ,vn} is a basis of a vector space V then
each vector v in V can be expressed as a linear combination of the vectors
v1, v2, . . . . . . ,vn in exactly one way.
40 CHAPTER 4. LINEAR DEPENDENCE AND INDEPENDENCE
Proof Suppose that
v = a1v1 + a2v2 + . . . . . .+ anvn
and also that
v = b1v1 + b2v2 + . . . . . .+ bnvn.
Then by subtraction,
(a1 − b1)v1 + (a2 − b2)v2 + . . . . . .+ (an − bn)vn = 0 (∗)
But the vectors vi form a linearly independent set, and so all coefficients in
equation (∗) must be zero. That is,
a1 = b1, a2 = b2, . . . . . . , an = bn,
and the proof is complete.
Now, it is clear that a vector space V can have more than one basis. (The
previous examples give two bases for R3). However, we will now see that
different bases for the same vector space always contain the same number of
vectors. This fact (stated in Theorem 4.3) allows us to define the dimension
of a vector space. First, a preliminary result.
Theorem 4.2 If one basis of a vector space V contains n vectors, then
1. every set of vectors containing more than n vectors is linearly depen-
dent;
2. no set of vectors containing fewer than n vectors can span V .
Proof While it is beyond the scope of this course to prove this theorem for
vector spaces in general, we can prove it for Rn.
1. Let {v1, v2, . . . . . . ,vm} be a set of vectors in Rn, with m > n. Now
consider the possible solutions (for a1, a2, . . . . . . , am) to the system of
equations
a1v1 + a2v2 + . . . . . .+ amvm = 0.
Notice that there are m unknowns and n equations, and m > n. That
is, we have more unknowns than equations, and the coefficient matrix
for the system has more columns than rows. Therefore, when the coeffi-
cient matrix is reduced to row echelon form at least one of the columns
will be missing a leading 1. The system is homogeneous (and therefore
consistent), and hence there are infinitely many solutions. So there are
certainly solutions in which at least one of a1, a2, . . . . . . , am is non-zero,
and hence the vectors {v1, v2, . . . . . . ,vm} are linearly dependent.
4.3. BASIS AND DIMENSION 41
2. Now let W = {v1, v2, . . . . . . ,vk} be a set of vectors in Rn, with k < n.
We wish to show that this set of vectors cannot span Rn. In other
words, we wish to show that there are vectors in Rn that cannot be
written as a linear combination of the vectors in W . So suppose that
x =


x1
x2
...
xn

 is a vector in Rn, and consider the system of equations
a1v1 + a2v2 + . . . . . .+ akvk = x.
In this system there are k unknowns, and n equations, with k < n.
That is, there are more equations than unknowns, and the coefficient
matrix has more rows than columns. So when the coefficient matrix
is reduced to row echelon form there will be at least one row of zeros
at the bottom, while the entry in the last row and last column of
the augmented matrix will be an expression involving some (or all) of
x1, x2, . . . , xn. The system will only be consistent if this expression
equals zero, and hence there will exist some vectors x for which the
system is inconsistent and has no solution. So W cannot span Rn.
From these results, the next theorem follows immediately.
Theorem 4.3 – The Basis Theorem If one basis of a vector space V
contains n vectors, then every basis of V contains n vectors.
Proof Let B be a basis of a vector space V containing n vectors, and let
B′ be another basis of V containing m vectors. Then m ≤ n, since otherwise
B′ would be a linearly dependent set (by Theorem 4.2), and m ≥ n since
otherwise B′ would not span V . Therefore m = n.
Definition The number of vectors in any basis of a vector space V is called
the dimension of V , and is written dim(V ).
Previous examples allow us to conclude that
• dim(Rn) = n for all integers n ≥ 1.
• dim(Pn) = n+ 1 for all integers n ≥ 0.
• dim({0}) = 0 (basis is the empty set).
The vector space F is different:
• dim(F) is ∞.
42 CHAPTER 4. LINEAR DEPENDENCE AND INDEPENDENCE
Example Let W =




a
b
c
d

 ∣∣ a, b, c, d ∈ R, a− b = c

, a subset of R
4.
Given that W is a subspace, find a basis for W and the dimension of W .
Since c = a−b, an arbitrary vector inW has the form


a
b
a− b
d

 where a, b, d
can take any real values. Now

a
b
a− b
d

 = a


1
0
1
0

 + b


0
1
−1
0

+ d


0
0
0
1

 (∗)
and so W = Span




1
0
1
0

 ,


0
1
−1
0

 ,


0
0
0
1



 .
Testing this set of vectors for linear independence, and making use of (∗), we
see that if
a


1
0
1
0

 + b


0
1
−1
0

 + d


0
0
0
1

 =


0
0
0
0

 ,
then the only solution for a, b, d is a = b = d = 0. Hence the set



1
0
1
0

 ,


0
1
−1
0

 ,


0
0
0
1




is a basis for W , proving that W is a subspace of R4 of dimension 3.
Example Find a basis for the subspace
W = {p ∈ P3 | p(x) = p(−x), for all x ∈ R} .
Let p(x) = a + bx+ cx2 + dx3. Then if p(x) = p(−x) for all x, we have
a+ bx+ cx2 + dx3 = a− bx+ cx2 − dx3
4.3. BASIS AND DIMENSION 43
for all x. Using the fact that {1, x, x2, x3} is a basis for P3, then by Theorem
4.1, p is in W if and only if b = d = 0. Hence the typical vector in W has
the formula p(x) = a+ cx2, where a and c can take any values in R. That is,
W = Span {1, x2} . Since {1, x2} is a linearly independent set, we see that
W is a subspace of P3 of dimension 2, with basis {1, x2}. (For convenience
we have used the formulas of the polynomials rather than their names, in
this explanation.)
As you might have expected, the dimensions of the subspaces in the previous
two examples were smaller than the dimensions of the whole spaces. The
relationship between the dimension of a vector space and the dimensions of
its subspaces is formalised in the following theorem.
Theorem 4.4 If W is a subspace of a vector space V , and dim(V ) = n,
then
1. dim(W ) ≤ n, and
2. dim(W ) = n if and only if W = V .
Proof
1. If dim(W ) > n, then W has a basis containing more than n vectors,
contradicting the fact that within V , each set of more than n vectors
is linearly dependent. Hence dim(W ) ≤ n.
2. If dim(W ) = n (with basis {w1, w2, . . . . . . ,wn}) and W = V , then
there is some vector v in V which is not in W . The set
{v, w1, w2, . . . . . . ,wn}
is then a linearly independent set, again contradicting the fact that
within V , each set containing more than n vectors is linearly dependent.
Hence W = V . Conversely, if W = V then clearly dim(W ) =dim(V ).

Now we see how to build a basis by enlarging a given linearly independent set
– that is, by enlarging a basis of a subspace of V . Suppose we have a linearly
independent set in a vector space V which is not a basis for V . Then it
doesn’t span V . We extend it to a basis of V by including vectors not in its
span. The process is illustrated by the following example.
44 CHAPTER 4. LINEAR DEPENDENCE AND INDEPENDENCE
Example The set X =




1
1
1
0

 ,


1
0
0
0



 is linearly independent in R
4 but
is not a basis of R4, since it doesn’t contain 4 vectors.
Span (X) =

a


1
1
1
0

+ b


1
0
0
0

 ∣∣ a, b ∈ R


=




a + b
a
a
0

 ∣∣ a, b, ∈ R


Span (X) is a subspace of R4 of dimension 2; a basis consists of the two
vectors in X. Select a vector not in the span of X, say the vector


0
0
1
0

.
Re-define the set X by including this new vector:
X =




1
1
1
0

 ,


1
0
0
0

 ,


0
0
1
0



 .
By including the third vector, which is not in the span of the previous two,
we have maintained the linear independence. (Check this!) However, the
new X still does not span R4. Let’s calculate Span (X).
Span(X) =




a + b
a
a+ c
0

 ∣∣ a, b, c ∈ R

 .
Span(X) is now a subspace of R4 of dimension 3; a basis for Span(X) consists
of the three vectors now listed in X. Again, select a vector not in Span(X),
such as


0
0
0
1

. Re-define X by including this vector as well:
4.3. BASIS AND DIMENSION 45
X =




1
1
1
0

 ,


1
0
0
0

 ,


0
0
1
0

 ,


0
0
0
1



 .
The set X is still linearly independent (check). As it has 4 vectors, Span(X)
has dimension 4. If Span(X) were not the whole of R4, we would be able to
continue this procedure by re-defining X to include another vector, and X
would then be a linearly independent set of five vectors. But in R4 a set of
five vectors cannot be linearly independent, so Span(X) must be the whole
of R4. Hence a basis for R4 is the set
X =




1
1
1
0

 ,


1
0
0
0

 ,


0
0
1
0

 ,


0
0
0
1



 .
The opposite situation to “building up” a basis occurs when we have a set
which spans a vector space but is not a basis. Then it must be a linearly
dependent set. We can obtain a basis by discarding unnecessary vectors from
this set. In particular, we discard vectors which are linear combinations of
others in the set. Again, this is illustrated by an example.
Example The set
{(
1
1
)
,
(
3
4
)
,
(
4
1
)}
spans the vector space R2. To see
this, observe that each
(
x
y
)
in R2 can be written as a linear combination of
these three vectors:(
x
y
)
= (4x− 3y − 13)
(
1
1
)
+ (3 + y − x)
(
3
4
)
+
(
4
1
)
Hence
Span
{(
1
1
)
,
(
3
4
)
,
(
4
1
)}
= R2.
However,
{(
1
1
)
,
(
3
4
)
,
(
4
1
)}
is a linearly dependent set, as
−13
(
1
1
)
+ 3
(
3
4
)
+ 1
(
4
1
)
=
(
0
0
)
.
46 CHAPTER 4. LINEAR DEPENDENCE AND INDEPENDENCE
Each vector in the set of three is expressible as a linear combination of the
other two. Discard one of the vectors, say
(
4
1
)
. Then the set
{(
1
1
)
,
(
3
4
)}
still spans R2 and is linearly independent. Hence it is a basis of R2.
A basis of a vector space V is therefore a set of vectors of just the “right” size:
big enough to span the space and small enough to be linearly independent.
The “building up” and “discarding” processes are summarised in the final
theorem, which is stated without proof.
Theorem 4.5 Let V be a vector space, with dim(V ) = n.
1. Every linearly independent set containing fewer than n vectors can be
extended to a basis of V .
2. Every spanning set of V with more than n vectors contains a basis of
V .
3. Every linearly independent set of n vectors is a basis of V .
4. Every set of n vectors which spans V is a basis of V .
4.4 Summary
You should know and understand the following definitions and results.
• Definitions of linear dependence, linear independence, basis and di-
mension.
• If {v1, v2, . . . . . . ,vn} is a basis of a vector space V then each vector
v in V can be expressed as a linear combination of {v1, v2, . . . . . . ,vn} in
exactly one way.
• If one basis of V contains n vectors, then every set of vectors containing
more than n vectors is linearly dependent, and no set of vectors containing
fewer than n vectors can span V .
• The Basis Theorem: If one basis of a vector space V contains n vectors,
then every basis of V contains n vectors.
• If W is a non zero subspace of V and V has dimension n, then
dim(W ) ≤ n, with dim(W ) = n if and only if W = V .
• In a vector space V of dimension n, every linearly independent set
containing fewer than n vectors can be extended to a basis of V , and every
4.4. SUMMARY 47
spanning set of V with more than n vectors contains a basis of V . Moreover,
every linearly independent set of n vectors is a basis of V , and every set of
n vectors which spans V is a basis of V .
You should master these techniques:
• Test a given set of vectors (in vector spaces Rn or F) and decide if the
set is linearly dependent or independent.
• Determine whether or not a given set is a basis of a vector space.
• Find a basis for a given vector space.
Practice Examples
1. The functions f, g, h are vectors in the vector space F. You are given
that
f(x) = ex g(x) = xex h(x) = x2ex.
Determine whether the set {f, g, h} is linearly independent or linearly
dependent.
2. Is the following set a basis for the vector space R3?


13
2

 ,

−10
1

 ,

31
0




3. Find a basis for the subspace W of R3, where
W =



xy
z

 | 2x− y + 3z = 0

 .
Answers
1. The set is linearly independent.
2. Yes, as the set is linearly independent and spans the space.
3. There are infinitely many bases: here are two of them.


12
0

 ,

03
1







 1−1
−1

 ,

 213
3



 .
48 CHAPTER 4. LINEAR DEPENDENCE AND INDEPENDENCE
Chapter 5
Some Special Bases
In previous work we have seen that a given vector space has many different
bases. Some are “natural” in the sense of being obvious or algebraically
simple, such as the standard bases of Rn and Pn. The fact is that when
working on a particular linear problem, the best choice of basis will usually
be determined by the problem itself. In a later chapter we will deal with
bases consisting of eigenvectors of a matrix, and see how they can greatly
simplify the solution to many different linear problems. In this chapter we
examine Lagrange polynomials, which give us a special basis of Pn chosen to
solve a specific problem. We also study bases of the column and null spaces
of a matrix, and establish a connection between their dimensions.
5.1 Lagrange polynomials
The Lagrange polynomials are used to solve the problem of finding a poly-
nomial of minimal degree that passes through a given set of points in the
xy plane. Knowing such a polynomial then permits predictions to be made
about y values corresponding to other values of x, a process termed interpo-
lation.
Suppose we consider the set of n+ 1 points
(c0, k0), (c1, k1), (c2, k2), . . . . . . , (cn, kn)
in the xy plane. We assume that all the x values c0, c1, . . . . . . , cn are different.
We can display the points in tabulated form:
x values c0 c1 c2 . . . cn
y values k0 k1 k2 . . . kn
49
50 CHAPTER 5. SOME SPECIAL BASES
Through two data points (c0, k0) and (c1, k1) we can fit a straight line:


(c0, k0)
(c1, k1)
We could of course find parabolas, cubic curves and so on to fit the above
two points, but for simplicity and efficiency we always seek a polynomial of
minimum degree.
Through three data points (c0, k0), (c1, k1) and (c2, k2) we can fit a parabola.
(In certain cases it may be possible to fit a straight line to three points, but
in general, a quadratic polynomial would be needed.)



(c0, k0)
(c2, k2)
(c1, k1)
With n+1 data points (c0, k0), (c1, k1), (c2, k2), . . . . . . , (cn, kn) a polynomial
of degree n is generally the one of minimum degree that fits the points exactly.
To find such a polynomial, we construct a special basis of the vector space
Pn making use of the x values from the data table. This basis consists of
n+ 1 polynomials of degree n,
{ p0, p1, p2, . . . . . . , pn}
5.1. LAGRANGE POLYNOMIALS 51
satisfying the conditions
• pi(cj) = 0 if i = j, and
• pi(ci) = 1,
for all i = 0, 1, 2, . . . . . . , n. That is, pi(x) is zero at all the x data values
except ci, and pi(x) is 1 when x = ci.
We now show that if such a set of polynomials { p0, p1, p2, . . . . . . , pn} exists,
then it forms a basis of Pn, called a Lagrange basis of the vector space Pn .
Theorem 5.1 If { p0, p1, p2, . . . . . . , pn } are polynomials in Pn satisfying
the conditions
• pi(cj) = 0 if i = j, and
• pi(ci) = 1,
for all i = 0, 1, 2, . . . . . . , n, then { p0, p1, p2, . . . . . . , pn } is a basis of Pn.
Proof First we prove linear independence. Suppose that
a0p0(x) + a1p1(x) + a2p2(x) + . . . . . .+ anpn(x) = 0
for all values of x. Substituting x = c0 gives
a0 × 1 + a1 × 0 + a2 × 0 . . . . . .+ an × 0 = 0,
that is, a0 = 0.
Substituting x = c1 gives
a0 × 0 + a1 × 1 + a2 × 0 . . . . . .+ an × 0 = 0,
that is, a1 = 0.
Continuing in this way, it is clear that all scalars a0, a1, . . . . . . , an must be
zero – no other values are possible. This proves the linear independence of
{ p0, p1, p2, . . . . . . , pn }. Hence Span{ p0, p1, p2, . . . . . . , pn } is a subspace
of Pn of dimension n + 1, and by the results of the previous chapter, we
conclude that
Span{ p0, p1, p2, . . . . . . , pn } = Pn.
Therefore { p0, p1, p2, . . . . . . , pn } is a basis of Pn.
To illustrate how to find a Lagrange basis in practice and see how it’s used
to fit a polynomial to the original data points, we will use a (rather long)
example.
52 CHAPTER 5. SOME SPECIAL BASES
Example Find a quartic polynomial expression p(x) passing exactly through
the following data points, and use it to estimate the value of y when x = 1.5.
x values −1 0 1 2 3
y values 7 3 3 −5 −9
Since there are five data points, the construction will produce a Lagrange
basis { p0, p1, p2, p3, p4 } of P4. First consider p0. We know that p0(x)
must be zero at all the x data values except x = −1, and must be 1 at
x = −1. That is, p0(−1) = 1 and
p0(0) = 0, p0(1) = 0, p0(2) = 0, p0(3) = 0.
(Think of it systematically: the first x value from the data table is x = −1.
The first basis vector takes the value 1 at the first x value and is zero at all
other x values.) A sensible guess for p0(x) is therefore
x(x− 1)(x− 2)(x− 3),
since it’s equal to zero when x = 0, 1, 2, and 3. However, when x = −1, it
has the value
−1 × (−1 − 1)× (−1− 2)× (−1− 3) = 24.
We simply divide our guess by 24 to arrive at the first Lagrange basis vector,
p0(x) =
1
24
x(x− 1)(x− 2)(x− 3).
Now find the second basis vector p1. It must satisfy the conditions p1(0) = 1
and
p1(−1) = 0, p1(1) = 0, p1(2) = 0, p1(3) = 0.
(The second x value from the data table is x = 0. The second basis vector
takes the value 1 at the second x value and is zero at all other x values.) A
sensible guess for p1(x) is
(x+ 1)(x− 1)(x− 2)(x− 3)
since this is zero at all the right places. However, when x = 0 it takes the
value (0 + 1)(0 − 1)(0 − 2)(0 − 3) = −6. So we divide by −6 to obtain the
second Lagrange basis vector
p1(x) = −1
6
(x+ 1)(x− 1)(x− 2)(x− 3).
5.1. LAGRANGE POLYNOMIALS 53
Similarly,
p2(x) =
1
4
(x+ 1)(x)(x− 2)(x− 3),
p3(x) = −1
6
(x+ 1)(x)(x− 1)(x− 3),
p4(x) =
1
24
(x+ 1)(x)(x− 1)(x− 2).
Having constructed a Lagrange basis for P4, we now try to find a polynomial
p in P4 that fits all the data points. Such a polynomial will be a linear
combination of the basis vectors of P4.
Let
p(x) = a0p0(x) + a1p1(x) + a2p2(x) + a3p3(x) + a4p4(x).
It only remains to find the values of the scalars a0, . . . . . . , a4 in the linear
combination. Substitute the x values in turn and use the y values from the
data table. We obtain
p(−1) = 7 = a0p0(−1) + a1p1(−1) + a2p2(−1) + a3p3(−1) + a4p4(−1)
= a0 × 1 + a1 × 0 + a2 × 0 + a3 × 0 + a4 × 0
= a0
That is, a0 = 7, the first y value! Similarly,
p(0) = 3 = a0p0(0) + a1p1(0) + a2p2(0) + a3p3(0) + a4p4(0)
= a0 × 0 + a1 × 1 + a2 × 0 + a3 × 0 + a4 × 0
= a1
Hence a1 = 3, the second y value, and so on. Using all the y values from the
data table we end up with
p(x) = 7p0(x) + 3p1(x) + 3p2(x)− 5p3(x)− 9p4(x).
After substituting the formulas for p0(x), . . . . . . p4(x) and simplifying, this
gives
p(x) = x4 − 4x3 + x2 + 2x+ 3.
Finally, when x takes the value 1.5, the value of p(x) is
1.54 − 4× 1.53 + 1.52 + 2× 1.5 + 3 = −0.1875
54 CHAPTER 5. SOME SPECIAL BASES
In the general case, given the data table
x values c0 c1 c2 . . . cn
y values k0 k1 k2 . . . kn
the polynomial p of degree n that fits the data exactly is
p = k0p0 + k1p1 + k2p2 + . . . . . .+ knpn,
where
{p0, p1, . . . . . . , pn}
is the Lagrange basis of Pn. The general formula for pi(x) for i = 0, 1, . . . , n
is
pi(x) =
(x− c0)(x− c1)(x− c2) . . . (x− ci−1) ̂(x− ci)(x− ci+1) . . . (x− cn)
(ci − c0)(ci − c1)(ci − c2) . . . (ci − ci−1) ̂(ci − ci)(ci − ci+1) . . . (ci − cn)
The “hats” placed over two of the bracketed expressions means these are to
be omitted from the list.
The Lagrange basis of Pn always consists of n + 1 polynomials of degree n,
the maximum degree possible. Even if the exact fit polynomial has degree
less than n in some particular cases, the above process will find it. This is
illustrated by the following example.
Example Suppose the data is
x values 0 1 2
y values 1 3 5
Notice that the data points lie on the straight line y = 2x+1. With three data
points, the Lagrange basis calculation takes place in P2 and the basis consists
of quadratic polynomials: we should expect that the exact fit polynomial (a
linear combination of the Lagrange basis vectors) simplifies to this linear
expression.
For the data table given, the Lagrange basis is { p0, p1, p2 }, where
p0(x) =
(x− 1)(x− 2)
(0− 1)(0− 2) =
1
2
(x− 1)(x− 2)
p1(x) =
x(x− 2)
1(1− 2) = −x(x − 2)
p2(x) =
x(x− 1)
2(2− 1) =
1
2
x(x− 1)
5.2. BASES OF THE NULL AND COLUMN SPACES OF A MATRIX 55
The polynomial p that fits the data exactly is given by the linear combination
of the Lagrange basis vectors with the y data values as the coefficients:
p(x) = 1p0(x) + 3p1(x) + 5p2(x)
=
1
2
(x− 1)(x− 2)− 3x(x− 2) + 5
2
x(x− 1)
= 2x+ 1,
as expected.
5.2 Bases of the null and column spaces of a
matrix
Suppose that A is a matrix of size m× n. The null and column spaces of A
have already been defined in an earlier chapter. Recall that
Null(A) = {x | Ax = 0}
and
Col(A) = Span{c1, c2, . . . . . . , cn},
where x is in Rn and vectors c1, c2, . . . . . . , cn denote the columns of A.
These two subspaces “live” in different vector spaces. The null space of A is
a subspace of Rn and the column space of A is a subspace of Rm. It might
seem surprising, therefore, that there is a connection between the two. The
connection is made clear by finding a basis for each space and relating these
bases to the reduced row echelon form of A.
The idea will be illustrated using the 4× 6 matrix
A =


1 4 −1 6 4 5
0 −1 1 5 6 6
4 2 10 −4 2 6
1 0 3 0 2 3

 ,
whose reduced row echelon form (check!) is
J =


1 0 3 0 2 3
0 1 −1 0 −1 −1
0 0 0 1 1 1
0 0 0 0 0 0

 .
56 CHAPTER 5. SOME SPECIAL BASES
5.2.1 Null space basis
First find the null space of A. Writing x =


x1
x2
x3
x4
x5
x6


, we know from first year
work that Ax = 0 if and only if Jx = 0. (This follows from the fact that
EkEk−1 . . . . . . E2E1A = J , where the Ei are elementary matrices.) Hence
the null space of a matrix is identical to the null space of its reduced row
echelon form. So x is in the null space of A if and only if
x = x3


−3
1
1
0
0
0


+ x5


−2
1
0
−1
1
0


+ x6


−3
1
0
−1
0
1


= x3v1 + x5v2 + x6v3
for some values of x3, x5 and x6, where v1, ,v2 and v3 are the constant
vectors shown. Therefore Null(A) = Span { v1, v2, v3 }, a subspace of R6.
Solving Ax = 0 and expressing the answer using the parameters (free vari-
ables) as coefficients in a linear combination (as above) always gives a span-
ning set for Null(A). Moreover, the spanning set thus obtained is always
a linearly independent set of vectors. In this particular case, suppose that
x3v1 + x5v2 + x6v3 = 0. That is,

−3x3 − 2x5 − 3x6
x3 + x5 + x6
x3
−x5 − x6
x5
x6


=


0
0
0
0
0
0


.
The only possible solution for the scalars x3, x5, x6 is that they are all zero
(look at the third, fifth and sixth rows!), showing the linear independence of
{ v1, v2, v3 }. Therefore these vectors form a basis of Null(A). In this
case, Null(A) is a subspace of R6 of dimension 3.
The number of vectors in the basis of the null space of A produced by solving
Ax = 0 equals the number of free variables (parameters), and the number
5.2. BASES OF THE NULL AND COLUMN SPACES OF A MATRIX 57
of parameters is always equal to the number of columns of the reduced row
echelon form which are missing a leading 1.
If A is an m× n matrix whose echelon form has r leading 1s, then
Null(A) is a subspace of Rn of dimension n − r, corresponding to
the number of parameters in the general solution of Ax = 0.
5.2.2 Column space basis
The column space of a matrix is defined in terms of a spanning set, but of
course the columns themselves may not be linearly independent and so may
not form a basis. We now show how to find which columns to discard from
the spanning set to obtain a basis for Col(A). Using the same A and J as
before,
A =


1 4 −1 6 4 5
0 −1 1 5 6 6
4 2 10 −4 2 6
1 0 3 0 2 3

 ,
and
J =


1 0 3 0 2 3
0 1 −1 0 −1 −1
0 0 0 1 1 1
0 0 0 0 0 0

 ,
an immediate conclusion can be drawn. The column space of a matrix is
generally not equal to the column space of its reduced row echelon form. Look
at the entries in the last row of J and think about calculating the span of the
columns of J . Every vector in Col(J) would have its last component equal to
zero, and so the first, third, fifth and sixth columns of A can’t be in Col(J).
The column spaces of the two matrices are different! However, the spaces
are related, as we shall soon see.
It’s really easy to find a basis for the column space of a matrix which is in
reduced row echelon form. In J , let the columns be called d1, . . . ,d6. Then
by looking at the numerical entries in J , the following relationships may be
observed:
d3 = 3d1 − d2, d5 = 2d1 − d2 + d4, d6 = 3d1 − d2 + d4.
That is, d3, d5, d6 are linear combinations of d1, d2, d4. Columns 1, 2 and
4 are part of the standard basis of R4, and so are linearly independent; they
therefore form a basis for Col(J).
58 CHAPTER 5. SOME SPECIAL BASES
The columns containing the leading 1s in a reduced row echelon
matrix J form a basis for the column space of J .
Now we develop a link between the column spaces of A and J in the general
case. Suppose A is an m×n matrix, and x =


x1
x2
...
xn

 is a solution of Ax = 0.
Let c1, . . . , cn and d1, . . . ,dn denote the columns of A and J , respectively.
Then Ax = 0 is equivalent to [ c1 c2 . . . . . . cn ] x = 0, which in turn is
equivalent to
x1c1 + x2c2 + . . . . . .+ xncn = 0. (∗)
If the only value of x is the zero vector, then equation (∗) tells us that
the columns c1, . . . , cn of A are linearly independent. If there are non-zero
vectors x, then (∗) tells us that these columns are linearly dependent, and
gives an exact dependency relationship between them.
Recall now that Ax = 0 is equivalent to Jx = 0. Therefore, as before,
Jx = 0 is equivalent to
x1d1 + x2d2 + . . . . . .+ xndn = 0. (∗∗)
That is, if the columns of A are linearly independent, so are the columns of J ,
and vice-versa. If the columns of A are linearly dependent, so are the columns
of J , and vice-versa. Equations like (∗) show the dependency relationship
and this gives an identical dependency relationship between the columns of J ,
shown in (∗∗). The dependency relationships are easy to see in J because J is
in reduced row echelon form. Going back to our particular examples of A and
J , we have already seen that d3 = 3d1−d2 in J , and therefore in A we must
have the same linear dependency, c3 = 3c1 − c2. Since d5 = 2d1 − d2 + d4,
we must have c5 = 2c1− c2+ c4 and since d6 = 3d1 −d2 +d4 we must have
c6 = 3c1 − c2 + c4. Check these!
If certain columns of J form a basis for Col(J), then the corre-
sponding columns in A form a basis for Col(A).
This means, of course, that the dimensions of the column spaces of A and J
are equal. The spaces themselves may be different but they have the same
dimension. Since the dimension of the column space of J is equal to the
number of columns with a leading 1, we have the following result.
5.3. SUMMARY 59
If A is an m × n matrix whose reduced row echelon form J has r
leading 1s, then Col(A) has dimension r.
This leads to the matrix version of the famous Dimension Theorem of vector
spaces.
Theorem 5.2 – The Dimension Theorem for Matrices
If A is an m× n matrix, then
dim Null(A) + dim Col(A) = the number of columns of A.
Proof Suppose that the reduced row echelon form of A has r leading 1s.
Then the left hand side of the above equation equals n− r + r, which equals
the number of columns of A.
There are special names given to the dimensions of the null and column
spaces of a matrix.
Definition The nullity of a matrix A is the dimension of the null space of
A. The rank of A is the dimension of the column space of A.
Therefore if A is an m × n matrix whose reduced row echelon form J has r
leading 1s,
nullity = n− r, rank = r,
and
nullity + rank = the number of columns of the matrix.
5.3 Summary
Definitions are results you must understand and learn are:
• The definition of the Lagrange basis { p0, p1, p2, . . . . . . , pn} of Pn,
corresponding to a given set of x values c0, c1, . . . . . . , cn.
• The polynomial of degree n (or less) that fits the data points
(c0, k0), (c1, k1), (c2, k2), . . . . . . , (cn, kn)
exactly is
p = k0p0 + k1p1 + k2p2 + . . . . . .+ knpn,
where
{p0, p1, . . . . . . pn}
60 CHAPTER 5. SOME SPECIAL BASES
is the Lagrange basis of Pn.
• If A is an m × n matrix whose echelon form has r leading 1s, then
Null(A) is a subspace of Rn of dimension n−r, corresponding to the number
of parameters needed to solve Ax = 0.
• The columns containing the leading 1s in a reduced row echelon matrix
J form a basis for the column space of J .
• If certain columns of J form a basis for Col(J), then the corresponding
columns in A form a basis for Col(A).
• If A is anm×n matrix whose reduced row echelon form J has r leading
1s, then Col(A) has dimension r.
• Dimension Theorem for Matrices: If A is an m× n matrix then
dim Null(A) + dim Col(A) = the number of columns of A.
• The nullity of a matrix A is the dimension of the null space of A. The
rank of A is the dimension of the column space of A.
Techniques you must master:
• Given a data set with n+ 1 points, calculate a Lagrange basis of Pn.
• Find a polynomial of minimum degree that fits a set of data points,
using the Lagrange basis.
• Given a matrix, find a basis for its null space and a basis for its column
space.
• Find the rank and nullity of a given matrix.
Practice Examples
1. Given the data points
x values −2 −1 0 1 2
y values −30 −10 −4 0 14
use the Lagrange polynomial method to find a polynomial of minimum
degree that passes through all points. Use your answer to find the
approximate value of y when x = 1.5.
5.3. SUMMARY 61
2. Find a basis for the null space of the matrix A =

1 1 0 22 3 1 0
3 0 −3 3

 and
write down the nullity of A.
3. Using the same matrix A, find a basis for the column space of A and
write down the rank of A. Check that the nullity and the rank add up
to the number of columns of A.
Answers
1. p(x) = 2x3 − x2 + 3x − 4, of degree 1 less than the maximum! The
interpolated value of y when x = 1.5 is 5.
2. A basis for the null space is




1
−1
1
0



.
The nullity of A is therefore 1.
3. A basis for the column space is


12
3

 ,

13
0

 ,

20
3




The rank of A is therefore 3. The sum of 1 and 3 is equal to 4, the
number of columns of A, as expected by the dimension theorem.
62 CHAPTER 5. SOME SPECIAL BASES
Chapter 6
Linear Transformations
So far our study of vector spaces has been mainly concerned with under-
standing the concepts of subspace, basis and dimension within a single vec-
tor space. We now broaden our approach and consider relationships between
different vector spaces. This is done by defining a special type of function
from one vector space to another that is compatible with the operations of
addition of vectors and multiplication of a vector by a scalar that are defined
on each space. These special functions are called linear transformations.
6.1 Linear transformations
Let V and W be any two vector spaces and let T : V → W be a function.
Then T is called a linear transformation if
(i) T (u+ v) = T (u) + T (v) for all vectors u and v in V , and
(ii) T (ku) = kT (u) for all scalars k and for all vectors u in V .
The two conditions given in the definition should be studied closely. Note
that the vector spaces V andW could be quite different; for example V could
be a space of functions like Pn and W could be a space of columns of real
numbers like R3. Thus the operations of addition and multiplication by a
scalar could be quite different in the two spaces. Yet the equations in the
definition use the same symbols to represent these different operations. In
writing T (u+v) on the left hand side of condition (i), the “+” sign stands for
addition of vectors in V , whereas in the expression T (u) + T (v) on the right
hand side of condition (i) the “+” sign now stands for addition of vectors in
W . Similar comments apply to the different operations of multiplication by
a scalar that appear in condition (ii).
63
64 CHAPTER 6. LINEAR TRANSFORMATIONS
If a function T : V → W satisfies condition (i) we say that T preserves
addition and if it satisfies (ii) we say it preserves multiplication by a scalar.
The two conditions (i) and (ii) can in fact be combined into a single condition,
as follows.
Suppose that both (i) and (ii) are true. Then for all scalars a and b and all
vectors u and v in V , we have
T (au+ bv) = T (au) + T (bv) ( by condition (i) )
= aT (u) + bT (v) ( by condition (ii) )
Conversely, if for all scalars a and b and all vectors u and v in V , we have
T (au+ bv) = aT (u) + bT (v),
then setting a = b = 1 gives condition (i) and setting b = 0 gives condition
(ii).
Checking that a given function T : V → W between vector spaces V and W
is a linear transformation can therefore be done either by checking that the
two separate conditions (i) and (ii) of the definition are valid, or by checking
the combined condition:
T : V → W is a linear transformation if for all scalars a and b and
all vectors u and v in V ,
T (au+ bv) = aT (u) + bT (v).
Every linear transformation maps the zero vector of V to the zero vector of
W . To see this, use condition (ii) with k = 0 and recall that 0x is equal to
the zero vector for all vectors x.
T (0V ) = T (0u) = 0T (u) = 0W ,
where 0V and 0W denote the zero vectors in V and W , respectively, and u
is any vector in V .
If T : V → W is a linear transformation, then T maps the zero
vector of V to the zero vector of W .
Thus if the zero vector of V is not mapped to the zero vector of W by a
function T : V → W , then the function T is not a linear transformation.
6.2. EXAMPLES OF LINEAR TRANSFORMATIONS 65
6.2 Examples of linear transformations
Example Differentiation is a linear transformation. Consider the function
T : Pn → Pn−1, given by
T (p) = p′
for all polynomials p in Pn. Suppose that p and q are any two polynomials in
Pn and a and b are any two scalars. Then checking the two conditions (i) and
(ii) in the combined version, we have (using the properties of differentiation)
T (ap+ bq) = (ap + bq)′
= (ap)′ + (bq)′
= ap′ + bq′
= aT (p) + bT (q)
Example Indefinite integration is also a linear transformation. Consider the
function T : Pn → Pn+1, where T (p) is the polynomial function whose formula
is
∫ x
c
p(t) dt (where c is any scalar). Using the properties of integration,∫ x
c
( ap(t) + bq(t) ) dt = a
∫ x
c
p(t) dt+ b
∫ x
c
q(t) dt
for all scalars a and b and all polynomials p and q in Pn. Therefore the
function T (ap+ bq) is the same as the function aT (p) + bT (q).
Example Consider the function T : Pn → Pn, given by T (p) = p˜, where
p˜(x) = x
d
dx
p(x)
for all polynomials p in Pn. Then T is a linear transformation. To see this,
we first check condition (i) of the definition.
Let p and q be any two polynomials in Pn. Then for all real x,
(˜p+ q)(x) = x
d
dx
( p(x) + q(x) )
= x(
dp
dx
+
dq
dx
)
= x
dp
dx
+ x
dq
dx
= p˜(x) + q˜(x).
66 CHAPTER 6. LINEAR TRANSFORMATIONS
Therefore T (p + q) = T (p) + T (q). Now check condition (ii). Let k be any
scalar and let p be any polynomial in Pn. For all x,
(˜kp)(x) = x
d
dx
( kp )(x)
= xk
dp
dx
= kx
dp
dx
= kp˜(x).
That is, T (kp) = kT (p) for all scalars k and all polynomials p.
Example The function T : P2 → R2 given by
T (p) =
(
a+ b
b+ c
)
,
for all polynomials p in P2, where p(x) = a+bx+cx
2 is a linear transformation.
To see this, let p(x) = a+ bx+ cx2 and q(x) = d+ ex+ fx2 and let α and β
be any scalars. Then
αp(x) + βq(x) = (αa+ βd) + (αb+ βe)x+ (αc+ βf)x2
and
T (αp+ βq) =
(
(αa+ βd) + (αb+ βe)
(αb+ βe) + (αc+ βf)
)
=
(
α(a+ b) + β(d+ e)
α(b+ c) + β(e+ f)
)
= α
(
a+ b
b+ c
)
+ β
(
d+ e
e+ f
)
= αT (p) + βT (q)
Example Suppose that M is any m×n matrix. The function T : Rn → Rm,
such that for all x in Rn,
T (x) = Mx
is a linear transformation. This follows from the properties of matrix multi-
plication. Let u, v be any vectors in Rn and let a, b be any scalars.
T (au+ bv) = M(au + bv)
= M(au) +M(bv)
= aMu + bMv
= aT (u) + bT (v)
6.3. MATRIX TRANSFORMATIONS 67
6.3 Matrix transformations
The last example in the previous section shows a linear transformation in
which the domain and codomain are vector spaces of Rn type and the function
is calculated by a matrix multiplication process. Such linear transformations
are called matrix transformations.
Definition A matrix transformation determined by the m × n matrix M
is the linear transformation T : Rn → Rm given by
T (x) = Mx
for all x in Rn.
The matrix transformation T : Rn → Rm associated with the matrix
M =


a11 a12 . . . . . . a1n
a21 a22 . . . . . . a2n
...
...
...
am1 am2 . . . . . . amn


is given by the formula
T




x1
x2
...
xn



 =


a11x1 + a12x2 + . . . . . .+ a1nxn
a21x1 + a22x2 + . . . . . .+ a2nxn
...
am1x1 + am2x2 + . . . . . .+ amnxn


and conversely, every function T : Rn → Rm whose formula is of this type is
a matrix transformation.
We already know that every matrix transformation T : Rn → Rm is a linear
transformation. It’s also true that every linear transformation T : Rn → Rm
is in fact a matrix transformation.
Theorem 6.1 If T : Rn → Rm is a linear transformation, then there exists
a matrix M such that T (x) = Mx for all x in Rn. That is, T is a matrix
transformation.
Proof Express x =


a1
a2
...
an

 in Rn as a linear combination of the standard
68 CHAPTER 6. LINEAR TRANSFORMATIONS
basis vectors:
x = a1


1
0
...
0

+ a2


0
1
...
0

+ . . . . . .+ an


0
0
...
1

 .
Suppose that T maps the standard basis vectors as follows:
T




1
0
...
0



 = c1, T




0
1
...
0



 = c2, . . . . . . , T




0
0
...
1



 = cn.
Define the matrix M to be M = [ c1 c2 . . . . . . cn ]. Apply T to x:
T (x) = T

a1


1
0
...
0

 + a2


0
1
...
0

+ . . . . . .+ an


0
0
...
1




= a1T




1
0
...
0



+ a2T




0
1
...
0



 + . . . . . . anT




0
0
...
1




= a1c1 + a2c2 + . . . . . .+ ancn
= [ c1 c2 . . . . . . cn ]


a1
a2
...
an


= [ c1 c2 . . . . . . cn ] x
= Mx.
Hence T is a matrix transformation.
Example The function T : R2 → R3 given by
T
((
x
y
))
=

2x+ 3y−x
y − x


6.4. GEOMETRIC SIGNIFICANCE OFMATRIX TRANSFORMATIONS69
for all vectors
(
x
y
)
in R2 is a matrix transformation. To see this, we need
only rewrite the formula as a matrix multiplication involving a 3× 2 matrix
as follows:
T
((
x
y
))
=

2x+ 3y−x
y − x

 =

 2 3−1 0
−1 1

(x
y
)
= M
(
x
y
)
,
that is, T
((
x
y
))
= M
(
x
y
)
where M is the matrix

 2 3−1 0
−1 1

.
Non Example The function T : R3 → R3 given by
T



xy
z



 =

2x+ 3y + zy + z + 3
y − x− z


is not a matrix transformation, as the entry in the second position contains
an unattached constant “3”.
6.4 Geometric significance of matrix trans-
formations
Matrix transformations involving the vector spaces R2 and R3 lend them-
selves to geometric interpretation. All matrix transformations have the prop-
erty that they map lines to lines. That is, if point P is mapped by a linear
transformation to point P ′ and point Q is mapped to point Q′, then a point
X on the line PQ between P and Q is mapped to a point X ′ on the line
P ′Q′ between P ′ and Q′.






P
Q
X
P ′
Q′
X ′
This follows from the fact that relative to some origin, the position vector
of X is simply a linear combination of the position vectors of points P and
Q, and applying a linear transformation to obtain image points P ′, Q′, X ′
70 CHAPTER 6. LINEAR TRANSFORMATIONS
maintains the linear combination. That is, the position vector of X ′ is a
linear combination of the position vectors of points P ′ and Q′ and hence X ′
lies on the line P ′Q′.
6.4.1 Reflections
Consider the matrix transformation T : R2 → R2 with formula
T (x) =
(
1 0
0 −1
)
x
for all x in R2. Think of x =
(
x
y
)
geometrically as the point with coordinates
(x, y) in the xy plane. Since
T
((
x
y
))
=
(
1 0
0 −1
)(
x
y
)
=
(
x
−y
)
,
we see that geometrically, T corresponds to reflection in the x axis, since it
takes as “input” the point (x, y) and gives as “output” the point (x,−y).


(x, y)
(x,−y)
Similarly, reflection in the y axis is given by the matrix transformation
T : R2 → R2 with
T (x) = T
((
x
y
))
=
(−1 0
0 1
)(
x
y
)
=
(−x
y
)
,
for all x =
(
x
y
)
in R2.
(x, y)(−x, y)
6.4. GEOMETRIC SIGNIFICANCE OFMATRIX TRANSFORMATIONS71
Reflection in the line y = x is given by the matrix transformation
T : R2 → R2 with
T (x) = T
((
x
y
))
=
(
0 1
1 0
)(
x
y
)
=
(
y
x
)
,
for all x in R2.


(x, y)
(y, x)
We now derive the matrix associated with reflection in an arbitrary line
through the origin. Suppose that the axis of reflection is the line with equa-
tion y = mx, and the angle from the positive x axis drawn anticlockwise to
is θ, as shown. Let the point (x, y) have reflected image (x′, y′) and suppose
that the distance from (x, y) to the origin is r, so that x = r cosα, y = r sinα.



α θ
r
r (x, y)
(x′, y′)
The angle between the x axis and the line from the origin to (x′, y′) is
α + (θ − α) + (θ − α) = 2θ − α. Then
x′ = r cos(2θ − α) = r cos 2θ cosα + r sin 2θ sinα
72 CHAPTER 6. LINEAR TRANSFORMATIONS
and
y′ = r sin(2θ − α) = r sin 2θ cosα− r cos 2θ sinα
Therefore
x′ = x cos 2θ + y sin 2θ
and
y′ = x sin 2θ − y cos 2θ.
If T : R2 → R2 is the function which reflects in the line , then for all
x =
(
x
y
)
in R2, we have
T (x) = T
((
x
y
))
=
(
x′
y′
)
=
(
cos 2θ sin 2θ
sin 2θ − cos 2θ
)(
x
y
)
.
The general reflection matrix is therefore(
cos 2θ sin 2θ
sin 2θ − cos 2θ
)
.
All reflections maintain distances between points and angles between lines.
For example, a square drawn anywhere in the xy plane will still be a square
after it has been reflected in any line.
6.4.2 Rotations


α
θ
r
r
(x, y)
(x′, y′)
Suppose that in the xy plane, (x, y) is rotated anticlockwise about the origin
through an angle of θ, to the point (x′, y′). Let r be the distance of the point
6.4. GEOMETRIC SIGNIFICANCE OFMATRIX TRANSFORMATIONS73
(x, y) to the origin, and let the ray from the origin to (x, y) make an angle α
with the positive x axis, as shown in the diagram. Then
x = r cosα, y = r sinα
while
x′ = r cos(α+ θ) = r cosα cos θ − r sinα sin θ
and
y′ = r sin(α + θ) = r sinα cos θ + r cosα sin θ.
Therefore
x′ = x cos θ − y sin θ
y′ = y cos θ + x sin θ,
that is, (
x′
y′
)
=
(
cos θ − sin θ
sin θ cos θ
)(
x
y
)
.
This calculation demonstrates that rotation about the origin anticlockwise
through an angle θ is a matrix transformation, T : R2 → R2, given by the
rotation matrix
(
cos θ − sin θ
sin θ cos θ
)
. All rotations maintain distances between
points and angles between lines. For example, a square drawn anywhere in
the xy plane will still be a square after it has been rotated through any angle.
6.4.3 Shears
Shears are examples of matrix transformations in which distances between
points and angles between lines are not preserved. The simplest case is that
of a shear in the direction of the x axis, with shear factor α. Let T : R2 → R2
be the matrix transformation given by
T (x) =
(
1 α
0 1
)
x
for all x in R2. We shall use α = 2 and apply the shear transformation to a
test square with vertices at (0, 0), (1, 0), (1, 1), and (0, 1). When T is applied
to these four points, the following results are obtained.
T
((
0
0
))
=
(
1 2
0 1
)(
0
0
)
=
(
0
0
)
T
((
1
0
))
=
(
1 2
0 1
)(
1
0
)
=
(
1
0
)
74 CHAPTER 6. LINEAR TRANSFORMATIONS
T
((
1
1
))
=
(
1 2
0 1
)(
1
1
)
=
(
3
1
)
T
((
0
1
))
=
(
1 2
0 1
)(
0
1
)
=
(
2
1
)
Plotting the original four “input” points and the “output” points on the same
diagram gives this figure:


(0, 0)
(1, 0)
(1, 1)
(0, 1)
(2, 1) (3, 1)
The transformation has mapped the square to a parallelogram. All points
on the x axis are fixed by T , as for any point (x, 0),
T
((
x
0
))
=
(
1 2
0 1
)(
x
0
)
=
(
x
0
)
.
The larger the shear factor α, the more the final figure is “pushed” from the
original square shape. It is also possible to define shears in other directions.
For example, the formula
T (x) =
(
1 0
α 1
)
x
for all x in R2 defines a shear in the direction of the y axis with shear factor
α. This linear transformation fixes all points on the y axis.
6.4.4 Scalings
Changing scale on each axis is an example of a matrix transformation. Sup-
pose that we wish to scale a particular figure by a factor of s in the x direction
and by a factor of t in the y direction. We can achieve this by the matrix
transformation T : R2 → R2 where
T
((
x
y
))
=
(
s 0
0 t
)(
x
y
)
=
(
sx
ty
)
6.5. COMPOSITE TRANSFORMATIONS 75
for all
(
x
y
)
in R2. For example, apply the scaling transformation that
doubles the x coordinates (s = 2) and leaves the y coordinates unchanged
(t = 1), using the matrix
(
2 0
0 1
)
. As a test figure, consider the 2 × 2
square with vertices at the points A(1, 0), B(1, 2), C(−1, 2), D(−1, 0). The
calculations are shown underneath the figure. The 2× 2 test square ABCD
is mapped to a 4× 2 rectangle A′B′C ′D′ by the scaling.







A A′
B B′CC ′
DD′
T
((
1
0
))
=
(
2
0
)
, T
((
1
2
))
=
(
2
2
)
,
T
((−1
2
))
=
(−2
2
)
, T
((−1
0
))
=
(−2
0
)
.
6.5 Composite transformations
All the previously mentioned matrix transformations can be used to manip-
ulate and display graphical images. (Other types of transformations such as
those which twist and deform images are not linear transformations and so
won’t be studied here.) Amongst the simplest 2D graphics are letters of the
alphabet, and we see now how to apply various geometric transformations
T1, T2, . . . one after the other to obtain altered images of these letters. Such
a process involves a composite transformation TkTk−1 . . . T2T1 applied to
vectors representing the coordinates of the vertices of the letter. An example
will illustrate the idea.
Example Consider a capital N such as the one shown in the diagram. It
can be drawn by knowing the coordinates of the eight vertices 1, 2, 3, . . . , 8
and then joining adjacent vertices with line segments. We are going to apply
a shear to this figure, followed by a scaling transformation.
76 CHAPTER 6. LINEAR TRANSFORMATIONS






1 2
8
3
4
56
7
The coordinates of the 8 vertices are stored as columns of a 2×8 data matrix
D,
D =
(
0 .5 .5 6 6 5.5 5.5 0
0 0 6.42 0 8 8 1.58 8
)
.
The first column of D corresponds to the coordinates of vertex 1, the second
column to coordinates of vertex 2 and so on.
First, apply a shear along the x axis, using shear matrix
(
1 .25
0 1
)
. The
“new” vertices after the shear are given by the columns of
(
1 .25
0 1
)
D, that
is, the matrix (
0 .5 2.105 6 8 7.5 5.895 2
0 0 6.420 0 8 8 1.580 8
)
.
The figure looks like this after the shear:






1 2
3
4
56
7
8
6.6. SUMMARY 77
Now shrink the (horizontal) width of the letter by scaling using the matrix(
.75 0
0 1
)
. The next set of coordinates after this scaling are given by the
columns of the matrix (
.75 0
0 1
)(
1 .25
0 1
)
D,
that is, by the columns of(
0 .375 1.5787 4.5 6 5.625 4.4212 1.5
0 0 6.420 0 8 8 1.580 8
)
.
The figure after the composite transformation (one matrix transformation
applied after the other) now looks like this:






1 2
3
4
56
7
8
6.6 Summary
You should know and understand the following definitions and results.
• A linear transformation from one vector space V to another vector
space W is a function T : V → W satisfying the conditions
(i) T (u+ v) = T (u) + T (v) for all vectors u and v in V , and
(ii) T (ku) = kT (u) for all scalars k and for all vectors u in V .
• If T : V → W is a linear transformation, then T maps the zero vector
of V to the zero vector of W .
• A matrix transformation determined by the m × n matrix M is the
linear transformation T : Rn → Rm given by
T (x) = Mx
78 CHAPTER 6. LINEAR TRANSFORMATIONS
for all x in Rn.
You should be able to apply these procedures.
• Determine if a given function T : V → W is a linear transformation,
either by checking conditions (i) and (ii) of the definition or by using the
combined version: T : V → W is a linear transformation if for all scalars a
and b and all vectors u and v in V ,
T (au+ bv) = aT (u) + bT (v).
• Find the matrices corresponding to particular scaling, shear, reflection
and rotation transformations in the xy plane and calculate the images of
points in the plane under one or more of these transformations.
Practice Examples
1. Determine if the function T : R3 → F given by T



ab
c



 = f, where
f(x) = ax+ bc for all x, is a linear transformation.
2. Find the image of the line y = 2x when rotated about the origin anti-
clockwise through 45◦ then reflected in the line y = x.
3. Find the image of the line y = 2x when reflected in the line y = x then
rotated about the origin anticlockwise through 45◦.
6.6. SUMMARY 79
Answers
1. No, as T

2

11
1



 = T



22
2



 is the function with formula 2x+ 4
while 2T



11
1



 is the function with formula 2(x+1). These are not
equal for all x, and so T does not preserve multiplication by a scalar.
2. The image of y = 2x is the line y = −1
3
x.
3. The image of y = 2x is the line y = 3x.
80 CHAPTER 6. LINEAR TRANSFORMATIONS
Chapter 7
Eigenvalues and Eigenvectors
In the previous chapter we studied a number of different matrix transforma-
tions from R2 to R2 which had geometric interpretations. Under a scaling
transformation with matrix
(
s 0
0 t
)
, points on the x and y axes are mapped
to other points on the x and y axes, respectively:(
s 0
0 t
)(
x
0
)
=
(
sx
0
)
and (
s 0
0 t
)(
0
y
)
=
(
0
ty
)
.
The x axis and the y axis are fixed lines for that particular matrix trans-
formation. Other types of matrix transformations we have studied also have
fixed lines. For example, a shear in the direction of the x axis, given by the
matrix
(
1 α
0 1
)
, has the x axis as a fixed line, since
(
1 α
0 1
)(
x
0
)
=
(
x
0
)
.
In fact, in this case, the x axis is not only a fixed line but is also fixed point
by point: all points on the x axis map to themselves. In the case of a rotation
about the origin anticlockwise through an angle θ, no line is mapped to itself
unless θ is an integer multiple of π. However, for the reflection T in a line
making an angle θ with the positive x axis, all points on itself are fixed.
Moreover, each point (x, y) on the line m through the origin perpendicular
to is mapped to the opposite point (−x,−y) (through the origin) on that
same line. That is, a reflection has two fixed lines.
81
82 CHAPTER 7. EIGENVALUES AND EIGENVECTORS
θ
(c, d) = T (c, d)


(a, b)
T (a, b) = (−a,−b)

m
The fixed lines in these geometric transformations are eigenspaces of the
given transformation and non zero vectors in the direction of the fixed lines
are called eigenvectors of the transformation. These concepts were studied in
first year linear algebra and their algebraic definitions are revised in the next
section. The relevance of eigenvalues and eigenvectors for this unit of study
comes from the huge simplification that we can make in many applications
of linear algebra by using a basis of eigenvectors, when possible.
7.1 Eigenvalues, eigenvectors and eigenspaces
Definition If A is a square matrix of size n×n, and v is a non zero vector
such that Av = λv for some scalar λ, then v is called an eigenvector of the
matrix A. The scalar λ is called an eigenvalue of A corresponding to the
eigenvector v.
Our discussion of the fixed lines of reflections may now be re-phrased in terms
of eigenvalues and eigenvectors. Non zero vectors on the axis of reflection
are eigenvectors of the reflection matrix (corresponding to eigenvalue 1). For
these vectors,
T
((
c
d
))
=
(
c
d
)
= 1
(
c
d
)
.
Non zero vectors on the line perpendicular to the axis of reflection are also
eigenvectors of the reflection matrix (corresponding to eigenvalue −1). In
this case,
T
((
a
b
))
=
(−a
−b
)
= −1
(
a
b
)
.
7.2. HOW TO FIND EIGENVALUES AND EIGENVECTORS 83
Theorem 7.1 If λ is an eigenvalue of the n × n matrix A, then the set
of all vectors v such that Av = λv is a subspace of Rn. It is called the
λ-eigenspace of A.
Proof Rearranging Av = λv gives the equivalent equation (A− λI)v = 0.
The set of all vectors v satisfying this equation is the null space of the matrix
A− λI, and is therefore a subspace of Rn.
7.2 How to find eigenvalues and eigenvectors
Eigenvalues and eigenvectors of matrices of size 4 × 4 and larger are rarely
found by hand calculation, unless the matrix in question has a particularly
simple structure. Commercial packages such as Matlab can be used instead.
However, the ability to calculate eigenvalues and eigenvectors of 2 × 2 and
3 × 3 matrices by hand is an essential technique of this unit of study. It
relies on knowledge of determinants (first year material), ability to factorise
algebraic expressions and ability to solve linear systems.
Since eigenvectors satisfy the equation (A−λI)v = 0, and since by definition
eigenvectors are non zero vectors, we seek values of λ which will guarantee
that the matrix A− λI has zero determinant. (If det(A− λI) = 0, then the
only solution of (A−λI)v = 0 is the zero vector.) The first step is therefore
to set
det(A− λI) = 0
and solve the resulting polynomial equation for λ. Observe that if A is an
n× n matrix, then det(A− λI) is a polynomial of degree n in λ, and so the
equation det(A− λI) = 0 has at most n distinct solutions.
An n×n matrix has at most n distinct eigenvalues.
Definition The polynomial equation det(A−λI) = 0 is called the character-
istic equation of the matrix A. The solutions of the characteristic equation
are the eigenvalues of A.
Having found the eigenvalues of A, then for each value of λ find the λ-
eigenspace by solving the linear system (A− λI)v = 0 for the vectors v. In
this part of the process it will usually be necessary to apply some EROs to
the augmented matrix [ (A−λI) 0 ] and the solution will always involve at
least one free variable (parameter). The numerical procedure is illustrated in
84 CHAPTER 7. EIGENVALUES AND EIGENVECTORS
the following three examples, which demonstrate different type of outcomes
that might be encountered.
Example A Let us find the eigenvalues and eigenspaces of the the matrix
A =

 2 4 −30 3 0
−1 5 4

 .
The first step is to solve the characteristic equation, det(A− λI) = 0.
|A− λI| =
∣∣∣∣∣
2− λ 4 −3
0 3− λ 0
−1 5 4− λ
∣∣∣∣∣
= (2− λ)
∣∣∣∣∣3− λ 05 4− λ
∣∣∣∣∣− 4
∣∣∣∣∣ 0 0−1 4− λ
∣∣∣∣∣− 3
∣∣∣∣∣ 0 3− λ−1 5
∣∣∣∣∣
= (2− λ)(3− λ)(4− λ)− 3(3− λ)
= (3− λ)[(2− λ)(4− λ)− 3]
= (3− λ)(λ2 − 6λ+ 5)
= (3− λ)(λ− 5)(λ− 1)
Therefore det(A−λI) = 0 if and only if λ = 1, 3, 5. There are three distinct
eigenvalues, 1, 3 and 5.
Next we find the three eigenspaces, one for each value of λ. In fact, we end
up finding a basis for each eigenspace.
1. When λ = 1, solve (A− 1I)v = 0. The matrix A− I is the matrix
 1 4 −30 2 0
−1 5 3

 .
Set up the augmented matrix and apply appropriate EROs.
1 4 −3 0
0 2 0 0
−1 5 3 0

1 4 −3 0
0 1 0 0
0 9 0 0

1 0 −3 0
0 1 0 0
0 0 0 0
If v =

xy
z

, then z is the parameter and x = 3z, y = 0. Therefore
every solution has the form v =

3z0
z

 = z

30
1

 for all real z. A
7.2. HOW TO FIND EIGENVALUES AND EIGENVECTORS 85
basis for the 1-eigenspace is therefore



30
1



 and the 1-eigenspace
has dimension 1.
2. When λ = 3, solve (A− 3I)v = 0. The matrix A− 3I is the matrix
−1 4 −30 0 0
−1 5 1

 .
Set up the augmented matrix and apply appropriate EROs.
−1 4 −3 0
0 0 0 0
−1 5 1 0

1 −4 3 0
0 1 4 0
0 0 0 0

1 0 19 0
0 1 4 0
0 0 0 0
If v =

xy
z

, then z is the parameter and x = −19z, y = −4z. There-
fore every solution has the form v =

−19z−4z
z

 = z

−19−4
1

 for all
real z. A basis for the 3-eigenspace is therefore



−19−4
1



 and the
3-eigenspace has dimension 1.
3. When λ = 5, solve (A− 5I)v = 0. The matrix A− 5I is the matrix
−3 4 −30 −2 0
−1 5 −1

 .
Set up the augmented matrix and apply appropriate EROs.
−3 4 −3 0
0 −2 0 0
−1 5 −1 0

1 −5 1 0
0 1 0 0
0 −11 0 0

1 0 1 0
0 1 0 0
0 0 0 0
If v =

xy
z

, then z is the parameter and x = −z, y = 0. Therefore
every solution has the form v =

−z0
z

 = z

−10
1

 for all real z. A
86 CHAPTER 7. EIGENVALUES AND EIGENVECTORS
basis for the 5-eigenspace is therefore



−10
1



 and the 5-eigenspace
has dimension 1.
To sum up, the 3 × 3 matrix A has three distinct eigenvalues, each corre-
sponding to an eigenspace of dimension 1. The eigenspaces have a geometrical
interpretation as lines through the origin in 3D space, the directions of the
lines being given by the basis vectors.
Example B Find eigenvalues and eigenspaces of the matrixB =

 0 2 1−2 4 1
−2 2 3

 .
First solve the characteristic equation to find the eigenvalues.
|B − λI| =
∣∣∣∣∣
−λ 2 1
−2 4− λ 1
−2 2 3− λ
∣∣∣∣∣
= −λ(λ2 − 7λ+ 12− 2)− 2(2λ− 6 + 2) + 1(−4 + 8− 2λ)
= −λ(λ− 5)(λ− 2)− 2(2λ− 4) + (4− 2λ)
= (λ− 2)(−λ2 + 5λ− 6)
= (λ− 2)2(3− λ)
The matrix B therefore has only two distinct eigenvalues, 2 and 3. The
eigenvalue 2 is a repeated solution of the characteristic equation, and has
algebraic multiplicity 2 (the factor (λ− 2) is squared).
We now find a basis for each eigenspace.
1. The 3-eigenspace has basis



11
1



 and dimension 1 (exercise).
2. For the repeated solution λ = 2, we solve (A − 2I)v = 0. The aug-
mented matrix reduces as follows.
−2 2 1 0
−2 2 1 0
−2 2 1 0

1 −1 −1
2
0
0 0 0 0
0 0 0 0
If v =

xy
z

, then y and z are parameters, and x = y + 1
2
z. Every
7.2. HOW TO FIND EIGENVALUES AND EIGENVECTORS 87
solution has the form
v =

y + 12zy
z

 = y

11
0

+ z

120
1

 ,
for all values of y and z. The 2-eigenspace has basis


11
0

 ,

 120
1




and is therefore a two-dimensional subspace of R3.
To sum up, the 3×3 matrix B has two distinct eigenvalues, the single solution
of the characteristic equation corresponding to an eigenspace of dimension
1 and the repeated solution corresponding to an eigenspace of dimension 2.
The eigenspace of dimension 1 has a geometrical interpretation as the line
through the origin in 3D space in the direction of the basis vector

11
1

 ,
that is, the line joining (0, 0, 0) to (1, 1, 1). The eigenspace of dimension 2
corresponds to the plane through the origin containing the points (1, 1, 0)
and (1
2
, 0, 1). (This plane has cartesian equation z = 2x− 2y.) The repeated
solution of the characteristic equation occurred with algebraic multiplicity 2,
equal to the dimension of the eigenspace. The next example shows quite a
different situation.
Example C Find eigenvalues and eigenspaces of the matrix
C =

1 0 10 2 0
0 1 1

 .
As usual, the characteristic equation is found first.
|C − λI| =
∣∣∣∣∣
1− λ 0 1
0 2− λ 0
0 1 1− λ
∣∣∣∣∣
= (λ− 1)2(2− λ)
The matrix C therefore has only two distinct eigenvalues, 2 and 1. The
eigenvalue 1 is repeated in the characteristic equation, and has algebraic
multiplicity 2. Now the eigenspaces are found.
88 CHAPTER 7. EIGENVALUES AND EIGENVECTORS
1. The 2-eigenspace has basis



11
1



 and dimension 1 (exercise).
2. When λ = 1, we solve (A − 1I)v = 0. Applying EROs to reduce the
augmented matrix gives:
0 0 1 0
0 1 0 0
0 1 0 0

0 1 0 0
0 0 1 0
0 0 0 0
There is one parameter, x, and y = 0, z = 0. Every solution v has the
form
v =

x0
0

 = x

10
0


for all values of x. Therefore the 1-eigenspace has basis



10
0



 and
dimension 1.
To sum up, the 3×3 matrix C has two distinct eigenvalues, the single solution
of the characteristic equation corresponding to an eigenspace of dimension 1
and the repeated solution also corresponding to an eigenspace of dimension
1. Both eigenspaces correspond geometrically to lines through the origin in
3D space, one line joining (0, 0, 0) and (1, 1, 1) and the other joining (0, 0, 0)
and (1, 0, 0). The repeated solution of the characteristic equation occurred
with algebraic multiplicity 2, which does NOT equal the dimension of the
corresponding eigenspace.
7.3 The diagonalisation theorem
In the early part of this chapter it was mentioned that many problems in-
volving matrices could be greatly simplified if the vector space in question
had a basis consisting of eigenvectors of the matrix. The Diagonalisation
Theorem expresses the natural consequence of the existence of such a basis
and will be used repeatedly in the rest of the course.
Examples A, B and C of the previous section were chosen to illustrate a
variety of different situations in the cases of distinct and repeated eigenvalues.
In Example A there were three eigenspaces each of dimension 1, all subspaces
7.3. THE DIAGONALISATION THEOREM 89
of R3. If we collect the basis vectors of each of these eigenspaces together,
the set we obtain is a basis for R3,


−10
1

 ,

30
1

 ,

−19−4
1




In Example B there were two eigenspaces, of dimensions 1 and 2. Collecting
the basis vectors of the two eigenspaces together gives the set


11
0

 ,

120
1

 ,

11
1




which is also a basis for R3.
However in Example C the eigenspaces were both of dimension 1. Collecting
the basis vectors of those eigenspaces together gives us the set


11
1

 ,

10
0




which is not a basis for R3 since there are not three vectors in the set.
What is the situation for an arbitrary n × n matrix M? Sometimes it is
possible to form a basis of Rn consisting entirely of eigenvectors of M , and
sometimes it is not. The simplest case is when all eigenvalues are distinct,
as in Example A.
Theorem 7.2 If an n×n matrix M has n distinct eigenvalues, then the set
of n eigenvectors formed by selecting a non zero vector from each eigenspace
is a basis for Rn.
Proof Suppose the (distinct) eigenvalues are λ1, λ2, . . . . . . , λn and let
v1, v2, . . . . . . ,vn
be corresponding eigenvectors. We shall give a proof by induction.
The set {v1} is linearly independent, as v1 is a non zero vector. Assume that
the set
{v1, v2, . . . . . . ,vi}
is linearly independent. We shall now prove that the set
{v1, v2, . . . . . . ,vi, vi+1}
90 CHAPTER 7. EIGENVALUES AND EIGENVECTORS
is linearly independent. Consider the equation
a1v1 + a2v2 . . . . . .+ aivi + ai+1vi+1 = 0. (∗)
Multiply equation (∗) by A to obtain
a1Av1 + a2Av2 . . . . . .+ aiAvi + ai+1Avi+1 = 0,
that is,
a1λ1v1 + a2λ2v2 . . . . . .+ aiλivi + ai+1λi+1vi+1 = 0.
Now multiply equation (∗) by λi+1 to obtain
a1λi+1v1 + a2λi+1v2 . . . . . .+ aiλi+1vi + ai+1λi+1vi+1 = 0.
Subtracting the last two equations gives
a1(λ1 − λi+1)v1 + a2(λ2 − λi+1)v2 + . . . . . .+ ai(λi − λi+1)vi = 0.
Since the set {v1, v2, . . . . . . ,vi} is linearly independent, we deduce that
a1(λ1 − λi+1) = 0, a2(λ2 − λi+1) = 0, . . . . . . , ai(λi − λi+1) = 0.
But the λs are all different, and therefore a1 = 0, a2 = 0, . . . . . . , ai = 0.
From equation (∗), and knowing that vi+1 = 0, we then obtain ai+1 = 0. So
the set
{v1, v2, . . . . . . ,vi, vi+1}
is linearly independent. By mathematical induction, we conclude that the
set
{v1, v2, . . . . . . ,vn}
is linearly independent, and since it contains n vectors and the dimension of
Rn is n, it is a basis for Rn.
Even if some of the eigenvalues are repeated solutions of the characteristic
equation, it may still be possible to form a basis of eigenvectors (refer to
Example B). The condition that is needed is for the algebraic multiplicity of
the repeated solution λ to be equal to the dimension of the λ-eigenspace. This
is why in Example C it was not possible to obtain a basis of eigenvectors.
The repeated solution had algebraic multiplicity 2 but the corresponding
eigenspace had dimension 1. This result is summarised below, and is stated
without proof.
7.3. THE DIAGONALISATION THEOREM 91
If an n×n matrix M does not have n distinct eigenvalues, then Rn
has a basis consisting of eigenvectors of M if and only if for each
repeated solution λ of the characteristic equation, the algebraic
multiplicity of λ in the equation equals the dimension of the λ-
eigenspace.
We now come to the main theorem of the course, which underpins all the
applications to follow in later chapters.
Theorem 7.3 – Diagonalisation Theorem
If Rn has a basis {v1, v2, . . . . . . ,vn} consisting of eigenvectors of an n × n
matrix A, then there exists an invertible matrix P and a diagonal matrix D
such that D = P−1AP .
Proof Suppose that each eigenvector vi corresponds to an eigenvalue λi,
for i = 1, 2, . . . . . . , n. Define the matrix P to be the matrix with columns
v1, v2, . . . . . . ,vn. That is,
P = [ v1 v2 . . . . . . vn ]
Then P is invertible, because if P were not invertible, then one of the columns
of the reduced row echelon form of P would be missing a leading 1, contra-
dicting the linear independence of the columns of P .
Now
AP = A[ v1 v2 . . . . . . vn ]
= [ Av1 Av2 . . . . . . Avn ]
= [ λ1v1 λ2v2 . . . . . . λnvn ]
= [ v1 v2 . . . . . . vn ]


λ1 0 . . . 0
0 λ2 . . . 0
. . .
0 0 . . . λn


Defining D to be the diagonal matrix
D =


λ1 0 . . . 0
0 λ2 . . . 0
. . .
0 0 . . . λn

 ,
we have AP = PD. Since P is invertible, AP = PD is equivalent to A =
PDP−1, or alternatively, to P−1AP = D.
92 CHAPTER 7. EIGENVALUES AND EIGENVECTORS
If matrices A, P and D are as described in the Diagonalisation Theorem,
then we say that “P diagonalises A”, or “P is the diagonalising matrix for
A.” The matrix A is also called a “diagonalisable” matrix. Note that the
matrices P and D are not unique. It doesn’t matter in which order the
eigenvector basis vectors are arranged as the columns of P , as long as the
eigenvalues are placed on the main diagonal of D in the corresponding order.
The following example will illustrate.
Example Diagonalise the matrix
A =

 2 4 −30 3 0
−1 5 4

 .
This matrix is the one used in Example A earlier in this chapter. We found
that A has three distinct eigenvalues 1, 3 and 5 and eigenspace bases as
follows: the 1-eigenspace has basis



30
1



, the 3-eigenspace has basis


−19−4
1



 and the 5-eigenspace has basis



−10
1



. A diagonalising
matrix P is therefore
P =

3 −19 −10 −4 0
1 1 1


and the diagonal matrix D such that A = PDP−1 is
D =

1 0 00 3 0
0 0 5

 .
Observe that as



30
1



 was used as the first column of P , the correspond-
ing eigenvalue λ = 1 is placed in the first main diagonal position of D. Since
the 3-eigenspace basis vector



−19−4
1



 was used as the second column of
P , the eigenvalue 3 is placed in the second main diagonal position of D, and
so on.
7.4. FINDING POWERS OF A MATRIX 93
Some alternative diagonalising matrices P and the corresponding diagonal
matrices D are:
P =

−1 3 −190 0 −4
1 1 1

 , D =

5 0 00 1 0
0 0 3

 ,
in which the order of eigenvectors as columns of P has been changed (and
the corresponding diagonal entries of D have therefore changed to match),
and
P =

9 38 10 8 0
3 −2 −1

 , D =

1 0 00 3 0
0 0 5

 ,
in which different scalar multiples of the original eigenvectors have been used.
7.4 Finding powers of a matrix
The main application of the Diagonalisation Theorem is that it gives an easy
way of calculating powers of a diagonalisable matrix.
Lemma If a square matrix A can be expressed in the form A = PDP−1
for some matrices P and D, then An = PDnP−1 for all positive integers n.
Proof Suppose that A = PDP−1. Then
A2 = (PDP−1)(PDP−1) = PD2P−1.
Now assume that Ai = PDiP−1 for some integer i ≥ 1. Then
Ai+1 = AAi = (PDP−1)(PDiP−1) = PDi+1P−1.
The result now follows by mathematical induction.
The usefulness of this result comes from the fact that if D is diagonal, then
Dn is easy to find, since no matrix multiplication is needed to find powers
of diagonal matrices.
Lemma If D is the k × k diagonal matrix with main diagonal entries
a11, a22, . . . . . . , akk then for all positive integers n,
Dn =


an11 0 0 . . . 0
0 an22 0 . . . 0
. . .
0 0 0 . . . ankk

 .
94 CHAPTER 7. EIGENVALUES AND EIGENVECTORS
Proof The proof is by induction. The result is clearly true when n = 1.
Assuming that for i ≥ 1,
Di =


ai11 0 0 . . . 0
0 ai22 0 . . . 0
. . .
0 0 0 . . . aikk


we obtain
Di+1 = DiD =


ai11 0 0 . . . 0
0 ai22 0 . . . 0
. . .
0 0 0 . . . aikk




a11 0 0 . . . 0
0 a22 0 . . . 0
. . .
0 0 0 . . . akk


=


ai+111 0 0 . . . 0
0 ai+122 0 . . . 0
. . .
0 0 0 . . . ai+1kk


The result now follows by mathematical induction.
7.5 Summary
You must know and understand these definitions and results.
• If A is a square matrix of size n × n, and v is a non zero vector such
that Av = λv for some scalar λ, then v is called an eigenvector of the matrix
A. The scalar λ is called an eigenvalue of A corresponding to the eigenvector
v.
• If λ is an eigenvalue of the n× n matrix A, then the set of all vectors
v such that Av = λv is a subspace of Rn. It is called the λ-eigenspace of A.
• The polynomial equation det(A − λI) = 0 is called the characteristic
equation of the matrix A. The solutions of the characteristic equation are
the eigenvalues of A.
• If an n × n matrix M has n distinct eigenvalues, then the set of
n eigenvectors formed by selecting a basis vector from each eigenspace is
linearly independent and hence forms a basis for Rn.
• If an n × n matrix M does not have n distinct eigenvalues, then Rn
has a basis consisting of eigenvectors of M if and only if for each repeated
7.5. SUMMARY 95
solution λ of the characteristic equation, the algebraic multiplicity of λ equals
the dimension of the λ-eigenspace.
• Diagonalisation Theorem: If Rn has a basis {v1, v2, . . . . . . ,vn} con-
sisting of eigenvectors of an n × n matrix A, then there exists an invertible
matrix P and an diagonal matrix D such that D = P−1AP .
• If a square matrix A can be expressed in the form A = PDP−1 for
some matrices P and D, then An = PDnP−1 for all positive integers n.
• If D is the k × k diagonal matrix with main diagonal entries
a11, a22, . . . . . . , akk
then for all positive integers n,
Dn =


an11 0 0 . . . 0
0 an22 0 . . . 0
. . .
0 0 0 . . . ankk

 .
You must be able to master the following techniques. Inability to calculate
eigenvalues and eigenvectors correctly will cause serious difficulty in complet-
ing tutorial and assignment problems and exam questions.
• Find eigenvalues λ and a basis of each λ-eigenspace of 2× 2 and 3× 3
matrices.
• Given a 2 × 2 or 3 × 3 matrix A, determine if there is a basis of
eigenvectors, and if there is, find a matrix P which diagonalises A; that is,
find a matrix P and a diagonal matrix D such that A = PDP−1.
Practice Examples
1. Find eigenvalues and corresponding eigenspaces of the matrix
(
1 3
6 4
)
.
2. Find eigenvalues and corresponding eigenspaces of the matrix
A =

2 1 −20 1 0
1 1 −1

 .
Is A diagonalisable?
96 CHAPTER 7. EIGENVALUES AND EIGENVECTORS
3. If the previous matrix A can be diagonalised, find a diagonalising ma-
trix P and a corresponding diagonal matrix D such that D = P−1AP .
Answers
1. The 7-eigenspace has basis
{(
1
2
)}
and the −2-eigenspace has basis{(−1
1
)}
.
2. The characteristic equation has solutions 0 (single) and 1 (repeated).
The 0-eigenspace has basis



10
1



 and the 1-eigenspace has basis



20
1

 ,

11
1



 .
The matrixA is diagonalisable as the repeated eigenvalue has eigenspace
of dimension 2, enabling a basis of eigenvectors of R3 to be found,
namely 


10
1

 ,

20
1

 ,

11
1



 .
3. One suitable answer would be
P =

1 2 10 0 1
1 1 1

 , D =

0 0 00 1 0
0 0 1

 .
Chapter 8
Linear Recurrence Relations
In this chapter we show how the Diagonalisation Theorem can be used to
solve problems expressible as linear recurrence relations. A variety of prob-
lems in applied science fall into this category and some of them are discussed
in this chapter.
What is a linear recurrence relation? Suppose that x0, x1, x2 . . . . . . is some
sequence of numbers. The term “recurrence relation” is used when the value
of xn is expressible in terms of one or more of the previous members of the
sequence. For example, the first two members of a sequence might be 3 and
5, and the recurrence relation might be that for n ≥ 2, the nth member
equals the average of the two preceding members,
xn =
1
2
(xn−1 + xn−2).
The information provided by the recurrence relation and the initial values is
enough to find any particular xn. For example, we find that
x2 =
1
2
(x0 + x1) = 4,
x3 =
1
2
(x2 + x1) = 4.5
x4 =
1
2
(x3 + x2) = 4.25
and so on.
Definition An equation of the form xn+2 = axn+1 + bxn where a and b
are constants and each x is expressed in terms of the previous two values is
called a linear recurrence relation of order 2 (or sometimes a linear difference
equation of order 2).
97
98 CHAPTER 8. LINEAR RECURRENCE RELATIONS
It is possible to have linear recurrence relations of other orders. For example,
an equation of the form
xn+3 = axn+2 + bxn+1 + cxn
where a, b and c are constants and each x is expressed in terms of the
previous three values is called a linear recurrence relation of order 3, and
so on. Thanks to the Diagonalisation Theorem, there is often a better
way to find the value of a particular xn than to find all preceding values
x0, x1, x2, . . . . . . , xn−1. As might be expected, this involves the use of vectors
and matrices rather than a simple scalar equation like xn+2 = axn+1 + bxn.
Once the necessary matrix and vectors have been introduced, a linear recur-
rence relation of any order can be rewritten as a (vector) recurrence relation
of order 1.
The procedure for modelling recurrence relations in vector/matrix form is ex-
plained via examples drawn from different areas of science. The first example,
the Leslie population model, falls into a category of recurrence relations in
which vectors and matrices occur naturally as part of the problem descrip-
tion. In the other examples, conduction of electrical impulses along nerve
fibres and the growth of branches on a tree, the problems present them-
selves naturally in the form of linear difference equations, and vectors and
matrices have to be artificially constructed in order to use the results of the
Diagonalisation Theorem.
8.1 Leslie population model
The Leslie population model assumes that the overall population growth of a
given species depends on the female population growth. Females are divided
into age classes such as “juvenile, young adult and mature”, or possibly by
a finer division such as “between 0 and 1 years, between 1 and 2 years”, and
so on. The overall aim is to predict population sizes in each age class into
the future, knowing the present population in each class. It is assumed that
the following information is known:
(i) the numbers of females in each age class at some particular time (usu-
ally denoted by t = 0),
(ii) the average number of daughters born to each female during the time
she is in each age class, and
(iii) the proportion of females in each age class who survive and pass into
the next age class.
8.1. LESLIE POPULATION MODEL 99
Example In an animal population, the oldest age attained by a female is
15 years. Use three age classes of 5 years duration, and assume the following
survival and fertility data.
Age class Ave no. daughters/female Prop’n surviving to next class
1 (0 ≤ age < 5) 0 1/2
2 (5 ≤ age < 10) 4 1/4
3 (10 ≤ age < 15) 3 0
Time is measured in units of 5 years, so that t = 1 corresponds to 5 years,
t = 2 to 10 years and so on. Let
xn = the number in class 1 at time t = n,
yn = the number in class 2 at time t = n,
zn = the number in class 3 at time t = n,
Then
xn+1 = 0xn + 4yn + 3zn
yn+1 =
1
2
xn
zn+1 =
1
4
yn.
That is, 
xn+1yn+1
zn+1

 =

0 4 31
2
0 0
0 1
4
0



xnyn
zn

 .
Define the population vector un and the Leslie matrix L as follows:
un =

xnyn
zn

 , L =

0 4 31
2
0 0
0 1
4
0


Then un+1 = Lun. This is the standard form of a recurrence relation in
vector/matrix notation. Assume that the initial population in each age class
is 1000, and measure population in units of a thousand, so that u0 =

11
1

 .
Then
u1 = Lu0 =

0 4 31
2
0 0
0 1
4
0



11
1

 =

 70.5
0.25

 ,
100 CHAPTER 8. LINEAR RECURRENCE RELATIONS
u2 = Lu1 =

0 4 31
2
0 0
0 1
4
0



 7.5
.25

 =

 2.753.5
0.125

 ,
and so on.
Notice that
u2 = Lu1 = L(Lu0) = L
2u0,
u3 = Lu2 = L(L
2u0) = L
3u0,
and in general,
un = L
n u0 for all positive integers n.
This formula enables us to calculate the population vector un for any n, using
the initial population vector and the nth power of the Leslie matrix.
For example, to find the population vector at time t = 5 using Matlab,
calculate the fifth power of the Leslie matrix and multiply it by u0.
u5 = L
5 u0 = L
5

11
1

 =

29.7814.063
1.797

 .
Reading off the entries one by one, we see that the population in age class 1
at time t = 5 is approximately 29781, with the other age classes significantly
smaller at 4063 and 1797 respectively. At time t = 15, the same type of
calculation gives
u15 = L
15 u0 = L
15

11
1

 =

1365.7368.6
78.0

 .
Thus the three age classes have populations of approximately 1365700, 368600
and 78000 when t = 15. This type of information has some use, but in many
cases just knowing the populations at particular times is not as valuable as
being able to predict which age class will become dominant over long periods
of time. For this we make use of the eigenvalues and eigenvectors of L.
Assuming that L is diagonalisable, R3 has a basis of eigenvectors {v1, v2, v3}
of L. Let the corresponding eigenvalues be λ1, λ2, λ3. Let P be a diagonal-
ising matrix for L and D the associated diagonal matrix.
P = [ v1 v2 v3 ], D =

λ1 0 00 λ2 0
0 0 λ3

 .
8.1. LESLIE POPULATION MODEL 101
Then by the Diagonalisation Theorem and lemma of the previous chap-
ter, L = PDP−1 and Ln = PDnP−1. Since un = Ln u0, we have un =
PDnP−1u0.
Because {v1, v2, v3} is a basis of R3, the initial population vector u0 is
expressible as a linear combination of these vectors. That is, for some scalars
c1, c2, c3,
u0 = c1v1 + c2v2 + c3v3
= [ v1 v2 v3 ]

c1c2
c3


= P

c1c2
c3


Thus P−1u0 =

c1c2
c3

 . In addition,
PDn = [ v1 v2 v3 ]

λn1 0 00 λn2 0
0 0 λn3


= [ λn1 v1 λ
n
2 v2 λ
n
3 v3 ]
Therefore
un = PD
nP−1u0
= [ λn1 v1 λ
n
2 v2 λ
n
3 v3 ]

c1c2
c3


= c1λ
n
1 v1 + c2λ
n
2 v2 + c3λ
n
3 v3.
We call
un = c1λ
n
1 v1 + c2λ
n
2 v2 + c3λ
n
3 v3
the general solution of the recurrence relation un+1 = Lun. It expresses the
population vector in year n in terms of the eigenvalues and eigenvectors of L.
The values of the scalars c1, c2, c3 can be found using the initial population
data. When these values are known and are substituted into the general
solution, we have a particular solution of the recurrence relation.
102 CHAPTER 8. LINEAR RECURRENCE RELATIONS
We have already found the populations in each age class at t = 5 and t = 15.
We now evaluate the eigenvalues and eigenvectors of our particular L and
use the initial population data to write down the particular solution of our
problem and to make some population predictions.
|L− λI| =
∣∣∣∣∣
−λ 4 3
1
2
−λ 0
0 1
4
−λ
∣∣∣∣∣
= −λ3 + 2λ+ 3
8
By inspection, λ = 3
2
is a solution (as −27
8
+ 3+ 3
8
= 0). Therefore (λ− 3
2
) is
a factor of −λ3 + 2λ+ 3
8
; that is,
−λ3 + 2λ+ 3
8
= (λ− 3
2
)(−λ2 + aλ− 1
4
) for some scalar a.
Expand the right hand side and equate coefficients with the left hand side to
find that the value of a is −3
2
. Therefore
|L− λI| = (λ− 3
2
)(−λ2 − 3
2
λ− 1
4
).
Using the formula for the roots of a quadratic allows us to evaluate all three
eigenvalues of L. We obtain
λ1 =
3
2
, λ2 =
−3−√5
4
, λ3 =
−3 +√5
4
.
Working to three decimal places, we have
λ1 =
3
2
, λ2 = −1.309, λ3 = −0.191.
Corresponding eigenvectors of L are
v1 =

186
1

 , v2 =

 0.932−0.356
0.068

 , v3 =

 0.226−0.591
0.774

 .
(Eigenvectors v2 and v3 are approximations and were calculated using Mat-
lab.)
Values of c1, c2, and c3 are found by solving P

c1c2
c3

 = u0, that is,

18 0.932 0.2266 −0.356 −0.591
1 0.068 0.774



c1c2
c3

 =

11
1

 .
8.1. LESLIE POPULATION MODEL 103
This calculation (again done using Matlab) gives approximate values c1 =
0.158, c2 = −2.89, c3 = 1.289. Now
un = c1λ
n
1 v1 + c2λ
n
2 v2 + c3λ
n
3 v3
= 0.158(1.5)n

186
1

− 2.89(−1.309)n

 0.932−0.356
0.068


+ 1.289(−0.191)n

 0.226−0.591
0.774


If we wish, we could now read off the values of xn, yn and zn from the previous
equation, recalling that xn is the first entry in un, yn the second entry and
zn the third entry. However more interesting information can be obtained
by going back to the general solution and observing the behaviour of the
solutions when n is large. Selecting the eigenvalue of largest magnitude, we
factor it out of the general formula for un as follows.
un = c1λ
n
1 v1 + c2λ
n
2 v2 + c3λ
n
3 v3
= c1(1.5)
n v1 + c2(−1.309)n v2 + c3(−0.191)n v3
= (1.5)n{c1v1 + c2(−1.3091.5 )nv2 + c3(−0.1911.5 )nv3}
For large n, the values of (−1.309
1.5
)n and (−0.191
1.5
)n are small. Therefore un is
approximately equal to (1.5)nc1v1,
un ≈ 0.158 (1.5)n

186
1

 .
That is, in terms of the original notation for the populations in each age
class,
xn ≈ (1.5)n18× 0.158
yn ≈ (1.5)n6× 0.158
zn ≈ (1.5)n (0.158)
The proportions of the populations in each age class xn : yn : zn approach
18 : 6 : 1 as n gets large. In other words, for large n, the populations in the
age classes are approximately in proportion to the entries of the eigenvector
corresponding to the eigenvalue of largest magnitude.
104 CHAPTER 8. LINEAR RECURRENCE RELATIONS
8.2 Conduction of electrical impulses
In this section we look at an application from the field of medical science
which also leads to a recurrence relation in standard vector/matrix form.
However the matrix and vectors used to write this problem in standard form
do not arise naturally as they did in the Leslie population model, and have
to be artificially constructed.
The transmission of signals between the brain and muscles is made possible
by the conduction of electrical impulses along nerve fibres. Some nerve fibres
are insulated with a sheath of myelin, a mixture of proteins and lipids that
assists the propagation of the signal from one end of the nerve fibre to the
other. At regular intervals there is a break in the myelin sheath where the
nerve fibre is exposed; these gaps are called nodes of Ranvier. The nodes
are rich in molecules called sodium ion channels which assist the flow of
electricity from one segment of myelinated fibre to the next. There is some
leakage of current at each node.
nerve
fibre
myelin
sheath
xn−1
current
leakage
n− 1th
node
xn
current
leakage
nth
node
xn+1
current
leakage
n+ 1th
node
Our aim is to find a formula for the current flowing in a myelinated segment
of the nerve fibre, in terms of the current flow in adjacent segments. Let xn
be the current flowing between the nth and (n+1)th nodes, let the electrical
potential of the nth node be denoted by En, the resistance of the nerve fibre
by R and the resistance of each node by r. We shall make use of Ohm’s Law,
which states that
Resistance × Current = Potential difference.
Applying Ohm’s Law to the current flow between the nth and (n+1)th nodes
gives
Rxn = En − En+1 (1)
8.2. CONDUCTION OF ELECTRICAL IMPULSES 105
The leakage of current at the nth node is xn−1−xn, and is also equal to En
r
.
That is,
xn−1 − xn = En
r
(2)
Similarly, the leakage at the (n+ 1)th node is
xn − xn+1 = En+1
r
(3)
Subtracting (3) from (2) and using (1) gives
xn−1 − 2xn + xn+1 = 1
r
(En − En+1) = R
r
xn,
or
xn+1 =
(
2 +
R
r
)
xn − xn−1.
The last equation is an example of a second order recurrence relation (differ-
ence equation) with constant coefficients, which would allow us to calculate
the currents x2, x3, . . . . . . if we knew the first two values, x0 and x1.
In order to use our theory about diagonalisation, this single difference equa-
tion is rewritten in a form involving matrices and vectors. The vectors we
need here are different from the “population vectors” that were used in the
Leslie model. In that example, each entry of a vector represented a different
population value at the same time stage, t = n, but here we’re going to
define a vector whose entries represent values of the same quantity (electric
current) at adjacent stages. We define the vector un as follows:
un =
(
xn+1
xn
)
for all integers n ≥ 0.
Thus, replacing n by n− 1, we have
un−1 =
(
xn
xn−1
)
.
We now need a 2× 2 matrix to go with our vectors. We write down an extra
equation which is obviously true, underneath the given recurrence relation,
like this:
xn+1 =
(
2 +
R
r
)
xn − xn−1 (original equation)
xn = xn (dummy equation)
106 CHAPTER 8. LINEAR RECURRENCE RELATIONS
Now the pair of equations can be written as a single equation using vectors
and a matrix: (
xn+1
xn
)
=
(
2 + R
r
−1
1 0
)(
xn
xn−1
)
,
or
un =
(
2 + R
r
−1
1 0
)
un−1.
This is the standard form of a recurrence relation in vector/matrix notation,
un = Aun−1. Although the mathematical modelling was different from the
Leslie population model, we ended up with the same standard form and the
Diagonalisation Theorem can be used in exactly the same way as in the Leslie
model. The next section shows another situation where the same standard
form can be achieved.
8.3 Growth of branches on a tree
A tree with a single branch is planted at time t = 0 and the number of
branches is counted at the beginning of each year. Growth of new branches
is assumed to follow this pattern:
∗ a new branch becomes mature after 1 year,
∗ each mature branch will grow 2 new branches in the following year,
∗ no branches die.
The diagrams show the development of new branches up to the beginning
of year 4. Our aim is to find a formula for the number of branches at the
beginning of year t.
t = 0 t = 1 t = 2 t = 3 t = 4
8.3. GROWTH OF BRANCHES ON A TREE 107
Let bt be the number of branches at the beginning of year t. Then
b0 = 1, b1 = 1, b2 = 3, b3 = 5, b4 = 11.
Let us now count the number of branches at time t + 2. Since no branches
die, the total at time t+2 includes every branch present at time t+1, as well
as all new branches which have grown in the year between t + 1 and t + 2.
Using the first two assumptions of growth pattern, we see that
bt+2 = bt+1 + 2bt, for all t ≥ 0.
That is, the growth of branches is determined by a second order linear dif-
ference equation, and as we know the values of b0 and b1 we may calculate
all subsequent values bt. We now rewrite this second order linear difference
equation as a recurrence relation in standard vector/matrix form, following
the same procedure as in the nerve fibre example.
bt+2 = bt+1 + 2bt (original equation)
bt+1 = bt+1 (dummy equation)
The vector ut is defined as
ut =
(
bt+1
bt
)
.
Then ut+1 =
(
bt+2
bt+1
)
and the pair of equations is rewritten
ut+1 =
(
1 2
1 0
)
ut = Aut,
which is a recurrence relation in standard vector/matrix form. If A is di-
agonalisable, then as the Leslie model example shows, the general solution
is
ut = c1λ
t
1 v1 + c2λ
t
2 v2,
where λ1, λ2 are eigenvalues of A and v1, v2 are corresponding eigenvectors.
At this stage, c1 and c2 are arbitrary constants.
We shall now find the particular solution. The eigenvalues and corresponding
eigenvectors of A are λ1 = 2, λ2 = −1, v1 =
(
2
1
)
, v2 =
(−1
1
)
. (You should
check these results.) Since the eigenvalues are distinct, we are guaranteed a
108 CHAPTER 8. LINEAR RECURRENCE RELATIONS
basis of R2 consisting of eigenvectors, one from each eigenspace. Hence A is
diagonalisable.
At time t = 0,
u0 =
(
b1
b0
)
=
(
1
1
)
= c1v1 + c2v2.
So
c1
(
2
1
)
+ c2
(−1
1
)
=
(
1
1
)
,
or
2c1 − c2 = 1, c1 + c2 = 1.
Solving these equations simultaneously gives c1 = 2/3, c2 = 1/3. Therefore
ut =
(
bt+1
bt
)
=
2
3
2t
(
2
1
)
+
1
3
(−1)t
(−1
1
)
.
This is the particular solution of the recurrence relation in vector/matrix
form. Reading off the second entry in the vectors of both sides, we obtain a
formula for bt:
bt =
2
3
2t +
1
3
(−1)t.
This is the particular solution of the original scalar recurrence relation. We
can check that this formula gives the same result for t = 4 that was noted in
the branch diagram. Substituting t = 4 gives
b4 =
2
3
24 +
1
3
(−1)4 = 2
3
× 16 + 1
3
= 11,
as expected.
8.4 The general case
The process of rewriting a linear difference equation in vector/matrix form
as a recurrence relation of order 1 can be applied to difference equations of
any order. As the order increases, the size of the matrix and the length of
the vectors increase as well. For example, for the third order linear difference
equation
xn+3 = axn+2 + bxn+1 + cxn
we use two dummy equations.
xn+3 = axn+2 + bxn+1 + cxn (original equation)
xn+2 = xn+2 (dummy equation)
xn+1 = xn+1 (dummy equation)
8.4. THE GENERAL CASE 109
With un =

xn+2xn+1
xn

 , we have un+1 =

xn+3xn+2
xn+1

 and hence
un+1 =

a b c1 0 0
0 1 0

un = Aun.
This is in standard form and can be solved in the same way as our previous
examples, provided A is diagonalisable. With the usual notation, the general
solution is
un = c1λ
n
1 v1 + c2λ
n
2 v2 + c3λ
n
3 v3.
For a linear difference equation of order k,
xn+k = a1xn+k−1 + a2xn+k−2 + . . .+ akxn
we use k − 1 dummy equations and obtain
xn+k = a1xn+k−1 + a2xn+k−2 + . . . . . .+ ak−1xn+1 + akxn (original equation)
xn+k−1 = xn+k−1 (dummy equation)
xn+k−2 = xn+k−2 (dummy equation)
...
xn+1 = xn+1 (dummy equation)
Rewritten in standard form, this gives

xn+k
xn+k−1
xn+k−2
...
xn+1

 =


a1 a2 a3 . . . ak−1 ak
1 0 0 . . . 0 0
0 1 0 . . . 0 0
...
...
0 0 0 . . . 1 0




xn+k−1
xn+k−2
xn+k−3
...
xn

 ,
or
un+1 = Aun,
where A is the k × k matrix
A =


a1 a2 a3 . . . ak−1 ak
1 0 0 . . . 0 0
0 1 0 . . . 0 0
...
...
0 0 0 . . . 1 0


110 CHAPTER 8. LINEAR RECURRENCE RELATIONS
and
un =


xn+k−1
xn+k−2
xn+k−3
...
xn

 .
Suppose that A is diagonalisable, with eigenvalues λ1, λ2, . . . , λk and a cor-
responding set of eigenvectors v1, v2, . . . . . . ,vk which form a basis for R
k.
Then the general solution of the recurrence relation is given by
un = c1 λ
n
1 v1 + c2 λ
n
2 v2 + . . . . . . ck λ
n
k vk.
It’s clear from this formula that the entries of un for large n are determined
by the size and sign of the eigenvalues. Suppose that one eigenvalue, say
λ1, has larger magnitude than all the other eigenvalues. Assuming that the
initial data produces a set of values for c1, c2, . . . . . . , ck in which c1 = 0, then
as
un = c1 λ
n
1 v1 + c2 λ
n
2 v2 + . . . . . . ck λ
n
k vk
= λn1 (c1 v1 + c2
(λ2
λ1
)n
v2 + . . . . . . ck
(λk
λ1
)n
vk),
we deduce that un approaches c1 λ
n
1 v1 for large n. If λ1 = 1, then un will
approach a stable value, namely c1v1. If |λ1| < 1, then all entries of un will
approach zero. If |λ1| > 1, then entries of un will increase without bound or
will oscillate without bound. These ideas are illustrated by examples.
Example Suppose the populations of two inter-related species are given by
the equations
xn+1 =
2
3
xn +
1
6
yn
yn+1 =
1
3
xn +
5
6
yn,
where xn and yn represent the respective populations at time n. Predict the
long-term population of each species if the initial populations are equal.
Writing un =
(
xn
yn
)
, we see that
un+1 =
(
2/3 1/6
1/3 5/6
)
un.
8.4. THE GENERAL CASE 111
The eigenvalues and corresponding eigenvectors of
(
2/3 1/6
1/3 5/6
)
are λ1 = 1
and λ2 =
1
2
, with v1 =
(
1
2
)
and v2 =
(−1
1
)
(check!). Since the eigenvalues
are distinct, the matrix is diagonalisable and we may write
un = c1
(
1
2
)
+ c2 (
1
2
)n
(−1
1
)
.
Solve for c1 and c2 by substituting n = 0:(
1
1
)
= c1
(
1
2
)
+ c2
(−1
1
)
,
which gives c1 =
2
3
, c2 = −13 . Therefore
un =
2
3
(
1
2
)
− 1
3
(1
2
)n(−1
1
)
.
As the eigenvalue of largest magnitude is 1, we expect that un will approach
a stable value as n becomes large. Indeed this is the case. For large n,
un ≈ 23
(
1
2
)
, and so xn ≈ 23 , while yn ≈ 13 .
Example Repeat the previous example with the same initial data, but this
time with the equations
xn+1 =
5
4
xn − 1
4
yn
yn+1 = −9
4
xn +
5
4
yn,
Writing un =
(
xn
yn
)
as before, we have
un+1 =
(
5/4 −1/4
−9/4 5/4
)
un.
The eigenvalues and corresponding eigenvectors of
(
5/4 −1/4
−9/4 5/4
)
are λ1 =
1
2
and λ2 = 2, with v1 =
(
1
3
)
and v2 =
(−2
6
)
. Since the eigenvalues are
distinct, the matrix is diagonalisable and we may write
un = c1 (
1
2
)n
(
1
3
)
+ c2 2
n
(−2
6
)
.
112 CHAPTER 8. LINEAR RECURRENCE RELATIONS
Solve for c1 and c2 by substituting n = 0:(
1
1
)
= c1
(
1
3
)
+ c2
(−2
6
)
,
which gives c1 =
2
3
, c2 = −16 . Therefore
un =
2
3
(
1
2
)n(
1
3
)
− 1
6
2n
(−2
6
)
.
That is,
xn =
2
3
(
1
2
)n
+
1
3
2n
and
yn = 2
(
1
2
)n
− 2n.
8.5 Summary
You must know and understand these results:
• The standard form of a linear recurrence relation in vector/matrix form
is un = Aun−1, where A is a k × k matrix. The definition of un depends
upon the particular type of problem being solved. Sometimes the entries
of un represent values of particular quantities at the same time stage (for
example un =

xnyn
zn

), and sometimes they represent values of a quantity at
adjacent stages (for example un =

xn+2xn+1
xn

).
• If A is a diagonalisable k×k matrix, the general solution of un = Aun−1
is
un = c1λ
n
1 v1 + c2λ
n
2 v2 + . . . . . .+ ckλ
n
k vk,
where the eigenvalues of A are λ1, λ2, . . . . . . , λk and v1, v2, . . . . . . ,vk are
corresponding eigenvectors of A which form a basis of Rk.
• A particular solution of un = Aun−1 can be obtained by setting n = 0
in the general solution, and using a given initial vector u0. This enable us to
evaluate the arbitrary constants c1, c2, . . . . . . , ck in the general solution.
8.5. SUMMARY 113
You must be able to apply these techniques:
• Find an appropriate definition of un in order to express a recurrence
relation in standard form. This also involves finding a matrix A such that
un = Aun−1.
• Determine whether A is diagonalisable, and if so, find the general
solution of un = Aun−1.
• Given appropriate initial conditions, determine the values of the arbi-
trary constants in the general solution, and hence obtain a particular solution
of un = Aun−1.
• Interpret the solution in terms of the original problem (for example
make predictions about long term values of the variables).
Practice Examples
1. Initially there is only one supermarket (X) in a town and everyone
shops there. A new supermarket (Y) opens (at time t = 0) and after
one month half the population shops at Y and half shops at X. Each
month, X loses half its customers to Y and Y loses a quarter of its
customers to X. Model this problem as a linear recurrence relation for
the purpose of finding the proportion of people who shop at X and
Y after month n. State clearly the vectors and matrix used in your
standard form. Find the particular solution and give a prediction about
how many people will shop at X and Y in the long term. (You may
assume that the population is constant.)
2. Find the particular solution of the third order linear difference equation
xn+3 = xn+2 + 4xn+1 − 4xn
with initial conditions x0 = 6, x1 = 1, x2 = 3.
Answers
1. If xn and yn represent the proportions of the population shopping after
month n at X and Y respectively, then(
xn
yn
)
=
(
1/2 1/4
1/2 3/4
)(
xn−1
yn−1
)
.
114 CHAPTER 8. LINEAR RECURRENCE RELATIONS
In standard form, we have un = Aun−1 where A =
(
1/2 1/4
1/2 3/4
)
. The
particular solution corresponding to initial conditions x0 = 1, y0 = 0 is
un =
1
3
(
1
2
)
− 2
3
(
1
4
)n
(−1
1
)
.
In the long term, X has 1/3 of the shoppers and Y has 2/3.
2. The problem can be written
xn+3xn+2
xn+1

 =

1 4 −41 0 0
0 1 0



xn+2xn+1
xn

 .
This leads to the general solution
xn+2xn+1
xn

 = c1

11
1

 + c2 2n

42
1

+ c3 (−2)n

 4−2
1

 .
The particular solution required is
xn+2xn+1
xn

 = 7

11
1

− 2 (2n)

42
1

+ (−2)n

 4−2
1

 ,
that is, xn = 7− 2n+1 + (−2)n.
Chapter 9
Systems of Differential
Equations
9.1 Introduction
Suppose that x1(t), x2(t), . . . . . . , xn(t) are differentiable functions of a single
variable t and that for any t, the derivative of each function is a linear
combination of all these functions. We then have what is known as a system
of linear differential equations,
dx1
dt
= a11x1 + a12x2 + . . . . . .+ a1nxn
dx2
dt
= a21x1 + a22x2 + . . . . . .+ a2nxn
...
...
dxn
dt
= an1x1 + an2x2 + . . . . . .+ annxn
Such systems arise often when problems in science and engineering are mod-
elled mathematically. In this chapter we shall see how the Diagonalisation
Theorem can be used to find explicit formulas for x1(t), x2(t), . . . . . . , xn(t).
As in the previous chapter on linear recurrence relations, the first step is to
re-write the system in vector/matrix notation.
115
116 CHAPTER 9. SYSTEMS OF DIFFERENTIAL EQUATIONS
Define vector X(t) and its derivative vector X′(t) as follows:
X(t) =


x1(t)
x2(t)
...
xn(t)

 , X′(t) =


x′1(t)
x′2(t)
...
x′n(t)

 .
Then if A is defined to be the n× n matrix with (i, j) entry aij , the system
can be rewritten as a single equation involving vectors and a matrix,
X′(t) = AX(t),
or X′ = AX for simplicity.
9.2 Solving using diagonalisation
Suppose that we write
S = {X | X′ = AX},
that is, S is the set of all solutions of the system. The first observation is
that S is not empty, as if Z(t) =


0
0
...
0

 for all t, then Z′(t) =


0
0
...
0

 for all t
and Z′(t) = AZ(t). Therefore Z is a member of S. Further, if X and Y are
both in S, then
X′ = AX, Y′ = AY
and so
(X+Y)′ = X′ +Y′ = AX+ AY = A(X+Y).
This shows that X+Y is also a solution; that is, S is closed under addition.
Furthermore, for all scalars k,
(kX)′ = kX′ = kAX = A(kX).
Therefore kX is a solution for all scalars k, and S is closed under multipli-
cation by a scalar. Hence S is a subspace. To find all solutions (that is, all
members of S), we obviously seek a basis for S. Before we tackle this prob-
lem in its most general setting, we shall look at a small system (3 differential
equations involving 3 unknown functions x1, x2, x3) where the matrix is in
extremely simple form, namely a diagonal matrix.
9.2. SOLVING USING DIAGONALISATION 117
Example Suppose that X′ = AX, with A =

3 0 00 −2 0
0 0 5

 . Then

x′1x′2
x′3

 =

3 0 00 −2 0
0 0 5



x1x2
x3

 .
Recovering the original differential equations gives
x′1(t) = 3x1(t), so that x1(t) = c1 e
3t,
x′2(t) = −2x2(t), so that x2(t) = c2 e−2t,
x′3(t) = 5x3(t), so that x3(t) = c3 e
5t,
where c1, c2, c3 are arbitrary constants. These are the only solutions possible.
For if x′1(t) = 3x1(t), then for all t,
d
dt
(x1(t)e
−3t) = x′1(t)e
−3t − 3x1(t)e−3t
= 3x1(t)e
−3t − 3x1(t)e−3t
= 0.
Therefore x1(t)e
−3t must be a constant, say c1, which gives x1(t) = c1e3t.
Similar calculations can be made for x2(t) and x3(t).
We say that the vector
X =

 c1 e3tc2 e−2t
c3 e
5t


is the general solution of the system
x′1x′2
x′3

 =

3 0 00 −2 0
0 0 5



x1x2
x3

 .
Observe that we may rewrite the general solution in the following way:
X =

 c1 e3tc2 e−2t
c3 e
5t

 = c1 e3t

10
0

+ c2 e−2t

01
0

+ c3 e5t

00
1

 .
This shows that every solution is a linear combination of the vectors
e3t

10
0

 , e−2t

01
0

 and e5t

00
1

 ,
118 CHAPTER 9. SYSTEMS OF DIFFERENTIAL EQUATIONS
which are in fact the basis vectors of S (the set of all solutions). The matrix A
has eigenvalues 3,−2 and 5 and eigenvectors

10
0

,

01
0

 and

00
1

. That is,
the basis vectors of S involve the eigenvalues and eigenvectors of A together
with the exponential function.
What happens in the general case? The matrix A will not necessarily be 3×3
and it is not likely to be a diagonal matrix such as in the previous example.
However it may be diagonalisable. Suppose that A is an n× n matrix which
is diagonalisable. First find its eigenvalues (which may not necessarily be
distinct)
λ1, λ2, . . . . . . , λn
and a corresponding set of eigenvectors
v1, v2, . . . . . . ,vn
which form a basis for Rn. Construct the matrix P which diagonalises A and
the associated diagonal matrix D, as we have done in the past.
P = [ v1 v2 . . . . . . vn ], D =


λ1 0 . . . 0
0 λ2 . . . 0
...
0 0 . . . λn


Then A = PDP−1 and as X′ = AX, we obtain X′ = PDP−1X, or
P−1X′ = DP−1X.
We now make a change of notation, writing Y = P−1X. Then since both P
and its inverse P−1 are just matrices with constant entries, we have
Y′ = P−1X′.
Therefore Y′ = DY, or

y′1(t)
y′2(t)
...
y′n(t)

 =


λ1 0 . . . 0
0 λ2 . . . 0
...
0 0 . . . λn




y1(t)
y2(t)
...
yn(t)

 .
9.2. SOLVING USING DIAGONALISATION 119
When multiplied out, this single vector/matrix equation gives us the linear
system
y′1(t) = λ1 y1(t)
y′2(t) = λ2 y2(t)
...
y′n(t) = λn yn(t).
The diagonalisation of A has transformed the original linear system of dif-
ferential equations into a much simpler system in the new unknowns
y1(t), y2(t), . . . . . . , yn(t). The process of diagonalising has “de-coupled” the
original equations and will enable us to see the part played by exponential
functions in the solution. Just as in the 3× 3 example, the general solution
of Y′ = DY is
Y =


y1(t)
y2(t)
...
yn(t)

 =


c1 e
λ1 t
c2 e
λ2 t
...
cn e
λn t


where c1, c2, . . . . . . , cn are arbitrary constants.
We now return to the original unknown X and find the general solution. As
Y = P−1X, we see that X = PY, and therefore
X = [ v1 v2 . . . . . . vn ]


c1 e
λ1 t
c2 e
λ2 t
...
cn, e
λn t

 ,
that is,
X = c1 e
λ1 tv1 + c2 e
λ2 tv2 + . . . . . .+ cn e
λn tvn.
The vector X given by this formula is the general solution of X′ = AX.
Every solution X is a linear combination of the basis vectors
eλ1 tv1, e
λ2 tv2, . . . . . . , e
λn tvn
which are guaranteed to be linearly independent, as
{ v1, v2, . . . . . . ,vn }
120 CHAPTER 9. SYSTEMS OF DIFFERENTIAL EQUATIONS
is a linearly independent set. If we know the initial conditions, that is, the
values of x1(0), x2(0), . . . . . . , xn(0), we can find values of c1, c2, . . . . . . , cn
and hence a particular solution of X′ = AX. We illustrate with an example.
Example Find the solution of the linear system of differential equations
dx1
dt
= 3x1 + 4x2
dx2
dt
= 3x1 + 2x2
satisfying the initial conditions x1(0) = 6, x2(0) = 1.
The system is written X′ = AX, where X =
(
x1(t)
x2(t)
)
and A =
(
3 4
3 2
)
. The
matrix A has eigenvalues λ1 = 6 (with corresponding eigenvector v1 =
(
4
3
)
)
and λ2 = −1 (with corresponding eigenvector v2 =
(
1
−1
)
). Therefore the
general solution is
X = c1 e
6t
(
4
3
)
+ c2 e
−t
(
1
−1
)
.
Set t = 0 in the general solution and use the given initial conditions:(
6
1
)
= c1
(
4
3
)
+ c2
(
1
−1
)
.
Equating corresponding entries on each side gives the simultaneous equations
4c1 + c2 = 6, 3c1 − c2 = 1
whose solution is c1 = 1, c2 = 2. Therefore the particular solution satisfying
the given initial conditions is
X = e6t
(
4
3
)
+ 2e−t
(
1
−1
)
,
or
x1(t) = 4e
6t + 2e−t,
x2(t) = 3e
6t − 2e−t.
9.3. THE MIXING PROBLEM 121
Example Find the general solution of the linear system of differential
equations
dx1
dt
= 2x2 + x3
dx2
dt
= −2x1 + 4x2 + x3
dx3
dt
= −2x1 + 2x2 + 3x3
The system is writtenX′ = AX, whereX =

x1(t)x2(t)
x3(t)

 andA =

 0 2 1−2 4 1
−2 2 3

.
This matrix was discussed as Example B in the second section of the chapter
on eigenvalues and eigenvectors. It has eigenvalues λ1 = 3 (with correspond-
ing eigenvector v1 =

11
1

) and λ2 = 2, a repeated root of the characteristic
equation. The 2−eigenspace has dimension 2, and so A is diagonalisable. The
corresponding (linearly independent) eigenvectors for λ2 = 2 are v2 =

11
0


and v3 =

10
2

. Therefore the general solution of the system of DEs is
X = c1 e
3t

11
1

+ c2 e2t

11
0

+ c3 e2t

10
2

 .
The next sections will illustrate various situations where this technique for
solving systems of linear differential equations can be applied.
9.3 The mixing problem
The diagram show two tanks of capacity 100 litres. Initially, tank T1 contains
100 litres of pure water and tank T2 contains 100 litres of pure water in which
5 kg of fertiliser has been dissolved. The tanks are connected by pipes which
allow liquid to flow between them at a rate of 2 litres per minute, and the
mixtures are kept uniform by stirring. The problem is to find the amounts
of fertiliser (in kilograms) in each of the tanks at time t minutes.
122 CHAPTER 9. SYSTEMS OF DIFFERENTIAL EQUATIONS
T1 T2
2 /min
2 /min
Let x(t) and y(t) denote the number of kilograms of fertiliser in T1 and T2
respectively. Considering T1 first, we see that
dx
dt
is equal to the rate of inflow
of fertiliser minus the rate of outflow of fertiliser. That is,
dx
dt
=
2
100
y(t)− 2
100
x(t)
= −0.02x+ 0.02y
Similarly for T2,
dy
dt
=
2
100
x(t)− 2
100
y(t)
= 0.02x− 0.02y
We now have a system of linked differential equations which can be written
as a single equation in vector/matrix form. Let X =
(
x(t)
y(t)
)
. Then
X′ =
(−0.02 0.02
0.02 −0.02
)
X = AX.
Using the theory developed in the previous section, we solve this by finding
eigenvalues and eigenvectors of A. First, we find the eigenvalues.
|A− λI| =
∣∣∣∣∣−0.02− λ 0.020.02 −0.02− λ
∣∣∣∣∣
= (λ+ 0.02)2 − 0.0004
= λ2 + 0.04λ
= λ(λ+ 0.04)
Hence A has eigenvalues 0 and −0.04, and as the eigenvalues are distinct, A
is diagonalisable.
9.3. THE MIXING PROBLEM 123
Now we must find the corresponding eigenvectors.
When λ = 0, we solve (A − 0I)v = 0 to obtain an eigenvector v. A couple
of simple EROs leads from the initial augmented matrix to the following row
echelon form: (−0.02 0.02 0
0.02 −0.02 0
)
−→
(
1 −1 0
0 0 0
)
.
Hence the 0−eigenspace has basis {
(
1
1
)
}.
When λ = −0.04, we solve (A + 0.04I)v = 0 to obtain an eigenvector v.
Applying EROs leads from the initial augmented matrix to the following row
echelon form: (
0.02 0.02 0
0.02 0.02 0
)
−→
(
1 1 0
0 0 0
)
.
Hence the −0.04−eigenspace has basis {
(−1
1
)
}.
The general solution of the system is therefore
X = c1e
0t
(
1
1
)
+ c2e
−0.04t
(−1
1
)
= c1
(
1
1
)
+ c2e
−0.04t
(−1
1
)
.
To find the values of c1 and c2 we use the initial conditions, x(0) = 0 and
y(0) = 5. Setting t = 0 in the general solution then gives
(
0
5
)
= c1
(
1
1
)
+ c2
(−1
1
)
and solving for c1 and c2 we find c1 = c2 = 2.5. The particular solution is
therefore
X = 2.5
(
1
1
)
+ 2.5e−0.04t
(−1
1
)
,
that is, x(t) = 2.5(1− e−0.04t) and y(t) = 2.5(1 + e−0.04t). As expected from
the physical situation, these equations confirm that for large values of t, the
original 5kg of fertiliser in T2 becomes shared equally between the two tanks,
as e−0.04t approaches zero as t approaches ∞. Hence x(t) and y(t) both
approach 2.5. This is illustrated in the graphs of x and y plotted against t.
124 CHAPTER 9. SYSTEMS OF DIFFERENTIAL EQUATIONS
t
5
2.5
y(t)
x(t)
9.4 Populations of interlinked species
Suppose that x(t) and y(t) denote the population sizes of two interacting
species at time t, and that
dx
dt
= ax+ by
dy
dt
= cx+ dy
where a, b, c, d are constants. In general, we may think of a and d as natural
birth rates (a > 0, d > 0). How can the values of b and c help us to interpret
the interaction between the species?
Competing Species This situation corresponds to b and c both being
negative. For example, if
dx
dt
= ax + by and b < 0, then an increase in y
will cause a decrease in
dx
dt
, that is, the growth of x is slowed. Similarly, if
dy
dt
= cx + dy and c < 0, then an increase in x will cause a decrease in
dy
dt
,
that is, the growth of y is slowed. Increasing the number of one species slows
the growth of the other.
Symbiotic Relationship This situation corresponds to b and c both being
positive. An increase in the population of one species causes an increase in
the growth of the other species.
Predator-Prey Relationship This situation corresponds to b and c having
opposite sign. Suppose for example that b < 0 and c > 0. Then from the
equation
dx
dt
= ax+ by,
9.4. POPULATIONS OF INTERLINKED SPECIES 125
we see that an increase in y will cause a decrease in the growth of x (since
b < 0), while from the equation
dy
dt
= cx+ dy
we see that an increase in x will cause an increase in the growth of y (as
c > 0). With this arrangement of values for b and c, the species whose
population is x(t) is the prey species, and the species whose population is
y(t) is the predator species.
Example Solve the ‘competing species’ system
dx
dt
= 3x− y
dy
dt
= −2x+ 2y
with initial conditions x(0) = 90, y(0) = 150.
Let X =
(
x
y
)
and we then have X′ = AX =
(
3 −1
−2 2
)
X.
Find the eigenvalues of A:
|A− λI| =
∣∣∣∣∣3− λ −1−2 2− λ
∣∣∣∣∣
= λ2 − 5λ+ 4
= (λ− 1)(λ− 4)
Hence A has eigenvalues 1 and 4; because the eigenvalues are distinct, A is
diagonalisable.
When λ = 1, we solve (A − I)v = 0 to obtain an eigenvector v. Applying
EROs in the usual way to the augmented matrix:(
2 −1 0
−2 1 0
)
−→
(
1 −1
2
0
0 0 0
)
Therefore
{(
1
2
)}
is a basis for the 1−eigenspace. In a similar way we find
that
{(−1
1
)}
is a basis for the 4−eigenspace. The general solution is
X = c1e
t
(
1
2
)
+ c2e
4t
(−1
1
)
.
126 CHAPTER 9. SYSTEMS OF DIFFERENTIAL EQUATIONS
With initial conditions x(0) = 90, y(0) = 150, setting t = 0 in the general
solution gives (
90
150
)
= c1
(
1
2
)
+ c2
(−1
1
)
.
Solving for c1 and c2 then gives c1 = 80, c2 = −10. Our particular solution is
X = 80et
(
1
2
)
− 10e4t
(−1
1
)
.
In terms of the original variables x and y, we have
x(t) = 80et + 10e4t and y(t) = 160et − 10e4t.
Observe that y(t) = 0 when 160 = 10e3t, that is, when t =
ln 16
3
≈ 0.92. The
model suggests that the species whose population is y(t) dies out at time
0.92. It would therefore be pointless to use the original equations to predict
population sizes for values of t greater than 0.92.
9.5 The diffusion equation
An infinite pipe contains 4 segments, two of finite (equal) length and the two
at the ends semi-infinite. The segments are divided by porous membranes.
Segment 1 Segment 2 Segment 3 Segment 4
Porous membrane
At time t = 0, segments 2 and 3 contain chemical solution or gases, in concen-
trations v(0) and w(0) respectively. At all times t, the average concentrations
in segments 1 and 4 are zero, since they have infinite volume.
9.5. THE DIFFUSION EQUATION 127
Diffusion starts at t = 0, and the diffusion rate between adjacent segments
of the pipe is equal to the difference in concentration between the segments.
The aim is to find formulas for v(t) and w(t), the concentrations in segments
2 and 3 respectively, at time t.
The concentration in segment 2 is changing in two ways: by diffusion into
segment 1 (given by 0−v) and by diffusion into and out of segment 3 (given
by w − v). The net rate of change of v(t) is therefore
dv
dt
= (0− v) + (w − v) = −2v + w.
Similarly,
dw
dt
= (0− w) + (v − w) = v − 2w.
We rewrite this system in the usual way using vector/matrix notation. If
X =
(
v(t)
w(t)
)
then X′ =
(
v′(t)
w′(t)
)
and
X′ =
(−2 1
1 −2
)
X = AX.
We find that eigenvalues of A are λ1 = −3 and λ2 = −1, with corresponding
eigenvectors v1 =
(−1
1
)
and v2 =
(
1
1
)
(you should check these results). The
matrix A is therefore diagonalisable and we may apply the theory derived
earlier in this chapter. The general solution of the system is
X = c1 e
−3t
(−1
1
)
+ c2 e
−t
(
1
1
)
.
Equations for v(t) and w(t) are then found to be
v(t) = −c1 e−3t + c2 e−t
and
w(t) = c1 e
−3t + c2 e−t.
Note that as t → ∞, both v(t) and w(t) approach zero, as expected.
With initial conditions v(0) = 3, w(0) = 5, we obtain
−c1 + c2 = 3
c1 + c2 = 5.
128 CHAPTER 9. SYSTEMS OF DIFFERENTIAL EQUATIONS
Thus c2 = 4, c1 = 1 and
v(t) = −e−3t + 4e−t, w(t) = e−3t + 4e−t.
The graphs of v and w against t are shown in the diagram.
t
5
3
w(t)
v(t)
9.6 Summary
You must know and understand the following results:
• The system of first order linear differential equations with constant
coefficients
dx1
dt
= a11x1 + a12x2 + . . . . . .+ a1nxn
dx2
dt
= a21x1 + a22x2 + . . . . . .+ a2nxn
...
...
dxn
dt
= an1x1 + an2x2 + . . . . . .+ annxn
is rewritten as a single equation in vector/matrix form as X′ = AX, where
X(t) =


x1(t)
x2(t)
...
xn(t)

 , X′(t) =


x′1(t)
x′2(t)
...
x′n(t)

 .
and A is the n× n matrix with (i, j) entry aij .
9.6. SUMMARY 129
• If the matrix A is diagonalisable, the general solution of X′ = AX is
X = c1 e
λ1 tv1 + c2 e
λ2 tv2 + . . . . . .+ cn e
λn tvn,
where λ1, λ2, . . . . . . , λn are eigenvalues of A and v1, v2, . . . . . . ,vn are cor-
responding eigenvectors which form a basis for Rn.
• The particular solution is obtained by finding the values of the arbitrary
constants c1, c2, . . . . . . , cn. This is done by setting t = 0 in the general
solution and using a given set of initial conditions, x1(0), x2(0), . . . . . . , xn(0).
You must be able to apply the following techniques:
• Write a given system of first order linear differential in vector/matrix
form, defining the variables you are using.
• Determine whether or not A is diagonalisable by finding its eigenvalues
and corresponding eigenvectors.
• If A is diagonalisable, write down the general solution and use it to
determine a particular solution given a set of initial conditions.
• Interpret the solution you have obtained in terms of the physical setting
of the original problem.
Practice Examples
1. Two interlinked species have populations x(t) and y(t) at time t. The
governing equations are
dx
dt
= 5x+ 8y
dy
dt
= 4x+ y
Find the populations at time t when the initial populations (measured
in thousands) are x(0) = 1, y(0) = 2. Are the two species in competi-
tion, in a predator-prey situation or in a symbiotic relationship? What
is the proportion of x(t) to y(t) when t is large?
130 CHAPTER 9. SYSTEMS OF DIFFERENTIAL EQUATIONS
2. Find the particular solution of the linear system
dx
dt
= 2x+ y + z
dy
dt
= x+ 3y + 3z
dz
dt
= 3x+ 2y + 2z
when the initial conditions are x(0) = 1, y(0) = −2, z(0) = 1.
Answers
1. The particular solution is
x(t) = 2e9t − e−3t, y(t) = e9t + e−3t.
The species are in a symbiotic relationship. For large t, x(t) is double
the value of y(t), the reverse of the situation when t = 0.
2. The particular solution is
x(t) = et, y(t) = −2et, z(t) = et.
Index
λ-eigenspace, 83
addition of functions, 13
addition of vectors, 10
age class, 98
associative law of addition, 10
axioms of a vector space, 10
back substitution, 4
basis, 38
building a basis, 43
of a vector space, 38
of column space of a matrix, 58
of null space of a matrix, 56
Basis Theorem, 41
characteristic equation, 83
closure under addition, 10
closure under multiplication by a scalar,
11
column space of a matrix, 27, 29, 55
basis of, 58
dimension of, 59
commutative law, 11
competing species, 124
composite of linear transformations,
75
consistent linear system, 7, 26
current flow in nerve fibre, 104
current leakage, 104
data matrix, 76
definition
λ-eigenspace, 83
basis of a vector space, 38
characteristic equation, 83
column space of a matrix, 27
dimension of a vector space, 41
eigenvalue of a matrix, 82
eigenvector of a matrix, 82
Lagrange basis of Pn, 51
linear combination, 22
linear difference equation, 97
linear recurrence relation, 97
linear transformation, 63
linearly dependent set, 34
linearly independent set, 36
matrix transformation, 67
null space of a matrix, 21
nullity of a matrix, 59
rank of a matrix, 59
span of a set of vectors, 23
standard basis of Pn, 38
standard basis of Rn, 39
subspace of a vector space, 14
vector space over R, 10
determinant, 83
diagonal matrix, 93
powers of, 93
diagonalisable matrix, 92
Diagonalisation Theorem, 91
diagonalising matrix, 92
diffusion equation, 126
dimension of a vector space, 41
dimension of column space of a ma-
trix, 59
dimension of null space of a matrix,
57
131
132 INDEX
Dimension Theorem for Matrices, 59
dummy equation, 105, 108, 109
eigenspaces, 82
eigenvalue(s), 82
distinct, 84
repeated, 86, 87
eigenvector, 82
exact fit polynomial, 54
fitting a polynomial to a data set, 54
fixed lines
reflection, 81
rotation, 81
scaling, 81
shear, 81
free variable, 3
function space, 12
Gaussian elimination, 3
general reflection matrix, 72
general rotation matrix, 73
general solution
of recurrence relation, 101
of system of linear DEs, 119
growth of branches on a tree, 106
initial population data, 101
Lagrange basis of Pn, 51
properties of, 51
Lagrange polynomials, 49
leading 1s, 3
Leslie matrix, 99
Leslie population model, 98
linear combination, 2, 22, 28
linear dependence, 34
linear independence, 36
linear recurrence relation, 97
general solution of, 101
in standard form, 99
in vector/matrix form, 98
of order 2, 97
of order 3, 98
particular solution of, 101
linear system, 1
linear transformation, 63
composite of, 75
eigenspaces of, 82
eigenvectors of, 82
testing procedure, 64
linearly dependent set, 34
linearly independent set, 36
matrix
column space of, 27, 29
data matrix, 76
diagonal, 93
diagonalisable, 92
Leslie, 99
null space of, 21, 28
nullity of, 59
powers of, 93
rank of, 59
reflection matrix, 72
rotation matrix, 73
scaling matrix, 75
shear matrix, 73
matrix of a reflection, 72
matrix of a rotation, 73
matrix of a scaling, 75
matrix of a shear, 73
matrix transformation, 67
eigenspaces of, 82
eigenvectors of, 82
fixed lines, 81
mixing problem, 121
multiplication of a function by a scalar,
13
multiplication of a vector by a scalar,
10
myelin sheath, 104
natural birth rates, 124
INDEX 133
negative of a vector, 11
nerve fibres, 104
nodes of Ranvier, 104
null space of a matrix, 21, 28, 55
basis of, 56
dimension of, 57
spanning set for, 56
Ohm’s Law, 104
parameter, 3
particular solution of Ax=b, 9
particular solution of recurrence rela-
tion, 101
particular solution of system of linear
DEs, 120
pivot positions, 3
population vector, 99
populations of interlinked species, 124
powers of a diagonal matrix, 93
powers of a matrix, 93
predator-prey relationship, 124
preserves addition, 64
preserves multiplication by a scalar,
64
recurrence relation
in standard form, 99
particular solution in scalar form,
108
particular solution in vector form,
108
reflection in the x axis, 70
reflection in the y axis, 70
reflection in the line y = x, 71
rotation through θ about origin, 73
scaling matrix, 75
shear along the x axis, 73
shear matrix, 73
solution of homogeneous system, 10
solving linear systems, 1
span of a set of vectors, 23, 29
standard basis of Pn, 38
standard basis of Rn, 39
standard form of a recurrence rela-
tion, 99
subspace of a vector space, 9, 14, 19
testing procedure, 14, 19
symbiotic relationship, 124
system of linear DEs, 115
de-coupled system, 119
general solution, 129
general solution of, 119
particular solution, 120, 129
vector/matrix form, 116
system of linear equations, 1
techniques for solving linear systems,
3
vector space, 9, 10, 19
Mm,n, 13
C, 13
F, 12
Rn, 11
basis, 38
dimension, 41
weights, 2
zero vector, 11

欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468