程序代写案例-MA1131

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

Diploma Program
MA1131 Mathematics 1A
ALGEBRA NOTES
CRICOS Provider No: 00098G c©2020 UNSW Sydney

iii
Preface
Please read carefully.
These Notes form the basis for the algebra strand of MA1131. However, not all of the material in
these Notes is included in the MA1131 algebra syllabus. In particular, any material marked [X] is
non-assessable. A detailed syllabus is given, commencing on page ix of these Notes.
In using these Notes, you should remember the following points:
1. Most courses at university present new material at a faster pace than you will have been
accustomed to in high school, so it is essential that you start working right from the beginning
of the session and continue to work steadily throughout the session. Make every effort to keep
up with the lectures and to do problems relevant to the current lectures.
2. These Notes are not intended to be a substitute for attending lectures or tutorials. The
lectures will expand on the material in the notes and help you to understand it.
3. These Notes may seem to contain a lot of material but not all of this material is equally
important. One aim of the lectures will be to give you a clearer idea of the relative importance
of the topics covered in the Notes.
4. Use the tutorials for the purpose for which they are intended, that is, to ask questions about
both the theory and the problems being covered in the current lectures.
5. Some of the material in these Notes is more difficult than the rest. This extra material is
marked with the symbol [H].
6. Problems marked with [V] have a video solution available from Moodle.
7. It is essential for you to do problems which are given at the end of each chapter. If you
find that you do not have time to attempt all of the problems, you should at least attempt
a representative selection of them. The problems set in tests and exams will be similar to
the problems given in these notes. Further information on the problems and class tests is on
pages x and 226.
8. You will be expected to use the computer algebra package Maple in tests and understand
Maple syntax and output for the end of semester examination.
Note.
We gratefully acknowledge the contributions of the School of Mathematics and Statistics towards
the creation of this resource. Copyright is vested in The University of New South Wales, c©2020.
c©2020 School of Mathematics and Statistics, UNSW Sydney
iv
c©2020 School of Mathematics and Statistics, UNSW Sydney
vContents
Preface iii
Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Syllabus for MA1131 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
Homework schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Tutorial schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Test schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
1 Introduction to vectors 1
1.1 Vector quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Geometric vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Two dimensional vector quantities . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Vector quantities and Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.1 Vectors in R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.2 Vectors in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Rn and analytic geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.1 Two dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.2 Three dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.3.3 n-dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.4 Lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.4.1 Lines in R2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.4.2 Lines in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.4.3 Lines through two given points (in Rn) . . . . . . . . . . . . . . . . . . . . . . 27
1.5 Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.5.1 Linear combination and span . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.5.2 Parametric vector form of a plane . . . . . . . . . . . . . . . . . . . . . . . . 31
1.5.3 Cartesian form of a plane in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.6 Vectors and Maple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Problems for Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2 Vector geometry 43
2.1 Lengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.2 The dot product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.2.1 Arithmetic properties of the dot product . . . . . . . . . . . . . . . . . . . . . 46
2.2.2 Geometric interpretation of the dot product in Rn . . . . . . . . . . . . . . . 46
2.3 Applications: orthogonality and projection . . . . . . . . . . . . . . . . . . . . . . . . 48
c©2020 School of Mathematics and Statistics, UNSW Sydney
vi
2.3.1 Orthogonality of vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.3.2 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.3.3 Distance between a point and a line in R3 . . . . . . . . . . . . . . . . . . . . 53
2.4 The cross product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2.4.1 Arithmetic properties of the cross product . . . . . . . . . . . . . . . . . . . . 56
2.4.2 A geometric interpretation of the cross product . . . . . . . . . . . . . . . . . 57
2.4.3 Areas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.5 Scalar triple product and volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
2.5.1 Volumes of parallelepipeds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
2.6 Planes in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.6.1 Equations of planes in R3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
2.6.2 Distance between a point and a plane in R3 . . . . . . . . . . . . . . . . . . . 67
2.7 Geometry and Maple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
Problems for Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3 Complex numbers 75
3.1 A review of number systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.2 Introduction to complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.3 The rules of arithmetic for complex numbers . . . . . . . . . . . . . . . . . . . . . . 78
3.4 Real parts, imaginary parts and complex conjugates . . . . . . . . . . . . . . . . . . 80
3.5 The Argand diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
3.6 Polar form, modulus and argument . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.7 Properties and applications of the polar form . . . . . . . . . . . . . . . . . . . . . . 88
3.7.1 The arithmetic of polar forms . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.7.2 Powers of complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.7.3 Roots of complex numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.8 Trigonometric applications of complex numbers . . . . . . . . . . . . . . . . . . . . . 96
3.9 Geometric applications of complex numbers . . . . . . . . . . . . . . . . . . . . . . . 99
3.10 Complex polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
3.10.1 Roots and factors of polynomials . . . . . . . . . . . . . . . . . . . . . . . . . 103
3.10.2 Factorisation of polynomials with real coefficients . . . . . . . . . . . . . . . . 106
3.11 Appendix: A note on proof by induction . . . . . . . . . . . . . . . . . . . . . . . . . 107
3.12 Appendix: The Binomial Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
3.13 Complex numbers and Maple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Problems for Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4 Linear equations and matrices 121
4.1 Introduction to linear equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4.2 Systems of linear equations and matrix notation . . . . . . . . . . . . . . . . . . . . 125
4.3 Elementary row operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
4.3.1 Interchange of equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
4.3.2 Adding a multiple of one equation to another . . . . . . . . . . . . . . . . . . 130
4.3.3 Multiplying an equation by a non-zero number . . . . . . . . . . . . . . . . . 131
4.4 Solving systems of equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.4.1 Row-echelon form and reduced row-echelon form . . . . . . . . . . . . . . . . 132
c©2020 School of Mathematics and Statistics, UNSW Sydney
vii
4.4.2 Gaussian elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
4.4.3 Transformation to reduced row-echelon form . . . . . . . . . . . . . . . . . . 137
4.4.4 Back-substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
4.5 Deducing solubility from row-echelon form . . . . . . . . . . . . . . . . . . . . . . . . 141
4.6 Solving Ax = b for indeterminate b . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
4.7 General properties of the solution of Ax = b . . . . . . . . . . . . . . . . . . . . . . 143
4.8 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
4.8.1 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
4.8.2 Chemical engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
4.8.3 Economics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
4.9 Matrix reduction and Maple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Problems for Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
5 Matrices 165
5.1 Matrix arithmetic and algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
5.1.1 Equality, addition and multiplication by a scalar . . . . . . . . . . . . . . . . 166
5.1.2 Matrix multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
5.1.3 Matrix arithmetic and systems of linear equations . . . . . . . . . . . . . . . 174
5.2 The transpose of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
5.2.1 Some uses of transposes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
5.2.2 Some properties of transposes . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
5.3 The inverse of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
5.3.1 Some useful properties of inverses . . . . . . . . . . . . . . . . . . . . . . . . . 181
5.3.2 Calculating the inverse of a matrix . . . . . . . . . . . . . . . . . . . . . . . . 182
5.3.3 Inverse of a 2× 2 matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
5.3.5 Inverses and solution of Ax = b . . . . . . . . . . . . . . . . . . . . . . . . . . 187
5.4 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
5.4.1 The definition of a determinant . . . . . . . . . . . . . . . . . . . . . . . . . . 188
5.4.2 Properties of determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
5.4.3 The efficient numerical evaluation of determinants . . . . . . . . . . . . . . . 193
5.4.4 Determinants and solutions of Ax = b . . . . . . . . . . . . . . . . . . . . . . 196
5.5 Matrices and Maple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
Problems for Chapter 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
Answers to selected problems 205
Chapter 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
Chapter 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
Chapter 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
Chapter 5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
Past class tests 225
Index 237
c©2020 School of Mathematics and Statistics, UNSW Sydney
viii
c©2020 School of Mathematics and Statistics, UNSW Sydney
ix
ALGEBRA SYLLABUS AND LECTURE TIMETABLE
The algebra course for MA1131 is based on the MA1131 Algebra Notes given here.
The computer package Maple will be used in the algebra course. An introduction to Maple is
included in the booklet Computing Laboratories Information and First Year Maple Notes.
The lecture timetable is given below. Lecturers will try to follow this timetable, but some variations
may be unavoidable, especially in lecture groups affected by public holidays.
Chapter 1. Introduction to Vectors
Lecture 1. Vector quantities and Rn. (Section 1.1, 1.2).
Lecture 2. R2 and analytic geometry. (Section 1.3).
Lecture 3. Points, line segments and lines. Parametric vector equations. Parallel lines. (Section
1.4).
Lecture 4. Planes. Linear combinations and the span of two vectors. Planes though the origin.
Parametric vector equations for planes in Rn. The linear equation form of a plane. (Section 1.5).
Chapter 2. Vector Geometry
Lecture 5. Length, angles and dot product in R2, R3, Rn. (Sections 2.1,2.2).
Lecture 6. Orthogonality and orthonormal basis, projection of one vector on another. Orthonor-
mal basis vectors. Distance of a point to a line. (Section 2.3).
Lecture 7. Cross product: definition and arithmetic properties, geometric interpretation of cross
product as perpendicular vector and area (Section 2.4).
Lecture 8. Scalar triple products, determinants and volumes (Section 2.5). Equations of planes
in R3: the parametric vector form, linear equation (Cartesian) form and point-normal form of
equations, the geometric interpretations of the forms and conversions from one form to another.
Distance of a point to a plane in R3. (Section 2.6).
Chapter 3. Complex Numbers
Lecture 9. Development of number systems and closure. Definition of complex numbers and of
complex number addition, subtraction and multiplication. (Sections 3.1, 3.2, start Section 3.3).
Lecture 10. Division, equality, real and imaginary parts, complex conjugates. (Finish 3.3, 3.4).
Lecture 11. Argand diagram, polar form, modulus, argument. (Sections 3.5, 3.6).
Lecture 12. De Moivre’s Theorem and Euler’s Formula. Arithmetic of polar forms. (Section 3.7,
3.7.1).
Lecture 13. Powers and roots of complex numbers. Binomial theorem and Pascal’s triangle.
(Sections 3.7.2, 3.7.3, start Section 3.8).
Lecture 14. Trigonometry and geometry. (Finish 3.8, 3.9).
Lecture 15. Complex polynomials. Fundamental theorem of algebra, factorization theorem,
factorization of complex polynomials of form zn− z0, real linear and quadratic factors of real poly-
nomials. (Section 3.10).
Chapter 4. Linear Equations and Matrices
Lecture 16. Introduction to systems of linear equations. Solution of 2× 2 and 2× 3 systems and
geometrical interpretations. (Section 4.1).
Lecture 17. Matrix notation. Elementary row operations. (Sections 4.2, 4.3).
Lecture 18. Solving systems of equations via Gaussian elimination. (Section 4.4 to 4.8)
c©2020 School of Mathematics and Statistics, UNSW Sydney
xChapter 5. Matrices
Lecture 19. Matrices. (Section 5.1).
Lecture 20. Transpose of a matrix. Inverse of a matrix. (Sections 5.2, 5.3)
Lecture 21. Inverses and definition of determinants. (Section 5.3 and start Section 5.4).
Lecture 22. Properties of determinants. (Section 5.4).
Revision
Lecture 23. Revision.
Lecture 24. Revision.
ALGEBRA PROBLEM SETS
The Algebra problems are located at the end of each chapter of the Algebra Notes booklet. They
are also available from the course module on the UNSW Moodle server. The problems marked [R]
form a basic set of problems which you should try first. Problems marked [H] are harder and can
be left until you have done the problems marked [R]. You do need to make an attempt at the [H]
problems because problems of this type will occur on tests and in the exam. If you have difficulty
with the [H] problems, ask for help in your tutorial. Questions marked [V] have a video solution
available from the course page for this subject on Moodle.
There are a number of questions marked [M], indicating that Maple is required in the solution of
the problem.
c©2020 School of Mathematics and Statistics, UNSW Sydney
xi
WEEKLY ALGEBRA SCHEDULES
Solving problems and writing mathematics clearly are two separate skills that need to be devel-
oped through practice. We recommend that you keep a workbook to practice writing solutions to
mathematical problems. The following table gives the range of questions suitable for each week.
In addition it suggests specific recommended problems to do before your classroom tutorials.
The Online Tutorials will develop your problem solving skills, and give you examples of mathemat-
ical writing. Because this overlaps with the skills developed through homework, there are fewer
recommended homework in Online Tutorial weeks.
WEEKLY ALGEBRA HOMEWORK SCHEDULE
Week Try to do up to Recommended Homework
Chapter Problem Problems
1 1 25 4, 12(a), 14, 17, 20
2 1 40 26(d), 28(b), 29(b), 34(b), 34(d), 39
3 2 9 1(b), 3, 7, 8(b)
4 2 23 13(b), 17(a), 19(a), 21, 22(b)
5 3 17 5, 8(c), 10, 16(d)
6 3 33 18(d), 18(e), 21(a)-21(d), 26, 27, 33(a)
8 3 53 34(a), 40 45, 49(a), 50(b)
9 3 71 57(b), 61, 67
4 5 4(b), 5
10 4 41 8, 11(b), 12(c), 15, 20(a), 24, 25, 29
11 5 19 1(m), 6, 7, 14
12 5 40 17 (d), 21 (a), 24, 28(a), 31
WEEKLY MA1131 ALGEBRA TUTORIAL SCHEDULE
The main reason for having tutorials is to give you a chance to tackle and discuss problems which
you find difficult or don’t fully understand.
There are two kinds of tutorials: Online and Classroom. Algebra Online Tutorials are delivered
using MapleTA. These are optional and can be completed from home. Algebra Classroom tutorials
are delivered in a classroom by an algebra tutor. The topics covered in a classroom tutorial are
flexible, and you can (and should) ask your tutor to cover any homework topics you find difficult.
You may also be asked to present solutions to homework questions to the rest of the class.
c©2020 School of Mathematics and Statistics, UNSW Sydney
xii
c©2020 School of Mathematics and Statistics, UNSW Sydney
xiii
ALGEBRA CLASS TESTS
Questions for the class tests in MA1131 will be similar to the questions marked [R] and [H] in the
problem sets. Since each class test is only thirty minutes in length only shorter straight forward
tests of theory and practice will be set. As a guide, see the sample class test papers (at the end of
the Algebra notes).
The following table shows the week in which each test will be held and the topics covered.
Examination questions are, by their nature, different from short test questions. They may test a
greater depth of understanding. The questions will be longer, and sections of the course not covered
in the class tests will be examined. As a guide, see the sample exam papers in the separate sample
exam papers booklet.
c©2020 School of Mathematics and Statistics, UNSW Sydney
Topics covered
Test Week chapter sections
1 5 1 & 2 All
2 10 3 & 4 All
xiv
c©2020 School of Mathematics and Statistics, UNSW Sydney
1Chapter 1
Introduction to vectors
“You see, the earth takes twenty-four hours to
turn round on its axis —”
“Talking of axes,” said the Duchess, “chop off her head.”
Lewis Carroll, Alice in Wonderland.
The aims of this chapter are to introduce the idea of “vector” and in a relatively informal and
intuitive manner, and to illustrate applications of these ideas to the geometry of lines and planes.
Until quite recently, the main applications of vectors had been in the physical and engineering
sciences. However the study of vectors has now become an important branch of modern pure
and applied mathematics, and vectors are now being used in such diverse fields as economics
and management science, psychology and the social sciences, chemistry and chemical engineering,
mechanical and electrical engineering, computer science, numerical analysis and computational
mathematics.
The definition of vectors used in mathematics courses is essentially algebraic in nature whereas
the one use by physicists is geometric. We shall begin with the geometric approach then we shall
introduce the algebraic definition and show how they relate to one another. As we shall see however,
the algebraic definition of a vector is not limited to describing quantities that arise in physics and
engineering.
1.1 Vector quantities
Vector quantities, as opposed to scalar quantities, are very important in an understanding of the
laws of physics and engineering.
A scalar quantity is anything that can be specified by a single number. Examples of scalar
quantities are temperature, distance, mass and speed. For example, specifying the speed of a car
as 60 km per hour only involves the single real number 60.
A vector quantity is one which is specified by both a magnitude and a direction. Examples
of vector quantities are displacement, velocity, force and electric field. For example, specifying the
velocity of a car as 60 km per hour northeast involves a vector of magnitude 60 and direction
northeast.
The usual notational convention in books is to differentiate vector quantities from scalar ones by
denoting them by boldface symbols such as a, and we shall do this in these notes. In handwriting,
one usually signifies that a quantity is a vector by using a tilde sign under the letter (as in a
˜
), or by
c©2020 School of Mathematics and Statistics, UNSW Sydney
2 CHAPTER 1. INTRODUCTION TO VECTORS
writing an arrow above the letter (as in ~a). Because the properties of scalar and vector quantities
are quite different, it is vital that you distinguish them, especially in solutions to problems.
The magnitude of the vector a is usually denoted by |a|. Note that this is always a non-negative
real number, and that |a| = 0 only when a is the zero vector, usually denoted by 0.
Definition 1. The zero vector is the vector 0 of magnitude zero, and undefined
direction.
1.1.1 Geometric vectors
To represent a vector on a diagram we draw an
arrow (i.e. a directed line segment) where the length
of the arrow is the magnitude of the vector, and the
direction of the arrow is the direction of the vector.
An arrow can be specified by its initial point (the tail)
and its terminal point (the head). In figure 1, a vector
a is represented by an arrow with initial point P and
terminal point Q. We denote this arrow by
−−→
PQ.
Two vectors are said to be equal if they have the
same magnitude and direction. As the arrows
−−→
AB and−−→
EF have the same length and direction as
−−→
PQ, these
arrows all represent the same vector. We can write
−−→
PQ =
−−→
AB =
−−→
EF .
Each vector may be represented by many arrows, but
each arrow only represents one vector. Nonetheless,
we shall sometimes find it convenient to blur the dis-
tinction and write expressions like a =
−−→
PQ, when we
really mean that
−−→
PQ represents a.
a
P
Q
A
B
E
F
Figure 1.
Note. Since we use the intuitive notion of direction in our physical world, apparently we can only
have two dimensional or three dimensional geometric vectors. We shall introduce the algebraic
definition of vectors including the higher dimensional ones in the next section. Though we can still
talk about higher dimensional geometric vectors, the notions of length and direction will depend
on the algebraic nature of the vectors.
There are two equivalent ways to add two vectors together. The first addition rule is often
known as the triangle law for vector addition.
Definition 2. (The addition of vectors). On a diagram drawn to scale, draw
an arrow representing the vector a. Now draw an arrow representing b whose initial
point lies at the terminal point of the arrow representing a. The arrow which goes
from the initial point of a to the terminal point of b represents the vector c = a+b.
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.1. VECTOR QUANTITIES 3
c
b
a
c = a+ b
Figure 2: Addition of Vectors.
The second addition rule is known as the parallelogram law for vector addition.
Definition 3. (Alternate definition of vector addition). On a diagram drawn
to scale, draw an arrow representing the vector a. Now draw an arrow representing
b whose initial point lies at the initial point of the arrow representing a. Draw
the parallelogram with the two arrows as adjacent sides. The initial point and the
two terminal points are vertices of the parallelogram. The arrow which goes from
the initial point of a to the fourth vertex of the parallelogram represents the vector
c = a+ b.
c
b
a
c = a+ b
Figure 3: Addition of Vectors.
Obviously, these two definitions are equivalent.
Vector addition is of course quite different from usual addition of numbers, but they do share
some important properties. In particular, for any vectors a, b and c,
a+ b = b+ a, (Commutative law of vector addition)
and (a+ b) + c = a+ (b+ c). (Associative law of vector addition)
These laws, which follow from basic geometry (see Figures 4 and 5), assert that it makes no
difference in what order, or in what grouping we add vectors.
a+
b b
a
b+
a
b
a
Figure 4: Commutative Law of Vector Addition.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4 CHAPTER 1. INTRODUCTION TO VECTORS
(a
+
b)
+
c
a+
b b
a
c
a+
(b
+ c
)
b
+
c
b
a
c
Figure 5: Associative Law of Vector Addition.
The law of addition can easily be extended to cover the zero vector by the natural condition
that a+ 0 = 0 + a = a for all vectors a. We then can introduce the negative of a vector and the
subtraction of vectors.
Definition 4. (Negative of a vector and subtraction). The negative of a,
written −a is a vector such that
a+ (−a) = −a+ a = 0.
If a and b are vectors, we define the subtraction by
a− b = a+ (−b).
From the definition, the vector −a is the unique
vector which has the same magnitude as a, but the di-
rection is opposite to that of a. As shown in Figure 6,
the arrow representing a − b can be drawn by first
reversing the arrow representing b, and then adding
the vectors a and −b.
On the other hand if we represent the vectors a
and b by arrows with the same initial point O. Let
P and Q be the terminal points of the two vectors,
respectively. Suppose that OPRQ is a parallelogram.
From definition, we have
−−→
OQ+
−−→
QP =
−−→
OP
b+
−−→
QP = a
−−→
QP = a− b
Hence the diagonal
−−→
QP represents the difference
a − b. Moreover, a − b is the vector which can be
represented by the arrow from the terminal point of b
to the terminal point of a.
a b
a
−b
a− b
O
P
Q
R
a
b
a−
b
Figure 6: Subtraction of Vectors.
We now define the operation of multiplying a vector by a real number. Roughly speaking, to
multiply a vector a by a real number λ, all we do is stretch the vector by a factor of λ, whilst
keeping its direction unchanged. We need to be careful if λ is not positive.
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.1. VECTOR QUANTITIES 5
Definition 5. (Multiplication of a vector by a scalar). Let a and b be vectors
and let λ ∈ R.
1. If λ > 0, then λa is the vector whose magnitude is λ|a| and whose direction is
the same as that of a.
2. If λ = 0, then λa = 0.
3. If λ < 0, then λa is the vector whose length is |λ||a| and whose direction is
the opposite of the direction of a.
From the above, the negative of a and the
product of the scalar −1 and the vector a are
both the vector which has the same magni-
tude as a, but the direction is the opposite of
that of a. Hence we have
−a = (−1)a.
a
2a
−a
Figure 7: Scalar Multiplication.
We have already seen the commutative and associative laws of vector addition. There are some
other important properties of scalar multiplication and vector addition. Let a and b be vectors, λ
and µ be real numbers, then:
λ(µa) = (λµ)a, (Associative law of multiplication by a scalar)
(λ+ µ)a = λa+ µa, (Scalar distributive law)
λ(a+ b) = λa+ λb. (Vector distributive law)
The vector distributive law for the case that λ > 0 follows from properties of similar triangles and
Figure 8. The proof for the other cases and the proofs of the other two laws are left as exercises
for interested students.
a λa
b
λb
Figure 8: Vector Distributive Law.
c©2020 School of Mathematics and Statistics, UNSW Sydney
6 CHAPTER 1. INTRODUCTION TO VECTORS
We can use these rules to simplify vector expressions.
Example 1. Simplify 3(2a− b) + (a− 2b).
Solution. In this example, we shall quote all the rules that we used.
3(2a− b) + (a− 2b)
= 3[2a+ (−1)b] + [a+ (−1)(2b)] (Definition of subtraction;− a = (−1)a)
= [3(2a) + 3((−1)b)] + [a+ (−1)(2b)] (Vector distributive law)
= [6a+ (−3)b] + [a+ (−2)b] (Associative law of multiplication by a scalar)
= (6a+ a) + [(−3)b+ (−2)b] (Associative law and commutative law)
= (6 + 1)a+ [(−3) + (−2)]b (Scalar distributive law)
= 7a− 5b (Definition of subtraction;− a = (−1)a)
In practice, we simply write
3(2a− b) + (a− 2b) = 6a− 3b+ a− 2b = 7a− 5b. ♦
Example 2. Simplify 2
−→
AC −−−→OC +−→OA.
Solution. Let
−→
OA = a and
−−→
OC = c.
2
−→
AC −−−→OC +−→OA = 2(c− a)− c+ a = 2c− 2a− c+ a = c− a = −→AC.
♦
1.1.2 Two dimensional vector quantities
At first, we can apply geometric vectors to prove some geometry theorems. In some aspects, vectors
do have some advantages. For instance, we can prove a quadrilateral OABC to be a parallelogram
simply by proving that
−→
OA =
−−→
BC. To prove two lines PQ and RS are parallel, we need to show
that there exists a real number λ such that
−−→
PQ = λ
−→
RS.
Example 3. In a triangle OAB, take D and E such that OD : DA = OE : EB = 1 : 2. Prove
that DE is parallel to AB and the length of DE is
1
3
times that of that of AB.
Proof. SinceD divides OA in the ratio 1 : 2, the vector
−−→
OD has the same direction as
−→
OA and its length is
1
3
times that of
−→
OA. Hence
−−→
OD =
1
3
−→
OA. Similarly, we
also have
−−→
OE =
1
3
−−→
OB. Thus
−−→
DE =
−−→
OE −−−→OD = 1
3
−−→
OB − 1
3
−→
OA
=
1
3
(
−−→
OB −−→OA) = 1
3
−−→
AB.
Hence DE is parallel to AB and the length of DE is
1
3
times that of that of AB.
O
D
A
E
B
Figure 9.
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.1. VECTOR QUANTITIES 7
[X] Example 4. Let
−→
OA = a and
−−→
OB = b. Prove that P is a point on the line AB between A
and B if and only if
−−→
OP = (1− λ)a+ λb for some real number 0 < λ < 1.
Proof. If P is a point on AB between A and B, there exists
a real number λ between 0 and 1 such that
−→
AP = λ
−−→
AB.
Hence,
−−→
OP =
−→
OA+
−→
AP =
−→
OA+ λ
−−→
AB
= a+ λ(b− a) = (1− λ)a+ λb.
Conversely, if
−−→
OP = (1− λ)a+ λb and 0 < λ < 1, we have
−→
AP = [(1− λ)a+ λb]− a = λ(b− a) = λ−−→AB.
Since 0 < λ < 1, the point P lies on AB between A and
B.
O A
B
P
Figure 10.
[X] Example 5. Prove that the three medians of a triangle are concurrent.
Proof. Name the vertices of a triangle by O, A and B. Let D,E,F be the midpoints of OB,OA,AB
respectively. Suppose that AD and BE intersect at G.
Let
−→
OA = a and
−−→
OB = b. Hence
−−→
OE =
1
2
a and
−−→
OD =
1
2
b.
Since G lies on both AD and BE and inside the triangle,
from Example 4 there exist real numbers λ and µ such that
−−→
OG = (1− λ)a+ λ
(
1
2
b
)
= (1− µ)b+ µ
(
1
2
a
)
.
By rearranging terms, we get
(1− λ)a− 1
2
µ a = (1− µ)b− 1
2
λb.
Since a cannot be a non-zero scalar multiple of b, we have
(1− λ)− 1
2
µ = 0 and (1− µ)− 1
2
λ = 0.
By solving the above simultaneous equations, we have λ = 23
and µ = 23 . So we have
−−→
OG = 13 (a+ b).
O A
B
E
D F
G
Figure 11.
Since F is the midpoint of AB, so
−−→
OF = 12 (a + b). Thus
−−→
OG and
−−→
OF are in the same direction.
Hence G lies on OF and therefore the three medians of the triangle OAB are concurrent.
Many calculations with vector quantities in the plane can be done geometrically using scale
diagrams.
c©2020 School of Mathematics and Statistics, UNSW Sydney
8 CHAPTER 1. INTRODUCTION TO VECTORS
Example 6. A yacht sails from a pier in the direction N60◦ E for 15 km, then turns to N 45◦W
for 10 km. What are the distance and the bearing of the yacht from the pier?
Solution. As shown in Figure 12, after the first leg, the
yacht is at P which is 15 cos 30◦ km east and 15 sin 30◦ km
north of the pier, O. The yacht then moves, 10 cos 45◦ km
west and 10 sin 45◦ km north, to Q. The yacht is then
15 cos 30◦ − 10 cos 45◦ ≈ 5.919 km east of O
and 15 sin 30◦ + 10 sin 45◦ ≈ 14.571 km north of O.
Hence OQ ≈
√
(5.919)2 + (14.571)2 ≈ 15.73,
θ ≈ tan−1
(
14.571
5.919
)
≈ 67◦54′.
Here, θ is angle of OQ measured from the east. The
distance and the bearing of the yacht from the pier are
then 15.73 km and N22◦6′ E. ♦
O
P
Q
E
N
30◦
45◦
Figure 12.
In adding the two displacement vectors
−−→
OP and
−−→
PQ, we add and subtract displacements in the
east-west and the north-south direction. We then specified the sum
−−→
OQ by its length and direction.
We shall see how this relates to the algebraic definition of vector in the next section.
Note. For many of the problems in physical world, the vector quantities are in three dimensions
rather than two. At least in theory, the same methods in this section could be used to solve such
problems. Each vector could be represented as an arrow in space, and then we can add two vectors
and multiply a vector by a scalar.
1.2 Vector quantities and Rn
In Example 6 of Section 1.1.2, if we denote the vector of length 1 km towards the east by i and the
vector of length 1 km towards the north by j, by the definition of geometric vectors we can write
−−→
OP = 15 cos 30◦ i+ 15 sin 30◦ j,
−−→
PQ = −10 cos 45◦ i+ 10 sin 45◦ j, and
By the associative, commutative and distributive laws of geometric vectors, we have
−−→
OQ =
−−→
OP +
−−→
PQ
= (15 cos 30◦ − 10 cos 45◦)i+ (15 sin 30◦ + 10 sin 45◦))j.
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.2. VECTOR QUANTITIES AND Rn 9
In a more convenient way, we can write
−−→
OP =
(
15 cos 30◦
15 sin 30◦
)
,
−−→
PQ =
(−10 cos 45◦
10 sin 45◦
)
,
−−→
OQ =
(
15 cos 30◦ − 10 cos 45◦
15 sin 30◦ + 10 sin 45◦
)
.
There is no reason why we cannot generalise this to all two dimensional vectors, three dimen-
sional vectors and beyond.
1.2.1 Vectors in R2
We first choose two vectors, conventionally denoted by i
and j, of unit length, and at right angles to each other so
that j is pointing at an angle of pi2 anticlockwise from i. The
vectors i and j are known as the standard basis vectors for
R2. See Figure 13.
i
j
Figure 13.
As shown in Figure 14, every vector a can be ‘resolved’
(in a unique way) into the sum of a scalar multiple of i plus
a scalar multiple of j. That is there are unique real numbers
a1 and a2 such that a = a1i+ a2 j. If the direction θ of a is
measured from the direction of i, then these scalars can be
easily found using the formulae
a1 = |a| cos θ, and a2 = |a| sin θ.
We call a1i, a2 j the components of a.
a
a1i
a2 j
θ
Figure 14.
Now, every vector a in the plane can be specified by these two unique real numbers. We
can write a in form of a column vector or a 2-vector
(
a1
a2
)
. We call the numbers a1, a2 the
components of
(
a1
a2
)
. In this case, i =
(
1
0
)
and j =
(
0
1
)
.
The column vector
(
a1
a2
)
is also called the coordinate vector with respect to the basis vectors{(
1
0
)
,
(
0
1
)}
and a1, a2 are also called the coordinates of the vector.
Theorem 1. Let a and b be (geometric) vectors, and let λ ∈ R. Suppose that a =
(
a1
a2
)
and
b =
(
b1
b2
)
. Then
1. the coordinate vector for a+ b is
(
a1 + b1
a2 + b2
)
;
c©2020 School of Mathematics and Statistics, UNSW Sydney
10 CHAPTER 1. INTRODUCTION TO VECTORS
2. the coordinate vector for λa is
(
λa1
λa2
)
.
Proof. The basis vectors are i and j. We first give a detailed proof of the second part.
λa = λ(a1i+ a2 j) (by definition of coordinate vectors)
= λa1i+ λa2 j (by vector distributive law)
Hence the coordinate vector for λa is
(
λa1
λa2
)
.
For the first part, by associative, commutative and distributive laws, we have
a+ b = (a1i+ a2 j) + (b1i+ b2 j)
= (a1i+ b1i) + (a2 j+ b2 j)
= (a1 + b1)i+ (a2 + b2)j
Hence the coordinate vector for a+ b is
(
a1 + b1
a2 + b2
)
.
We then can define the mathematics structure R2, which is the set of 2-vectors, by
R2 =
{(
a1
a2
)
: a1, a2 ∈ R
}
,
with addition and multiplication by a scalar defined by — for any
(
a1
a2
)
,
(
b1
b2
)
∈ R2 and λ ∈ R,
(
a1
a2
)
+
(
b1
b2
)
=
(
a1 + b1
a2 + b2
)
and λ
(
a1
a2
)
=
(
λa1
λa2
)
.
The elements in R2 are called vectors and sometimes column vectors. It is obvious that the set
R2 is closed under addition and scalar multiplication. Like geometric vectors, the vectors in R2
also obey the commutative, associative, and distributive laws. There is a zero vector 0 =
(
0
0
)
and
a negative
(−a1
−a2
)
for any vector a =
(
a1
a2
)
.
1.2.2 Vectors in Rn
The concept of components can easily be generalised from two dimensions to any number of di-
mensions.
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.2. VECTOR QUANTITIES AND Rn 11
Definition 1. Let n be a positive integer. The set Rn is defined by
Rn =


a1
a2
...
an
 : a1, a2, . . . , an ∈ R
 .
An element

a1
a2
...
an
 in Rn is called an n-vector or simply a vector; and a1, a2, . . . , an
are called the components of the vector.
Note. We say that two vectors in Rn are equal if the corresponding components are equal. In
other words
a1...
an
 =
b1...
bn
 if and only if a1 = b1, . . . , an = bn.
Example 1. 1. Clearly
10
1
 ∈ R3;

0.5
1.4
2.3
−4.1
 ∈ R4.
2.
12
3
 and
23
1
 are different elements of R3. ♦
As in R2, we can define addition of two vectors in Rn and multiplication of a vector in Rn by a
scalar in R.
Definition 2. Let a =

a1
a2
...
an
, b =

b1
b2
...
bn
be vectors in Rn and λ be a real number.
We define the sum of a and b by a+ b =

a1 + b1
a2 + b2
...
an + bn
.
We define the scalar multiplication of a by λ by λa =

λa1
λa2
...
λan
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
12 CHAPTER 1. INTRODUCTION TO VECTORS
Note. The addition rule tells us how to add two vectors with the same number of components.
The sum of vectors with different numbers of components is not defined.
Example 2. 1.
 4−3
5
+
 24
−2
 =
 4 + 2−3 + 4
5 + (−2)
 =
61
3
.
2.

1.3
2
0
1
+

3.2
−2
2
1
 =

4.5
0
2
2
 .
3. 3
12
3
 =
36
9
.
4. π

−1
2
0
5
 =

−π
2π
0
5π
.
5.
12
3
+

2
0
4
−1
 is not defined! ♦
Proposition 2. Let u, v and w be vectors in Rn.
1. u+ v = v + u. (Commutative Law of Addition)
2. (u+ v) +w = u+ (v +w). (Associative Law of Addition)
[X] Proof. We prove the commutative law, while proving the associative law will be left as an
exercise.
Let u =
u1...
un
 and v =
v1...
vn
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.2. VECTOR QUANTITIES AND Rn 13
u+ v =
u1...
un
+
v1...
vn

=
u1 + v1...
un + vn
 (definition of addition in Rn)
=
v1 + u1...
vn + un
 (commutative law of real numbers)
=
v1...
vn
+
u1...
un
 (definition of addition in Rn)
= v+ u

Definition 3. Let n be a positive integer.
1. The zero vector in Rn, denoted by 0, is the vector with all n components 0.
2. Let a =
a1...
an
 ∈ Rn. The negative of a, denoted by −a is the vector
−a1...
−an
.
3. Let a, b be vectors in Rn. We define the difference, a− b, by
a− b = a+ (−b).
Note. Let a, b be vectors in Rn and 0 be the zero vector in Rn.
1. a+ 0 = 0+ a = a.
2.
a1...
an
−
b1...
bn
 =
a1 − b1...
an − bn
 .
3. a+ (−a) = (−a) + a = 0.
Example 3. 1.
00
0
 is the zero vector in R3, while

0
0
0
0
 is the zero vector in R4
c©2020 School of Mathematics and Statistics, UNSW Sydney
14 CHAPTER 1. INTRODUCTION TO VECTORS
2.
 4−3
5
−
 24
−2
 =
 4− 2−3− 4
5− (−2)
 =
 2−7
7
.
Proposition 3. Let λ, µ be scalars and u, v be vectors in Rn.
1. λ(µv) = (λµ)v (Associative Law of Scalar Multiplication)
2. (λ+ µ)v = λv + µv (Scalar Distributive Law)
3. λ(u+ v) = λu+ λv (Vector distributive Law)
1.3 Rn and analytic geometry
1.3.1 Two dimensions
From an algebraic point of view, R2 behaves very much the same as R43 — the rules for addition
and scalar multiplication are more or less the same. However, the different sets R1, R2, R3, R4, . . .
tend to appear in different types of applications.
The sets R2 and R3 are particularly important for
solving problems which lie in a plane or in “space”.
You are probably quite familiar with working with
the coordinates of points in a plane. In this section
we shall look more closely at the relationship between
points in the plane and their coordinates.
Let’s suppose that we have a plane with several
objects in it, as in Figure 15.
b
b
b
A
C
B
Figure 15.
To solve problems such as “what is the distance between A and B?”, it is often convenient to
work with the coordinates of these points. In real life, points don’t come with (x, y) coordinates
attached.
We have to
(i) specify an origin point O somewhere in the
plane;
(ii) specify a unit of length;
(iii) specify a direction (usually called the x-
direction); and
(iv) specify a y-direction (usually at angle pi2 anti-
clockwise to the x-direction).
Having done this we can then find the coordinates
(x1, y1) of A (with respect to our chosen coordinate
system) by requiring that A is x1 units from the origin
in the x-direction and y1 units from the origin in the y-
direction. In this case we represent A by the 2-tuple,
or ordered pair, (x1, y1).
x1
y1
O
Ab
x-direction
y-direction
Figure 16.
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.3. Rn AND ANALYTIC GEOMETRY 15
The line through O in the x-direction is called the
x-axis. The y-axis is similarly defined.
With the choice of the origin O, the point A is nat-
urally associated with a vector
−→
OA which is called the
position vector of A with respect to the origin. Then
we choose i to be the unit vector in the x-direction and
j to be the unit vector in the y-direction. The vectors
x1i and y1j are often known as the components of
a =
−→
OA in the x and y-directions respectively. As in
section 2.2.1, the position vector of A is represented
by the coordinate vector
(
x1
y1
)
with respect to the
basis vectors {i, j}.
x1i
y1j
a
O
A
x-direction
y-direction
Figure 17.
When a coordinate system is specified, each point in the plane corresponds to a unique 2-tuple
(x, y), and equivalently a unique column vector
(
x
y
)
. Conversely, once the coordinate system has
been fixed, each 2-tuple (x, y) or equivalently each
(
x
y
)
∈ R2 corresponds to a unique point in the
plane. We shall often denote the position vector of A by a.
It is a consequence of Pythagoras’ Theorem that if the points A and B in the plane have position
vectors
(
x1
y1
)
and
(
x2
y2
)
respectively, then the distance between A and B is given by
dist(A,B) =
√
(x1 − x2)2 + (y1 − y2)2.
Now, we can naturally define the length of a 2-vector a. Choose the point A such that
−→
OA
represents a. The length of a is defined to be dist(O,A). It is not difficult to see that as displace-
ments in a plane, two non-zero vectors a and b in R2 is parallel if and only if there is a non-zero
real number λ such that a = λb.
Note. The use of coordinates to solve problems in geometry is a relatively recent introduction.
Although geometry has been studied for thousands of years, this “analytic geometry” was only
introduced by R. Descartes (from whose name the term ‘Cartesian’ is derived) in the early 17th
century.
1.3.2 Three dimensions
You can give coordinates to points in space just as you give coordinates to points in the plane.
Again however, we need to fix our coordinate system. Now, let us
(i) specify an origin point O somewhere in space;
(ii) specify a unit of length;
(iii) specify three directions, usually called the x, y and z directions, each at right angles to the
others.
We choose unit vectors i, j, k, respectively, in the x-direction, y-direction, z-direction. The coor-
dinates of a point A with respect to the basis {i, j,k} are given by a 3-tuple (x1, y1, z1) such that
c©2020 School of Mathematics and Statistics, UNSW Sydney
16 CHAPTER 1. INTRODUCTION TO VECTORS
A is x1 units from the origin in the x-direction, y1 units from the origin in the y-direction, and z1
units from the origin in the z-direction. The position vector of A,
−→
OA, is a = x1i + y1j + z1k,
which can also be represented by
x1y1
z1
 ∈ R3.
y-direction
z-direction
x-direction
z1
y1
x1
A
O
b
j-direction
y-axis
k-direction
z-axis
i-direction
x-axis
x1i
y1j
z1k
A
O
a
Figure 18: Right-handed Coordinate System.
The coordinate system in Figure 18 is what is called right-handed Cartesian coordinate
system. Using your right hand, if you point your index finger in the x-direction, your middle finger
in the y-direction, then your extended thumb will point in the z-direction. This is the orientation
that is conventionally used in mathematics and physics.
Again once we have set up our coordinate system, each point in space corresponds uniquely to
a 3-tuple and uniquely to a 3-vector (and vice-versa). It is rather hard to draw pictures of objects
in three dimensions (on two dimensional paper!). Nevertheless, this identification will provide us
with an important way of visualising relationships between vectors in R3.
Since the basis vectors chosen are of unit length and are perpendicular to each other, by Pythago-
ras’ theorem the length of
−→
OA is
√
x21 + y
2
1 + z
2
1 . Let B be another point with position vector
b =
x2y2
z2
, the distance between A and B is the same as the length of the vector b−a = −−→AB, i.e.
dist(A,B) =
√
(x1 − x2)2 + (y1 − y2)2 + (z1 − z2)2.
Similar to R2, if a =
−→
OA, the length of a is dist(O,A). Two non-zero vectors a and b in R3 are
parallel if and only if there is a non-zero real number λ such that a = λb.
1.3.3 n-dimensions
We have used elements in R2 and R3 to represent the position vectors of points in the plane and in
the three-dimensional space, respectively. How do we interpret n-vectors in Rn as position vectors
of points in an n-dimensional space which generalises the 2-dimensional and the 3-dimensional
spaces? To study the geometry in the n-dimensional space, we need the notions of length and
direction.
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.3. Rn AND ANALYTIC GEOMETRY 17
In R2 or R3, two non-zero vectors a and b are parallel if and only if there exists a non-zero real
number λ such that b = λa. We use this idea as definition.
Definition 1. Two non-zero vectors a and b in Rn are said to be parallel if
b = λa for some non-zero real number λ.
They are said to be in the same direction if λ > 0.
Example 1. The vectors

3
−1
2
4
 and

6
−2
4
8
 are parallel because

6
−2
4
8
 = 2

3
−1
2
4
.
The vectors

3
0
2
1
 and

6
1
4
2
 are not parallel because we cannot find a scalar λ such that

6
1
4
2
 = λ

3
0
2
1
. ♦
Definition 2. Let ej be the vector in R
n with a 1 in the jth component and 0 for
all the other components. Then the n vectors e1, e2, . . . , en are called the standard
basis vectors for Rn.
Example 2. The standard basis vectors for R3 are10
0
 ,
01
0
 ,
00
1
 and
a1a2
a3
 = a1
10
0
+ a2
01
0
+ a3
00
1
 .
The standard basis vectors for R4 are

1
0
0
0
 ,

0
1
0
0
 ,

0
0
1
0
 ,

0
0
0
1
 . ♦
We imagine that we can choose a point in the n-dimensional space as the origin O and n axes in
the directions of these n standard basis vectors respectively. Then we can define the coordinates
of a point A with respect to this coordinate system in the same way as in 3-dimensional space. For
instance, if A is ai units from O in the direction of ei for each 1 6 i 6 n, the coordinates of A
are given by an n-tuple (a1, . . . , an). Thus the coordinate vector, or the position vector, of the
point A is a =
a1...
an
 ∈ Rn. We can also discuss n-dimensional geometric vectors. For example,
the vector
−→
OA, which has the initial point O and the terminal point A, represents a.
c©2020 School of Mathematics and Statistics, UNSW Sydney
18 CHAPTER 1. INTRODUCTION TO VECTORS
Example 3. i) The coordinate vector for the displacement from the point A, with coordinates
(2, 3) to the point B with coordinates (−1, 2) is
−−→
AB =
−−→
OB −−→OA =
(−1
2
)
−
(
2
3
)
=
(−3
−1
)
.
ii) The coordinate vector for the line segment joining the point A with coordinates (2, 4,−5, 3)
to the point B with coordinates (−3, 4, 0, 4) is
−−→
AB =

−3
4
0
4
−

2
4
−5
3
 =

−5
0
5
1
 .
iii) If the position vector of the point A is
−12
2
 and the displacement from A to B has
coordinate vector
 42
−3
, then the position vector for B is
−−→
OB =
−→
OA+
−−→
AB =
−12
2
+
 42
−3
 =
 34
−1
 .
Thus B has coordinates (3, 4,−1). ♦
We can also define the length of a vector and the distance between two points which have
coordinate vectors in Rn.
Definition 3. The length of a vector a =
a1...
an
 ∈ Rn is defined by
|a| =
√
a21 + · · ·+ a2n.
Example 4. Find the lengths of a =
(
2
−4
)
and b =

1
−3
2
−6
4
.
Solution. |a| = √4 + 16 = 2√5; |b| = √1 + 9 + 4 + 36 + 16 = √66. ♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.4. LINES 19
Definition 4. The distance between two points A and B with position vectors
in Rn is the length of the vector
−−→
AB.
Then, if a =
a1...
an
 is the coordinate vector of A and b =
b1...
bn
 is the coordinate vector of B,
we have
−−→
AB = b− a =
b1 − a1...
bn − an
 ,
so the distance between A and B is∣∣∣−−→AB∣∣∣ = |b− a| =√(b1 − a1)2 + . . .+ (bn − an)2.
Example 5. Find the distance between the point A with coordinates (3,−2, 5, 1) and the point B
with coordinates (−1, 2,−8, 4).
Solution. As
−−→
AB =

−4
4
−13
3
 , the distance is ∣∣∣−−→AB∣∣∣ = √16 + 16 + 169 + 9 = √210. ♦
Note. Until now, we distinguish an n-dimensional space from Rn which contains all the position
vectors of the points in the space. From now on, we shall blur the difference. So we often refer
to R2 as the plane and the R3 as the three dimensional space. In our algebra notes, a point in an
n-space is usually referred as an n-tuple of coordinates (a1, . . . , an) and the corresponding position
vector is denoted by an n-vector
a1...
an
.
1.4 Lines
Suppose now that we wish to express a line in the plane in terms of coordinate vectors. One form
for the equation of a line is y = mx+ b, where m is the slope of the line and b is the y-intercept.
Since a line is really just a set of points, each point on the line will have a corresponding
coordinate vector in R2. There is a simple way to describe the coordinate vector of each point on
a line in the plane.
c©2020 School of Mathematics and Statistics, UNSW Sydney
20 CHAPTER 1. INTRODUCTION TO VECTORS
Let us look firstly at a line though the origin,
y = mx. One way of reading this equation is to
say that the line is the set of all points whose coor-
dinates are of the form (x,mx). In other words, the
set
S =
{
x ∈ R2 : x = x
(
1
m
)
, for some x ∈ R
}
is the set of all coordinate vectors for points on the
line.
There is another way of thinking about this. Let v
be some non-zero displacement in the plane. Let Q be
a point in the plane such that
−−→
OQ = v. Consider the
set S of all points P in the plane whose displacement
from the origin is λv for some real number λ. It is not
difficult to see that S is the set of all points on the line
through O parallel to v or equivalently S is the line
through O and Q. See Figure 19.
O
b
b
b
Q
P
v
x
Figure 19: A Line Through O.
For each value of the ‘parameter’ λ, we obtain a point, in the form of a coordinate vector, on
the line. It is for this reason that such a description is often called a parametric vector form
for the line.
This is how we are going to define a ‘line’ spanned by a vector in Rn.
Definition 1. Let v be any non-zero vector in Rn. The set
S = {x ∈ Rn : x = λv, for some λ ∈ R}.
is the line in Rn spanned by v, and we call the expression x = λv, a parametric
vector form for this line.
Example 1. Draw a diagram of the line in R2 spanned
by (1, 2), and find a parametric vector form for this
line.
Solution. This is the set of all points of the form
λ(1, 2) = (λ, 2λ) for some λ ∈ R. In the plane, it is
the line going though the origin and the point with
coordinates (1, 2). (See Figure 20.)
A parametric vector form is
x = λ
(
1
2
)
, λ ∈ R.
♦
1
2
O
b
Figure 20.
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.4. LINES 21
Example 2. Write a parametric vector form for the line in R4 spanned by the vector

3
5
−6
8
.
A solution is x = λ

3
5
−6
8
 for λ ∈ R. ♦
Note. A given line does not have a unique parametric vector form. The following are also
parametric vector forms of the same line.
x = λ

−3
−5
6
−8
 and x = λ

9
15
−18
24
 .
On the other hand, the name of the parameter in the expression is not important. For example
the forms
x = µ

3
5
−6
8
 for µ ∈ R and x = s

3
5
−6
8
 for s ∈ R
also represent the same line.
Most lines in the plane, or in three dimensional space do not go through the origin of the
coordinate system being used. Consider the line in the plane going through the points A and B.
To find the coordinate vector of some point P on this line, we shall calculate the displacement from
O to P .
Clearly −−→
OP =
−→
OA+
−→
AP.
Because A, P and B are collinear, thus
−→
AP is parallel
to
−−→
AB. That is,
−→
AP = λ
−−→
AB, for some real number
λ. Let a denote the position vector of A and x the
position vector of P and let v denote the displacement
from A to B. In terms of these vectors, what we have
just said is
x = a+ λv, for some λ ∈ R.
Again, each value of the parameter λ gives the position
vector of a point on the line, and each point on the
line can be found by choosing an appropriate value of
λ.
a
v
λv
x = a+ λv
A
B
P
O
Figure 21.
c©2020 School of Mathematics and Statistics, UNSW Sydney
22 CHAPTER 1. INTRODUCTION TO VECTORS
Definition 2. A line in Rn is any set of vectors of the form
S = {x ∈ Rn : x = a+ λv, for some λ ∈ R},
where a and v 6= 0 are fixed vectors in Rn. The expression
x = a+ λv, for some λ ∈ R,
is a parametric vector form of the line through a parallel to v.
Example 3. Find a parametric vector form of the line through the point (2,−3, 1, 6) parallel to
the vector v =

1
4
−6
2
.
Solution. Each point on the line has coordinate vector of the form
x =

2
−3
1
6
+ λ

1
4
−6
2
 , for some λ ∈ R. (*)
♦
Note. Once again, a line may have different parametric vector forms. For instance, the following
line
x =

3
1
−5
8
+ λ

1
4
−6
2
 , for some λ ∈ R. (**)
is one which passes through (3, 1,−5, 8) parallel to v. Note that

2
−3
1
6
 +

1
4
−6
2
 =

3
1
−5
8
, so
(3, 1,−5, 8) is another point on the line with equation (*). Thus both (*) and (**) are parametric
vector forms of the same line.
1.4.1 Lines in R2
We can write the equation of a line in Cartesian form y = mx + d and also in parametric vector
form x = a+ λv. It is often necessary to convert between these two forms.
Conversion from Cartesian form to parametric vector form.
The equation y = mx+ d can be converted to parametric vector form as follows:
Suppose x =
(
x
y
)
is the position vector of a point on the line and set x = λ. Then, since
y = mλ+ d, we obtain
x =
(
λ
mλ+ d
)
=
(
0
d
)
+
(
λ
mλ
)
=
(
0
d
)
+ λ
(
1
m
)
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.4. LINES 23
Thus a parametric vector form for the line is
x =
(
0
d
)
+ λ
(
1
m
)
, for λ ∈ R.
You should think of this as saying that the equation y = mx+d represents a line through the point
(0, d) parallel to the vector
(
1
m
)
.
Example 4. The equation y = 3x+ 2 has a parametric vector form
x =
(
x
y
)
=
(
0
2
)
+ λ
(
1
3
)
, for λ ∈ R.
The line passes through (0, 2) and is parallel to
(
1
3
)
. ♦
The equation y = mx+ d is a special case of the more general linear equation in two variables
x and y, which is given by
ax+ by = c,
where a, b, and c are fixed numbers. If b 6= 0, this linear equation can easily be converted to the
y = mx + d form by dividing by b. Alternatively, the equation ax + by = c can be converted
directly to parametric vector form by setting either x or y as a parameter. The following example
illustrates this technique.
Example 5. Find parametric vector forms for the lines in R2 given by
i) 2x− 4y = 6, ii) 2x = 8, and iii) 4y = 8.
Solution. i) On setting y = λ, we have 2x − 4λ = 6, and hence x = 3 + 2λ. Hence a parametric
vector form for the line is
x =
(
x
y
)
=
(
3 + 2λ
λ
)
=
(
3
0
)
+ λ
(
2
1
)
, for λ ∈ R.
ii) Note that 2x = 8 means x = 4. So the x has a fixed value but the y value varies. We need to
set y = λ. Hence a parametric vector form for the line is
x =
(
x
y
)
=
(
4
λ
)
=
(
4
0
)
+ λ
(
0
1
)
, for λ ∈ R.
iii) In this case the equation fixes y = 2, whereas x can have any real value. We therefore set x = λ
as the parameter and obtain
x =
(
x
y
)
=
(
λ
2
)
=
(
0
2
)
+ λ
(
1
0
)
, for λ ∈ R.
as a parametric vector form. ♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
24 CHAPTER 1. INTRODUCTION TO VECTORS
Conversion from parametric vector form to Cartesian form.
Let us start from a line in parametric vector form
x =
(
x
y
)
=
(
a1
a2
)
+ λ
(
v1
v2
)
.
Obviously, we can find two different points (x1, y1) and (x2, y2) on the line by choosing two different
values for the parameter, then we can find the Cartesian equation of the line by the two-point form
y − y1
x− x1 =
y2 − y1
x2 − x1 . However, we would like to use a method which can be easily generalised to the
higher dimension cases.
By comparing the components of the vectors on the left side and right side of the parametric
vector form, we can express the line as a pair of parametric equations:{
x = a1 + λv1,
y = a2 + λv2.
Then we can eliminate the parameter λ to get the line in Cartesian form.
Example 6. Find the Cartesian form of each of the following lines which are given in parametric
vector form.
i) x =
(
3
1
)
+ λ
(
2
1
)
, for λ ∈ R, ii) x =
(
3
1
)
+ λ
(
2
0
)
, for λ ∈ R.
Solution. i) By comparing the components, we get the parametric equations x = 3 + 2λ and
y = 1 + λ. If we eliminate the parameter λ , we find
λ =
x− 3
2
= y − 1,
which on rearranging gives
y =
x− 1
2
, or x− 2y = 1.
ii) For the second line, the parametric equations are x = 3 + 2λ and y = 1. Since y has a fixed
values 1 and x can be any real number, the Cartesian form of the line is y = 1. ♦
1.4.2 Lines in R3
A line in R3 is still of the form x = a + λv, but now the vectors x,a and v each have three
components or three coordinates.
An alternative Cartesian (or symmetric) form for the equation of a line is sometimes used in
engineering. This form can be obtained as follows:
Let x =
xy
z
, a =
a1a2
a3
 and v =
v1v2
v3
 be vectors in R3. Then the parametric vector
equation of the line through (a1, a2, a3) parallel to
v1v2
v3
 is
x =
xy
z
 =
a1a2
a3
+ λ
v1v2
v3
 .
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.4. LINES 25
Thus the parametric equations of the line are
x = a1 + λv1, y = a2 + λv2, z = a3 + λv3.
Eliminating the parameter λ yields (if all vi 6= 0), the Cartesian form
x− a1
v1
=
y − a2
v2
=
z − a3
v3
(= λ).
If v1, v2 or v3 is 0, then x, y or z will, respectively, be constant.
Example 7. Find the Cartesian form for the lines
i) x =
 2−3
1
+ λ
35
6
 , for λ ∈ R, ii) x =
 2−3
1
+ λ
30
6
 , for λ ∈ R.
Solution. i) Here x = 2 + 3λ, y = −3 + 5λ, z = 1 + 6λ. Thus, eliminating λ gives
x− 2
3
=
y + 3
5
=
z − 1
6
.
ii) In this case, x = 2+ 3λ, y = −3, z = 1+ 6λ. By eliminating λ, the Cartesian form of the line is
x− 2
3
=
z − 1
6
and y = −3.
♦
We have seen how to convert a parametric vector form of a line in R3 to the Cartesian form.
To convert a line from Cartesian to the parametric vector form, we find a point on the line and a
vector parallel to the line. Alternatively, we can introduce a parameter.
Example 8. Find a parametric vector form for the line
x− 4
7
=
2y + 3
2
= −z
6
.
Solution. Method 1. Besides forming parametric equations and then eliminating the parameter,
we can rewrite the Cartesian equations as
x− 4
7
=
y + 32
1
=
z
−6 .
Obviously, this is a line through (4,−32 , 0) parallel to
 71
−6
. Hence a parametric vector form of
the line is
x =
xy
z
 =
 4−32
0
+ λ
 71
−6
 , for λ ∈ R.
Method 2. Set each term to λ. That is
c©2020 School of Mathematics and Statistics, UNSW Sydney
26 CHAPTER 1. INTRODUCTION TO VECTORS
x− 4
7
=
2y + 3
2
= −z
6
= λ.
By rearranging terms, we get the parametric equations
x = 7λ+ 4, y = λ− 3
2
, z = −6λ.
In vector form, we have xy
z
 =
7λ+ 4λ− 32
−6λ
 =
 4−32
0
+ λ
 71
−6
 .
Then we shall get the same parametric vector form as the one we have obtained by Method 1. ♦
Example 9. Find a parametric vector form of the line defined by
x− 2y = 4 and z = 1.
Solution. Set y to be a parameter λ. Then x = 4 + 2λ. In vector form, we have
x =
xy
z
 =
4 + 2λλ
1
 =
40
1
+ λ
21
0
 .
Hence a parametric vector form of the line is:
x =
40
1
+ λ
21
0
 for λ ∈ R.
♦
Example 10. Find a parametric vector form of the line through the points A (1, 3,−4) and
B (2, 0, 6).
Solution. The line passes through the point A (1, 3,−4) and it is parallel to the vector
−−→
AB =
20
6
−
 13
−4
 =
 1−3
10
 .
Hence a parametric vector form of the line is
x =
 13
−4
+ λ
 1−3
10
 , for λ ∈ R.
♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.4. LINES 27
1.4.3 Lines through two given points (in Rn)
The methods used in handling lines in R3 can be generalised to deal with lines in Rn. We have seen
the definition of a parametric vector form of a line in Rn (Definition 2 on page 22) — x = a+ λv.
We can also use the symmetric form of a line in Rn. In particular, if none of the components of v is
0, the symmetric form, or the Cartesian form of a line through (a1, . . . , an) parallel v =
v1...
vn

is
x1 − a1
v1
=
x2 − a2
v2
= · · · = xn − an
vn
, where x =
x1...
xn
 .
Example 11. Find the equation of the line in R4 through the pointsA (1, 0, 3,−2) andB (2, 1, 0,−1)
in parametric vector form and Cartesian form.
Solution. The line is parallel to the vector

2
1
0
−1
 −

1
0
3
−2
 =

1
1
−3
1
. Thus the line in
parametric vector form is
x =

1
0
3
−2
+ λ

1
1
−3
1
 , for λ ∈ R.
The Cartesian form of the line is
x1 − 1
1
=
x2
1
=
x3 − 3
−3 =
x4 + 2
1
.
♦
More generally, a vector equation of the line joining A to B, with position vectors a and b
respectively, is
x = a+ λ(b− a) = (1− λ)a+ λb, λ ∈ R.
Different values of λ correspond to different points
on this line. For example, the value λ = 15 , gives the
position vector
x =
4
5
a+
1
5
b = a+
1
5
(b− a).
This vector can be written as
x =
−→
OA+
1
5
−−→
AB.
In R2 or R3, it is the position vector of the point P
which divides AB in the ratio 1 : 4 (see Figure 22).
b
b
bA
B
P
O
a
x
b
λ = 15
Figure 22.
c©2020 School of Mathematics and Statistics, UNSW Sydney
28 CHAPTER 1. INTRODUCTION TO VECTORS
Generally, in Rn, if P lies between A and B, then AP : PB = λ : 1− λ, for some λ such that
0 < λ < 1. Therefore, the position vector of P is
(1− λ)a+ λb,
and the set all the points on the line segment AB in Rn is
{x ∈ Rn : x = (1− λ)a+ λb, for 0 6 λ 6 1} .
Example 12. Find the medians of the triangle with vertices A (1, 2, 3), B (−2, 1,−4) and C (4,−1, 2).
Solution. A median of a triangle is a line through a vertex and the midpoint of the opposite side.
The median through A is therefore the line which passes through the point A and the midpoint P
of the line segment BC, as shown in Figure 23.
Let a, b, c be the position vectors of A, B, C, respectively.
Since P is the midpoint of BC, the coordinate vec-
tor of P is
1
2
(b+ c) =
1
2
−21
−4
+
 4−1
2
 =
 10
−1
 .
and hence
−→
AP =
 10
−1
−
12
3
 =
 0−2
−4
 .
Since A lies on the median and the vector
−→
AP is par-
allel to the median, we can write
x =
12
3
+ λ1
 0−2
−4
 , λ1 ∈ R
as a parametric vector form for the median through
A.
A
B
P
C
O
Figure 23.
Similarly, we can obtain the following parametric vector forms for the medians through B and
C:
x =
−21
−4
+ λ 2


5
2
1
2
5
2
−
−21
−4

 =
−21
−4
+ λ 2

9
2
−12
13
2
 , λ 2 ∈ R;
and
x =
 4−1
2
+ λ 3
−
9
2
5
2
−52
 , λ 3 ∈ R.
♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.5. PLANES 29
Example 13. Are the lines x =
 3−1
2
+ λ1
−2−3
4
 and x =
40
1
+ λ 2
 39
2
−6
 parallel?
Solution. Since two vectors are parallel if and only if one is a non-zero scalar multiple of the
other, and
 39
2
−6
 = −3
2
−2−3
4
, we can conclude that the two lines are parallel. ♦
1.5 Planes
We have seen that the set of all vectors x ∈ Rn which are scalar multiples of a fixed non-zero vector
v ∈ Rn, i.e. the set {x ∈ Rn: x = λv, for λ ∈ R}, has a simple geometric interpretation as a
line through the origin. In this section we shall show that the set of all sums of scalar multiples of
two non-zero non-parallel vectors v1 and v2 has a geometric interpretation as a plane through the
origin. Furthermore, we shall look at how to find equations for planes.
1.5.1 Linear combination and span
The set of all sums of scalar multiples of some collection of vectors is very important in linear
algebra, and therefore has a special name. In the following definitions the word vector is used to
mean either a geometric vector or an element of Rn.
Definition 1. A linear combination of two vectors v1 and v2 is a sum of scalar
multiples of v1 and v2. That is, it is a vector of the form
λ1v1 + λ2v2,
where λ1 and λ2 are scalars.
Example 1. i) The vector
−32
6
 is a linear combination of the vectors
10
2
 and
 3−1
0
 since
−32
6
 = 3
10
2
+ (−2)
 3−1
0
 .
ii) The vector
01
2
 is not a linear combination of
10
2
 and
 3−1
0
 since there are no real
numbers λ1 and λ2 such that 01
2
 = λ1
10
2
+ λ2
 3−1
0
 .
(Can you prove this?) ♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
30 CHAPTER 1. INTRODUCTION TO VECTORS
Definition 2. The span of two vectors v1 and v2, written span(v1,v2), is the set
of all linear combinations of v1 and v2. That is, it is the set
span(v1,v2) = {x : x = λ1v1 + λ2v2, for some λ1, λ2 ∈ R }.
Example 2. Given that v1 and v2 are non-zero, non-parallel vectors in R
3. What does span(v1,v2)
look like?
Take x in span(v1,v2) . So x = λ1v1+λ2v2,
for some λ1, λ2 ∈ R. By the parallelogram law
of vector addition, x lies in the plane passing
through O and contains v1 and v2.
Conversely, suppose that X is a point with
position vector x on this plane and OX is nei-
ther parallel to v1 nor parallel to v2. Since
v1 and v2 are non-zero and non-parallel, there
exists a parallelogram OAXB, such that OA
and OB are parallel to v1 and v2 respectively.
Note also that, in this case,
−→
OA = λ1v1 and−−→
OB = λ2v2, for some real numbers λ1, λ2. By
the parallelogram law, we have x = λ1v1+λ2v2
which is in span(v1,v2).
When OX is parallel to v1, we have x = λ1v1
which is also in the span. Similarly, x is also in
the span when OX is parallel to v2.
O
x
v1 λ1v1
v2
λ2v2
λ1v1 + λ2v2
O
x
v1 A
v2
B X
Figure 24.
Hence, span(v1,v2) is the plane through the origin and parallel to v1 and v2 with equation
x = λ1v1 + λ2v2, for some λ1, λ2 ∈ R.
♦
Note that The above construction does not work if v1 and v2 are parallel or one of them is 0.
Theorem 1. In R3, the span of any two non-zero non-parallel vectors is a plane through the origin.
We can extend this to Rn using:
Definition 3. A plane through the origin is the span of any two (non-zero)
non-parallel vectors.
Example 3. Describe geometrically span
 3−4
5
 ,
42
7
, and decide if it is a plane or a line.
Solution. By definition, the span is the set of all vectors x ∈ R3 such that
x = λ1
 3−4
5
+ λ2
42
7
 for some λ1, λ2 ∈ R.
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.5. PLANES 31
As
 3−4
5
 is not a scalar multiple of
42
7
, the span is a plane through the origin parallel to the
vectors
 3−4
5
 and
42
7
. ♦
Example 4. Describe geometrically span


3
5
2
7
8
 ,

6
10
4
14
16

 and decide whether it is a plane or a
line.
Solution. By definition, the span is the set of all vectors x ∈ R5 satisfying
x = λ1

3
5
2
7
8
+ λ2

6
10
4
14
16
 for some λ1, λ2 ∈ R.
As

6
10
4
14
16
 = 2

3
5
2
7
8
, the span is the line x = λ

3
5
2
7
8
 for some λ ∈ R. ♦
1.5.2 Parametric vector form of a plane
The span of two non-zero non-parallel vectors is a plane through the origin. For planes that do not
pass through the origin we use a similar approach to that we use for lines.
c©2020 School of Mathematics and Statistics, UNSW Sydney
32 CHAPTER 1. INTRODUCTION TO VECTORS
Consider first three vectors a, v1, v2 ∈ R3.
We shall assume that v1 and v2 are non-zero non-
parallel. Let
S = {x ∈ R3 : x = a+ λ1v1 + λ2v2 for λ1, λ2 ∈ R}.
This set consists of all the vectors in R3 for which
x − a ∈ span(v1,v2). That is, S contains all the
points x for which x − a lies in the plane through
the origin spanned by v1 and v2. Thus S is the plane
in R3 which passes through the point with position
vector a, and is parallel to v1 and v2. The picture
is shown in Figure 25.
So in R3,
x = a+ λ1v1 + λ2v2, for λ1, λ2 ∈ R
is an equation of the plane passing through a and
parallel to the non-zero non-parallel vectors v1,v2.
This is called a parametric vector form of the plane.
O
S
a
x
λ1v1
λ2v2
x− a
Figure 25: x = a+ λ1v1 + λ2v2.
As before we shall use this as a way of defining what we mean by a plane in Rn.
Definition 4. Let a, v1 and v2 be fixed vectors in R
n, and suppose that v1 and v2
are not parallel. Then the set
S = {x ∈ Rn : x = a+ λ1v1 + λ2v2, for some λ1, λ2 ∈ R} ,
is the plane through the point with position vector a, parallel to the vectors v1 and
v2. The expression
x = a+ λ1v1 + λ2v2, for λ1, λ2 ∈ R,
is called a parametric vector form of the plane.
Note. For a given plane, there is not a unique parametric vector form. For example, a plane
parallel to v1 and v2 is also parallel to v1 + v2 and v1 − v2.
From the definition, a point P with position vector p ∈ Rn is said to lie on the plane through
a parallel to v1, v2, if there exist real numbers λ1, λ2 such that p = a+ λ1v1 + λ2v2.
Example 5. Find a parametric vector form for the plane through the point (1, 3, 0), parallel to
the vectors
42
9
 and
−23
4
.
Solution. One answer (there are many possible!) is
x =
13
0
+ λ1
42
9
+ λ2
−23
4
 , for λ1, λ2 ∈ R.
♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.5. PLANES 33
Example 6. Describe geometrically the setx ∈ R4 : x =

1
4
2
1
+ λ1

1
0
1
0
+ λ2

−3
2
−4
1
 , for λ1, λ2 ∈ R
 .
Solution. Since

1
0
1
0
 and

−3
2
−4
1
 are not parallel (i.e. they are not scalar multiples of each
other), this set is the plane in R4 through the point (1, 4, 2, 1) parallel to the vectors

1
0
1
0
 and

−3
2
−4
1
. ♦
Example 7. Let A, B and C be three points in Rn whose coordinate vectors with respect to some
coordinate system are a, b and c respectively. Find a parametric vector form for the plane which
passes through A, B and C.
Solution. Consider the plane through A parallel to
−−→
AB and
−→
AC. This has a parametric vector
form:
x = a+ λ1
−−→
AB + λ2
−→
AC = a+ λ1(b− a) + λ2(c− a), for λ1, λ2 ∈ R.
When λ1 = λ2 = 0 we have x = a. When λ1 = 1, λ2 = 0 we have x = b, and when λ1 = 0, λ2 = 1
we have x = c. Hence it is the plane through A, B and C.
♦
Example 8. Find a parametric vector form for the plane through the three points A (2,−1, 3), B
(−1, 4, 4) and C (3,−1, 2).
Solution. The plane is parallel to
−−→
AB =
−14
4
 −
 2−1
3
 =
−35
1
 and −−→BC =
 4−5
−2
, and
passes though the point A (2,−1, 3). Hence a parametric vector form for the plane is
x =
 2−1
3
+ λ1
−35
1
+ λ2
 4−5
−2
 , for λ1, λ2 ∈ R.
♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
34 CHAPTER 1. INTRODUCTION TO VECTORS
1.5.3 Cartesian form of a plane in R3
In R3, any equation of the form
ax1 + bx2 + cx3 = d
represents plane when not all a, b, c are zero. In Rn, n > 3, if not all a1, a2, . . . , an are zero, an
equation of the form
a1x1 + a2x2 + · · ·+ anxn = d
is called a hyperplane.
Example 9. Find a parametric vector form for the plane x1 − 3x2 + 4x3 = 4, and hence find a
point on the plane and two vectors parallel to the plane.
Solution. We set x2 = λ1 and x3 = λ2. Then solving for x1 gives x1 = 4 + 3λ1 − 4λ2. Hence
x =
x1x2
x3
 =
4 + 3λ1 − 4λ2λ1
λ2
 =
40
0
+ λ1
31
0
+ λ2
−40
1
 ,
for λ1, λ2 ∈ R. Thus the plane passes through the point (4, 0, 0) and is parallel to
31
0
 and−40
1
. ♦
As we did for lines, we can find a Cartesian equation for a plane by eliminating the parameters
λ1, λ2 from the parametric vector form.
Example 10. Find the Cartesian form of the plane x =
11
2
+ λ
11
1
+ µ
−10
2
 for λ, µ ∈ R.
Solution. Let x =
x1x2
x3
. By comparing the components, we get the following parametric
equations
x1 = 1 + λ− µ, x2 = 1 + λ and x3 = 2 + λ+ 2µ.
Hence from the second equation, we get λ = x2 − 1. Substituting this into the first equation, we
have x1 = 1 + (x2 − 1)− µ, so µ = x2 − x1. Now substitute these values of λ and µ into the third
parametric equation. The Cartesian form of the plane is, therefore,
x3 = 2 + (x2 − 1) + 2(x2 − x1) or 2x1 − 3x2 + x3 = 1.
♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
1.6. VECTORS AND MAPLE 35
1.6 Vectors and Maple
This section shows how to handle vectors in Maple. The capacity to plot simple curves in the plane
is illustrated. The 3-dimensional plot facility needs an x-terminal.
The following instruction loads the linear algebra package:
with(LinearAlgebra);
Before attempting anything involving vectors you may wish to plot a piece of a parabola paramet-
rically with:
plot([2*t,t∧2,t=-4..4]);
To enter the vectors a =
12
3
 and b =
20
9
, we type:
a:=<1,2,3>;
b:=<2,6,9>;
The command
a[2];
selects the second component of a.
Since a and b have the same number of components, they may be added with:
a+b;
Multiplication by a scalar is handled analogously:
4*a;
Now start another example with
a:=<2,3>; b:=<1,4>;
If a and b are two vectors, a parametric vector form for the line through a and b is given by:
v:=b-a; x:=a+t*v;
To plot the line segment from t = −2 to t = 3, use:
plot([x[1],x[2],t=-2..3]);
Now let a, v1 and v2 be 3-vectors. The plane x = a+ sv1 + tv2 can be entered as:
x:=a+s*v1+t*v2;
To find out about the three-dimensional plot facility, just type:
?plot3d
c©2020 School of Mathematics and Statistics, UNSW Sydney
36 CHAPTER 1. INTRODUCTION TO VECTORS
Problems for Chapter 1
Questions marked with [R] are routine, [H] harder and [M] Maple. You should make sure that you
can do the easier questions before you tackle the more difficult questions. Questions marks with a
[V] have video solutions available on Moodle.
Problems 1.1 : Vector quantities
1. [R][V] Given that ABC, DEF, andOGH are equally spaced parallel lines, as areADO, BEG
and CFH. P is the mid point of AD.
A
D
O
B C
E
G
F
H
bP
If
−−→
OH = h and
−→
OA = a, express the following in terms of a and h.
a)
−−→
OC, b)
−−→
HA, c)
−−→
GC, d)
−−→
OP, e)
−−→
GP .
2. [R] Simplify
a)
−−→
AB −−−→OB +−→OA, b) −−→AB −−−→CB + 3−−→DA+ 3−−→CD.
3. [R] Express each of the following in terms of a and b.
a) 3(2a + b)− 2(5a− b),
b) 2(p a+ q b) + 3(r a− sb) where p, q, r, s ∈ R.
4. [R] Let ABC be a triangle with
−→
OA = a,
−−→
OB = b,
−−→
OC = c where O is the origin.
a) IfM is the midpoint of the line segment AB and P is the midpoint of the line segment
CB express the vectors
−−→
OM and
−−→
OP in terms of a, b, and c.
b) Show that
−−→
MP is parallel to
−→
AC and has half its length.
5. [H][V] Given a convex quadrilateral ABCD, prove, using vectors, that the quadrilateral
formed by joining the midpoints of AB, BC, CD, and DA is a parallelogram.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 1 37
6. [R] Use geometric vectors to solve the following problems. In each case, draw a careful
picture and then use trigonometry to find the answer. If your picture is accurate, you may
wish to use a ruler and protractor to confirm your result.
a) An ant crawls 10 cm due east in a straight line and then crawls 5 cm northeast in a
straight line. What is the ant’s final displacement from its starting point?
b) An ant is standing at the western edge of a moving walkway which is moving at 12
cm per sec in the direction due South. The ant starts to walk at 5 cm/sec across the
walkway in the direction perpendicular to its edge. If the walkway is 40 cm wide, find
the displacement of the ant from its starting point just as it steps off the walkway.
c) An observer on a wharf sees a yacht sailing at 15 km per hour southeast. A sailor on
the yacht is watching a container ship and sees it sailing at 25 km per hour due north.
What is the velocity of the container ship as seen by the observer on the wharf?
d) A rower is rowing across a river. His rowing speed is 2 km per hour and there is a
current flowing in the river at 1 km per hour. Find the direction that the rower must
row to go directly across the river. If the river is 300 metres wide, how long will it
take him to cross the river?
Problems 1.2 : Vector quantities and Rn
7. [R][V] Find u+ 2v − 3w (if possible) given that
a) u =
(
2
3
)
, v =
(−1
2
)
, w =
(−1
1
)
;
b) u =
23
3
, v =
 76
−1
, w =
00
2
;
c) u =

1
1
1
2
, v =

−1
2
1
0
, w =

2
1
3
1
;
d) u =
−13
5
, v =
 102
−3
, w = (0
0
)
;
e) u = 2i+ 3j− 2k, v = i− 2j+ k, w = −i+ j− k.
8. [R] A car travels 3km due North then 5km Northeast. Use coordinate vectors to find the
distance and direction from the starting point.
9. [R] Solve Problems 6 (a) and (b) using coordinate vectors.
10. [R] Suppose that v =
a1b1
c1
 and w =
a2b2
c2
 are vectors in R3; λ and µ are real numbers.
Prove the scalar distributive law (λ + µ)v = λv + µv and the vector distributive law
λ(v +w) = λv + λw.
c©2020 School of Mathematics and Statistics, UNSW Sydney
38 CHAPTER 1. INTRODUCTION TO VECTORS
Problems 1.3 : Rn and analytic geometry
11. [R][V] Let v =
(
2
3
)
and w =
(−1
1
)
. Draw coordinate axes and mark in the points
whose coordinate vectors are v, −v, w, v +w, 2v and v −w.
12. [R] Given the following points A, B, C and D, are the vectors
−−→
AB and
−−→
CD parallel?
a) A = (1, 2, 3), B = (−2, 3, 4), C = (−3,−4, 7), D = (4,−6,−9);
b) A = (3, 2, 5), B = (5,−3,−6), C = (−2, 3, 7), D = (0,−2,−4);
c) A = (12,−4, 6), B = (2, 6,−4), C = (5,−2, 9), D = (0, 3, 4).
Do any of these sets of 4 points form a parallelogram?
13. [R] Prove that A(1, 2, 1), B(4, 7, 8), C(6, 4, 12) and D(3,−1, 5) are the vertices of a par-
allelogram. Draw and label the parallelogram.
14. [R] Show that the points A(1, 2, 3), B(3, 8, 1), C(7, 20,−3) are collinear.
15. [R][V] Show that the points A(−1, 2, 1), B(4, 6, 3), C(−1, 2,−1) are not collinear.
16. [R] Show that the points A,B,C in R3 with coordinate vectors
a =
10
3
 , b =
01
4
 and c =
 6−5
−2

are collinear.
17. [H] If A(−1, 3, 4), B(4, 6, 3), C(−1, 2, 1) and D are the vertices of a parallelogram, find
all the possible coordinates for the point D.
18. [H] Consider three non-collinear points D, E, F in R3 with coordinate vectors d, e and
f . There are exactly 3 points in R3 which, taken one at a time with D, E and F, form a
parallelogram. Calculate vector expressions for the three points.
19. [R][V] Let A = (2, 3,−1) and B = (4,−5, 7). Find the midpoint of A and B. Find the
point Q on the line through A and B such that B lies between A and Q and BQ is three
times as long as AB.
20. [R] The coordinate vectors, relative to the origin O, of the points A and B are respectively
a and b. State, in terms of a and b, the position vector of the point T which lies on AB
and is such that
−→
AT = 2
−→
TB.
21. [R] List the standard basis vectors for R5.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 1 39
22. [R] For each of the following vectors, find its length and find a vector of length one (“unit”
vectors) parallel to it.
a =
 4−4
2
 , b =

2
1
0
3
 , c =

4
0
1
−2
0
 .
23. [R][V] Find the distances between each of the following pairs of points with coordinate
vectors:
a)
 8−4
2
,
−61
0
; b)
11
1
,
 5−7
−7
; c)

3
0
1
4
,

−2
6
1
3
.
24. [R] A triangle has vertices A,B and C which have coordinate vectors
41
7
 ,
 7−4
6
 and
62
8
 respectively. Find the lengths of the sides of the triangle and deduce that the triangle
is right-angled.
25. [H] Construct a cube in R3 with the length of each edge 1. Show that the face diagonal
has length
√
2 and the long diagonal
√
3. Try to generalise this idea to R4 and show that
there are now diagonals of length
√
2,
√
3 and 2. How many vertices does a 4-cube have?
Problems 1.4 : Lines
26. [R][V] Find the coordinate vector for the displacement vector
−−→
AB and parametric vector
forms for the lines through the points A and B with coordinates
a) A (1, 2), B (2, 7); b) A (1, 2,−1), B (−1,−1, 5);
c) A (1, 2, 1), B (7, 2, 3); d) A (1, 2,−1, 3), B (−1, 3, 1, 1).
27. [R] Does the point (3, 5, 7) lie on the line x =
−13
6
+ λ
42
1
?
28. [R] Find parametric vector forms for the following lines in R2:
a) y = 3x+ 4; b) 3x+ 2y = 6; c) y = −7x;
d) y = 4; e) x = −2.
In each case indicate the direction of the line and a point through which the line passes.
29. [R] Find a parametric vector form and a Cartesian form for each of the following lines
c©2020 School of Mathematics and Statistics, UNSW Sydney
40 CHAPTER 1. INTRODUCTION TO VECTORS
a) through the points (−4, 1, 3) and (2, 2, 3);
b) through (1, 2,−3) parallel to the vector
 4−5
6
;
c) through (1,−1, 1) parallel to the line joining the points (2, 2, 1) and (7, 1, 3);
d) through (1, 0, 0) parallel to the line joining the points (3, 2,−1) and (3, 5, 2).
30. [R] Let A,B,P be points in R3 with position vectors
a =
 7−2
3
 , b =
 1−5
0
 and p =
 1−1
2
 .
Let Q be the point on AB such that AQ =
2
3
AB.
a) Find q, the position vector of Q.
b) Find the parametric vector equation of the line that passes through P and Q.
31. [R] Decide whether each of the following statements is true or false.
a) The lines y = 3x− 4 and x =
(
2
1
)
+ λ
(
4
12
)
are parallel.
b) The lines x =
(
3
−1
)
+ λ
(
6
4
)
and 2x+ 3y = 8 are parallel.
c) The lines x =
 4−1
2
+ λ
102
8
 and x+ 10
5
= y − 7 = z + 3
4
are parallel.
d) The line x =
 3−2
7
+ λ
 100
−4
 and the line
x+ 10
5
=
z + 3
−2 and y = −5
are parallel.
Problems 1.5 : Planes
32. [R] Find a parametric vector form for the planes passing through the points
a) (0, 0, 0), (3,−1, 2), (1, 4,−6); b) (1, 4,−2), (2, 6, 4), (1,−10, 3).
33. [R] For each of the following sets of vectors, decide if the set is a line or a plane, give a
point on the line or plane, and give vectors parallel to the line or plane, i.e., geometrically
describe the sets.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 1 41
a) S =
x : x = λ1
12
3
+ λ2
−23
4
 for λ1, λ2 ∈ R
 .
b) S =
x : x =

3
1
2
4
+ λ1

−2
1
3
2
+ λ2

4
−2
−6
−4
 for λ1, λ2 ∈ R
 .
c) span


3
2
1
2
 ,

−9
−6
−3
−6

 .
d) S =
x : x =
12
3
+ y for y ∈ span
 4−1
2
 ,
82
4
 .
34. [R][V] Find parametric vector forms for the planes
a) through the point (1, 2, 3) parallel to
21
3
 and
−12
−3
;
b) through the points (3, 1, 4), (−1, 2, 4), (6, 7,−2);
c) through the points (−2, 4, 1, 6), (3, 2, 6,−1), (1, 4, 0, 0);
d) 4x1 − 3x2 + 6x3 = 12, where x =
x1x2
x3
 ∈ R3;
e) 5x2 − 6x3 = 5, where x =
x1x2
x3
 ∈ R3;
f) through the point (1, 2, 3, 4) parallel to the lines x =

−3
1
2
4
+ λ

4
0
−4
5
 and
x1 − 5
7
=
x2 + 6
2
=
x3 − 2
−3 =
x4 + 1
−5 .
35. [R] Find parametric vector forms to describe the following planes in R3.
a) x1 + x2 + x3 = 0. b) 3x1 − x2 + 4x3 = 12.
c) x2 + 6x3 = −1. d) x3 = 2.
36. [H] Show that the line x = t
21
3

a) lies on the plane 4x− 5y − z = 0, and b) is parallel to the plane 3x− 3y − z = 2.
c©2020 School of Mathematics and Statistics, UNSW Sydney
42 CHAPTER 1. INTRODUCTION TO VECTORS
37. [H] a) Find the intersection of the line x =
2 + t3− t
4t
 and the plane 2x+ 3y + z = 16.
b) Find the intersection of the line x =
−12
3
+λ
 2−3
4
 and the plane 9x+4y−z = 0.
38. [H] a) Write the plane x =
−32
6
+ λ
24
0
+ µ
−10
3
 in Cartesian form.
b) Write the plane x =
 6−1
4
+ λ
−16
6
+ µ
21
0
 in Cartesian form.
39. [H] Consider the line
x− 3
−2 =
y + 2
3
= z − 1 and the plane 2x+ y + 3z = 23 in R3.
a) Find a parametric vector form for the line.
b) Hence find where the line meets the plane.
40. [H] Let ℓ be the line
x− 6
5
=
y − 4
2
=
z − 1
−2 in R
3.
a) Express the line ℓ in parametric vector form.
b) Find the coordinates of the point where ℓ meets the plane 2x+ y − z = 1.
c©2020 School of Mathematics and Statistics, UNSW Sydney
43
Chapter 2
Vector geometry
“Why,” said the Gryphon, “you first form into a line along the sea-shore—”
“Two lines!’ cried the Mock Turtle.
Lewis Carroll, Alice in Wonderland.
In Chapters 1 and 4 we have shown how vectors can be used to solve geometric problems
involving points, lines and planes.
Our aim in this chapter is to show how vectors can be used to solve geometric problems involving
lengths, distances, areas, angles and volumes. For simplicity, and because of the fundamental
importance of two and three dimensions in the physical sciences and engineering, we will concentrate
on problems in two and three dimensions. However, we shall see that many of the two and three
dimensional results can be easily generalised to Rn. The key idea is to use theorems in R2 and
R3 to motivate definitions in Rn for n > 3.
2.1 Lengths
We have defined lengths of vectors in Rn and distance between two points on page 18. We expect
that lengths and distances in Rn should have the same essential properties as those in two and
three dimensional spaces. For example, we expect them to be real non-negative numbers. Some of
these essential properties for Rn are proved in the following proposition.
Proposition 1. For all a ∈ Rn and λ ∈ R,
1. |a| is a real number,
2. |a| > 0,
3. |a| = 0 if and only if a = 0,
4. |λa| = |λ| |a|.
Proof. From Definition 3 on page 18
|a| =
√
a21 + · · · + a2n.
As a ∈ Rn, all a2k are real and non-negative, and hence a21 + · · ·+ a2n > 0. Thus, properties 1 and 2
hold.
c©2020 School of Mathematics and Statistics, UNSW Sydney
44 CHAPTER 2. VECTOR GEOMETRY
Property 3 holds since a sum of non-negative numbers is zero if and only if every term in the
sum is zero.
The proof of Property 4 is as follows. From the definitions of λa and the length of a vector, we
have
|λa| =
√
(λa1)2 + · · · + (λan)2 = |λ|
√
a21 + · · ·+ a2n = |λ| |a|.
Example 1. | − 6a| = 6|a|. ♦
2.2 The dot product
Most introductory courses in trigonometry include a statement of the cosine rule for triangles. This
rule can be stated in vector form as follows.
Proposition 1. Cosine Rule for Triangles. If the sides of a triangle in R2 or R3 are given by
the vectors a, b and c, then
|c|2 = |a|2 + |b|2 − 2|a| |b| cos θ,
where θ is the interior angle between a and b.
Proof. From Figure 1, since △BPA is a right-angled
triangle, we have
|c|2 =
∣∣∣−−→BA∣∣∣2 = ∣∣∣−−→BP ∣∣∣2 + ∣∣∣−→PA∣∣∣2 .
But,
∣∣∣−−→BP ∣∣∣ = ∣∣∣|−−→OP | − |−−→OB|∣∣∣ = ∣∣|a| cos θ − |b|∣∣
and
∣∣∣−→PA∣∣∣ = |a| sin θ,
and hence
|c|2 = ∣∣|a| cos θ − |b|∣∣2 + |a|2 sin2 θ
= (|a| cos θ − |b|)2 + |a|2 sin2 θ
= |a|2 + |b|2 − 2|a| |b| cos θ.
O
A
B P
a
b
c
θ
Figure 1: The Cosine Rule.
If we write a =
a1a2
a3
 and b =
b1b2
b3
 and use the formula for the length of c = a− b, we have
|c|2 = (a1 − b1)2 + (a2 − b2)2 + (a3 − b3)2
=
(
a21 + a
2
2 + a
2
3
)
+
(
b21 + b
2
2 + b
2
3
)− 2 (a1b1 + a2b2 + a3b3)
= |a|2 + |b|2 − 2 (a1b1 + a2b2 + a3b3) .
On comparing this expression with the cosine rule, we find
a1b1 + a2b2 + a3b3 = |a| |b| cos θ.
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.2. THE DOT PRODUCT 45
This expression also applies immediately to vectors in the plane, where a3 = b3 = 0.
Let us now make the following definition.
Definition 1. The dot product of two vectors a,b ∈ Rn is
a · b = a1b1 + · · ·+ anbn =
n∑
k=1
akbk.
Notice that we have already met the expression for the dot product (see Example 4 of Section 5.2)
as the “scalar product” aTb, i.e., a · b = aTb.
For the special case of R2 or R3, we then have
a · b = |a| |b| cos θ.
Students of physics and engineering should note that it is this geometric result which is usually
taken as the definition of the dot product in physics and engineering courses.
Example 1. Find the dot product of
21
4
 and
−13
2
, and hence find the cosine of the angle
between the vectors.
Solution. The dot product is 21
4
 ·
−13
2
 = −2 + 3 + 8 = 9,
and the lengths are
√
21 and
√
14, and hence
cos θ =
9√
(21)(14)
=
9
7
√
6
.
♦
Notice that the value of cos θ does not uniquely define the value of the angle θ. It is conventional
to define the angle between two vectors as an angle θ in the interval [0, π], so that the value of cos θ
does uniquely define the value of θ.
Example 2. On choosing the angle in the interval [0, π], the angle between the vectors
21
4
 and−13
2
 of Example 1 is
θ = cos−1
9
7
√
6
= 1.018 . . . radians.
♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
46 CHAPTER 2. VECTOR GEOMETRY
2.2.1 Arithmetic properties of the dot product
Some of the main properties of the dot product are summarised in the following proposition.
Proposition 2. For all vectors a,b, c ∈ Rn and scalars λ ∈ R,
1. a · a = |a|2, and hence |a| = √a · a;
2. a · b is a scalar, i.e., is a real number;
3. Commutative Law: a · b = b · a;
4. a · (λb) = λ(a · b);
5. Distributive Law: a · (b+ c) = a · b+ a · c;
Proof. Properties 1 and 2 follow immediately from the definitions of length and dot product.
The proof of the commutative law is as follows.
a · b = a1b1 + · · · + anbn, and
b · a = b1a1 + · · · + bnan.
But, akbk = bkak (all ak and bk are real numbers), and hence the two expressions are equal.
The proof of Property 4 follows immediately on expanding the dot products on each side and
using the properties of real numbers.
Finally, to prove the distributive law we note that
a · (b+ c) = a1(b1 + c1) + · · ·+ an(bn + cn)
= (a1b1 + · · · + anbn) + (a1c1 + · · ·+ ancn)
= a · b+ a · c
2.2.2 Geometric interpretation of the dot product in Rn
We have seen above that the dot product between two vectors in R2 and R3 has a geometric
interpretation in terms of lengths and angles as
a · b = a1b1 + a2b2 + a3b3 = |a| |b| cos θ,
where θ ∈ [0, π] is the angle between a and b. Now, in Rn, we have defined lengths of vectors
(Definition 3 on page 18) and the dot product (Definition 1 of Section 2.2), but the idea of an
angle has not been defined. It is reasonable to try to define an angle in Rn so that the geometric
interpretation of the dot product in Rn is the same as the geometric interpretation of it in R2 or
R3. We therefore define an angle in Rn as follows.
Definition 2. If a, b are non-zero vectors in Rn, then the angle θ between a
and b is given by
cos θ =
a · b
|a| |b| , where θ ∈ [0, π].
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.2. THE DOT PRODUCT 47
Example 3. Find the angle between a =

−1
2
−3
−1
 and b =

2
−4
0
−1
.
Solution. Using the definition of length, dot product and angle in Rn, we have
|a| =
√
15, |b| =
√
21, a · b = −2− 8 + 0 + 1 = −9,
cos θ = − 9√
(15)(21)
= − 3√
35
,
and hence, on choosing the angle between 0 and π, we have
θ = cos−1
−3√
35
= 2.103 . . . radians.
♦
Now, in Definition 2 of an angle in Rn, we have assumed that the definition makes sense, i.e., that
the equation
cos θ =
a · b
|a| |b|
can always be solved to obtain a real number as the value for the angle θ. But, since
−1 6 cos θ 6 1 for real numbers, a real solution for θ is possible if and only if
−1 6 a · b|a| |b| 6 1.
A proof that this inequality is true for all non-zero vectors in Rn is given in the following important
theorem.
Theorem 3 (The Cauchy-Schwarz Inequality). If a, b ∈ Rn, then −|a| |b| 6 a · b 6 |a| |b|.
[X] Proof. Note first that the inequality is clearly true if either a or b is a zero vector.
For b 6= 0, consider
q(λ) = |a− λb|2 for λ ∈ R.
Then, from the properties of lengths and dot products in Rn, we have that q > 0 for all λ ∈ R and
hence that
0 6 q = (a− λb) · (a− λb) = |a|2 − 2λa · b+ λ2|b|2.
This q(λ) is a quadratic function of λ which has a minimum at
λ =
a · b
|b|2 .
The minimum value is
q0 = q
(
a · b
|b|2
)
= |a|2 − (a · b)
2
|b|2 .
Now, as q(λ) > 0 for all λ, we have
0 6 q0 = |a|2 − (a · b)
2
|b|2 .
c©2020 School of Mathematics and Statistics, UNSW Sydney
48 CHAPTER 2. VECTOR GEOMETRY
Thus,
(a · b)2 6 |a|2|b|2,
and therefore
−|a| |b| 6 a · b 6 |a| |b|,
and the proof is complete.
Another useful inequality which follows immediately from the Cauchy-Schwarz inequality is the
following inequality for lengths of vectors.
Theorem 4 (Minkowski’s Inequality (or the Triangle Inequality)). For a,b ∈ Rn,
|a+ b| 6 |a|+ |b|.
Proof.
|a+ b|2 = (a+ b) · (a+ b) = |a|2 + 2a · b+ |b|2.
But, from the Cauchy-Schwarz inequality,
a · b 6 |a| |b|, and hence
|a+ b|2 6 |a|2 + 2|a| |b|+ |b|2 = (|a|+ |b|)2.
On taking positive square roots of both sides of this
inequality, we then obtain the result to be proved.
|a+ b|
|a|
|b|
Figure 2: The Triangle Inequality.
As illustrated in Figure 2, the geometric interpretation of the triangle inequality is that the
sum of two sides of a triangle is greater than or equal to the third side.
2.3 Applications: orthogonality and projection
The dot product has many applications. Two important applications are for testing if two vectors
are at right angles and for projecting one vector on another.
2.3.1 Orthogonality of vectors
The dot product provides a simple test for vectors being at right angles to each other. We begin
with the following definition.
Definition 1. Two vectors a and b in Rn are said to be orthogonal if a · b = 0.
Note that the zero vector is orthogonal to every vector, including itself.
On using the formula for the dot product in terms of angles, we see that two vectors are
orthogonal if
a · b = |a| |b| cos θ = 0,
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.3. APPLICATIONS: ORTHOGONALITY AND PROJECTION 49
i.e., if either |a| or |b| or cos θ is zero. As a vector is zero if its length is zero, and cos θ = 0 only
when θ is a right angle, we have the result that vectors a and b are orthogonal if either vector is
the zero vector or if the vectors are at right angles to each other.
Note. Two non-zero vectors at right angles to each other are also said to be perpendicular to
each other or to be normal to each other.
By the definitions of length and orthogonality, the set of standard basis vectors {e1, . . . , en} is
a set of vectors of unit length at right angles to each other. Sets of vectors with this property are
of great practical importance, and they have been given a special name.
Definition 2. An orthonormal set of vectors in Rn is a set of vectors which
are of unit length and mutually orthogonal.
The connection between the dot product and lengths and angles provides a simple test for a set
of vectors to be an orthonormal set.
Example 1. The three standard basis vectors e1, e2, and e3 of R
3 form an orthonormal set.
Proof. The three vectors are of unit length since
|e1|2 = e1 · e1 =
10
0
 ·
10
0
 = 1,
and similarly, |e2|2 = 1 and |e3|2 = 1.
The vectors e1 and e2 are orthogonal since
e1 · e2 =
10
0
 ·
01
0
 = 0.
Similarly, e1 and e3 are orthogonal since e1 · e3 = 0. Finally, e2 and e3 are orthogonal since
e2 · e3 = 0.
Thus, the three vectors e1, e2, e3 are each of unit length and they are mutually orthogonal, and
hence they form an orthonormal set.
Note that a compact form of writing the conditions for an orthonormal set in Rn are that, for
1 6 i 6 n and 1 6 j 6 n,
ei · ej =
{
0 for i 6= j;
1 for i = j.
Example 2. Show that the three vectors u1 =

1√
2
0
− 1√
2
, u2 =

− 1√
3
1√
3
− 1√
3
, u3 =

1√
6
2√
6
1√
6
 form an
orthonormal set. Find scalars λ1, λ2, λ3 such that e1 = λ1u1 + λ2u2 + λ3u3.
c©2020 School of Mathematics and Statistics, UNSW Sydney
50 CHAPTER 2. VECTOR GEOMETRY
Solution. The three vectors are of unit length, since
u1 · u1 =
(
1√
2
)2
+ 02 +
(
− 1√
2
)2
= 1,
and similarly u2 · u2 = 1, u3 · u3 = 1. They are mutually orthogonal since
u1 · u2 =
(
1√
2
)(
− 1√
3
)
+ 0× 1√
3
+
(
− 1√
2
)(
− 1√
3
)
= 0,
and similarly u1 · u3 = 0, u2 · u3 = 0. The three vectors therefore form an orthonormal set.
For writing e1 as a linear combination of the three vectors, again we use the dot product.
u1 · e1 = u1 · (λ1u1 + λ2u2 + λ3u3)
1√
2
= λ1(u1 · u1) + λ2(u1 · u2) + λ3(u1 · u3),
we have λ1 =
1√
2
. Similarly,
λ2 = u2 · e1 = − 1√
3
and λ3 = u3 · e1 = 1√
6
.
In other words, we can write e1 as a linear combination of the three vectors:
e1 =
1√
2
u1 − 1√
3
u2 +
1√
6
u3.
♦
The method used in the above example can generally be used to write any vector in the span
of an orthonormal set as a linear combination of vectors in this set.
Here is an application of orthogonality to the geometry of triangles.
Example 3. Show that the three altitudes of a triangle are concurrent, i.e., they intersect at a
point.
Solution. As shown in Figure 3, let the three vertices be A, B and C (coordinate vectors a, b,
c), and let D and E be the points at which the altitudes from A and B intersect the opposite sides
of the triangle. Finally, let P (coordinate vector p) be the point of intersection of the altitudes AD
and BE.
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.3. APPLICATIONS: ORTHOGONALITY AND PROJECTION 51
Then, we can prove the result by showing that P
is also on the altitude from C to AB.
Now, as P is on the altitude from A to the opposite
sideBC, we have that
−→
AP is perpendicular to
−−→
BC, and
hence
(p− a) · (c− b) = 0,
i.e., p · c− p · b− a · c+ a · b = 0. (1)
Similarly, as P is on the altitude from B to the oppo-
site side CA, we have
(p− b) · (a− c) = 0,
i.e., p · a− p · c− b · a+ b · c = 0. (2)
A B
C
D
E
P
Figure 3: Altitudes of a Triangle.
Then, on adding (1) and (2) and grouping terms, we obtain
(p− c) · (a− b) = 0,
i.e.,
−−→
CP · −−→BA = 0. Thus, −−→CP is perpendicular to −−→BA and hence P is on the altitude from C to the
opposite side BA. The result is proved. ♦
2.3.2 Projections
The geometric idea of a projection of a vector a on a non-zero vector b in R2 or R3 is shown in
Figure 4, where
−−→
OP is the projection.
Note that
−−→
OP is parallel to b and
−→
PA = a−−−→OP is
perpendicular to b. Note also that
|−−→OP | = |a|| cos θ| = |a · b||b| ,
because a ·b = |a| |b| cos θ. Since b|b| is a unit vector,
−−→
OP = |−−→OP | b|b| when θ is acute;
−−→
OP = −|−−→OP | b|b| when θ is obtuse.
O
A
B P
a
b
θ
Figure 4: Projection of a on b.
In both cases,
−−→
OP =
(
a · b
|b|2
)
b. Thus we can formally define a projection in terms of the dot
product as follows.
c©2020 School of Mathematics and Statistics, UNSW Sydney
52 CHAPTER 2. VECTOR GEOMETRY
Definition 3. For a,b ∈ Rn and b 6= 0, the projection of a on b is
projba =
(
a · b
|b|2
)
b.
The geometric properties in R2 used to motivate this definition can be proved to be true for
projections in Rn also. We have
Proposition 1. projba is the unique vector λb parallel to the non-zero vector b such that
(a− λb) · b = 0. (#)
Proof. From (#) we have
0 = (a− λb) · b = a · b− λ|b|2.
For b 6= 0, this equation can be solved for λ to obtain the unique solution
λ =
a · b
|b|2 ,
and hence the unique solution of (#) is
λb =
(
a · b
|b|2
)
b = projb a.
The proof is complete.
Alternative forms of writing the formula for a projection are sometimes useful. These are
projba = (a · b̂)b̂ = |a| cos θ b̂,
where b̂ =
b
|b| is the unit vector in the direction of b and where θ is the angle between a and b.
Example 4. Find the projections of a vector a =
a1a2
a3
 on the three standard basis vectors e1,
e2, e3.
Solution. a · e1 = a1, |e1|2 = e1 · e1 = 1, and hence
proje1a =
a · e1
|e1|2 e1 = a1e1.
Similarly, a2e2 is the projection of a on e2 and a3e3 is the projection of a on e3. ♦
Example 5. Find the projection of a =
 1−3
2
 on b =
−41
5
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.3. APPLICATIONS: ORTHOGONALITY AND PROJECTION 53
Solution. a · b = 3, |b|2 = 42, and hence projba = 1
14
−41
5
. ♦
Note that a simple formula for the length of the projection of a on b is
|projba| = |a · b̂| =
|a · b|
|b| .
Example 6. Find the length of the projection of
35
1
 on
−31
−4
.
Solution. As
∣∣∣∣∣∣
35
1
 ·
−31
−4
∣∣∣∣∣∣ = 8, and as
∣∣∣∣∣∣
−31
−4
∣∣∣∣∣∣ = √26, the length of the projection is 8√26.
♦
2.3.3 Distance between a point and a line in R3
In R2 and R3, the distance between a point B and
a line x = a + λd is the shortest distance between
the point and the line. In the diagram, the distance
is |−−→PB|, where P is the point on the line such that
∠APB is a right angle. We can easily find |−−→AB| from
the coordinates of A and B, and we can find |−→AP | as it
is the length of the projection of
−−→
AB on the direction
d. Thus,
∣∣∣−−→AB∣∣∣ = |b− a|, ∣∣∣−→AP ∣∣∣ =
∣∣∣−−→AB · d∣∣∣
|d| ,
and
∣∣∣−−→PB∣∣∣ =√|−−→AB|2 − |−→AP |2.
O
A
B
P
a
d
Figure 5: Shortest Distance
between Point and Line.
Note that
−−→
AB is the line segment joining some point A on the line to the given point B, while−→
AP is the projection of this line segment on the direction of the line.
An alternative method of solving this problem is to use the fact that
−−→
PB =
−−→
AB −−→AP = b− a− projd(b− a),
and then the shortest distance is |−−→PB|.
Example 7. Find the distance from the point (2,−1, 3) to the line through the points (0, 1, 4) and
(4, 2, 9).
c©2020 School of Mathematics and Statistics, UNSW Sydney
54 CHAPTER 2. VECTOR GEOMETRY
Solution. Let A, B and C be the points (0, 1, 4), (2,−1, 3) and (4, 2, 9), respectively. Suppose
that P is the foot of the perpendicular from B to the line AC. Referring to Figure 5, the length of
the projection of
−−→
AB =
 2−2
−1
 on the direction d = −→AC =
41
5
 is
∣∣∣−→AP ∣∣∣ = |−−→AB · d||d| = |8− 2− 5|√16 + 1 + 25 = 1√42 .
Together with |−−→AB| = 3, the distance from the point to the line is
∣∣∣−−→BP ∣∣∣ =√32 − 1
42
=
√
377√
42
.
♦
2.4 The cross product
We now define the cross product of two vectors in three dimensions. One motivation for a cross
product is to find a formula which gives a vector which is perpendicular to two other vectors.
Now, a vector x will be perpendicular to two vectors a and b if and only if the dot products
a · x and b · x are both zero. Using coordinates, we can write
a1x1 + a2x2 + a3x3 = 0
b1x1 + b2x2 + b3x3 = 0
This pair of equations can be easily solved in the usual way to obtain a solution which can be
written in the form
x = λ
a2b3 − a3b2a3b1 − a1b3
a1b2 − a2b1
 ,
where λ is a real parameter. This expression for x looks like some kind of “product” of a and b.
Definition 1. The cross product of two vectors a =
a1a2
a3
 and b =
b1b2
b3
 in
R3 is
a× b =
a2b3 − a3b2a3b1 − a1b3
a1b2 − a2b1
 .
Note that the cross product of two vectors is a vector. For this reason the cross product is
often called the vector product of two vectors, in contrast to the dot product which is a scalar
and is often called the scalar product of two vectors. Note also that the cross product a× b has
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.4. THE CROSS PRODUCT 55
the important property that it is perpendicular to the two vectors a and b. As an exercise you
might like to check directly that a× b is orthogonal to a by checking that a · (a× b) = 0.
There are several tricks available for remembering the formula for a cross product. The most
common trick is to use determinant notation. The more general theory of determinants will be
covered in a later chapter. We define here a 2× 2 determinant by∣∣∣∣a bc d
∣∣∣∣ = ad− bc.
To find the cross product of two vectors
a1a2
a3
 and
b1b2
b3
 we write these as rows in a 3 × 3
determinant:
a× b =
∣∣∣∣∣∣
e1 e2 e3
a1 a2 a3
b1 b2 b3
∣∣∣∣∣∣ .
To calculate this, we use the following procedure. Firstly, take the vector e1 and multiply it by the
2× 2 determinant obtained by deleting the row and column in which e1 is contained. That is, we
write
e1
∣∣∣∣a2 a3b2 b3
∣∣∣∣ .
Then take −e2 and repeat the process, followed by e3. Each of the 2×2 determinants can be found
using the definition above.
We write
a× b =
∣∣∣∣∣∣
e1 e2 e3
a1 a2 a3
b1 b2 b3
∣∣∣∣∣∣
= e1
∣∣∣∣a2 a3b2 b3
∣∣∣∣− e2 ∣∣∣∣a1 a3b1 b3
∣∣∣∣+ e3 ∣∣∣∣a1 a2b1 b2
∣∣∣∣
= e1(a2b3 − a3b2)− e2(a1b3 − a3b1) + e3(a1b2 − a2b1)
=
10
0
 (a2b3 − a3b2)−
01
0
 (a1b3 − a3b1) +
00
1
 (a1b2 − a2b1)
=
a2b3 − a3b2a3b1 − a1b3
a1b2 − a2b1
 .
The determinant is expanded along the first row and as usual e1, e2, e3 are the standard basis
vectors of R3. While this may appear complicated at first, with practice it is much easier than
using the formula for the cross product.
Example 1. Find the cross product of
 1−2
3
 and
 4−5
6
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
56 CHAPTER 2. VECTOR GEOMETRY
Solution. Using the determinant formula, we have 1−2
3
×
 4−5
6
 =
∣∣∣∣∣∣
e1 e2 e3
1 −2 3
4 −5 6
∣∣∣∣∣∣ = e1(−12 + 15)− e2(6− 12) + e3(−5 + 8) =
36
3
 .
♦
2.4.1 Arithmetic properties of the cross product
Some of the basic properties of the cross product are summarised in the following proposition.
Proposition 1. For all a, b, c ∈ R3 and λ ∈ R,
1. a× a = 0, i.e., the cross product of a vector with itself is the zero vector.
2. a×b = −b×a. The cross product is not commutative. If the order of vectors in the cross
product is reversed, then the sign of the product is also reversed.
3. a× (λb) = λ(a× b) and (λa)× b = λ(a× b).
4. a× (λa) = 0, i.e., the cross product of parallel vectors is zero.
5. Distributive Laws. a× (b+ c) = a× b+ a× c and (a+ b)× c = a× c+ b× c.
Proof. Each of the properties listed in Proposition 1 can be proved by expanding each side of the
properties using the definition of cross product given in Definition 1. For example, for Property 2,
we have on expansion that
a× b =
a2b3 − a3b2a3b1 − a1b3
a1b2 − a2b1
 = −
b2a3 − b3a2b3a1 − b1a3
b1a2 − b2a1
 = −b× a.
There are several useful relations for the cross products of the standard basis vectors e1, e2, e3
in R3.
Proposition 2. The three standard basis vectors in R3 satisfy the relations
1. e1 × e1 = e2 × e2 = e3 × e3 = 0,
2. e1 × e2 = e3, e2 × e3 = e1, e3 × e1 = e2.
Proof. The proof of these relations follows immediately from Property 1 of Proposition 1 and
from Definition 1. For example,
e1 × e2 =
∣∣∣∣∣∣
e1 e2 e3
1 0 0
0 1 0
∣∣∣∣∣∣ =
00
1
 = e3.
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.4. THE CROSS PRODUCT 57
Note. The cross product is not associative. In fact,
(e1 × e2)× e2 = −e1,
but e1 × (e2 × e2) = 0.
Proposition 3. Suppose A, B are points in R3 that have coordinate vectors a and b, and ∠AOB =
θ then |a× b| = |a| |b| sin θ.
Proof. If a = 0 or b = 0 then both sides are 0 although θ is not defined.
Also, since 0 6 θ 6 π, sin θ > 0 we need only prove
|a× b|2 = |a|2 |b|2 sin2 θ
or |a× b|2 = |a|2 |b|2 (1 − cos2 θ) = |a|2 |b|2 − (a · b)2 by Section 2.2.
To prove this identity we expand both sides
L.H.S. =
∣∣∣∣∣∣
a2b3 − a3b2a3b1 − a1b3
a1b2 − a2b1
∣∣∣∣∣∣
2
= a22b
2
3 + a
2
3b
2
2 − 2a2a3b2b3 + a23b21 + a21b23 − 2a3a1b3b1 + a21b22 + a22b21 − 2a1a2b1b2
R.H.S. = (a21 + a
2
2 + a
2
3) (b
2
1 + b
2
2 + b
2
3)− (a1b1 + a2b2 + a3b3)2
= a21b
2
1 + a
2
1b
2
2 + a
2
1b
2
3 + a
2
2b
2
1 + a
2
2b
2
2 + a
2
2b
2
3 + a
2
3b
2
1 + a
2
3b
2
2 + a
2
3b
2
3
− a21b21 − a22b22 − a23b23 − 2a1a2b1b2 − 2a1a3b1b3 − 2a2a3b2b3
= a21b
2
2 + a
2
1b
2
3 + a
2
2b
2
1 + a
2
2b
2
3 + a
2
3b
2
1 + a
2
3b
2
2 − 2a2a3b2b3 − 2a3a1b3b1 − 2a1a2b1b2
as required.
2.4.2 A geometric interpretation of the cross product
We have constructed a cross product to be perpendicular to two given vectors. In this subsection
we explore the geometric properties further.
Choose an orthonormal set of basis vectors so that i is in the direction of a, and so that i and
j are in the plane of a and b, with b = b1i+ b2 j, b1 > 0. Finally, k is taken at right angles to the
plane of a and b with its direction determined by a “right-hand rule”: using your right hand, point
to a with your index finger and to b with your middle finger, then your thumb can be extended in
the direction of k. If we take the direction of k as normal to the page and pointing outwards, then
the picture for the i, j plane is as shown in Figure 6.
c©2020 School of Mathematics and Statistics, UNSW Sydney
58 CHAPTER 2. VECTOR GEOMETRY
i-direction
a-direction
j-
d
ir
ec
ti
on
b
aO A P
B
a =
−→
OA = a1i
b =
−−→
OP +
−−→
PB = b1i+ b2j
a× b = a1b2k
a× b
a
b
Figure 6: Geometry of a× b.
Hence, we have
a =
a10
0
 , b =
b1b2
0
 and a× b =
 00
a1b2
 ,
and we also have that the coordinates a1 and b2 are positive.
Now, since a1 and b2 are positive, the cross product is in the direction of k, i.e., it is in a
direction perpendicular to both a and b as given by the right-hand rule.
Let θ denote the angle between a and b. Note first that the angle θ is always chosen to be in
the interval [0,π] so that sin θ > 0. Also, the length of the vector a× b is
|a× b| = a1b2.
Now a1 = |a|, and from trigonometry, b2 = |b| sin θ, and hence |a× b| = |a| |b| sin θ, in agreement
with Proposition 3.
We have therefore shown that:
a × b is a vector of length |a| |b| sin θ in the direction perpendicular to both a and b as given
by the right-hand rule.
This statement is usually taken as the definition of the cross product in physics and engineering
courses.
Note. The above proof is valid since if a,b are arbitrary in R3 then we can apply a rotation to
move a to αi. We then rotate about the x-axis to move b to βi + γj with γ > 0. Since rotations
preserve lengths and angle sizes the results
|a× b| = |a| |b| sin θ
(a× b) · a = 0, (a× b) · b = 0 and
(a,b,a× b) form a right-hand triple
holds for arbitrary vectors a,b in R3.
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.4. THE CROSS PRODUCT 59
2.4.3 Areas
The length of the cross product a× b also has an alternative geometric interpretation in terms of
the area of a parallelogram with sides a and b. Consider the picture of Figure 7.
θ
θ
O A
B CP
a
a
b b
Figure 7: The Cross Product and the Area of a Parallelogram.
The area of the parallelogram OACB in Figure 7 is “base times perpendicular height”.
Area = |−→OA||−−→OP | = |a| |b| sin θ = |a× b|.
The vector a×b is perpendicular to both a and b, so it is normal to the plane of the parallelogram.
Example 2. Find the area of a parallelogram with vertices at points A (1, 0, 1), B (−2, 1, 3), and
C (3, 1, 4).
Solution. One parallelogram can be formed with sides
−−→
AB =
−31
2
 and −→AC =
21
3
. The
cross product is
−−→
AB ×−→AC =
 113
−5
, and hence the area is
∣∣∣∣∣∣
 113
−5
∣∣∣∣∣∣ = √195. ♦
Note. There are three parallelograms which can be formed from three vertices. The areas of all
three are equal. As an exercise you might like to find the other two parallelograms with the same
three vertices and check that they have the same area.
[X] 2.4.4 Shortest distance between lines in R3
It is necessary to consider the case of parallel lines and the case of skew (non-parallel) lines sepa-
rately. Be careful, non-parallel lines in R3 may not intersect.
Parallel Lines. The distance between two parallel lines is the same as the distance between a
point on one of the lines and the other line. We only need to use the method in 2.3.3 to find the
distance between a point on one line and the other line.
Skew Lines. The distance, i.e. the shortest distance, between two skew lines is obtained by
drawing a perpendicular to both lines. The direction of the perpendicular is in the direction of the
cross product of the directions of the lines. The shortest distance is the length of the projection on
this perpendicular of a line segment joining any point on one line to any point on the other.
c©2020 School of Mathematics and Statistics, UNSW Sydney
60 CHAPTER 2. VECTOR GEOMETRY
A
BP1
P2
n
d1
d2
n = d1 × d2 is perpendicular
to d1 and d2∣∣∣−−−→P1P2∣∣∣ = ∣∣∣projn−−→AB∣∣∣
Figure 8: Shortest Distance between Skew Lines.
Example 3. Find the shortest distance between the lines
x =
14
2
+ λ
31
2
 and x =
03
1
+ λ
24
1
 .
Solution. The cross product of the direction vectors of the lines is n =
31
2
×
24
1
 =
−71
10
,
and hence n is perpendicular to both lines. A line segment from a point on one line to a point on
the other is
14
2
−
03
1
 =
11
1
.
The shortest distance is the length of the projection of the line segment
11
1
 on the perpendicular
−71
10
 and is given by,
∣∣∣∣∣∣
11
1
 ·
−71
10
∣∣∣∣∣∣∣∣∣∣∣∣
−71
10
∣∣∣∣∣∣
=
4√
150
. ♦
2.5 Scalar triple product and volume
There are two products that can be constructed from three vectors in three dimensions. They are
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.5. SCALAR TRIPLE PRODUCT AND VOLUME 61
Definition 1. For a, b, c ∈ R3, the scalar triple product of a, b and c is
a · (b× c).
Definition 2. For a, b, c ∈ R3, the vector triple product of a, b and c is
a× (b× c).
In this section we shall briefly examine the properties of the scalar triple product and give it a
geometric interpretation as a volume. Although the vector triple product is useful in physics and
engineering, we shall not consider it any further in this mathematics course.
It is important to notice that in evaluating the scalar triple product, the cross product must
be calculated before the dot product. The expression (a ·b)×c has no meaning, because a ·b
is a scalar and the cross product of a scalar and a vector has no meaning.
Some properties of the scalar triple product are listed in the following proposition.
Proposition 1. For a, b, c ∈ R3,
1. a · (b× c) = (a× b) · c, that is, the dot and cross can be interchanged.
2. a · (b× c) = −a · (c× b), that is, the sign is reversed if the order of two vectors is reversed.
3. a · (a × b) = (a × a) · b = 0, that is, the scalar triple product is zero if any two vectors are
the same.
4. The scalar triple product can be written using the determinant notation.
a · (b× c) =
∣∣∣∣∣∣
a1 a2 a3
b1 b2 b3
c1 c2 c3
∣∣∣∣∣∣ .
This means that we replace i by a1, j by a2 and k by a3 in the determinant form of the cross
product.
Proof. The proof of Property 1 follows immediately on using the definitions of the dot and cross
product to expand the two expressions. We have
a · (b× c) =
a1a2
a3
 ·
b2c3 − b3c2b3c1 − b1c3
b1c2 − c1b2

= a1b2c3 − a1b3c2 + a2b3c1 − a2b1c3 + a3b1c2 − a3c1b2.
(a× b) · c =
a2b3 − a3b2a3b1 − a1b3
a1b2 − a2b1
 ·
c1c2
c3

= a2b3c1 − a3b2c1 + a3b1c2 − a1b3c2 + a1b2c3 − a2b1c3.
c©2020 School of Mathematics and Statistics, UNSW Sydney
62 CHAPTER 2. VECTOR GEOMETRY
The two fully expanded expressions are equal, and hence the result is proved.
Property 2 is an immediate consequence of the fact that b× c = −c× b.
Property 3 follows immediately from Property 1 and from the fact that a× a = 0 for all a.
Property 4 can be proved by expanding both the scalar triple product and the determinant and
noting that the expansions of the two are equal.
Note. Property 3 gives one more proof that the cross product a×b is perpendicular to a. Clearly,
b · (a× b) = 0, and hence b is also orthogonal to a× b.
2.5.1 Volumes of parallelepipeds
A parallelepiped is a three-dimensional analogue of a parallelogram. Given three vectors a, b and
c we can form a parallelepiped as shown in Figure 9.
O
A
B
C
a
b
cn = b× c
Figure 9: The Scalar Triple Product and the Volume of a Parallelepiped.
The parallelepiped formed from the vectors a, b, and c is called the parallelepiped spanned by
a, b, and c.
From geometry, the volume of a parallelepiped is “area of base times perpendicular height”.
The base is the parallelogram whose sides are the vectors b and c, and hence from the results of
Section 2.4.2, the area of the base is |b × c| and the direction of the perpendicular to the base is
the direction of n = b × c. From the results of Section 2.3.2, the length of the projection of the
vector a on the perpendicular n is the perpendicular height, and hence
Perpendicular height =
|a · n|
|n| =
|a · (b× c)|
|b× c| =
|a · (b× c)|
Area of base
.
Hence, the volume of the parallelepiped is given by the formula
Volume = |a · (b× c)|.
Example 1. Find the volume of the parallelepiped spanned by
12
3
,
−24
−1
, and
35
1
.
Solution. Using the determinant formula for the scalar triple product, we have12
3
 ·
−24
−1
×
35
1
 =
∣∣∣∣∣∣
1 2 3
−2 4 −1
3 5 1
∣∣∣∣∣∣ = 1(4 + 5)− 2(−2 + 3) + 3(−10 − 12) = −59,
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.6. PLANES IN R3 63
and hence the volume is | − 59| = 59. ♦
Notice that if the volume of the parallelepiped spanned by three vectors a, b, c is zero, i.e., if
a · (b× c) = 0, then the three vectors are in the same plane (that is, they are coplanar).
2.6 Planes in R3
Another useful application of vectors and dot and cross products is to the geometry of planes in
three dimensions. There are three common forms for the equation of a plane in R3. These are
“parametric vector form”, “Cartesian form”, and “point-normal form”. Each of these equations
has a direct geometric interpretation. In this section we shall discuss the geometric interpretation
of each of these forms and we shall show how to convert one form to another. We shall also show
how to find the distance between a point and a plane in R3.
2.6.1 Equations of planes in R3
Parametric Vector Form. In Section 1.5, we showed that the equation of a plane through a
given point with position vector c and parallel to two given non-parallel vectors v1 and v2 could
be written in a parametric vector form
x = c+ λ1v1 + λ2v2 for λ1, λ2 ∈ R.
Cartesian Form. In Section 1.5, we have shown that the linear equation in three unknowns,
a1x1 + a2x2 + a3x3 = b,
represents a plane in R3. This linear equation is also often called the Cartesian form of the
equation of a plane in R3. We have already shown that this Cartesian form can be converted to a
parametric vector form by solving the linear equation.
It is important to note that a single linear equation a1x1+· · ·+anxn = b is the equation of a plane
only if n = 3. For n = 2, the equation is a1x1+a2x2 = b which represents a line. In general, solving
a single linear equation with n unknowns yields a parametric vector form of solution containing
n− 1 parameters, whereas a parametric vector equation of a plane must contain 2 parameters.
The Cartesian form can be given a geometric
interpretation in terms of intercepts on the three
coordinate axes.
We first divide the Cartesian form by the right
hand side and rearrange it as
x1
d1
+
x2
d2
+
x3
d3
= 1,
where d1 =
b
a1
, d2 =
b
a2
, d3 =
b
a3
.
Then, to obtain the intercept on the x1 axis, we set
x2 = x3 = 0 in the equation, and find x1 = d1. By
a similar argument, the intercept on the x2 axis is
d2 and the intercept on the x3 axis is d3.
x2-axis
x3-axis
x1-axis
O
(d1, 0, 0)
(0, d2, 0)
(0, 0, d3)
Figure 10: a1x1 + a2x2 + a3x3 = b.
c©2020 School of Mathematics and Statistics, UNSW Sydney
64 CHAPTER 2. VECTOR GEOMETRY
The rule for obtaining intercepts is therefore to rewrite the equation with 1 on the right, and then
the intercepts are the reciprocals of the coefficients of the variables. Notice that if the coefficient
of x1 is zero, then the plane is parallel to the x1 axis. A similar result applies if the coefficient of
x2 or x3 is zero.
Example 1. Write down the equation of a plane which has the intercepts 4, 7,−2 on the three
coordinate axes.
Solution. The coefficients of the variables are the reciprocals of the intercepts, and hence the
equation is
x1
4
+
x2
7
+
x3
−2 = 1, or 14x1 + 8x2 − 28x3 = 56.
♦
Point-Normal Form. The point-normal form is the equation n · (x − c) = 0, where n and c
are fixed vectors and x is the position vector of a point in space. If we rewrite this equation as
n · x = n · c and then expand in terms of coordinates, we obtain
n1x1 + n2x2 + n3x3 = n1c1 + n2c2 + n3c3 = b,
which is the Cartesian form of the equation of a plane. Thus, the equation n · (x − c) = 0 is the
equation of a plane. As for the Cartesian form, the point-normal equation is the equation of a
plane for three-dimensional vectors only.
The name “point-normal form” for the equation n · (x − c) = 0 is based on the following
geometric interpretation (see Figure 11). Clearly, x = c is a solution of the equation, and hence c
is the position vector of a point on the plane. If x is any other point in the plane, then the line
segment x − c lies in the plane. The equation then says that the vector n is normal to all line
segments lying in the plane. n is called a normal to the plane. The point-normal form is therefore
the most convenient form of equation to use when a point on the plane and a normal to the plane
are known.
O
C
P
n
c
x
x− c
Figure 11: Point-Normal Form n · (x− c) = 0.
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.6. PLANES IN R3 65
Example 2. Find the point-normal form of the equation of a plane which passes through the point
(1,−2, 3) and whose normal is
−34
5
.
Solution.
−34
5
 ·
x−
 1−2
3
 = 0. ♦
We shall now show how to convert from one form of equation to another by giving examples of
the methods.
Example 3 (Conversion from point-normal to Cartesian form). Find a Cartesian form of the
equation of a plane which passes through the point (2, 0, 4) and whose normal is
 2−3
5
.
Solution. The point-normal form is
 2−3
5
 ·
x−
20
4
 = 0, and hence the corresponding
Cartesian form is 2x1 − 3x2 + 5x3 =
 2−3
5
 ·
20
4
 = 24. ♦
Example 4 (Conversion from Cartesian to point-normal form). Find the point-normal form of the
equation
3x1 − 7x2 + 5x3 = 21.
Solution. A comparison of the Cartesian and point-normal forms shows that the coefficients
of x1, x2, x3 are just the coordinates of a normal to the plane. Thus, a normal to the plane is
n =
 3−7
5
.To find some point on the plane, we let x2 = x3 = 0, and then from the equation we
find x1 =
21
3
= 7. Hence (7, 0, 0) is a point on the plane.
A point-normal form is therefore
 3−7
5
 ·
x−
70
0
 = 0. ♦
Example 5 (Conversion from parametric vector to point-normal form). Find the point-normal
form for the parametric vector equation
x =
 3−5
1
+ λ1
03
4
+ λ2
−26
1
 .
c©2020 School of Mathematics and Statistics, UNSW Sydney
66 CHAPTER 2. VECTOR GEOMETRY
Solution. The plane is parallel to
03
4
 and
−26
1
. The cross product of these two vectors is
therefore normal to the plane. Thus, a normal to the plane is n =
03
4
×
−26
1
 =
−21−8
6
. As
(3,−5, 1) is a point on the plane, a point-normal form is
−21−8
6
 ·
x−
 3−5
1
 = 0. ♦
Example 6 (Conversion from parametric vector to Cartesian form). Find a Cartesian form for the
equation of a plane through the point (2,−1, 3) parallel to the vectors
 3−1
4
 and
21
5
.
Solution. The simplest way to carry out this conversion is to first find a point-normal form as in
Example 5, and then to find a Cartesian form as in Example 3.
A normal to the plane is
 3−1
4
×
21
5
 =
−9−7
5
.
A point-normal form is
−9−7
5
 ·
x−
 2−1
3
 = 0.
Hence a Cartesian form is −9x1 − 7x2 + 5x3 = 4. ♦
Example 7 (Conversion from point-normal to parametric vector form). Find a parametric vector
form for the equation of a plane with normal
−20
4
 through the point (0, 2, 1).
Solution. One way to carry out this conversion is to first convert the point-normal to Cartesian
form as in Example 3, and then convert the Cartesian to parametric vector form as in 1.5.3.
A point-normal form is
−20
4
 ·
x−
02
1
 = 0. Hence, a Cartesian form is
−2x1 + 4x3 =
−20
4
 ·
02
1
 = 4.
x2 does not appear in the equation, but don’t forget it. As in 1.5.3, we set x2 = λ1 to be a real
parameter. Also, x3 can have any value in the equation, and hence we set x3 = λ2 to be a second
real parameter. Then, we obtain x1 = −2 + 2x3 = −2 + 2λ2. Then, on rewriting the solution in
vector form, we obtain a parametric vector form of the equation of the plane to be
x =
x1x2
x3
 =
−2 + 2λ2λ1
λ2
 =
−20
0
+ λ1
01
0
+ λ2
20
1
 .
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.6. PLANES IN R3 67
♦
Example 8. Find a point-normal form of the equation of a plane through the three points
A (3, 1, 2), B (0,−2, 1), and C (1, 2, 3).
Solution. The plane is parallel to the two line segments
−−→
AB =
−3−3
−1
 and −→AC =
−21
1
 .
Hence a normal to the plane is
n =
−−→
AB ×−→AC =
−3−3
−1
×
−21
1
 =
−25
−9
 .
A (3, 1, 2) is a point on the plane, and hence a point-normal form is−25
−9
 ·
x−
31
2
 = 0.
♦
Note. Although we have not stated it explicitly, most of the solutions given to the examples in
this section are not unique. For example, any multiple of a normal to a plane is still a normal, any
multiple of a vector parallel to a plane is still a vector parallel to the plane, there are an infinite
number of points on a plane etc. However, the intercepts of a plane on the coordinate axes are
unique.
2.6.2 Distance between a point and a plane in R3
The distance from the point B to the plane is
∣∣∣−−→PB∣∣∣, where −−→PB is normal to the plane. If A is any
point on the plane,
−−→
PB is the projection of
−−→
AB on any vector normal to the plane. The shortest
distance is the “perpendicular distance” to the plane, and it is the length of the projection on a
normal to the plane of a line segment from any point on the plane to the given point.
A
P
B
n
Figure 12: Shortest Distance between Point and Plane.
c©2020 School of Mathematics and Statistics, UNSW Sydney
68 CHAPTER 2. VECTOR GEOMETRY
Example 9. Find the distance between the point (2,−1, 3) and the plane
x =
04
2
+ λ1
41
2
+ λ2
−31
5
 .
Solution. (0, 4, 2) is a point on the plane and n =
41
2
 ×
−31
5
 =
 3−26
7
 is a vector
normal to the plane. A vector from the point (0, 4, 2) on the plane to the given point (2,−1, 3) is 2−1
3
−
04
2
 =
 2−5
1
.
The shortest distance is the length of the projection of
 2−5
1
 on n, which is
∣∣∣∣∣∣
 2−5
1
 · n
∣∣∣∣∣∣
|n| =
143√
734
.
♦
2.7 Geometry and Maple
Rather than list all the commands that you might need in this chapter, we suggest that you have
a look through the commands available in the LinearAlgebra package using the on-line manual.
The command
?LinearAlgebra;
opens a help page with a list of the commands options available in the linear algebra package. If
you think that CrossProduct looks promising, you should click on its hyperlink, or type in:
?CrossProduct
in the Maple worksheet. The on-line help usually gives a few examples. Try these first then you
can experiment with a few problems of your own.
The package geom3d contains many useful procedures for solving problems in three-dimensional
geometry. In fact, every one of the problems for Section 5.7 can be solved with procedures in
geom3d. The following is an outline of some of the things which you can do with geom3d.
The method of assigning a name to a point, line or plane is not quite what you might expect. To
say that A is the point (1, 2, 3), you do NOT enter A:=[1,2,3];, you enter point(A,[1,2,3]);.
The command line is used to assign a name to a line. The line may be specified by giving two
points on it or a point on it and a direction parallel to it. The command plane is used to assign a
name to a plane. The plane may be specified by giving a Cartesian equation for it or three (non-
collinear) points on it or a point on it and a normal direction or a point on it and two lines parallel
to it. The command sphere is used to assign a name to a sphere. A sphere may be specified by
c©2020 School of Mathematics and Statistics, UNSW Sydney
2.7. GEOMETRY AND MAPLE 69
giving its Cartesian equation or four points on it or the end-points of a diameter or its center and
its radius. To display the specifications of one of these things you need to use detail (see the
example below). For a plane, the detail includes a Cartesian equation for the plane. (If you want
to find a normal to a plane p use
NormalVector(p);)
Be warned that if you specify a plane or sphere by means of an equation then Maple will want
you to specify the names of the variables which are associated with the three axes. You can do this
by listing them as a third argument to the plane or sphere command, as in
plane(P,x+y+z=1,[x,y,z]);
If you leave out the [x,y,z] then Maple will, rather strangely, prompt you to enter the name of
the x-axis, to which you reply x;, and similarly for the other two axes.
When you have set up objects of these types you can, for example, use the command distance
to find the distance between two of them or the command intersection to find the intersection
of two of them (except the intersection of a line and a sphere) or the command FindAngle to find
the angle between two of them. You can use the Maple help to find out more about any of these
commands and to find out about the many other commands available in geom3d.
In the following example, we first label the points A(0, 1, 2) and B(2, 3, 1) and the line AB
through A and B. Applying detail to AB shows that the direction of the line is
 22
−1
 and the
line can be expressed in parametric vector form as
x =
01
2
+ t
 22
−1
 , t ∈ R.
Then we assign the label P to the plane through C(4, 5, 6) with normal (1, 1, 1) and use detail to
find that P can be described by the Cartesian equation
x+ y + z = 15.
Then we assign the label X to the point of intersection of the line AB and the plane P and find
that the coordinates of X are (8, 9,−2). Finally, we find that the plane ABC through the three
points A,B,C can be described by the Cartesian equation
12x− 12y + 12 = 0.
Notice that we use a colon : to suppress the output of most of the commands because the output
would just be an echo of assigned names.
with(geom3d):
point(A,[0,1,2]),point(B,[2,3,1]):
line(AB,[A,B]):
detail(AB);
Warning, assume that the name of the parameter in the parametric equations is t
Warning, assuming that the names of the axes are x, y, and z
name of the object: AB
form of the object: line3d
equation of the line: [ x = 2* t, y = 1+2* t, z = 2- t]
c©2020 School of Mathematics and Statistics, UNSW Sydney
70 CHAPTER 2. VECTOR GEOMETRY
point(C,[4,5,6]):
plane(P,[C,[1,1,1]]):
detail(P);
Warning, assuming that the names of the axes are x, y and z
name of the object: P
form of the object: plane3d
equation of the plane: -15+ x+ y+ z = 0
intersection(X,AB,P):
detail(X);
name of the object: X
form of the object: point3d
coordinates of the point: [8, 9, -2]
plane(ABC,[A,B,C]):
detail(ABC);
Warning, assuming that the names of the axes are x, y and z
name of the object: ABC
form of the object: plane3d
equation of the plane: 12+12* x-12* y = 0
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 2 71
Problems for Chapter 2
Problems 2.2: The dot product
1. [R][V] Find the angles between the following pairs of vectors:
a)
−22
0
,
03
0
; b)
10
3
,
−25
1
; c)
 71
−2
 ,
 3−11
5
; d)

3
0
1
4
,

−2
6
1
3
.
2. [H] Find the cosines of the internal angles of the triangles whose vertices have the following
coordinate vectors:
a) A
40
2
, B
62
1
 and C
51
6
; b) A
02
1
, B
−13
0
 and C
31
2
;
c) A

1
−2
0
3
, B

0
4
−2
5
 and C

−2
1
0
3
.
3. [R] A cube has vertices at the 8 pointsO (0, 0, 0), A (1, 0, 0), B (1, 1, 0), C (0, 1, 0), D (0, 0, 1),
E (1, 0, 1), F (1, 1, 1), G (0, 1, 1). Sketch the cube, and then find the angle between the di-
agonals
−−→
OF and
−→
AG.
4. [H][V] Prove the following properties of dot products for vectors a,b, c ∈ R3. :
a) a · b = b · a, b) a · (λb) = λ(a · b), c) a · (b+ c) = a · b+ a · c.
5. [H] Use the dot product to prove that the diagonals of a square intersect at right angles.
Problems 2.3 : Applications: orthogonality and projection
6. [R] Let u1 =

1√
2
0
− 1√
2
 ,u2 =
01
0
 ,u3 =

1√
2
0
1√
2
 and a =
 2−3
1
. Show that the
set of vectors {u1,u2,u3} is an orthonormal set. Find scalars λ1, λ2, λ3 such that a =
λ1u1 + λ2u2 + λ3u3.
HINT. See Examples 2 of Section 2.3.
c©2020 School of Mathematics and Statistics, UNSW Sydney
72 CHAPTER 2. VECTOR GEOMETRY
7. [H] Consider the triangle ABC inR3 formed by the pointsA(3, 2, 1), B(4, 4, 2) and C(6, 1, 0).
a) Find the coordinates of the midpoint M of the side BC.
b) Find the angle BAC.
c) Find the area of the triangle ABC.
d) Find the coordinates of the point D on BC such that AD is perpendicular to BC.
8. [R][V] Find the following projections:
a) the projection of
21
4
 on
 1−2
1
,
b) the projection of

2
−1
2
4
 on

−1
3
0
2
,
c) the projection of
−22
7
 on the direction of the line x =
10
2
+ λ
−11
2
.
9. [R] Find the shortest distances between
a) the point (−2, 1, 5) and the line x =
 12
−5
+ λ
 63
−4
;
b) the point (0, 3, 8) and the line
x1 − 1
1
=
x2 − 2
−1 =
x3 − 3
4
.
Problems 2.4 : The cross product
10. [R] Find the cross product a× b of the following pairs of vectors:
a) a =
 02
−4
 and b =
13
2
, b) a =
31
4
 and b =
−26
1
,
c) a =
19
2
 and b =
 20
−5
.
11. [R][V] Find a vector which is perpendicular to
13
2
 and
−20
4
.
12. [H] Prove the following properties of cross products for vectors a,b, c ∈ R3:
a) a× a = 0; b) a× b = −b× a;
c) a× (λb) = λ(a× b); d) a× (b+ c) = a× b+ a× c.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 2 73
13. [R] Find the areas of, and the normals to the planes of, the following parallelograms:
a) the parallelogram spanned by
13
2
 and
02
4
;
b) a parallelogram which has vertices at the three points A (0, 2, 1), B (−1, 3, 0) and
C (3, 1, 2) and sides
−→
AB and
−→
AC.
14. [R][V] Find the areas of the triangles with the following vertices:
a) A (0, 2, 1), B (−1, 3, 0) and C (3, 1, 2);
b) A (2, 2, 0), B (−1, 0, 2) and C (0, 4, 3).
15. [R] Let D, E, F be the points with coordinate vectors
d =
56
7
 , e =
67
8
 , f =
 78
10

a) Calculate cos(∠DEF ) as a surd.
b) Calculate the area of ∆DEF as a surd.
Problems 2.5 : Scalar triple product and volume
16. [R][V] Show that a · (b× c) can be written in the form∣∣∣∣∣∣
a1 a2 a3b1 b2 b3
c1 c2 c3
∣∣∣∣∣∣ ,
where the vectors e1, e2, e3 are replaced by the scalars a1, a2, a3.
17. [R] Find the volumes of the following parallelepipeds:
a) the parallelepiped spanned by
21
3
,
41
2
 and
02
1
;
b) a parallelepiped which has vertices at the four points A (2, 1, 3), B (−2, 1, 4), C (0, 4, 1)
and D (3,−1, 0), with sides
−→
AB,
−→
AC and
−→
AD.
18. [R] Show that the four points A,B,C,O with coordinate vectors
21
3
,
41
2
,
61
1
,00
0
 are coplanar.
c©2020 School of Mathematics and Statistics, UNSW Sydney
74 CHAPTER 2. VECTOR GEOMETRY
Problems 2.6 : Planes in R3
19. [R][V] Find parametric vector, point-normal, and Cartesian forms for the following planes:
a) the plane through (1, 2,−2) perpendicular to
−11
2
;
b) the plane through (1, 2,−2) parallel to
−11
2
 and
23
1
;
c) the plane through the three points (1, 2,−2), (−1, 1, 2) and (2, 3, 1);
d) the plane with intercepts −1, 2 and −4 on the x1, x2 and x3 axes;
20. [R] Consider four points O,A,B,C in R3 with coordinate vectors
0 =
00
0
 , a =
12
4
 , b =
 10
−1
 , c =
 2−1
−1
 .
Let Π be the plane through A and parallel to the lines OB and OC.
a) Find a parametric vector form for Π.
b) Find a vector n normal to Π.
c) Use the point normal form to find a Cartesian equation for Π.
21. [R] Find the projection of
23
8
 on the normal to the plane 2x1 + 2x2 + x3 = 4.
22. [R][V] Find the shortest distances between
a) the point (2, 6,−5) and the plane
x−
12
3
 ·
−24
4
 = 0;
b) the point (1, 4, 1) and the plane 2x1 − x2 + x3 = 5;
c) the point (1, 2, 1) and the plane with intercepts at 3,−1, 2 on the three axes;
d) the origin and the plane through the three points (2, 1, 3), (5, 3, 1) and (5, 1, 2).
23. [R] Let P be the plane in R3 through the points A = (1, 2, 0), B = (0, 1, 2), and C =
(−1, 3, 1).
a) Find a parametric vector form for the plane P .
b) Find a vector n normal to the plane P .
c) Find a point normal form for the plane P .
d) Find the shortest distance from the point Q = (2, 4, 5) to the plane P.
c©2020 School of Mathematics and Statistics, UNSW Sydney
75
Chapter 3
Complex numbers
“Ignorance of Axioms”, the Lecturer continued,
“is a great drawback in life. It wastes so much time
to have to say them over and over again.”
Lewis Carroll, Sylvie and Bruno Concluded.
The main purpose of this chapter is to introduce the system of complex numbers. In the calculus
part of this subject we concentrate on the set R of real numbers which we think of as corresponding
to the points on the number line.
−2 −1 0 1 2
b
x
x is real. x ∈ R.
If one wants to solve all quadratic equations then the real numbers do not suffice. In particular,
if b2 − 4ac < 0 then it is not possible to find real solutions to ax2 + bx + c = 0. In this chapter,
we will construct a larger set of numbers in which such an equation can be solved. Indeed, in this
larger set, every polynomial equation has at least one solution. Much of mathematics becomes
simpler (not more complicated) when complex numbers are used and applications to areas such as
physics, chemistry, electrical and mechanical engineering, oceanography, economics and the theory
of dynamical systems are made simpler by using complex numbers.
3.1 A review of number systems
Before going into the details of the complex number system, it is worth while reviewing some of
the basic properties of the number systems which you have already met in primary and high school
mathematics. This review is not intended as a rigorous development of the theory of number
systems – such a rigorous development is quite difficult and is sometimes given in more advanced
courses.
The first system of numbers that you will have met consists of the set of natural numbers
(or counting numbers)
N = {0, 1, 2, . . .},
c©2020 School of Mathematics and Statistics, UNSW Sydney
76 CHAPTER 3. COMPLEX NUMBERS
together with rules of addition and multiplication. This set of numbers has the property that
addition or multiplication of natural numbers always produces another natural number, whereas
subtraction or division may not. Thus 3 − 5 and 35 are not natural numbers. We say that the set
of natural numbers is closed under the operations of addition and multiplication, whereas the set
is not closed under the operations of subtraction or division.
A very limited class of equations have solutions in N. For example, in N, x+3 = 7 and 5x+2 = 17
can be solved, but x+ 7 = 3 and 5x+ 2 = 18 can not! Thus, to solve all linear equations, a larger
set of numbers is required.
A set of numbers that is closed under subtraction (i.e., for which subtraction is always possible)
can be obtained by extending the natural number system by introducing a new number (−1).
Then, after using the usual rules for addition and multiplication, the set of integers
Z = {. . . ,−2,−1, 0, 1, 2, . . .}
is obtained.
However, the set Z is not closed under division as, for example, 35 is not an integer. At this stage
x + 7 = 3 has a unique solution but 5x + 2 = 18 still has no solution. To solve such an equation,
we need fractions. Thus, we extend the system of integers to the set of rational numbers, Q,
defined by
Q =
{
p
q
: p, q ∈ Z for q 6= 0
}
.
This system is then closed under the four standard operations of arithmetic of addition, subtraction,
multiplication and division (division by zero excluded).
Now that the rationals are in the set that we are focusing on, all equations in one variable with
rational coefficients, such as 5x+2 = 18 or 32x+
1
4 = 1 can be solved. All solutions will be rational
although sometimes a solution may happen to be an integer. Indeed, if we now consider the general
equation in one variable ax + b = c with a, b, c ∈ Q then this has a unique solution x = (c − b)/a
unless a = 0.
The rationals are the first and primary example of a mathematical concept called a field. (See
definition 1.) A field is a set (of numbers) which satisfy “twelve number laws”. These laws, or
axioms as they are called, form a minimal list of properties that one needs in order to be able to
add, subtract, multiply and divide (by non-zero numbers).
Much elementary mathematics can be carried out using rational numbers, as is done by your
calculator. The set of real numbers, R, contains all the rationals, along with numbers such as√
2,
√
3, π, e, etc. which are not rational. The proofs that
√
2,
√
3,
√
5, . . . are irrational are very
straightforward. The proofs that π and e are irrational are harder. You will see a proof of the
irrationality of e later in the year. The last question in the HSC Extension 2 2003 paper outlines
a proof of the irrationality of π.
Using real numbers, we can now solve some quadratic and higher degree equations. For example,
x2− 3 = 0 has 2 real solutions x = ±√3 and x3−x2− 3x+3 = 0 has 3 real solutions x = 1, ±√3.
The set of real numbers also satisfies all twelve number laws and thus forms a field. Hence all
equations ax+ b = c with a, b, c ∈ R, a 6= 0, have a unique solution.
Note that the general motivation behind the development of number systems sketched above
is that some set of numbers is extended to a new set by introducing new “numbers” so that some
operation (e.g., subtraction, division, or finding lengths of sides of squares) is always possible in
the extended set.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.2. INTRODUCTION TO COMPLEX NUMBERS 77
We conclude this section by introducing the definition of a field.
Definition 1. Let F be a non-empty set of elements for which a rule of addition (+)
and a rule of multiplication are defined. Then the system is a field if the following
twelve axioms (or fundamental number laws) are satisfied.
1. Closure under Addition. If x, y ∈ F then x+ y ∈ F.
2. Associative Law of Addition. (x+ y) + z = x+ (y + z) for all x, y, z ∈ F.
3. Commutative Law of Addition. x+ y = y + x for all x, y ∈ F.
4. Existence of a Zero. There exists an element of F (usually written as 0)
such that 0 + x = x+ 0 = x for all x ∈ F.
5. Existence of a Negative. For each x ∈ F, there exists an element w ∈ F
(usually written as −x) such that x+ w = w + x = 0.
6. Closure under Multiplication. If x, y ∈ F then xy ∈ F.
7. Associative Law of Multiplication. x(yz) = (xy)z for all x, y, z ∈ F.
8. Commutative Law of Multiplication. xy = yx for all x, y ∈ F.
9. Existence of a One. There exists a non-zero element of F (usually written
as 1) such that x1 = 1x = x for all x ∈ F.
10. Existence of an Inverse for Multiplication. For each non-zero x ∈ F,
there exists an element w of F (usually written as 1/x or x−1) such that
xw = wx = 1.
11. Distributive Law. x(y + z) = xy + xz for all x, y, z ∈ F.
12. Distributive Law. (x+ y)z = xz + yz, for all x, y, z ∈ F.
3.2 Introduction to complex numbers
We cannot find a real number x such that x2 + 1 = 0, since the square of real number is never
negative. In the sixteenth century, mathematicians wrote down formal solutions to quadratic
equations such as x2 + 1 = 0 in the form x = ±√−1. If this object √−1 were treated just like an
ordinary number, but with the property that
√−1√−1 = −1, then indeed these values of x satisfy
the above equation. The problem was that no-one really had any idea what writing down
√−1
meant! No real-life problems that were being treated at the time ever had meaningful solutions
that involved these “imaginary” numbers.
In the eighteenth and nineteenth centuries however, these imaginary numbers became more and
more useful. Leonhard Euler introduced the now standard notation i for
√−1 and showed that the
use of these numbers allows one to express deep relationships between the trigonometric functions
c©2020 School of Mathematics and Statistics, UNSW Sydney
78 CHAPTER 3. COMPLEX NUMBERS
and the exponential function.
Let us begin with the general quadratic equation ax2 + bx+ c = 0, a 6= 0, then by multiplying
both sides of the equation by 4a and completing the square, we have
4a2x2 + 4abx+ 4ac = 0
(2ax+ b)2 = b2 − 4ac
Hence, if ∆ = b2 − 4ac > 0
2ax+ b = ±
√
∆
or x =
−b±√∆
2a
and the equation has been solved.
If ∆ = b2 − 4ac < 0, we can express the solutions using the complex number i. For example, if
x2 = −4, then we can write
x2 = ±√−4 = ± 2√−1 = ± 2i.
So, returning to our quadratic equation ax2 + bx + c = 0 with ∆ = b2 − 4ac < 0, we write
∆ = −(4ac− b2)
thus
√
∆ = ±i
√
4ac− b2 and
x =
−b± i√4ac− b2
2a
are our 2 solutions.
We now define the set C of complex numbers by:
C = {a+ bi ∣∣ a, b ∈ R, i2 = −1}.
A complex number written in the form a + bi, where a, b ∈ R, is said to be in Cartesian form.
The real number a is called the real part of a + bi, and b is called the imaginary part. The set C
contains all the real numbers (when b = 0). Numbers of the form bi, with b real (b 6= 0), are called
purely imaginary numbers. The set of complex numbers also satisfies the twelve number laws,
and so it also forms a field.
Example 1. Some examples of complex numbers in Cartesian form are
3 + 4i, 2− i = 2 + (−1)i, −5i = 0 + (−5)i, 6 = 6 + 0i and cos π
3
+ i sin
π
3
.
♦
Note. The symbol ♦ indicates the end of an example.
3.3 The rules of arithmetic for complex numbers
Let z = a+ bi and w = c+ di, where a, b, c, d ∈ R.
Addition and subtraction. We define the sum, z + w, by
z + w = (a+ c) + (b+ d)i.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.3. THE RULES OF ARITHMETIC FOR COMPLEX NUMBERS 79
and the difference, z − w, by
z − w = (a− c) + (b− d)i.
That is, we add or subtract the real parts and the imaginary parts separately.
Example 1. If z = 6 + 5i and w = −3 + 4i then z + w = 3 + 9i and z − w = 9 + i. ♦
Multiplication. Expanding out, we have
(a+ bi)(c+ di) = ac+ bci+ adi+ (bi)(di) = ac+ (bc+ ad)i+ (bd)i2.
Since i2 = −1, this can be simplified to (ac− bd)+ (bc+ ad)i. Hence we define the product, zw, by
zw = (ac− bd) + (bc+ ad)i.
Example 2. If z = 6 + 5i and w = −3 + 4i then
zw = (6 + 5i)(−3 + 4i) = −18− 20− 15i + 24i = −38 + 9i.
♦
(It is wise to multiply out the terms producing real numbers first and then purely imaginary
numbers second.)
Division. To divide two complex numbers we use a similar process to that used in “rationalising
the denominator”. For example,
1
2 +
√
3
=
(2−√3)
(2 +
√
3)(2−√3) =
2−√3
4− 3 = 2−
√
3,
rationalises the denominator by multiplying both numerator and denominator by the denominator
with the sign of
√
3 changed.
A similar process can be applied when dividing complex numbers:
z
w
=
a+ bi
c+ di
=
(a+ bi)(c − di)
(c+ di)(c − di) =
ac+ bd+ (bc− ad)i
c2 + d2
=
ac+ bd
c2 + d2
+
bc− ad
c2 + d2
i.
Thus we define the quotient
z
w
, (w 6= 0) by
z
w
=
ac+ bd
c2 + d2
+
bc− ad
c2 + d2
i.
Note that the quotient of two complex numbers is also a complex number.
Example 3. If z = 3− 4i and w = 1 + 2i then
z
w
=
3− 4i
1 + 2i
=
(3− 4i)(1 − 2i)
(1 + 2i)(1 − 2i) =
−5− 10i
5
= −1 + 2i.
♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
80 CHAPTER 3. COMPLEX NUMBERS
Note. The formula for division by w = c + id given above fails if and only if c2 + d2 = 0. But,
since c and d are real, c2+ d2 = 0 if and only if c = 0 and d = 0, that is, if and only if w = 0. Thus
the formula for complex division fails if and only if the denominator is 0.
Before concluding this section on complex number arithmetic, we should point out that we now
have three examples of fields; namely, the rational numbers Q, the real numbers R and the complex
numbers C.
Proposition 1. [X] The following properties hold for addition of complex numbers:
1. Uniqueness of Zero. There is one and only one zero in C.
2. Cancellation Property. If z, v, w ∈ C satisfy z + v = z + w, then v = w.
Proposition 2. [X] The following properties hold for multiplication of complex numbers:
1. 0z = 0 for all complex numbers z.
2. (−1)z = −z for all complex numbers z.
3. Cancellation Property. If z, v, w ∈ C satisfy zv = zw and z 6= 0, then v = w.
4. If z, w ∈ C satisfy zw = 0, then either z = 0 or w = 0 or both.
One final point should be made. So far we have stressed the similarities between real number
arithmetic and complex number arithmetic. However, there are also many important differences.
One important difference is that while it makes sense to say that a real number is positive or that
one real number is greater than (or less than) another, it does not make sense to say that a complex
number is positive or that one complex number is greater than (or less than) another. That is,
complex numbers cannot be ordered.
3.4 Real parts, imaginary parts and complex conjugates
Definition 1. The real part of z = a + bi (written Re(z)), where a, b ∈ R, is
given by
Re(z) = a.
Definition 2. The imaginary part of z = a+ bi (written Im(z)), where a, b ∈ R,
is given by
Im(z) = b.
Example 1. If z = 6− 5i then Re(z) = 6 and Im(z) = −5. ♦
NOTE.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.4. REAL PARTS, IMAGINARY PARTS AND COMPLEX CONJUGATES 81
1. The imaginary part of a complex number is a real number.
2. Two complex numbers are equal if and only if their real parts are equal and their imaginary
parts are equal. That is
a+ bi = c+ di if and only if a = c and b = d,
where a, b, c, d ∈ R.
Example 2. Find real numbers a, b such that (2a+ b) + (3a− 2b)i = 4− i.
Solution. By comparing the real parts and imaginary parts, we have
2a+ b = 4 and 3a− 2b = −1.
Hence a = 1, b = 2. ♦
The complex conjugate of a complex number is defined as follows.
Definition 3. If z = a+ bi, where a, b ∈ R, then the complex conjugate of z is
z = a− bi.
Example 3. 3− 4i = 3 + 4i, 6 = 6, −10i = 10i. ♦
Example 4. Let z = 2 + i, w = 1 + 2i. We have
z + w = 3 + 3i = 3− 3i.
Note that z + w = (2− i) + (1− 2i) = 3− 3i. So z + w = z + w. ♦
Properties of the Complex Conjugate.
1. z = z.
2. z + w = z + w and z − w = z − w.
3. zw = z w and
( z
w
)
=
z
w
.
4. Re(z) = 12(z + z) and Im(z) =
1
2i (z − z).
5. If z = a+ bi, then zz = a2 + b2, so zz ∈ R and zz > 0.
The proofs of most of these properties are straightforward. For example, the proof of property 4 is
as follows.
If z = a+ bi then z = a− bi and
z + z = (a+ bi) + (a− bi) = 2a = 2Re(z)
z − z = (a+ bi)− (a− bi) = 2bi = 2iIm(z).
The proofs of the remaining properties are left as exercises.
So far, we handle complex numbers by writing them in a+ bi form. With the above properties,
we can write the real and imaginary parts of a complex number in terms of the complex number
and its conjugate. We can also prove some general results in a simpler way.
c©2020 School of Mathematics and Statistics, UNSW Sydney
82 CHAPTER 3. COMPLEX NUMBERS
Example 5. Let z, w ∈ C such that zz = ww. Prove that z + w
z − w is purely imaginary.
Proof. To show that
z + w
z − w is purely imaginary, we only need to show that Re
(
z + w
z − w
)
is 0. Using
α+ α = 2Re(α),
Re
(
z + w
z − w
)
=
1
2
(
z + w
z − w +
(
z + w
z − w
))
=
1
2
(
z + w
z − w +
z + w
z − w
)
=
(z + w)(z − w) + (z − w)(z + w)
2(z − w)(z −w)
=
zz − zw + wz − ww + zz + zw −wz − ww
2(z − w)(z − w)
=
2(zz − ww)
2(z − w)(z − w)
which is 0 since zz = ww. The result follows.
Note. The symbol indicates the end of a proof.
3.5 The Argand diagram
In the rest of this chapter, unless otherwise stated,
we shall assume a, b, x, y ∈ R when we say z = a+ bi
or z = x+ yi is a complex number.
There is a simple and extremely useful geometric
picture of complex numbers which is obtained by iden-
tifying a complex number z = a+ bi with the point in
the xy-plane whose coordinates are (a, b). For exam-
ple, 4 + 3i is represented by (4, 3), 4 by (4, 0) and 3i
by (0, 3). The coordinate plane with complex numbers
plotted in this way is called an Argand diagram, as
shown in Figure 1.
Note that a number z = a + bi is plotted with
Re(z) = a in the x-direction and Im(z) = b in the y-
direction. For this reason, the x-axis is usually called
the real axis and the y-axis is usually called the
imaginary axis.
y-axis
Imaginary axis
x-axis
Real axis
b
a
b
0
Figure 1: The Argand Diagram.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.5. THE ARGAND DIAGRAM 83
Example 1. Plot the numbers 4, −4, i, −2i, −2+ 3i, 3− 4i, −3− 4i on an Argand diagram. The
solution is shown in Figure 2. ♦
b
bb
b b
b
b i
−2i
4−4
−3− 4i 3− 4i
−2 + 3i
0
Imaginary axis
Real axis
Figure 2: Plots of 4, −4, i, −2i, −2 + 3i, 3− 4i, −3− 4i, on the Argand Diagram.
Note that we can write either i and −2i on the y-axis above, or simply mark 1 and −2,
remembering that this is the imaginary axis.
Plotting real and imaginary parts leads to a simple geometric picture of addition and subtraction
of complex numbers. Addition and subtraction is done by adding or subtracting x-coordinates and
y-coordinates separately, as shown in Figure 3.
Real axis
Im
ag
in
ar
y
ax
is
b
b
b
z1
z2
x2
y2
z1 + z2
0
z1 = x1 + y1i
z2 = x2 + y2i
Real axis
Im
ag
in
ar
y
ax
is
b
b
b
z1
z2
x2
y2
z1 − z2
0
Figure 3: Addition and Subtraction of Complex Numbers.
c©2020 School of Mathematics and Statistics, UNSW Sydney
84 CHAPTER 3. COMPLEX NUMBERS
3.6 Polar form, modulus and argument
An alternative representation for complex
numbers, which proves to be very useful, is
obtained by using plane polar coordinates r
and θ instead of the Cartesian coordinates x
and y. The coordinate r is the distance of
a point from the origin, and θ is an angle
measured from the positive x-axis, as shown
in Figure 4.
Take a complex number z 6= 0, then from
Figure 4, Pythagoras’ theorem gives
r =
√
x2 + y2 with r > 0.
Real axis
Im
ag
in
ar
y
ax
is
x
y
θ
0
r
z = x+ yi
Figure 4: Polar Coordinates of a Complex
Number.
By trigonometry,
cos θ =
x√
x2 + y2
=
x
r
and sin θ =
y√
x2 + y2
=
y
r
.
Thus the relations between the real and imaginary parts of z = x+ yi and the polar coordinates r
and θ are
Re(z) = x = r cos θ and Im(z) = y = r sin θ,
and hence a complex number z 6= 0 can be written using the polar coordinates r and θ as:
z = r(cos θ + i sin θ).
It is important to note here that the angle θ for a given complex number z = x+ yi is not uniquely
defined; since adding or subtracting 2π produces exactly the same values for x and y and hence
the same complex number z. This result is summarised in the following proposition.
Proposition 1 (Equality of Complex Numbers). Two complex numbers
z1 = r1(cos θ1 + i sin θ1) and z2 = r2(cos θ2 + i sin θ2), z1, z2 6= 0
are equal if and only if r1 = r2 and θ1 = θ2 + 2kπ, where k is any integer.
[X] Proof. Let z1 = x1 + iy1 and z2 = x2 + iy2.
If r1 = r2 and θ1 = θ2 + 2kπ, then cos θ1 = cos θ2, sin θ1 = sin θ2,
x1 = r1 cos θ1 = r2 cos θ2 = x2
y1 = r1 sin θ1 = r2 sin θ2 = y2
and z1 = z2.
Conversely, if z1 = z2, then we have x1 = x2 and y1 = y2. Hence,
r1 =
√
x21 + y
2
1 =
√
x22 + y
2
2 = r2,
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.6. POLAR FORM, MODULUS AND ARGUMENT 85
and since r1, r2 6= 0,
cos θ1 =
x1
r1
=
x2
r2
= cos θ2 and sin θ1 =
y1
r1
=
y2
r2
= sin θ2,
and hence, cos(θ1 − θ2) = cos θ1 cos θ2 + sin θ1 sin θ2 = cos2 θ1 + sin2 θ1 = 1
so θ1 − θ2 = 2kπ, θ1 = θ2 + 2kπ for k ∈ Z.
The polar coordinate r that we have associated with a complex number is often called the
modulus of the complex number. The formal definition is:
Definition 1. For z = x+ yi, where x, y ∈ R, we define the modulus of z to be
|z| =
√
x2 + y2.
The quantity |z| is also called the magnitude of z or the absolute value of z. Note that it has
a geometric interpretation as the distance r = |z| of the point z from the origin in an Argand
diagram. Note also that zz = |z|2.
Example 1. | − 4| = 4, |2i| = 2, and |3− 4i| =
√
32 + (−4)2 = 5. ♦
Argument and Principal Argument.
The polar coordinate θ that we have associated with a complex number is often called an
argument of the complex number and is written as arg(z). As mentioned above, such an angle
can be increased or decreased by 2π without changing the corresponding complex number. It is
desirable to define a particular argument that is unique for a given complex number. We do this by
choosing a value θ of the argument so that −π < θ 6 π. This is called the principal argument
of z and is written as Arg(z).
The following diagrams illustrate the possible positions of a complex number z in each of the
four quadrants.
c©2020 School of Mathematics and Statistics, UNSW Sydney
86 CHAPTER 3. COMPLEX NUMBERS
0 Real axis
Imaginary axis
b
z
θ
0 Real axis
Imaginary axis
b
z
θ
0
Real axis
Imaginary axis
b
z
θ
0
Real axis
Imaginary axis
b
z
θ
Figure 5: Principal Argument.
We can see from the diagrams, that if z lies in the first or second quadrant, we measure the
principal argument θ anticlockwise. For z in the 3rd or the 4th quadrant, we measure θ, as a
negative angle, clockwise. Note that we leave Arg(0) undefined.
One useful strategy is always to draw a diagram, then use the tangent ratio to find the acute
angle α formed by the corresponding triangle and use this to find the principal argument.
Example 2. Find the arguments of 1 + i, −1 +√3 i, −√3− i and 1−√3 i.
Solution.
We first plot z = 1 + i on an Argand diagram
as in Figure 6. The complex number lies in the 1st
quadrant, and so 0 < Arg(z) < π/2. As shown in
the figure,
tan θ = 1, and
Arg (1 + i) = θ =
π
4
.
Real axis
Imaginary
axis
b
0
1 + i
1
1
θ
Figure 6.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.6. POLAR FORM, MODULUS AND ARGUMENT 87
From Figure 7, the complex number −1 +√3 i
lies in the 2nd quadrant. Also,
tanα =
√
3, so α =
π
3
.
Hence
Arg
(
−1 +
√
3 i
)
= θ = π − α = 2π
3
.
Real axis
Imaginary
axis
b
0
−1 +√3 i
√
3
1
α
θ
Figure 7.
From Figure 8, the complex number −√3−i lies
in the 3rd quadrant. Also,
tanα =
1√
3
, so α =
π
6
.
Hence
Arg
(√
3− i
)
= −π + α) = −5π
6
.
Real axis
Imaginary
axis
b
0
−√3− i
1
√
3
α
θ
Figure 8.
From Figure 9, the complex number 1−√3 i lies
in the 4th quadrant. Also
tanα =
√
3, so α =
π
3
.
Hence,
Arg
(
1−
√
3 i
)
= −α = −π
3
.
Real axis
Imaginary
axis
b
0
1− i√3
√
3
1
α
Figure 9.
♦
In the special case that a given complex number is real or purely imaginary, we can easily read
the principal argument from an Argand diagram.
Example 3. Find the principal argument for each of the numbers 4, −4, i and −2i.
Solution. From Figure 2 in Section 3.5, we can easily read the arguments as
Arg(4) = 0, Arg(−4) = π, Arg(i) = π
2
, Arg(−2i) = −π
2
.
♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
88 CHAPTER 3. COMPLEX NUMBERS
Example 4. Find the “a+ ib” form of the complex number with modulus 4 and argument −π
6
.
Solution.
z = 4
(
cos
(
−π
6
)
+ i sin
(
−π
6
))
= 4
(√
3
2
− 1
2
i
)
= 2
√
3− 2i.
♦
3.7 Properties and applications of the polar form
We begin by proving a useful lemma.
Lemma 1. For any real numbers θ1 and θ2
(cos θ1 + i sin θ1)(cos θ2 + i sin θ2) = cos(θ1 + θ2) + i sin(θ1 + θ2).
Proof. Expanding the left hand side, we obtain
(cos θ1 cos θ2 − sin θ1 sin θ2) + i(cos θ1 sin θ2 + sin θ1 cos θ2).
Then, using the standard trigonometric formulae
cos(θ1 + θ2) = cos θ1 cos θ2 − sin θ1 sin θ2,
and
sin(θ1 + θ2) = cos θ1 sin θ2 + sin θ1 cos θ2,
the result follows.
Lemma 1 can be used to derive a very important and useful theorem for integer powers of
complex numbers in polar forms. This theorem is called De Moivre’s Theorem.
Theorem 2 (De Moivre’s Theorem). For any real number θ and integer n
(cos θ + i sin θ)n = cosnθ + i sinnθ. (#)
Proof. We shall prove this theorem by proving that the condition (#) holds for the four separate
cases of n > 0, n = 0, n = −1 and n < −1, and hence that it holds for all n ∈ Z.
CASE 1. n > 0. The proof is by induction (see Section 3.11).
We first note that (#) is obviously true for n = 1.
We now show that, if (#) is true for some value of n > 0, then it is also true for n+1. Assuming
(#) is true for some value of n > 0, we have
(cos θ + i sin θ)n+1 = (cos θ + i sin θ)(cos θ + i sin θ)n
= (cos θ + i sin θ)(cosnθ + i sinnθ)
= cos(n+ 1)θ + i sin(n+ 1)θ (from lemma 1),
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.7. PROPERTIES AND APPLICATIONS OF THE POLAR FORM 89
and hence (#) is true for n+ 1.
Now, we have already seen that (#) is true for n = 1, and hence, from the first principle of
induction, it is also true for all integers n > 1.
CASE 2. n = 0. The condition (#) is true for this case, provided we use the convention that
z0 = 1 for any complex number z.
CASE 3. n = −1. By definition, z−1 = 1
z
. Then, applying the division rule for complex numbers
to z = cos θ + i sin θ, we have
(cos θ + i sin θ)−1 =
1
cos θ + i sin θ
=
1
cos θ + i sin θ
× cos θ − i sin θ
cos θ − i sin θ
=
cos θ − i sin θ
cos2 θ + sin2 θ
= cos(−θ) + i sin (−θ),
where we have used the trigonometric identities
cos θ = cos(−θ), sin θ = − sin(−θ) and cos2 θ + sin2 θ = 1.
The condition (#) is therefore true for n = −1 also.
CASE 4. n < −1. Note that
(cos θ + i sin θ)−n =
(
(cos θ + i sin θ)−1
)n
= (cos (−θ) + i sin(−θ))n (from case 3)
= cos(−nθ) + i sin(−nθ) (from case 1).
De Moivre’s Theorem provides a simple formula for integer powers of complex numbers. How-
ever, it can also be used to suggest a meaning for complex powers of complex numbers. To make
this extension to complex powers it is actually sufficient just to give a meaning to the exponential
function for imaginary exponents. We first make the following definition.
Definition 1. (Euler’s Formula). For real θ, we define
eiθ = cos θ + i sin θ.
This definition may appear somewhat arbitrary at first, but there are several reasons why it is
c©2020 School of Mathematics and Statistics, UNSW Sydney
90 CHAPTER 3. COMPLEX NUMBERS
reasonable. First recall that for real constants a, θ and φ, we have for integer n,
(eaθ)n = eanθ, (1)
eaθeaφ = ea(θ+φ), (2)
e0 = 1, (3)
d
dθ
(
eaθ
)
= aeaθ. (4)
In fact, properties (3) and (4) are often taken as a definition of the exponential function for real
numbers.
The next point to notice is that if a is replaced by i and if eiθ is replaced by (cos θ + i sin θ)
in these four formulae, then all four formulae are still satisfied. In fact, equation (1) would be
De Moivre’s Theorem, equation (2) would be lemma 1, equation (3) would obviously be true, and
equation (4) would be
d
dθ
(cos θ + i sin θ) = i(cos θ + i sin θ) = − sin θ + i cos θ,
which is also true (provided we assume that differentiation of expressions containing the symbol i
can be carried out in the same way as if i were a real constant). Thus, cos θ + i sin θ has exactly
the same properties that we would like eiθ to have, and so the definition
eiθ = cos θ + i sin θ
is consistent with our experience with other exponential functions.
Note also that, since cosine is an even function and sine is an odd function, so
e−iθ = cos(−θ) + i sin(−θ) = cos θ − i sin θ,
which is the conjugate of eiθ.
We should also say that in more sophisticated treatments of complex functions, eiθ will be
defined by other means (usually a “power series”) and Euler’s formula then becomes a theorem.
De Moivre’s Theorem and Euler’s formula have a wide variety of uses, ranging from calculation of
powers of complex numbers to calculation of roots of complex numbers to derivation of trigonometric
formulae etc. We shall examine some of these applications in the remainder of this chapter.
3.7.1 The arithmetic of polar forms
Using Euler’s formula, we can rewrite the complex number z = r(cos θ + i sin θ) in an alternative
and more usual form. We will call this form, the polar form of the complex number.
Definition 2. The polar form for a non-zero complex number z is
z = reiθ
where r = |z| and θ = Arg(z).
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.7. PROPERTIES AND APPLICATIONS OF THE POLAR FORM 91
Four important special cases are:
1 = e0, i = eipi/2, −1 = eipi, −i = e−ipi/2.
Some further examples of polar forms are as follows.
Example 1. −4 = 4eipi, −4i = 4e−ipi/2, 1− i√3 = 2e−ipi/3, −1− i√3 = 2e−2ipi/3. ♦
(Note: At High School you may have taken r(cos θ + i sin θ) as the polar form and abbreviated it
to r cis (θ). This notation is not used in this course.)
Since eiθ = ei(θ+2kpi) for all k an integer, it is sometimes convenient to express the polar form using
a general argument, that is,
z = eiθ, where θ = Arg(z) + 2kπ, k ∈ Z.
Using Euler’s formula, we can rewrite the equality proposition for polar forms (Proposition 1 of
Section 3.6) as
z1 = r1e
iθ1 = r2e
iθ2 = z2
if and only if r1 = r2 and θ1 = θ2 + 2kπ for k ∈ Z.
The polar form is very useful for multiplication and division of complex numbers. The formulae
for multiplication and division of polar forms are:
z1z2 = r1e
iθ1r2e
iθ2 = r1r2e
i(θ1+θ2),
z1
z2
=
r1e
iθ1
r2eiθ2
=
r1
r2
ei(θ1−θ2).
It is sometimes useful to express these results in terms of the modulus and argument. In this case,
we have for multiplication,
|z1z2| = r1r2 = |z1| |z2| and Arg(z1z2) = Arg(z1) + Arg(z2) + 2kπ,
where k is an integer, chosen so that −π < Arg(z1z2) 6 π. That is, the rule is to multiply the
moduli and add the arguments. For division, we have∣∣∣∣z1z2
∣∣∣∣ = r1r2 = |z1||z2| and Arg
(
z1
z2
)
= Arg(z1)−Arg(z2) + 2kπ,
where k is an integer chosen so that −π < Arg(z1/z2) 6 π. That is, the rule is to divide the moduli
and subtract the arguments.
Example 2. Use polar forms to find the modulus and argument of (−1− i)(1 − i√3) and
(−1− i)/(1 − i√3).
Solution. The modulus and argument of −1− i and 1− i√3 are
| − 1− i| = √2 and Arg (−1− i) = −3π
4
,
|1− i√3| = 2 and Arg (1− i√3) = −π
3
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
92 CHAPTER 3. COMPLEX NUMBERS
Hence, for multiplication, we have∣∣∣(−1− i)(1− i√3)∣∣∣ = | − 1− i| ∣∣∣1− i√3∣∣∣ = 2√2,
and, for some integer k,
Arg
(
(−1− i)
(
1− i
√
3
))
= −3π
4
+
(
−π
3
)
+ 2kπ = −13π
12
+ 2kπ.
Then, on choosing k = 1, to obtain a principal argument in the interval (−π, π], we have
Arg
(
(−1− i)
(
1− i
√
3
))
= −13π
12
+ 2π =
11π
12
.
For division, we have ∣∣∣∣ −1− i1− i√3
∣∣∣∣ = | − 1− i|∣∣1− i√3∣∣ =
√
2
2
and, for some integer k,
Arg
( −1− i
1− i√3
)
= −3π
4
−
(
−π
3
)
+ 2kπ = −5π
12
,
where k = 0 has been chosen to obtain a principal argument in the interval (−π, π]. ♦
The next example shows a reasonably simple method for finding the square roots of a complex
number in Cartesian form. It rests on the observation that if z ∈ C and z = a+ bi with a, b ∈ R,
then
|z2| = |z · z| = |z| |z| = |z|2 or |(a+ bi)2| = a2 + b2,
and also Re(z2) = a2 − b2, Im(z2) = 2ab.
Example 3. Find the square roots of −5− 12i.
Solution. We want to find all complex solutions to z2 = −5 − 12i. Writing z = a + bi, this is
equivalent to finding all real solutions (a, b) to the equation (a + bi)2 = −5 − 12i. Expanding the
left-hand side and equating real and imaginary parts, we have
a2 − b2 = −5
2ab = −12.
Also,
a2 + b2 = |(a+ ib)2| = | − 5− 12i| =
√
(−5)2 + (−12)2 = 13.
So we want to solve the system of equations
a2 + b2 = 13
a2 − b2 = −5
2ab = −12.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.7. PROPERTIES AND APPLICATIONS OF THE POLAR FORM 93
Solving the first pair in this system, we get a2 = 4 and b2 = 9. This means that a = ±2 and b = ±3.
The third equation is satisfied precisely when we choose opposite signs for a and b. Therefore the
two square roots of −5− 12i are
z = 2− 3i and − 2 + 3i,
or more compactly,
z = ±(2− 3i).
♦
Example 4. Use the quadratic formula to solve the equation
z2 − (4 + i)z + (5 + 5i) = 0.
Solution. We solve this using the quadratic formula. The proof of the quadratic formula uses
only field axioms and the fact of the existence of square roots and thus carries over directly to
quadratic polynomials with complex coefficients. The details are left as an exercise.
Proceeding, the solutions to z2 − (4 + i)z + (5 + 5i) = 0 are
z =
4 + i±√(4 + i)2 − 4(5 + 5i)
2
=
4 + i±√−5− 12i
2
From the previous example, the square roots of −5 − 12i are ±(2 − 3i). Thereby the roots to
the quadratic are
z =
4 + i+ (2− 3i)
2
= 3− i and z = 4 + i− (2− 3i)
2
= 1 + 2i
We can easily check that z = 3− i and z = 1 + 2i are roots to the equation by substitution. ♦
Example 5. Show that the set of numbers of unit modulus, that is, the set
S = {z ∈ C : |z| = 1},
is closed under multiplication and division.
Solution. For closure under multiplication we must prove that the product z1z2 ∈ S for all z1,
z2 ∈ S. Now, if z1, z2 ∈ S, then |z1| = 1 and |z2| = 1. Then
|z1z2| = |z1||z2| = 1
also. Thus, z1z2 ∈ S, and hence S is closed under multiplication.
For closure under division we must prove that z1/z2 ∈ S for all z1, z2 ∈ S. For z1/z2, we have∣∣∣∣z1z2
∣∣∣∣ = |z1||z2| = 1,
as |z1| = 1 and |z2| = 1. Thus z1/z2 ∈ S, and hence S is closed under division. ♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
94 CHAPTER 3. COMPLEX NUMBERS
Example 6. [X] Suppose a and b are real numbers (not both zero) and w = (az + b)/(bz + a).
Show that, if |z| = 1, then |w| = 1.
Solution. We have
|w| =
∣∣∣∣az + bbz + a
∣∣∣∣ = |az + b||bz + a| .
Now, |z| = 1, so z = cos θ + i sin θ, θ real.
|az + b| = |a(cos θ + i sin θ) + b| =
√
(a cos θ + b)2 + (a sin θ)2
=
√
a2 + 2ab cos θ + b2
and |bz + a| = |b(cos θ + i sin θ) + a| =
√
a2 + 2ab cos θ + b2.
So |w| = 1. ♦
3.7.2 Powers of complex numbers
With the exception of very simple cases such as squares and cubes, the simplest method of calcu-
lating powers of a complex number z is to use the polar form. Thus, if
z = reiθ,
then the properties of exponentials give
zn = rneinθ.
Example 7. Calculate
(
1 + i
√
3
)10
.
Solution. For z = 1 + i
√
3, we have |z| = 2, Arg(z) = pi3 and so z = 2eipi/3. Hence,(
1 + i
√
3
)10
=
(
2eipi/3
)10
= 210e10ipi/3 = 210e−2ipi/3 = 210
(
cos
2π
3
− i sin 2π
3
)
= 29
(
−1− i
√
3
)
.
♦
3.7.3 Roots of complex numbers
The polar form can also be used to find roots of complex numbers, where a root of a complex
number is defined as follows.
Definition 3. A complex number z is an nth root of a number z0 if z0 is the nth
power of z, that is, z is the nth root of z0 if z
n = z0.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.7. PROPERTIES AND APPLICATIONS OF THE POLAR FORM 95
If z0 6= 0 the nth roots of z0 can be found by equating the polar forms of zn and z0. Thus, if
z = reiθ and z0 = r0e
iθ0 , we have
zn = rneinθ = r0e
iθ0 ,
and hence, on equating the moduli and arguments, we have
rn = r0 and nθ = θ0 + 2kπ for k ∈ Z.
Thus
r = r
1/n
0 and θ =
θ0 + 2kπ
n
=
θ0
n
+
2kπ
n
for k ∈ Z,
and
z = reiθ = r
1
n
0 e
i
(
θ0+2kpi
n
)
. (*)
As k ranges over the integers, z takes precisely n different values. These n values can be found
by letting k take any n consecutive values. Thus z0 has precisely n distinct nth roots.
Example 8. Find all fifth roots of unity.
Solution. The modulus and argument of z0 = 1 are r0 = |1| = 1 and θ0 = Arg(1) = 0.
Hence, the fifth roots of unity are complex
numbers, z, given by
z5 = e2kpii for k ∈ Z,
By (*) above, these roots are
z = ei(
0+2kpi
5 ),
where the five consecutive integers for k are
chosen to be −2,−1, 0, 1, 2 so that each argu-
ment θ = 2kpi5 lies in (−π, π].
Real axis
Imaginary axis
b
b
b
b
b
2pi
5
2pi
5
Figure 10: The Five Fifth Roots of Unity.
Polar forms for the five fifth roots of unity are therefore
e−4pii/5, e−2pii/5, 1, e2pii/5, e4pii/5.
♦
Note that these solutions occur in conjugate pairs, eg. e2pii/5 and e−2pii/5 and so on.
As shown in Figure 10, these fifth roots are equally spaced out on a circle of radius 1 with angles
2π
5
between them.
Example 9. Find all sixth roots of −2.
Solution. The modulus and argument of −2 are | − 2| = 2 and Arg(−2) = π.
c©2020 School of Mathematics and Statistics, UNSW Sydney
96 CHAPTER 3. COMPLEX NUMBERS
Hence, the sixth roots of −2 are complex
numbers z, such that
z6 = 2e(pi+2kpi)i for k ∈ Z.
So, the sixth roots are
z = 2
1
6 ei
(2k+1)pi
6 , for k = −3,−2,−1, 0, 1, 2.
Writing α = 2
1
6 , the six roots are: αe−i
5pi
6 ,
αe−i
pi
2 , αe−i
pi
6 , αei
pi
6 , αei
pi
2 , and αei
5pi
6 .
In “a+ ib” form, these sixth roots are
Real axis
Imaginary axis
b
b
b
b
b
b
pi
3
pi
3
Figure 11: The Six Sixth Roots of −2.
−2 16
(√
3 + i
2
)
, −2 16 i, 2 16
(√
3− i
2
)
, 2
1
6
(√
3 + i
2
)
, 2
1
6 i, −2 16
(√
3− i
2
)
♦
There is again a simple geometric picture of this result. From figure 11, the 6 roots lie on a
circle of radius 2
1
6 with angles
2π
6
=
π
3
between them.
In summary, there are always exactly n nth roots of a non-zero complex number. On the
Argand diagram, these nth roots of z0 lie on a circle of radius |z0| 1n with angles 2π
n
between them.
3.8 Trigonometric applications of complex numbers
A large variety of trigonometric formulae can be obtained by using complex numbers and the
Binomial Theorem. This theorem is stated below, and a proof is given in Section 3.12.
Theorem 1 (Binomial Theorem). If a, b ∈ C and n ∈ N, then
(a+ b)n = an + nan−1b+
n(n− 1)
2!
an−2b2 +
n(n− 1)(n− 2)
3!
an−3b3 + · · ·+ nabn−1 + bn
=
n∑
k=0
(
n
k
)
an−kbk,
where the numbers
(
n
k
)
=
n!
k!(n− k)! are the binomial coefficients.
For small values of n the binomial coefficients may be easily calculated using Pascal’s triangle.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.8. TRIGONOMETRIC APPLICATIONS OF COMPLEX NUMBERS 97
n BINOMIAL COEFFICIENTS
0
1
2
3
4
5
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
Figure 12: Pascal’s Triangle.
Note that each coefficient (except the 1’s at the end) is obtained by adding the two coefficients
immediately above it.
Example 1. For n = 4, the coefficients are 1, 4, 6, 4, 1, and hence
(a+ b)4 = a4 + 4a3b+ 6a2b2 + 4ab3 + b4.
♦
Sine and cosine of multiples of θ.
It is now easy to express cosnθ or sinnθ in terms of powers of cos θ or sin θ by using De Moivre’s
Theorem.
Example 2. Find a formula for cos 4θ in terms of powers of cos θ and sin θ.
Solution. Using De Moivre’s Theorem and the Binomial Theorem gives
cos 4θ + i sin 4θ = (cos θ + i sin θ)4
= cos4 θ + 4cos3 θ (i sin θ) + 6 cos2 θ (i sin θ)2 + 4cos θ (i sin θ)3 + (i sin θ)4 .
Then, on using i2 = −1, i3 = −i, i4 = 1, and separately equating real and imaginary parts on the
left and right hand sides gives
cos 4θ = cos4 θ − 6 cos2 θ sin2 θ + sin4 θ
and
sin 4θ = 4
(
cos3 θ sin θ − cos θ sin3 θ) .
Note that the formula for cos 4θ can be rewritten in term of powers of cos θ only by using
sin2 θ = 1− cos2 θ to replace the sine terms by cosines. ♦
Powers of sine and cosine.
Euler’s formula shows that the trigonometric functions, sine and cosine, are closely related to
the exponential function with an imaginary exponent.
Using Euler’s formula, we note that
einθ = cosnθ + i sinnθ,
e−inθ = cos(−nθ) + i sin (−nθ) = cosnθ − i sin nθ.
c©2020 School of Mathematics and Statistics, UNSW Sydney
98 CHAPTER 3. COMPLEX NUMBERS
On first adding and then subtracting these formulae, we obtain the important formulae
cosnθ =
1
2
(
einθ + e−inθ
)
, sinnθ =
1
2i
(
einθ − e−inθ
)
.
In particular, we have
cos θ =
1
2
(
eiθ + e−iθ
)
, sin θ =
1
2i
(
eiθ − e−iθ
)
.
We can apply the above formulae to derive trigonometric formulae which relate powers of sin θ
or cos θ to sines or cosines of multiples of θ.
Example 3. Find a formula for sin3 θ in terms of sines of multiples of θ.
Solution. Using the formula sinnθ =
1
2i
(
einθ − e−inθ), for n = 1 and n = 3, and the Binomial
Theorem gives
sin3 θ =
(
1
2i
(
eiθ − e−iθ
))3
= − 1
8i
(
eiθ − e−iθ
)3
= − 1
8i
(
ei3θ − 3eiθ + 3e−iθ − e−i3θ
)
= −1
4
(
1
2i
(
ei3θ − e−i3θ
)
− 3 · 1
2i
(
eiθ − e−iθ
))
=
3
4
sin θ − 1
4
sin 3θ.
♦
Example 4. Find a formula for cos5 θ in terms of cosines of multiples of θ.
Solution. Using the formula cosnθ = 12
(
einθ + e−inθ
)
, for n = 1, 3, 5, and the Binomial Theorem
gives
cos5 θ =
(
1
2
(
eiθ + e−iθ
))5
=
1
32
(
eiθ + e−iθ
)5
=
1
32
(
ei5θ + 5ei3θ + 10eiθ + 10e−iθ + 5e−i3θ + e−i5θ
)
=
1
16
(
1
2
(
ei5θ + e−i5θ
)
+
5
2
(
ei3θ + e−i3θ
)
+ 5
(
eiθ + e−iθ
))
=
1
16
(cos 5θ + 5cos 3θ + 10 cos θ) .
♦
Formulae of this type are very useful in integration. For example, using the above identity,∫
cos5 θ dθ =
1
16
(
1
5
sin 5θ +
5
3
sin 3θ + 10 sin θ
)
+ C.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.9. GEOMETRIC APPLICATIONS OF COMPLEX NUMBERS 99
[X] Example 5. Find the sum of
cos θ + cos(2θ) + · · · + cos(nθ).
Solution. We use the fact that cos θ is the real part of eiθ. The required sum is then the real part
of the sum
Sn = e
iθ + e2iθ + · · ·+ eniθ.
Since ekiθ =
(
eiθ
)k
, the sum Sn is a geometric progression with ratio of successive terms given by
R = eiθ. Then, on using the formula for the sum of a geometric progression (see Section 3.11), we
have
Sn = R+R
2 + · · ·+Rn = R (1 +R+ · · ·+Rn−1) = R1−Rn
1−R = e
iθ 1− einθ
1− eiθ .
We require the real part of Sn. The simplest way of finding the real part is to use the following
trick. Note that
1− eiθ = eiθ/2
(
e−iθ/2 − eiθ/2
)
= −2ieiθ/2 sin
(
θ
2
)
,
and hence that
Sn = e
iθ e
inθ/2 sin
(
nθ
2
)
eiθ/2 sin
(
θ
2
) = ei(n+1)θ/2 sin (nθ2 )
sin
(
θ
2
) .
The required sum of the cosine terms is therefore
n∑
k=1
cos(kθ) = Re(Sn) = cos
(
(n+ 1)θ
2
)
sin
(
nθ
2
)
sin
(
θ
2
) .
♦
Sums of this type are used in X-ray diffraction, solid-state physics, chemistry, signal processing
in electrical engineering, and tomography, as well as in many other areas where periodic functions
and waves must be analysed.
3.9 Geometric applications of complex numbers
We have seen that the Argand diagram can be used to represent every complex number as a
point in a plane. This plane is frequently called the complex plane. Relations between complex
numbers can therefore be given a geometric interpretation in this complex plane, and, conversely,
the geometry of a plane can be represented by algebraic relations between complex numbers. This
connection between geometry and complex numbers has proved to be a very useful and powerful tool
in mathematics and in areas such as physics, electrical engineering, fluid dynamics, oceanography,
aerodynamics and mechanical engineering, among others.
In this section, we shall look at some simple geometrical examples. From Figure 3 in Section 3.5,
we can see that the complex numbers 0, z1, z2 and z1+ z2 form a parallelogram. For the geometric
interpretation of the difference of two complex numbers, we shall use a different approach.
Example 1. Give a geometric interpretation of |z − w| and Arg (z − w).
c©2020 School of Mathematics and Statistics, UNSW Sydney
100 CHAPTER 3. COMPLEX NUMBERS
Solution. Let z = x+ iy, w = a+ ib. Note that z − w = (x− a) + i(y − b). Now
|z −w| =
√
(x− a)2 + (y − b)2
which is the distance between the points representing z and w.
To understand the geometric interpretation of Arg (z − w), we represent the complex number
z − w by the directed line segment (the arrow) from w to z. We plot z and w on the Argand
diagram for each of the four cases as shown in Figure 13. The angle α satisfies
sinα =
y − b
|z − w| and cosα =
x− a
|z − w| ,
and hence α = Arg(z − w).
0 Real axis
Im
ag
in
ar
y
ax
is
b
b
w
z
x− a
y−
b
|z −
w|
α
0 Real axis
Im
ag
in
ar
y
ax
is
b
b
w
z
α
0 Real axis
Im
ag
in
ar
y
ax
is
b
b
z
w
α
0 Real axis
Im
ag
in
ar
y
ax
is
b
b
z
w
α
α = Arg (z − w)
Figure 13: Geometric Interpretation of Arg (z − w).
The fact that |z−w| is the distance between the points z and w and that Arg (z − w) is the angle
between the arrow from w to z and a line in the direction of the positive real axis are important
and should be remembered. ♦
Example 2. Give a geometric interpretation of each of the following sets of points
a) |z − 4− i| = 3, and b) |z − 4− i| 6 3.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.9. GEOMETRIC APPLICATIONS OF COMPLEX NUMBERS 101
Solution. |z − 4 − i| is the distance of a point z from the point 4 + i. Therefore |z − 4 − i| = 3
says z is at a distance of 3 from the point 4 + i. Thus, the equation describes a circle of radius 3
with centre at the point 4 + i. Similarly, |z − 4 − i| 6 3 is the region in which all points are at
a distance at most 3 from the point 4 + i, and hence the inequality describes the disc of radius
3 with centre at 4+i (with the boundary circle included). The plots are shown in Figures 14 and 15.
0 Real axis
Im
ag
in
ar
y
ax
is
b
4 + i
b
0 Real axis
Im
ag
in
ar
y
ax
is
4 + i
Figure 14: {z ∈ C : |z − 4− i| = 3}. Figure 15: {z ∈ C : |z − 4− i| 6 3}.
An alternative way of arriving at a geometric interpretation of a set is to use the “x+ yi” (or
Cartesian) form of a complex number. For example, using z = x+ yi, we can square the equation
|z − 4− i| = 3 and rewrite it to obtain
|z − 4− i|2 = |(x− 4) + (y − 1)i|2 = (x− 4)2 + (y − 1)2 = 9,
which corresponds to the equation of a circle with centre at x = 4 and y = 1 and of radius 3. ♦
Example 3. Sketch the set
{
z ∈ C : −π
3
6 Arg (z − 4− i) 6 π
4
}
.
Solution. The above set is the set of all complex numbers z such that
−π
3
6 Arg(z − 4− i) 6 π
4
.
Now, as shown in Example 1,
Arg (z − 4− i) represents the angle be-
tween the line segment from 4 + i to z and
a line parallel to the real axis, and hence
the required sketch is as shown in Figure 16.
Notice that the point z = 4 + i is omitted
from the set. This is because the function
Arg(z − 4 − i) is undefined there. The
boundary line segments are{
z ∈ C : Arg (z − 4− i) = −π
3
}
,{
z ∈ C : Arg (z − 4− i) = π
4
}
.
Real axis
Im
ag
in
ar
y
ax
is
0
◦4 + i
pi
4
−pi3
Figure 16:
{
z : −π
3
6 Arg (z − 4− i) 6 π
4
}
.
♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
102 CHAPTER 3. COMPLEX NUMBERS
Example 4. Sketch the set {z ∈ C : |z − 3| > 2 and Re(z) 6 4}.
Solution. The given set is the set of all complex numbers z which satisfy both |z − 3| > 2 and
Re(z) 6 4. The inequality |z − 3| > 2 describes the region on and outside the circle of radius 2
centred at 3, while Re(z) = x 6 4 corresponds to the half-plane on and to the left of the line x = 4.
The required set is the intersection of these two regions as both inequalities must be satisfied. The
plots are shown in Figure 17. In practise, only the last diagram need to be shown.
b
30
b
40
b b
3 40
{z ∈ C : |z − 3| > 2} {z ∈ C : Re(z) 6 4} {z ∈ C : |z − 3| > 2 and Re(z) 6 4}
Figure 17: Argand Diagrams for Example 4.
♦
Example 5. Sketch the set {z ∈ C : |z − 1 + i| < 2 or Im(z) > 0}.
Solution. The inequality |z − 1 + i| < 2 represents the open disc of radius 2 with centre at 1− i.
0
b
1− i
0 0
b
1− i
{z ∈ C : |z − 1 + i| < 2} {z ∈ C : Imz > 0} {z ∈ C : |z − 1 + i| < 2 or Im(z) > 0}
Figure 18: Argand Diagrams for Example 5.
To indicate that the circle |z − 1 + i| = 2 is not included in the set, we draw it with dashes.
Also, Im(z) = y > 0 describes the half-plane on and above the real axis. The required set is the
union of these two regions (as either inequality being satisfied puts the point in the set). ♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.10. COMPLEX POLYNOMIALS 103
3.10 Complex polynomials
We saw at the beginning of this chapter that there are polynomials which have no roots in the
real numbers. Instead we defined the number i to be the solution for x2 + 1 = 0. We shall now
investigate in more detail the properties of polynomials and their roots when we work over the
complex numbers.
Definition 1. Suppose n is a natural number and a0, a1, . . . , an are complex
numbers with an 6= 0. Then the function p : C −→ C such that
p(z) = a0 + a1z + a2z
2 + · · ·+ anzn
is called a polynomial of degree n.
The zero polynomial is defined to be the function p(z) = 0 and we do not define its
degree.
Note. If a0, a1, . . . , an are real and z takes only real values, then we say that the polynomial is
defined over R.
Example 1. p1(z) = 1 + 3z
2, p2(z) = 1 + i− 4iz3 are examples of polynomials.
Note that p1(z) has real coefficients. If it takes complex values, we say it is a polynomial defined
over C. However, if it only takes real values, we say it is a polynomial defined over R.
3.10.1 Roots and factors of polynomials
An important mathematical and practical problem concerning polynomials is that of factorising
them into simpler polynomials. This factorisation problem is closely related to the problem of
finding the roots (or zeroes) of polynomials.
Definition 2. A number α is a root (or zero) of a polynomial p if p(α) = 0.
Definition 3. Let p be a polynomial. Then, if there exist polynomials p1 and p2
such that p(z) = p1(z)p2(z) for all complex z, then p1 and p2 are called factors of
p.
Theorem 1 (Remainder Theorem). The remainder r which results when p(z) is divided by z − α
is given by r = p(α).
Theorem 2 (Factor Theorem). A number α is a root of p if and only if z − α is a factor of p(z).
The major difference between polynomials over the complex numbers and polynomials over the
real numbers is contained in the following theorem.
Theorem 3 (The Fundamental Theorem of Algebra). A polynomial of degree n > 1 has at least
one root in the complex numbers.
c©2020 School of Mathematics and Statistics, UNSW Sydney
104 CHAPTER 3. COMPLEX NUMBERS
We shall not try to prove the fundamental theorem here. The proof is usually given in courses
on functions of a complex variable.
There are several important points to note about the fundamental theorem.
The theorem is not true in general for polynomials over R. For example, the real quadratic q
defined by q(z) = 1 + z2 does not have any real roots. In contrast, it has the two complex roots
±i.
We can combine the Fundamental Theorem of Algebra and the Factor Theorem to prove the
following extremely important theorem.
Theorem 4 (Factorisation Theorem). Every polynomial of degree n > 1 has a factorisation into
n linear factors of the form
p(z) = a(z − α1)(z − α2) . . . (z − αn), (#)
where the n complex numbers α1, α2, . . . , αn are roots of p and where a is the coefficient of z
n.
[X] Proof. To prove the theorem we use induction on the degree of p.
Condition (#) is obviously true for deg(p) = 1. We now prove that, for n > 1, (#) is true for
polynomials of degree n+ 1 whenever it is true for polynomials of degree n.
Let p be a polynomial of degree n + 1, where n > 1. Then, by the Fundamental Theorem
there exists a root α1 ∈ C of p, and by the Factor Theorem z − α1 is a factor of p. Hence
p(z) = (z − α1)p1(z), where p1 is a polynomial of degree n. Now, if (#) is true for polynomials of
degree n, then there are n complex numbers α2, . . . , αn+1 and a complex number a such that
p1(z) = a(z − α2) . . . (z − αn+1).
Thus,
p(z) = (z − α1)p1(z) = a(z − α1)(z − α2) . . . (z − αn+1).
Hence, (#) is true for polynomials of degree n+1 whenever it is true for polynomials of degree
n, and as (#) is true for n = 1, it is true for all n > 1, by induction.
Finally, note that in the factorisation shown in (#) a is the coefficient of the zn term in p.
Example 2. Find the roots and factors of the quadratics
p1(z) = z
2 + 2z + 1, p2(z) = 2z
2 − 9z + 4 and p3(z) = z2 + z + 1.
Solution. It is straightforward to factorise p1(z):
p1(z) = (z + 1)
2.
So the polynomial has a repeated root −1, with multiplicity 2.
Likewise, we have
p2(z) = (2z − 1) (z − 4).
The roots of this polynomial are 12 and 4.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.10. COMPLEX POLYNOMIALS 105
The polynomial p3(z) does not easily factor. Using the quadratic formula, the roots are
z =
−1± i√3
2
, so
p3(z) =
(
z − 1
2
(
−1 + i
√
3
))(
z − 1
2
(
−1− i
√
3
))
.
♦
The Factorisation Theorem guarantees that a polynomial of degree n always has n roots, but it
does not tell us how to actually find these roots. For polynomials of degree n 6 4, exact formulae
for the roots in terms of the coefficients have been found. For quadratics, the exact formula for
the roots is well known and very easy to use. For cubic polynomials (degree 3) the exact formula
is called Cardano’s formula and it is also reasonably easy to use. For quartic polynomials (degree
4) there is also an exact formula for the roots but the formula is very complicated. However, for
polynomials of degree n > 4, there are no formulae for the roots in terms of square roots, cube
roots, etc. A proof of this fact is given in courses on Galois Theory.
In general, it is either difficult or impossible to find exact roots, and hence an exact factorisation,
for higher degree polynomials. It is usually necessary to resort to approximate numerical methods
to find the roots of a polynomial. At the present time the best general-purpose numerical method
is based on finding the “eigenvalues” of a “companion matrix” for the polynomial. These numerical
methods are discussed in advanced courses on numerical matrix algebra.
There are, however, some very simple types of higher degree polynomials for which exact roots,
and hence an exact factorisation, can be found. One such case is for polynomials of the form
p(z) = zn − a, since in this case the solutions of
zn − a = 0 or equivalently of zn = a,
are the nth roots of the number a. This problem has been discussed in Section 3.7.3.
Example 3. Factorise z6 + 1.
Solution. We first solve z6+1 = 0 or z6 = −1 by finding the sixth roots of −1. Since the modulus
and argument of −1 are | − 1| = 1 and Arg(−1) = π, we have
z6 = e(pi+2kpi)i for k ∈ Z.
So, the sixth roots in polar forms are
z = ei
(2k+1)pi
6 , for k = −3,−2,−1, 0, 1, 2.
That is, e−i
5pi
6 , e−i
pi
2 , e−i
pi
6 , ei
pi
6 , ei
pi
2 , and ei
5pi
6 .
By the Factor Theorem, we have
z6 + 1 =
(
z − eipi/2
)(
z − e−ipi/2
)(
z − eipi/6
)(
z − e−ipi/6
)(
z − e5ipi/6
)(
z − e−5ipi/6
)
.
An alternative form can be obtained by putting the roots into the “a + ib” form. You should do
this as an exercise, and you should obtain the result
z6+1 = (z−i)(z+i)
(
z − 1
2
(√
3 + i
))(
z − 1
2
(√
3− i
))(
z − 1
2
(
−
√
3 + i
))(
z − 1
2
(
−
√
3− i
))
.
♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
106 CHAPTER 3. COMPLEX NUMBERS
3.10.2 Factorisation of polynomials with real coefficients
This is an important special case of the general theory. In the examples of factorisation we have
given above in Examples 2 and 3, all of the polynomials have real coefficients. For the quadratics
in Example 2, we showed one case of two equal real roots, one case of two distinct real roots, and
one case of two complex roots. In Example 3, we obtained six complex roots of z6+1. However, if
you examine the complex roots in Examples 2 and 3, you will see that the roots occur in conjugate
pairs. Thus, in the quadratic example the roots (−1 + i√3)/2 and (−1 − i√3)/2 are a conjugate
pair, while in the sixth-root example there are three conjugate pairs
{
e−ipi/2, eipi/2
}
,
{
e−ipi/6, eipi/6
}
,
and
{
e−5ipi/6, e5ipi/6
}
. These results are examples of a useful proposition which is:
Proposition 5. If α is a root of a polynomial p with real coefficients, then the complex conjugate
α is also a root of p.
Proof. Let p be a polynomial with real coefficients given by
p(z) = a0 + a1z + a2z
2 + · · ·+ anzn for all z ∈ C,
where a0, a1, . . . , an are real numbers. Then, since α is a root of p,
p(α) = a0 + a1α+ a2α
2 + · · · + anαn = 0.
On taking the complex conjugate of this equation, and using the facts that if ak is real and b is any
complex number, then
akb = ak b = akb
and that the complex conjugate of αk is (α)k, we have
0 = p(α) = a0 + a1α+ a2α
2 + · · ·+ anαn = p(α).
Hence, α is also a root of p.
An immediate consequence of this proposition is that the roots of a complex polynomial with real
coefficients are either real or occur in conjugate pairs. This fact can be used to obtain a factorisation
of a polynomial with real coefficients into linear or quadratic factors with real coefficients.
Proposition 6. If p is a polynomial with real coefficients, then p can be factored into linear and
quadratic factors all of which have real coefficients.
Proof. For a polynomial with real coefficients, if α is a root which is not real then so is α, and
hence both z − α and z − α are factors of p. On multiplying these factors, we obtain
(z − α)(z − α) = z2 − (α+ α)z + αα.
Then, from section 3.3, we have that
α+ α = 2Re(α), and αα = |α|2,
where Re(α) and |α|2 are both real numbers. Hence,
q(z) = (z − α)(z − α) = z2 − 2Re(α) z + |α|2
is a quadratic with real coefficients. Thus, all complex factors can be replaced in pairs by quadratic
factors with real coefficients, and the proof is complete.
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.11. APPENDIX: A NOTE ON PROOF BY INDUCTION 107
Example 4. Factorise z5 + 32 into linear and quadratic factors with real coefficients.
Solution. We first solve z5 = −32 to find the fifth roots of −32. Using the usual procedure, we
obtain the five solutions
2eipi/5, 2ei3pi/5, 2e−ipi/5, 2eipi = −2, 2e−3ipi/5,
which consist of one real root and two conjugate pairs.
The factors corresponding to a conjugate pair can be replaced by a quadratic factor with real
coefficients. For example,(
z − 2eipi/5
)(
z − 2e−ipi/5
)
= z2 − 2
(
eipi/5 + e−ipi/5
)
z + 4
= z2 − 4z cos π
5
+ 4.
The required factorisation is therefore
z5 + 2 = (z + 2)
(
z − 2 eipi/5
)(
z − 2 e−ipi/5
)(
z − 2 e3ipi/5
)(
z − 2 e−3ipi/5
)
= (z + 2)
(
z2 − 4z cos π
5
+ 4
)(
z2 − 4z cos 3π
5
+ 4
)
.
Note that this factorisation is certainly not obvious. If you multiply out the right hand side and
compare coefficients you will find some very surprising relations between cos
π
5
and cos
3π
5
. ♦
3.11 Appendix: A note on proof by induction
In mathematics and logic a proposition is a meaningful statement which can only be either true or
false. Proof by induction is a method of proof which is often used to prove that some proposition
P (n) is true for all integers n > n0, where n0 is a fixed integer.
Example 1. For each integer n > 1, let P (n) be the proposition
1 + 2 + 3 + · · ·+ n = 1
2
n(n+ 1).
In this example, P (n) gives the formula for the sum of the first n positive integers. ♦
Example 2. For each integer n > 0, let P (n) be the proposition
1 + r + r2 + · · ·+ rn = 1− r
n+1
1− r ,
where r 6= 1 is a fixed number. In this example, P (n) gives the formula for the sum of n+1 terms
of a geometric progression with ratio r. ♦
Example 3. For each integer n > 2, let P (n) be the proposition that “n can be completely factored
into prime numbers”. ♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
108 CHAPTER 3. COMPLEX NUMBERS
As we shall see, all the propositions P (n) in examples 1, 2 and 3 are true.
There are several versions of “proof by induction”, but the two most commonly used are based
on what are sometimes called the first and second principles of induction.
First Principle of Induction. Let n0 ∈ Z, and let P (n), for n > n0, be propositions. Then, if
1. P (n0) is true, and
2. for each n > n0, P (n+ 1) is true whenever P (n) is true,
then P (n) is true for all n > n0.
Let us return to the previous examples.
Example 1. We prove the formula for the sum of the first n integers, i.e., that:
For n > 1,
1 + 2 + 3 + · · ·+ n = 1
2
n(n+ 1).
Proof. We first note that the proposition P (1) states that 1 =
1
2
× 1× 2, which is clearly true.
Now, if P (n) is true for some fixed n > 1, then the sum of n+ 1 integers is
1 + 2 + · · ·+ n+ (n+ 1) = 1
2
n(n+ 1) + (n+ 1)
=
1
2
(n + 1)(n+ 2) =
1
2
(n+ 1)
(
(n+ 1) + 1
)
,
and hence P (n+ 1) is also true. Thus, for all n > 1, P (n + 1) is true whenever P (n)is true.
Therefore, from the first principle of induction, P (n) is true for all integers n > 1.
Example 2. We prove the formula for the sum of a geometric progression, i.e., that:
For n ∈ N and r 6= 1,
1 + r + · · ·+ rn = 1− r
n+1
1− r .
Proof. We first note that the proposition P (0) states that
1 =
1− r
1− r ,
which is clearly true for r 6= 1.
Now, if P (n) is true for some fixed integer n > 0, then the sum of n+ 2 terms is
1 + r + · · ·+ rn + rn+1 = 1− r
n+1
1− r + r
n+1 =
1− rn+2
1− r =
1− r(n+1)+1
1− r ,
and hence P (n + 1) is also true. Thus, for any integer n > 0, P (n + 1) is true whenever P (n) is
true.
Therefore, from the first principle of induction, P (n) is true for all n > 0.
We now briefly describe the second principle of induction.
Second Principle of Induction. Let n0 ∈ Z, and let P (n), for n > n0, be propositions. Then, if
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.12. APPENDIX: THE BINOMIAL THEOREM 109
1. P (n0) is true, and
2. for each n > n0, P (n+ 1) is true whenever all propositions P (m) are true
for m = n0, . . . , n (i.e. n0 6 m 6 n),
then P (n) is true for all n > n0.
Example 3. Every integer n > 2 can be factored into primes.
Proof. The proposition P (n) is that n can be factored into primes. P (2) then asserts that 2 can
be factored into primes. This result is clearly true, since 2 is itself a prime.
We now show that, if P (m) is true for all integers m with 2 6 m 6 n, then P (n + 1) is also
true.
Now, the integer n+ 1 must either be prime or not prime.
Case 1. n+ 1 is prime. Then n+ 1 is already factored and hence P (n+ 1) is true in this case.
Case 2. n + 1 is not prime. Then, there are two integers m1 and m2 less than n + 1 such that
n+1 = m1m2. But, as P (m) is true for 2 6 m 6 n, m1 and m2 can both be factored into primes,
and hence n+ 1 can also be factored into primes. Thus, P (n+ 1) is true in this case also.
We have shown that P (2) is true, and that P (n+1) is true whenever P (m) is true for all m with
2 6 m 6 n. Hence, from the second principle of induction, P (n) is true for all integers n > 2.
In fact the two apparently different types of induction are the same, as we can easily show.
Simply let the proposition Q(n) be “P (m) is true for m = n0, . . . , n”. Then use the first type of
proof.
3.12 Appendix: The Binomial Theorem
Theorem 1. If a, b ∈ C and n ∈ N, then
(a+ b)n = an + nan−1b+
n(n− 1)
2!
an−2b2 + · · ·+ nabn−1 + bn
=
n∑
k=0
(
n
k
)
an−k bk,
where the numbers (
n
k
)
=
n!
k!(n − k)!
are the binomial coefficients.
NOTE. We are using the convention that a0 = 1 and 0! = 1.
Proof. The proof of the theorem is based on the first principle of induction. In this case, the
proposition P (n) to be proved true is that the formula given above for (a+ b)n is correct.
For n = 1, P (1) asserts that (a+ b)1 = a1 + b1, which is clearly true.
c©2020 School of Mathematics and Statistics, UNSW Sydney
110 CHAPTER 3. COMPLEX NUMBERS
Now, if the formula is correct for some integer n > 1, then we have
(a+ b)n+1 = (a+ b)(a+ b)n
= (a+ b)
n∑
k=0
(
n
k
)
an−k bk.
On multiplying out, we find that the coefficient of an+1 and bn+1 are 1 and that the coefficient
of an+1−kbk is (
n
k
)
+
(
n
k − 1
)
.
Now,
(
n
k
)
+
(
n
k − 1
)
=
n!
k!(n− k)! +
n!
(k − 1)!(n − k + 1)!
=
n!
k!(n− k + 1)! (n− k + 1 + k)
=
(n+ 1)!
k!(n− k + 1)! =
(
n+ 1
k
)
.
Hence,
(a+ b)n+1 = an+1 +
n∑
k=1
(
n+ 1
k
)
an+1−kbk + bn+1
=
n+1∑
k=0
(
n+ 1
k
)
an+1−kbk.
and hence P (n+ 1) is true.
We have therefore shown that P (1) is true, and that, for all n > 1, P (n + 1) is true whenever
P (n) is true. Thus, P (n) is true for all n > 1 by induction.
Note. The relation (
n
k − 1
)
+
(
n
k
)
=
(
n+ 1
k
)
shows why Pascal’s triangle works.
3.13 Complex numbers and Maple
Maple is a Symbolic Computing Package which enables computers to do algebra and calculus.
Information about computing and Maple is available from the School of Mathematics web site, my
eLearning Vista, the MATH1131/1141 information booklet and the Computing Notes for the First
Year Mathematics Courses.
This section gives a few hints on how it can be used for complex numbers. The symbol I is used
for the complex number i. Multiplication is denoted by ∗ and exponentiation by the ‘circumflex’
c©2020 School of Mathematics and Statistics, UNSW Sydney
3.13. COMPLEX NUMBERS AND MAPLE 111
or ‘uparrow’ ∧. Complex number arithmetic is normally done automatically. However, in some
situations the command evalc is required to get a complex expression in a + ib from, where a, b
are real. Some examples are:
(3+5*I)∧3;
gives the correct value of (3 + 5i)3 = −198 + 10i on the screen.
Re(%);
gives the real part of this last expression. If you now try:
conjugate(%%);
the complex conjugate of the expression before last will appear. The command
evalc(polar(2,Pi));
can be used to enter complex numbers (here 2epii) in polar form and this command gives the correct
value −2.
Instructions on how to convert complex numbers into polar form are available ‘on line’ by means
of
?convert[polar]
Alternatively, you can use Maple Help menu. Various other conversion operations are also available.
c©2020 School of Mathematics and Statistics, UNSW Sydney
112 CHAPTER 3. COMPLEX NUMBERS
Problems for Chapter 3
“Why,” said the Dodo, “the best way to explain it is to do it.”
- Lewis Carroll, Alice in Wonderland.
Problems 3.1 : A review of number systems
1. [R] Solve (if possible) the following equations for x ∈ N, x ∈ Z, x ∈ Q and x ∈ R.
a) x+ 25 = 0, 3x− 9 = 0, 3x+ 9 = 0, 3x+ 10 = 0.
b) x2 + 4x− 5 = 0, 2x2 − 13x+ 15 = 0, x2 − x− 1 = 0, x2 + 3x+ 4 = 0.
c) sin(πx/3) = 0, sin(x/3) = 0.
2. [R] Is the set {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} closed under addition? Prove your answer.
3. [H] Can any finite set of integers be closed under addition? Prove your answer.
4. [R] Is the set {−1, 1} closed under multiplication and division?
Problems 3.3 : The rules of arithmetic for complex numbers
5. [R][V] Let z = 2 + 3i, w = −1 + 2i. Calculate 3z, z2, z + 2w, z(w + 3), z
w
,
w
z
.
6. [R] Write the following expressions in a+ ib or “Cartesian” form:
a)
1 + i
1 + 2i
, b)
2− i
3 + i
− 3− i
2 + i
.
7. [R] If z = a+ ib, express the following in “Cartesian” form:
a) z2, b)
1
z
c)
z + 1
z − 1
8. [R][V] Use the quadratic formula to find all complex roots of the following polynomials.
a) z2 + z + 1, b) z2 + 2z + 3, c) z2 − 6z + 10,
d) − 2z2 + 6z − 3, e) z4 + 5z2 + 4.
9. [H] Show that
[(√
3 + 1
)
+
(√
3− 1) i]3 = 16(1 + i).
10. [R] Simplify
(√
3 + 4i+
√
3− 4i)2 (where we assume √z has non negative real part).
11. [H][V] Simplify
(
a+ bi
a− bi
)2
−
(
a− bi
a+ bi
)2
where a and b are real numbers not both zero.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 3 113
Problems 3.4 : Real parts, imaginary parts and complex conjugates
12. [R] Find Re(z), Im(z) and z for z = −1 + i, 2 + 3i, 2− 3i, 2− i
1 + i
,
1
(1 + i)2
.
13. [R] Let z = 1 + 2i and w = 3 − 4i. Calculate z2 and z
w
, expressing the answers in
Cartesian form.
14. [R][V] Given that 2z + 3w = 1 + 12i and z − w = 3− i, find z and w.
15. [R] By evaluating each side of the equations, check that zw = z w, and
( z
w
)
=
z
w
are
satisfied by the complex numbers z = 2 + 3i, w = −1 + 2i.
16. [R] Prove that for any two complex numbers z and w
a) Im(z) =
1
2i
(z − z) b) 2Re(z) = z + z
c) (z − w) = z − w d)
(
1
z
)
=
1
z
,
e) zw = z w f)
( z
w
)
=
z
w
.
17. [H] a) Use the properties of the complex conjugate to show that if the complex number
α is a root of a quadratic equation ax2+bx+c = 0 with a, b, c being real coefficients,
then so is α.
b) Write down the monic quadratic polynomial with real coefficients which has 3− 2i as
one of its roots.
c) Does the result of a) generalise to higher degree polynomials?
Problems 3.5 : The Argand diagram and
3.6 : Polar form, modulus and argument
18. [R][V] Find the modulus, principal argument and polar form of each of the following
numbers and plot them on an Argand diagram:
a) 6 + 6i, b) − 4, c) √3− i, d) −1√
2
− i√
2
, e) − 7 + 3i.
19. [R] If z = 4 + 3i and w = 2 + i find |3z − 3iw|, Im((1− i)z − 3|w|).
20. [H] If z = 1 + i, calculate the powers zj for j = 1, 2, . . . , 10 and plot them on an Argand
diagram. Is there a pattern? What is the smallest positive integer n such that zn is a real
number?
21. [R] Find the “a+ ib” form of the complex numbers whose moduli and principal arguments
are
c©2020 School of Mathematics and Statistics, UNSW Sydney
114 CHAPTER 3. COMPLEX NUMBERS
a) |z| = 3, Arg(z) = π
3
; b) |z| = 3, Arg(z) = 5π
6
;
c) |z| = 3, Arg(z) = −2π
3
; d) |z| = 3, Arg(z) = −π
6
;
e) [H] |z| = 3, Arg(z) = π
8
.
22. [R][V] a) Show that z z = |z|2. Hence, or otherwise, show that if |z| = 1, then z = z−1.
b) Show that |z| = |z| for all z ∈ C.
c) If z = r(cos θ + i sin θ), show that a polar form for the complex conjugate is
z = r (cos(−θ) + i sin(−θ)).
23. [H] Show that Re
(
1− z
1 + z
)
= 0 for any complex z with |z| = 1.
24. [H] Use zz = |z|2 to prove the identity |z1 + z2|2 + |z1 − z2|2 = 2(|z1|2 + |z2|2).
25. [H] Use zz = |z|2 to show that
|1− zw|2 − |z − w|2 = (1− |z|2) (1− |w|2)
and deduce that |1− zw|2 = |z − w|2 if either z or w lies on the unit circle.
26. [R] Plot the following complex numbers on an Argand diagram:
a) 2e
ipi
4 , b) 3e
5ipi
6 , c) e−
2ipi
3 , d) 2e−
ipi
2 , e) 4eipi.
27. [R] Let z = (1− i) and w = 2eipi/3. Calculate w6, z−w and w
z
and express your answers
in Cartesian form.
28. [R] For z = 3e−5pii/6 and w = 1 + i, find Re
(
iw + z2
)
.
29. [R] Solve |eiθ − 1| = 2 for −π < θ 6 π.
30. [R] Find Arg(−1 + i) and Arg(−√3 + i) and hence find the principal arguments of the
complex numbers (−1 + i)(−√3 + i) and −1 + i−√3 + i .
31. [H][V] Let z = (1 +
√
3i) and w = (1 + i). Find Arg z and Argw and hence Arg(zw).
Evaluate zw and hence show that cos
7π
12
=
1−√3
2
√
2
. Find a similar expression for sin
7π
12
.
32. [R] Find polar forms for z = 1 + i
√
3 and w = 1− i, and hence find first the polar forms
and then the “a+ ib” forms of zw, z9, and
( z
w
)12
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 3 115
Problems 3.7 : Properties and applications of the polar form
33. [R] Find the polar, and hence also the Cartesian form for:
a)
(√
3 + i
)5
, b)
(−1 + i√
2
)1002
, c)
(
1 +
√
3i
2
)−8
.
34. [H] Find the square roots (in Cartesian Form) of
a) 21− 20i, b) − 16 + 30i, c) 24 + 70i.
35. [H] a) Explain why multiplying a complex number z by eiθ rotates the point represented
by z anticlockwise about the origin, through an angle θ.
b) The point represented by the complex number 1 + i is rotated anticlockwise about
the origin through an angle of
π
6
. Find its image in polar and Cartesian form.
c) Find the complex number (in Cartesian form) obtained by rotating 6−7i anticlockwise
about the origin through an angle
3π
4
.
36. [H] If z = reiθ, 0 6 θ 6
π
2
, show that
a)
∣∣(1− i)z2∣∣ = √2r2 b) Arg ((1− i)z2) = 2θ − π
4
,
c)
∣∣∣∣∣1 + i
√
3
z
∣∣∣∣∣ = 2r , d) Arg
(
1 + i
√
3
z
)
=
π
3
− θ.
37. [H][V] Find the roots (in Cartesian Form) of
a) z2 − 3z + (3− i) = 0, b) z2 − (7− i)z + (14− 5i) = 0,
c) z2 + (4− i)z + (1 + 13i) = 0.
38. [R] Find the seventh roots of −1 and plot the roots on an Argand diagram.
39. [R] Find the sixth roots of i and plot the roots on an Argand diagram.
40. [H] Find the fifth roots of 16 − 16i√3 and plot the roots on an Argand diagram.
41. [H] Find all z ∈ C satisfying (z − 6 + i)3 = −27.
42. [H][V] Show that if ω is an nth root of unity (ω 6= 1 and n > 1) then
ω + ω2 + · · · + ωn = 0.
Hint: Sum the geometric progression.
43. [H] Suppose θ, φ 6= π
2
(2k + 1) where k is an integer. Use the fact that
z =
1 + z
1 + z−1
c©2020 School of Mathematics and Statistics, UNSW Sydney
116 CHAPTER 3. COMPLEX NUMBERS
a) to find the real and imaginary parts of
1 + cos 2θ + i sin 2θ
1 + cos 2θ − i sin 2θ ;
b) to show that if n is a positive integer then(
1 + sinφ+ i cosφ
1 + sinφ− i cosφ
)n
= cosn
(π
2
− φ
)
+ i sinn
(π
2
− φ
)
.
Problems 3.8 : Trigonometric applications of complex numbers
44. [R][V] Using De Moivre’s theorem and the binomial theorem, prove the identity
cos 3θ = 4cos3 θ − 3 cos θ.
45. [R] a) Use De Moivre’s Theorem to express cos 6θ and sin 6θ in terms cos θ and sin θ.
b) Write cos 6θ in terms of cos θ only
46. [H] Express cos 7θ and sin 7θ in terms of powers of cos θ and sin θ.
47. [R] a) Derive a formula for cos θ in terms of eiθ and e−iθ.
b) Deduce a formula for cos6 θ in terms of cos kθ, 1 6 k 6 6.
c) Show that
∫ pi
2
0
cos6 θ dθ =
5π
32
.
48. [R][V] Express sin5 θ and cos4 θ in terms of sines or cosines of multiples of θ, and hence
find their integrals.
Problems 3.9 : Geometric applications of complex numbers
49. [R][V] Sketch the set of points on the complex plane corresponding to each of the following:
a) |z − i| 6 2, b) |z − i| 6 2 or − π
3
6 Arg (z − i) 6 2π
3
,
c) |z| > 2 and |Im(z)| 6 3, d) Re(z) > Im(z),
e) |z − i| = |z + i|, f) |z − 1− i| < 1 and − π
4
< Arg (z − 1− i) 6 π
2
,
g) |z − i| = 2|z + i|.
50. [R] Sketch the following on two carefully labelled Argand diagrams.
a) S1 = {z : Re(z) > 3 Im(z) and |z − (3 + i)| > 2},
b) S2 =
{
z : |z − i| < |z + i| and − π
6
6 Arg (z − i) 6 π
6
}
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 3 117
51. [R] Let S = {z ∈ C : Im(z) > −4 and |z − 1− i| > 3}.
a) Sketch S on a carefully labelled Argand diagram.
b) Does 2 + 4i belong to S?
52. [R] Let z be a complex number. Prove that |z − Re(z)| 6 |z − x| for all real numbers x.
Draw a sketch to illustrate the result.
53. [H] Let z, w be complex numbers.
a) Sketch the subset of the complex plane defined by w = eiα for −π < α 6 π.
b) Given that Arg(z) = θ, prove that |z − eiθ| 6 |z − eiα| for all α ∈ R.
c) Give a geometric interpretation of the result in part b).
Problems 3.10 : Complex polynomials
54. [R] Use the remainder theorem to find the following remainders when.
a) 2 + 3z − z2 + 6z3 is divided by z − 5,
b) 1− 6z + 5z2 − 8z3 + 2z4 is divided by z + 2,
c) 3z + 2z2 + z3 is divided by z − 1− i.
55. [R][V] Use the remainder theorem and the factor theorem to show that z − 2 is a factor
of p(z) = 30− 17z − 3z2 + 2z3. Then divide p by z − 2 and hence find all linear factors of
p.
56. [R] Use the method of the previous question to show that z − 1 and z + 2 are factors of
p(z) = −8− 6z + 7z2 + 6z3 + z4. Then find all linear factors of p.
57. [R] Find all linear factors of
a) z5 + i, b) z6 + 8.
58. [H] a) Factorise x8 − 1 into real linear and real quadratic factors.
b) Repeat for x6 + 8.
59. [R][V] Factorise z4 + 4 over the rational numbers.
60. [R] Factorise the polynomial z4 + i into complex linear factors.
61. [R][V] a) Solve the equation z6 = −1 where z ∈ C.
b) Plot your solutions from part a) as points in the Argand diagram.
c) Write z6 + 1 as a product of complex linear factors.
d) Write z6 + 1 as a product of real quadratic factors.
c©2020 School of Mathematics and Statistics, UNSW Sydney
118 CHAPTER 3. COMPLEX NUMBERS
62. [H] Let p(z) = z6 + z4 + z2 + 1.
a) By using the identity, (z2 − 1)p(z) = z8 − 1, find all 6 complex roots of p(z) in polar
form.
b) Hence factorise p(z) into complex linear factors.
c) Factorise p(z) into a product of 3 real irreducible quadratic polynomials.
63. [H] Let p(z) = 1 + z + z2 + z3 + z4.
a) Solve z5 − 1 = 0 and hence factorise p(z) into linear factors.
b) Find all linear and quadratic factors with real coefficients for p(z).
c) Divide the equation p(z) = 0 by z2. Let x = z + 1z and deduce that x
2 + x− 1 = 0.
d) Deduce that
cos
2π
5
=
−1 +√5
4
and cos
4π
5
=
−1−√5
4
64. [H] Consider f(t) = t6 + t5 − t4 − 5t3 − 6t2 − 6t− 4. Given that −1 + i is a root of f and
that f also has two real integer roots,
a) factorise f into complex linear factors,
b) factorise f into linear and quadratic factors with real coefficients.
65. [H][V] Let f(z) = z5 − 2z4 + 2z3 − 5z2 + 10z − 10. Given that 1 + i is a root, find all
solutions to f(z) = 0.
Problems 3.11 : Appendix: A note on proof by induction
66. [R][V] Prove by mathematical induction that for all positive integers n,
1.2 + 2.3 + · · · + n(n+ 1) = 1
3
n(n+ 1)(n + 2).
67. [R] Prove that, for all integers n > 1,
12 + 22 + 32 + · · ·+ n2 = 1
6
n(n+ 1)(2n + 1).
68. [R] Prove that, for all integers n > 1,
13 + 23 + 33 + · · ·+ n3 = 1
4
n2(n+ 1)2.
69. [H] Prove that, for all integers n > 1,
14 + 24 + 34 + · · ·+ n4 = 1
30
n(n+ 1)(6n3 + 9n2 + n− 1).
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 3 119
Problems 3.13 : Complex numbers and Maple
70. [M] Write a Maple command (or commands) to evaluate the complex number (
√
2+7i)13
in Cartesian or “a+ ib” form.
71. [M] Use Maple to evaluate (5+ i)4 (239− i). Then use de Moivre’s Theorem to show that
π
4
= 4 cot−1 5− cot−1 239.
“I can do Addition,” she said, “if you give me time
— but I can’t do Subtraction under ANY circumstances!”
Lewis Carroll, Through the Looking Glass.
c©2020 School of Mathematics and Statistics, UNSW Sydney
120 CHAPTER 3. COMPLEX NUMBERS
c©2020 School of Mathematics and Statistics, UNSW Sydney
121
Chapter 4
Linear equations and matrices
One glass lemonade (Why ca’n’t you drink water, like me?)
three sandwiches (They never put in half mustard enough.
I told the young woman so, to her face . . . )
and seven biscuits. Total one-and-twopence.
Lewis Carroll, A Tangled Tale.
Linear equations and matrices are very important because they are used as mathematical models
in virtually all areas in which mathematics is applied in the modern world and because they
also appear at the heart of computational algorithms for solving a vast array of quite diverse
mathematical problems.
The processes described in this chapter are appropriate for the general theory of handling
systems of linear equations. They work well for most small-scale examples.
An understanding of linear equations and matrices is essential for much of the later work in
algebra in this book so you must master the material in this chapter.
4.1 Introduction to linear equations
You are already familiar with solving linear equations in one variable such as 2x = 6, and system
of two simultaneous linear equations in two variables such as
{
2x+ y = 3
3x− y = 7 .
A linear equation in two variables such as x− 2y = 1 has infinitely many solutions, since once
y is specified, then x is determined and conversely. It is convenient to write the solutions to such
an equation parametrically, by introducing a parameter λ.
Example 1. Express solutions of x− 2y = 1 parametrically.
Solution. We put y = λ then x = 2λ+ 1, so we can write(
x
y
)
=
(
2λ+ 1
λ
)
, λ ∈ R,
as the parametric solution to x − 2y = 1. Notice that this is not unique, since we could also have
put x = λ and solved for y. ♦
Two linear simultaneous equations in two unknowns may have:
c©2020 School of Mathematics and Statistics, UNSW Sydney
122 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
(1) No solution, for example
{
x+ y = 3
x+ y = 5
.
(2) Unique solution, for example
{
2x+ y = 3
3x− y = 7 .
(3) Infinitely many solutions, for example
{
x− 2y = 1
2x− 4y = 2 .
Since these equations correspond to straight lines in the plane, case (1) represents two distinct
parallel lines; case (2), two non-parallel lines and in case (3), the two lines are the same.
Now consider some examples of two equations in three unknowns. Assume that none of the
equations is of the form 0x1 + 0x2 + 0x3 = b. The solutions of such simultaneous equations can be
interpreted as the points of intersection (if any) of two planes in R3. From geometry, we expect
that two planes either
1. intersect along a line, or
2. are parallel and do not intersect, or
3. are parallel with all points in common, (that is, they are the same plane).
We convert each system to another equivalent system, i.e. a system which has the same solution
set, until we get one which gives us a simple form of the solution set.
Example 2. Find the solution of the system of equations
x1 + x2 + x3 = 5 (1)
3x1 + 4x2 + 7x3 = 20 (2)
Solution. To solve this pair, we first eliminate x1 from equation (2) by subtracting 3 times
equation (1) from equation (2). The new equation (2) is then
x2 + 4x3 = 5 (2
′)
Therefore the original system is equivalent to
x1 + x2 + x3 = 5 (1)
x2 + 4x3 = 5 (2
′)
Equation (2′) has infinitely many solutions which can be represented using a parameter. Put
x3 = λ, then
x2 = 5− 4λ.
Substituting this back to equation (1), we have
x1 + (5− 4λ) + λ = 5,
so x1 = 3λ.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.1. INTRODUCTION TO LINEAR EQUATIONS 123
We write the solutions as
x =
x1x2
x3
 =
 3λ5− 4λ
λ
 =
05
0
+ λ
 3−4
1
 , λ ∈ R.
The geometric interpretation of this example is that equations (1) and (2) each represents a
plane, and the solutions of system represent the points of intersection of the two planes. In this
particular example, the intersection of the two planes is a line. This line passes through the point
(0, 5, 0) and is parallel to the vector
 3−4
1
. ♦
Example 3. Find the solution of the system
2x1 − 3x2 + x3 = 20 (1)
4x1 − 6x2 + 2x3 = 34. (2)
Solution. As usual, we eliminate x1 from equation (2). On subtracting 2 × equation (1) from
equation (2), we obtain the new equation (2) as
0x1 + 0x2 + 0x3 = −6, (2′)
which clearly has no solution. Thus the system has no solution. The geometric interpretation is
that the planes represented by equations (1) and (2) are parallel and do not intersect. ♦
Example 4. Find the solutions of the system
x1 − 3x2 + x3 = 20 (1)
2x1 − 6x2 + 2x3 = 40. (2)
Solution. Eliminating x1 from equation (2) gives the new equation (2) as
0x1 + 0x2 + 0x3 = 0. (2
′)
The given system is thus equivalent to
x1 − 3x2 + x3 = 20 (1)
0x1 + 0x2 + 0x3 = 0 (2
′)
Thus the solutions to (1) and (2) are just the solutions to (1) and conversely. There are infinitely
many solutions, but here two of the unknowns need to be specified in order to obtain the third.
Thus two parameters λ1, λ2 need to be introduced. Put x2 = λ1 and x3 = λ2, then
x1 = 20 + 3λ1 − λ2,
and we write the solutions as
x =
x1x2
x3
 =
20 + 3λ1 − λ2λ1
λ2
 =
200
0
+ λ1
31
0
+ λ2
−10
1
 , for λ1, λ2 ∈ R.
c©2020 School of Mathematics and Statistics, UNSW Sydney
124 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
Hence the system has an infinite number of solutions. The geometric interpretation of this example
is that equations (1) and (2) represent the same plane and the ‘solution’ is just a parametric vector
form of expression for that plane. ♦
Now, what about a system of more than two equations in two or three variables? Let us
exclude the cases that one or more of the equations in the system is of the form 0x1 + 0x2 = b or
0x1+0x2+0x3 = b. In such cases, the system is either has no solutions or is equivalent to a system
of two or less equations.
For a system of three equations in two variables, we can interpret the three equations as three
lines in a plane. Geometrically, we have three cases.
1. The three lines are concurrent, so the system has a unique solution.
2. The three lines do not have a point in common, so the system has no solution. (See Figure 1
for cases in which the three lines are distinct.)
3. The three equations represent the same line. The system has infinitely many solutions.
Figure 1: Three Lines with no Point in Common.
For a system of three equations in three variables, the three equations represent three planes in
three dimensional space. Geometrically, we have four cases.
1. The three planes intersect at one point, so the system has a unique solution.
2. The three planes do not have a point in common, so the system has no solution. (See Figure 2
for cases in which the three planes are distinct.)
3. The three planes intersect in a line, so there are infinitely many solutions.
4. The three equations represent the same plane. Again, the system has infinitely many solu-
tions.
Figure 2: Three Planes with no Point in Common.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.2. SYSTEMS OF LINEAR EQUATIONS AND MATRIX NOTATION 125
4.2 Systems of linear equations and matrix notation
We will now consider the general system of simultaneous linear equations and develop a systematic
method of solution.
Definition 1. A system of m linear equations in n variables is a set of m linear
equations in n variables which must be simultaneously satisfied. Such a system is of
the form 
a11x1 + a12x2 + · · · + a1nxn = b1
a21x1 + a22x2 + · · · + a2nxn = b2
...
...
...
...
am1x1 + am2x2 + · · · + amnxn = bm
(*)
Note carefully the position of the subscripts in (∗): aij is the coefficient of the variable xj in
the ith equation, that is, the equation index is first and the variable index is second.
A solution to a system of equations is the set of values of the variables which simultaneously
satisfy all the equations. We normally write a solution in form of a column vector in Rn. For
instance, if x1 = α1, . . . , xn = αn satisfy the equations simultaneously, the vector
α1...
αn
 is a
solution. A system of equations is said to be consistent if it has at least one solution. Otherwise,
the system is said to be inconsistent.
In particular, the system is called homogeneous when all the bi’s are zero.
Definition 2. The system (*) is said to be homogeneous if b1 = 0, . . . , bm = 0.
Example 1. 
2x1 + 3x2 − x3 + x4 = 6
x1 − 4x2 + 5x3 = 2
x1 − 6x3 = 5
x1 + 2x2 − x4 = 7
is a system of 4 linear equations in 4 unknowns. ♦
The variables x1, x2, . . . , xn in the system (
∗) are really just place marker for the coefficients.
Hence it is convenient to adopt a shorthand notation for (∗) which records only the coefficients.
This is called the augmented matrix for the system (∗):
a11 a12 · · · a1n b1
a21 a22 · · · a2n b2
...
...
. . .
...
...
am1 am2 · · · amn bm
 .
c©2020 School of Mathematics and Statistics, UNSW Sydney
126 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
Example 2. The system in Example 1 has augmented matrix:
2 3 −1 1 6
1 −4 5 0 2
1 0 −6 0 5
1 2 0 −1 7
 .
Note.
1. A rectangular array of numbers is called a matrix. So the above matrix form of the system
of linear equations is called the corresponding augmented matrix. The vertical line is not part
of the matrix. It is used to separate the right hand side of the system from the coefficients
on the left hand side.
2. We call
(
ai1 ai2 · · · ain bi
)
the ith row of the augmented matrix which is denoted by
Ri. The entries in Ri of the matrix correspond to the coefficients and the right hand side of
the i-th equation.
3. We call

a1j
a2j
...
amj
the jth column which is denoted by Cj . The entries in Cj correspond to the
coefficients of xj when 1 6 j 6 n. The last column Cn+1 is the right hand side of the system.
We denote the matrix of coefficients by A, the column of the right hand side as the vector b
and the column vector of the variables as x. That is
A =

a11 a12 · · · a1n
a21 a22 · · · a2n
...
...
. . .
...
am1 am2 · · · amn
 , b =

b1
b2
...
bm
 . and x =

x1
x2
...
xn

Using such notation, the augmented matrix can be abbreviated by (A|b)1.
The drawback of using augmented matrix is the omission of the variables. We introduce the
notion of the matrix equation form. The system of equations is represented by the matrix
equation Ax = b. The matrix A is called the coefficient matrix and x is called the unknown
vector.
When we write Ax, we mean the vector a11x1 + a12x2 + . . . + a1nxn...
am1x1 + am2x2 + . . . + amnxn
 .
1Formally speaking, (A|b) should be




a11 · · · a1n
...
. . .
...
am1 · · · amn


∣∣∣∣∣∣∣


b1
...
bm



, but for simplicity we denote the augmented
matrix


a11 · · · a1n b1
...
. . .
...
...
am1 · · · amn bm

 by (A|b). .
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.2. SYSTEMS OF LINEAR EQUATIONS AND MATRIX NOTATION 127
This is our first meeting with “matrix multiplication”, which we shall investigate in more detail in
Chapter 5.
Finally we can also write the system of linear equations in vector equation or vector form.
x1

a11
a21
...
am1
+ x2

a12
a22
...
am2
+ · · ·+ xn

a1n
a2n
...
amn
 =

b1
b2
...
bm
 .
This vector equation can be written more concisely as
x1a1 + x2a2 + · · ·+ xnan = b,
where
aj =

a1j
a2j
...
amj
 for 1 6 j 6 n and b =

b1
b2
...
bm
 .
Notice that the vector aj contains the coefficients of the variable xj in the system of linear equations.
The vector b is called the right-hand-side vector.
This vector form of the system of equations shows that solving the system corresponds to
expressing the right-hand-side vector b as a linear combination of the vectors a1,a2, . . . ,an.
Example 3. The system of linear equations
x1 + 2x2 + 3x3 = 1
4x1 + 5x2 + 6x3 = −1
7x1 − 5x2 − 9x3 = 0
may be written in vector form as
x1
14
7
+ x2
 25
−5
+ x3
 36
−9
 =
 1−1
0
 ;
or represented by the matrix equation Ax = b with
A =
 1 2 34 5 6
7 −5 −9
 , x =
x1x2
x3
 , b =
 1−1
0
 ;
or represented by the augmented matrix
(A|b) =
 1 2 3 14 5 6 −1
7 −5 −9 0
 .
♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
128 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
Example 4. Write the system of linear equations corresponding to the augmented matrix 1 3 6 7 −2−2 0 5 −4 3
7 0 0 −5 −10
 .
Solution. Because there are 4 columns before the “|”, the system has 4 variables, which we will
call x1, x2, x3 and x4 (although other names are acceptable). Because (A|b) has 3 rows, the system
has 3 equations. Reading off the coefficients of row 1, row 2, and row 3 in turn, we obtain the
system 
x1 + 3x2 − 6x3 + 7x4 = −2
−2x1 + 5x3 − 4x4 = 3
7x1 − 5x4 = −10
Although it is conventional to omit variables with zero coefficients when writing a system of linear
equations, it is essential in matrix notation that every row contains exactly the same number of
entries and that every column also contains exactly the same number of entries so 0s must not be
omitted from matrices or vectors. ♦
Example 5. Write the system of linear equations corresponding to the matrix equation Ax = b,
where
A =
(
3 −2 6 7 −8
5 3 −2 −7 4
)
, b =
(
7
−3
)
.
Solution. The matrix A has 5 columns, so there are 5 variables, which we call x1, x2, x3, x4, x5.
Also, A has 2 rows, so there are 2 equations.
Reading off the coefficients from the two rows, we obtain the system{
3x1 − 2x2 + 6x3 + 7x4 − 8x5 = 7
5x1 + 3x2 − 2x3 − 7x4 + 4x5 = −3
♦
Example 6. Write down the Ax = b form of the system of linear equations corresponding to the
vector equation
x1
 2−3
4
+ x2
 50
−1
+ x3
 −48
0
+ x4
 67
−2
 =
 0−6
2
 .
Solution. The vector associated with xj is the jth column of the coefficient matrix so as a matrix
equation:  2 5 −4 6−3 0 8 7
4 −1 0 −2


x1
x2
x3
x4
 =
 0−6
2
 .
♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.3. ELEMENTARY ROW OPERATIONS 129
4.3 Elementary row operations
When you solve a pair a simultaneous linear equations, you add (or subtract) a multiple of one
equation from the other equation. Each equation in a system of linear equations is represented
by one row of the corresponding augmented matrix. This means that the operation of adding
a multiple of one equation from another equation can be carried out on the augmented matrix
by adding a multiple of one row from another row. This is an example of what we call a row
operation.
In this section we introduce the three elementary row operations. These operations are
applied to the augmented matrix in order to find the solution to a system of m linear equations
in n variables. Each of these operations gives us a new augmented matrix which is equivalent to
the old augmented matrix in the following sense: the system represented by the new matrix has
exactly the same solution set as the system represented by the old matrix.
We also show how to record row operations to make it easier to check your calculations.
4.3.1 Interchange of equations
If two equations are interchanged, then the set of solutions of the new system is clearly the same as
that of the original system. The corresponding operation for an augmented matrix is to interchange
two complete rows.
Example 1. Consider the system
(1) 2x2 + 3x3 = 5
(2) −x1 + 3x2 + x3 = 6
(3) 2x1 + 4x2 + 7x3 = 8
or
R1
R2
R3
 0 2 3 5−1 3 1 6
2 4 7 8
 .
Interchanging equations (1) and (2), or rows R1 and R2, yields
(1′) = (2) −x1 + 3x2 + x3 = 6
(2′) = (1) 2x2 + 3x3 = 5
(3) 2x1 + 4x2 + 7x3 = 8
or
New R1 = old R2
New R2 = old R1
R3
 −1 3 1 60 2 3 5
2 4 7 8
 .
This should be recorded as 0 2 3 5−1 3 1 6
2 4 7 8
 R2 ↔ R1−−−−−−−−−−→
 −1 3 1 60 2 3 5
2 4 7 8
 .
♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
130 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
4.3.2 Adding a multiple of one equation to another
The operation of adding a multiple of one equation to another does not change the solution set of
the system of equations. For example, consider the system of equations{
a1x1 + · · ·+ anxn = b
α1x1 + · · ·+ αnxn = β (1)
If we add λ times equation (1) to equation (2), we obtain the system{
a1x1 + · · · + anxn = b
(α1 + λa1)x1 + · · · + (αn + λan)xn = β + λb (2)
Proposition 1. The solution of the system (1) is the same as the solution of the system (2).
Proof. Let x =
x1...
xn
 be a solution of system (1). Then we have
a1x1 + . . . + anxn = b
and (α1 + λa1)x1 + · · ·+ (αn + λan)xn = (α1x1 + · · ·+ αnxn) + λ(a1x1 + · · ·+ anxn)
= β + λb,
and so system (2) is satisfied.
Conversely, let x =
x1...
xn
 be a solution of system (2). Then, we have
a1x1 + . . . + anxn = b
and α1x1 + · · ·+ αnxn = (α1 + λa1)x1 + · · ·+ (αn + λan)xn − λ(a1x1 + · · ·+ anxn)
= (β + λb)− λb = β,
and so system (1) is satisfied.
In the augmented-matrix notation, the addition of a scalar multiple of one equation to another
is equivalent to addition of a scalar multiple of one complete row to another row.
Example 2. Consider the system
(1) x1 + 2x2 + 3x3 = 5
(2) −x1 − 4x2 + x3 = 6
(3) 2x1 + 10x2 + 7x3 = 8
or
R1
R2
R3
 1 2 3 5−1 −4 1 6
2 10 7 8

On adding equation (1) to equation (2) in the above system, we obtain
(1) x1 + 2x2 + 3x3 = 5
(2′) − 2x2 + 4x3 = 11
(3) 2x1 + 10x2 + 7x3 = 8
or
R1
R2
R3
 1 2 3 50 −2 4 11
2 10 7 8
 ,
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.4. SOLVING SYSTEMS OF EQUATIONS 131
where the new R2 = old R2 +R1.
This should be recorded as 1 2 3 5−1 −4 1 6
2 10 7 8
 R2 = R2 +R1−−−−−−−−−−−−−−→
 1 2 3 50 −2 4 11
2 10 7 8
 .
♦
4.3.3 Multiplying an equation by a non-zero number
We can multiply one equation by a non-zero constant without changing the solutions of the system.
For example the systems of equations corresponding to the two matrices 1 2 3 4 80 0 5 6 9
0 0 0 7 3
 and
 1 2 3 4 80 0 1 65 95
0 0 0 1 37

are equivalent.
This should be recorded as 1 2 3 4 80 0 5 6 9
0 0 0 7 3
 R2 = 15R2−−−−−−−−−−−→
R3 =
1
7R3
 1 2 3 4 80 0 1 65 95
0 0 0 1 37
 .
Row operations should be done one after the other. Although in the above we have done two
row operations at once, there is no difference from performing the operations one after the other.
However, we should not make the following mistake. 1 2 3 5−1 −4 1 6
2 10 7 8
 R2 = R2 +R1−−−−−−−−−−−−−−→
R1 = R1 +R2
 0 −2 4 110 −2 4 11
2 10 7 8

The two systems are not equivalent. If we do want to perform the row operations R2 = R2 + R1
then R1 = R1 +R2, we should have 1 2 3 5−1 −4 1 6
2 10 7 8
 R2=R2+R1−−−−−−−−−→
 1 2 3 50 −2 4 11
2 10 7 8
 R1=R1+R2−−−−−−−−−→
 1 0 7 160 −2 4 11
2 10 7 8
 .
4.4 Solving systems of equations
This section describes a process for solving systems of linear equations.
The process consists of two distinct stages.
c©2020 School of Mathematics and Statistics, UNSW Sydney
132 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
1. In the first stage, known as Gaussian elimination, we use two types of row operation
(interchange of rows and subtracting a multiple of one row from another row) to produce
an equivalent system in a simpler form which is known as row-echelon form. From the
row-echelon form we can tell many things about solutions of the system. In particular, we
can tell whether the system has no solution, a unique solution or infinitely many solutions.
2. If the system does have solutions, the second stage is to find them. It can be carried out by
either of two methods.
(a) We can use further row operations to obtain an even simpler form which is called re-
duced row-echelon form. From this form we can read off the solution(s).
(b) From the row echelon form, we can read off the value (possibly in terms of parameters)
for at least one of the variables. We substitute this value into one of the other equations
and get the value for another variable, and so on. This process, which is called back-
substitution, will be described fully later.
4.4.1 Row-echelon form and reduced row-echelon form
We begin by defining some special forms of matrix.
Definition 1. In any matrix
1. a leading row is one which is not all zeros,
2. the leading entry in a leading row is the first (i.e. leftmost) non-zero entry,
3. a leading column is a column which contains the leading entry for some row.
For example, in the following matrix the 1st row is a leading row with leading entry 5 and the 2nd
row is a non-leading row. The 2nd column is a leading column but the 1st and 3rd columns are
non-leading columns. (
0 5© 7
0 0 0
)
Definition 2. A matrix is said to be in row-echelon form if
1. all leading rows are above all non-leading rows (so any all-zero rows are at the
bottom of the matrix), and
2. in every leading row, the leading entry is further to the right than the leading
entry in any row higher up in the matrix.
The following are examples of augmented matrices which are in row-echelon form. The circled
entries are leading entries. We shall see later that the position of leading entries in a row-echelon
form gives important information about solubility of the system.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.4. SOLVING SYSTEMS OF EQUATIONS 133
(1)

2© 3 4 11
0 −3© 2 7
0 0 4© 8
 , (2)

−5© 0 4 1 0
0 2© 3 2 1
0 0 0 0 2©
 ,
(3)
 6© 2 3 4
0 0 7© 7
 , (4)

−5© 0 2 4 1 0
0 0 2© 3 2 1
0 0 0 0 0 0
 ,
(5)

2© 3 4 11
0 −3© 2 7
0 0 4© 8
0 0 0 0
 , (6)

−5© 0 4 1 0
0 2© 3 2 0
0 0 0 0 0
 .
The following are examples of augmented matrices which are NOT in row-echelon form.
(7)
 0 6© 3 4
5© 7 0 0
 , (8)

−5© 0 4 0
0 0 0 0
0 2© 3 1
 ,
(9)

5© 3 5 −4 6
0 0 6© −6 7
0 0 7© 0 −8.

Example (8) does not satisfy Condition 1 of Definition 2. Examples (7) and (9) do not satisfy
Condition 2.
Warning. The definition of row-echelon form varies from one book to another and the terminology
in Definition 1 is not universally adopted.
Definition 3. A matrix is said to be in reduced row-echelon form if it is in
row-echelon form and in addition
3. every leading entry is 1, and
4. every leading entry is the only non-zero entry in its column.
The following are examples of augmented matrices which are in reduced row-echelon form.
c©2020 School of Mathematics and Statistics, UNSW Sydney
134 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
(10)
 1© 0 0 10 1© 0 2
0 0 1© 3
 , (11)
 0 1© 2 00 0 0 1©
0 0 0 0
 ,
(12)
 1© −1 0 2 0 20 0 1© 1 0 −1
0 0 0 0 1© 6
 .
Later in this section, we shall see that we can easily read the solutions of a system with its
augmented matrix in reduced row-echelon form. The solution set of a system with its augmented
matrix in row-echelon from can be found by a process called back-substitution.
4.4.2 Gaussian elimination
In this subsection we shall see how the operations of interchanging two rows and of adding a multiple
of a row from another row can be used to take a system of equations Ax = b and transform it into
an equivalent system Ux = y such that the augmented matrix (U |y) is in row-echelon form. The
method we shall describe is called the Gaussian-elimination algorithm.
We now describe the steps in the algorithm and illustrate each step by applying it to the
following augmented matrices.
a) (A|b) =
 1 2 1 32 3 1 3
1 3 3 2
 , b) (A|b) =

0 0 0 2 3 −1
0 3 −3 3 3 −6
0 0 0 1 3 −2
0 6 −6 6 3 −9
 .
Step 1. Select a pivot element.
We shall use the following rule to choose what is called a pivot element: start at the left of
the matrix and move to the right until you reach a column which is not all zeros, then go down this
column and choose the first non-zero entry as the pivot entry. The column containing the pivot
entry is called the pivot column and the row containing the pivot entry is called the pivot row.
Note. When solving linear equations on a computer, a more complicated pivot selection rule is
generally used in order to minimise “rounding errors”.
In examples (a) and (b), our pivot selection rule selects the circled entries below as pivot entries
for the first step of Gaussian elimination.
a)
 1© 2 1 32 3 1 3
1 3 3 2
 , b)

0 0 0 2 3 −1
0 3© −3 3 3 −6
0 0 0 1 3 −2
0 6 −6 6 3 −9
 .
In example (a), row 1 is the pivot row and column 1 is the pivot column. In example (b), row 2 is
the pivot row and column 2 is the pivot column.
Step 2. By a row interchange, swap the pivot row and the top row if necessary.
The rule is that the first pivot row of the augmented matrix must finish as the first row of
(U |y). You can achieve this by interchanging row 1 and the pivot row.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.4. SOLVING SYSTEMS OF EQUATIONS 135
In example (a), the pivot row is already row 1, so no row interchange is needed. In example
(b), the pivot row is row 2, so rows 1 and 2 of the augmented matrix must be interchanged. This
is shown as
b)

0 0 0 2 3 −1
0 3© −3 3 3 −6
0 0 0 1 3 −2
0 6 −6 6 3 −9

R1 ↔ R2−−−−−−−−−−→

0 3© −3 3 3 −6
0 0 0 2 3 −1
0 0 0 1 3 −2
0 6 −6 6 3 −9
 .
Step 3. Eliminate (i.e., reduce to 0) all entries in the pivot column below the pivot
element.
We can do this by adding suitable multiples of the pivot row to the lower rows. After this
process the pivot column is in the correct form for the final row-echelon matrix (U |y).
In example (a), we can use the row operations R2 = R2 + (−2)R1 and R3 = R3 + (−1)R1 to
reduce the pivot column to the required form. In recording the row operations, we can simply write
R2 = R2 − 2R1 and R3 = R3 −R1.
a)
 1© 2 1 32 3 1 3
1 3 3 2
 R2 = R2 − 2R1−−−−−−−−−−−−−−→
R3 = R3 −R1
 1© 2 1 30 −1 −1 −3
0 1 2 −1
 .
In example (b), the row operation R4 = R4 − 2R1 reduces the pivot column to the required
form. This gives
b)

0 3© −3 3 3 −6
0 0 0 2 3 −1
0 0 0 1 3 −2
0 6 −6 6 3 −9

R4 = R4 − 2R1−−−−−−−−−−−−−−−→

0 3© −3 3 3 −6
0 0 0 2 3 −1
0 0 0 1 3 −2
0 0 0 0 −3 3
 .
Step 4. Repeat steps 1 to 3 on the submatrix of rows and columns strictly to the right
of and below the pivot element and stop when the augmented matrix is in row-echelon
form.
Note that the top row in step 2 here should mean the top row of the submatrix. In example
(a), the required operation is
1© 2 1 3
0 −1© −1 −3
0 1 2 −1
 R3 = R3 +R2−−−−−−−−−−−−−−→

1© 2 1 3
0 −1© −1 −3
0 0 1 −4
 .
Then we have reduced the matrix to row-echelon from
1© 2 1 3
0 −1© −1 −3
0 0 1© −4
 .
c©2020 School of Mathematics and Statistics, UNSW Sydney
136 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
In example (b), the required operations are
0 3© −3 3 3 −6
0 0 0 2 3 −1
0 0 0 1 3 −2
0 0 0 0 −3 −3

R3 = R3 − 12R2−−−−−−−−−−−−−−−−→

0 3© −3 3 3 −6
0 0 0 2© 3 −1
0 0 0 0 32 −32
0 0 0 0 −3 −3

R4 = R4 + 2R3−−−−−−−−−−−−−−−→

0 3© −3 3 3 −6
0 0 0 2© 3 −1
0 0 0 0 32© −32
0 0 0 0 0 0
 = (U |y).
In both examples we have now reached a matrix which is in row-echelon form, so we have
completed the process of Gaussian elimination for these examples.
Note.
1. If you are doing Gaussian elimination without the aid of a computer, you don’t have to stick
rigidly to the pivot selection rule which we stated above. For example, the value 1 is very
convenient as a pivot entry, so when the leftmost non-zero column has a 1 in it you may find
it best to use this 1 as your pivot entry in preference to some other non-zero entry which is
above the 1.
For example, if you have the augmented matrix(
13 27 174
1 2 13
)
you should do(
13 27 174
1 2 13
)
R1 ↔ R2−−−−−−−−−−→
(
1 2 13
13 27 174
)
R2 = R2 − 13R1−−−−−−−−−−−−−−−−→
(
1 2 13
0 1 5
)
R1 = R1 − 2R2−−−−−−−−−−−−−−−→
(
1 0 3
0 1 5
)
rather than applying the rule stated above, which would give(
13 27 174
1 2 13
)
R2 = R2 − 113R1−−−−−−−−−−−−−−−−→
(
13 27 174
0 − 113 − 513
)
etc.
2. The legal row operations are just
Ri ↔ Rj ,
Ri = λRi where λ 6= 0,
Ri = Ri + λRj,
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.4. SOLVING SYSTEMS OF EQUATIONS 137
and they are supposed to be done one after the other. Be careful to avoid doing simultaneously
two operations like
R1 = R1 − 2R2 and R2 = R1 − 2R2.
See page 131. If all the entries in Ri are integers and have a common factor λ, we should use
Ri =
1
λRi to produce integers with smaller magnitude.
4.4.3 Transformation to reduced row-echelon form
Given a matrix in row-echelon form, we can transform it into reduced row-echelon form by using
two types of elementary row operations — multiplication of a row by a constant and adding a
multiple of one row to another row.
The procedure is as follows.
Start with the lowest row which is not all zeros. Multiply it by a suitable constant to make its
leading entry 1. Then add multiples of this row to higher rows to get all zeros in the column above
the leading entry of this row. Repeat this procedure with the second lowest non-zero row, and so on.
We shall now see how this procedure applies to examples 1 to 6 of Section 4.4.1 (all of which are
already in row-echelon form) and how the solution (if any) of each system relates to the row-echelon
form.
Example 1. The transformation to reduced row-echelon form goes like this. 2 3 4 110 −3 2 7
0 0 4 8
 R3 = 14R3−−−−−−−−−−→
 2 3 4 110 −3 2 7
0 0 1 2
 R1 = R1 − 4R3−−−−−−−−−−−−→
R2 = R2 − 2R3
 2 3 0 30 −3 0 3
0 0 1 2

R2 = −13R2−−−−−−−−−−→
 2 3 0 30 1 0 −1
0 0 1 2
 R1 = R1 − 3R2−−−−−−−−−−−−−→
 2 0 0 60 1 0 −1
0 0 1 2

R1 =
1
2R1−−−−−−−−−−→
 1 0 0 30 1 0 −1
0 0 1 2

The final augmented matrix represents the system
x1 + 0x2 + 0x3 = 3
0x1 + x2 + 0x3 = −1
0x1 + 0x2 + x3 = 2,
so the solution is x1 = 3, x2 = −1, x3 = 2. ♦
Notice that in the original row-echelon form, every column of the coefficient matrix (the
part of the augmented matrix to the left of the vertical bar) is a leading column and that the
system of equations has a unique solution.
Example 2. In this case we can get to reduced row-echelon form just by multiplying each row by
a suitable constant — R1 = −15R1, R2 = 12R2 and R3 = 12R3. The reduced row-echelon form is
c©2020 School of Mathematics and Statistics, UNSW Sydney
138 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
then  1 0 −
4
5 −15 0
0 1 32 1
1
2
0 0 0 0 1
 .
Since the third row of the reduced row-echelon form represents the equation
0x1 + 0x2 + 0x3 + 0x4 = 1,
the system has no solution. ♦
In general, if an augmented matrix has a row of the form
(0 0 0 · · · 0 | α), where α 6= 0,
then the corresponding system of equations has no solution. Note that if the right-hand-side
column is a leading column, then the system of equations has no solution, i.e. inconsistent.
Example 3. The transformation to reduced row-echelon form goes like this. 6 2 3 4
0 0 1 4
 R1 = R1 − 3R2−−−−−−−−−−−−−→
 6 2 0 1
0 0 1 1
 R1 = 16R1−−−−−−−−−→
 1© 13 0 16
0 0 1© 1
 .
The final augmented matrix represents the system
x1 +
1
3x2 =
1
6 , x3 = 1.
The second equations tells us that x3 must equal 1. For the first equation, we need to set x2 as a
parameter, say λ. This means that the system must have infinitely many solutions. In fact x will
be a solution if and only if
x1 =
1
6 − 13λ, x2 = λ, x3 = 1.
This can be rewritten in vector form as
x =
x1x2
x3
 =
16 − 13λλ
1
 =
160
1
+ λ
 −131
0
 for λ ∈ R.
This is a parametric vector form of a line in R3 through the point (16 , 0, 1) parallel to the vector−131
0
. ♦
A variable xi is a leading variable if the ith column of the row-echelon matrix is a leading
column. It is a non-leading variable if the ith column of the matrix is a non-leading column. Notice
that the leading variables in the echelon form are x1 and x3 while x2 is non-leading, so this example
illustrates a general rule which says that non-leading variables can be chosen arbitrarily by
setting parameter, and then the leading variables can be written in terms of the parameters.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.4. SOLVING SYSTEMS OF EQUATIONS 139
Example 4. In this example, the third row corresponds to the equation
0x1 + 0x2 + 0x3 + 0x4 + 0x5 = 0
which is satisfied by every x, and so is superfluous. However, such rows should not be deleted.
The transformation to reduced echelon form gives 1© 0 0 −
1
5
1
5
1
5
0 0 1© 32 1 12
0 0 0 0 0 0
 .
Columns 2, 4 and 5 are non-leading columns, so the variables x2, x4 and x5 are non-leading
variables and in terms of parameters we have x2 = λ1, x4 = λ2 and x5 = λ3. We can then read off
the values of the leading variables as
x1 =
1
5 +
1
5λ2 − 15λ3, x3 = 12 − 32λ2 − λ3.
This set of solutions can be expressed in vector form as
x =

x1
x2
x3
x4
x5
 =

1
5
0
1
2
0
0
+ λ1

0
1
0
0
0
+ λ2

1
5
0
−32
1
0
+ λ3

−15
0
−1
0
1
 for λ1, λ2, λ3 ∈ R.
♦
Examples 3 and 4 illustrate that the parameters in the solution are the values of the
non-leading variables, so the number of parameters in the solution equals the number
of non-leading columns in the row-echelon form of the coefficient matrix.
In general, if a system is consistent and there are non-leading columns in a row-echelon form of
the system, then the system has infinitely many solutions.
Example 5. In this example, row 4 is all zeros and so is superfluous. The remaining rows are the
same as in example 1, so the solution is identical to the solution in example 1. ♦
Example 6. Transformation to reduced row-echelon form gives 1 0 −
4
5 −15 0
0 1 32 1 0
0 0 0 0 0

We let the values of the non-leading variables x3 and x4 be arbitrary parameters, say λ1 and λ2.
We can then read off the values for the leading variables x1 and x2 (in terms of λ1 and λ2) and
express the set of solutions in vector form as
x = λ1

4
5
−32
1
0
+ λ2

1
5
−1
0
1
 for λ1, λ2 ∈ R.
c©2020 School of Mathematics and Statistics, UNSW Sydney
140 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
This is a parametric vector form of a plane in R4 which contains the origin and is parallel to the
vectors

4
5
−32
1
0
 and

1
5
−1
0
1
. ♦
We end this subsection by summarising the rule for getting solutions from a reduced echelon
form of the system:
Assign an arbitrary parameter as the value for each non-leading variable. Then read off expressions
for the leading variables in terms of the arbitrary parameters which you have introduced.
4.4.4 Back-substitution
When solving small systems by hand, you may prefer to use this procedure as an alternative to the
transformation of a row-echelon form into reduced row-echelon form. Instead of doing further row
operations on the augmented matrix, we go to the equations represented by the row-echelon form
and proceed as follows.
Assign an arbitrary parameter value to each non-leading variable. Then read off from the last non-
trivial equation an expression for the last leading variable in terms of your arbitrary parameters.
Substitute this expression back into the second last equation to get an expression for the second last
leading variable, and so on.
Example 7. We will redo example 4 of the last subsection, using back-substitution instead of
transformation to reduced echelon form. The row-echelon form is −5© 0 2 4 1 00 0 2© 3 2 1
0 0 0 0 0 0
 .
Solution. The non-leading variables are x2, x4 and x5 so we let x2 = λ1, x4 = λ2 and x5 = λ3.
Then the second row corresponds to the equation
2x3 + 3λ2 + 2λ3 = 1
which gives
x3 =
1
2
− 3
2
λ2 − λ3.
Substituting this back into the equation represented by the first row gives
−5x1 + 2
(
1
2
− 3
2
λ2 − λ3
)
+ 4λ2 + λ3 = 0.
We solve this for x1 and get
x1 =
1
5
+
1
5
λ2 − 1
5
λ3.
Note that this is the same solution as we got by transforming to reduced row-echelon form. ♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.5. DEDUCING SOLUBILITY FROM ROW-ECHELON FORM 141
4.5 Deducing solubility from row-echelon form
We can determine the number of solutions to a system by examining the position of leading
entries in a row-echelon form for the augmented matrix of the system. The rules for doing this
are summarised in the following proposition.
Proposition 1. If the augmented matrix for a system of linear equations can be transformed into
an equivalent row-echelon form (U |y) then:
1. The system has no solution if and only if the right hand column y is a leading column.
2. If the right hand column y is a NOT a leading column then the system has:
a) A unique solution if and only if every variable is a leading variable.
b) Infinitely many solutions if and only if there is at least one non-leading variable.
In this case, each non-leading variable becomes a parameter in the general expression
for all solutions and the number of parameters in the solution equals the number of
non-leading variables.
It often happens in practice that you want to consider the system Ax = b for one fixed coefficient
matrix A but many different vectors b. For this reason it is useful to know the answer to the
following question: if we know a row-echelon form U for the coefficient matrix A (not the augmented
matrix (A|b)), what can we say about solubility of the system Ax = b for an arbitrary right hand
side b?
If U has an all-zero row then it will obviously be possible to find at least one b ∈ Rn such that
the row-echelon form (U |y) for (A|b) has a row of the form(
0 0 · · · 0 | α) , α 6= 0
and this represents an equation which has no solution. On the other hand, if U has no non-leading
row (see page 132) then no b can give rise to an impossible row of the above type and we can
always get a solution by back-substitution. In this case the number of solutions will be infinite if
and only if there are non-leading columns in U .
The following proposition sums up what we can say about solubility for arbitrary b.
Proposition 2. If A is an m× n matrix which can be transformed by elementary row operations
into a row-echelon form U then the system Ax = b has
a) at least one solution for each b in Rm if and only if U has no non-leading rows,
b) at most one solution for each b in Rm if and only if U has no non-leading columns,
c) exactly one solution for each b in Rm if and only if U has no non-leading rows and no
non-leading columns.
When U has neither non-leading row nor non-leading column, Ax = b always has a unique
solution. Otherwise, whether the matrix equation has solution depends on the vector b. We shall
study examples on finding a condition on b for a solution to exist and we shall discuss how to find
a specific formula for a solution x expressed in terms of an arbitrary right hand side b in the next
section.
c©2020 School of Mathematics and Statistics, UNSW Sydney
142 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
4.6 Solving Ax = b for indeterminate b
To solve the equation Ax = b, we apply row operations to reduce the corresponding augmented
matrix (A|b).
Example 1. Solve the system of equations{
2x1 + 3x2 = b1
3x1 + 4x2 = b2
Solution. We reduce the augmented matrix as follows.(
2 3 b1
3 4 b2
)
R2 = R2 − 32R1−−−−−−−−−−−−−→
(
2 3 b1
0 −12 −32b1 + b2
)
.
From the second row of the row-echelon form, we have
−1
2
x2 = −3
2
b1 + b2 or x2 = 3b1 − 2b2.
Substitute it into the equation represented by the first row,
2x1 + 3(3b1 − 2b2) = b1 or x1 = −4b1 + 3b2.
♦
Note. We could write the system of equations as{
2x1 + 3x2 = 1b1 + 0b2
3x1 + 4x2 = 0b1 + 1b2
Provided the interpretation of the columns is clearly understood the row reduction can be rewritten
as (
2 3 1 0
3 4 0 1
)
R2 = R2 − 32R1−−−−−−−−−−−−−→
(
2 3 1 0
0 −12 −32 1
)
.
Then we can proceed to reduce the matrix to reduced row-echelon from or apply back-substitution
as before.
In the above example, the system always has a unique solution for any b. The system of
equations in the next example does not always have solution. A condition on b has to be satisfied
for a solution to exist.
Example 2. Find a condition for b =
(
b1
b2
)
so that the system of equations
{
2x1 + 3x2 = b1
4x1 + 6x2 = b2
has solutions. Find the solution set if the condition is satisfied.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.7. GENERAL PROPERTIES OF THE SOLUTION OF Ax = b 143
Solution. The reduction process yields(
2 3 b1
4 6 b2
)
R2 = R2 − 2R1−−−−−−−−−−−−−→
(
2 3 b1
0 0 b2 − 2b1
)
.
The second row represents the equation 0x1 + 0x2 = −2b1 + b2. Thus for solutions to exist, the
condition −2b1 + b2 = 0 has to be satisfied. If solutions exist, we have to assign the non-leading
variable x2 a parameter λ. Then the equation corresponding to the first row becomes
2x1 + 3λ = b1, that is x1 = −3
2
λ+
1
2
b1.
The solution set is then {
x ∈ R2 : x =
(
1
2b1
0
)
+ λ
(−32
1
)
, λ ∈ R
}
.
♦
Sometimes there is more than one condition for solutions to exist.
Example 3. Find conditions on b1, b2, b3, b4 for the following system of equations to be consistent.
Find the solutions when these conditions are satisfied.
x1 + 2x2 = b1
3x1 + 5x2 = b2
5x1 + 7x2 = b3
7x1 + 9x2 = b4
Solution. 
1 2 b1
3 5 b2
5 7 b3
7 9 b4
 R2 = R2 − 3R1−−−−−−−−−−−−−−−→R3 = R3 − 5R1
R4 = R4 − 7R1

1 2 b1
0 −1 b2 − 3b1
0 −3 b3 − 5b1
0 −5 b4 − 7b1

R3 = R3 − 3R2−−−−−−−−−−−−−−−→
R4 = R4 − 5R2

1 2 b1
0 −1 −3b1 + b2
0 0 4b1 − 3b2 + b3
0 0 8b1 − 5b2 + b4

From rows 3 and 4, the original system of equations has a solution if and only if b3 = −4b1 + 3b2
and b4 = −8b1 + 5b2. If the condition is satisfied, we get x2 = 3b1 − b2 from row 2. Then from
row 1, we have x1 = b1 − 2(3b1 − b2) = −5b1 + 2b2. ♦
4.7 General properties of the solution of Ax = b
We have seen that every system of linear equations can be written as Ax = b and every such
system has either no solution, a unique solution or an infinite number of solutions. For all systems
of linear equations that we have solved in this chapter which have infinitely many solutions, the
solutions can be written in parametric vector form as
x = xp + λ1v1 + λ2v2 + · · · + λkvk,
c©2020 School of Mathematics and Statistics, UNSW Sydney
144 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
where v1, · · · ,vk are non-zero vectors and where the parameters λ1, . . . , λk are scalars. It is not
difficult to see that this is generally true for all systems of linear equations with infinitely many
solutions.
Furthermore, we shall prove that the vector xp is a solution of the system Ax = b, and the
vectors vi which go with the parameters in the solution are solutions of the homogeneous equation
Ax = 0. It does not depend on how we reduce the matrix to an echelon form.
We begin with some important propositions about solutions of Ax = 0.
Proposition 1. Ax = 0 has x = 0 as a solution.
Proof. Since Ax = 0 represents the system of equations
a11x1 + a12x2 + · · · + a1nxn = 0
a21x1 + a22x2 + · · · + a2nxn = 0
...
...
...
...
am1x1 + am2x2 + · · · + amnxn = 0
,
so x1 = x2 = · · · = xn = 0 satisfies all the equations.
Proposition 2. If v and w are solutions of Ax = 0 then so are v+w and λv for any scalar λ.
Proof. Since v =
v1...
vn
 and w =
w1...
wn
 are solutions, we have

a11v1 + a12v2 + · · · + a1nvn = 0
a21v1 + a22v2 + · · · + a2nvn = 0
...
...
...
...
am1v1 + am2v2 + · · · + amnvn = 0
, (1)

a11w1 + a12w2 + · · · + a1nwn = 0
a21w1 + a22w2 + · · · + a2nwn = 0
...
...
...
...
am1w1 + am2w2 + · · · + amnwn = 0
. (2)
If we add the ith equations of (1) and (2) for 1 6 i 6 m, we shall get
a11(v1 + w1) + a12(v2 + w2) + · · · + a1n(vn + wn) = 0
a21(v1 + w1) + a22(v2 + w2) + · · · + a2n(vn + wn) = 0
...
...
...
...
am1(v1 + w1) + am2(v2 + w2) + · · · + amn(vn + wn) = 0
.
Hence v +w is a solution of Ax = 0.
If we multiply both sides of every equation of (1) by λ, then
a11(λv1) + a12(λv2) + · · · + a1n(λvn) = 0
a21(λv1) + a22(λv2) + · · · + a2n(λvn) = 0
...
...
...
...
am1(λv1) + am2(λv2) + · · · + amn(λvn) = 0
.
Hence λv is also a solution.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.7. GENERAL PROPERTIES OF THE SOLUTION OF Ax = b 145
A further important proposition about homogeneous systems can be obtained by induction from
the last proposition.
Proposition 3. Let v1, . . . ,vk be solutions of Ax = 0 for 1 6 j 6 k. Then λ1v1 + · · · + λkvk is
also a solution of Ax = 0 for all λ1, . . . , λk ∈ R.
We now turn to the properties of the solutions of Ax = b.
Proposition 4. If v and w are solutions of Ax = b then v −w is a solution of Ax = 0.
Proof. Since v =
v1...
vn
 and w =
w1...
wn
 are solutions of Ax = b, we have

a11v1 + a12v2 + · · · + a1nvn = b1
a21v1 + a22v2 + · · · + a2nvn = b2
...
...
...
...
am1v1 + am2v2 + · · · + amnvn = bn
, (1)

a11w1 + a12w2 + · · · + a1nwn = b1
a21w1 + a22w2 + · · · + a2nwn = b2
...
...
...
...
am1w1 + am2w2 + · · · + amnwn = bn
. (2)
If we subtract the ith equation of (2) from the ith equation of (1) for 1 6 i 6 m, we shall get
a11(v1 −w1) + a12(v2 − w2) + · · · + a1n(vn − wn) = 0
a21(v1 −w1) + a22(v2 − w2) + · · · + a2n(vn − wn) = 0
...
...
...
...
am1(v1 − w1) + am2(v2 − w2) + · · · + amn(vn − wn) = 0
.
Hence v −w is a solution of Ax = 0.
We can now prove the following result.
Proposition 5. Let xp be a solution of Ax = b.
1. If x = 0 is the only solution of the homogeneous equation Ax = 0 then xp is the unique
solution of Ax = b.
2. If the homogeneous equation has non-zero solutions v1, . . . ,vk then
x = xp + λ1v1 + · · ·+ λkvk,
is also a solution of Ax = b for all λ1, . . . , λk ∈ R.
Proof. For part 1, let v be another solution of Ax = b. By Proposition 4, v − xp is a solution of
Ax = 0. Since 0 is the only solution of Ax = 0, we have v− xp = 0 and that is v = xp. Hence xp
is the unique solution of Ax = b.
For part 2, let v = λ1v1+ · · ·+λkvk. By Proposition 3, we know v is also a solution of Ax = 0.
With a similar method used in proving Proposition 2, we can show that xp + v is a solution of
Ax = b.
c©2020 School of Mathematics and Statistics, UNSW Sydney
146 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
Note. The form of solutions in part 2 of Proposition 5 raises the question of what is the minimum
number of vectors v1, · · · ,vk required to yield all the solutions of Ax = b. We will return to this
question in Chapter 6 when we discuss the ideas of “spanning sets” and “linear independence”. For
the present, it is sufficient to note that the solution method we have used in this chapter does find
all solutions of Ax = b.
Example 1. In Section 4.4.2 we found a row-echelon form for the augmented matrix of Ax = b,
where
A =

0 0 0 2 3
0 3 −3 3 3
0 0 0 1 3
0 6 −6 6 3
 , x =

x1
x2
x3
x4
x5
 and b =

−1
−6
−2
−9

as 
0 3 −3 3 3 −6
0 0 0 2 3 −1
0 0 0 0 32 −32
0 0 0 0 0 0
 .
By back-substitution, x5 = −1, x4 = 1. By setting x1 = λ1 and x3 = λ2, we also have x2 = λ2− 2.
Hence the solutions are
x =

λ1
λ2 − 2
λ2
1
−1
 =

0
−2
0
1
−1
+ λ1

1
0
0
0
0
+ λ2

0
1
1
0
0
 for λ1, λ2 ∈ R.
The first vector is a particular solution (corresponding to λ1 = λ2 = 0) of Ax = b and the vectors
which go with the parameters are solutions of the homogeneous equation Ax = 0.
As an exercise, can you write down the two systems of 4 equations in 5 variables corresponding
to Ax = b and Ax = 0 and verify that the first vector is a solution to the non-homogeneous system
and the two other vectors are solutions to the homogeneous system? ♦
4.8 Applications
Linear equations appear in many applications. In this section we first present some geometric
applications and then some applications in some other fields.
4.8.1 Geometry
In Section 4.1, we have shown some simple geometric applications of linear equations to lines in R2
and planes in R3. In this section we show how some common problems involving lines and planes
in Rn can be solved using linear equations.
Example 1. Does

3
0
5
6
 belong to span


1
−2
3
2
 ,

0
4
1
2

?
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.8. APPLICATIONS 147
Solution. The vector

3
0
5
6
 ∈ span


1
−2
3
2
 ,

0
4
1
2

 if and only if

3
0
5
6
 = λ1

1
−2
3
2
+ λ2

0
4
1
2
 for some λ1, λ2 ∈ R.
On equating corresponding coordinates on both sides of this vector equation, we have
λ1 = 3
−2λ1 + 4λ2 = 0
3λ1 + λ2 = 5
2λ1 + 2λ2 = 6
The augmented matrix is
(A|b) =

1 0 3
−2 4 0
3 1 5
2 2 6

This is equivalent to
(U |y) =

1 0 3
0 4 6
0 0 −112
0 0 0
 .
As the system has no solution,

3
0
5
6
 is not in span


1
2
−3
2
 ,

0
4
1
2

. ♦
Example 2. Is v =
24
6
 parallel to the plane
x =
12
3
+ λ1
−13
2
+ λ2
31
4
?
Solution. The vector v =
24
6
 is parallel to the plane if and only if it is a linear combination
of the two vectors
−13
2
 and
31
4
. Hence, v is parallel to the plane if and only if there are real
c©2020 School of Mathematics and Statistics, UNSW Sydney
148 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
numbers λ1 and λ2 such that 24
6
 = λ1
−13
2
+ λ2
31
4
 .
On equating corresponding components of the vectors on both sides, we obtain
−λ1 + 3λ2 = 2
3λ1 + λ2 = 4
2λ1 + 4λ2 = 6
The augmented matrix is
(A|b) =
 −1 3 23 1 4
2 4 6
 ,
which becomes, by Gaussian elimination,
(U |y) =
 −1 3 20 10 10
0 0 0
 .
Since the right hand column is non-leading, this system has a solution for λ1 and λ2, and hence24
6
 is parallel to the plane. ♦
Example 3. Find the intersection of the line through (3, 2, 1, 4) parallel to

1
2
−3
6
 and the plane
through (3, 1,−4, 7) parallel to

2
1
4
5
 and

−1
3
0
6
.
Solution. The equation of the line is
x =

3
2
1
4
+ λ

1
2
−3
6
 for λ ∈ R,
and the equation of the plane is
x =

3
1
−4
7
+ µ1

2
1
4
5
+ µ2

−1
3
0
6
 for µ1, µ2 ∈ R.
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.8. APPLICATIONS 149
An intersection occurs for values of λ, µ1, µ2 for which
x =

3
2
1
4
+ λ

1
2
−3
6
 =

3
1
−4
7
+ µ1

2
1
4
5
+ µ2

−1
3
0
6
 .
On rearranging, we have
λ

1
2
−3
6
− µ1

2
1
4
5
− µ2

−1
3
0
6
 =

0
−1
−5
3
 .
On equating coordinates on both sides, we obtain a set of 4 equations in 3 unknowns
λ − 2µ1 + µ2 = 0
2λ − µ1 − 3µ2 = −1
−3λ − 4µ1 = −5
6λ − 5µ1 − 6µ2 = 3
After forming an augmented matrix, and using Gaussian elimination, we obtain
(U |y) =

1 −2 1 0
0 3 −5 −1
0 0 −413 −253
0 0 0 681123
 .
The system of equations has no solution, so the given line and plane in R4 do not
intersect. ♦
4.8.2 Chemical engineering
A greatly simplified example which illustrates an application of systems of linear equations to oil
refining is as follows.
Example 4. An oil company has three refineries at Sydney, Melbourne and Brisbane. Each refinery
makes four products: super petrol, unleaded petrol, diesel fuel and aviation fuel. The amount of
each product that each refinery can make per hour is given in the following table.
Sydney Melbourne Brisbane
Super (litres/hour) 10000 20000 5000
Unleaded (litres/hour) 2000 5000 1000
Diesel (litres/hour) 500 800 200
Aviation fuel (litres/hour) 100 200 100
The oil company has to decide how many hours per day to run each refinery in order to produce
a required amount of product each day.
Do the following:
c©2020 School of Mathematics and Statistics, UNSW Sydney
150 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
(a) Set up a linear equation model for the company.
(b) On Monday, the oil company requires exactly 610,000 litres of super, 137,000 litres of un-
leaded, 26,400 litres of diesel and 7,200 litres of aviation fuel. Find the number of hours each
refinery should be run or show that the desired production levels cannot be met.
(c) On Tuesday, the amount of super, unleaded and diesel required is the same, but the amount
of aviation fuel required is increased to 7,800 litres. Again find the number of hours each
refinery should be run or show that the desired production levels cannot be met.
Solution. Let
x1 = number of hours for which Sydney refinery runs,
x2 = number of hours for which Melbourne refinery runs,
x3 = number of hours for which Brisbane refinery runs,
and let b1, b2, b3 and b4 be the amounts of super, unleaded, diesel and aviation fuel respectively
required each day. Then the equations are:
Super 10000x1 + 20000x2 + 5000x3 = b1
Unleaded 2000x1 + 5000x2 + 1000x3 = b2
Diesel 500x1 + 800x2 + 200x3 = b3
Aviation 100x1 + 200x2 + 100x3 = b4
The coefficient matrix, unknowns vector and right-hand side are therefore
A =

10000 20000 5000
2000 5000 1000
500 800 200
100 200 100
 , x =
x1x2
x3
 , b =

b1
b2
b3
b4
 .
On Monday
(A|b) =

10000 20000 5000 610000
2000 5000 1000 137000
500 800 200 26400
100 200 100 7200

is equivalent to
(U |y) =

10000 20000 5000 610000
0 1000 0 15000
0 0 −50 −1100
0 0 0 0

which has the solution x3 = 22, x2 = 15, x1 = 20.
On Tuesday
(A|b) =

10000 20000 5000 410000
2000 5000 1000 137000
500 800 200 26400
100 200 100 7800

c©2020 School of Mathematics and Statistics, UNSW Sydney
4.8. APPLICATIONS 151
is equivalent to
(U |y) =

10000 2000 5000 610000
0 1000 0 15000
0 0 −50 −1100
0 0 0 600

which has no solution, so production levels can not be met. ♦
The model used here is too simple to reflect accurately what is done in practice. In practice
the problem is generalised in two ways. First, the equations are replaced by inequalities; i.e., the
problem is changed from one of having to produce exactly the required amount of product to one of
having to produce at least the required amount of product. Secondly, the problem is changed to one
of having tominimise the cost of production subject to the constraints of having to produce the
required amount of product. This generalised problem is called a linear programming problem.
Linear programming is very important in economic, financial and manufacturing applications and
so on where problems involving thousands of variables and inequalities are routinely solved, and
result in savings of millions of dollars.
4.8.3 Economics
A simplified example of an application of linear equations to economics is as follows.
Example 5. The island of Wotsit-Matta has a simple economy in which the only goods produced
are wheat, iron and pigs. The Wotsit-Mattas have worked out that the following amounts of
wheat, iron and pigs are required on 31 December of each year for each unit of wheat, iron and
pigs produced in the following year:
Wheat Iron Pigs
Requirements (per tonne) (per tonne) (per hundred)
Wheat(tonnes) .5333 4.286 2.0000
Iron(tonnes) .02557 .2857 .0500
Pigs (hundreds) .0400 .5714 .5000
That is, to produce 1 tonne of wheat in a particular year we need to have available on 31
December of the previous year 0.5333 tonnes of wheat, 0.02557 tonnes of iron and 0.0400 hundred
pigs, and similarly for iron and pigs.
Assuming that all of the wheat, iron and pigs available at 31 December of one year are used to
produce wheat, iron and pigs in the following year, find the amount of wheat, iron and pigs at 31
December, 1988 and then at 31 December, 1989 if
(a) the Wotsit-Mattas have 180 tonnes of wheat, 8.401 tonnes of iron and 2400 pigs available at
31 December, 1987, and
(b) the Wotsit-Mattas have 180 tonnes of wheat, 8 tonnes of iron and 2400 pigs available at 31
December, 1987.
c©2020 School of Mathematics and Statistics, UNSW Sydney
152 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
Solution. We first set up a linear equations model for the Wotsit-Matta economy. Note that the
unknowns are the amounts of wheat, iron and pigs produced in a year, and the right-hand sides
are the amounts available from the previous year.
We let
x1 = number of tonnes of wheat produced in a year
x2 = number of tonnes of iron produced in a year
x3 = number of hundreds of pigs produced in a year
and let
b1 = number of tonnes of wheat available from the previous year
b2 = number of tonnes of iron available from the previous year
b3 = number of hundreds of pigs available from the previous year
Then the linear-equation model for the economy is
Wheat used .5333x1 + 4.286x2 + 2x3 = b1
Iron used .02667x1 + .2857x2 + .05x3 = b2
Pigs used .04x1 + .5714x2 + .5x3 = b3
We can now solve for the production in years 1988 and 1989.
CASE 1.(a). On 31 December, 1987, b =
 1808.401
24
. On solving for the production in 1988 we
obtain x =
 1808.403
24
, that is, 180 tonnes wheat, 8.403 tonnes iron and 2400 pigs.
This 1988 production is then used as the right-hand side b in the equations for the 1989
production. On solving Ax =
 1808.403
24
 as before, we obtain the solution x =
179.98.425
23.98
, and so
the 1989 production is 179.9 tonnes wheat, 8.425 tonnes iron and 2398 pigs.
CASE 2.(b). On 31 December, 1987, b =
1808
24
. We find that the solution for 1988 production
is 200.1 tonnes wheat, 4.658 tonnes iron and 2667 pigs.
On using this 1988 production as the right-hand side for the 1989 production, we obtain the
1989 production as 432.2 tonnes wheat, −34.37 tonnes iron and 5787 pigs.
♦
Note. The negative result for iron in case (b) means that the Wotsit-Matta economy is in deep
trouble. Possible actions that the Wotsit-Mattas might take include importing iron from Australia
(if they can find the foreign exchange) or alternatively, having a great party and eating sufficient
wheat and pigs to stop the overheating in these sections of their economy. A complete mathematical
explanation of the marked difference in behaviour between cases (a) and (b) requires a knowledge
of eigenvalues and eigenvectors (which will be studied in MATH1231).
c©2020 School of Mathematics and Statistics, UNSW Sydney
4.9. MATRIX REDUCTION AND MAPLE 153
4.9 Matrix reduction and Maple
This section deals with the reduction of partitioned matrices to row echelon and reduced row
echelon forms. It should be noted that Maple can handle symbolic entries, so we can solve systems
that involve general right-hand-side vectors. The first instruction needed is:
with(LinearAlgebra):
which loads the linear algebra package. If you wish to experiment with a matrix with, say, 3 rows
and 4 columns and with small integer entries, just try:
RandomMatrix(3,4);
though if you want to vary your random matrix you should try first:
randomize(301092);
Integers other than 301092 will give rise to other random matrices. To assign the name A to this,
you use:
A:=RandomMatrix(3,4);
The 3× 3 identity matrix may be obtained with the command:
Id:=IdentityMatrix(3);
An easy way to enter the values of a matrix is to command:
C:=< <1,4> | <2,5> | <3,6> >;
Note that we are inputting the entries of the matrix columnwise to give a 2 × 3 matrix called
C. You may recover the second column with the command:
Column(%,2);
If you want to augment two matrices A1, A2 with the same number (r) of rows to obtain what is
denoted in the main text by the partitioned matrix (A1|A2), the command
< A1 | A2 >;
will do the trick. However Maple is blind to the partitioning. The following example, using a given
3× 4 matrix A, is worth trying:
b:=< b1 , b2 , b3 >;
Ab:=< A | b >;
Here Ab is the name we have given to the resulting 3× 5 matrix.
The reduction to echelon form of the matrix A is achieved by:
GaussianElimination(A);
The reduced row-echelon form is obtained with the command:
ReducedRowEchelonForm(%);
either after using GaussianElimination or directly. Here the % refers to the previous expression.
GaussianElimination does not always do what one would like. Try GaussianElimination(A)
with
A:=< <1,2,3> | <1,2,3> | <1,2,3> | >;
There is a facility for finding the general solution of a set of linear equations. For example, the
pair 1x1 + 2x2 + 3x3 + 7x4 = b1, 2x1 + 4x2 + 5x3 + 9x4 = b2 can be handled by the commands:
A:=< <1,2> | <2,4> | <3,5> | <7,9> >;
b:=< b1,b2 >;
Ab:=< A | b >;
GaussianElimination(%);
BackwardSubstitute(%);
The same answer is obtained from:
GaussianElimination(Ab);
c©2020 School of Mathematics and Statistics, UNSW Sydney
154 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
BackwardSubstitute(%);
Experimenting with this may help motivate the reduced row-echelon form. You should also exper-
iment with:
BackwardSubstitute(Ab);
to see what can go wrong if you leave out the Gaussian elimination step.
Rows 1 and 3 of the matrix C may be swapped with the command:
RowOperation(C,[1,3]);
You may replace row 4 by row 4 plus 72 times row 3 in the matrix A to row 4 with
RowOperation(A,[4,3],7/2);
You may multiply row 4 of the matrix A by 72 by the command:
RowOperation(A,4,7/2);
You should now practise reducing a ‘random’ matrix A by hand and comparing your answer with
that obtained from GaussianElimination and ReducedRowEchelonForm. Other useful commands
may be found by typing in:
?Pivot
?LinearSolve
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 4 155
Problems for Chapter 4
Problems 4.1 : Introduction to linear equations
1. [R] Find the solution set of each of the following linear equation.
a) 2x1 − 5 = 0 as an equation of one variable, then as an equation in two variables, and
then three variables.
b) x1 + 2x2 = 4 as an equation of two variables, then three variables.
c) 2x1 − 3x2 + x3 = 2 as an equation of three variables.
2. [R][V] Determine algebraically whether the following systems of equations have a unique
solution, no solution, or an infinite number of solutions. Draw graphs to illustrate your
answers.
a) 3x1 + 2x2 = 6
9x1 + 6x2 = 36
b) 3x1 + 2x2 = 6
9x1 + 4x2 = 36
c) x1 − 5x2 = 5
6x1 − 30x2 = 30
3. [H] Find conditions on the coefficients a11, a12, a21, a22, b1, b2 so that the system of
equations
a11x1 + a12x2 = b1
a21x1 + a22x2 = b2
has a) a unique solution, b) no solution, and c) an infinite number of solutions.
For simplicity, assume a11 6= 0.
4. [R] Find and geometrically describe the solutions for the following systems of linear equa-
tions.
a) x1 + 2x2 + 3x3 = 5
2x1 + 5x2 + 8x3 = 12
b) 4x1 + 5x2 − 2x3 = 16
8x1 + 10x2 − 4x3 = 20
c) 4x1 + 5x2 − 2x3 = 16
8x1 + 10x2 − 4x3 = 32
5. [R] Show that x1 = 2 − 2λ, x2 = λ, x3 = 3 + 2λ, where λ is any real number, satisfy the
system of equations
x1 + 4x2 − x3 = −1
2x1 + 4x2 = 4
6x2 − 3x3 = −9
c©2020 School of Mathematics and Statistics, UNSW Sydney
156 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
Problems 4.2 : Systems of linear equations and matrix notation
6. [R][V] Write each of the following systems of equations in vector form, as a matrix equation
Ax = b, and in the augmented matrix (A|b) form.
a) 3x1 − 3x2 + 4x3 = 6
5x1 + 2x2 − 3x3 = 7
−x1 − x2 + 6x3 = 8
b) x1 + 3x2 + 7x3 + 8x4 = −2
3x1 + 2x2 − 5x3 − x4 = 7
3x2 + 6x3 − 6x4 = 5
7. [R] Write the system of equations, the matrix equation and the augmented matrix form
corresponding to the vector equation
x1

1
0
−6
7
+ x2

−3
6
−1
9
+ x3

0
6
−4
11
 =

10
−2
0
5
 .
Problems 4.3 : Elementary row operations
8. [R] For each of the following matrices, find the appropriate elementary row operations to
describe the transformation from one matrix to the next. Also continue the row reduction
until the matrix is in row echelon form.
a)
 1 4 2 32 6 3 0
4 −2 4 4
→
 1 4 2 30 −2 −1 −6
0 −18 −4 −8
 ,
b)
 3 4 1 32 8 0 2
0 8 3 0
→
 1 −4 1 11 4 0 1
0 8 3 0
 .
9. [M] Write down the output when the Maple command RowOperation(A,[2,1],3); is
applied to the matrix
A =
 2 4 1 23 2 4 1
1 3 1 3
 .
Problems 4.4 : Solving systems of equations
10. [R] For each of the following augmented matrices do the following. Determine whether
the matrix is in row-echelon form as defined in Section 4.4.1. If the matrix is in row-
echelon form, identify the leading elements, leading rows, leading columns, and non-leading
columns.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 4 157
a)
 3 2 1 100 4 2 8
0 0 −7 14
, b) ( 3 2 1 10 ) , c)
 3 2 1 104 0 2 8
0 0 −7 14
,
d)
(
3 2 1 10
0 4 2 8
)
, e)
(
0 3 1 6
0 0 1 5
)
, f)

3 2 1 10
0 4 2 8
0 0 −7 14
0 0 0 0
,
g)

3 2 1 10
0 4 2 8
0 0 −7 14
0 0 0 6
, h)

3 2 1 10
0 4 2 8
0 0 0 0
0 0 0 6
.
11. [R] Find the solutions to the following systems of equations. If possible give a geometric
interpretation of the solution.
a) 3x1 + 2x2 + x3 = 10
4x2 + 2x3 = 8
− 7x3 = 14
b) 3x1 + 2x2 + x3 + x4 = 10
4x2 + 2x3 − 4x4 = 8
− 7x3 + 14x4 = 14
12. [R][V] For each of the following systems of equations, do the following:
i) Write down the corresponding augmented matrix.
ii) Use Gaussian elimination to transform the augmented matrix into row-echelon form.
iii) Solve each system of equations writing your answer in vector form.
a) x1 − 2x2 = 5
3x1 + x2 = 8
b) x1 − 2x2 − 3x3 = 3
2x1 + 4x2 + 10x3 = 14
c) x1 − 2x2 + 3x3 = 11
2x1 − x2 + 3x3 = 10
4x1 + x2 − x3 = 4
d) 2x1 − 2x2 + 4x3 = −3
3x1 − 3x2 + 6x3 = −4
5x2 + 2x3 = 9
e) x1 + 2x2 + 4x3 = 10
−3x1 + 3x2 + 15x3 = 15
−2x1 − x2 + x3 = −5
f) x1 − 4x2 − 5x3 = −6
2x1 − x2 − x3 = 2
3x1 + 9x2 + 12x3 = 30
g) x1 + 2x2 − x3 + x4 = 4
x2 − x3 + x4 = 1
3x1 + 2x2 − 2x4 = 3
5x1 + 3x2 − x4 = 9
h) x1 + 2x2 − x3 + x4 = 4
x2 − x3 + x4 = 1
3x1 + 2x2 − 2x4 = 3
c©2020 School of Mathematics and Statistics, UNSW Sydney
158 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
13. [R] For each of the following augmented matrices, find a reduced row-echelon form. Then
write down all solutions of the corresponding system of equations and try to give a geo-
metric interpretation of the solutions.
a)
 2 4 1 40 1 2 −2
0 0 −1 2
, b)
 1 2 3 4 10 −1 5 6 2
0 0 1 7 3
.
Problems 4.5 : Deducing solubility from row-echelon form
14. [R][V] For each of the following augmented matrices, without solving, decide whether
or not the corresponding system of linear equations has a unique solution, no solution or
infinitely many solutions.
a)
 3 2 1 100 4 2 8
0 0 −7 14
, b)

3 2 1 10
0 4 2 8
0 0 −7 14
0 0 0 6
, c) ( 3 2 1 10 ),
d)
(
3 2 1 10
0 4 2 8
)
, e)

3 2 1 10
0 4 2 8
0 0 −7 0
0 0 0 0
.
15. [H][V] Determine which values of k, if any, will give a) a unique solution b) no
solution c) infinitely many solutions to the system of equations
x + y + kz = 2
3x + 4y + 2z = k
2x + 3y − z = 1.
16. [H] For which values of λ do the equations
x + 2y + λz = 1
−x + λy − z = 0
λx − 4y + λz = −1
have a) no solutions, b) infinitely many solutions, c) a unique solution?
17. [H] Consider the equation
1 2 3 0
0 2 2 −1
0 0 3 1
0 0 0 a


x1
x2
x3
x4
 =

5
0
a
a+ 2b
 .
For what values of a and b does the equation have
a) a unique solution, b) no solution,
c) infinitely many solutions? d) In the case of (c), determine all solutions.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 4 159
18. [H] You are an auditor for a company whose four executives make regular business trips
on four routes and you suspect that at least one of the executives has been overstating her
expenses. You don’t know how much it costs to travel each route, but you know that it is
the same for all the executives. You know the number of trips each executive made on each
route in a certain period and you know the total expenses claimed by each executive for
this period. If the numbers of trips are as shown in the table below, do you have sufficient
information to be sure that someone is cheating? State your reasoning clearly.
Route
1 2 3 4
Executive A 0 1 1 2
Executive B 1 2 0 1
Executive C 3 4 0 1
Executive D 2 1 3 3
19. [H][V] P,Q,R and S are four cities connected by highways which are labelled as shown
in the diagram.
Q
P S
R
c
a
d
b
e
A hire car operator in P makes a note of the number of kilometres travelled by five
customers who made trips starting and ending at P . He knows that the routes travelled
by the five customers were as follows: abdc abdea cddc cdbec aedbec
Can he determine the length of each of the five highways? State your reasoning clearly.
Problems 4.6 : Solving Ax = b for indeterminate b
20. [R] For each of the following systems of linear equations, find x1, x2 and x3 in terms of
b1, b2 and b3.
a) x1 − 2x2 + 3x3 = b1
x2 − 3x3 = b2
−2x1 + 3x2 − 2x3 = b3
b) 2x1 − 4x3 = b1
3x1 + x2 − 2x3 = b2
−2x1 − x2 − x3 = b3
21. [R] Show that the system of equations x+ y + 2z = a, x+ z = b and 2x+ y + 3z = c are
consistent if and only if c = a+ b.
c©2020 School of Mathematics and Statistics, UNSW Sydney
160 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
22. [R] For the following systems, find conditions on the right-hand-side vector b which ensure
that the system has a solution.
a) 2x1 − 4x3 = b1
3x1 + x2 − 2x3 = b2
−2x1 − x2 = b3
b) x1 + x2 + 3x3 − x4 = b1
2x1 − x2 + 2x4 = b2
x1 − 2x2 − 3x3 + 3x4 = b3
3x2 + 6x3 − 4x4 = b4
Problems 4.7 : General properties of the solution of Ax = b
23. [R] Show that x =
72
0
+ λ
−20
1
 , λ ∈ R are the solutions of
x1 − 2x2 + 2x3 = 3
2x1 − 6x2 + 4x3 = 2
−2x1 + 4x2 − 4x3 = −6
and that x = λ
−20
1
 , λ ∈ R are the solutions of the corresponding homogeneous system
x1 − 2x2 + 2x3 = 0
2x1 − 6x2 + 4x3 = 0
−2x1 + 4x2 − 4x3 = 0
Problems 4.8 : Applications
24. [R] Does the point (−3, 3, 6) lie on the plane
x =
 21
−1
+ λ1
−12
4
+ λ2
32
1
?
25. [R] Is the vector
13
2
 in span
−13
4
 ,
21
3
?
26. [R] Is the vector

1
1
4
12
 in span


3
−1
4
6
 ,

4
−2
4
3

?
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 4 161
27. [R] Can

3
1
−2
4
 be expressed as a linear combination of

1
0
−3
7
 and

2
−1
5
6
?
28. [R] Do the lines x =
21
3
+ λ1
13
2
 and x =
1215
7
+ λ2
 31
−2
 intersect?
29. [R] Is the vector
 57
−1
 parallel to the plane x =
21
3
+λ1
12
1
+λ2
35
1
 for λ1, λ2 ∈
R?
30. [H] Show that the line
x− 1
2
=
y
3
=
z + 1
−1 is parallel to the plane
x = λ1
11
0
+ λ2
 01
−1
 , λ1, λ2 ∈ R.
31. [H] Find the intersection (if any) of the line x =
 018
1
 + µ
 2−3
1
 for µ ∈ R and the
plane x =
10
4
+ λ1
14
1
+ λ2
 31
−2
 for λ1, λ2 ∈ R.
32. [R] Find the intersection (if any) of the planes 8x1 + 8x2 + x3 = 35 and
x =
 6−2
3
+ λ1
−21
3
+ λ2
 11
−1
 for λ1, λ2 ∈ R.
33. [H] Are the planes
x =

1
−4
2
3
+ λ1

2
1
−2
7
+ λ2

−3
1
5
2
 for λ1, λ2 ∈ R
and
x =

2
−4
1
3
+ µ1

3
−1
2
4
+ µ2

−1
4
2
6
 for µ1, µ2 ∈ R
parallel?
c©2020 School of Mathematics and Statistics, UNSW Sydney
162 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
34. [R] Show that the 3 planes with Cartesian equations
x+ 3y + 2z = 5
2x+ y − z = 2
7x+ 11y + 4z = 13
do not intersect at one point.
35. [H] Consider the following system of equations
x+ y − z = 1
2x− 4y + 2z = 2
3x− 3y + z = 3
a) Use Gaussian elimination and back–substitution to find the solution(s), if any, of the
above equations.
b) Use your result in part a) to decide whether the three planes represented by the
equations are parallel, intersecting in a straight line, intersecting at a point or have
some other configuration.
36. [R] Find a polynomial p(x) of degree 2 satisfying p(1) = 5, p(2) = 7, p(3) = 13.
37. [R] The total of the ages of my brother, my sister and myself is 140 years. I am seven
times the difference between their ages (my sister is older than my brother) and in seven
years I will be half their combined ages now. How old are we?
38. [R] In a trip to Asia a traveller spent $90 a day for hotels in Bangkok, $60 a day in
Singapore and $60 a day in Kuala Lumpur. For food the traveller spent $60 a day in
Bangkok, $90 a day in Singapore, and $60 a day in Kuala Lumpur. In addition the
traveller spent $30 a day in other expenses in each city. The traveller’s diary shows that
the total hotel bill was $1020, total food bill was $960, and total other expenses were $420.
Find the number of days the traveller spent in each city, or show that the diary must be
wrong.
39. [R] A dietician is planning a meal consisting of three foods. A serving of the first food
contains 5 units of protein, 2 units of carbohydrates and 3 units of iron. A serving of the
second food contains 10 units of protein, 3 units of carbohydrates and 6 units of iron. A
serving of the third food contains 15 units of protein, 2 units of carbohydrates and 1 unit
of iron. How many servings of each food should be used to create a meal containing 55
units of protein, 13 units of carbohydrates and 17 units of iron?
Problems 4.9 : Matrix reduction and Maple
40. [M] The Maple session below (for which the package LinearAlgebra has been loaded)
calculates the intersection of 3 planes Π1, Π2 and Π3.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 4 163
a) Write down the cartesian equations for Π1, Π2 and Π3.
b) Give a full geometric description of the intersection of Π1, Π2 and Π3.
c) Express the intersection of Π1, Π2 and Π3 in cartesian form.
> A:=<<1,3,2>|<2,6,4>|<-1,-1,-1>>;
A :=
1 2 −13 6 −1
2 4 −1

> b:=<2,12,7>;
b :=
 212
7

> LinearSolve(A,b); 5− 2 t2t2
3

41. [M] > with(LinearAlgebra):
> A:=<<1,3,4,7>|<-2,6,2,-8>|<1,8,7,6>|>;
A :=

1 −2 1 a
3 6 8 b
4 2 7 c
7 −8 6 d

> GaussianElimination(A);
1 −2 1 a
0 12 5 b− 3a
0 0 −7
6
c− 3a
2
− 5b
6
0 0 0 d− a+ 2b− 3c

>
a) The above is a Maple session designed to calculate where 4 planes in R3 meet. What
are the equations of the planes?
b) What are the condition(s) on a, b, c, d for the planes to meet at a point?
c) If a = 1, b = 2, c = 3 and the planes meet, where do they meet?
c©2020 School of Mathematics and Statistics, UNSW Sydney
164 CHAPTER 4. LINEAR EQUATIONS AND MATRICES
c©2020 School of Mathematics and Statistics, UNSW Sydney
165
Chapter 5
Matrices
“It seems very pretty,” she said when she had finished it,
“but it’s RATHER hard to understand!. . .
. . . Somehow it seems to fill my head with ideas
— only I don’t exactly know what they are!”
Lewis Carroll, Through the Looking Glass.
In Chapter 4, matrices are treated as devices to help in writing systems of linear equations.
They are also important in many other areas of mathematics and their applications.
In this chapter, we shall define the addition of two matrices, the multiplication of two matrices
and the multiplication of a matrix by a scalar. Under these operations of addition and scalar
multiplication matrices behave much like vectors and indeed like real numbers. For instance, both
matrices and real numbers obey associative laws and distributive laws. Yet, there are differences
between matrices and real numbers. Multiplication of real numbers is commutative, but, as we
shall see, matrix multiplication is not. Every non-zero real number has an multiplicative inverse
which is the reciprocal of that number. However, not all matrices have inverses. We shall study
the matrix arithmetic and algebra and an important function on the set of square matrices—the
determinant function.
5.1 Matrix arithmetic and algebra
In this section we describe the arithmetic of matrices, including equality and addition of matrices,
multiplication of a matrix by a scalar, and multiplication of matrices. In general, division is not
defined for matrices, although an inverse can be defined for some matrices (see Section 5.3).
c©2020 School of Mathematics and Statistics, UNSW Sydney
166 CHAPTER 5. MATRICES
Definition 1. An m × n (read “m by n”) matrix is an array of m rows and n
columns of numbers of the form
A =

a11 a12 · · · a1n
a21 a22 · · · a2n
...
...
. . .
...
am1 am2 · · · amn
 .
The number aij in the matrix A lies in row i and column j. It is called the ijth
entry or ijth element of the matrix.
Note.
1. An m× 1 matrix is called a column vector, while a 1× n matrix is called a row vector.
2. We use Mmn to stand for the set of all m × n matrices, i.e., the set of all matrices with m
rows and n columns.
In general, we will assume that the entries in a matrix can be complex numbers. However, it
is sometimes necessary to distinguish between real matrices, in which all entries aij are real
numbers, and complex matrices, in which all aij are complex numbers. In this case, we use
Mmn(R) for the set of all real m× n matrices and Mmn(C) for the set of all complex m× n
matrices. Likewise, Mmn(Q) is used for the set of all rational m× n matrices.
3. We say an m× n matrix is of size m× n.
4. When we say “let A = (aij)”, we are specifying a matrix of fixed size, in which, for each given
i, j, the ijth entry is aij. On the other hand, for a given matrix A, we denote the entry in
the ith row and jth column by [A]ij .
Example 1. The matrix A =
 1 2 −1 23 −4 2 5
−6 −3 1 4
 is of size 3× 4 and [A]24 = 5. ♦
When the number of rows or columns of a matrix is larger than 9, we need to use a comma
to separate the row number and the column number. If B is a matrix of size 5 × 12, we shall use
[B]2,4 and [B]3,11 to denote the entry in the second row and fourth column and the entry in the
third row and the eleventh column, respectively.
5.1.1 Equality, addition and multiplication by a scalar
The rules for equality of matrices, addition of matrices and multiplication of a matrix by a scalar
are straightforward generalisations of the corresponding rules for vectors in Rn.
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.1. MATRIX ARITHMETIC AND ALGEBRA 167
Definition 2. Two matrices A and B are defined to be equal if
1. the number of rows of A equals the number of rows of B,
2. the number of columns of A equals the number of columns of B,
3. [A]ij = [B]ij for all i and j.
To summarise, equal matrices have the same size and their corresponding entries are equal.
Addition of matrices is defined as follows.
Definition 3. If A and B are m × n matrices, then the sum C = A + B is the
m× n matrix whose entries are
[C]ij = [A]ij + [B]ij for all i, j.
To understand the proofs of the properties of matrices, we need to know the notation [A]ij well.
The row i column j entry of the matrix A + B is denoted by [A + B]ij. The definition simply
says that this entry is the same as [A]ij + [B]ij , that is the sum of the corresponding entries of the
matrices A and B.
Note that addition of matrices of different sizes is not defined.
Example 2. (
2 3 5
4 −3 2
)
+
( −1 1 4
2 4 −1
)
=
(
1 4 9
6 1 1
)
but ( −1 1 4
2 4 −1
)
+
(
1 2
1 0
)
is not defined as the sizes are different. ♦
Proposition 1. Let A, B and C be m× n matrices.
1. A+B = B +A. (Commutative Law of Addition)
2. (A+B) + C = A+ (B + C). (Associative Law of Addition)
Proof. We only prove the Associative law.
[(A+B) + C]ij = [A+B]ij + [C]ij (by definition, the sum of (A+B) and C)
= ([A]ij + [B]ij) + [C]ij (by definition, the sum of A and B)
= [A]ij + ([B]ij + [C]ij) (associative law of numbers )
= [A]ij + [B + C]ij (by definition, the sum of B and C)
= [A+ (B + C)]ij (by definition, the sum of A and (B + C))
Hence (A+B) + C = A+ (B + C).
c©2020 School of Mathematics and Statistics, UNSW Sydney
168 CHAPTER 5. MATRICES
We now define what is meant by a zero matrix.
Definition 4. A zero matrix (written 0) is a matrix in which every entry is zero.
Example 3.
0 =
(
0 0
0 0
)
is the zero matrix in M22.
and
0 =
(
0 0 0
0 0 0
)
is the zero matrix in M23.
♦
Note that when writing the zero matrix by hand, you can write 0 to distinquish it from the zero
vector and the number 0.
Proposition 2. Let A be a matrix and 0 be the zero matrix, both in Mmn. Then
A+ 0 = 0+A = A.
Proof. Similar to the proof of the associative law of addition but skipping the detail explanation,
we have
[A+ 0]ij = [A]ij + 0 = [A]ij .
Hence A+ 0 = A. We can also prove that 0+A = A in a similar way.
Definition 5. For any matrix A ∈ Mmn, the negative of A is the m× n matrix
−A with entries
[−A]ij = −[A]ij for all 1 6 i 6 m, 1 6 j 6 n.
Using the above definition of the negative of a matrix, we can define subtraction by
A−B = A+ (−B) for all A,B ∈Mmn.
As with addition, we cannot subtract one matrix from another one of different size.
Example 4. Let A =
(
1 3 −2
2 −1 3
)
and B =
(
2 −4 5
3 0 −2
)
. We have
−B =
(−2 4 −5
−3 0 2
)
and A−B =
(−1 7 −7
−1 −1 5
)
.
♦
Proposition 3. If A is an m× n matrix, then
A+ (−A) = (−A) +A = 0.
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.1. MATRIX ARITHMETIC AND ALGEBRA 169
We now define multiplication of a matrix by a scalar, that is, by a number (real or complex).
Definition 6. If A is an m×n matrix and λ is a scalar, then the scalar multiple
B = λA, of A is the m× n matrix whose entries are
[B]ij = λ[A]ij for all 1 6 i 6 m, 1 6 j 6 n.
that is, each entry of the matrix is multiplied by the scalar λ.
Example 5.
3
(
2 3 5
4 −3 2
)
=
(
6 9 15
12 −9 6
)
.
♦
Proposition 4. Let λ, µ be scalars and A, B be matrices in Mmn.
1. λ(µA) = (λµ)A (Associative Law of Scalar Multiplication)
2. (λ+ µ)A = λA+ µA (Scalar Multiplication is distributive over Scalar Addition)
3. λ(A+B) = λA+ λB (Scalar Multiplication is distributive over Matrix Addition)
The above rules for equality, matrix addition and multiplication by a scalar are essentially the
same as the corresponding rules for vectors in Rn. The matrix operations also obey the basic laws
of arithmetic as do addition and multiplication by a scalar in Rn. For instance, in both Mmn and
Rn, the commutative laws, associative laws and distributive laws hold.
5.1.2 Matrix multiplication
In Section 4.2, we represented the system of linear equations
a11x1 + a12x2 + · · · + a1nxn = b1
a21x1 + a22x2 + · · · + a2nxn = b2
...
...
...
...
am1x1 + am2x2 + · · · + amnxn = bm
by the matrix equation Ax = b, where
A =

a11 a12 · · · a1n
a21 a22 · · · a2n
...
...
. . .
...
am1 am2 · · · amn
 , x =

x1
x2
...
xn
 , b =

b1
b2
...
bm
 .
It makes sense to use the above relation between the matrix equation Ax = b and the system of
linear equations as the basis for the definition of the product of two matrices. The column vector
x ∈ Rn may be regarded as an n× 1 matrix and the column vector b = Ax ∈ Rm may be regarded
as an m× 1 matrix.
c©2020 School of Mathematics and Statistics, UNSW Sydney
170 CHAPTER 5. MATRICES
Definition 7. If A = (aij) is an m×n matrix and x is an n× 1 matrix with entries
xi, then the product b = Ax is the m× 1 matrix whose entries are given by
bi = ai1x1 + ai2x2 + · · ·+ ainxn =
n∑
k=1
aikxk for 1 6 i 6 m.
Example 6. If A is a 3× 2 matrix and x is a 2× 1 matrix given by
A =
 2 −35 −1
−7 4
 , x = (6
9
)
then the product Ax is the 3× 1 matrix
Ax =
 2(6) + (−3)(9)5(6) + (−1)(9)
(−7)(6) + 4(9)
 =
 −1521
−6
 .
♦
To obtain a suitable general definition of multiplication of two matrices, we replace the matrix
x with 1 column by a matrix X with p columns. We can then think of the matrix A multiplying
each of the columns of X using the above rule.
Definition 8. Let A be an m× n matrix and X be an n× p matrix and let xj be
the jth column of X. Then the product B = AX is the m × p matrix whose jth
column bj is given by
bj = Axj for 1 6 j 6 p.
The matrix X is the augmented matrix of the p column vectors, i.e. X = (x1| · · · |xp). So
the definition can be written as
A(x1| · · · |xp) = (Ax1| · · · |Axp).
By combining Definitions 6 and 7, we can give an equivalent definition of a matrix product in
terms of the entries of the matrices as:
Definition 9. (Alternative) If A is an m × n matrix and X is an n × p matrix,
then the product AX is the m× p matrix whose entries are given by the formula
[AX]ij = [A]i1[X]1j + · · ·+ [A]in[X]nj =
n∑
k=1
[A]ik[X]kj for 1 6 i 6 m, 1 6 j 6 p.
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.1. MATRIX ARITHMETIC AND ALGEBRA 171
Note.
1. Let A be an m×n matrix and B be an r× s matrix. The product AB is defined only when
the number of columns of A is the same as the number of rows of B, i.e. n = r. If n = r, the
size of AB will be m× s.
A B
m × n r × s
if equal
size of AB
2. Matrix multiplication is a row-times-column process: we get the (row i, column j) entry of
AX by going across the ith row of A and down the jth column of X multiplying and adding
as we go.
Example 7. For
A =
(
1 2 3
1 0 −1
)
and X =
 1 −1 22 −2 3
3 −3 1

the product AX = (bij) can be easily obtained.
A
 12
3
 = ( 14−2
)
, A
 −1−2
−3
 = ( −14
2
)
, A
 23
1
 = ( 11
1
)
.
Hence AX =
(
1 2 3
1 0 −1
) 1 −1 22 −2 3
3 −3 1
 = ( 14 −14 11−2 2 1
)
.
♦
Warning. For some A and X, AX 6= XA, i.e., matrix multiplication does not satisfy the commu-
tative law. This can happen when one product is defined and the other is not. It can happen even
when both are defined. Thus, for the A and X just given, the product XA is not defined as the
number of columns of X does not equal the number of rows of A.
X A
3 × 3 2 × 3
not equal
XA is not defined
c©2020 School of Mathematics and Statistics, UNSW Sydney
172 CHAPTER 5. MATRICES
Here is an example in which both AX and XA are defined but are different:
A =
(
1 0
0 2
)
and X =
(
1 3
2 4
)
,
in which case
AX =
(
1 0
0 2
)(
1 3
2 4
)
=
(
1 3
4 8
)
,
whereas
XA =
(
1 3
2 4
)(
1 0
0 2
)
=
(
1 6
2 8
)
.
A matrix is said to be square if it has the same number of rows as columns. For example, of
the four matrices displayed below, the second and fourth are square:
(
1
0
) (
1 2
3 4
) (
1 2 3
4 5 6
) 1 2 34 5 6
7 8 9

The diagonal of a square matrix consists of the positions on the line from the top left to the bottom
right. More precisely, the diagonal entries of an n × n square matrix (aij) are a11, a22, . . . , ann.
The matrix is said to be an identity matrix if its diagonal entries are all 1 and all other entries
are 0. The following are identity matrices
(1)
(
1 0
0 1
) 1 0 00 1 0
0 0 1


1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
 .
For each integer n there is one and only one n× n identity matrix. It is denoted by In, or just by
I if there is no risk of ambiguity.
Definition 10. An identity matrix (written I) is a square matrix with 1’s on the
diagonal and 0’s off the diagonal.
Example 8.
I =
(
1 0
0 1
)
is the identity matrix in M22.
and
I =
1 0 00 1 0
0 0 1
 is the identity matrix in M33.
♦
Although matrix multiplication does not satisfy a commutative law of multiplication, the oper-
ation does satisfy some laws.
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.1. MATRIX ARITHMETIC AND ALGEBRA 173
Proposition 5 (Properties of Matrix Multiplication). Let A, B, C be matrices and λ be a scalar.
1. If the product AB exists, then A(λB) = λ(AB) = (λA)B.
2. Associative Law of Matrix Multiplication. If the products AB and BC exist, then
A(BC) = (AB)C.
3. AI = A and IA = A, where I represents identity matrices of the appropriate (possibly
different) sizes.
4. Right Distributive Law. If A+B and AC exist, then (A+B)C = AC +BC.
5. Left Distributive Law. If B + C and AB exist, then A(B + C) = AB +AC.
Proof. We will prove the right distributive law, and leave the proof of the remainder as a problem
at the end of the chapter. As A + B exists, A and B must be the same size. Also, as AC exists,
the number of columns of A must be equal to the number of rows of C. Therefore we let A,B be
m× n matrices and C be an n× p matrix. Then BC also exists and the row i, column j entry of
(A+B)C is
[(A+B)C]ij =
n∑
k=1
[A+B]ik[C]kj
=
n∑
k=1
([A]ik + [B]ik)[C]kj
=
n∑
k=1
[A]ik[C]kj +
n∑
k=1
[B]ik[C]kj
= [AC]ij + [BC]ij
= [AC +BC]ij.
Hence (A+B)C = AC +BC as claimed.
Example 9. If A ∈M23 is given by
A =
(
1 2 4
−3 5 7
)
,
then by direct multiplication
AI =
(
1 2 4
−3 5 7
)1 0 00 1 0
0 0 1
 = A,
and
IA =
(
1 0
0 1
)(
1 2 4
−3 5 7
)
= A.
Note that in this case the right and left I’s must be of different sizes for the matrix products to
exist. ♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
174 CHAPTER 5. MATRICES
Example 10. Here is an example to verify the associative law of multiplication. For
A =
(
2 1 −2
3 4 2
)
, B =
 −1 50 −3
2 6
 , C = ( −5 2−3 −6
)
,
we have
A(BC) =
(
2 1 −2
3 4 2
) −10 −329 18
−28 −32
 = ( 45 18−50 −88
)
and
(AB)C =
( −6 −5
1 15
)( −5 2
−3 −6
)
=
(
45 18
−50 −88
)
.
♦
Using the properties of matrix operations, we can simplify expressions in unknown matrices in
almost the same way as simplifying expressions in algebra.
Example 11. By distributive laws and associative laws, we can write
A(A+ 2B)A = A3 + 2ABA.
Here A3 obviously means AAA. Thanks to the associative law of matrix multiplication, we do not
have to specify ABA to be (AB)A or A(BA) because (AB)A = A(BA). Note that we cannot write
the second term as 2A2B because matrix multiplication is not commutative.
Example 12. Expand the expression (A+ I)2.
Solution.
(A+ I)2 = (A+ I)(A+ I)
= (A+ I)A+ (A+ I)I
= (A2 + IA) + (AI + I)
= A2 +A+A+ I
= A2 + 2A+ I
Can you identify all the rules used in each step? ♦
Note that, in general, (A+B)2 6= A2 + 2AB +B2.
5.1.3 Matrix arithmetic and systems of linear equations
In Section 4.2 we used the matrix notation Ax = b as a shorthand notation for a system of linear
equations. Then we used this as the motivation to define multiplication in Definition 7 in 5.1.2.
That is, when we multiply the coefficient matrix A with the unknown column vector x, we obtain a
column vector Ax. If we equate the components of the column vector Ax with those of b, we shall
recover the system of equations. Thus Ax = b is not simply a shorthand notation, it is a matrix
equation. In other words, we have the following proposition.
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.2. THE TRANSPOSE OF A MATRIX 175
Proposition 6. Let A =
a11 · · · a1n... . . . ...
am1 · · · amn
, x =
x1...
xn
, b =
b1...
bn
, v =
v1...
vn
.
The vector v =
v1...
vn
 is a solution to the matrix equation Ax = b, i.e. Av = b if and only if
x1 = v1, . . . , xn = vn is a solution to the system of equations
a11x1 + · · · + a1nxn = b1
...
...
...
am1x1 + · · · + amnxn = bm
.
Using this we can rewrite the proof of the propositions in Section 4.7 in a much shorter way.
As an example, let us rewrite the proof of the following.
Proposition 2 (Section 4.7). If v and w are solutions of Ax = 0 then so are v+w and λv for
any scalar λ.
Proof. Since v and w are solutions of Ax = 0, we have
A(v +w) = Av +Aw = 0+ 0 = 0.
Hence v +w is also a solution.
On the other hand, we have
A(λv) = λ(Av) = λ0 = 0.
Thus λv is also a solution.
5.2 The transpose of a matrix
Definition 1. The transpose of an m×n matrix A is the n×m matrix AT (read
‘A transpose’) with entries given by
[AT ]ij = [A]ji.
Example 1. The transpose of
A =

3 2
−4 0
0 1
1 −5
 is AT = ( 3 −4 0 12 0 1 −5
)
.
♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
176 CHAPTER 5. MATRICES
Example 2. The transpose of
A =
(
a11 a12 a13
a21 a22 a23
)
∈M23 is AT =
a11 a21a12 a22
a13 a23
 ∈M32.
♦
Note. The columns of AT are the rows of A and the rows of AT are the columns of A.
5.2.1 Some uses of transposes
Vectors. As noted in Chapter 3, column vectors are often written as the transpose of ‘row vectors’.
Example 3. The transpose of the column vector
x =

2
5
−2
1

is the row vector xT =
(
2 5 −2 1) = y; conversely, x = yT . ♦
Products of Vectors. So far we have not defined the product of two vectors. However, the
product of a row vector and a column vector makes sense as a special case of matrix multiplication.
Example 4. Let a and b be two vectors in Rn. Then, using the usual rule for matrix multiplication,
we have
aTb =
(
a1 a2 . . . an
)

b1
b2
...
bn
 = a1b1 + a2b2 + · · ·+ anbn.
This product is a 1× 1 matrix1, which we sometimes regard simply as a scalar, and so aTb is often
called the scalar product of the two vectors a and b in Rn. Note also that bTa = aTb. ♦
Example 5. If
a =

1
4
−2
3
 and b =

2
−3
0
6

then
aTb =
(
1 4 −2 3 )

2
−3
0
6
 = 2− 12 + 0 + 18 = 8 = ( 2 −3 0 6 )

1
4
−2
3
 = bTa.
1Technically, we should write the matrix product aTb as a matrix with brackets. In practice we write the result
as a single number without brackets.Thus if a =
(
1
2
)
and b =
(
2
3
)
, then we write aTb = 8, not aTb = (8).
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.2. THE TRANSPOSE OF A MATRIX 177
The order of multiplication is very important. If a and b are (column) vectors, then the product
abT is not a scalar but a matrix. ♦
Example 6. For a ∈ Rm and b ∈ Rn
abT =

a1
a2
...
am
(b1 b2 · · · bn) =

a1b1 a1b2 · · · a1bn
a2b1 a2b2 · · · a2bn
...
...
. . .
...
amb1 amb2 · · · ambn
 .
♦
Example 7. If
a =

1
4
−2
3
 and b =

2
−3
0
6

then
abT =

2 −3 0 6
8 −12 0 24
−4 6 0 −12
6 −9 0 18
 .
♦
Note that the expressions ab, aTbT have no meaning. The expression baT has a meaning as a
matrix. However, baT 6= abT in general.
Example 8. For the vectors a and b of Example 7
baT =

2 8 −4 6
−3 −12 6 −9
0 0 0 0
6 24 −12 18
 .
Note that the matrix in Example 8 is the transpose of the matrix in Example 7. ♦
5.2.2 Some properties of transposes
Proposition 1. The transpose of a transpose is the original matrix, i.e., (AT )T = A.
Proof. A transpose is obtained by changing the order of the subscripts on each entry, and hence
[(AT )T ]ij = [A
T ]ji = [A]ij .
Thus all matrix entries of (AT )T are equal to corresponding entries of A, and hence the matrices
are equal.
c©2020 School of Mathematics and Statistics, UNSW Sydney
178 CHAPTER 5. MATRICES
Example 9. ((
1 2 3
4 5 6
)T)T
=
1 42 5
3 6
T = (1 2 3
4 5 6
)
.
♦
Proposition 2. If A,B ∈Mmn and λ, µ ∈ R, then (λA+ µB)T = λAT + µBT .
Proof. For A,B ∈Mmn and λ, µ ∈ R, the matrix (λA+ µB)T has matrix entries
[(λA+ µB)T ]ij = [λA+ µB]ji
= [λA]ji + [µB]ji
= λaji + µbji
= λ[AT ]ij + µ[B
T ]ij
= [λAT + µBT ]ij.
Thus, as corresponding matrix entries are equal, we have (λA + µB)T = λAT + µBT , and the
result is proved.
Proposition 3. If AB exists, then (AB)T = BTAT .
That is, if a product exists, then the transpose of a product is the product of the transposes,
but with the order of multiplication reversed.
Proof. Since AB exists, the number of columns of A must equal the number of rows of B. Suppose
A ∈Mmn and B ∈Mnp. Then the matrix entries of (AB)T are
[(AB)T ]ij = [AB]ji =
n∑
k=1
[A]jk[B]ki.
Now, by definition of transpose, BT ∈ Mpn and AT ∈ Mnm. Thus the product BTAT exists. The
matrix entries of this product are
[BTAT ]ij =
n∑
k=1
[BT ]ik[A
T ]kj =
n∑
k=1
[B]ki[A]jk =
n∑
k=1
[A]jk[B]ki = [(AB)
T ]ij .
The result is proved.
Example 10. Let
A =
(
1 2 −1
3 −3 6
)
and B =
 −2 5 −14 −1 5
1 3 2
 .
Then
(AB)T =
(
5 0 7
−12 36 −6
)T
=
 5 −120 36
7 −6

c©2020 School of Mathematics and Statistics, UNSW Sydney
5.2. THE TRANSPOSE OF A MATRIX 179
and
BTAT =
 −2 4 15 −1 3
−1 5 2
 1 32 −3
−1 6
 =
 5 −120 36
7 −6
 .
♦
Example 11. For column vectors, a,b, we have from Propositions 1 and 3 that
(abT )T = (bT )TaT = baT ,
as noted for the special case of the vectors in Examples 7 and 8. ♦
Symmetric matrices are especially important in certain applications of matrices to geometry
and physics,
Definition 2. A matrix is said to be a symmetric matrix if A = AT .
Note that the entries of a symmetric matrix A satisfy [A]ij = [A
T ]ji = [A]ji for all i, j, and
hence a symmetric matrix must be square.
Example 12. The matrix
A =

1 3 −4 6
3 7 10 −11
−4 10 −2 5
6 −11 5 4

is a symmetric matrix. ♦
Example 13. Let A and B be symmetric matrices. Prove that A+B is symmetric but AB may
not.
Proof. Since A and B are symmetric, we have AT = A and BT = B. By Proposition 2,
(A+B)T = AT +BT = A+B.
Hence A+B is symmetric.
Let A =
(
2 1
1 0
)
and B =
(
1 2
2 0
)
. Both A and B are symmetric. However,
AB =
(
4 4
1 2
)
is not symmetric.
c©2020 School of Mathematics and Statistics, UNSW Sydney
180 CHAPTER 5. MATRICES
5.3 The inverse of a matrix
In Section 5.1, we defined the operations of addition, subtraction and multiplication for matrices.
We also emphasised that division is not defined for matrices. The closest we can come to
defining ‘division’ for matrices is to define the inverse A−1 of a matrix A. Not all matrices,
however, have inverses.
Warning. If x and y are numbers, then x−1 =
1
x
means 1 divided by x, and
x
y
means x divided
by y. However, for a matrix A, we cannot use
I
A
to represent A−1. For matrices A and B, writing
A
B
would be ambiguous. The reason is that matrix multiplication is not commutative, and so it is
not clear whether
A
B
is supposed to mean AB−1 or B−1A.
Definition 1. A matrix X is said to be an inverse of a matrix A if both
AX = I and XA = I,
where I is an identity (or unit) matrix of the appropriate size.
If a matrix A has an inverse, then A is said to be an invertible matrix. An invertible matrix
is also called a non-singular matrix. If a matrix A is not an invertible matrix, then it is called a
singular matrix.
Proposition 1. All invertible matrices are square, that is, have the same number of rows as
columns.
[X] Proof. Suppose that the matrix A has more columns than rows. Then the equation Ax = 0
has some non-zero solution x. Left multiplication by the hypothetical inverse X yields x = Ix =
(XA)x = X0 = 0, which is a contradiction. So no matrix with more columns than rows has an
inverse. If the matrix A has more rows than columns and has an inverse X then X has more
columns than rows and has an inverse A. As before this leads to a contradiction.
Example 1. The matrix
X =
 1 −
1
3 −23
1 23 −23
2 23 −53

is an inverse of the matrix
A =
 2 3 −2−1 1 0
2 4 −3
 ,
since (check calculations)
AX =
1 0 00 1 0
0 0 1
 = XA.
♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.3. THE INVERSE OF A MATRIX 181
Definition 2. A matrix X is said to be a right inverse of A if A is r × c, X is
c× r and AX = Ir. A matrix Y is said to be a left inverse of A if A is r× c, Y is
c× r and Y A = Ic.
Note.
1. The conditions in the definition of an invertible matrix are very restrictive, so many square
matrices are not invertible.
2. However, invertible matrices are very important as we shall begin to see below.
3. If A has left inverse then c 6 r. If A has a right inverse then r 6 c. (See proof of Proposition 1).
5.3.1 Some useful properties of inverses
Theorem 2. If the matrix A has both a left inverse Y and a right inverse X, then Y = X. In
particular, if both Y and X are inverses of A, then Y = X.
That is, if A has a left and a right inverse, then A is both invertible and square.
Proof. Y = Y Ir = Y (AX) = (Y A)X = IcX = X.
Thus, AX = I = XA and so A is invertible. Hence A is a square matrix by Proposition 1.
Notation: Because the inverse of a matrix A (if one exists) is unique we can denote it by a special
symbol: A−1 (read ‘A inverse’).
Proposition 3. If A is an invertible matrix, then A−1 is also an invertible matrix and the inverse
of A−1 is A. That is, (A−1)−1 = A.
Proof. If X = A−1 exists, then AX = I and XA = I. But these two conditions also imply that A
is the inverse of X, and hence X is invertible and X−1 = A. Replacing X by A−1 then gives the
result (A−1)−1 = A.
Proposition 4. If A and B are invertible matrices and the product AB exists, then AB is also an
invertible matrix and (AB)−1 = B−1A−1.
Note the reversal of the order in the product of the inverses.
Proof. On multiplying AB from the right by B−1A−1, we have
(AB)(B−1A−1) = A(BB−1)A−1 = AIA−1 = AA−1 = I.
Thus the matrix X = B−1A−1 satisfies (AB)X = I.
Similarly, on multiplying AB from the left by B−1A−1, we have
(B−1A−1)(AB) = B−1(A−1A)B = B−1IB = B−1B = I.
Thus X = B−1A−1 also satisfies X(AB) = I. Thus, this X is the inverse of AB, and we have
B−1A−1 = X = (AB)−1.
c©2020 School of Mathematics and Statistics, UNSW Sydney
182 CHAPTER 5. MATRICES
This proposition can easily be extended to products with 3 or more factors to obtain, for
example,
(ABC)−1 = C−1(AB)−1 = C−1B−1A−1.
These results are sometimes useful in simplifying complicated expressions.
Example 2. Assuming F , G, and H are invertible and all products exist, simplify
A = HG(FHG)−1FG.
♦
Solution. Replacing (FHG)−1 by G−1H−1F−1, we have
A = HGG−1H−1F−1FG.
Then using GG−1 = I and F−1F = I, we have
A = HIH−1IG = HH−1G = IG = G.
♦
5.3.2 Calculating the inverse of a matrix
Suppose A is a 2× 2 invertible matrix. Write the columns of A−1 as x1, x2 then AA−1 = I can be
written as
AA−1 = A(x1|x2) = (e1|e2),
where {e1, e2} is the standard basis for R2. Then x1 is the solution of Ax = e1 and x2 is the
solution of Ax = e2. We could find x1 by reducing (A|e1) to the reduced row-echelon form and
similarly for x2. This can be done simultaneously as shown in the following example.
Example 3. Find the inverse of A =
(
2 3
3 4
)
.
Solution.
(A| e1 e2) =
(
2 3 1 0
3 4 0 1
)
R2 = R2 − 32R1−−−−−−−−−−−−−→
(
2 3 1 0
0 −12 −32 1
)
.
If we further reduce the matrix to reduced row-echelon form by(
2 3 1 0
0 −12 −32 1
)
R2 = −2R2−−−−−−−−−−−−−→
(
2 3 1 0
0 1 3 −2
)
R1 = R1 − 3R2−−−−−−−−−−−−−→
(
2 0 −8 6
0 1 3 −2
)
R1 =
1
2R1−−−−−−−−−−−−−→
(
1 0 −4 3
0 1 3 −2
)
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.3. THE INVERSE OF A MATRIX 183
Hence x1 =
(−4
3
)
and x2 =
(
3
−2
)
and A−1 =
(−4 3
3 −2
)
. ♦
In general, if A is an invertible n × n matrix and the columns of the unique inverse, A−1, are
x1, x2, . . . , xn, then xi is the unique solution of Ax = ei, where 1 6 i 6 n. Here {e1, e2, . . . , en}
is the standard basis for Rn. Conversely if any equation Ax = ei does not have a unique solution,
then A is not invertible.
This suggests a method of finding the inverse of a (invertible) matrix A:
1. Form the augmented matrix (A | I) with n rows and 2n columns.
2. Use Gaussian-elimination to convert (A | I) to row-echelon form (U | C). Then, if all entries
in the bottom row of U are zero, stop. In such circumstances A has no inverse.
3. Otherwise, use further row operations to reduce (U |C) to reduced row-echelon form (I | B).
The right hand half of this reduced row-echelon form is the inverse.
The next proposition sums up the implications of these observations.
Proposition 5. A matrix A is invertible if and only if it can be reduced by elementary row
operations to an identity matrix I and if (A | I) can be reduced to (I | B) then B = A−1.
Example 4. Determine if the matrix
A =
 1 2 32 1 2
1 −1 1

is invertible and, if it is invertible, find its inverse.
Solution. We solve AX = I by forming an augmented matrix with the 3 columns of I on the
right, as in
(A | I) =
 1 2 32 1 2
1 −1 1
∣∣∣∣∣∣
1 0 0
0 1 0
0 0 1
 .
As usual in solving equations, we first use Gaussian elimination to reduce the augmented matrix
to row-echelon form. We obtain
(U | Y ) =
 1 2 30 −3 −4
0 0 2
∣∣∣∣∣∣
1 0 0
−2 1 0
1 −1 1
 .
Because the bottom row of U contains a non-zero entry, A is invertible.
To obtain the inverse X = A−1, we use further row operations to get U into reduced row-echelon
form and find 1 0 00 1 0
0 0 1
∣∣∣∣∣∣∣
−12 56 −16
0 13 −23
1
2 −12 12
 .
We now have an identity matrix on the left, so the solution X of the equations AX = I can be
read off immediately as the matrix on the right of the |. The solution is
c©2020 School of Mathematics and Statistics, UNSW Sydney
184 CHAPTER 5. MATRICES
A−1 = X =
 −
1
2
5
6 −16
0 13 −23
1
2 −12 12
 .
As a check, multiplication of A and the A−1 obtained should give I. ♦
Example 5. Determine if
A =
1 2 34 5 6
7 8 9

is invertible and, if it is invertible, find its inverse.
Solution. The augmented matrix [A | I] is
(A | I) =
1 2 34 5 6
7 8 9
∣∣∣∣∣∣
1 0 0
0 1 0
0 0 1
 .
On reducing [A | I] to row-echelon form by Gaussian elimination, we have
(U | Y ) =
 1 2 30 −3 −6
0 0 0
∣∣∣∣∣∣
1 0 0
−4 1 0
1 −2 1
 .
Here, all entries in the bottom row of U are zero, so A is not invertible. ♦
5.3.3 Inverse of a 2× 2 matrix
There is a simple formula of the inverse of an invertible 2× 2 matrix.(
a b
c d
)−1
=
1
ad− bc
(
d −b
−c a
)
, provided ad− bc 6= 0.
We can easily prove this by checking that the product
1
ad− bc
(
a b
c d
)(
d −b
−c a
)
is I. This
formula is useful and should be committed to memory.
Example 6. Find the inverse of
(
2 3
3 4
)
by the above formula.
Solution.
(
2 3
3 4
)−1
=
1
2× 4− 3× 3
(
4 −3
−3 2
)
=
(−4 3
3 −2
)
. ♦
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.3. THE INVERSE OF A MATRIX 185
[X] 5.3.4 Elementary row operations and matrix multiplication
In this subsection we shall see that the three elementary row operations on a matrix A in Chapter 4
can be interpreted as multiplication of A by “elementary matrices”. As applications, we can use
this to prove that the algorithm to calculate the inverse of a matrix works and prove that the left
inverse of a square matrix is two-sided in the next subsection.
Example 7. 1. If E1 =
 1 0 00 3 0
0 0 1
 and A =
 a b cd e f
g h i
 then E1A =
 a b c3d 3e 3f
g h i

(check this), that is, the second row has been multiplied by 3.
2. If E2 =

1 0 0 0
0 1 0 0
0 0 0 1
0 0 1 0
 and B =

a b c d
e f g h
i j k l
m n o p
 then E2C =

a b c d
e f g h
m n o p
i j k l
, that is, the
3rd and 4th rows have been swapped.
3. If E3 =
 1 0 00 1 0
−5 0 1
 then E3A =
 a b cd e f
g − 5a h− 5c i− 5c
, that is, row three is replaced
by row three minus five times row one. ♦
Proposition 6. The three elementary row operations can be effected by left multiplication by
matrices. These matrices are all invertible.
Proof. Let E1 =

1
. . .
λ
. . .
↑
i©
1
 E2 =

1
. . .
1
0 1
1
. . .
1
1 0
. . .
↑
i© ↑j©
1

E3 =

1
. . .
1
. . .
λ 1
. . .
↑
i© ↑j©
1

. The entries in the blank space are zeroes.
c©2020 School of Mathematics and Statistics, UNSW Sydney
186 CHAPTER 5. MATRICES
As demonstrated by Example 7 it is easy to see that, when multiplied on the left of a matrix,
• E1 multiplies row i by λ
• E2 interchanges row i and row j
• E3 adds λ times row i to row j
If is also clear that
E−11 =

1
. . .
λ−1
. . .
1
 , E
−1
2 = E2, E
−1
3 =

1
. . .
1
. . .
−λ 1
. . .
↑
i© ↑j©
1

.
Indeed E−11 , E
−1
2 and E
−1
3 “undo” the operations done by E1, E2 and E3.
Theorem 7. If an augmented matrix (A | I) is reduced to a matrix of the form (I | X) then X is
a left inverse of A.
Proof. On the reduction of (A | I) → (I | X), each step is an elementary row operation. By
Proposition 6 each row operation is also left multiplication by a matrix of type E1, E2 or E3. So
there are a sequence of matrices S1, · · · , Sk, each of which is of type E1, E2 or E3 such that
Sk Sk−1 · · · S1A = I.
However the same operations are performed on I yielding X, so we have
Sk Sk−1 · · ·S1 I = X
hence XA = I, that is X is a left inverse of A.
Theorem 8. The left inverse of A in Theorem 7 is also a right inverse of A.
Proof. Continuing from the proof of Theorem 7, X = Sk Sk−1 · · ·S1 is a left inverse of A. By
Proposition 6 S1, . . . , Sk are invertible with inverses S
−1, . . . , S−1k . Let B = S
−1
1 , S
−1
2 , . . . , S
−1
k .
Now
XB = Sk Sk−1 . . . S2 S1 S−11 S
−1
2 . . . S
−1
k = I
and BX = S−11 S
−1
2 . . . S
−1
k Sk Sk−1 . . . S2 S1 = I. But XA = I also, hence
B = BI = B(XA) = (BX)A = IA = A
thus B is both the left and right inverse of X as is A! So, X actually is A−1.
Corollary 9. The algorithm described in 5.3.2 does work. That is, the matrix calculated by the
algorithm is the inverse of the given matrix.
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.3. THE INVERSE OF A MATRIX 187
5.3.5 Inverses and solution of Ax = b
Proposition 10. If A is a square matrix, then A is invertible if and only if Ax = 0 implies x = 0.
Proof. Suppose A is invertible and Ax = 0 then A−1Ax = 0, Ix = 0, x = 0.
Conversely suppose Ax = 0 implies x = 0 then the reduced echelon form of A is the identity matrix
and, as discussed in 5.3.4, A−1 exists.
Proposition 11. Let A be an n× n square matrix. Then A is invertible if and only if the matrix
equation Ax = b has a unique solution for all vectors b ∈ Rn. In this case, the unique solution is
x = A−1b.
Proof. Suppose A is invertible and Ax = b then A−1Ax = A−1b, Ix = A−1b, x = A−1b.
Conversely suppose Ax = b has a unique solution for all vectors b. Then, in particular, Ax = 0
has a unique solution, and this is clearly x = 0. Hence, by Proposition 10, A is invertible.
Note. In principle, this proposition can be used to obtain a solution of Ax = b by finding an
inverse A−1 and then forming x = A−1b. However, there are several serious practical problems
which occur if the proposition is used in this way. Some of these problems are as follows.
1. To obtain the inverse of an n × n matrix A it is necessary to solve n equations of the form
Axj = ej . Even with the use of the most efficient available method (the LU -factorisation
method), the calculation of an inverse takes longer than the solution of the original equation
Ax = b.
2. It is very easy to forget that many matrices do not have inverses. It is then very easy to make
the mistake of saying that the ‘solution’ of Ax = b is x = A−1b when, in fact, the equations
might actually have no solution or might have an infinite number of solutions.
3. The Gaussian elimination and back-substitution method of solving equations always works,
whereas the x = A−1b formula only works in the special case that Ax = b has a unique
solution.
4. In large-scale numerical calculations, use of the inverse may produce more ‘rounding error’.
However, the propositions can be used to prove some important results.
Corollary 12. Let A be a square matrix.
1. If X is a left inverse of A, then X is the (two-sided) inverse of A. That is,
if XA = I, then X = A−1.
2. If X is a right inverse of A, then X is the (two-sided) inverse of A.
Proof. For part 1, assume Ax = 0. Since X is a left inverse of A,
x = Ix = (XA)x = X(Ax) = X0 = 0.
Then by Proposition 10, A is invertible. Hence
XA = I ⇒ XAA−1 = IA−1 ⇒ X = A−1.
We can use similar arguments to prove part 2.
c©2020 School of Mathematics and Statistics, UNSW Sydney
188 CHAPTER 5. MATRICES
5.4 Determinants
In this section we shall define the determinant function. If A is a square matrix, then the determi-
nant of A, written det(A), is a number. Determinants arise in many areas of mathematics. As we
shall see in Chapter 5, they can be used to find the volume of a parallelepiped. Determinants are
very important in the theory of eigenvalues and eigenvectors, covered in Chapter 8. In later years
you will need to calculate determinants to perform changes of variables in multivariable calculus.
From a theoretical point of view, the determinant is also important in determining whether
a system of n linear equations in n unknowns has a solution for every right-hand-side vector, or
equivalently, whether an n× n matrix A is invertible. As we shall see, A is invertible if and only if
det(A) 6= 0. Unfortunately, calculating det(A), even by the most efficient methods, takes the same
length of time as finding the row-echelon form for A, so determinants are not used for numerical
calculations of this kind. On the other hand, many general statements can be proved by using the
properties of the determinant function.
Determinants are defined only for square matrices. We shall give definitions for arbitrary-
size n× n matrices, but most of our examples will be restricted to determinants of 2× 2 and 3× 3
matrices. The case of 1 × 1 matrices is included for completeness. An alternative notation for
det(A) is |A|.
5.4.1 The definition of a determinant
The determinant of a 1 × 1 matrix A is defined to be its sole entry. As the notation |A| tends to
cause confusion with the absolute value, it should be used with caution for 1 × 1 matrices. Thus
|(−1)| = −1 whilst | − 1| = 1.
There are several ways of defining determinants, each of which has advantages and disadvan-
tages. We use a recursive definition which begins with the definition of a 2× 2 determinant. Here
‘recursive’ indicates that the definition of a determinant for square matrices of a given size invokes
that for matrices of a smaller size.
Definition 1. The determinant of a 2× 2 matrix
A =
(
a11 a12
a21 a22
)
is det(A) = a11a22 − a12a21.
Example 1. If A =
(
2 −4
−3 4
)
then |A| = det(A) = 2(4) − (−4)(−3) = −4. ♦
The definition of the determinant of a general n × n matrix can be built up recursively from the
definition of the determinant of a 2× 2 matrix.
We first define the concept of a minor of an entry of a matrix.
Definition 2. For a matrix A, the (row i, column j) minor is the determinant of
the matrix obtained from A by deleting row i and column j from A.
Notation. We shall use the symbol |Aij | to represent the (row i, column j) minor in a matrix A.
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.4. DETERMINANTS 189
Example 2. The (row 2, column 3) minor of the (row 2, column 3) entry in A =
 2 −3 48 −1 −4
6 5 −7

is, on deleting row 2 and column 3, |A23| =
∣∣∣∣ 2 −36 5
∣∣∣∣ = 28. ♦
Definition 3. The determinant of an n× n matrix A is
|A| = a11|A11| − a12|A12|+ a13|A13| − a14|A14|+ · · ·+ (−1)1+na1n|A1n|
=
n∑
k=1
(−1)1+ka1k|A1k|.
Note that each term in the definition of the determinant |A| is a product of an entry in the first
row of A with its corresponding minor. The signs of the terms alternate +, −, +, −, . . . starting
with a + on the (row 1, column 1) entry. The formula is called “expanding along the first row of
the determinant.”
Example 3. Evaluate the determinant
det(A) =
∣∣∣∣∣∣
5 1 7
−2 3 −4
6 −1 2
∣∣∣∣∣∣ .
Solution.
det(A) = 5
∣∣∣∣ 3 −4−1 2
∣∣∣∣− 1 ∣∣∣∣ −2 −46 2
∣∣∣∣+7 ∣∣∣∣ −2 36 −1
∣∣∣∣ = 5(6− 4)− (−4 + 24) + 7(2− 18) = −122.
♦
5.4.2 Properties of determinants
In this section we present some of the basic properties of determinants. Where it is possible to do
so in an efficient manner by elementary methods, we give proofs of the properties for the general
n×n case. For properties where the general proof requires more advanced methods, we give proofs
for the 2× 2 or 3× 3 cases instead.
Proposition 1. det(AT ) = det(A)
Proof. For the 3× 3 case. If
A =
a11 a12 a13a21 a22 a23
a31 a32 a33
 then AT =
a11 a21 a31a12 a22 a32
a13 a23 a33
 .
By direct evaluation of det(A) and det(AT ) from Definition 3, we obtain
c©2020 School of Mathematics and Statistics, UNSW Sydney
190 CHAPTER 5. MATRICES
det(A) = a11
∣∣∣∣a22 a23a32 a33
∣∣∣∣− a12 ∣∣∣∣a21 a23a31 a33
∣∣∣∣+ a13 ∣∣∣∣a21 a22a31 a32
∣∣∣∣
= a11a22a33 − a11a23a32 − a12a21a33 + a12a23a31 + a13a21a32 − a13a22a31
and
det(AT ) = a11
∣∣∣∣a22 a32a23 a33
∣∣∣∣− a21 ∣∣∣∣a12 a32a13 a33
∣∣∣∣+ a31 ∣∣∣∣a12 a22a13 a23
∣∣∣∣
= a11a22a33 − a11a32a23 − a21a12a33 + a21a32a13 + a31a12a23 − a31a22a13
These two expressions are equal, and hence det(A) = det(AT ).
One immediate consequence of Proposition 1 is that a determinant can also be evaluated by
‘expansion down the first column’, that is
|A| = a11|A11| − a21|A21|+ a31|A31| − a41|A41|+ · · ·+ (−1)n+1an1|An1| =
n∑
k=1
(−1)k+1ak1|Ak1|.
Example 4. Use expansion along the first column to evaluate the determinant of Example 3.
Solution. Expanding along the first column, we have
det(A) = 5
∣∣∣∣ 3 −4−1 2
∣∣∣∣− (−2) ∣∣∣∣ 1 7−1 2
∣∣∣∣+6 ∣∣∣∣ 1 73 −4
∣∣∣∣ = 5(6− 4) + 2(2 + 7) + 6(−4− 21) = −122.
♦
Proposition 2. If any two rows (or any two columns) of A are interchanged, then the sign of
the determinant is reversed. More precisely if the matrix B is obtained from the matrix A by
interchanging two rows (or columns), then detB = −detA.
Proof. For 2× 2 case. Let
A =
(
a11 a12
a21 a22
)
.
Then, on interchanging columns 1 and 2 of A,
det
(
a12 a11
a22 a21
)
= a12a21 − a11a22 = − det(A).
One important application of this proposition is that a determinant can be evaluated by ex-
panding along any row or any column, with the signs chosen from the following array.
+ − + − · · ·
− + − + · · ·
+ − + − · · ·
− + − + · · ·
...
...
...
...
. . .
 .
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.4. DETERMINANTS 191
If we evaluate the determinant by expanding the ith row or the jth column, where 1 6 i, j 6 n,
we have
|A| =
n∑
k=1
(−1)i+kaik|Aik| =
n∑
k=1
(−1)k+jakj|Akj|.
Example 5. Use expansion along the second row to evaluate det(A) of the previous two examples.
Solution. To evaluate det(A) by expanding along the second row, we choose the signs from the
second row of
+ − +− + −
+ − +
. So
det(A) = −(−2)
∣∣∣∣ 1 7−1 2
∣∣∣∣+ 3 ∣∣∣∣ 5 76 2
∣∣∣∣− (−4) ∣∣∣∣ 5 16 −1
∣∣∣∣ = −122.
♦
Obviously, we should evaluate a determinant by expanding along that row or column containing
the greatest number of 0′s. We will soon see an even better method.
A second application of Proposition 2 is as follows.
Proposition 3. If a matrix contains a zero row or column then its determinant is zero.
Proof. Clearly, if we evaluate the determinant by expansion along the zero row or column, the value
obtained will be zero.
Most of the remaining important properties of determinants centre around the question of what
happens to the value of the determinant when one column (or row) of a matrix is changed, with
all other columns (or rows) remaining unchanged.
Proposition 4. If a row (or column) of A is multiplied by a scalar, then the value of detA is
multiplied by the same scalar. That is, if the matrix B is obtained from the matrix A by multiplying
a row (or column) of A by the scalar λ, then detB = λdetA.
Proof. On multiplying the first row of
A =

a11 a12 · · · a1n
a21 a22 · · · a2n
...
...
. . .
...
an1 an2 · · · ann

by a scalar λ, we obtain
B =

λa11 λa12 · · · λa1n
a21 a22 · · · a2n
...
...
. . .
...
an1 an2 · · · ann
 .
Then, on expanding along the first row, we have
det(B) =
n∑
k=1
(−1)1+kλa1k |A1k| = λ
n∑
k=1
(−1)1+ka1k |A1k| = λdet(A),
and hence the result is proved for a multiple of the first row. Then, from Propositions 1 and 2, the
result is also true for a scalar multiple of any row or column.
c©2020 School of Mathematics and Statistics, UNSW Sydney
192 CHAPTER 5. MATRICES
An immediate consequence of Propositions 2 and 4 is the following useful result.
Proposition 5. If any column of a matrix is a multiple of another column of the matrix (or any
row is a multiple of another row), then the value of det(A) is zero.
Proof. The result is clearly true if the scalar multiple is zero.
Now, assume that the ith column of a matrix A is λ times the jth column. Then, if we multiply
the ith column by λ−1 we obtain a matrix B, whose determinant, from Proposition 4, has the value
det(B) = λ−1 det(A).
This matrix B has its ith and jth columns equal. On interchanging columns i and j, we have,
from Proposition 2, that the determinant of the new matrix has the value − det(B). But, inter-
changing equal columns does not change the matrix, so the new matrix is still B with determinant
det(B). Hence, det(B) = − det(B), and thus det(B) = 0 and det(A) = 0.
The proof of the proposition for rows follows immediately from Proposition 1.
Example 6. ∣∣∣∣∣∣
1 2 3
2 −5 7
−3 −6 −9
∣∣∣∣∣∣ = 0
as row 3 is a scalar multiple of row 1. This result can easily be checked directly by using expansion
along the second row to evaluate the determinant. ♦
Propositions 2 and 4 show the effect of the two elementary row operations of interchanging two
rows and of multiplying a row by a scalar. The effect of the third elementary row operation of
adding a multiple of one row to another row is given in the following proposition.
Proposition 6. If a multiple of one row (or column) is added to another row (or column), then
the value of the determinant is not changed.
[X] Proof. On adding λ times row i to row 1 of
A =
a11 a12 · · · a1n... ... . . . ...
an1 an2 · · · ann
 , we obtain B =

a11 + λai1 a12 + λai2 · · · a1n + λain
...
...
. . .
...
ai1 ai2 · · · ain
...
...
. . .
...
an1 an2 · · · ann
 .
Then, on expanding along the first row, we have
det(B) =
n∑
k=1
(−1)1+k (a1k + λaik) |A1k|
=
n∑
k=1
(−1)1+ka1k |A1k|+ λ
n∑
k=1
(−1)1+kaik |A1k|
= det(A) + λ
n∑
k=1
(−1)1+kaik |A1k| .
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.4. DETERMINANTS 193
The last sum is the determinant of the matrix which is obtained from A by replacing the first row
of A by the ith row of A. Thus, the first and ith rows of this matrix are the same, and hence, from
Proposition 5, its determinant is zero. Thus the second sum in the above equation is zero, and
hence det(B) = det(A). We have therefore proved the result for subtraction of scalar multiples of
any row from the first row. Then, from Propositions 1 and 2, the result is true for any two rows or
any two columns.
The final useful proposition that we shall give here is as follows.
Proposition 7. If A and B are square matrices such that the product AB exists, then
det(AB) = det(A) det(B).
All known proofs of this result are non–trivial, so we shall not give a proof here.
Example 7. For
A =
(
2 −4
−3 5
)
and B =
( −1 −8
2 9
)
,
we have
|A| = −2, |B| = 7, |AB| =
∣∣∣∣ −10 −5213 69
∣∣∣∣ = −14 = |A||B|.
♦
5.4.3 The efficient numerical evaluation of determinants
With the exception of 2 × 2 and 3 × 3 matrices, the direct evaluation of a determinant from
Definition 3 is a lengthy procedure. For example, the value of an n × n determinant is given in
terms of n minors of size (n− 1)× (n − 1), each of which is given in terms of n− 1 minors of size
(n − 2) × (n − 2), etc. The evaluation ending with the evaluation of 2 × 2 minors. In fact, the
total number of terms to be evaluated is n!. Even for a small number such as n = 6, this is already
6! = 720 terms.
With the exception of the 2× 2 and 3× 3 cases, by far the most efficient method of evaluating
determinants of matrices with numerical entries is to use Gaussian elimination to reduce the matrix
to row-echelon form. The basic results required for this efficient method of evaluation are given in
the following two propositions.
Proposition 8. If U is a square row-echelon matrix, then det(U) is equal to the product of the
diagonal entries of U .
Proof. Let
U =

u11 u12 · · · u1n
0 u22 · · · u2n
...
...
. . .
...
0 0 · · · unn

be an n× n matrix in row-echelon form. Then the proposition states that
det(U) = u11u22 · · · unn.
c©2020 School of Mathematics and Statistics, UNSW Sydney
194 CHAPTER 5. MATRICES
We shall prove this result by induction. We first introduce some notation.
Let U(j) be the submatrix of U which is defined by
U(j) =

u11 u12 · · · u1j
0 u22 · · · u2j
...
...
. . .
...
0 0 · · · ujj
 .
Note that U(j) is a square row-echelon form matrix for 1 6 j 6 n, with U(1) = [u11], det(U(1)) =
u11 and U(n) = U .
We shall now prove that if the proposition is true for det
(
U(j)
)
, then it is also true for det
(
U(j+
1)
)
. Now,
det
(
U(j + 1)
)
=
∣∣∣∣∣∣∣∣∣
u11 u12 · · · u1,j+1
0 u22 · · · u2,j+1
...
...
. . .
...
0 0 · · · uj+1,j+1
∣∣∣∣∣∣∣∣∣ .
We can evaluate det
(
U(j + 1)
)
by expanding along the last row, and we obtain
det
(
U(j + 1)
)
= (−1)2juj+1,j+1 det
(
U(j)
)
= uj+1,j+1 det
(
U(j)
)
.
If we now assume that the proposition is true for det
(
U(j)
)
, we can replace det
(
U(j)
)
by the
product of its diagonal entries to obtain
det
(
U(j + 1)
)
= u11 · · · ujjuj+1,j+1.
Hence, if the result is true for det
(
U(j)
)
it is also true for det
(
U(j + 1)
)
.
However, the result is true for j = 1, since in this case we have det
(
U(1)
)
= u11. It is therefore
true for j = 2, 3, . . ., and the proposition is proven for all n > 1 by induction.
Example 8. ∣∣∣∣∣∣∣∣∣∣
3 2 9 3 12
0 −4 −5 6 2
0 0 6 −7 −3
0 0 0 5 20
0 0 0 0 8
∣∣∣∣∣∣∣∣∣∣
= (3)(−4)(6)(5)(8) = −2880.
♦
Proposition 9. If A is a square matrix and U is an equivalent row-echelon form obtained from
A by Gaussian elimination using row interchanges and adding a multiple of one row to another,
then det(A) = ǫ det(U), where ǫ = +1 if an even number of row interchanges have been made, and
ǫ = −1 if an odd number of row interchanges have been made.
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.4. DETERMINANTS 195
Proof. From Proposition 2, each row interchange reverses the sign of the determinant. Thus the
sign is unchanged if 0, 2, 4, . . . row interchanges have been made, whereas the sign is reversed if
1, 3, 5 . . . row interchanges have been made.
From Proposition 6, the value of the determinant is unchanged if a multiple of one row is added
to another row. Thus row addition operations do not change the value of the determinant. The
proof is complete.
Propositions 8 and 9 show that the value of a determinant can be found by the highly efficient
Gaussian elimination method.
Example 9. Evaluate the determinant of
A =

1 −1 0 3
2 −2 6 −1
4 −2 1 7
3 5 −7 2
 .
Solution. The first leading element for Gaussian elimination is already in the first row, so no row
interchange is required. Hence, at this stage, ǫ = 1. Adding multiples of the first row to subsequent
rows does not change the value of the determinant. As is usual in Gaussian elimination, the multiple
are chosen to make the entries below the leading element all zero. After partly reducing the original
matrix we have:
|A| =
∣∣∣∣∣∣∣∣
1 −1 0 3
0 0 6 −7
0 2 1 −5
0 8 −7 −7
∣∣∣∣∣∣∣∣ .
The second leading element is in row 3. We therefore interchange rows 2 and 3 to bring the
second pivot element to its required row 2 position. This interchange changes the sign of the
determinant, and hence we now have ǫ = −1. After interchanging rows 2 and 3 and then adding
suitable multiples of the new row 2 from the new row 3 and row 4, we have
|A| = (−1)
∣∣∣∣∣∣∣∣
1 −1 0 3
0 2 1 −5
0 0 6 −7
0 0 −11 13
∣∣∣∣∣∣∣∣ .
The third pivot element is in its correct row 3 position, so no row interchange is required. Hence
ǫ remains at its previous value of −1. After further reducing the matrix we have
|A| = (−1)
∣∣∣∣∣∣∣∣
1 −1 0 3
0 2 1 −5
0 0 6 −7
0 0 0 16
∣∣∣∣∣∣∣∣ = (−1)(1 × 2× 6×
1
6
) = −2.
The experienced evaluator of determinants will discard entries in these matrices as the calculation
progresses. This leads to the shortened calculation:∣∣∣∣∣∣∣∣
1 −1 0 3
2 −2 6 −1
4 −2 1 7
3 5 −7 2
∣∣∣∣∣∣∣∣ =
∣∣∣∣∣∣∣∣
1 −1 0 3
0 0 6 −7
0 2 1 −5
0 8 −7 −7
∣∣∣∣∣∣∣∣ = (−1)
∣∣∣∣∣∣
2 1 −5
0 6 −7
0 −11 13
∣∣∣∣∣∣ = −2
∣∣∣∣ 6 −7−11 13
∣∣∣∣ = −2.
c©2020 School of Mathematics and Statistics, UNSW Sydney
196 CHAPTER 5. MATRICES
♦
Although we can reduce any matrix to a row echelon form by the two operations—row exchange
and row addition, using the row operation—multiply a row by a scalar helps, especially when we
do the calculations by hand. This row operation exploits the result of Proposition 4. We illustrate
the technique by the following example.
[X] Example 10. Factorise
∣∣∣∣∣∣
1 a2 a3
1 b2 b3
1 c2 c3
∣∣∣∣∣∣.
Solution.∣∣∣∣∣∣
1 a2 a3
1 b2 b3
1 c2 c3
∣∣∣∣∣∣
=
∣∣∣∣∣∣
1 a2 a3
0 b2 − a2 b3 − a3
0 c2 − a2 c3 − a3
∣∣∣∣∣∣ R2 = R2 −R1R3 = R3 −R1
= (b− a)(c− a)
∣∣∣∣∣∣
1 a2 a3
0 a+ b a2 + ab+ b2
0 a+ c a2 + ac+ c2
∣∣∣∣∣∣
Proposition 4
factorise (b− a) from R2
factorise (c− a) from R3
= (b− a)(c− a)
∣∣∣∣∣∣
1 a2 a3
0 a+ b a2 + ab+ b2
0 c− b a(c− b) + c2 − b2
∣∣∣∣∣∣ R3 = R3 −R2
= (b− a)(c− a)(c− b)
∣∣∣∣∣∣
1 a2 a3
0 a+ b a2 + ab+ b2
0 1 a+ b+ c
∣∣∣∣∣∣ Proposition 4factorise (c− b) from R3
= (b− a)(c− a)(c− b)[(a + b)(a+ b+ c)− (a2 + ab+ b2)] expand along the first column
= (a− b)(b− c)(c− a)(ab+ bc+ ca)
♦
5.4.4 Determinants and solutions of Ax = b
The propositions of Section 5.4.3 can be used to establish an important relation between the value
of det(A) and the solution of Ax = b for a square matrix A. The basic result is as follows.
Proposition 10. Let A be an n× n matrix.
1. If det(A) 6= 0, the equation Ax = b has a solution and the solution is unique for all b ∈ Rn.
2. If det(A) = 0, the equation Ax = b either has no solution or an infinite number of solutions
for a given b.
Proof.
CASE 1. det(A) 6= 0. From Propositions 8 and 9 , det(A) 6= 0 implies that all diagonal entries of
an equivalent row-echelon matrix U for A are non-zero. Then, since A, and hence U , is a square
c©2020 School of Mathematics and Statistics, UNSW Sydney
5.4. DETERMINANTS 197
matrix, U has no zero rows and no non-leading columns. As there are no zero rows, the equations
have a solution for all b, and, as there are no non-leading columns, each solution is unique.
CASE 2. det(A) = 0. In this case, U contains at least one zero diagonal entry, so at least one zero
row and at least one non-leading column. Thus there is a non-zero solution to Ax = 0. The result
follows from Proposition 5 of Section 4.7.
Note. In the case when det(A) 6= 0 there is an explicit formula called Cramer’s rule, available for
the solution of Ax = b. Each entry xi of x is specified as the quotient of two determinants.
Example 11. Describe the type of solution for each of the following systems.{
2x1 + 3x2 = b1
4x1 + 6x2 = b2
and

x1 − x2 + 3x3 = b1
2x1 + 3x2 + x3 = b2
3x1 + x2 + 4x3 = b3
.
Solution. The determinants of the coefficient matrices of the equations are
|A| =
∣∣∣∣2 34 6
∣∣∣∣ = 0 and |B| =
∣∣∣∣∣∣
1 −1 3
2 3 1
3 1 4
∣∣∣∣∣∣ = −5 6= 0.
Hence, depending on the values of b1 and b2, the first system of equations either has no solution or
an infinite number of solutions, whereas the second system of equations has a unique solution for
all b. ♦
Note. With the possible exception of 2 × 2 and 3 × 3 matrices, Proposition 10 does not provide
a practical method of determining the number of solutions of Ax = b, because the most efficient
method for evaluating det(A) is first to use Gaussian elimination to reduce A to row-echelon form
U and then to multiply diagonal entries of U . However, as we have seen before, the number of
solutions that an equation Ax = b has can be seen directly from the row-echelon form itself,
without the need for any further calculation.
Proposition 10 leads immediately to two results which are sometimes useful.
Proposition 11. For a square matrix A, the homogeneous system of equations Ax = 0 has a
non-zero solution if and only if det(A) = 0.
Proof. Ax = 0 always has x = 0 as a solution. From Proposition 10, if det(A) 6= 0, then x = 0 is the
unique solution, whereas, if det(A) = 0, then there are an infinite number of non-zero solutions.
Proposition 12. A square matrix A is invertible if and only if det(A) 6= 0.
Proof. The result follows immediately on combining the results of Proposition 11 and of Proposi-
tion 10 of Section 5.3.
Example 12. Use determinants to check if the matrices
A =
(
2 5
−4 1
)
and B =
 1 3 4−2 4 −8
−3 1 −12

are invertible.
c©2020 School of Mathematics and Statistics, UNSW Sydney
198 CHAPTER 5. MATRICES
Solution. As |A| = 22 6= 0, A is invertible. As column 3 of B is a multiple of column 1 of B,
det(B) = 0. Thus B has no inverse. ♦
There is a very simple relationship between the determinant of a matrix and the determinant
of the inverse of the matrix.
Proposition 13. If A is an invertible matrix, then det(A−1) =
1
det(A)
.
Proof. If A−1 exists, then
det(AA−1) = det(I) = 1.
But, from Proposition 7, the determinant of a product is the product of the determinants, and
hence
1 = det(AA−1) = det(A) det(A−1).
Finally, as A is invertible, det(A) 6= 0, and hence det(A−1) = 1
det(A)
.
5.5 Matrices and Maple
The Linear Algebra package becomes available on Maple only if it is ‘loaded’ with the command:
with(LinearAlgebra):
Maple treats matrices as individual entities, which can be added and subtracted (use + and -) and
multiplied by scalars (use *). Matrix multiplication is done with the dot.
The matrix A =
(
11 12
21 22
)
is entered by the command:
A := < < 11,21 > | < 12,22 > >;
Maple ignores irrelevant blanks! To display the matrix A you enter:
A;
In Chapter 3 we showed how to enter identity matrices. You can now try:
B := 3*A-2*IdentityMatrix(2);
C := B.A;
You can transpose a given matrix A and set B as this transpose with:
B := Transpose(A);
The determinant of a square matrix A can be evaluated by
Determinant(A);
The inverse of a square matrix A can be calculated by:
A^(-1);
though you should read Subsection 5.3.2 and then try reducing an augmented matrix.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 5 199
Problems for Chapter 5
Problems 5.1 : Matrix arithmetic and algebra
1. [R] Given the matrices
A =
 2 −3 43 2 −2
1 −1 3
 , B =
 −2 13 4
−1 5
 , C =
 −3 21 −4
6 2
 , D = ( 2 3 1
1 −2 −3
)
.
Find the following matrices if they exist, or explain why they don’t exist. (I stands for an
identity matrix of the appropriate size).
a) 3A, b) −2B, c) A+B, d) B + C, e) A+ 3I,
f) B + 3I, g) AB, h) BA, i) BC, j) CD,
k) A2, l) B2, m) (BD)2.
2. [H] Suppose A and B are matrices such that both AB and BA are defined.
a) Show that AB and BA are both square matrices.
b) If AB = BA, show that A and B are both square and of the same size.
c) If A and B are square matrices such that AB = BA, show that
(A−B)(A+B) = A2 −B2.
d) Find two 2× 2 matrices A, B for which (A−B)(A+B) 6= A2 −B2.
e) Prove that (A+B)2 = A2 +B2 + 2AB if and only if AB = BA.
3. [H] Let A and B be matrices of the same size. By considering the general entries [A]ij ,
[B]ij, [A+B]ij and [B+A]ij , prove the commutative laws of addition, i.e. A+B = B+A.
4. [H] Suppose λ is a scalar and A,B ∈Mmn. Prove that λ(A+B) = λA+ λB.
5. [H] Let A and B be two matrices such that AB is defined. By considering the general entry
in both sides of the equation, show that A(λB) = λAB where λ is any real number.
6. [R] Let
A =
 1 0 10 1 1
1 1 2
 , B =
 1 22 −2
−1 4
 , C =
 2 23 −2
−2 4
 .
Show that AB = AC and deduce that matrices cannot in general be cancelled from
products.
7. [R][V] Let
A =
(
2 1
3 −1
)
.
Show that A2 = A+ 5I and hence find A6 as a linear combination of A and I.
c©2020 School of Mathematics and Statistics, UNSW Sydney
200 CHAPTER 5. MATRICES
8. [R] Let
N =
0 1 00 0 1
0 0 0
 .
Find N2 and N3. Show that (I +N) (I −N +N2) = I.
9. [H][V] Let A and B be n × n real matrices such that A2 = I, B2 = I and (AB)2 = I.
Prove that AB = BA.
10. [H] Let A be a 2× 2 real matrix such that AX = XA for all 2× 2 real matrices X. Show
that A = αI for some α ∈ R.
11. [H] Suppose
A =
 1 −2 34 0 1
3 2 −1
 , B =
 7 0 32 −1 6
−1 0 5
 .
a) Write down a column vector v such that Av is the second column of A.
b) Write down a row vector v such that vB is the third row of B.
c) Write down a column vector v such that Av is the second column of AB.
d) Write down a row vector v such that vB is the first row of AB.
Problems 5.2 : The transpose of a matrix
12. [R] Find the transposes of the following matrices:
A =
 1 −2−3 0
4 5
 , B =
 2 −5 4 3−4 6 5 5
5 0 8 6
 , C =
 1 4 24 −3 6
2 6 7
 .
13. [R] Let a = (1, 3,−2)T and b = (0, 4, 2)T . Evaluate all of the following expressions that
make sense and find those which are equal:
ab, aTb, abT , aTbT , bTa, baT .
14. [R][V] Suppose that A is a square matrix.
a) Show that the matrix B = (A+AT ) is symmetric.
b) Show that the matrix C = AAT is symmetric.
c) A matrix M with the property that MT = −M is called a skew symmetric matrix.
Show that D = (A−AT ) is a skew symmetric matrix.
d) [H] Can you show how to write any square matrix as the sum of a symmetric and a
skew symmetric matrix?
15. [H] Show by constructing an example that, in general, ATA 6= AAT , even if A is square.
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 5 201
16. [H] Suppose there exists a real matrix G such that GGT =
(
λ 0
0 µ
)
where λ, µ ∈ R. Prove
that λ and µ are non-negative. If λ = 45 and µ = 20 find an example of such a matrix G
with integer entries.
Problems 5.3 : The inverse of a matrix
17. [R][V] Find the inverses of those of the following 2× 2 matrices that have inverses.
a)
(
2 7
1 4
)
, b)
( −4 7
3 −5
)
, c)
(
6 12
3 6
)
, d)
(
8 9
3 4
)
, e)
(
0 1
1 7
)
.
18. [R] Use the matrix inversion algorithm of Section 5.3 to decide if the following matrices
are invertible, and find the inverses for those which are invertible.
A =
 1 3 −20 −1 2
0 0 1
 , B =
 0 2 01 2 3
−1 4 −2
 , C =
1 2 32 3 4
4 5 6
 , D =
 1 4 12 3 1
1 −7 −2
 .
19. [H] Write down the inverse of each of the following matrices
a)
1 0 00 5 0
0 0 6
 b)
 0 1 00 0 3
−2 0 0
.
20. [R] Decide if the following matrices are invertible, and find the inverses for those that are
invertible.
A =

1 2 −1 −1
1 2 −2 −2
0 1 1 1
1 4 0 −1
 , B =

1 1 0 1
3 3 1 5
1 0 2 4
0 −4 2 1
 , C =

1 2 −2 −7
2 4 3 14
−1 −2 3 11
3 5 2 12
 .
21. [R] Given that A, B and C are invertible n× n matrices simplify
a) A(CB2A)−1C, b) (ABA−1)6, c) A(A−1 +A)2A−1,
d) A(I + (I −A) + · · ·+ (I −A)m).
HINT: Write the first A as I − (I −A).
22. [R] a) Simplify (B−1A)−1.
b) Find (B−1A)−1 if A−1 =
 1 2 10 −1 1
0 0 1
 and B =
 1 0 10 2 0
1 0 3
 .
23. [H] a) Prove that (AT )−1 = (A−1)T for any invertible matrix A.
b) If A, B, C are invertible matrices of the same size simplify
i) A−1(BAT )TB, ii) AT (CAT )−1C T .
c©2020 School of Mathematics and Statistics, UNSW Sydney
202 CHAPTER 5. MATRICES
24. [R][V] Let A =
1 −1 11 1 0
3 −2 2
 .
a) Calculate A−1. b) Solve Ax = c for x, where c =
c1c2
c3
.
25. [H] A square matrix Q is said to be an orthogonal matrix if it has the property that
QTQ = I. That is, QT = Q−1. Show that the matrix
Q =

2
3
1
3 −23
2
3 −23 13
1
3
2
3
2
3

is orthogonal. Hence write down the solution of Qx = b for b ∈ R3.
26. [H] Show Q =
(
cos θ − sin θ
sin θ cos θ
)
is orthogonal. Show that x ∈ R2 and Qx are equidistant
from the origin.
Problems 5.4 : Determinants
27. [R] Evaluate the determinants of the following 2×2 matrices and hence determine whether
or not they are invertible.
a)
(
2 7
1 4
)
, b)
( −4 7
3 −5
)
, c)
(
5 2
10 4
)
, d)
(
8 9
3 4
)
, e)
(
11 13
12 14
)
.
28. [R] Evaluate the determinants for the following matrices by reducing to row echelon form.
a)
 −1 1 22 4 −1
0 −1 1
, b)
 1 −2 43 1 −2
1 5 −10
, c)
 1 0 43 1 −2
1 5 −10
.
29. [R] Find the determinant of the matrix

1 1 1 1
1 1 7 1
1 8 3 1
1 1 1 4
 .
30. [H] Suppose A =
a b cd e f
g h i
 has determinant 5. Find
c©2020 School of Mathematics and Statistics, UNSW Sydney
PROBLEMS FOR CHAPTER 5 203
a) det
 3a 3b 3c2d 2e 2f
−g −h −i
, b) det
a+ 2d b+ 2e c+ 2fd− g e− h f − i
g h i
,
c) det
d e fg h i
a b c
, d) det(7A).
31. [R] Given that A is a 3× 3 matrix with detA = −2. Calculate:
a) detAT , b) detA−1, c) detA5.
32. [R] Evaluate det(A), det(B) and hence det(AB), where
A =
 1 −2 30 3 5
3 4 −2
 , B =
 5 −1 0−3 2 4
2 5 0
 .
33. [R] For what values of a is the matrix
1 2 21 3 1
1 3 a
 invertible?
34. [H] Long long ago, a mathematician wrote C and C−1 on a piece of paper. Unfortunately
insects have damaged the paper and all that is left is
C =
−2 −1 11 2 −1
∗ ∗ ∗
 and C−1 =
 ∗ 0 −12 ∗ −1
5 1 ∗

a) Find C−1. b) Find C. c) Find det C.
35. [H] Show that
det

1 1 1 1
1 1 + a 1 1
1 1 1 + b 1
1 1 1 1 + c
 = abc.
36. [H] Let U1 and U2 be two n × n row-echelon matrices. Prove that det(U1) det(U2) =
det(U1U2).
37. [H] Let A =
 α 1 −1α 2α+ 2 α
α− 3 α− 3 α− 3
 .
a) Factorise det(A).
b) Hence, find the values of α will there be a nonzero solution of Ax = 0.
38. [R] Show by constructing an example that in general det(A+B) 6= det(A) + det(B).
c©2020 School of Mathematics and Statistics, UNSW Sydney
204 CHAPTER 5. MATRICES
39. [R] Show by constructing an example that in general det(λA) 6= λdet(A).
40. [H] Use the product rule for determinants to show that a square orthogonal matrix Q (see
Question 25) has a value for det(Q) of +1 or −1.
c©2020 School of Mathematics and Statistics, UNSW Sydney
205
Answers to selected problems
Chapter 1
1. a) a+ h, b) a− h, c) a+ 12h, d) 34 a, e) 34 a− 12 h.
2. a) 0, b) 2
−→
CA.
3. a) −4a+ 5b, b) (2p + 3r)a+ (2q − 3s)b.
4. a) 12 (b+ a),
1
2(b+ c)
6. a) ≈ 14 cm N75◦ E. b) ≈ 104 cm S 23◦ E.
75◦
10 cm
5 cm
14 cm
5 cm/s for 40cm
12 cm/s13 cm/s
8× 13 = 104 cm
c) ≈ 18 km/h N36◦ E. d) The rower must row 30◦ upstream.
25 km/h
15 km/h
18 km/h
1 km/h
√
3 km/h
2 k
m/
h
7. a)
(
3
4
)
, b)
 1615
−5
, c)

−7
2
−6
−1
, d) Not possible, e) 7i− 4j+ 3k.
8. 7.43, N 28◦ E.
c©2020 School of Mathematics and Statistics, UNSW Sydney
206 CHAPTER 1
12. a) not parallel, b) parallel, c) parallel.
Only in b) is ABCD a parallelogram.
17. (4, 5, 0), (−6,−1, 2), (4, 7, 6)
18. d+ e− f , d+ f − e, e+ f − d.
19. The midpoint is (3,−1, 3). The point Q is (10,−29, 31).
20. t =
1
3
a+
2
3
b
21.

1
0
0
0
0
 ,

0
1
0
0
0
 ,

0
0
1
0
0
 ,

0
0
0
1
0
 ,

0
0
0
0
1
 .
22. 6,
1
6
 4−4
2
; √14, 1√
14

2
1
0
3
; √21, 1√21

4
0
1
−2
0
.
23. a) 15, b) 12, c)
√
62.
24.
√
35,
√
6,
√
41.
25. A 4–cube has 16 vertices, say, V = {(a, b, c, d) | a, b, c, d = 0, 1}.
26. a) x =
(
1
2
)
+ λ
(
1
5
)
, λ ∈ R; b) x =
 12
−1
+ λ
−2−3
6
 , λ ∈ R;
c) x =
12
1
+ λ
60
2
 , λ ∈ R; d) x =

1
2
−1
3
+ λ

−2
1
2
−2
 , λ ∈ R.
27. Yes, it corresponds to λ = 1.
28. a) x =
(
0
4
)
+ λ
(
1
3
)
, λ ∈ R; b) x =
(
0
3
)
+ λ
(
2
−3
)
, λ ∈ R;
c) x = λ
(
1
−7
)
, λ ∈ R; d) x =
(
0
4
)
+ λ
(
1
0
)
, λ ∈ R;
e) x =
(−2
0
)
+ λ
(
0
1
)
, λ ∈ R.
c©2020 School of Mathematics and Statistics, UNSW Sydney
ANSWERS 207
29. a) x =
−41
3
+ λ
61
0
 or x1 + 4
6
= x2 − 1, x3 = 3.
b) x =
 12
−3
+ λ
 4−5
6
 or x1 − 1
4
=
x2 − 2
−5 =
x3 + 3
6
.
c) x =
 1−1
1
+ λ
 5−1
2
 or x1 − 1
5
=
x2 + 1
−1 =
x3 − 1
2
.
d) x =
10
0
+ λ
03
3
 or x1 = 1, x2 = x3.
30.
 3−4
1
. a) x =
 1−1
2
+ λ
 2−3
−1
, λ ∈ R.
31. a) true, b) false, c) true, d) true.
32. a) x = λ
 3−1
2
+ µ
 14
−6
 ; λ, µ ∈ R.
b) x =
 14
−2
+ λ
12
6
+ µ
 0−14
5
 ; λ, µ ∈ R.
33. a) Plane through the origin parallel to
12
3
 and
−23
4
.
b) Line through (3, 1, 2, 4) parallel to

−2
1
3
2
.
c) Line through origin parallel to

3
2
1
2
.
d) Plane through (1, 2, 3) parallel to
 4−1
2
 and
82
4
.
34. a) x =
12
3
+ λ1
21
3
+ λ2
−12
−3
 for λ1, λ2 ∈ R;
c©2020 School of Mathematics and Statistics, UNSW Sydney
208 CHAPTER 2
b) x =
31
4
+ λ1
−41
0
+ λ2
 36
−6
 for λ1, λ2 ∈ R;
c) x =

−2
4
1
6
+ λ1

5
−2
5
−7
+ λ2

3
0
−1
−6
 for λ1, λ2 ∈ R;
d) x =
30
0
+ λ1
−30
2
+ λ2
34
0
 for λ1, λ2 ∈ R;
e) x =
01
0
+ λ1
10
0
+ λ2
 06
5
1
 for λ1, λ2 ∈ R;
f) x =

1
2
3
4
+ λ1

4
0
−4
5
+ λ2

7
2
−3
−5
 for λ1, λ2 ∈ R.
35. a) x = λ1
−11
0
+ λ2
−10
1
 ; λ1, λ2 ∈ R.
b) x =
40
0
+ λ1
13
0
+ λ2
−40
3
 ; λ1, λ2 ∈ R.
c) x =
 0−1
0
+ λ1
10
0
+ λ2
 0−6
1
 ; λ1, λ2 ∈ R.
d) x =
00
2
+ λ1
10
0
+ λ2
01
0
 ; λ1, λ2 ∈ R.
37. a) (3, 2, 4), b) (3,−4, 11).
38. a) 6x− 3y + 2z = −12, b) 6x− 12y + 13z = 100.
39. a) x =
 3−2
1
+ λ
−23
1
 for λ ∈ R. b) (−13, 22, 9).
40. a) x =
64
1
+ λ
 52
−2
 for λ ∈ R. b) (1, 2, 3).
c©2020 School of Mathematics and Statistics, UNSW Sydney
ANSWERS 209
Chapter 2
1. a)
π
4
b) cos−1
(
1
10
√
3
)
≈ 86◦41′, c) π
2
, d) cos−1
(
7
10
√
13
)
≈ 78◦48′.
2. a) 0,
2√
6
,
1√
3
; b)
4
3
√
2
, − 5√
33
,
8√
66
; c)
7
3
√
10
, − 1√
42
,
8√
105
.
3. cos−1
(
1
3
)
≈ 70◦32′.
6. λ1 = a · u1 = 1√
2
, λ2 = a · u2 = −3, λ3 = a · u3 = 3√
2
.
7. a)
 55
2
1
. b) π
2
. c)
√
66
2
. d)
1
17
8050
22
.
8. a)
1
3
 2−4
2
, b) 3
14

−1
3
0
2
, c)
−33
6
.
9. a) 7, b) 3.
10. a)
 16−4
−2
, b)
−23−11
20
, c)
−459
−18
.
11.
 12−8
6
.
13. a) 2
√
21,
 8−4
2
; b) 2√2,
 0−2
−2
.
14. a)
√
2; b)
15
2
.
15. a) − 4
3
√
2
. b)
1√
2
.
17. a) 14, b) 53.
19. As usual, the answers for equations of planes are not unique.
c©2020 School of Mathematics and Statistics, UNSW Sydney
210 CHAPTER 2
a) x =
30
0
+ λ1
11
0
+ λ2
20
1
;
−11
2
 ·
x−
 12
−2
 = 0;
x1 − x2 − 2x3 = 3.
b) x =
 12
−2
+ λ1
−11
2
+ λ2
23
1
;
−55
−5
 ·
x−
 12
−2
 = 0;
x1 − x2 + x3 = −3.
c) x =
 12
−2
+ λ1
−2−1
4
+ λ2
11
3
,
−710
−1
 ·
x−
 12
−2
 = 0;
7x1 − 10x2 + x3 = −15.
d) x =
−10
0
+ λ1
1/21
0
+ λ2
−1/40
1
,
−42
−1
 ·
x−
−10
0
 = 0;
4x1 − 2x2 + x3 = −4.
20. a) x =
12
4
+ λ
 10
−1
+ µ
 2−1
−1
, for λ, µ ∈ R.
b)
−1−1
−1
. c) x1 + x2 + x3 = 7.
21.
44
2
.
22. a) 3, b)
√
6, c)
13
7
, d)
25
7
.
23. a) x =
12
0
+ λ
−1−1
2
+ µ
−21
1
 , λ, µ ∈ R.
b)
11
1
. c)
11
1
 ·
x−
12
0
 = 0. d) 8√
3
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
ANSWERS 211
Chapter 3
1.
x ∈ N x ∈ Z x ∈ Q x ∈ R
a) - −25 −25 −25
3 3 3 3
- −3 −3 −3
- - −103 −103
b) 1 1,−5 1,−5 1,−5
5 5 5,32 5,
3
2
- - - 1±
√
5
2
- - - -
c) 3j,j ∈ N 3j,j ∈ Z 3j,j ∈ Z 3j,j ∈ Z
0 0 0 3kπ,k ∈ Z
2. No.
3. Yes. The set {0} and the empty set ∅ = { }.
4. Yes.
5. 3z = 6 + 9i, z2 = −5 + 12i, z + 2w = 7i, z(w + 3) = −2 + 10i, z
w
=
1
5
(4 − 7i),
w
z
=
1
13
(4 + 7i).
6. a)
1
5
(3− i), b) −1
2
(1− i).
7. a) a2− b2 +2abi, b) a
a2 + b2
− i b
a2 + b2
, c)
1
(a− 1)2 + b2
(
(a2 − 1 + b2)− 2ib) .
8. a) 12 (−1±
√
3 i), b) −1±√2 i, c) 3± i, d) 12 (3±
√
3)i, e) ±i, ±2i.
10. 16
11.
8abi(a2 − b2)
(a2 + b2)2
c©2020 School of Mathematics and Statistics, UNSW Sydney
212 CHAPTER 3
12.
z Re(z) Im(z) z
−1 + i −1 1 −1− i
2 + 3i 2 3 2− 3i
2− 3i 2 −3 2 + 3i
2−i
1+i
1
2 −32 1+3i2
1
(1+i)2 0 −12 12
13. −3 + 4i, 11
25
− 2
25
i.
14. z = 2 + 3i, w = −1 + 2i.
17. b) z2 − 6z + 13
18.
z |z| Arg(z) Polar Form
6 + 6i 6
√
2 pi4 6
√
2
(
cos pi4 + i sin
pi
4
)
−4 4 π 4(cos π + i sin π)
√
3− i 2 −pi6 2
(
cos pi6 − i sin pi6
)
−1√
2
− i√
2
1 −3pi4 cos
3pi
4 − i sin 3pi4
−7 + 3i √58 α √58(cosα+ i sinα) Here α = π − tan−1 37 .
19.
√
234, −1.
20. n = 4
21. a)
3
2
(
1 +
√
3i
)
, b)
3
2
(−√3 + i), c) −3
2
(
1 +
√
3i
)
, d)
3
2
(√
3− i),
e)
3
2
(√
2 +
√
2 + i
√
2−√2
)
(Double angle formula used).
27. 64, −(1 +√3)i, 1 +
√
3
2
+
√
3− 1
2
i.
28.
7
2
.
29. π.
30. Arg (−1 + i) = 3π
4
; Arg
(−√3 + i) = 5π
6
;
Arg
(
(−1 + i) (−√3 + i)) = −5π
12
; Arg
( −1 + i
−√3 + i
)
= − π
12
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
ANSWERS 213
31. sin
7π
12
=
1 +
√
3
2
√
2
.
32. zw = 2
√
2eipi/12 = 2
√
2
[
cos
( π
12
)
+ i sin
( π
12
)]
; z9 = −512;
( z
w
)12
= 64eipi = −64.
33. a) 16
(−√3 + i) , b) − i, c) − 1
2
− i
√
3
2
.
34. a) ±(5− 2i), b) ±(3 + 5i), c) ±(7 + 5i).
35. b)
√
2e
5pii
12 = 12
((√
3− 1)+ i (√3 + 1)) , c) 1√
2
(1 + 13i).
37. a) 2 + i, 1− i; b) 4 + i, 3− 2i; c) 1− 2i, −5 + 3i.
38. eipi/7, e3ipi/7, e5ipi/7, eipi, e−ipi/7, e−3ipi/7, e−5ipi/7.
39. einpi/12 for n = −11,−7,−3, 1, 5, 9.
40. 2einpi/15 for n = −13,−7,−1, 5, 11.
41.
15
2
+ i
(
3
√
3
2
− 1
)
, 3− i, 15
2
− i
(
3
√
3
2
+ 1
)
.
43. a) Real part = cos(2θ). Imaginary part = sin(2θ).
45. a) cos 6θ = cos6 θ − 15 cos4 θ sin2 θ + 15 cos2 θ sin4 θ − sin6 θ
sin 6θ = 6cos5 θ sin θ − 20 cos3 θ sin3 θ + 6cos θ sin5 θ
b) cos 6θ = 32 cos6 θ − 48 cos4 θ + 18 cos2 θ − 1
46. sin 7θ = 7cos6 θ sin θ − 35 cos4 θ sin3 θ + 21 cos2 θ sin5 θ − sin7 θ
cos 7θ = cos7 θ − 21 cos5 θ sin2 θ + 35 cos3 θ sin4 θ − 7 cos θ sin6 θ.
47. a) cos θ =
1
2
(
eiθ + e−iθ
)
.
b) cos6 θ =
1
32
(cos 6θ + 6cos 4θ + 15 cos 2θ + 10).
48. sin5 θ =
1
16
(sin 5θ − 5 sin 3θ + 10 sin θ)∫
sin5 θdθ =
1
16
(
−1
5
cos 5θ +
5
3
cos 3θ − 10 cos θ
)
+ C,
cos4 θ =
1
8
[3 + 4 cos(2θ) + cos(4θ)]∫
cos4 θdθ =
1
8
[
3θ + 2 sin(2θ) +
1
4
sin(4θ)
]
+ C.
c©2020 School of Mathematics and Statistics, UNSW Sydney
214 CHAPTER 3
49. a) |z − i| 6 2 b) |z − i| 6 2 or − π
3
6 Arg (z − i) 6 2π
3
bi
2
bb i
c) |z| > 2 and |Im(z)| 6 3 d) y 6 x
b
y = 3
y = −3
2
0
y
=
x
0
e) The real axis f) |z − 1− i| < 1 & −π
4
< Arg (z − 1− i) 6 π
2
0
Im
Re0
b
1 + i
c©2020 School of Mathematics and Statistics, UNSW Sydney
ANSWERS 215
g) Circle: x2 +
(
y +
5
3
)2
=
(
4
3
)2
b
b
−3i
−53 i
0
50. a) Re(z) > 3 Im(z) and |z − (3 + i)| > 2
1
3 S1
Real Axis
x = 3y
Im Axis
Circle radius 2, centre 3 + i
0
b) |z − i| < |z + i| and −π
6
6 Arg (z − i) 6 π
6
1
Im Axis
Real Axis
|z − i| < |z + i| is equivalent to
Im(z) > 0
pi
6
pi
6
S2
0
c©2020 School of Mathematics and Statistics, UNSW Sydney
216 CHAPTER 3
51. a) Im(z) > −4 and |z − 1− i| > 3 b) Yes.
1 + i
3
Re(z)
Im(z)
Im(z) = −4
52. | z − x |>| z − Re(z) |
b
b
b
Re(z) x
z
| z −
x |
|z
−
R
e(
z
)
|
0
53. a) w = eiα, −π < α 6 π c) | z − eiα |>| z − eiθ |, θ = Arg(z)
Real axis
b
α
1
Real axis
b
θ
α
z|z − e
iα |
|z −
e
iθ
|
54. a) 742, b) 129, c) 1 + 9i.
55. p(z) = (z − 2)(2z − 5)(z + 3).
c©2020 School of Mathematics and Statistics, UNSW Sydney
ANSWERS 217
56. p(z) = (z − 1)(z + 1)(z + 2)(z + 4).
57. a)
(
z − e− ipi10
)(
z − e 3pii10
)(
z − e 7pii10
)(
z − e− ipi2
)(
z − e− 9pii10
)
.
b)
(
z −
√
2e
ipi
6
)(
z −
√
2e
ipi
2
)(
z −
√
2e
5pii
6
)(
z −
√
2e−
ipi
6
)(
z −
√
2e−
ipi
2
)(
z −
√
2e−
5pii
6
)
.
58. a) (x− 1)(x+ 1)(x2 + 1)(x2 +√2x+ 1)(x2 −√2x+ 1).
b) (x2 + 2)(x2 +
√
6x+ 2)(x2 −√6x+ 2).
59. (z2 + 2z + 2)(z2 − 2z + 2)
60. (z − e−ipi/8) (z − ei3pi/8) (z − ei7pi/8) (z − e−i5pi/8)
61. a) e−
5pii
6 , e−
pii
2 , e−
pii
6 , e
pii
6 , e
pii
2 , e
5pii
6 .
b) Note that the solutions are evenly spaced around the unit circle centred on 0.
Re z
Im z
0
b
i
b
e
5ipi
6
b
e
−5ipi
6
b
−i
b
e
−ipi
6
b
e
ipi
6
c)
(
z − e−5pii6
)(
z − e−pii2
)(
z − e−pii6
)(
z − epii6
)(
z − epii2
)(
z − e5pii6
)
.
d)
(
z2 + 1
) (
z2 +
√
3 z + 1
) (
z2 −√3 z + 1) .
62. a) eipi/4, eipi/2, ei3pi/4, e−ipi/4, e−ipi/2, e−3pi/4.
b)
(
z − eipi/4) (z − e−ipi/4) (z − ei3pi/4) (z − e−i3pi/4) (z − eipi/2) (z − e−ipi/2)
c)
(
z2 −√2z + 1) (z2 +√2z + 1) (z2 + 1) .
63. a) (z − e2ipi/5)(z − e−2ipi/5)(z − e4ipi/5)(z − e−4ipi/5)
b)
(
z2 − 2z cos
(
2π
5
)
+ 1
)(
z2 − 2z cos
(
4π
5
)
+ 1
)
64. a) (t+ 1− i) (t+ 1 + i) (t− 2) (t+ 1) (t+ i) (t− i),
b) (t2 + 2t+ 2) (t− 2) (t+ 1) (t2 + 1).
65. 1 + i, 1− i, 3√5,
3
√
5
2
(−1 + i√3) , 3√5
2
(−1− i√3) .
70. evalc((sqrt(2)+7*I)^13);
c©2020 School of Mathematics and Statistics, UNSW Sydney
218 CHAPTER 4
Chapter 4
1. a)
{
5
2
}
,
{(
5
2
λ
)
: λ ∈ R
}
,

 52λ
µ
 : λ, µ ∈ R

b)
{(
4− 2λ
λ
)
: λ ∈ R
}
,

4− 2λλ
µ
 : λ, µ ∈ R

c)

 λµ
2− 2λ+ 3µ
 : λ, µ ∈ R

2. a) No solution. b) Unique solution
(
x1
x2
)
=
(
8
−9
)
.
c) Infinite number of solutions on the line x =
(
5
0
)
+ λ
(
5
1
)
, λ ∈ R.
3. For a11 6= 0 the conditions are as follows.
a) If a11a22 − a12a21 6= 0, then solution is unique.
b) If a11a22 − a12a21 = 0 and a11b2 − a21b1 6= 0, then there is no solution.
c) If a11a22 − a12a21 = 0 and a11b2 − a21b1 = 0, then there are an infinite number of
solutions.
4. a) Solution set =

 1 + λ2− 2λ
λ
 : λ ∈ R
.
Planes intersect in line x =
12
0
+ λ
 1−2
1
 , λ ∈ R.
b) No solution. Planes are parallel.
c) Solution set =

4− 54λ+ 12µλ
µ
 : λ, µ ∈ R
. Equations represent the same plane.
6. a) In vector form,
x1
 35
−1
+ x2
−32
−1
+ x3
 4−3
6
 =
67
8
 .
As a matrix equation and augmented matrix, 3 −3 45 2 −3
−1 −1 6
x1x2
x3
 =
67
8
 ; (A|b) =
 3 −3 4 65 2 −3 7
−1 −1 6 8
 .
c©2020 School of Mathematics and Statistics, UNSW Sydney
ANSWERS 219
b) In vector form,
x1
13
0
+ x2
32
3
+ x3
 7−5
6
+ x4
 8−1
−6
 =
−27
5
 .
As a matrix equation and augmented matrix,
 1 3 7 83 2 −5 −1
0 3 6 −6


x1
x2
x3
x4
 =
−27
5
 ; (A|b) =
 1 3 7 8 −23 2 −5 −1 7
0 3 6 −6 5
 .
7. The system of equation is
x1 − 3x2 = 10
6x2 + 6x3 = −2
−6x1 − x2 − 4x3 = 0
7x1 + 9x2 + 11x3 = 5
The augmented matrix form is
A =

1 −3 0 10
0 6 6 −2
−6 −1 −4 0
7 9 11 5
 .
8. a) R2 = R2 − 2R1, R3 = R3 − 4R1; b) R1 = R1 −R2, R2 = 12R2.
9.
 2 4 1 29 14 7 7
1 3 1 3
.
10. All but c) and h) are in row-echelon form.
11. a) x =
 23
−2
. Point of intersection of 3 planes.
b) x =

2
3
−2
0
+ λ

−1
0
2
1
 , λ ∈ R.
A line in R4 through the point (2, 3,−2, 0) and parallel to

−1
0
2
1
.
c©2020 School of Mathematics and Statistics, UNSW Sydney
220 CHAPTER 4
12. a) x =
(
3
−1
)
. b) x =
51
0
+ λ
−1−2
1
 , λ ∈ R. c) x =
 2−3
1
 .
d) No solution. e) x =
05
0
+ λ
 2−3
1
 , λ ∈ R. f) No solution.
g) x =

1
2
3
2
. h) x =

−3
6
5
0
+ λ

2
−2
−1
1
 , λ ∈ R.
13. a)
 1 0 0 −10 1 0 2
0 0 1 −2
.
Solution: x =
−12
−2
 , which is the position vector of a point in R3.
b)
 1 0 0 −75 −340 1 0 29 13
0 0 1 7 3
.
Solution: x =

−34
13
3
0
+ λ

75
−29
−7
1
 , λ ∈ R, which is a line in R4.
14. a) Unique solution, b) no solution, c) infinitely many solutions,
d) infinitely many solutions, e) unique solution.
15. a) k 6= 3, b) no such value of k, c) k = 3.
16. a) λ = ±2, b) λ = 1, c) all other values of λ.
17. a) a 6= 0, b) a = 0, b 6= 0, c) a = b = 0, d) x =

5
0
0
0
+ λ

−2
5
2
−1
3
 λ ∈ R.
18. Perhaps, if the costs are negative or very large then you can be sure that someone is cheating.
19. No.
20. a) x1 = 7b1 + 5b2 + 3b3
x2 = 6b1 + 4b2 + 3b3
x3 = 2b1 + b2 + b3
b) x1 =
3
2b1 − 2b2 − 2b3
x2 = −72b1 + 5b2 + 4b3
x3 =
1
2b1 − b2 − b3
c©2020 School of Mathematics and Statistics, UNSW Sydney
ANSWERS 221
22. a) b3 − 12b1 + b2 = 0. b) b1 − b2 + b3 = 0 and −2b1 + b2 + b4 = 0.
24. Yes.
25. No.
26. Yes, since

1
1
4
12
 = 3

3
−1
4
6
− 2

4
−2
4
3
.
27. No.
28. Yes, at (6, 13, 11).
29. Yes, since
 57
−1
 = 3
35
1
− 4
12
1
.
31. Meet at (6, 9, 4).
32. The planes intersect at the line x =
 6−2
3
+ λ
−54
8
 λ ∈ R.
33. Planes are not parallel as λ1

2
1
−2
7
+ λ2

−3
1
5
2
 = µ1

3
−1
2
4
+ µ2

−1
4
2
6

only when λ1 = λ2 = µ1 = µ2 = 0.
35. a) x =
10
0
+ λ
 132
3
1
 , λ ∈ R. b) The planes intersect in a line.
36. p(x) = 2x2 − 4x+ 7
37. I am 42, my brother is 46 and my sister is 52.
38. 6 days in Bangkok, 4 each in Singapore and Kuala Lumpur.
39. 3, 1, 2.
40. a) Π1 is x+ 2y − z = 2, Π2 is 3x+ 6y − z = 12, Π3 is 2x+ 4y − z = 7.
c©2020 School of Mathematics and Statistics, UNSW Sydney
222 CHAPTER 5
b) x =
5− 2t2t2
3
 =
50
3
+ t2
−21
0
 , t2 ∈ R
The intersection is a line through (5, 0, 3) and parallel to
−21
0
.
c) x = −2y + 5 and z = 3.
41. a)

x− 2y + z = a
3x+ 6y + 8z = b
4x+ 2y + 7z = c
7x− 8y + 6z = d
. b) d− a+ 2b− 3c = 0. c)
(
4
7
,−1
7
,
1
7
)
.
Chapter 5
1. a) 3A =
 6 −9 129 6 −6
3 −3 9
. b) −2B =
 4 −2−6 −8
2 −10
.
c) A+B is not defined. d) B + C =
 −5 34 0
5 7
.
e) A+ 3I =
 5 −3 43 5 −2
1 −1 6
 . f) B + 3I is not defined.
g) AB =
 −17 102 1
−8 12
 . h) BA is not defined.
i) BC is not defined. j) CD =
 −4 −13 −9−2 11 13
14 14 0
 .
k) A2 =
 −1 −16 2610 −3 2
2 −8 15
. l) B2 is not defined.
m) (BD)2 =
 −86 81 167−47 38 85
−187 171 358
.
7. 96A + 205I.
8. N2 =
0 0 10 0 0
0 0 0
 , N3 =
0 0 00 0 0
0 0 0
 .
c©2020 School of Mathematics and Statistics, UNSW Sydney
ANSWERS 223
11. a)
01
0
, b) (0 0 1), c)
 0−1
0
, d) (1 −2 3).
13. AT =
(
1 −3 4
−2 0 5
)
, BT =

2 −4 5
−5 6 0
4 5 8
3 5 6
 , CT =
 1 4 24 −3 6
2 6 7
 = C.
14. aTb = bTa = 8, abT =
 0 4 20 12 6
0 −8 −4
, baT =
 0 0 04 12 −8
2 6 −4
, ab and aTbT are not
defined.
17. A possible G =
(
3 6
−4 2
)
.
18. a)
(
4 −7
−1 2
)
, b)
(
5 7
3 4
)
, c) no inverse, d)
1
5
(
4 −9
−3 8
)
, e)
( −7 1
1 0
)
.
19. A−1 =
 1 3 −40 −1 2
0 0 1
, B−1 =
 8 −2 −31
2 0 0
−3 1 1
, C is not invertible,
D−1 =
1
4
 1 1 15 −3 1
−17 11 −5
.
20. a)
1 0 00 15 0
0 0 16
 b)
 0 0 −121 0 0
0 13 0

21. A−1 =

4 −3 −2 0
−1 1 1 0
1 −2 −2 1
0 1 2 −1
 ; B−1 =

6 −2 1 0
9 −4 3 −1
25 −11 8 −2
−14 6 −4 1
 ;
C−1 does not exist.
22. a)
(
B−1
)2
, b) AB6A−1, c) (A+A−1)2, d) I − (I −A)m+1.
23. a) A−1B. b)
2 4 41 −2 3
1 0 3
.
24. b) i) BTB, ii) C−1C T .
c©2020 School of Mathematics and Statistics, UNSW Sydney
224 CHAPTER 5
25. a)
−2 0 12 1 −1
5 1 −2
. b)
 −2c1 + c32c1 + c2 − c3
5c1 + c2 − 2c3
.
26. x = QTb.
28. a) 1, b) −1, c) 0, d) 5, e) −2. All are invertible except
(
5 2
10 4
)
.
29. a) −9, b) 0, c) 56.
30. −126.
31. a) −30, b) 5, c) 5, d) 5× 73 = 1715.
32. a) −2, b) −1
2
, c) −32.
33. −83, −108, 8964.
34. a 6= 1.
35. a)
1 0 −12 1 −1
5 1 −3
 . b)
−2 −1 11 2 −1
−3 −1 1
 . c) 1.
38. a) (α− 3) (α+ 1) (α+ 2). b) −1, −2, 3.
39. For example, A = B =
(
1 0
0 1
)
.
40. For example, A =
(
1 0
0 1
)
and λ = −1.
c©2020 School of Mathematics and Statistics, UNSW Sydney
225
Past class tests
The following selection of past MATH1131 class tests can be used as a guide to the degree of
difficulty of algebra class tests in this course. However, due to variations in timing and frequency,
the material examined in each class test differs from the tests given here. Thus students must
consult the Information booklet for MA1131, or page 226 of these notes, to ascertain the precise
topics that may be examined in each algebra class test.
c©2020 School of Mathematics and Statistics, UNSW Sydney
UNIVERSITY OF NEW SOUTH WALES
SCHOOL OF MATHEMATICS AND STATISTICS
MATH1131/1141 Mathematics 1A Algebra S1 2014
TEST 1 VERSION 1a
This sheet must be filled in and stapled to the front of your answers
Student’s Family Name Initials Student Number
Tutorial Code Tutor’s Name Mark
Note: The use of a calculator is NOT permitted in this test
Show all your working
All answers should be given in the appropriately SIMPLIFIED form.
QUESTIONS (Time allowed: 25 minutes)
1. (2 marks)
For the points A (4, 2, 3), B (5,−7,−2) and C (7,−25,−10).
(i) Find a parametric vector equation of the straight line AB.
(ii) Determine, with reasons, whether or not the point C is on the straight line AB.
2. (2 marks)
Find a parametric vector equation of the plane in R3 with Cartesian equation
2x1 − 5x2 + x3 = 7 .
Hence give two non–parallel non–zero vectors which are parallel to the plane.
3. (3 marks)
For the points A (1, 2, 3), B (3, 4, 1), C (3, 3, 4) calculate
(i)
−−→
AB ×−→AC.
(ii) Area of △ABC.
4. (3 marks)
Let ℓ be the straight line in R3 through the point P (1, 2, 3) and parallel to the vector v =23
1
. Let Q be the point with co-ordinates (1, 4, 4).
(i) Find projv
(−−→
PQ
)
.
(ii) Find the shortest distance d between the line ℓ and Q.
Please write your answers on lined A4 paper and staple to this cover sheet.
UNIVERSITY OF NEW SOUTH WALES
SCHOOL OF MATHEMATICS AND STATISTICS
MATH1131/1141 Mathematics 1A Algebra S1 2014
TEST 1 VERSION 1b
This sheet must be filled in and stapled to the front of your answers
Student’s Family Name Initials Student Number
Tutorial Code Tutor’s Name Mark
Note: The use of a calculator is NOT permitted in this test
Show all your working
All answers should be given in the appropriately SIMPLIFIED form.
QUESTIONS (Time allowed: 25 minutes)
1. (2 marks)
Determine, with reasons, whether or not the 3 points A (3, 5, 7), B (5,−4, 3) and C (−5, 41, 22)
are collinear (i.e. all in a straight line).
2. (2 marks)
Find a parametric vector equation for the plane through the pointsA (1, 2, 1), B (3, 4, 2), C (5, 2, 1).
3. (3 marks)
For the points A (1, 2, 3), B (5, 6, 4) and C (2, 1, 3) calculate;
(i) the distance d(A,B) between A and B.
(ii) the projection proj−→
AC
(−−→
AB
)
.
4. (3 marks)
A triangle has vertices at the origin O, at A(4,−4, 8) and at B (0,−3,−6).
Let X be a point on the side OA such that OX = 34OA, and Y a point on the side OB such
that OY = 23OB.
Find parametric vector equations for the lines AY and BX and show that they intersect at
the point P (2,−3, 2).
Please write your answers on lined A4 paper and staple to this cover sheet.
UNIVERSITY OF NEW SOUTH WALES
SCHOOL OF MATHEMATICS AND STATISTICS
MATH1131/1141 Mathematics 1A Algebra S1 2014
TEST 1 VERSION 2b
This sheet must be filled in and stapled to the front of your answers
Student’s Family Name Initials Student Number
Tutorial Code Tutor’s Name Mark
Note: The use of a calculator is NOT permitted in this test
Show all your working
All answers should be given in the appropriately SIMPLIFIED form.
QUESTIONS (Time allowed: 25 minutes)
1. (3 marks)
Consider the line ℓ and plane Π in R3 with Cartesian equations:
ℓ :
x− 2
3
=
y + 1
4
=
z + 3
1
Π : 3x− 2y − 4z = 11 .
(i) Find a parametric equation of the line ℓ.
(ii) Find the co-ordinates of the point P where ℓ meets Π.
2. (3 marks)
For the points A (1, 2, 1), B (3, 1,−1) and C (2, 4, 1);
(i) Calculate
−−→
AB ×−→AC;
(ii) Find the area of parallelogram with two adjacent sides AB and AC.
3. (4 marks)
Let ℓ be the straight line in R3 through the point P (1, 2, 3) and parallel to the vector v =31
2
. Let Q be the point with co-ordinates (2, 4, 4).
(i) Find projv
(−−→
PQ
)
;
(ii) Find the shortest distance d between the line ℓ and Q;
(iii) Find the co-ordinates m of the point M on ℓ which is closest to Q.
Please write your answers on lined A4 paper and staple to this cover sheet.
UNIVERSITY OF NEW SOUTH WALES
SCHOOL OF MATHEMATICS AND STATISTICS
MATH1131/1141 Mathematics 1A Algebra S1 2014
TEST 1 VERSION 3a
This sheet must be filled in and stapled to the front of your answers
Student’s Family Name Initials Student Number
Tutorial Code Tutor’s Name Mark
Note: The use of a calculator is NOT permitted in this test
Show all your working
All answers should be given in the appropriately SIMPLIFIED form.
QUESTIONS (Time allowed: 25 minutes)
1. (2 marks)
For the points A (3, 2, 1) and B (6, 3,−2)
(i) Find a parametric vector equation for the line AB.
(ii) Find Cartesian equations for the line AB.
2. (2 marks)
Find a parametric vector equation for the plane in R3 with cartesian equation
7x1 + 2x2 − x3 = 1 .
Hence give two non-parallel, non-zero vectors which are parallel to the plane.
3. (2 marks)
For the points A (1, 4, 1), B (3, 5,−2) and C (5, 1, 2),
(i) Find cos (∠BAC).
(ii) Find proj−→
AC
(−−→
AB
)
.
4. (4 marks)
In the plane with a cartesian co-ordinate system, let OACB be a parallelogram, with O the
origin and
−→
OA = a,
−−→
OB = b, where a ∦ b.
(i) Write down (and label as such), parametric vector equations of the lines OC and AB in
terms of a and b.
(ii) Find the co-ordinates of the point P of intersection of lines OC and AB in terms of a
and b.
(iii) Show that |−−→OP | = |−−→PC| and |−→PA| = |−−→PB|.
Please write your answers on lined A4 paper and staple to this cover sheet.
UNIVERSITY OF NEW SOUTH WALES
SCHOOL OF MATHEMATICS AND STATISTICS
MATH1131/1141 Mathematics 1A Algebra S1 2014
TEST 1 VERSION 4a
This sheet must be filled in and stapled to the front of your answers
Student’s Family Name Initials Student Number
Tutorial Code Tutor’s Name Mark
Note: The use of a calculator is NOT permitted in this test
Show all your working
All answers should be given in the appropriately SIMPLIFIED form.
QUESTIONS (Time allowed: 25 minutes)
1. (2 marks)
For the points A (1, 2, 3), B (5, 7,−2) and C (8,−3, 2) in R3;
(i) Find the co-ordinates t of the point T on AB such that
−→
AT = 2
−→
TB.
(ii) Find the co-ordinates d of the point D such that the quadrilateral ABCD (named in
cyclic order) is a parallelogram.
2. (2 marks)
Find a parametric vector equation for the plane in R3 with cartesian equation
3x1 − x2 + 2x3 = 8 .
Hence give two non–parallel non–zero vectors which are parallel to the plane.
3. (2 marks)
For a =
12
3
 , b =
 3−1
1
, calculate a× b.
4. (4 marks)
Let ℓ be the straight line in R3 through the point P (1, 2, 3) and parallel to the vector v =23
1
. Let Q be the point with co-ordinates (1, 4, 4).
(i) Find projv
(−−→
PQ
)
.
(ii) Find the shortest distance d between the line ℓ and Q.
(iii) Find the co-ordinates m of the point M on ℓ which is closest to Q.
Please write your answers on lined A4 paper and staple to this cover sheet.
UNIVERSITY OF NEW SOUTH WALES
SCHOOL OF MATHEMATICS AND STATISTICS
MATH1131/1141 Mathematics 1A Algebra S1 2014
TEST 2 VERSION 1a
This sheet must be filled in and stapled to the front of your answers
Student’s Family Name Initials Student Number
Tutorial Code Tutor’s Name Mark
Note: The use of a calculator is NOT permitted in this test
Show all your working
All answers should be given in the appropriately SIMPLIFIED form.
QUESTIONS (Time allowed: 25 minutes)
1. (2 marks)
For the complex numbers z = 1 + 5i, w = 3− 2i calculate
Im(z + 3iw) , z/w , Arg(1− 4i− w)
in simplified cartesian form.
2. (4 marks)
Determine what conditions on b1, b2, b3, b4 are needed to ensure that

b1
b2
b3
b4
 belongs to the
span of the vectors

1
−2
−2
6
 ,

3
−5
−4
3
 ,

−3
4
2
12
.
3. (4 marks)
Use the identity
sin θ =
1
2i
(
eiθ − e−iθ)
to write sin5 θ in terms of sin θ, sin 2θ, sin 3θ, . . . .
Please write your answers on lined A4 paper and staple to this cover sheet.
UNIVERSITY OF NEW SOUTH WALES
SCHOOL OF MATHEMATICS AND STATISTICS
MATH1131/1141 Mathematics 1A Algebra S1 2014
TEST 2 VERSION 1b
This sheet must be filled in and stapled to the front of your answers
Student’s Family Name Initials Student Number
Tutorial Code Tutor’s Name Mark
Note: The use of a calculator is NOT permitted in this test
Show all your working
All answers should be given in the appropriately SIMPLIFIED form.
QUESTIONS (Time allowed: 25 minutes)
1. (3 marks)
For the complex numbers z = −2− 3i, w = 1− i calculate
Re
(
(1 + 3i)z
)
, |z2| , z + 1
w
in simplified cartesian form.
2. (4 marks)
Determine what conditions on b1, b2, b3, b4 are needed to ensure that

b1
b2
b3
b4
 belongs to the
span of the vectors

1
2
4
1
 ,

0
1
1
−1
 ,

−2
1
−3
−7
.
3. (3 marks)
Use the identity
cos θ = 12(e
iθ + e−iθ)
to write cos5 θ in terms of cos θ, cos 2θ, cos 3θ, . . . .
Please write your answers on lined A4 paper and staple to this cover sheet.
UNIVERSITY OF NEW SOUTH WALES
SCHOOL OF MATHEMATICS AND STATISTICS
MATH1131/1141 Mathematics 1A Algebra S1 2014
TEST 2 VERSION 2a
This sheet must be filled in and stapled to the front of your answers
Student’s Family Name Initials Student Number
Tutorial Code Tutor’s Name Mark
Note: The use of a calculator is NOT permitted in this test
Show all your working
All answers should be given in the appropriately SIMPLIFIED form.
QUESTIONS (Time allowed: 25 minutes)
1. (3 marks)
Find the complex square roots of −24− 70i by solving (x+ iy)2 = −24− 70i for x, y real.
2. (3 marks)
Determine, with reasons, whether or not the lines
ℓ1 : x =
−11
3
+ λ
 12
−5

and
ℓ2 : x =
 0−1
5
+ µ
−1−1
6

intersect.
3. (4 marks)
(i) Find the complex roots of z6 + 64 = 0.
(ii) Hence factorise p(z) = z6 + 64 into real linear and real irreducible quadratic factors.
Please write your answers on lined A4 paper and staple to this cover sheet.
UNIVERSITY OF NEW SOUTH WALES
SCHOOL OF MATHEMATICS AND STATISTICS
MATH1131/1141 Mathematics 1A Algebra S1 2014
TEST 2 VERSION 2b
This sheet must be filled in and stapled to the front of your answers
Student’s Family Name Initials Student Number
Tutorial Code Tutor’s Name Mark
Note: The use of a calculator is NOT permitted in this test
Show all your working
All answers should be given in the appropriately SIMPLIFIED form.
QUESTIONS (Time allowed: 25 minutes)
1. (3 marks)
Find the complex square roots of 16− 30i by solving (x+ iy)2 = 16 − 30i for x, y real.
2. (3 marks)
Determine, with reasons, whether or not the lines
ℓ1 : x =
−13
0
+ λ
21
4
 , λ ∈ R
and
ℓ2 : x =
12
2
+ µ
22
5
 , µ ∈ R
intersect.
3. (4 marks)
(i) Find the complex roots of z5 − 32 = 0.
(ii) Hence factorise p(z) = z5 − 32 into real linear and real irreducible quadratic factors.
Please write your answers on lined A4 paper and staple to this cover sheet.
UNIVERSITY OF NEW SOUTH WALES
SCHOOL OF MATHEMATICS AND STATISTICS
MATH1131/1141 Mathematics 1A Algebra S1 2014
TEST 2 VERSION 3a
This sheet must be filled in and stapled to the front of your answers
Student’s Family Name Initials Student Number
Tutorial Code Tutor’s Name Mark
Note: The use of a calculator is NOT permitted in this test
Show all your working
All answers should be given in the appropriately SIMPLIFIED form.
QUESTIONS (Time allowed: 25 minutes)
1. (3 marks)
For the complex numbers z = −1− i, w = −11 + 7i find
(−5− i)z + 2w , w
1 + 3i
, Arg(2z) .
2. (3 marks)
Let z = −√3 + 3i. Find a polar form for z and the principal argument and “a+ ib” form of
z19.
Powers of real numbers may be left unsimplified.
3. (4 marks)
Find the general solution for the following linear system of equations by setting up an aug-
mented matrix, performing Gaussian Elimination and solving by back substitution.
x1 + 3x2 − 2x3 + 4x4 = 2
−2x1 − 4x2 + 5x3 − 9x4 = 0
−x1 + x2 + 4x3 − 6x4 = 6
Please write your answers on lined A4 paper and staple to this cover sheet.
236
c©2020 School of Mathematics and Statistics, UNSW Sydney
237
Index
analytic geometry, 14
angle between vectors, 46
Argand diagram, 82
axioms
fields, 77
basis vectors, 9
binomial theorem, 96, 109
Cauchy-Schwarz, 47
closed
under addition, 76
under multiplication, 76
coefficient matrix, 126
complex number, 77
argument, 85
Cartesian form, 78
conjugate, 81
imaginary part, 80
modulus, 85
nth roots, 94
polar form, 90
real part, 80
complex plane, 99
components, 15
coordinate vector, 9, 17
cosine rule, 44
cross product, 54
De Moivre’s theorem, 88
determinant, 188
determinants, 188
numerical evaluation, 193
distance, 43
dot product, 45
elementary row operations, 129
equations of planes, 63
Euler’s formula, 89
factor theorem, 103
fundamental theorem of algebra, 103
Gaussian elimination, 132, 134
geom3d, 68
geometric vector, 2
identity matrix, 172
induction, 107
infinitely many, 141
leading column, 132
leading entry, 132
leading row, 132
leading variable, 138
line, 19
Cartesian form, 25, 27
parametric vector form, 22
spanned by a vector, 20
symmetric form, 27
linear combination, 29
linear equation, 121
mathematical induction, 107
matrix, 166
augmented, 125, 170
diagonal of a, 172
equality, 167
identity, 172
inverse, 180
invertible, 180
multiplication, 170
scalar multiplication, 169
size, 166
c©2020 School of Mathematics and Statistics, UNSW Sydney
238
square, 172
sum, 167
symmetric, 179
transpose, 175
zero, 168
matrix algebra, 165
associative law, 167, 169, 173
commutative law, 167
distributive law, 169, 173
minor, 188
n-dimensional space, 16
coordinates, 17
direction, 17
distance, 19
length, 18, 43
orthogonal vectors, 48
orthonormal set of vectors, 49
parallel vectors, 17
standard basis vectors, 17
non-leading variable, 138
orthogonal, 48
parallelepiped, 62
pivot, 134
plane, 29
Cartesian form, 63
parametric vector form, 32
point-normal form, 64
spanned by two vectors, 30
polynomial, 103
factorisation theorem, 104
root, 103
position vector, 15–17
projection, 52
real factorisation, 106
reduced row-echelon form, 133
remainder theorem, 103
row operations, 129
row operations and mult., 185
row-echelon form, 132
scalar quantity, 1
scalar triple product, 60
span, 30
system of linear equations, 125
consistent, 125
homogeneous, 125
inconsistent, 125
solution, 125
triangle inequality, 48
tuple, 14, 15, 17
vector, 2, 11
addition, 2, 3, 11
components, 9, 11
negative, 4, 13
scalar multiplication, 5, 11
zero, 2, 13
vector quantity, 1
c©2020 School of Mathematics and Statistics, UNSW Sydney

欢迎咨询51作业君