CSC336, Summer 2020 Assignment 2
CSC336: Assignment 2
Due June 16, 2020 by 11:59pm
General instructions
general expectations, submission instructions, and reminders of course policies.
• Your work is graded on both correctness and clarity of communication. Solutions that are technically
submitting them.
• Solutions may be handwritten or typeset as appropriate, but must be submitted as PDFs and must
be legible. Any code should be submitted as specified.
The required filenames for this problem set are listed under additional instructions.
• Your work must be submitted online through MarkUs.
• Your submitted file(s) should not be larger than 9MB. You might exceed this limit if you use a word
processor like Microsoft Word to create a PDF; if it does, you should look into PDF compression
tools to make your PDF smaller, although please make sure that your PDF is still legible before
submitting!
• Submissions must be made before the due date on MarkUs. You may submit as many times as you
• The work you submit must be your own.
For the report questions, submit your solutions in a file called a2-report.pdf and your code in a2-
report.py. Your answers for this question will be graded on both correctness and the quality of presen-
tation of your results and analysis. It must be typeset using LaTeX, a word processor, or similar
For the other question, submit your written (or typed) solutions in a2.pdf and your code in a2.py.
If you import any non-standard modules (e.g tabulate or pandas) please do so in the if
name == ‘ main ’ block of your a2.py if possible (or put the code in a2-report.py)
Please start early and ask questions on Piazza if you need clarification or help with any part
of it. The coding questions require you to read some documentation in order to complete
them, so don’t leave it to the last minute.
Page 1/6
CSC336, Summer 2020 Assignment 2
1. Report Question - Banded Matrices (Based on Heath Computer Problem 2.17)
In this question, we will consider the performance of several different scipy linear system solvers when
applied to a banded linear system.
We will be solving the n× n banded linear system:

9 −4 1 0 · · · · · · 0
−4 6 −4 1 . . . ...
1 −4 6 −4 1 . . . ...
0
. . .
. . .
. . .
. . .
. . . 0
...
. . . 1 −4 6 −4 1
...
. . . 1 −4 5 −2
0 · · · · · · 0 1 −2 1


d1
d2
...
dn−1
dn

=
−1
n4

1
1
...
1
1

The physical system being modelled is a horizontal cantilevered beam - the bar is fixed at the left end (i.e
at x = 0, the displacement in the bar,d(x), is 0) and is free at the right end (x = 1). The bar is uniformly
loaded (i.e the right hand side vector is constant). The factor of
1
n4
is necessary to make the solution
comparable as we vary n.
The linear system arises from a discretization of a
differential equation (using finite differences simi-
lar to what we considered for approximating deriva-
tives in A1). This discretization means that di
corresponds to d(xi), xi =
i
n
, i = 1, . . . , n.
For example, solving this with different values of
n results in the following approximate solutions:
(a) LU
For a given value of n, setup and solve the linear system using scipy.linalg.solve. Form the matrix
using scipy.sparse.diags as we did in the Week 6 worksheet, but make sure to convert it to its dense
format (e.g. if S is a sparse matrix, S.toarray() returns the full matrix)
Time your code using time.perf counter() as we have done on previous worksheets.
(b) banded LU
Solve the linear system using scipy.linalg.solve banded and time your code. Read the documen-
tation to understand what format the input has to be in.
(c) sparse LU
Solve the linear system using scipy.sparse.linalg.spsolve and time your code.
(d) prefactored
From Heath, it turns out that this matrix can be factored as RRT , where,
Page 2/6
CSC336, Summer 2020 Assignment 2
R =

2 −2 1 0 · · · 0
0 1 −2 1 . . . ...
...
. . .
. . .
. . .
. . . 0
...
. . . 1 −2 1
...
. . . 1 −2
0 · · · · · · · · · 0 1

Solve the linear system using this factorization.
To do this, form R and use two calls to scipy.linalg.solve triangular with appropriate arguments.
(e) banded prefactored
Solve the linear system again using this factorization, but use two calls to scipy.linalg.solve banded
with appropriate arguments. Time your code.
(f) sparse prefactored
Repeat the above, but use scipy.sparse.linalg.spsolve triangular. Note, use the optional argu-
ment format=’csr’ in scipy.sparse.diags when forming the sparse R or spsolve triangular will com-
(g) Cholesky
It turns out that the matrix is symmetric positive definite, so we can also solve the system using the
Cholesky decomposition. Use scipy.linalg.cho factor and scipy.cho solve to solve the linear system.
(h) Experiment
Using the code you have written, perform an appropriate experiment to allow you to compare the
performance of these 5 7 different solvers. Your experiment should time each method for suitable
values of n. Present your results in an appropriate table and/or plot - in whatever way you feel best
For example, your results may look something like:
\ | | | | | | | |
\ method | LU | banded |sparse LU| R | banded R| sparse R| chol |
n \ | | | | | | | |
----------------------------------------------------------------------------------
200 | 0.55ms | 0.07ms | 0.19ms | 0.18ms | 0.07ms | 4.42ms | 0.31ms |
400 | 2.40ms | 0.07ms | 0.23ms | 0.29ms | 0.09ms | 9.64ms | 1.36ms |
600 | 5.93ms | 0.09ms | 0.31ms | 0.70ms | 0.12ms | 13.35ms | 3.55ms |
800 | 10.82ms | 0.11ms | 0.40ms | 1.34ms | 0.22ms | 17.44ms | 7.62ms |
1000 | 19.39ms | 0.13ms | 0.47ms | 2.11ms | 0.17ms | 21.91ms | 16.02ms |
1200 | 34.74ms | 0.16ms | 0.56ms | 3.17ms | 0.19ms | 26.17ms | 18.94ms |
1400 | 36.91ms | 0.18ms | 0.63ms | 4.17ms | 0.22ms | 31.76ms | 25.52ms |
1600 | 56.64ms | 0.20ms | 0.72ms | 6.20ms | 0.24ms | 35.01ms | 39.41ms |
Please ask for help on Piazza if you have trouble getting any of the approaches working
When you time your code, don’t include the time taken to construct any of the matrices.
are correct. If you can’t get one of the approaches implemented, clearly indicate this
in your work and refer to the above results for that method in your discussion
Page 3/6
CSC336, Summer 2020 Assignment 2
(i) Discussion
Comment on your results. Make connections to anything we have talked about regarding the costs
of the various methods in lecture or discussed in section 2.5 of Heath. We haven’t discussed general
sparse solvers, so your comments about the two sparse solvers can be limited to what you observe
and their relative performance to the other methods.
Note: You might find that spsolve triangular is slower than you might expect - check the source code
(linked in the online documentation 1) and see if you can figure out why that might be.
Make sure to spend enough time on this part - the discussion of the results is the most
important part of this exercise.
2. Report Question - Banded Matrices II (Based on Heath Computer Problem 2.17)
We’ll now consider the accuracy of the computed solutions. It turns out that the analytical solution of
the mathematical model 2 is d(x) =
1
24
(
(1− x)(4− (1− x)3)− 3), so we will compare against this in our
experiment.
(a) Experiment
For the methods in (b) and (e) above, plot the relative error in a loglog plot, where you vary n.
Note, increasing n corresponds to refining the discretization of the bar - with a spacing of h =
1
n
between points on the bar. (Since the left end of the bar is fixed, d(0) = 0. And d(1) corresponds to
the amount of bend in the far left right of the bar.)
Use values of n = 16, 32, 64, . . . , 16384, 32768, 65536. (i.e 2i, i = 4, . . . , 16).
(b) Discussion
Comment on the accuracy achieved. How does it vary with n? Why might the pre-factored approach
be more accurate than the LU approach?
Make sure to spend enough time on this part - the discussion of the results is the most
important part of this exercise.
3. Efficiently Translating Math into Code (Based on Exercise 2.21 in Heath)
We’ll again consider implementing the formula x = B−1(2A+1)(C−1 +A)b, where all matrices are n×n,
b is a column vector of length n, and 1 denotes the identity matrix. On the last two homeworks we
considered implementing this in two ways - the first explicitly computed inverses, the second instead
solved linear systems.
(a) Give an operation count for the formula as written (with the inverses explicitly computed
and all calculations performed left to right). Make sure we can easily follow where your total
operation count came from.
(b) Give an operation count for the formula implemented without explicitly computing any inverses
and ordering the calculations to reduce the total number of operations performed. Make
sure we can easily follow where your total operation count came from.
(c) Implement the two formulas and perform an appropriate experiment to compare the performance
of the two formulas. Clearly present the results of your experiment using a plot or table of values
setup or results.
(d) Comment on how your results compare with your operation count analysis.
1see lines 472-607 in https://github.com/scipy/scipy/blob/v1.4.1/scipy/sparse/linalg/dsolve/linsolve.py
2For example, see slide 17 in http://web.ncyu.edu.tw/~lanjc/lesson/C3/class/Chap06-A.pdf
Page 4/6
CSC336, Summer 2020 Assignment 2
4. Representing Permutation Matrices [autotested]
As we have seen, when we implement linear algebra formulas in code, we often do things to avoid unnec-
essary work - usually involving zeros in the structure of the matrices involved.
Here we are going to walk through how this can be done for permutation matrices.
An n× n permutation matrix, P , only contains n non-zero entries, so it is inefficient to explicitly form it
and perform a generic matrix multiply to compute PA.
Note: for this question, we’ll use the notation x = [x0, x1, . . . , xn−1], so that indices are all
zero-based.
One way to represent P is with a vector q of length n, for which qi = j if Pi,j = 1 For example, the 4× 4
permutation matrix,
P =

0 0 1 0
0 0 0 1
1 0 0 0
0 1 0 0
 ,
can be represented by the vector, q = [2, 3, 0, 1].
Moreover, in the context of Gaussian Elimination with partial pivoting, on each step of the algorithm, we
exchange the current row with one other row, so we only actually need to store which row was swapped
with the current row.
To represent the n− 1 permutation matrices that arise in partial pivoting, we will use the representation
employed by the LAPACK piv vector (see help(scipy.linalg.lu factor)).
In this representation, we use a vector of length n, call it p, where pi = j means that on step i of the
algorithm, row i was interchanged with row j. Note, there are only n − 1 steps, so we always have that
pn−1 = n− 1.
Given this p vector, we can efficiently perform the action of multiplying the n− 1 permutation matrices
together, to get P = Pn−2 · · ·P1P0, represented in the q vector format.
For example, if we have,
P0 =

0 0 1 0
0 1 0 0
1 0 0 0
0 0 0 1
 , P1 =

1 0 0 0
0 0 0 1
0 0 1 0
0 1 0 0
 , and P2 =

1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1
 ,
then p = [2, 3, 2, 3]. The product P2P1P0 can be verified to be P from before (or as a q vector, q =
[2, 3, 0, 1].
(a) Write a function in a2.py, p to q(p), that takes in a vector p as described above and returns the
vector q. Hint: your code should start with q = [0, 1, . . . , n − 1] and perform a sequence of swaps
based on stepping through p (carefully read how we defined p above).
(see the example in help(scipy.linalg.lu factor) - one of the autotests will do something like this to
Page 5/6
CSC336, Summer 2020 Assignment 2
(b) Write a function in a2.py, solve plu(A,b), that uses scipy.linalg.lu factor,
two calls to scipy.linalg.solve triangular, and your function from (a) to solve Ax = b.
Hint: it is maybe a bit confusing, but the only change is that you need to permute the rows in b (see
page 73 of Heath) using your q vector (i.e b = b[q]).