辅导案例-CSC336-Assignment 2
CSC336, Summer 2020 Assignment 2 CSC336: Assignment 2 Due June 16, 2020 by 11:59pm General instructions Please read the following instructions carefully before starting. They contain important information about general expectations, submission instructions, and reminders of course policies. • Your work is graded on both correctness and clarity of communication. Solutions that are technically correct but poorly written will not receive full marks. Please read over your solutions carefully before submitting them. • Solutions may be handwritten or typeset as appropriate, but must be submitted as PDFs and must be legible. Any code should be submitted as specified. The required filenames for this problem set are listed under additional instructions. • Your work must be submitted online through MarkUs. • Your submitted file(s) should not be larger than 9MB. You might exceed this limit if you use a word processor like Microsoft Word to create a PDF; if it does, you should look into PDF compression tools to make your PDF smaller, although please make sure that your PDF is still legible before submitting! • Submissions must be made before the due date on MarkUs. You may submit as many times as you want. Your last submission will be graded. • The work you submit must be your own. Additional instructions For the report questions, submit your solutions in a file called a2-report.pdf and your code in a2- report.py. Your answers for this question will be graded on both correctness and the quality of presen- tation of your results and analysis. It must be typeset using LaTeX, a word processor, or similar (please ask if you aren’t sure what is expected) For the other question, submit your written (or typed) solutions in a2.pdf and your code in a2.py. If you import any non-standard modules (e.g tabulate or pandas) please do so in the if name == ‘ main ’ block of your a2.py if possible (or put the code in a2-report.py) Please start early and ask questions on Piazza if you need clarification or help with any part of it. The coding questions require you to read some documentation in order to complete them, so don’t leave it to the last minute. Page 1/6 CSC336, Summer 2020 Assignment 2 1. Report Question - Banded Matrices (Based on Heath Computer Problem 2.17) In this question, we will consider the performance of several different scipy linear system solvers when applied to a banded linear system. We will be solving the n× n banded linear system: 9 −4 1 0 · · · · · · 0 −4 6 −4 1 . . . ... 1 −4 6 −4 1 . . . ... 0 . . . . . . . . . . . . . . . 0 ... . . . 1 −4 6 −4 1 ... . . . 1 −4 5 −2 0 · · · · · · 0 1 −2 1 d1 d2 ... dn−1 dn = −1 n4 1 1 ... 1 1 The physical system being modelled is a horizontal cantilevered beam - the bar is fixed at the left end (i.e at x = 0, the displacement in the bar,d(x), is 0) and is free at the right end (x = 1). The bar is uniformly loaded (i.e the right hand side vector is constant). The factor of 1 n4 is necessary to make the solution comparable as we vary n. The linear system arises from a discretization of a differential equation (using finite differences simi- lar to what we considered for approximating deriva- tives in A1). This discretization means that di corresponds to d(xi), xi = i n , i = 1, . . . , n. For example, solving this with different values of n results in the following approximate solutions: (a) LU For a given value of n, setup and solve the linear system using scipy.linalg.solve. Form the matrix using scipy.sparse.diags as we did in the Week 6 worksheet, but make sure to convert it to its dense format (e.g. if S is a sparse matrix, S.toarray() returns the full matrix) Time your code using time.perf counter() as we have done on previous worksheets. (b) banded LU Solve the linear system using scipy.linalg.solve banded and time your code. Read the documen- tation to understand what format the input has to be in. (c) sparse LU Solve the linear system using scipy.sparse.linalg.spsolve and time your code. (d) prefactored From Heath, it turns out that this matrix can be factored as RRT , where, Page 2/6 CSC336, Summer 2020 Assignment 2 R = 2 −2 1 0 · · · 0 0 1 −2 1 . . . ... ... . . . . . . . . . . . . 0 ... . . . 1 −2 1 ... . . . 1 −2 0 · · · · · · · · · 0 1 Solve the linear system using this factorization. To do this, form R and use two calls to scipy.linalg.solve triangular with appropriate arguments. Time your code. (e) banded prefactored Solve the linear system again using this factorization, but use two calls to scipy.linalg.solve banded with appropriate arguments. Time your code. (f) sparse prefactored Repeat the above, but use scipy.sparse.linalg.spsolve triangular. Note, use the optional argu- ment format=’csr’ in scipy.sparse.diags when forming the sparse R or spsolve triangular will com- plain. Time your code. (g) Cholesky It turns out that the matrix is symmetric positive definite, so we can also solve the system using the Cholesky decomposition. Use scipy.linalg.cho factor and scipy.cho solve to solve the linear system. Time your code. (h) Experiment Using the code you have written, perform an appropriate experiment to allow you to compare the performance of these 5 7 different solvers. Your experiment should time each method for suitable values of n. Present your results in an appropriate table and/or plot - in whatever way you feel best gets your results across. For example, your results may look something like: \ | | | | | | | | \ method | LU | banded |sparse LU| R | banded R| sparse R| chol | n \ | | | | | | | | ---------------------------------------------------------------------------------- 200 | 0.55ms | 0.07ms | 0.19ms | 0.18ms | 0.07ms | 4.42ms | 0.31ms | 400 | 2.40ms | 0.07ms | 0.23ms | 0.29ms | 0.09ms | 9.64ms | 1.36ms | 600 | 5.93ms | 0.09ms | 0.31ms | 0.70ms | 0.12ms | 13.35ms | 3.55ms | 800 | 10.82ms | 0.11ms | 0.40ms | 1.34ms | 0.22ms | 17.44ms | 7.62ms | 1000 | 19.39ms | 0.13ms | 0.47ms | 2.11ms | 0.17ms | 21.91ms | 16.02ms | 1200 | 34.74ms | 0.16ms | 0.56ms | 3.17ms | 0.19ms | 26.17ms | 18.94ms | 1400 | 36.91ms | 0.18ms | 0.63ms | 4.17ms | 0.22ms | 31.76ms | 25.52ms | 1600 | 56.64ms | 0.20ms | 0.72ms | 6.20ms | 0.24ms | 35.01ms | 39.41ms | Please ask for help on Piazza if you have trouble getting any of the approaches working When you time your code, don’t include the time taken to construct any of the matrices. You don’t have to include it in your report, but please make sure your implementations are correct. If you can’t get one of the approaches implemented, clearly indicate this in your work and refer to the above results for that method in your discussion Page 3/6 CSC336, Summer 2020 Assignment 2 (i) Discussion Comment on your results. Make connections to anything we have talked about regarding the costs of the various methods in lecture or discussed in section 2.5 of Heath. We haven’t discussed general sparse solvers, so your comments about the two sparse solvers can be limited to what you observe and their relative performance to the other methods. Note: You might find that spsolve triangular is slower than you might expect - check the source code (linked in the online documentation 1) and see if you can figure out why that might be. Make sure to spend enough time on this part - the discussion of the results is the most important part of this exercise. 2. Report Question - Banded Matrices II (Based on Heath Computer Problem 2.17) We’ll now consider the accuracy of the computed solutions. It turns out that the analytical solution of the mathematical model 2 is d(x) = 1 24 ( (1− x)(4− (1− x)3)− 3), so we will compare against this in our experiment. (a) Experiment For the methods in (b) and (e) above, plot the relative error in a loglog plot, where you vary n. Note, increasing n corresponds to refining the discretization of the bar - with a spacing of h = 1 n between points on the bar. (Since the left end of the bar is fixed, d(0) = 0. And d(1) corresponds to the amount of bend in the far left right of the bar.) Use values of n = 16, 32, 64, . . . , 16384, 32768, 65536. (i.e 2i, i = 4, . . . , 16). (b) Discussion Comment on the accuracy achieved. How does it vary with n? Why might the pre-factored approach be more accurate than the LU approach? Make sure to spend enough time on this part - the discussion of the results is the most important part of this exercise. 3. Efficiently Translating Math into Code (Based on Exercise 2.21 in Heath) We’ll again consider implementing the formula x = B−1(2A+1)(C−1 +A)b, where all matrices are n×n, b is a column vector of length n, and 1 denotes the identity matrix. On the last two homeworks we considered implementing this in two ways - the first explicitly computed inverses, the second instead solved linear systems. (a) Give an operation count for the formula as written (with the inverses explicitly computed and all calculations performed left to right). Make sure we can easily follow where your total operation count came from. (b) Give an operation count for the formula implemented without explicitly computing any inverses and ordering the calculations to reduce the total number of operations performed. Make sure we can easily follow where your total operation count came from. (c) Implement the two formulas and perform an appropriate experiment to compare the performance of the two formulas. Clearly present the results of your experiment using a plot or table of values and include it in a2.pdf. Please ask on Piazza if you aren’t sure about your experimental setup or results. (d) Comment on how your results compare with your operation count analysis. 1see lines 472-607 in https://github.com/scipy/scipy/blob/v1.4.1/scipy/sparse/linalg/dsolve/linsolve.py 2For example, see slide 17 in http://web.ncyu.edu.tw/~lanjc/lesson/C3/class/Chap06-A.pdf Page 4/6 CSC336, Summer 2020 Assignment 2 4. Representing Permutation Matrices [autotested] As we have seen, when we implement linear algebra formulas in code, we often do things to avoid unnec- essary work - usually involving zeros in the structure of the matrices involved. Here we are going to walk through how this can be done for permutation matrices. An n× n permutation matrix, P , only contains n non-zero entries, so it is inefficient to explicitly form it and perform a generic matrix multiply to compute PA. Note: for this question, we’ll use the notation x = [x0, x1, . . . , xn−1], so that indices are all zero-based. One way to represent P is with a vector q of length n, for which qi = j if Pi,j = 1 For example, the 4× 4 permutation matrix, P = 0 0 1 0 0 0 0 1 1 0 0 0 0 1 0 0 , can be represented by the vector, q = [2, 3, 0, 1]. Moreover, in the context of Gaussian Elimination with partial pivoting, on each step of the algorithm, we exchange the current row with one other row, so we only actually need to store which row was swapped with the current row. To represent the n− 1 permutation matrices that arise in partial pivoting, we will use the representation employed by the LAPACK piv vector (see help(scipy.linalg.lu factor)). In this representation, we use a vector of length n, call it p, where pi = j means that on step i of the algorithm, row i was interchanged with row j. Note, there are only n − 1 steps, so we always have that pn−1 = n− 1. Given this p vector, we can efficiently perform the action of multiplying the n− 1 permutation matrices together, to get P = Pn−2 · · ·P1P0, represented in the q vector format. For example, if we have, P0 = 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 1 , P1 = 1 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 , and P2 = 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 , then p = [2, 3, 2, 3]. The product P2P1P0 can be verified to be P from before (or as a q vector, q = [2, 3, 0, 1]. (a) Write a function in a2.py, p to q(p), that takes in a vector p as described above and returns the vector q. Hint: your code should start with q = [0, 1, . . . , n − 1] and perform a sequence of swaps based on stepping through p (carefully read how we defined p above). (see the example in help(scipy.linalg.lu factor) - one of the autotests will do something like this to confirm your function works properly) Please ask if you aren’t sure how to get started here Page 5/6 CSC336, Summer 2020 Assignment 2 (b) Write a function in a2.py, solve plu(A,b), that uses scipy.linalg.lu factor, two calls to scipy.linalg.solve triangular, and your function from (a) to solve Ax = b. Hint: it is maybe a bit confusing, but the only change is that you need to permute the rows in b (see page 73 of Heath) using your q vector (i.e b = b[q]). Please ask if any of the above is unclear. Make sure you don’t explicitly form any permutation matrices and multiply them - you should only work with the permutation vector. Page 6/6