辅导案例-ENG5322M

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

Course assignment
ENG5322M: Introduction to computer programming
October 28, 2019
Instructions
You should submit the following through the ENG5322M Moodle submission system by 4pm,
22nd November 2019:
1. Python notebook (.ipynb format) used for to solve each of the assignment problems.
This can be a separate notebook for each problem or a single file.
You will be marked according to the following criteria:
• Program correctness (60%): Does your program solve the problem given to you?
• Program readability (20%): Have you written your program in such a way that it is
easy to follow? Have you included appropriate comments?
• Testing (20%): Have you done appropriate testing to show your program works as
intended?
You are given two problems to solve for your project assignment which will be outlined in more
detail shortly. The two problems are focussed on:
1. Conjugate gradient method
2. Multivariate least square method
Notebook format
I expect to see you explain your code throughout your notebook both using text and images,
much like I have down in the lectures during this course. As a quick guide, the default type of a
cell is code, but if you change it to ’markdown’ (look in the toolbar) and then hit shift and enter
you will see it become text in the notebook.
1. Testing: You should write appropriate code that performs tests on your program.
1
1 PROBLEM 1: CONJUGATE GRADIENT METHOD
1 Problem 1: Conjugate gradient method
Overview of task: Write a program that solves equations of the form Ax = b using Conjugate
Gradient Method. Use this to solve the equation:
 4 −1 1−1 4 −2
1 −2 4
x1x2
x3
 =
12−1
5
 (1)
1.1 Derivation
The problem involves finding the vector x that minimizes the scalar function
f(x) =
1
2
xTAx− bTx (2)
where the matrix A is symmetric and positive definite. Because f(x) is minimized when its
gradient ∇f = Ax− b, we see that the minimization is equivalent to solving
Ax = b (3)
Gradient methods solve this problem of minimization through iteration starting with an initial
estimate of vector x0 and then iterate for k computations using
xk+1 = xk + αksk (4)
such that the step length αk is chosen so that xk+1 minimizes f(xk+1) in the search direction sk
such that xk+1 satisfies
A(xk + αksk) = b (5)
If we now introduce a residual
rk = b−Axk (6)
and pre-multiplying both sides by sTk and solving for αk gives
αk =
sTk rk
sTkAsk
(7)
While intuitively we can choose sk = −∇f = rk as this is the direction of the largest negative
change in f(x) (called method of steepest descent), it converges very slow. A better choice is
to use conjugate gradient method that uses the search direction
sk+1 = rk+1 + βksk (8)
2
1.2 Hint 1 PROBLEM 1: CONJUGATE GRADIENT METHOD
The constant βk is then chosen so that the two successive search directions are conjugate to
each other, i.e.,
sTk+1Ask = 0 (9)
Subsituiting Eq. 9 into Eq. 8 gives:
βk = −
rTk+1Ask
sTkAsk
(10)
The algorithm is then as follows:
91 ∗2.7 Iterative Methods
Here is the outline of the conjugate gradient algorithm:
Choose x0 (any vector will do).
Let r0 ← b− Ax0.
Let s0 ← r0 (lacking a previous search direction,
choose the direction of steepest descent).
do with k = 0, 1, 2, . . .:
αk ← s
T
k rk
sTkAsk
.
xk+1 ← xk + αksk .
rk+1 ← b− Axk+1.
if |rk+1| ≤ ε exit loop (ε is the error tolerance).
βk ←−
rTk+1Ask
sTkAsk
.
sk+1 ← rk+1 + βksk .
It can be shown that the residual vectors r1, r2, r3, . . . produced by the algorithm
are mutually orthogonal; that is, ri · r j = 0, i ̸= j . Now suppose that we have carried
out enough iterations to have computed thewhole set ofn residual vectors. The resid-
ual resulting from the next iteration must be a null vector (rn+1 = 0), indicating that
the solution has been obtained. It thus appears that the conjugate gradient algorithm
is not an iterative method at all, because it reaches the exact solution after n compu-
tational cycles. In practice, however, convergence is usually achieved in less than n
iterations.
The conjugate gradient method is not competitive with direct methods in the
solution of small sets of equations. Its strength lies in the handling of large, sparse
systems (where most elements of A are zero). It is important to note that A enters the
algorithm only through its multiplication by a vector; that is, in the form Av, where v
is a vector (either xk+1 or sk). If A is sparse, it is possible to write an efficient subrou-
tine for the multiplication and pass it, rather than A itself, to the conjugate gradient
algorithm.
! conjGrad
The function conjGrad shown next implements the conjugate gradient algorithm.
The maximum allowable number of iterations is set to n (the number of unknowns).
Note that conjGrad calls the function Av that returns the product Av. This func-
tion must be supplied by the user (see Example 2.18). The user must also supply the
starting vector x0 and the constant (right-hand-side) vector b. The function returns
the solution vector x and the number of iterations.
## module conjGrad
’’’ x, numIter = conjGrad(Av,x,b,tol=1.0e-9)
Conjugate gradient method for solving [A]{x} = {b}.
Figure 1: Pseudo algorithm for conjugate gradient method.
It can be shown hat the residual vectors r1, r2, r3, ... are mutually o thogonal i.e. ri.rj = 0, i 6= j
Please note that while writi g the algorithm, if you want to multiply a tra spose of a vector
with a regular vector, say xTy, you can use in numpy, np.dot(x,y). While implementing the
algorithm, set the tolerance to very low tol=1.0e-9.
1.2 Hint
The algorithm requires calculating r, α, and β (if you can work back from the worked example)
along with modulus of , then you can break down this whole algorithm to very simple ma-
trix/vector multiplications/addition/subtraction that you have to do in a loop whether it is for
or a while loop:
a) Multiplying a matrix A with x and s btracting from a vector b to get r. This can all be done
using numpy arrays,
A = np.array([[ 4, -1 ,1], [ -1, 4 ,-2], [ 1, -2 ,4]])
b = np.array([12, -1, 5])
x = np.ones(3)
then r = Ax− b is nothing but
r = np.dot(A,x) - b
3
1.2 Hint 1 PROBLEM 1: CONJUGATE GRADIENT METHOD
b) You also have to solve for α and β. If I take α, then the numerator bit is nothing but
multiplying the transpose of s with r (can be done using np.dot). The denominator is transpose
of s with As which you can again solve using np.dot provided if you can precalculate A times s
using np.dot.
numerator = np.dot(s,r)
As = np.dot(A,s)
denominator = np.dot(s,As)
alpha = numerator / denominator
You can similarly take hints from above to solve for β
c) xk+1 = xk + αksk, i.e., calculating next value of x is nothing but
x = x + alpha * s
or in short form
x+= alpha * s
We have also covered the += operator in the lectures. You can similarly solve for s.
d) To calculate |r|, you can simply use
rnorm = np.dot(r,r)
and then use an if statement to see if rnorm is less than a tolerance say 1e-5 to break from the
loop.
If I have said not to use numpy library function then it means not to use an equivalent function
such as np.linalg.solve() to cheat which, we already covered in the lectures and act as a one
line solution
def conjugate_gradient_cheating(A,b)
x=np.linalg.solve(A,b)
return x
instead I want you to implement your own algorithm:
def conjugate_gradient_snazzy(A,b)

return x
4
2 PROBLEM 2: MULTIVARIATE LEAST SQUARES METHOD
2 Problem 2: Multivariate least squares method
We want to find the regression coefficients a0, a1,...,am for the following equation:
yi = a0 + a1x1i + a2x2i + a3x3i + ...+ amxmi (11)
To solve this, let us suppose, we take a case for m = 2 and add an error term i to our equation:
yi + i = a0 + a1x1i + a2x2i (12)
We can then obtain an objective function S to minimize as:
i = a0 + a1x1i + a2x2i − yi (13)
2i = (a0 + a1x1i + a2x2i − yi)2 (14)
S =
n∑
i=1
2i =
n∑
i=1
(a0 + a1x1i + a2x2i − yi)2 (15)
Minimizing S with respect to a0, a1, and a2, we get:
∂S
∂a0
= 2
∑
(a0 + a1x1i + a2x2i − yi) (16)
∂S
∂a1
= 2
∑
(a0 + a1x1i + a2x2i − yi)x1i (17)
∂S
∂a2
= 2
∑
(a0 + a1x1i + a2x2i − yi)x2i (18)
Equating these to zero, we obtain three equations in three unknowns, a0, a1, a2:
a0 + a1 < x1 > +a2 < x2 >=< y > (19)
a0 < x1 > +a1 < x1x1 > +a2 < x1x2 >=< x1y > (20)
a0 < x2 > +a1 < x1x2 > +a2 < x2x2 >=< x2y > (21)
5
2.1 Hint 2 PROBLEM 2: MULTIVARIATE LEAST SQUARES METHOD
where < . > represents average quantity defined as:
< x >=
1
n
n∑
i=1
xi (22)
We can represent in matrix form to use matrix methods as: 1 < x1 > < x2 >< x1 > < x1x1 > < x1x2 >
< x2 > < x1x2 > < x2x2 >
a0a1
a2
 =
 < y >< x1y >
< x2y >
 (23)
or XA = Y . The matrix of unknown coefficients, A, is obtained by finding the inverse of X,
so that A =X−1Y .
Extending to m variables, we get:
X =

1 < x1 > < x2 > ... < xm >
< x1 > < x1x1 > < x1x2 > .. < x1xm >
< x2 > < x1x2 > < x2x2 > .. < x2xm >
... ... ... ... ...
< xm > < x1xm > < x2xm > ... < xmxm >
 ,Y =

< y >
< x1y >
< x2y >
...
< xmy >
 (24)
Overview of task: The assignment is to use the above logic to find the coefficients a0,a1,a2
for the data given in Table 1 (also provided as a tab-delimited text file). The exact solution is
y = −2.145− 3.117x1 + 1.486x2.
2.1 Hint
To program this efficiently in python, we can make use of a trick. Imagine we get the values of x
and y from file, and since we are reading one line at a time, rather than calculating the averages,
we can simply maintain the sums by taking 1/n out and cancelling it on both sides:
1
n
 n ∑ni=1 x1i ∑ni=1 xi2∑n
i=1 x1i
∑n
i=1 x1ix1i
∑n
i=1 x1ix2i∑n
i=1 xi2
∑n
i=1 x1ix2i
∑n
i=1 x2ix2i
a0a1
a2
 = 1
n
 ∑ni=1 yi∑n
i=1 x1iyi∑n
i=1 x2iyi
 (25)
If you can import all the function from numpy library from numpy import *, and if you have a
variable m as the number of independent variables, and if you can define x=zeros((m+1,m+1)),
y=zeros((m+1,1)), and xi=[0.0]*(m+1), then you can populate both matrices x and y in a
nested-loop with two equations, x[j,k] += xi[j]*xi[k] and y[j,0] += yi*xi[j], provided if
you have xi[0]=1, xi[1] the value of first predictor (first column), xi[2] the value of second
predictor (second column), and so on. yi is dependent variable (last column) in Table 1. Finally,
you can simply convert numpy arrays x and y to numpy matrix so that you can get the inverse
automatically: X=mat(x.copy()), Y=mat(y.copy()), and A=X.I*Y.
Also, the data file provided is tab-delimited. Look up how to split a string using .split()
function.
6
2.1 Hint 2 PROBLEM 2: MULTIVARIATE LEAST SQUARES METHOD
Table 1: Data to be fitted
x1 x2 y
0.408 0.58 -2.39
0.429 0.51 -2.53
0.492 0.53 -3.38
0.529 0.6 -2.72
0.569 0.58 -2.95
0.677 0.64 -3.32
0.703 0.61 -3.45
0.841 0.66 -3.81
0.911 0.73 -3.9
0.936 0.72 -4.11
0.96 0.57 -4.24
0.978 0.84 -4.81
0.993 0.69 -4.1
1.038 0.75 -4.28
1.051 0.76 -4.05
1.119 0.85 -4.23
1.134 0.75 -4.58
1.16 0.85 -4.4
1.209 0.72 -5.05
1.272 0.82 -4.79
1.303 0.86 -4.76
1.353 0.98 -4.65
1.367 0.9 -5.18
1.388 0.71 -5.5
1.425 1.04 -5.01
1.453 0.9 -5.15
1.484 0.77 -5.8
1.503 0.93 -5.33
1.536 0.81 -5.23
1.563 0.83 -6.11
1.655 1.2 -5.34
1.684 0.96 -6.13
1.897 1.24 -6.45
7