COMPSCI 369
THE UNIVERSITY OF AUCKLAND
FIRST SEMESTER, 2020 —MID-SEMESTER TEST
Version 1
Campus: VIRTUAL
COMPUTER SCIENCE
Computational Science
(Time allowed: ONE hour)
NOTE: The test has 15 questions, and it is worth 100 marks in total.
Answer all questions in a single document and upload it as a pdf file to
Canvas, like you do for the assignments.
This test has been designed as an open book one hour test. However
you have a full 24 hours to work on this test. You are allowed to use the full
24 hours and you are allowed to use the provided materials.
When a question requests to explain your answer, that means you need
to justify how you came up with the solution or why you made a certain
choice. Be concise and as clear as possible.
The work must be your own. You are not allowed to copy text or code
or solutions. We will not accept quotes to be valid answers. So, use your own
words when answering the questions. We will run the test through Turnitin.
You will be asked to include a sentence which states that the test is your
own work.
CONTINUED
QUESTION SHEET 2 COMPSCI 369
Section A: Integrity Statement
Copy down and sign the following statement.
For the 24 hour duration of this test, I, (insert your name here), confirm that I will not discuss the
content of the test with anyone else. I will not give any assistance to another student taking this test. I
will not receive any assistance from any person or tutoring service. I will not post any information about
this test on the Internet. [2 marks]
Section B: Computational Biology and Numerical Integration
The Runge-Kutta (RK4) Algorithm
1. Does RK4 take greater or fewer computations per iteration than the Euler method? [2 marks]
2. Explain why RK4 is preferable to the Euler method. [5 marks]
3. The RK4 algorithm calculates four ‘k-terms’ before combining them together in a weighted aver-
age. In your own words, once they are all combined together, what is this value an estimation of?
[5 marks]
4. Pretend that you are teaching a friend how RK4 works. The friend has never studied the algorithm
before. Write down your detailed explanation of the k2 term. What does it estimate and how is it
calculated? [7 marks]
5. Explain why finding a numerical artifact in a simulation is not a good thing. [3 marks]
Building a dynamical model
Pretend you have recently been hired by a farmer to build a computational model to help her understand
her dairy farm. She wants the model to simulate her grassy grazing fields, which are shared by many
cows and many sheep. Both cows and sheep eat the grass in these fields, and the number of cows grows
proportionally to how much grass they eat. Similarly, the sheep grow proportionally to how much grass
they eat. The grass grows at a constant rate.
6. Write down a set of differential equations for modelling this system. [12 marks]
7. List each VARIABLE in your model, and explain what it represents. [5 marks]
8. List each PARAMETER in your model and explain what it represents. [4 marks]
9. Identify two assumptions that are implicit in your model. [6 marks]
10. Draw a sketch of what you imagine a timeseries of this system might look like. Explain the main
features of the plot and why [10 marks]
CONTINUED
QUESTION SHEET 3 COMPSCI 369
Section C: Sequence Alignments
11. You are given the following two alignments generated for the same pair of sequences. Given that
the match score is sm, the mismatch score is sn, and the linear gap score is −g, which one aligns
better the two sequences and why? Consider also the biological relevance as a criterion in the
decision. Is there a better alignment than the two presented? Could you say that the two sequences
Alignment 1 Alignment 2
ATG-CCAGG- ATGCCAGG-
AT-CC-A-GG AT-CCA-GG
[7 marks]
12. What is the minimum length and the maximum possible length of an optimal sequence alignment
of two sequences A and B, where A can be any sequence of length 5 and B can be any sequence of
length 7, in the case of
(a) global alignment
(b) local alignment
(c) overlap alignment
You can assume that the mismatch and gap scores can take only non-positive values, and the match
[8 marks]
13. The following alignment score matrices were produced with one of the algorithms we discussed in
the lectures.
F1 :
B1 B2 B3 B4
0 −1 −2 −3 −4
A1 −1 1 0 −1 −2
A2 −2 0 0 1 0
A3 −3 −1 −1 1 0
A4 −4 −2 −2 0 0
F2 :
B1 B2 B3 B4 B5 B6
0 0 0 0 0 0 0
A1 0 0 0 0 0 0 0
A2 0 1 1 1 1 1 1
A3 0 1 2 2 2 2 2
A4 0 1 2 2 3 3 3
A5 0 1 2 2 3 4 4
A6 0 1 2 2 3 4 5
A7 0 1 2 2 3 4 5
Given that the mismatch and gap scores can take only non-positive values, and the match score is
CONTINUED
QUESTION SHEET 4 COMPSCI 369
(a) What alignment algorithm could have produced the matrix F1? What is the optimal score?
How will the optimal alignment look like?
(b) What alignment algorithm could have produced the matrix F2? The elements in red colour
and bold face trace the optimal path. Write down the alignment for the traced path. Is this the
only solution?
[10 marks]
14. You work with a biologist who just sequenced the complete genome of a new bacterial organism.
She does not know if the sequence is new or already exists, but found an online database of all
bacterial genomes sequenced so far. She does not know how to solve the problem computationally.
So, she comes to you with the database containing M sequences of length N1 . . . Nm and the
new sequence X of length Nx, and asks you to help her find out if the sequence is known. If the
sequence is unknown, she wants to know what is the closest bacterial genome to X.
(a) Explain how you will solve this problem, be explicit about the type of alignment algorithm
you will use and justify your choices.
(b) What will be the time complexity of the whole data analysis? Explain your result.
[8 marks]
15. Can we solve the multiple alignment problem exactly and efficiently? Explain your answer.
[6 marks]

Email:51zuoyejun

@gmail.com