程序代写案例-COMP219

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

PAPER CODE NO. EXAMINER : Xiaowei Huang
COMP219 DEPARTMENT : Computer Science Tel. No. 0151 795 4260
First Semester Class Test 2020/21
Advanced Artificial Intelligence
TIME ALLOWED : 50 minutes
INSTRUCTIONS TO CANDIDATES
Answer FOUR questions.
If you attempt to answer more questions than the required number of questions (in any section),
the marks awarded for the excess questions answered will be discarded (starting with your
lowest mark).
PAPER CODE COMP219 page 1 of 15 Continued
Figure 1: Joint probability for student grade and intelligence
Question 1: Basic Knowledge (Total 25 marks)
1. Which learning task best suits the following description: given a set of training instances
{(x(1), y(1)), ..., (x(n), y(n))} of an unknown target function f , where x(i) is the feature vec-
tor and y(i) is the label for i ∈ {1, ..., n}, it outputs a model h that best approximates f .
2 marks
(a) Unsupervised learning
(b) Supervised learning
(c) Reinforcement learning
(d) none of the above
2. Compute the following probability according to the table in Figure 1
P (Intelligence = High) =
2 marks
(a) 0.28
(b) 0.3
(c) 0.35
(d) 0.2
3. Compute the following conditional probability according to the table in Figure 1
P (Intelligence = High | Grade = C) =
2 marks
(a) 0.04/0.44
(b) 0.07/0.25
(c) 0.35/0.7
(d) 0.28/0.7
4. Which of the following statements are correct regarding to Figure 2. 3 marks
PAPER CODE COMP219 page 2 of 15 Continued
Figure 2: Diagram for y = x2 function
Figure 3: Probabilistic Graph of Diseases and Sympton
(a) minx x2 = 1
(b) minx x2 = 0
(c) 0 is in argminx x2
(d) pi is in argminx x2
5. Use the information provided in Figure 3 to compute the following joint probability
P (A = a1, B = b1) =
2 marks
(a) 0.06
(b) 0.36
(c) 0.24
(d) 0.42
6. Use the information provided in Figure 3 to compute the following expression
max
A,B
P (A,B) =
PAPER CODE COMP219 page 3 of 15 Continued
2 marks
(a) 0.42
(b) a1, b1
(c) 0.28
(d) a0, b1
7. Use the information provided in Figure 3 to compute the following maximum a posterior
expression
MAP (A,B) =
3 marks
(a) 0.36
(b) a1, b1
(c) 0.5
(d) a0, b1
8. Understanding simple numpy command.
Assume that a = np.arange(10).reshape((2, 5)). Then a.T.shape = 2 marks
(a) 10
(b) (2,5)
(c) (5,2)
(d) 2
9. Let x = (−2, 0, 6,−4) be a vector. Then its L2 norm ||x||2 = 2 marks
(a)
√
30
(b)
√
56
(c) 12
(d) 2
10. Let x = (0, 2, 3,−4) be a vector. Then its L1 norm ||x||1 = 3 marks
(a) 2
(b) 9
(c) 4
(d) 3
PAPER CODE COMP219 page 4 of 15 Continued
Figure 4: Decision Trees
Figure 5: Dataset for football payer
Question 2: Simple Learning Models (Total: 33 marks)
1. Which decision trees in Figure 4 can represent the boolean formula (x2 ∧ x5) 3 marks
(a) A
(b) B
(c) C
(d) none of the above
2. Figure 5 gives an example dataset D about football player. Please indicate which of the
following expressions is to compute its entropy HD(Y ), where Y is the random variable
for labelling: 3 marks
(a) −1
2
log2(
1
2
)− 1
2
log2(
1
2
)
(b) −1
2
log2(
1
2
)
(c)
1
2
log2(
1
2
)
(d) none of the above
3. Figure 5 gives an example dataset D about football player. Please compute the information
gain of splitting over the feature Wind InfoGain(D, V1) = HD(Y )−HD(Y | V1): 3 marks
PAPER CODE COMP219 page 5 of 15 Continued
Figure 6: A set of two-dimensional input samples
(a)
(b)
(c)
(d)
4. Assume that, as shown in Figure 6, we have a set of training instances with two features
X1 and X2:
{(0.5, 3), (0.5, 0.5), (1, 0.5), (1, 2), (1.5, 0.5), (2, 4), (2.5, 3), (3, 0.5), (3, 3.5), (3.5, 4)}
such that
• the instance (0.5, 3), (1, 2), (2, 4), (2.5, 3), (3, 3.5), (3.5, 4) are labeled with value 1,
• the other points are labelled with value 2
Now, we have a new input (2, 2.5), please indicate which of the following points are not
considered for the 3-nn classification, according to the manhattan (L1) distance. 3 marks
(a) (1,2)
(b) (1.5,0.5)
(c) (2.5,3)
(d) (2,4)
5. Continue with the above. Now, for new input (2, 2.5), please compute its regression result
for the 3-nn regression, according to the Manhattan distance. 3 marks
(a)
(b)
(c)
PAPER CODE COMP219 page 6 of 15 Continued
Figure 7: A confusion matrix for two-class problem
(d)
6. Assume a two-class problem where each instance is classified as either 1 (positive) or -1
(negative). We have a training dataset of 1,000 instances, such that 600 of them are labeled
as 1 and 400 of them are labeled as -1. After training, we apply the trained model to classify
the 1,000 instances and find that 800 instances are classified the same as their true labels.
Moreover, we know that, 500 instances are classified as 1 and, within the 500 instances, 50
instances are actually labeled as -1. Please indicate which numbers should be filled in to
(A,B,C,D) in Figure 7. 3 marks
(a)
(b)
(c)
(d)
7. Continue with the above question. Please compute the error rate of the trained model.
3 marks
(a)
(b)
(c)
(d)
8. Given training data {(x(i), y(i)) | 1 ≤ i ≤ m}, which one of the following is closest to the
logistic regression 3 marks
(a) minimise the loss Lˆ =
1
m
m∑
i=1
(wTx(i))
(b) minimise the loss Lˆ =
1
m
m∑
i=1
(σ(wTx(i))− y(i))2, where σ(a) = 1
1 + exp(−a)
(c) minimises the loss Lˆ =
1
m
m∑
i=1
(wTx(i) − y(i))2
PAPER CODE COMP219 page 7 of 15 Continued
Figure 8: A simple 3-layer neural network
(d) minimise the loss Lˆ =
1
m
m∑
i=1
(log(wTx(i))− y(i))2
9. Let f(X) = 3X21 + 4X82 + 5X3 be a function, where X1, X2 and X3 are three variables.
Please indicate which of the following gradient expression is correct: 3 marks
(a) ∇Xf(X) = 6X1
(b) ∇Xf(X) = (3, 4, 5)
(c) ∇Xf(X) = (6X1, 28X2, 5)
(d) ∇Xf(X) = (6X1, 28X82 , 5)
10. Naive Bayes method is based on the following assumption, where Xi for i ∈ {1..n} repre-
sent features of an instance and Y represents the parameter: 3 marks
(a) P (X1, ..., Xn) =
n∏
i=1
P (Xi)
(b) P (X1, ..., Xn | Y ) =
n∏
i=1
P (Xi | Y )
(c) P (X1, ..., Xn) =
n∑
i=1
P (Xi)
(d) P (X1, ..., Xn | Y ) =
n∑
i=1
P (Xi | Y )
Part 3: Deep Learning
1. Figure 8 gives a simple 3-layer neural network with 2 inputs x1, x2 and a single output z.
Please indicate which of the following expression is correct for the gradient
∂z
∂x2
?
(a)
∂z
∂y1
∂y1
∂x2
PAPER CODE COMP219 page 8 of 15 Continued
Figure 9: A two-dimensional input and a convolutional filter
(b)
∂z
∂y2
∂y2
∂x2
(c)
∂z
∂y1
∂y1
∂x2
+
∂z
∂y2
∂y2
∂x2
+
∂z
∂y3
∂y3
∂x2
(d)
∂z
∂y1
∂y1
∂x2
∂z
∂y2
∂y2
∂x2
∂z
∂y3
∂y3
∂x2
2. The following four questions are related to Figure 9. In Figure 9, we have a two-
dimensional input and a convolutional filter. Please indicate which of the following state-
ment is correct if zero-padding is not applied ?
(a) the result is a one dimensional array of length 16
(b) the result is a one dimensional array of length 25
(c) the result is a two dimensional array of shape (4, 4)
(d) the result is a two dimensional array of shape (5, 5)
3. Continue with the above question. Please indicate which of the following statements are
correct for the result of applying the convolutional filter on the input ?
(a) there is an element 47
(b) there is an element 39
(c) there is an element 34
(d) there is an element 48
PAPER CODE COMP219 page 9 of 15 Continued
4. Take the same input as in Figure 9 and apply maxpooling on 2×2 filter. Assume that we
ignore those entries on which no pooling operations are applied. Please indicate which of
the following statement is correct ?
(a) the result is a one dimensional array of length 2
(b) the result is a one dimensional array of length 4
(c) the result is a two dimensional array of shape (2, 2)
(d) the result is a two dimensional array of shape (4, 4)
5. Continue with the above question. Please indicate which of the following statements are
correct for the result of applying maxpooing on 2×2 filter on the input ?
(a) there is a single element with value 6
(b) there is a single element with value 9
(c) there are two elements with value 6
(d) there are two elements with value 9
6. Which of the following statements are correct with respect to the features and feature man-
ifolds ?
(a) In an end-to-end learning of feature hierarchy, initial modules capture low-level fea-
tures, middle modules capture mid-level features, and last modules capture high level,
class specific features.
(b) It is very often that high-dimensional data lie in lower dimensional feature manifolds.
(c) Feature manifolds are linear, so it is easier to compute.
(d) The computation of the coordinates of the data with respect to feature manifolds en-
ables an easy separation of the data.
Part 4: Probabilistic Graphical Models
10. Please select two structures which are key for Bayesian Networks to represent joint proba-
bility distribution:
(a) chain rules
(b) joint probability distribution table
(c) graph
(d) conditional probability distributions
PAPER CODE COMP219 page 10 of 15 Continued
Figure 10: Simple Probabilistic Graphical Model
(a) (b)
Figure 11: Joint probability of two random variables
11. Figure 10 provides a simple probabilistic graphical model of three variables S, G, and I .
We already know that
P |= (S⊥G | I)
Which of the following is the value of P (i1, s1, g2)?
(a) 0.0409
(b) 0.0408
(c) 0.235
(d) 0.480
PAPER CODE COMP219 page 11 of 15 Continued
Figure 12: A Bayesian network
12. Figure 11 (a) provides a joint probability P . Let I(P ) to be the set of conditional indepen-
dence assertions of the form (X⊥Y |Z) that hold in P . Which of the following is correct?
(a) (X⊥∅ | Y ) ∈ I(P )
(b) (X⊥Y | ∅) ∈ I(P )
(c) (Y⊥∅ | X) ∈ I(P )
(d) I(P ) = ∅
13. Figure 11 (b) provides a joint probability P . Let I(P ) to be the set of conditional indepen-
dence assertions of the form (X⊥Y |Z) that hold in P . Which of the following is correct?
(a) (X⊥∅ | Y ) ∈ I(P )
(b) (Y⊥∅ | X) ∈ I(P )
(c) (X⊥Y | ∅) ∈ I(P )
(d) I(P ) = ∅
PAPER CODE COMP219 page 12 of 15 Continued
Figure 13: A simple probabilistic graphical model
14. Consider the Bayesian network model G in Figure 12 and indicate which of the following
are in I(G):
(a) (L⊥I,D, S | G)
(b) (G⊥S | D, I)
(c) (I⊥D | ∅)
(d) (D⊥I, S | ∅)
15. Consider the Bayesian network model G in Figure 12 and calculate the following value
P (i1, d0, g2, s1, l0) =
(a) 0.4608
(b) 0.004608
(c) 0.5329
(d) 0.001435
16. Consider the probabilistic graphical model in Figure 12 and indicate which of the following
statements are correct.
(a) We can do evidential reasoning by computing P (l1) and P (l1 | i0, d0)
(b) We can do causal reasoning by computing P (i1) and P (i1 | l0, g0)
(c) We can do intercausal reasoning by computing P (i1) and P (i1 | l0, g0)
(d) We can do causal reasoning by computing P (l1) and P (l1 | d0)
PAPER CODE COMP219 page 13 of 15 Continued
17. Consider the probabilistic graphical model in Figure 13 and indicate which of the following
statements are correct.
(a) D can influence I when G is not observed and L is observed
(b) D can influence L when G is observed
(c) G can influence S when I is not observed
(d) D can influence I when G is observed
18. Consider the probabilistic graphical model in Figure 13 and indicate which of the following
statements about the observations can enable the influence of D over S.
(a) G and L are observed, I is not observed
(b) G is observed, L and I are not observed
(c) G and I are observed, L is not observed
(d) L is observed, G and I are not observed
PAPER CODE COMP219 page 14 of 15 Continued
This page collects some formulas/expressions that may be used in this exam.
1. entropy:
−
∑
y∈values(Y )
P (y) log2 P (y)
2. conditional entropy:
H(Y |X) =
∑
x∈values(X)
P (X = x)H(Y |X = x)
where
H(Y |X = x) = −
∑
y∈values(Y )
P (Y = y|X = x) log2 P (Y = y|X = x)
PAPER CODE COMP219 page 15 of 15 End

欢迎咨询51作业君