程序辅导案例 > Program >

程序代写案例-ECMM426

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

AN
SW
ER
S
ECMM426
(with Answers)
UNIVERSITY OF EXETER
COLLEGE OF ENGINEERING, MATHEMATICS
AND PHYSICAL SCIENCES
COMPUTER SCIENCE
Examination, May 2020
Computer Vision
Module Leader: Dr Anjan Dutta
Duration: TWO HOURS + 30 MINUTES UPLOAD TIME
Answer ALL the questions.
Question 1 is worth 80 marks, while question 2 is worth 20 marks.
The marks for this module are calculated from 40% of the percentage mark for
this paper plus 60% of the percentage mark for associated coursework.
This is an OPEN BOOK examination.
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 1
SECTION A (Multiple Choice Questions)
Question 1
There are FORTY multiple choice questions with several possible choices each.
Clearly mark or write all the correct choices. Please note these questions might
have multiple correct answers, with partial marking.
1. Consider a grayscale image of size 200× 300. How much space in kilobytes
(KB) would this image require for storing in a disk?
(i) 20 KB
(ii) 60 KB
(iii) 300 KB
(iv) 100 KB
(2 marks)
(ii) [2]
2. Which of the following is a challenge when dealing with computer vision
problems?
(i) Variations due to geometric changes (like pose, scale etc)
(ii) Variations due to photometric factors (like illumination, appearance etc)
(iii) Background clutter
(iv) All of the above
(2 marks)
(iv) [2]
3. Convolution of a Gaussian filter with another Gaussian filter generates:
(i) Box filter
(ii) Unsharp filter
(iii) Gaussian filter
(iv) None of the above
(2 marks)
ECMM426 (2020) 1
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 2
(iii) [2]
ECMM426 (2020) 2 Please Turn Over
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 3
4. Suppose we have the following noisy image (Figure 1):
Figure 1: salt and pepper noise
This type of noise in the image is called ‘salt & pepper’ noise. Which type
of filter should be applied to denoise the image?
(i) Linear filter
(ii) Median filter
(iii) Sobel filter
(iv) None of the above
(2 marks)
(ii) [2]
5. ‘Ringing’ is an image artefact generated by:
(i) Box filter
(ii) Gaussian filter
(iii) Unsharp filter
(iv) All of the above
(2 marks)
(i) [2]
ECMM426 (2020) 3
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 4
6. What would be the relation between the original and modified image if the
original image be convolved with the following filter (Figure 2)?
Figure 2: filter
(i) Blurred image
(ii) Sharpened image
(iii) Inverted image
(iv) Rotated image
(2 marks)
(ii) [2]
ECMM426 (2020) 4 Please Turn Over
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 5
7. If we convolve an image with the filter given below (Figure 3), what would
be the relation between the original and modified image?
Figure 3: filter
(i) The original image will be shifted to the right by 1 pixel
(ii) The original image will be shifted down by 1 pixel
(iii) The original image will be shifted to the left by 1 pixel
(iv) The original image will be shifted up by 1 pixel
(2 marks)
(i) [2]
8. In Canny edge detection, we will get more continuous edges if we make the
following change to the hysteresis thresholding
(i) increase the high threshold
(ii) decrease the high threshold
(iii) increase the low threshold
(iv) decrease the low threshold
(2 marks)
(iii) [2]
ECMM426 (2020) 5
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 6
9. In the following image (Figure 4), you can find an edge labelled in the red
region. Which form of discontinuity create this kind of edge?
Figure 4: chair
(i) Depth Discontinuity
(ii) Surface colour Discontinuity
(iii) Illumination discontinuity
(iv) None of the above
(2 marks)
(i), (ii) [2]
10. What kind of edges would the Canny edge detector generate without doing
the non-maximum suppression step?
(i) Very thin edges
(ii) Thick edge regions
(iii) Perfect edges
(iv) None of the above
(2 marks)
(ii) [2]
ECMM426 (2020) 6 Please Turn Over
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 7
11. What are the main benefits of detecting image edges using the zero-crossings
of Laplacian of Gaussian (LoG) of the image rather than thresholding its
gradient magnitude?
(i) Zero-crossing produces contours instead of regions
(ii) Zero-crossing is less sensitive to image noise
(iii) Zero-crossing is independent of threshold parameter
(iv) All of the above
(2 marks)
(i), (iii) [2]
12. Let λ1 and λ2 be the eigenvalues of the second order moment matrix M,
from which we can compute the measure for detecting Harris corners as
R = λ1λ2− k(λ1 +λ2)2, where k is a small constant. What are the different
criteria in terms of R to reject a region as a purpose of detecting corner?
(i) R > 0
(ii) |R| is small
(iii) R < 0
(iv) All of the above
(2 marks)
(ii), (iii) [2]
13. Which of the following transformations is the Harris corner detector
invariant to?
(i) Translation
(ii) Scaling
(iii) Rotation
(iv) Photometric
(2 marks)
ECMM426 (2020) 7
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 8
(i),(iii),(iv) [2]
ECMM426 (2020) 8 Please Turn Over
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 9
14. Let f 11 be a SIFT descriptor from an image I1, and f
1
2 and f
2
2 be two SIFT
descriptors from another image I2, which are respectively the nearest and
second nearest neighbours (in L2 distance) of f 11 in I2. f
1
1 from I1 is said to
be matched to f 12 in I2 if it satisfies the following criteria, where ‖ ·‖ denotes
L2 distance:
(i) ‖f
1
1−f12 ‖
‖f11−f22 ‖ ≈ 0
(ii) ‖f
1
1−f12 ‖
‖f11−f22 ‖ ≈ 1
(iii) ‖f
1
1−f12 ‖
‖f11−f22 ‖ 1
(iv) ‖f
1
1−f12 ‖
‖f11−f22 ‖ 1
(2 marks)
(ii) [2]
ECMM426 (2020) 9
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 10
15. Suppose you have to rotate an image (Figure 5). Image rotation is nothing
but multiplication of image by a specific matrix to get a new transformed
image.
Figure 5: rotation
For simplicity, we consider one point in the image to rotate with co-ordinates
as (1, 0) to a co-ordinate of (0, 1), which of the following matrix would we
have to multiply with?
(i)
[
1 1
1 1
]
(ii)
[
0 1
1 1
]
(iii)
[
0 −1
1 0
]
(iv)
[
0 1
1 0
]
(2 marks)
(iii) [2]
ECMM426 (2020) 10 Please Turn Over
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 11
16. The Cartesian coordinate of the homogeneous coordinate (x, y, w) is
(i) ( x
w
, y
w
)
(ii) ( x
w
, y
w
, 1)
(iii) (x, y, 1)
(iv) (x, y)
(2 marks)
(i) [2]
17. Let R1 and R2 be two matrices that define two different rotation
transformations. Which one of the followings is true about them?
(i) R1R2 6= R2R1
(ii) R1R2R1 = R2R1R2
(iii) R2R1 > R1R2
(iv) R1R2 < R2R1
(2 marks)
(i) [2]
18. In 2D coordinate system, mirroring about the line y = x can be achieved by
the following transformation matrix:
(i)
[
0 1
1 0
]
(ii)
[
0 1
−1 0
]
(iii)
[
0 −1
1 0
]
(iv)
[
1 1
1 1
]
(2 marks)
ECMM426 (2020) 11
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 12
(i) [2]
ECMM426 (2020) 12 Please Turn Over
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 13
19. LetO be the origin of a 2D coordinate system C and P (6= O) be any point in
C. We further assume that R be a rotation about O and T be the translation
from the point P to O. The transformation matrix that achieve rotation R
about the point P can be written as:
(i) RTR−1
(ii) T−1RT−1
(iii) T−1RT
(iv) TRT
(2 marks)
(iii) [2]
20. Which of the following could affect the intrinsic parameters of a camera?
(i) A crooked lens system
(ii) Diamond/Rhombus shaped pixels with non right angles
(iii) The aperture configuration and construction
(iv) Any offset of the image sensor from the lens’s optical centre
(2 marks)
(i), (ii), (iv) [2]
21. Which of the following statements describes an affine camera but not a
general perspective camera?
(i) Relative sizes of visible objects in a scene can be determined without
prior knowledge
(ii) Can be used to determine the distance from a object of a known height
(iii) Approximates the human visual system
(iv) An infinitely long plane can be viewed as a line from the right angle
(2 marks)
(i) [2]
ECMM426 (2020) 13
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 14
22. Let us assume that P number of unknown 3D points are projected into F
number of images where the 2D coordinates of those P points and their
correspondences are known. Assuming W (shape: 2F × P ) as the 2D
coordinates of those P points in F images, R (shape: 2F × 3) as the camera
rotation matrix for F images and S as the reconstructed 3D real world points,
their relation can be expressed as W = R × S, where W , R are known and
S is unknown. The solution of S can be given by:
(i) R−1W
(ii) W TR−1W TW
(iii) W TR−1W TR−1
(iv) None of the above
(2 marks)
(ii) [2]
23. Let us assume that P number of unknown 3D points are projected into F
number of images where the 2D coordinates of those P points and their
correspondences are known. Assuming W (shape: 2F × P ) as the 2D
coordinates of those P points in F images, R (shape: 2F × 3) as the camera
rotation matrix for F images and S as the reconstructed 3D real world points,
their relation can be expressed as W = R× S, where W is known and R, S
are unknown. The solutions of R and S can be estimated by:
(i) Random matrices that satisfy the expression
(ii) Singular value decomposition (SVD) and then selecting appropriate
submatrix depending on matrix rank
(iii) Selecting those rows and columns that respectively maximise and
minimise the matrix rank
(iv) None of the above
(2 marks)
(ii) [2]
ECMM426 (2020) 14 Please Turn Over
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 15
24. Recognising an ‘Armchair’ among a collection of ‘Wing chair’, ‘Deck
chair’, ‘Desk chair’, ‘Barber chair’, ‘Operator chair’, ‘Armchair’,
‘Executive chair’, ‘Garden chair’ is known as:
(i) Instance recognition
(ii) Category recognition
(iii) Deep recognition
(iv) None of the above
(2 marks)
(i) [2]
25. Which one of the following steps is not involved in bag-of-words model?
(i) Feature extraction
(ii) Feature quantisation
(iii) Non-maximum suppression
(iv) Visual vocabulary creation
(2 marks)
(iii) [2]
26. Let us assume that for creating a bag-of-visual-words (BoVW) model, we
have created a visual vocabulary of size 300. Now if we want to create
a bag-of-visual-words image descriptor with a 4 × 4 spatial pyramid, the
dimension of the feature should be:
(i) 4800
(ii) 1200
(iii) 300
(iv) 2400
(2 marks)
(i) [2]
ECMM426 (2020) 15
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 16
27. In a bag-of-visual-words model, the optimal size of the visual vocabulary
should be determined on the evaluation performance on the following data
split:
(i) Train set
(ii) Validation set
(iii) Test set
(iv) Both Validation and Test set
(2 marks)
(ii) [2]
28. What is the regular practice to use linear SVM to classify two classes that
are not linearly separable?
(i) Cross validation
(ii) Kernel trick
(iii) Neural neighbour trick
(iv) None of the above
(2 marks)
(ii) [2]
29. In Viola-Jones face detection algorithm, how does one implement a ‘weak
classifier’?
(i) SIFT feature with thresholding
(ii) Rectangular feature with thresholding
(iii) HOG feature with SVM
(iv) Rectangular feature with SVM
(2 marks)
(ii) [2]
ECMM426 (2020) 16 Please Turn Over
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 17
30. Suppose we have the following image (Figure 6):
Figure 6: image
Our task is to segment the objects in the image. A simple way to do this
is to represent the image in terms of pixel intensity and then cluster them
according to the values. On doing this, we got the following histogram
(Figure 7) of pixel intensity
Figure 7: histogram
Suppose we choose k-means clustering to solve the problem, what would be
the appropriate value of k from just a visual inspection of the pixel intensity
histogram?
(i) 1
(ii) 2
(iii) 3
(iv) 4
(2 marks)
ECMM426 (2020) 17
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 18
(iii) [2]
31. Which of the following is a representation learning algorithm?
(i) Neural network
(ii) Random Forest
(iii) k-Nearest neighbour
(iv) None of the above
(2 marks)
(i) [2]
32. Which of the following gives non-linearity to a neural network?
(i) Stochastic Gradient Descent
(ii) Rectified Linear Unit (ReLU)
(iii) Sigmoid
(iv) None of the above
(2 marks)
(ii), (iii) [2]
33. Suppose you have 5 convolutional kernels of size 7 × 7 with zero padding
and stride 1 in the first layer of a convolutional neural network. You pass an
input of dimension 224×224×3 through this layer. What are the dimensions
of the data which the next layer will receive?
(i) 217 x 217 x 5
(ii) 217 x 217 x 8
(iii) 218 x 218 x 5
(iv) 220 x 220 x 7
(2 marks)
ECMM426 (2020) 18 Please Turn Over
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 19
(iii) [2]
ECMM426 (2020) 19
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 20
34. Which of the following options can be used to reduce overfitting in deep
learning models?
1. Add more data
2. Use data augmentation
3. Use architecture that generalises well
4. Add regularisation
5. Reduce architectural complexity
(i) 1, 2, 3
(ii) 1, 4, 5
(iii) 1, 3, 4, 5
(iv) All of these
(2 marks)
(d) [2]
35. Suppose an input to average pooling layer is given above. The pooling size
of neurons in the layer is (3, 3):
3 4 6
5 7 3
4 3 7
What would be the output of this pooling layer?
(i) 3
(ii) 14
(iii) 5.5
(iv) 7
(2 marks)
(iv) [2]
ECMM426 (2020) 20 Please Turn Over
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 21
36. Which of the following is a data augmentation technique used in image
recognition tasks?
1. Horizontal flipping
2. Random cropping
3. Random scaling
4. Colour jittering
5. Random translation
6. Random shearing
(i) 1, 2, 4
(ii) 2, 3, 4, 5, 6
(iii) 1, 3, 5, 6
(iv) All of these
(2 marks)
(iv) [2]
37. What are the steps for using a gradient descent algorithm?
1. Calculate error between the actual value and the predicted value
2. Reiterate until you find the best weights of network
3. Pass an input through the network and get values from output layer
4. Initialise random weight and bias
5. Go to each neuron which contributes to the error and change its
respective values to reduce the error
(i) 1, 2, 3, 4, 5
(ii) 5, 4, 3, 2, 1
(iii) 3, 2, 1, 5, 4
(iv) 4, 3, 1, 5, 2
(2 marks)
ECMM426 (2020) 21
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 22
(iv) [2]
ECMM426 (2020) 22 Please Turn Over
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 23
38. While training a neural network for image recognition task, we plot the graph
of training error (loss) and validation error for debugging as follows (Figure
8):
Figure 8: training curve
What is the best place in the graph to stop the training of the neural network?
(i) A
(ii) B
(iii) C
(iv) D
(2 marks)
c [2]
39. What is the sequence of the following tasks in a perceptron?
1. Initialise weights of perceptron randomly
2. Go to the next batch of dataset
3. If the prediction does not match the output, change the weights
4. For a sample input, compute an output
(i) 1, 2, 3, 4
(ii) 4, 3, 2, 1
(iii) 3, 1, 2, 4
(iv) 1, 4, 3, 2
ECMM426 (2020) 23
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 24
(2 marks)
(iv) [2]
40. Assume a simple MLP model (single layer) with 3 hidden units with inputs
x = (3, 2, 1). The current weights and bias of the input units are respectively
w = (4, 5, 6) and b = 7. Assume the activation function is a linear constant
value of σ = 5. What will be the output?
(i) 64
(ii) 96
(iii) 175
(iv) 435
(2 marks)
(iii) [2]
(Total 80 marks)
ECMM426 (2020) 24 Please Turn Over
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 25
SECTION B (TRUE or FALSE)
Question 2
Read each of the TWENTY statements below carefully. Clearly write TRUE if
you think a statement is TRUE and FALSE if you think the statement is FALSE.
1. To blur an image, you can use any linear filter
(1 mark)
FALSE [1]
2. A box filter is a spatial domain linear filter in which each pixel in the
resulting image has a value equal to the average value of its neighbouring
pixels in the input image.
(1 mark)
TRUE [1]
3. Convolving twice with a Gaussian kernel of width σ is same as convolving
once with a Gaussian kernel of width 2σ
(1 mark)
FALSE [1]
4. Thresholding is a linear filter.
(1 mark)
FALSE [1]
5. Convolution in spatial domain is equivalent to multiplication in frequency
domain, which is one of the advantages of Fourier transform for performing
convolution on images.
(1 mark)
FALSE [1]
6. An alternative and computationally cheaper way of detecting corners
involves computing the cornerness measure as R = trace(M) − kdet(M)2,
ECMM426 (2020) 25
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 26
where k is a small constant.
(1 mark)
TRUE [1]
7. Blob detector is invariant to scaling but variant to illumination.
(1 mark)
FALSE [1]
8. Scale Invariant Feature Transform (SIFT) is a feature descriptor that
computes histogram of gradients in 8 directions within a local patch which
is divided into 4x4 grids.
(1 mark)
TRUE [1]
ECMM426 (2020) 26 Please Turn Over
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 27
9. With homogeneous coordinates, all the transformations can be expressed as
linear mappings and be computed as matrix multiplication.
(1 mark)
TRUE [1]
10. In a pinhole camera, too big diameter limits the amount of light entering the
camera and causes light diffraction, which eventually blurs the image.
(1 mark)
FALSE [1]
11. The assumption that corresponding pixel values remain the same in the
two consecutive frames in a video is called as the brightness constancy
constraint.
(1 mark)
TRUE [1]
12. In computer vision, the aperture problem refers to the fact of relative
darkness that appeared in an image due to the small aperture in a camera.
(1 mark)
FALSE [1]
13. Training a linear support vector machine is the process of finding the
hyperplanes that equally maximise the distance between the positive and
negative examples.
(1 mark)
FALSE [1]
14. In an integral image, each pixel represents the cumulative sum of a
corresponding input pixel with all pixels above and to the left of the input
pixel.
(1 mark)
ECMM426 (2020) 27
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 28
TRUE [1]
15. An attentional cascade of classifiers is built with a series of classifiers
starting with the simpler ones which reject many of the negative sub-
windows while correctly detecting almost all the positive responses which
trigger the evaluation of a second and more complex classifier, and so on.
(1 mark)
TRUE [1]
16. The difference between deep learning and machine learning algorithms is
that there is no need of feature engineering in machine learning algorithms,
whereas, it is recommended to do feature engineering first and then apply
deep learning.
(1 mark)
FALSE [1]
ECMM426 (2020) 28 Please Turn Over
AN
SW
ER
S
ECMM426 – 22/Apr/2022 – with answers Page 29
17. Increase in size of a convolutional kernel would necessarily increase the
performance of a convolutional neural network.
(1 mark)
FALSE [1]
18. Suppose we have a neural network with Rectified Linear Unit (ReLU)
activation function, which can approximate an XNOR function. Now, if we
replace the Rectified Linear Unit (ReLU) activations by linear activations,
the resulting neural network would not be able to approximate the XNOR
function anymore.
(1 mark)
TRUE [1]
19. The number of neurons in the output layer should match the number of
classes (where the number of classes is greater than 2) in a supervised
learning task.
(1 mark)
TRUE [1]
20. The function f(x) = ax3+ bx2+ cx+d can be represented by a single fully
connected hidden layer without any non-linear activation.
(1 mark)
FALSE [1]
(Total 20 marks)
ECMM426 (2020) 29 End of Paper

欢迎咨询51作业君