辅导案例-EEEM007 ASP

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

EEEM007 ASP-Lab 2019-20 / v1.8 / AM,MDP / 11 Mar 20
Module: EEEM007 ADVANCED SIGNAL PROCESSING
Year: 2019/2020. Examiner: A Mustafa, M D Plumbley
Date Due: 4pm Tuesday 12 May 2020
LAB EXPERIMENT: PATTERN RECOGNTION
1. INTRODUCTION
The aim of the experiment is to provide a practical support for the main aspects of the material
covered in EEEM007 Advanced Signal Processing, namely Pattern Classification. The experiment is
designed to reinforce the main theoretical results learnt in the course and to enable the student to
gain intuitive feeling about the effects of classifier design factors (such as training set size, class
separability, feature space dimensionality, classification rule) on the classification system
performance. It should also validate experimentally the results derived in the Assignment.
Two popular classifiers are investigated in the experiment: i) the Bayes decision rule for normally
distributed classes and ii) the k-Nearest Neighbour decision rule.
2. MATLAB COMMANDS
You may find the following Matlab commands useful:
mvnrnd generate random numbers
fitcnb construct Gaussian classifier
fitcknn construct k-NN classifier
predict function for finding class labels
mahal Mahalanobis distance
Include the Matlab code you used for your experiments in an Appendix to your report.
3. REPORT FORMAT
The Report should be a single document, consisting of descriptions and results for the experiments
(approx. 8-10 pages), plus an Appendix containing text listings of your Matlab code.
[Remember: The TurnItIn submission system will identify similar reports, so please write your own
Matlab code and experiment descriptions.]
4. MARKING SCHEME
For each experiment:
1. Description of experiment, objectives, design choices [20%]
2. Predicted experimental outcome from the theory [20%]
3. Presentation of the experimental results obtained [20%]
4. Analysis of results, discussion and conclusions [20%]
5. Presentation of the Matlab code used [20%]
Each of Experiments 1 to 4 is worth 1/4 of the final mark.

EEEM007 ASP-Lab 2019-20 / v1.8 / AM,MDP / 11 Mar 20
5. EXPERIMENT DESCRIPTION
5.1 Experiment 1
The aim of this experiment is to investigate the effect of training sample size on the classifier
performance and the effect of the size of test set on the reliability of the empirical error count
estimator.
A Gaussian classifier for discriminating between two 2-dimensional classes will be used for the study.
Use the class parameters (, , , , ) corresponding to your Assignment.

(a) Generate a design set and test set , each containing 100 patterns per class, distributed
according to the selected class parameters.
Design a Gaussian classifier for different values of = 3,5,10,50,100 select samples to
form a training set () from .
Test the designs with the same training samples to obtain error estimates design().
Test each classifier also using the full test set of 100 patterns to obtain error estimate test.
Repeat the experiment ten times for independent design sets
, = 1, . . .10 (the same test set
may be used) and average the estimated errors to obtain:
design() =
1
10
∑ design
()
10
=1
test() =
1
10
∑ test
()
10
=1
.
Plot the average errors design, test as a function of and compare them with the theoretical
error.
Comment on your results and try to explain them.

(b) Repeat (a) using the k-Nearest Neighbour classifier for all odd values of k in the range 1 to 51,
noting that cannot exceed 2. For a representative range of , starting from k=1 compare
your results with those obtained for the Gaussian classifier in Experiment 1.
For each value of select the best result
∗() (smallest error) over all values of and
record the corresponding ∗().
Plot these best results as a function of in the same graph as that used for presenting the
results of (a) above.
Plot also ∗() in the same graph.
Comment on your results.
5.2 Experiment 2
In this experiment we shall investigate the effect of the size of test set on the reliability of the
empirical error count estimator and we will also learn how to classify test and training samples.

(a) Design a Gaussian classifier for a two class problem similar to Experiment 1 (a).
Generate ten independent test sets
, = 1, . . .10 of size 100 using the same class conditional
distribution parameters.
Choose several different numbers of test patterns = 3,5,10,50,100.
For each samples form test set () for each
to obtain an estimate () of the
classifier error probability.
Repeat the experiment ten times for independent test sets and find the mean value () and
the variance 2() of the estimated error, i.e.
EEEM007 ASP-Lab 2019-20 / v1.8 / AM,MDP / 11 Mar 20
() =
1
10
∑ ()
10
=1
2 =
1
9
∑[() − ()]
2
10
=1

Plot your results and comment on how they compare with your theoretical predictions.
Comment on your results.

(b) Now assume that you have 100 samples for both testing and training, such that + =
100. Assign = 3,5, 10, 50, 75,90 and respectively = 97,95, 90, 50, 25,10 to form training
() and test () sets from the and samples to calculate the design
(design() from Experiment 1 (a)) and test error (() from Experiment 2 (a)) estimates for
each combination of train as test sets. Create a table as shown here depicting both design and
test errors for each set of and samples.
design() ()
3 97 (results) …
(etc)

Comment on your results. Which combination of and is the best trade-off in terms of
error estimates.
5.3 Experiment 3
In this experiment we shall investigate the dependence of the test() curves on the
dimensionality of the pattern recognition problem. As the determination of the true error
probability in high dimensional spaces is difficult, we shall take the estimated error test(), =
500) as the true error. You may choose the covariance matrix to be an identity matrix in this
exercise. Choose the mean vectors so that the error probability is maintained in the range of 5-10%.
Try to estimate test() for values = 3,5,10,20,50,100,200 for two class problems in =
5,10,15 dimensional spaces. The changes in the test error with the increase in training set size are
different in different dimensions.
If you are unable to design the classifier, consider what is the minimum number of training samples
required as a function of dimensionality and why.
How many training samples as a function of dimensionality do you need to achieve a reasonable
performance (close to the true error rate)?
5.4 Experiment 4
The aim of this experiment is to explore the relationship between class separability and error
probability.
Choosing a suitable pattern space dimensionality, , generate a sequence of sets of normally
distributed training data containing patterns from two classes. The Mahalanobis distance between
the means of the two classes in each set should be monotonically increasing with the position of the
set in the sequence.
Estimate the error probability of the classifier in each case.
Plot the error as a function of Mahalanobis distance.
Comment on your findings.