辅导案例-ICE-4701

Assessed Lab ICE-4701


Weight: 20% of the total mark of the module.

Total number of points: 100

Deadline: 13th March, 8pm. The submission point will close automatically thereafter.

Submission: Submit one pdf file with your Lab report. List in the report your solutions including code,
comments and output.

Notes:

Plagiarism: DO NOT copy from one another. Changing the variable names does not make the code your own! A
mark of zero will be awarded for assignments which are too similar.

I reserve the right to interview you about your submission and change your mark if I discover that you cannot
explain your code in enough detail.


1. Your data.
1 (a) Write code (preferably in Python) to upload your data, and create array “data” and array “labels” from it.
[5]
1 (b) Format and print the characteristics of the data: N, n, c, the number of objects in each class. Estimate the
prior probabilities for the classes. [10]
1 (c) Prepare and show a scatterplot of the data. An example is shown in Figure 1. [6]


Figure 1. Example of an expected plot for 1 (c) (Note that you all have different data sets!)

2. Classification
For an example on classification using Python, look up this link:
https://www.kaggle.com/abhikaggle8/pima-diabetes-classification

2 (a) Write code to apply the Largest Prior classifier to your data set. Calculate and show the error rate of this
classifier. Prepare and display the respective confusion matrix. Format and include the confusion matrix in your
report. [15]
2 (b) Apply the decision tree classifier to this data set using a 10-fold cross-validation. Show the confusion matrix
in your report. [10]
2 (c) Plot the classification regions for the decision tree classifier. An example of the expected outcome is shown
in Figure 2. [10]


Figure 2. Example of an expected plot for 2 (c)

2 (d) Using Weka, run a 10-fold cross-validation on your data for the following classifiers: [15]
• 1 NN (nearest neighbour)
• 3 NN (3- nearest neighbours)
• Decision tree
• Bagging ensemble
• AdaBoost ensemble
• Random Forest ensemble
• Rotation forest ensemble

Prepare a table with the classification accuracies of the classifiers. Give a comment on the results.

3. Classifier ensembles
You can solve this question by hand, using Excel or writing a piece of code. In all cases, show your work/spread
sheet/ code.
Take the first 5 points in your data set. Run them through a classifier ensemble consisting of the following three
classifiers, each one shown with its discriminant functions:

Classifier 1 Classifier 2 Classifier 3
1() +
2 − 0.1 0.02
2() − 2 −2 4 − 0.3
3() + − 0.4
2 + 0.1 ( + )2
4()
2 − − 0.1 ( − )2

3 (a) Calculate the majority vote label for each of the 5 objects, then prepare and show the confusion matrix. [18]
3(b) Calculate the class labels for the 5 objects using the following combination rules: [11]
• Average
• Minimum
• Maximum

51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468