程序代写案例-COMPSCI361-Assignment 4

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
COMPSCI361: Machine Learning
Assignment 4: Association Rule Mining
Due: June 4, 2021, 23:59 NZT.
This is worth 10% of your final grade.
The main purpose of this project is to get an in depth understanding of how the Apriori
algorithm works.
Task 1: Given the dataset below whereby each line is representing a transaction, what is the
frequent itemsets and rules produced, if minsup is set to 0.15 and minconf is set to 0.80. Show
your working.
Trans ID Transactions
1 A, B, C, D, F
2 A, B, C, D
3 A, B, C, D
4 A, B
5 B, C, E

Task 2: Your task is to write a Apriori program (in python using Jupyter notebook), that takes
as parameters:
 minsup - minimum support,
 minconf - minimum confidence,
 minlift - minimum lift, and
 the name of file of transactions (whose format is comma separated value as that of the
supermarket.csv downloaded from Canvas). Each line within the data file represents a
transaction, where items are separated by commas. Imagine you are going to a grocery
store, your transaction at the check-out counter would be one of the lines.
It will produce all association rules which can be mined from the transaction file which satisfy
the minimum support, lift, and confidence requirements. The rules should be output sorted first
by the number of items that they contain (in decreasing order), then by the lift value, then by
the confidence, and finally by their support (also in decreasing order).
You can use libraries e.g. Pandas, NumPy but you may NOT use any prebuilt Apriori packages.
Task 3: Now run your implementation using the data from the Task 1. Show that you can
produce the same output as Task 1. This can be the output from your Jupyter notebook.
Task 4: Your task is to investigate a dataset and perform an association rule mining task.
 Run your Apriori code on the data downloaded from Canvas (supermarket.csv). Try
different parameters minsup e.g. 0.10, 0.15, 0.20. TIP: Please note that this may take
awhile if your code is inefficient.
 Generate rules (you can try different measures (minsup, minlift, minconf to see which
gives you more useful and interesting results)). Explain why the rules are interesting to
you. TIP: Don’t over think this. Just describe what the measures for the rule and why you
think it is interesting.
COMPSCI361: Machine Learning
Task 5: Modify your program to take in an additional constraint and parameter called
minrelativesup. In addition to the minsup constraint, add the following pruning constraint at
each candidate generation of k level, where k > 2. Given k-itemset denoted as Sk. A frequent
itemset Sk is a k-itemset whose support/maxsubset ≥ minRelativeSup. Here maxsubset is the
maximum support for the itemsets in Sk-1’ where Sk-1’= {s| s  Sk , |s| = k-1}. Here s is an itemset of
size k-1 where k > 2.
 Discuss how would this change the itemset and rules generated.
 Would this change the anti-monotonic property in Apriori? Discuss your answer.
You will need to start another copy of your Apriori algorithm, such that it does not interfere
with Task 2. Please describe how you have done this in your report and link it to your code in
the Jupyter Notebook.
What to submit?
A copy of your Jupyter notebook and a final report. They must be deposited to Canvas. Please
name your report file “Your_UPI.pdf". Your report should be no more than two pages long. As a
rough guide of page length, you may use font Times New Roman with size 12pt and single
spacing.
This includes any images or references you may choose to show or use. Reproducible machine
learning is one of the criteria of this assignment. So, you need to report processing and
parameters for recreating your results. You should include a description of the different runs (if
you had carried out multiple run), and why you needed to make changes from your initial
choices, in the report. One simple question that will indicate whether you have fulfilled
minimum requirement of reproducibility is “Can someone reproduce your results based on your
explanation?".

Grading rubric
1 mark for the correct output in Task 1.
1 mark for correct implementation of Apriori (Candidate Generation and Rule Generation) – Task 2.
1 mark for correct output formatting as defined in the assignment - Task 2.
1 mark for the correct output in Task 3.
1 mark for justification of interesting rules in the dataset provided (0.5 discussion of each interestingness
measure and rule) Task 4.
1 mark for correct implementation of minrelativesup constraint - Task 5.
1 mark for discussion of how the minrelativesup would impact/change the output of Apriori - Task 5.
1 mark for discussion of any changes to the anti-monotonic properties in Apriori - Task 5.
1 mark for reproducibility, are all the parameters defined and results repeatable.
1 mark for clarity of the report.

Copyright © University of Auckland. This material is provided to you for your own use. You may not copy or distribute any
part of this material to any other person. Failure to comply with this warning may expose you to legal action for copyright
infringement and/or disciplinary action by the University.

欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468