辅导案例-COMP8430-Assignment 4

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
The Australian National University Research School of Computer Science, CECS
COMP8430 – Data Wrangling – 2020
Assignment 4 Due 11:55 pm on Friday 30 October 2020
Worth 10% of the final grade for COMP8430
This assignment is only for Master students enrolled in COMP8430.
Draft – Last update August 28, 2020
Overview and Objectives
This assignment requires you to select and read a research paper relevant to data wrangling, and then to summarise and
critically analyse this paper. You will need to provide a summary of the paper in your own words, and answer a set of
questions about certain aspects of the paper that you have selected.
Important
• The answers to this assignment have to be submitted online in Wattle, see the link Assignment 4 Submission in week
12 (26 to 30 October).
• Follow instructions given for maximum text length in free format answers. If your answers are too long this will attract a
penalty (for details see the individual questions below and the corresponding answer submission forms in Wattle).
• You can edit your answers many times and they will be saved by Wattle.
• Make sure you submit the final version of your assignment answers before the submission deadline.
• Note that Wattle does not allow us to access any earlier edited versions of your answers, so check very
carefully what you submit as the final version!
You can only submit your assignment once!
Make sure you do not forget to submit your assignment!
Penalties
Textual questions have maximum line and maximum word limits. If you write more than these provided limits we
will have to apply an over-word-limit penalty. For details of limits see the individual questions below and the corresponding
pages in the assignment submission in Wattle.
Deadlines, Extensions, and Late Submissions
The assignment is due 11:55 pm on Friday 30 October 2020.
Students will only be granted an extension on the submission deadline in extenuating circumstances, as defined by ANU
policy (http://www.anu.edu.au/students/program-administration/assessments-exams/deferred-examinations).
If you think you have grounds for an extension, you must notify the course convener as soon as possible and
provide written evidence in support of your case (such as a medical certificate). The course convener will then decide
whether to grant an extension and inform you as soon as practical.
In accordance with the CECS and ANU late submission policy, no late submissions will be accepted, except where an
extension has been approved by the course convener.
Assignment Structure
The assignment consists of five (5) tasks as described below. Make sure you answer all aspects of each task.
If you have any questions on the assignment please post them on Wattle – however do not post any partial solutions,
program codes, URLs, etc., or any hints on how to solve any of the assignment tasks.
Plagiarism
No group work is permitted for this assignment.
We do encourage you to discuss your work, but we expect you to do the assignment work by yourself. If you
are unsure about what constitutes plagiarism, make sure you carefully read the ANU Academic Honesty Policy
(http://academichonesty.anu.edu.au/).
If you do include ideas or material from other sources, then you clearly have to make attribution by providing a reference
to the material or source in your submitted assignment answers. We do not require a specific referencing format, as long as
you are consistent and your references allow us to find the source, should we need to while we are marking your assignment.
Marking
This assignment will be marked out of 10, and it will contribute 10% of your final course mark.
Note that not all questions might be equally difficult. For some questions there is no single right or wrong answer. Marks
will be awarded based on your description, reasoning, and explanations, as well as clarity and correctness of writing.
We will endeavour to release your marks and feedback within two teaching weeks after the submission deadline. If you feel
we have made an error in marking, you have two weeks following the release of marks to raise any issues with the course
convener, after which time your mark will be considered final. If you request that we re-mark your assignment, we
will re-mark the entire assignment and your mark may go up or down as a result.
Assignment Tasks
On Wattle in week 6, under the Assignment 4 specification (this doucment), you will find links to seven scientific
publications. These are papers from our group working in data wrangling here at the ANU. All these papers have been
published at different Pacific-Asia Conferences on Knowledge Discovery and Data Mining (PAKDD) in the past few years.
We selected these papers because we are very familiar with the topics and content of these papers; all these papers are on
different topics related to record linkage; and all these papers have the same length and format.
The seven listed papers are:
1. Adaptive Temporal Entity Resolution on Dynamic Databases (Christen and Gayler, 2013)
2. Efficient Interactive Training Selection for Large-Scale Entity Resolution
(Wang, Vatsalan, and Christen, 2015)
3. Improving Temporal Record Linkage Using Regression Classification (Hu, Wang, Vatsalan, and Christen, 2017)
4. Pattern-Mining Based Cryptanalysis of Bloom Filters for Privacy-Preserving Record Linkage
(Christen, Vidanage, Ranbaduge, and Schnell, 2018)
5. A Scalable and Efficient Subgroup Blocking Scheme for Multidatabase Record Linkage
(Ranbaduge, Vatsalan, and Christen, 2018)
6. Robust Temporal Graph Clustering for Group Record Linkage (Nanayakkara, Christen, and Ranbaduge, 2019)
7. Secure and Accurate Two-Step Hash Encoding for Privacy-Preserving Record Linkage
(Ranbaduge, Christen, and Schnell, 2020)
For this assignment, you must select one of these seven papers, and address the following five questions and provide
answers in the corresponding answer fields in Wattle (under the Assignment 4 Submission link in week 12).
• Task 1: Paper topic and the research question addressed (2 marks): Describe in your own words the topic of
what the paper covers, and the research question(s) the paper aims to address. Write a maximum of 250 words (around
10 lines).
• Task 2: Proposed methods (2 marks): Describe in your own words the method / approach proposed by this paper.
How does this method work? What are the building blocks / components / techniques used by the proposed method?
Again write a maximum of 250 words (around 10 lines).
• Task 3: Data set(s) and evaluation used (2 marks): Describe in your own words the data set(s) used by the paper
and how the proposed method was evaluated. What measures were used, and what aspects of the proposed method were
evaluated (runtime, scalability, quality, accuracy, privacy, etc.)? Write a maximum of 250 words (around 10 lines).
• Task 4: Critiques and shortcomings (2 marks): In your own words, describe any criticism you have of this paper.
This can include, but is not limited to, unclear or even wrong description of the method, inappropriate or incomplete
evaluation using not the right evaluation measures or not suitable data sets, unclear or inappropriate conclusions, and so
on. Again write a maximum of 250 words (around 10 lines).
• Task 5: Paper summary (2 marks): Finally, summarise the paper in your own words using a maximum of 250 words
(around 10 lines). Briefly describe the method the paper proposes, how this method is assessed or evaluated, and what
the main findings are.
Other Aspects
You do not need to include into the answers a reference to the paper you selected.
For all answers in this assignment, English writing mistakes and typographical errors will attract small penalties.

欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468