辅导案例-CSCI 4144

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
1
CSCI 4144 – Data Mining and Data Warehousing
Course Project: In-depth Understanding of an OLAP or
Data Mining Algorithm

- TA: Serikzhan Kazi ([email protected]), Miheer Kulkarni ([email protected])
- Tutorial/Lab: 11:35am - 12:55pm, Wednesdays; Room: Goldberg 127
- Additional TA Help Hours at CS Learning Center:
o Mondays (2pm-4pm): Zhenbang Wang ([email protected])
o Wednesdays (2pm-4pm): Hui Huang ([email protected])
o Fridays (3:35pm-5pm): Lauchlan Toal ([email protected])


1. Overview

In this project, you need to select a research paper that includes an OLAP or data mining algorithm,
implement the algorithm, and discuss the performance of the algorithm. The major objective of
this project is to learn how to find a useful OLAP/data mining algorithm and have an in-depth
understanding of it.

2. Detailed Requirements

1) Group Size: You are allowed to work in a group with up to 3 members. If you prefer to complete
an individual project, it is also acceptable.

2) Deliverables: In this project, you need to complete the following deliverables (Note that only
one member of each group needs to submit these deliverables on behalf of the group):
a) Project Proposal: Due 11:55pm, Mar. 1 (Sunday)
b) Project Presentation Slides: The slides are due 11:55pm, Apr. 3 (Friday).
a. The presentations will be held during the lecture/tutorial on Apr. 1 and 2. The
detailed schedule will be announced after the project groups are formed.
c) Project Code: Due 11:55pm Apr. 17 (Friday)
d) Project Report: Due 11:55pm Apr. 17 (Friday)

3) Paper Selection: You can select a paper using one of the following two approaches:
a) IEEE Xplore and ACM Digital Library are two widely-used online libraries. You can search
for research papers using varied keywords, such as OLAP, classification, clustering, web
classification, and web clustering. The links to these two libraries can be found here:
http://dal.ca.libguides.com/c.php?g=257110&p=1716818
b) Browse the webpage of varied conferences (such as KDD 2019) and journals (such as IEEE
Transactions on Knowledge and Data Engineering) in order to go through the latest
research papers, then select one that interests you. Once a paper is selected, you can use
the resources available at Dalhousie library (such as IEEE Xplore and ACM Digital Library)
2
to find the full paper. A pdf file that summarizes the major publication venues on data
mining and data warehousing can be found in brightspace.
c) You are NOT allowed to select a paper that corresponds to any of the algorithms that are
covered in this course. However, a revised/improved version of the algorithm covered in
my slides is acceptable. Here is a list of the algorithms that I plan to cover in this course:
a. OLAP: Multi-Way Array Aggregation, BUC, High-Dimensional OLAP
b. Frequent Itemset Mining: Apriori Algorithm
c. Classification: ID3, Naive Bayesian Classification, Classification Based on IF-THEN
Rules
d. Clustering: k-means, AGNES, DIANA, Dendrogram, DBSCAN

4) Project Proposal: The length of the project proposal is approximately 1 page. It should include
the following parts:
a) Tentative title of the project: Note that the title could be revised later.
b) Tentative selected research paper: It should include the list of authors, paper title,
publication venue (i.e. where the paper is published), and publication time. If you find a
more interesting research paper later, you can change the selected research paper
(although this is not preferred because it will leave less time to you to implement the
algorithm and collect experimental results).
c) Problem description: It describes the problem to be solved with the algorithm presented
in the selected paper.
d) Project timeline: It should include the major milestones of the project and your own
deadlines for them.
e) Group Members: A list of the members in the project group

5) Project Presentation: The details about the presentation can be found below:
a) The presentation should include the following sections: problem to be solved, how the
algorithm works, implementation details (programming language, data set, etc.), and
experimental results (note that preliminary results are acceptable).
b) The presentation should be roughly 4 minutes long. It is encouraged that all group
members participate in the presentation.
c) The presentation will be held during the lecture/tutorial on Apr. 1 and 2. The detailed
schedule will be announced after the project groups are formed.
d) The presentation slides need to be submitted via brightspace on Apr. 3 (Friday).

6) Project Code: In this project, you need to implement the selected algorithm in order to collect
the experimental results about the performance of the algorithm.
a) Required Programming Language: You can use Java, C, C++, or Python as the programming
language because bluenose supports these languages (note that you need to have a CSID
to access bluenose via SSH). In this project, you can use all kinds of libraries/APIs as long
as they are available on bluenose.
3
b) Online Data Sets: To study the performance of the selected OLAP or data mining algorithm,
you might need to use some data sets. You can either create the data sets yourself or
utilize the online data sources such as those on the following webpage:
a. http://www.kdnuggets.com/datasets/index.html

7) Readme File: You need to complete a readme file named “Readme.txt”, which includes the
instructions that the TA could use to compile and execute your program to generate the
experimental results.

8) Project Report:
a) Report components:
a. A cover page that includes the title of your survey, your group ID (note that each
group will be assigned a group ID), the name and banner ID of group members.
b. Introduction: A brief description of the problem to be tackled.
c. Algorithm: Please describe how the selected algorithm works.
d. Data Preparation and Algorithm Implementation:
i. Please describe how the algorithm is implemented. You could include the
information about the selected programming language, the structure of
the program, the major classes, etc.
ii. If a data set is involved, please describe how the data is obtained and
processed. If not, you do not need to include the description about data
preparation.
e. Experimental results: You need demonstrate the performance of the algorithm by
including the detailed experimental results. When possible, please compare your
results with the results included in the selected paper.
f. Conclusion: Please include your comments about the algorithm.
g. List of References
b) Report format:
a. Line spacing: single space
b. Font size: 11 or smaller
c. Column per page: single-column
d. Report length: Your report should be at most 6 pages long. Namely, with the cover
page, your report should be at most 7 pages long.

9) Submission: You need to submit the following deliverables via brightspace. Please note that
each group only needs to submit 1 Project Proposal, 1 copy of Project Presentation Slides, 1
Project Code file, and 1 Project Report. Namely, only one member of a group needs to submit the
project-related files on behalf of the whole group. In addition, please pay attention to the
following requirements:
a) Project Proposal: You need to convert your project proposal into a pdf file. The name of
the pdf file should be “CSCI4144-ProjectProposal-YourFirstname-YourLastName.pdf”. For
example, my proposal file should be named “CSCI4144-ProjectProposal-Qiang-Ye.pdf”.
The project proposal should be submitted via brightspace.
4
b) Project Presentation Slides: You need to convert your project presentation slides into a
pdf file. The name of the pdf file should be “CSCI4144-ProjectPresentation-Group-
YourGroupID.pdf”. For example, if the group ID is 12, then the pdf file should be named
“CSCI4144-ProjectPresentation-Group-12.pdf”. The project presentation slides should be
submitted via brightspace.
c) Project Code:
a. You should place “Readme.txt” in the directory where your program file is located.
b. Your program file and “Readme.txt” should be compressed into a zip file named
“CSCI4144-ProjectCode-Group-YourGroupID.zip”. For example, if the group ID is
12, then the zip file should be called “CSCI4144-ProjectCode-Group-12.zip”. Finally,
you need to submit your zip file via brightspace.
d) Project Report: You need to convert your project report into a pdf file. The name of the
pdf file should be “CSCI4144-ProjectReport-Group-YourGroupID.pdf”. For example, if the
group ID is 12, then the pdf file should be named “CSCI4144-ProjectReport-Group-12.pdf”.
The project report should be submitted via brightspace.

3. Grading Criteria

The marker will use your submitted zip file (except presentation) to evaluate your assignment.
The grade of the project presentation will be based on the in-class presentation and the
submitted presentation slides.

1) Project Proposal (10 Points):
a) Tentative title of the project (1 Point)
b) Tentative selected research paper (1 Point)
c) Problem description (6 Points)
d) Project timeline (1 Points)
e) Group Members (1 Point)

2) Project Presentation (7 Points):
a) Content (3 Points):
a. Background
b. Algorithm Description
c. Implementation Description
d. Experimental Results
b) Clarity (3 Points):
a. Logical and systematic development of ideas
b. Precise use of formal language
c) Timing (1 Point):
a. Time is well distributed over varied components
b. Presentation is completed on time

3) Project Code (10 Points):
5
a) Does “Readme.txt” include enough information so that the TA can easily compile and
execute the program on bluenose? (1 Point)
b) Can the submitted program be executed on bluenose to generate the experimental
results? (6 Points)
c) Overall design of the program (3 Points)

4) Project Report (20 Points):
a) Cover page (1 Point)
b) Introduction (2 Point)
c) Algorithm (4 Points)
d) Data Preparation and Algorithm Implementation (4 Points)
e) Experimental results (4 Points)
f) Conclusion (1 Point)
g) List of References (1 Point)
h) Overall report quality (3 Points)

4. Academic Integrity

At Dalhousie University, we respect the values of academic integrity: honesty, trust, fairness,
responsibility and respect. As a student, adherence to the values of academic integrity and
related policies is a requirement of being part of the academic community at Dalhousie University.

1) What does academic integrity mean?

Academic integrity means being honest in the fulfillment of your academic responsibilities thus
establishing mutual trust. Fairness is essential to the interactions of the academic community and
is achieved through respect for the opinions and ideas of others. Violations of intellectual honesty
are offensive to the entire academic community, not just to the individual faculty member and
students in whose class an offence occur (See Intellectual Honesty section of University Calendar).

2) How can you achieve academic integrity?

- Make sure you understand Dalhousie’s policies on academic integrity.
- Give appropriate credit to the sources used in your assignment such as written or oral work,
computer codes/programs, artistic or architectural works, scientific projects, performances, web
page designs, graphical representations, diagrams, videos, and images. Use RefWorks to keep
track of your research and edit and format bibliographies in the citation style required by the
instructor. (See http://www.library.dal.ca/How/RefWorks)
- Do not download the work of another from the Internet and submit it as your own.
- Do not submit work that has been completed through collaboration or previously submitted for
another assignment without permission from your instructor.
- Do not write an examination or test for someone else.
- Do not falsify data or lab results.
6
These examples should be considered only as a guide and not an exhaustive list.

3) What will happen if an allegation of an academic offence is made against you?

I am required to report a suspected offence. The full process is outlined in the Discipline flow
chart, which can be found at:
http://academicintegrity.dal.ca/Files/AcademicDisciplineProcess.pdf and includes the following:
a. Each Faculty has an Academic Integrity Officer (AIO) who receives allegations from instructors.
b. The AIO decides whether to proceed with the allegation and you will be notified of the process.
c. If the case proceeds, you will receive an INC (incomplete) grade until the matter is resolved.
d. If you are found guilty of an academic offence, a penalty will be assigned ranging from a
warning to a suspension or expulsion from the University and can include a notation on your
transcript, failure of the assignment or failure of the course. All penalties are academic in nature.

4) Where can you turn for help?

- If you are ever unsure about ANYTHING, contact myself.
- The Academic Integrity website (http://academicintegrity.dal.ca) has links to policies,
definitions, online tutorials, tips on citing and paraphrasing.
- The Writing Center provides assistance with proofreading, writing styles, citations.
- Dalhousie Libraries have workshops, online tutorials, citation guides, Assignment Calculator,
RefWorks, etc.
- The Dalhousie Student Advocacy Service assists students with academic appeals and student
discipline procedures.
- The Senate Office provides links to a list of Academic Integrity Officers, discipline flow chart, and
Senate Discipline Committee.
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468