Faculty of Information Technology Semester 1, 2021 FIT5145: Introduction to Data Science Assignments 2 & 4: Business and Data Case Study 1. Assignment 2: Write a proposal document introducing the data science project you are studying. The due date of Assignment 2 is: 11:55 PM, Friday 23 April 2021 (Week 7). 2. Assignment 4: Write a report on your case study of the data science project, as well as recording a short (maximum 5 minutes) video presentation of your study. The due date of Assignment 4 is: 11:55 PM, Friday 21 May 2021 (Week 11). This is an individual assignment. Focus of the study This case study needs to analyse (“study”) a data science project relevant to an example business scenario (“the case”). There are a couple of ways you can choose to do the case study. For instance, you may choose to study how an existing data science project has been implemented in a particular sector such as transportation or health. Moreover, the project chosen is NOT limited to those already established or completed. That means, you can propose an entirely new project of your own. That is, you can either study an existing data science project in New York or propose a completely new data science project to address a particular problem the Australian government faces such as bushfire or flood. Talk to your tutors about any proposed project you are interested in. Note, if we really like your project ideas, we may ask your permission to borrow some parts for an Industrial Experience project (FIT5122) the following year. No programming is required for this assignment Assignment 2: Proposal Weight: 5% of the unit mark Size: up to 1000 words (approx. 2-3 pages). Submission format: one PDF file What you need to do: ● Choose a data science project. ● Write the initial two sections, i.e., Project Description and Business Model (References as well if necessary) of the report. You can be creative and include visuals to explain your idea more clearly and in lesser words. This assignment is worth 5% of your unit mark. Assignment 4: Report and Presentation This assignment has two parts: (1) a report and (2) a presentation. Report Weight: 15% of the unit mark Size: up to 3000 words Submission format: one PDF file This report is your analysis of how data science can be used to help solve a particular problem. In your report you need to identify the size and scope of both the problem and the data science project, as well as the requirements of enabling the project. Your report should have at least (but not limited to) the following sections: ● Project Description: provide a description about the data science project that you study/propose, what the project is, and what data science roles are involved in this project and what are their responsibilities. ● Business Model: provide analysis about the business/application areas the project sits in, what are the challenges of the project, what kind of values the project can create for the specific business area, the data curation and management issues or policies involved in the project, etc. ● Characterising the Data and Data Processing: characterise the data in the project (i.e., the 4 V's), provide analysis on the required technologies, software, and tools for data processing according to the specific data characteristics. ● Resources: locate and assess existing or potential resources, software and tools for the project. ● Data Analysis: specify/propose the statistical methods used in the project, provide analysis on why you choose those methods and discuss the high-level output. The sections would present aspects of Weeks 1-10 of the unit for your chosen case study. The maximum word limit for the report (Assignment 4) is 3000 words. It may include some/all of your Assignment 2, modified if needed (counted in the 3000 word total). References at the end of the report (consisting of a list of URLs and/or cited reports) are not included in the word count. Note that staying within the word limit demonstrates your ability to write concisely. For this reason, a penalty will be applied to reports exceeding the limit, or the marker may ignore the excess of your submission in assignment 2 and 4. Make sure that any resources you use are acknowledged in your report. You may need to review the FIT citation style to make yourself familiar with appropriate citing and referencing for this assessment. Also, review the demystifying citing and referencing guide for help. Presentation Weight: 5% of the unit mark Size: maximum 5 minutes Submission format: ● A PDF file containing your presentation slides and the link to your YouTube recording Producing your video The video needs to describe the business and data science project and the results of your study. The video should be no longer than 5 minutes long. We recommend you produce a video by doing a 5 minute screen capture of your slides with voiceover entered concurrently via microphone. One option is to start a session on Zoom, share the slides and record the session. You don’t need to appear physically in the video yourself but you must speak. Note that the videos must be uploaded through YouTube and you will need to provide a LINK to your YouTube video in your case study report as well as in the first page of your slides. However, DO NOT include your video on Moodle as part of the submission. Make sure that you have set the uploaded video as "unlisted" (not private or public). If staff can’t access the video, you may be given no marks for that component of your work. A good guide is to have no more than 1 slide per minute and to avoid placing too much content on each slide. How you will be assessed Assignment 2: The 5% awarded for your proposal is broken down into the following categories: ● clear description of the goals of the project ● appropriateness of topic ● clear description of the business benefits ● novelty/creativity ● overall clarity of the initial report Assignment 4: See the grading rubric to understand how we will grade your report. You will be assessed on your ability to: Your report will be assessed on your ability to: ● analyse the role of data in the business model, identify the data curation and management issues ● discuss different parts of the data science project from the perspective of the data science process and from the perspective of the roles such as statistician, archivist, analyst and systems architect ● analyse the size and scope of data storage and data processing, and present the basic technologies in use ● locate and assess resources, software and tools for a data science project ● discuss the kinds of data analysis and statistical methods suitable for the data science project ● think critically and creatively, providing justification and analysis ● being able to support their project with some realistic data (if available in the public domain) or a mockup/example dataset to clearly explain their proposition, modelling approach and visualisations that can be derived from it. Note: Implementation is not mandatory but explanations using a realistic example increases your chance of getting a better mark. The 5% awarded for your presentation is broken down into the following categories: ● understanding of case ● depth of content ● quality of slides ● quality of delivery What you need to do Before you begin, make sure you: ● Download the business and data case study samples (available on Moodle) and review them as examples. Either o Select one case for the basis of your report, or o even better, propose your own topic (discuss it with your tutors) ● Have a look at examples of good past student reports (you will need to be logged into your Monash Google Webmail account to access this link). o Note that the length requirement and other aspects of the assignment was slightly different in the past. ● Download the marking guide or rubric (available on Moodle) as guidance on how you will be assessed. ● Prepare your 3000 word report and supplementary notes (if used). ● Organise your screen capture and microphone, or obtain or organise access to a video camera or mobile phone with a camera you can use to record your report. ● Check that you can access and upload videos to YouTube with your Monash details. You'll need a YouTube account to store your video report and review the Monash University rules for using YouTube. Choose a data science project as a case, and then: ● Do preliminary research about your case and the relevant technologies ● Write and submit your proposal (Assignment 2) ● Research and prepare your final report with cited references. ● Prepare your presentation slides and speech notes, and then rehearse prior to recording ● Record your video, making sure it’s no more than 5 minutes in duration (edit your video as required). ● Upload your completed video to your YouTube channel and add the link to your report. ● Create a transcript (preferably but not mandatory). On your YouTube channel, click on the CC button under the video, which is "subtitles and CC" and then select "English (Automatic)" to create a transcript which you can then edit. Note: it may be much easier to prepare a script ahead of time, read from the script, then submit the script as a transcript. ● Submit your report and slides (Assignment 4). You are free to modify the initial proposal sections submitted in week 8 (especially in response to feedback from your marker), or even change topics. It is recommended you have most of the report done by the end of week 10 to give you time to make a presentation. Week 11 is then spent polishing and completing the slides and recording the video. Be warned, the slides must be submitted as a PDF, so do not use fancy Powerpoint effects - they will be lost in the PDF! How to Submit Once you have completed your work, take the following steps to submit your work. Penalties may be applied to your marks if the following instructions are not followed. For Assignment 2 1. Please ensure you name the file containing your proposal correctly using the following format: LastName_StudentNumber_Assignment2.pdf e.g., Finn_21872187_Assignment2.pdf 2. Upload your assignment files in the assignment link provided on Moodle For Assignment 4 1. Please ensure you name the file containing your report correctly using the following format: LastName_StudentNumber_Assignment4_report.pdf e.g., Finn_21872187_Assignment4_report.pdf 2. Please ensure you name the file containing your slides correctly using the following format: LastName_StudentNumber_Assignment4_slides.pdf e.g., Finn_21872187_Assignment4_slides.pdf Do not zip or archive any files! 3. Upload your assignment files in the assignment link provided on Moodle Remember that the link to your recording MUST be included in your report as well as the first page of your slides. Further advice on the assignment: Here is some further advice from the lecturer and tutors regarding the assignment: 1. Make sure to carefully read the assignment specification above. 2. The project should be data-centred -- ideally combining multiple sources of data to develop a predictive model that can solve a real-world problem. 3. The project should contain a clear statement of the problem being tackled. What is the objective/purpose of the project? Have a look at the structure of the case studies in the NIST document -- each one starts with a clear definition of the problem. 4. Make sure that the benefit of the project is clear. What is it? Will the project have a financial benefit, or result in a social good? 5. The report needs to be "telling a story", and to be convincing somebody to "invest in your project" so that it can be built. 6. Try not to make the project too broad. It should be an achievable data science project. 7. You can include graphics to support claims. For example, depending on your project it may (or may not) make sense to include: o an influence diagram showing what data is available and how it relates to the decisions and objectives of the project, o graphs showing some exploratory data analysis (if applicable). 8. Make sure you understand the difference between business models and data models. They are not the same thing! 9. Read up as much as you can on the particular topic you've chosen in order to be able to describe the data (and software) requirements of the project. 10. Make it clear where the data would come from for the project: o Is the data proprietary? How would it be collected? o If the data is public, you could even do some exploratory data analysis on it. 11. What processing would be needed? How would the data need to be processed before it can be used? What software might be needed? Can the processing be distributed? 12. Finally, make sure you've seen the set of possible section headings suggested above and had a look at the examples of reports from past years. (Note though that the assignment specifications were slightly different in previous years.)
欢迎咨询51作业君