Assignment 2: COMP2008 Data Science Project Report Created by Group 51: Henry Do, Jialiang Shen, Meng Yang, Ruotong Zhao Research topic: Is there sufficient health services available for residents in Victoria? With the recent outbreak of the COVID-19 pandemic persisting over the past year, the vulnerabilities of each nation’s disease control and prevention measures have been put to the test, in particular, the adequacy of their health services available to the population. In light of this, the report investigates whether there are sufficient health services available for residents in Victoria, Australia, and aims to assist government decisions in enhancing the well-being of Victorians by improving their understanding of health factors pertaining to communities in Victoria. Through this, well-informed decisions can be made to further develop both new and current health services. The report is divided into three sections. The first provides insight into the chosen datasets, where they were sourced, their contents, and how they have been combined for processing. Following this, information regarding preprocessing and analysis methods used are detailed, as well as justifications of their use and assumptions made. The final section describes results obtained from analysis of the datasets, their significance, limitations and improvements for future research. Data The datasets selected for this project have been sourced from AURIN, a national database network provider for academic, government and private sectors in Australia for research to empower decisions for the improvement of the well-being of Australian communities. The chosen datasets are all in CSV format which include: 1. Local Government Area (LGA) profiles data 2015 for VIC (6KB) 2. Admissions - Hospital Types and Sex (LGA) 2014-2015 (7KB) 3. Admissions - Hospital Types and Sex (LGA) 2016-2017 (7KB) 4. Avoidable Mortality - Selected Causes (LGA) 2011-2015 (18KB) 5. Potential Preventable Admissions - All Conditions (LGA) 2014-2015 (3KB) 6. Potential Preventable Admissions - All Conditions (LGA) 2016-2017 (3KB) 7. National Hospital Statistics 2012-2013 (7KB) Limitations of datasets - year, might not be as recent -> may be irrelevant 2. What are the datasets you’ve used and how have you linked them together? Preprocessing Discussion References TO BE DELETED PRIOR TO SUBMISSION Feedback from Akira to keep in mind: A strong research topic with convincing motivation behind the choice of data. A suggestion I have is to also take a look at the location of hospital/GPs to see if you can find some meaningful relation there. This is because you are looking at financial budget statements, and it's important to note that more spending does not imply better quality of health services. In fact, it might be possible to argue that an increase budget may be the result of a lack of quality originally, and building these services up can take years! It may also be a good idea to take a look at researching some news articles surrounding the health sector. Although your datasets will provide the bulk of results, your analysis can be drawn from real events that can be referenced throughout to report to support your claims :) This will be the structure of our report: 1. Describe data. So where we got it from, source and what it is showing. What is in it, stuff like that 2. Preprocessing: akira said it will be fine if we use dot-points to describe the steps taken to process the data, however they must be proper sentences. Here we describe our assumptions and justifications, so why do we do this do process the data etc., any bias in the data and how this was address (maybe this can be in discussion section) 3. Discussion: most of our analysis, include graphs and plots, describe trends, deduce possible relationships, errors that may have arisen, limitations Your report should include the following information: 1. What is the research question and how is it related to the theme of understanding the liveability, inclusiveness, health and sustainability of communities in Victoria ? 2. What are the datasets you’ve used and how have you linked them together? 3. What wrangling and analysis methods have you applied? Why have you chosen these methods over other alternatives? 4. What are the key results your research has obtained? 5. Why are your results significant and valuable? 6. What are the limitations of your results and how can the project be improved for future? Your report should make effective use of visualisations to support your argument. GitHub Link: https://github.com/COMP20008/assignment-2-group-51.git
欢迎咨询51作业君