代写辅导接单-FIT5137 S2 2024

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top

FIT5137 S2 2024 Assignment 3: PTV Assignment Scenario

(Weight = 35%)

Due date: Friday, 25 October 2024, 4:30 PM

Version:2.0–21/08/2024

General Information and Submission

● This isanindividualassignment.

● Submission method:Submissionis online throughMoodle.

● Penalty forlatesubmission:5% deductionfor each day.

● Assignment FAQ: Thereis anAssignment3FAQpage set upontheEdStemforum.

Assignment Background

YouhavebeenhiredasadataanalystatPublicTransportVictoria(PTV),theVictoriaGovernment

authority responsible for public transport in the state. Some of your duties are data extraction,

integration and analysis to provide good understanding regarding the public transportation condition

inVictoriatothestakeholders.

After the COVID-19 restrictions were lifted, most companies are switching the workstyle from

work-from-home to face-to-face. Therefore, transportation infrastructure and network is one of the

most important aspects. While some people prefer to drive to work, some other people prefer to use

the public transportation network as their main transportation mode. PTVasthesoleproviderforthe

public transportation network reduced their services during the lockdown period. Now, PTV has

restored theservicestocoverasmanyareasaspossibleinthewholeregion.However,somequestions

remained mysteries. How good is the current PTV coverage? Are there anyuncoveredspots?Which

areahasthebestpublictransportationoptions?

Therefore, as a data analyst, your task is to evaluate the data and provide the spatial dataanalysisto

the stakeholders of PTV. The data should be presented inanarealevel,suchasmunicipality,suburbs

or postcode. For example, you may present “The number of bus services in Bundoora” or "The

numberofTrainsorTramsnetworkinBundoora”.

Data

There are two datasets that you have to obtain in this assignment, which are the PTV/GTFS

datasetandAustralianBoundarydata.

The General Transit Feed Specification (GTFS) is a data specification that allows public

transit agencies to publishtheirtransitdatainaformatthatcanbeconsumedbyawidevarietyof

software applications. Today, the GTFS data format is used by thousands of public transport

providers.

GTFS is split into a schedule component that contains schedule, fare, and geographic transit

information and a real-time component that contains arrival predictions, vehicle positions and

service advisories. A GTFS feed iscomposedofaseriesoftextfilescollectedinaZIPfile.Each

filemodelsaparticularaspectoftransitinformation:stops,routes,trips,andotherscheduledata.

For more detailed information about GTFS, you canrefertotheofficialdocumentationprovided

by Google at https://developers.google.com/transit/gtfs. Additionally, You can read further

explanation about the PTV-GTFS data from https://transitfeeds.com/p/ptv/497. For this

assignment,wewillbeusingthe17thMarch2023versionofthedataset.

TheGTFSdatastructureisshownbelow:

The Australian digital boundary is defined by the Australian Bureau of Statistics using the

Australian Statistical Geography Standard (ASGS).TheASGSisaclassificationofAustralia

into a hierarchy of statistical areas. It is a social geography, developed to reflect the location of

people and communities. It is used for the publicationandanalysisofofficialstatisticsandother

data. The ASGS is updated every 5 years to account for growth and change in Australia’s

population, economy and infrastructure. For the 2021 release, the ASGSwillbere-namedtothe

AustralianStatisticalGeographyStandard(ASGS)Edition3.

The ASGS is split into two parts, the ABS and Non ABS Structures. The ABS Structures are

geographies that the ABS designs specifically for the release and analysis of statistics. This

means that the statistical areas are designed to meet the requirements of statisticalcollectionsas

well as geographic concepts relevant to those statistics. This helps to ensure the confidentiality,

accuracyandrelevanceofABSdata.TheNonABSStructuresgenerallyrepresentadministrative

regions which are not defined or maintained bytheABS,butforwhichtheABSiscommittedto

directlyprovidingarangeofstatistics.

The Main Structure is developed by the ABS and isusedtoreleaseandanalyseabroadrangeof

social, demographic and economic statistics. It is a nested hierarchy of geographies, and each

level directly aggregates to the next level. Mesh Blocks (MBs) are the smallest geographic

areas defined by the ABS and form the building blocks forthelargerregionsoftheASGS.

MostMeshBlockscontain30to60dwellings.

Belowisthesimplified ABSandNonABSStructure.Youcanreadfurtherexplanationaboutthe

structurehere

https://www.abs.gov.au/statistics/standards/australian-statistical-geography-standard-asgs-edition

-3/jul2021-jun2026#overview

TheDigitalboundaryfilesthatyouhavetogetistheMeshBlocksdataset.TheMeshBlocks

datasetisavailableasShapefile.YoucanreadfurtherexplanationabouttheMeshBlocksdataset

here

https://www.abs.gov.au/statistics/standards/australian-statistical-geography-standard-asgs-edition

-3/jul2021-jun2026/access-and-downloads/digital-boundary-files

Allocationfilesarenon-spatialrepresentationsofhoweachgeographyisaggregatedfromtheir

buildingblockgeography.YoucanalsoreadfurtherexplanationabouttheAllocationfilesdataset

here

https://www.abs.gov.au/statistics/standards/australian-statistical-geography-standard-asgs-edition

-3/jul2021-jun2026/access-and-downloads/allocation-files

AssignmentTasklist

Your assignment consists of several parts. Always read the instruction one byone.Donotmove

tothestepwithoutcompletingthepreviousstep:

● Task 1: Data Restoration - Restore the data to the database. Monitor the success

indicatortoensuresuccessfulrestorationofthedata.

● Task 2: Data Preprocessing - Perform necessary structure maintenance and create

resulttablesforfurtherprocessing.

● Task 3: Data Analytics and Visualization - Develop SQL queriestoanalyzethedata

and evaluate performance & Create visualizations to present the results of the data

analytics.

● Nodatacleaningrequiredforthisassignment.

● Formoreinformation,seetheFAQforAssignment3.

For simplicity, all the data required for this assignment is readily available in thePostGIS

Docker container. You can access these datasets within the container by navigating to the

/data/adatafolder.Ifyoudon’tknowhowtodoit,refertothelabs10activities.

Verifyyourdatabeforetherestorationprocess.

As a data analyst, it is your responsibility to understand and

explore these publicly available data.

Assignment Task

Task 1: Data Restoration

Before you can start the data analyticprocesses,thefirstthingyouhavetodoistorestorethe

external data to your database. Make sure you prepare a destination schema to restore your

data.Thedestinationschemaforyourassignmentis“ptv”.

Note:

● Before initiating the data restoration process, it is essential to thoroughly explore

the dataset.Thisexplorationinvolvesidentifyingappropriatedatatypes,determining

field lengths, and making other relevant considerations that will inform the creation

ofthetablestructure.

● Ensure that you restore the data into the PTV schema using regular (local)

tables. Donotutiliseforeigntables,asthedatamustbestoreddirectlywithinthe

PostgreSQLdatabase.

● Ensurethatalltablesaresuccessfullyrestored,including8tablesfromGTFS

and3tablesforMB_2021,LGA_2021andSAL_2021respectively.

Theoutputs of thistask for Reportare:

a) AttachascreenshotoftheresultstoincludeallthetablesyourestoredinTask1,

includingthenumberofrowsforeachtableyourestoredbyusingfollowingcode:

withtblas

(selecttable_schema,TABLE_NAME

frominformation_schema.tables

wheretable_schemain('ptv'))

selecttable_schema,TABLE_NAME,

(xpath('/row/c/text()',query_to_xml(format('selectcount(*)ascfrom%I.%I',table_schema,

TABLE_NAME),FALSE,TRUE,'')))[1]::text::intASrows_n

fromtbl

orderbytable_name;

Task 2: Data Preprocessing for Melbourne Metropolitan area

The purpose of this section is to manipulate the data into a suitable format for the following

taskanalysis.Thistaskhastwoparts:MandatoryrequirementandOptionalrequirements.

Mandatoryrequirement

[Youmustmeetthemandatoryrequirementsdescribedinthissection.]

In this assignment, we aim to explore the transportation accessibility [Topic of report] of

the Melbourne Metropolitan area exclusively [Scope of report]. The mb_2021 table

contains mesh blocks for the entirecountryofAustralia.Tominimisequerycosts,ensurethat

you only use the mesh blocks within the Melbourne Metropolitan area for this assignment.

The Melbourne Metropolitan’s mesh blocks can be identified from the gcc_name21. If the

column contains “Greater Melbourne”,thismeshblockislocatedinMelbourneMetropolitan.

As a result, you need to create a table called "mb2021_mel" that contains ONLY the mesh

blocksinMelbourneMetropolitan.

Optionalrequirements:

[You are free to explore and manipulate the data creatively within the mandatory

requirements, which are limited to Melbourne Metropolitan for the topic of transport

accessibility.]

For optional requirements can be selected based on your specific data analysis needs.Make

sure to include a detailed explanationofyourrationaleinthereportforoptionalrequirements

youchoose.

Question:

DoIhavetoansweratleastoneoftheoptionalrequirements?

Answer:

No, you are free to explore and manipulate the data creatively as long as the data is

analysedinMelbourneMetropolitanforthetopicoftransportaccessibility.

The following suggestion may useful for data exploration and analysis transportation

accessibilityofMelbourneMetropolitanarea:

1.Since the working area will be Melbourne Metropolitan, it is important to have a polygon

for the boundary of our working area. Hint: aggregate all mesh blocks polygon tocreateone

largepolygonforMelbourneMetropolitanboundary.

2.Stops table does not have any geometry column. It might be useful to add a geometry

column, using the latitude and longitude values available in the table. Make sure you use

GDA2020(SRID:7844)forthiscolumn.

3.The Stops table does not show direct information regarding the vehicle types,

routes_short_nameandroutes_long_name.Theseinformationsarestoredintheroutestable.

4.If you want to explore the transportation situation for different vehicle types, such astram,

train, or bus, the vehicle type is determined by the corresponding route type in the routes

table,where:

● 0correspondstotram

● 2correspondstotrain

● 3correspondstobus

● Anyotherroutetypeislabelledas'Unknown'.

Theoutputs of thistask for Reportare:

b) Attach a screenshot of SQL script for creating a table named “mb2021_mel” that

containsONLYthemeshblocksinMelbourneMetropolitan.

c) Provide a detailed explanation of the remaining data processing steps you have

conducted, includingscreenshotsoftheSQLscriptsandtherationalebehindyourchoices

inthereport.

Task 3: Data Analytics and Visualisation

Inthissectionyouwillneedtoperformdataanalysisonthetablesyouhaverestored,focusing

on transport accessibility in metropolitan Melbourne. Use the techniques youhavelearnedin

the spatial database part to carry out your analysis. You are free to choose any specific

perspectives or aspects of data analysis relevant to your dataset,butensurethatyouranalysis

relatestothemaintopic:transportaccessibilityinmetropolitanMelbourne.

This could include exploring different statistical measures or carrying out other relevant

analyses.Presentyourfindingsclearlyandconcisely,demonstratingyourunderstandingofthe

datasetandhighlightinganynotableobservationsorpatterns.

As part of this data visualisation, you will also need to create at least one map-based

headmap using QGIS to present your findings related to the main topics. These

visualisations will be used in the next section of the assignment, the summary report. To

supportyouranalysis,youcanincludescreenshotsofthevisualisationsdirectlyinthereport.

Be sure to include the script or code used for data analysis and data visualisation in the

appendix of your report. The script should provide clearinstructionsonhowtheanalysiswas

performed and any necessary calculations or transformations applied to the data. This will

ensure that your analysis can be reproduced and verified. Remember to include appropriate

labels, titles, and legends in your visualisations to make them easy to understand. The

visualisations should be of sufficient quality and clarity to effectively convey your analysis

findings.

Note:

● UseSQLqueriestoinvestigatetherestoredtables.

● Conductathoroughdescriptiveanalysistouncoverinsightswithinthedata.

● SummariseandVisualiseyourfindingsclearlyandconcisely.

● Highlightkeyobservationsandpatternsdiscoveredduringtheanalysis.

● Ensureyourfindingsreflectadeepunderstandingofthedata.

Theoutputs of thistask for Reportare:

d) Data analysis and visualisation, including the screenshot of SQL scriptandvisualisation.

Forthevisualisation,itmustcontainatleastonemapbasefigure.

Submission Checklist

Summary Report for Task 1 to 3

As aprofessionaldataanalyst,yourtaskistoconsolidatealltheprevioustasks,includingdata

restoration, processing, analysis, and visualisations, into a comprehensive writtenreport.The

report should adhere to a word limitof2000wordsandfollowastructuredformat,consisting

of an introduction, methodology, results, conclusion, and appendix. Please note that a

question-and-answer format is not acceptable for this assignment, and marks will be

deductedforusingsuchaformat.

Please ensure that the report adheres to the given word limit and is well-organised, concise,

andcoherent.Thesamplereportshouldbeformattedasfollows:

Title:Writeyourtitlehereonaseparatepage,(Note:Abstractisnotrequired)

1.Introduction,suchas

● Brieflyexplainthepurposeofthereportandwhatyouaimtoachievewithyour

analysis.

● Highlightthekeyquestions youwanttoinvestigatethroughyouranalysis.

2.Methodology

Thissectionshouldprovideaclearexplanationofthedifferentstagesofyourwork.

● DatasetOverview,suchas

Provideanoverviewofthedataanditssource.

● DataRestorationandPreprocessing,suchas

Explainhowyouimportedandinitiallyexploredthedata.Includethesoftwareand

librariesused.

Provide a detailed explanation of the remaining data processing steps you have

conducted,andtherationalebehindyourchoicesinthereport.

● DataAnalysisandVisualization

Describe which area of transport accessibility in metropolitan Melbourne you are

primarilyexploring.

Describe the analysis you conducted and the types of visualisations you chose to

use, and why you felt they would effectively represent your data and findings.

Whatsoftwareorlibrarieswereusedtocreatethesevisualisations?

3.Results:

Presenttheresultsofyourin-depthinvestigations.Explainwhattheseresultsmean

andhowtheyansweryourinitialquestions.

Forthevisualisation,itmustcontainatleastonemapbasefigure.

4.Discussion

Discussyourfindingsandtheirimplications.

● Restatethemainfindingsofyourdescriptiveandadvancedanalyses.

● Discusshowthesefindingsansweryourinitialquestionsorhypotheses.

● Reflectontheprocessandanylimitationsorchallengesyoufacedduringyour

analysis.

5.References[Excludedfromthe2000-wordlimit]

If you have used external resources, don't forget to cite them properly according to the

chosenstyleguide(APA7thedition).

6.Appendix[Excludedfromthe2000-wordlimit]

Thescreenshotsofthefollowingtasks:

● AttachascreenshotoftheresultstoincludeallthetablesyourestoredinTask1,

includingthenumberofrowsforeachtableyourestored.

● Attach screenshots of theSQLscriptsusedinTask2,includingtheSQLscriptsfor

creating a table called "mb2021_mel" and screenshotsoftheSQLscriptsyouused

fortheremainingdataprocessingsteps.

● TheSQLscriptforDataanalysisandvisualisation

Video presentation

A five minute video presentation in mp4 format save as:

YourstudentID_A3_video.mp4

Based on the report you have created, present your design and findings in a

five-minute video presentation. Ensure you thoroughly understand both the dataset

and thereport toeffectivelyextract andcommunicatethe keypoints.

Assignment Submission

1. A combined pdf file save as: YourstudentID_A3_report.pdf, containing all of the

abovetasks1to 3.

2. A five minute video presentation in mp4 format save as:

YourstudentID_A3_video.mp4

Zip all abovefilesfrom step 1to3,andname theZIP folder asA3_YourstudentID.zip.

● The submission of this assignment must be in the form of a single ZIP file.

Only PDF and .mp4fileswill beaccepted withinthe zipfile.No otherformats

will be accepted.

● You must ensure that you have all the files listed in this checklist before

submitting your assignment to Moodle. Failure to submit a complete list of

fileswill lead tomarkpenalties.

● It's important to note that our support hours arelimited,and wedon'thave the

capacity to address submission issues outside of working hours. You must

ensure that you have allthe fileslisted inthis checklistbefore submittingyour

assignment to Moodle. Failuretosubmita completelistof fileswill resultina

markpenalty.

● Penalty for latesubmission: 5%deductionfor each day,including weekends

● Submission cut-off time:Friday, 1November2024,4:30PM.Submissionswill

not beacceptedafter this timeunlesstherearespecialconsiderations.

Authorship

This assignment is an individual assignment and the final submission must be identifiably

your own work. Breaches of this requirement willresult inanassignment notbeing accepted

for assessment andmayresult indisciplinaryaction.

Late Penalty

Late assignments submitted without an approved extension may be accepted up to a

maximum of seven days with theapprovalof theChief Examinerand/orLecturer butwill be

penalised at the rate of 5% per day (including weekends and public holidays).

Assignments submitted more than seven days after the due date will receive a zeromarkfor

that assignmentand maynot receiveany feedback.

Please note(late penaltyand extension):

1. An inability to manage your time or computing resources will not be accepted as a

valid excuse. (Several assignments being due at thesametimearea factofuniversity

life.)

2. Hardware failures, whether of personal or university equipment, are not normally

recognised as valid excuses.Failure tobackupassignment files isalso notrecognised

asa validexcuse.

Special Consideration

Students no longer seek extensions from chief examiner/teaching team. All extensions /

special considerations will now be handled by the central Spec Con team. Please do not

email teachingstaff to requestan extensionorspecialconsideration.

Extensions and other individual alterations to the assessment regime will only be

considered using the University Special Consideration Policy. Students should carefully

read the Special Consideration website, especially the details about what formal

documentation isrequired.

All special consideration requests should be made using the Special Consideration

Application.

Please do not assume that submission ofaSpecial Considerationapplication guarantees

that it will be granted – you must receive an official confirmation that it has been

granted.

Getting help and support

What canyougethelpfor?

● Consultationswith theTeaching Team

Talkto theTeachingTeam:

https://learning.monash.edu/course/view.php?id=19675§ion=5

● English language skills

Talkto English Connect:https://www.monash.edu/english-connect

● Study skills

Talkto alearningskills advisor:https://www.monash.edu/library/skills/contacts

● Counselling

Talkto acounsellor:https://www.monash.edu/health/counselling/appointments

Plagiarism and Collusion:

Monash University is committed to upholding standards and academic integrity andhonesty.

Pleasetake thetimetoview theselinks.

Academic Integrity Module

Student Academic Integrity Policy

Test your knowledge,collusion(FIT NoCollusion Module)

All the best for your Assignment!

51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: Fudaojun0228