Assessment 2 Information

Subject Code: DATA4200
Subject Name: Data Acquisition and Management
Assessment Title: Sampling report
Assessment Type: Report
Word Count: 300 Words (+/-10%)
Weighting: 20 %
Total Marks: 20
Submission: via Turnitin
Due Date: Monday Week 8, 23:55pm AEST

• Complete all parts below by the due date, on Monday of week 8 at 23:55 pm AEST. Consider the
rubric at the end of the assignment for guidance on structure and content.
• Submit the results as a Word file in Turnitin.

Assessment Description

• Business Problem: Suppose that you are a data analyst for the Police department in Boston,
USA. You have been asked to start analysing subgroups of crime statistics in your city. Your
boss wants you to sample the data rather than using all of it.

• Data sets: Access the following excel files on the KBS portal:
Boston_Crimes.csv
Offense_codes.xlsx

• Recall that we first saw this data set in week 5 and 6.

• Learning outcomes: LO2, LO3, LO4

Assessment Instructions

Part A:
In Tableau or Microsoft Excel,

1. Open the “Offense_codes file and chose a small range of offences, for example 311 (Robbery
-knife – Chain store) to 315 (Robbery -street) or 613 Larceny.

2. Open the Boston_Crimes.csv file and filter out the crimes with the codes you have chosen in
step 1.

3. Recall the sampling methods below that you learnt about in week 7.

4. Apply one of the following sampling methods (Quota, Judgement, random or Stratified)
that you have learnt to the subgroup of 100 crime cases you have chosen in step 1 and 2.

In a paragraph of approx. 250 words,
A) explain how you arrived at the sampled data, and
B) summarise where and when your crime type occurred.
[10 marks]

• For those choosing Quota sampling, imagine you have a quota of 100 cases.

• For those choosing Judgement sampling, chose 100 cases and make sure they are different

• If choosing 100 cases randomly or using stratified sampling, you will need to make sure you
select the cases using the appropriate functions.

5. Discuss a possible disadvantage of the sampling method that you chose. [2 marks]

Part B: In order to lower the crime rate in Boston, suggest another two data sets that could be
combined with the crimes data, and how you would use the information to obtain a better
understanding of why the particular group of crimes you chose are occurring. [4 marks]

Part C: Describe sampling and non-sampling error in your own words and relate it back to the data

Non-probability
sampling
Quota
Convenience
Judgement
Probability
sampling
Simple
random
Systematic
Stratified
Assessment Marking Guide

Data Management Project
Criteria Requirement Marks Available
Part A: Sampling methods
Select, apply, justify and interpret
one sampling method
12 marks Evaluate a disadvantage of the
method chosen

Part B: Other data sets
Recommend another two relevant
data sets to add to the crimes data,
in order to provide further
understanding
4 marks

Part C: Sampling Error
Explain sampling and non-sampling
error and relates this back to the
data 4 marks

