# 辅导案例-EGH404

EGH404 Portfolio
The EGH404 Portfolio is based on an individual dataset that can be downloaded at:

www.egh404.com
Please note that the dataset is too big to be loaded into Excel

The structure of the dataset is as follows:

Column 1: Point Location in WGS84
Column 2: Temperature in degree Celsius
Column 3: Rainfall in mm
Column 4: Number of people in the location
Column 5: Sensor value A
Column 6: Sensor value B
Part 1: Data Preparation (25%)
• Import the dataset into Matlab

• Determine minimum, maximum, and average for columns 2, 3, 4, 5 and 6

• Provide a box and whiskers plot for column 5 and 6

• Create a scatter plot to look for a correlation between column 4 and 6

• Create a histogram for sensor value A using 10 bins of equal size

Part 2: Data Analysis (50%)
The dataset contains 10 values per location. For each of the 153,709 locations:

• Determine the average number of people and plot the average for the top 100 locations with
the most people on average

• Determine the minimum and maximum temperature and plot the both values for the top 100
locations of the maximum temperature

For the entire data set, is there:

• A correlation between temperature and rainfall? Provide a plot and written answer (3
sentences max, let the plot do the talking)

• A correlation between Sensor value B and the number of people? Provide a plot and written
answer (3 sentences max, let the plot do the talking)

What is the expected temperature when rainfall is between 15 and 50mm based on the dataset
provided?

Part 3: Spatial Analysis (25%)
• Import the dataset into QGIS

• Create a map that colour-codes the locations based on sensor value A

• Create a map that colour-codes the locations based on minimal rainfall and uses marker sizes
based on the maximum number of people in the location

• Create a heat map for the maximum sensor value B values per location

Submission templates will be discussed in the workshop.

