代写辅导接单- MATH 4044 – Statistics for Data Sciences

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top

 MATH 4044 – Statistics for Data Sciences

Case Study SP5 2023

Due 29th Oct 2023 by 11:59pm

  Instructions

ˆ This assignment is worth 35% of your final mark. It is due no later than 29th Oct 2023 by 11:59pm.

ˆ You will need to submit your assignment via learnonline.

ˆ The submitted assignment needs to be a single file, in either a Microsoft Word

(doc or docx) or pdf file format, 25 pages at most excluding any appendices.

ˆ The assignment is out of 100 marks. To achieve maximum marks for each question,

you should aim to:

– Complete the requested statistical analysis in SAS using appropriate tasks or procedures (40%).

– Include only the output most relevant to the question and interpret all key results (40%). Do not include every piece of output produced by SAS!

– Discuss the results more broadly in the context of the given scenario (20%).

ˆ Assignments submitted late, without an extension being granted, will attract a penalty of 10 marks per each working day or any part thereof beyond the due date and time.

 1

 

MATH 4044 Statistics for Data Sciences Case Study

  Introduction

Currently rental bikes are introduced in many urban cities for the enhancement of mo- bility comfort. It is important to make the rental bike available and accessible to the public at the right time as it lessens the waiting time. Eventually, providing the city with a stable supply of rental bikes becomes a major concern. We would like to see how rented count varies with factors such as seasons, rain, temperature and day of the week.

Data Description

The data file for this assignment is called Seoulbike.sas7bdat. If you are using SAS University edition, the file can be downloaded in the assessment page. This data contains count of public bike rental at the peak-demand hour (6pm–7pm) in Seoul Bike Hiring System. This is a processed version of the data downloaded from http://archive.ics.uci.edu/ml/datasets/Seoul+Bike+Sharing+Demand.

The dataset contains weather information (Temperature, Humidity, Windspeed, Vis- ibility, Dewpoint, Solar radiation, Snowfall, Rain), the number of bikes rented at peak hour and date, season information. The variable descriptions are as follows.

   Variable

date

rented

temperature

humidity

windspeed

visibility

dewpoint

SolarRadiation

Rain

Snowfall

seasons

Holiday

wkday

Description

date-month-year

peak rented bike count

temperature in Celcius

Relative humidity (%)

Wind speed in m/s

visibility (multiples of 10m)

Dew point temperature in Celcius

Solar radiation (MJ/m2

1: rainy, 0: no rain

Snow fall (cm)

Winter, Spring, Summer, Autumn Yes; No

Day of the week Monday – Sunday

        2

 

MATH 4044 Statistics for Data Sciences Case Study

  Case Study Tasks

In all questions, provide relevant SAS outputs and interpretations. Remember to check for the relevant assumptions, examine and comment on the residuals.

Question 1 (55 marks)

a) (25 marks) Carry out a one-way analysis of variance relating rented to wkday. Use contrast to test at least one a-priori hypothesis of your choice. Examine and comment on residuals. Also carry out appropriate post-hoc comparisons and discuss your results. Comment on the suitability of ANOVA in this study.

b) (20 marks) Extend the analysis in part (a) to test whether there is evidence of in- teraction between wkday and rain. Study the simple effects. Carry out appropriate post-hoc comparisons and discuss your results.

c) (10 marks) If ANOVA is not suitable for the study in part (a), carry out the Kruskal-Wallis test relating rented to wkday. If appropriate, carry out the post- hoc analysis. Discuss your results. Note: consider using the option dscf to produce post-hoc comparisons.

Question 2 (30 marks) Use SAS to perform a one-way ANCOVA relating rented and wkday with temperature as a covariate, including appropriate post-hoc comparisons:

ˆ Confirm that there is a linear relationship between the response variable and the covariate (a scatterplot and a correlation coefficient plus a comment will suffice).

ˆ Check the two additional ANCOVA assumptions (report and comment only on the parts of the output most directly relevant to condition checking):

– Independence of the covariate and the treatment effect (perform a one-way ANOVA test). Will the covariate helps enhance the difference in rented by seasons or will it be a confounding factor?

– Equality of slopes (add and check significance of the interaction term);

ˆ Report and briefly discuss your results. Compare your results with Question 1a).

Technical note: Make sure you obtain and examine Type III Sum of Square (ss3). Also obtain estimates of ‘least squares means’ (lsmeans) which are means by treatment adjusted for the covariate.

Question 3 (15 marks)

Write a summary of your findings from Questions 1–3. Keep the technical details of the analyses that led you to these conclusions to the absolute minimum. Rather, focus on practical significance and present your findings in non-specialist terms. One to two paragraphs (up to a page) will be sufficient.

3

 

 


51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468