辅导案例-1QBUS2810

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
1QBUS2810: Statistical Modelling for Business
Individual Assignment Task #1
Submission Due Date: Sunday, 27th September, 2020 (Week 5) before 11:59 pm (Sydney
time)
Instructions:
1. You are required to type up your entire assignment, including any equations. Copy and paste relevant
outputs into your text. If you are using Word, you should use the equation editor for any maths notation.
2. You should attach relevant analysis outputs (graphs, tables, etc.) while discussing your answer in the text.
3. Please answer all questions in the given order; i.e., 1a, 1b, etc. You do not need to re-write the assignment
questions again. Keep your answers clear, brief, and concise.
4. There is no requirement for font size and line spacing, but it must be legible and correctly oriented.
5. Please convert and submit your assignment in pdf, which must be uploaded to the Turnitin assignment
box on Canvas.
6. For hypothesis test question, use the p-value approach. Your answer should include the alternatives
(H0 and H1), decision, and conclusion.
7. You are encouraged to discuss the assignment with your classmates, tutors, and lecturer. However, you
MUST write up solutions on your own. Students caught cheating will automatically receive a mark of 0
and are subject to disciplinary action.
1. A bank branch located in a commercial district of a city has the business objective of developing
an improved process for serving customers during the noon-to-1 p.m. lunch period. The waiting
time, in minutes, is defined as the time the customer enters the line to when he or she reaches the
teller window. Data are collected from a sample of 15 customers during this hour. The results are
as follows:
4.21, 5.55, 3.02, 5.13, 4.77, 2.34, 3.54, 3.20, 4.50, 6.10, 0.38, 5.12, 6.46, 6.19, 3.79
Another bank branch, located in a residential area, is also concerned with the noon-to-1 p.m. lunch
hour. The waiting times, in minutes, collected from a sample of 15 customers during this hour, are
listed below:
9.66, 5.90, 8.02, 5.79, 8.73, 3.82, 8.01, 8.35, 10.49, 6.68, 5.64, 4.08, 6.17, 9.91, 5.47
(a) List the five-number summaries of the waiting times at the two bank branches; 5-number
summary = [minimum, Q1, Q2, Q3, maximum].
(b) Construct the comparative boxplots and identify similarities and differences in the distributions
of the waiting times at the two bank branches in terms of their central location, spread, shape,
and outliers. Use the ± 1.5*IQR rule to help you detect outliers.
(c) Construct and comment quantile-quantile plots (qqplots) for the distributions of the waiting
times at the 2 bank branches.
(d) What are the assumptions of the two-sample t test? Are they satisfied?
(e) Assuming that the data met all the relevant assumptions of the two-sample t test, is there
evidence of a difference in the mean waiting time between the two branches? Use α = 0.05.
(f) Construct and interpret a 95% confidence interval estimate of the difference between the pop-
ulation means in the two branches.
(g) Compare this result in part (e) with that found in part (f).
(h) The nonparametric analog to the two-sample t test is the Mann-Whitney U test or Wilcoxon
rank sum test. They are mathematically equivalent based on Ui = SRi − 0.5ni(ni + 1). What
are the assumptions of the Mann-Whitney U test?
(i) Assuming that the data met all the relevant assumptions, conduct the Mann-Whitney U test or
Wilcoxon rank sum test to determine if there is evidence of a difference in the median waiting
time between the two branches? Use α = 0.05.
2(j) The median of the combined data is 5.595. Construct the following frequency table and fill in
the missing information; i.e., find a, b, c, and d.
Branch
Waiting times Commercial Residential
Above median a b
Below median c d
(k) What are the assumptions of the median test?
(l) Assuming that the data met all the relevant assumptions, perform the median test.
(m) Calculate the probability of observing the frequency table in part (j) under the assumption of
independence.
(n) The following lists all the different ways of rearranging cell frequencies in the frequency table
constructed in part (j), but with the marginal totals remaining the same.
Table a b c d P(a, b, c, d)
1 3 12 12 3 0.0013
2 2 13 13 2 0.0001
3 1 14 14 1 0.0000
4 0 15 15 0 0.0000
5 4 11 11 4 0.0120
6 5 10 10 5 0.0581
7 6 9 9 6 0.1615
8 7 8 8 7 0.2670
9 8 7 7 8 0.2670
10 9 6 6 9 0.1615
11 10 5 5 10 0.0581
12 11 4 4 11 0.0120
13 12 3 3 12 0.0013
14 13 2 2 13 0.0001
15 14 1 1 14 0.0000
16 15 0 0 15 0.0000
Perform the Fisher’s exact test.
(o) Of all the tests conducted, which test (two-sample t test, Mann-Whitney U test or Wilcoxon
rank sum test, Median test) is most powerful?
2. Let Y1, Y2, ..., Y5 be a random sample of size 5 from a normal population with mean 0 and variance
1, and let Y =
∑5
i=1
Yi
5 . Let Y6 be another independent observation from the same population.
Find the distribution of
(a) W =
∑5
i=1Y
2
1.
(b) U =
∑5
i=1(Yi −Y)2.
(c) V = U + Y26
(d) Q =

5Y6√
W
(e) R = 2Y6√
U
33. A sports training institute examines its swimmers to determine if they show signs of erosion of their
dental enamel, and records whether the person swims for less than 12 hours per week or more than
twelve hours per week. The results are shown below.
Erosion present Erosion absent
More than 12 hours 33 120
Less than 12 hours 16 128
(a) Find the odds ratio (OR) for having erosion present for the people who swim more than twelve
hours per week compared with those who swim less than twelve hours per week.
(b) A 100(1 - α)% confidence interval of the log of the odds ratio is given by
ln(OR)± zα/2

1
a
+
1
b
+
1
c
+
1
d
Construct a 95% confidence interval for ln(OR).
(c) Using your answer to part (b), back-transform to obtain and interpret a 95% confidence interval
for OR.
(d) Use the confidence interval in part(c) to determine if the odds for the two groups are significantly
different.
4. Consider a simple linear regression model Yi = β0 + β1Xi + εi, i = 1, ..., n, where εi
iid∼ N(0, σ),
and β0 is known.
(a) Explain why there is no need to derive the least squares estimator of β0.
(b) Find the least squares estimator of β1.
(c) Refer to part (b). Is βˆ1 linear in the Yi’s? Justify your answer with an explanation.
(d) Refer to part (b). Is βˆ1 unbiased for β1? Justify your answer with an explanation.
(e) Find the variance of the estimator of the slope in part (a).
(f) Write out an expression for the confidence interval on E(Y|X) for this model.
5. Consider a simple linear regression model Yi = β0 + β1Xi + εi, i = 1, ..., n, where εi
iid∼ N(0, σ).
Show that the test statistic for testing H0: β1 = 0 versus H1: β1 6= 0,
t =
βˆ1
s√
Sxx
is algebraically equivalent to the test statistic for testing H0: ρ = 0 versus H1: ρ 6= 0,
t =
r

n− 2√
1− r2

欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468