程序代写案例-FIT3154-Assignment 1

FIT3154 Assignment 1
Due Date: 12:00PM, Monday, 12/9/2022
Introduction
There are a total of three questions worth 17 + 15 + 14 = 46 marks in this as
signment.
This assignment is worth a total of 20% of your final mark, subject to hurdles and any other matters
(e.g., late penalties, special consideration, etc.) as specified in the FIT3154 Unit Guide or elsewhere
in the FIT3154 Moodle site (including Faculty of I.T. and Monash University policies).
Students are reminded of the Academic Integrity Awareness Training Tutorial Activity and, in par-
ticular, of Monash University’s policies on academic integrity. In submitting this assignment, you
acknowledge your awareness of Monash University’s policies on academic integrity and that work is
done and submitted in accordance with these policies.
Submission: No files are to be submitted via e-mail. Correct files are to be submitted to Moodle.
Scans of handwritten answers are acceptable but theymust be clean and legible. You must ensure your
submission contains answers to the questions in the order they appear in the assignment. Submission
must occur before 12:00 PM Monday, 12th of September, and late submissions will incur penalties as
per Faculty of I.T. policies.
1
Question 1 (17 marks)
Background: The Gamma Distribution
In this question we will be looking at the gamma distribution. This is a distribution frequently used
in the analysis of non-negative real numbers. This is introduced in Lecture 6, but here we will be
using a different (but equivalent) form parameterised instead in terms of the mean µ and shape φ with
probability density
p(y |µ, φ) = 1Γ(φ)
(
φ
µ

yφ−1 exp
(
−φy
µ
)
.
where Γ(·) is known as the “gamma function”. Figure 1 shows several example gamma distributions.
The following facts about the gamma distribution will prove very helpful for answering the questions.
If Y ∼ Ga(µ, φ) is a random variable following a gamma distribution with mean µ and shape φ then
E [Y ] = µ, and V [Y ] = µ
2
φ
so that µ controls the mean, and φ inversely scales the variance. Given a sample y = (y1, . . . , yn) the
maximum likelihood estimate, and Fisher information, for µ are well known to be
µˆ = y¯ and Jn(µ) =

µ2
(1)
respectively, where y¯ = (1/n)
∑n
i=1 yi is the sample mean. The estimates for φ are complicated, so
initially we will proceed assuming φ is known; we will later relax this assumption.
Bayesian Analysis of the Gamma Distribution with Known φ
Let us now consider a Bayesian analysis of the gamma distribution with known φ using the hierarchy:
yi |µ ∼ Ga(µ, φ), i = 1, . . . , n
µ | a, b ∼ IG(a, b)
where IG(a, b) denotes an inverse-gamma distribution with shape a and scale b1. A useful fact is that
if Y ∼ IG(a, b) and a > 1 then
E [Y ] = b
a− 1 .
One reason people use the inverse-gamma distribution as a prior for the mean of the gamma distribution
is that it is flexible; Figure 2 demonstrates several possible different inverse-gamma distributions. It
is also convenient, as the posterior distribution is then also an inverse gamma distribution of the form
µ |y, φ, a, b ∼ IG (nφ+ a, nφy¯ + b) .
The posterior mean of µ is then given by
E [µ |y] ≡ µˆa,b = φny¯ + b
φn+ (a− 1) . (2)
which can be used as an estimate (best guess) of the unknown population µ. You should install
the R package invgamma. This gives you access to the functions dinvgamma(), pinvgamma() and
qinvgamma() which will be crucial in answering some questions.
Important: for some reason, this package refers to the scale parameter of an inverse
gamma distribution as a rate parameter.
1https://en.wikipedia.org/wiki/Inverse-gamma_distribution
2
Consider using (2) as an estimate of the unknown population µ from a sample of data y =
(y1, . . . , yn) coming from a gamma distribution with mean µ and shape φ. The bias and variance
of this estimator is
biasµ(µˆa,b) =
(1− a)µ+ b
nφ+ a− 1 , Vµ [µˆa,b] =
nµ2φ
(nφ+ a− 1)2 (3)
Please answer the following questions. Provide appropriate working/mathematical justification.
1. If φ = 1, how can we interpret the hyperparameters a and b? [2 marks]
2. Describe the dependency of the variance of the estimator on the population values of µ and φ.
[1 mark]
3. Is the estimator (2) asymptotically efficient as n→∞? [2 marks]
It is common to use so called “normalized” squared-error for many problems in which the mean and
variance are linked. For the gamma distribution the normalised squared-error is
L(µ, µˆ) = (µˆ− µ)
2
(µ2/φ) .
and the normalized squared-error risk is therefore
R(µ, µˆ) =
(
φ
µ2
)
E
[
(µˆ− µ)2] .
Please answer the following questions regarding this loss function.
4. Why do you think the normalized squared-error loss might be preferable to the usual (unnor-
malized) squared-error loss? [1 mark]
5. Making use of (3), what is the normalized squared-error risk for the estimator when b = 0?
Comment on the risk. [2 marks]
6. Prove that when a = 1, the choice b = 0 dominates the choice b > 0. [1 mark]
7. Plot the normalized squared-error risk (expected normalized squared-error) for n = 5, φ = 2 and
µ ∈ (0.01, 10) for (a = 1, b = 0) (i.e., maximum likelihood), (a = 1/2, b = 1) and (a = 2, b = 1/2).
You may plot all curves on the one plot. For readability, use a logarithmic scale for the y-axis
and set the limits for the y-axis from (0.1, 2). Describe the key features of the risk curves. Over
the range of µ values considered in our plot do any of the estimators appear to be inadmissable
(that is, not admissable)? [3 marks]
8. Given an a > 1 and a b > 0, for which population value of µ does the estimator (2) attain the
smallest normalized squared-error risk? [2 marks]
Sometimes the alternative log-mean parameterisation v = logµ is used for the gamma distribution.
This is in some (statistical) ways a more natural parameterisation than the mean µ. Under this
parameterisation the probability density is
p(y | v, φ) = 1Γ(φ)φ
φyφ−1 exp
(−φe−vy − φv)
where ex ≡ exp(x). Please answer the following questions.
7. Derive the Fisher information Jn(v) for the log-scale parameter v. (hint: remember the observa-
tions are independently and identically distributed) [2 marks]
8. Provide an argument as to why v = logµ might be more “natural” by comparing the Fisher
informations for v, Jn(v), to the Fisher information for µ, Jn(µ) given by (1). [1 mark]
3
Question 2 (15 marks)
In this question we will use the gamma distribution to analyse some topical and relevant data: case
numbers of people in Victoria, Australia infected with the novel coronavirus (Covid-19). The data we
will use was obtained from the the Victorian government public health website. In particular, we will
analyse the daily case numbers for the month of July, 2022 from covid.daily.cases.07.2022.csv.
You will also need to download the file gamma.fit.R. This contains functions that will assist you
in fitting a gamma distribution to the data; in particular it contains code for maximum likelihood
estimation and Bayesian sampling.
Let us first analyse the data using maximum likelihood and the asymptotic distribution of maximum
likelihood estimates. Please answer the following questions. Provide working/justifications/code as
required to support your answers.
1. Fit a gamma distribution to the data using maximum likelihood via the function gamma.ml(y).
What is the estimated mean number of daily cases µ for the month of July 2022? Provide an
approximate 95% confidence interval for this estimate using the asymptotic distribution of the
maximum likelihood estimate. [3 marks]
2. The average daily case numbers in the previous month was roughly 7,200. Using the asymptotic
distribution of the ML estimate of µ, test the hypothesis that the average number of daily cases
in July 2022 is the same as in June 2022. Clearly specify the hypothesis you are testing and
discuss the resulting p-value. [3 marks]
Let us perform a Bayesian analysis. We will use the function gamma.sampling(y,a,b,n.samples)
to generate samples from the posterior distribution for the following hierarchy:
yj |µ, φ ∼ Ga(µ, φ)
µ | a, b ∼ IG(a, b)
φ ∼ C+(0, 1)
where y is the data, a and b are the hyperparameters for the inverse-gamma prior distribution for µ,
and n.samples is the number of samples to generate. The hierarchy uses a default weakly informative
half-Cauchy prior for φ, so we do not need to specify any hyperparameters for this prior. Using the
daily case numbers from the previous four and a half months, we find that a reasonable prior guess at
the average daily case numbers is around 8,500, with a 95% prior probability of the average daily case
numbers ranging anywhere from 5,000 to 13,000 cases.
3. Select appropriate values of a and b for our inverse gamma prior distribution to (approximately)
encode the specified information (hint: use the pinvgamma() function). [2 marks]
4. Generate 100, 000 samples from the posterior distribution using your specified prior hyperparam-
eters. Plot the prior and posterior distributions of µ (hint: use the dinvgamma() function and
histogram, as appropriate). [2 marks]
5. Report on the posterior mean and 95% credible intervals for µ. How do these compare to our
maximum likelihood estimates? (hint: use the quantile() function) [2 marks]
6. Do you think the Bayesian analysis supports the hypothesis that the average daily case numbers
for July, 2022 are the same as June, 2022? [1 mark]
7. It is important for authorities to have an idea on the likely range of daily case numbers. Provide
an estimate of the 97.5-th percentile of daily case numbers using your Bayesian samples. Provide
a 95% credible interval for this estimate (hint: use the qgamma.mu() function provided). [2
marks]
4
Question 3 (14 marks)
This question will require you to analyse a regression dataset using Bayesian lasso regression. In
particular, you will be looking at predicting the fuel efficiency of a car (in kilometers per litre) based
on characteristics of the car and its engine. This is clearly an important and useful problem. The
dataset fuel.2022.csv contains n = 1, 000 observations on p = 9 predictors obtained from actual fuel
efficiency tables for car models available for sale during the years 2017 through to 2020. The data
dictionary for this dataset is given in Table 1.
To analyse this data in a Bayesian fashion you will use the bayesreg package in R.
1. Fit a regression model to the fuel efficiency data using the least-squares method in R (i.e., using
lm()). From this output, what variables do you believe are potentially associated with increased
fuel efficiency, and why? [2 marks]
2. Analyse the fuel efficiency data using the Bayesian lasso; use at least 10, 000 samples from the
posterior. Compare posterior means, standard deviations and t-statistics to those obtained using
least-squares. How do they compare, and if you believe they are different, why do you think they
are different? [2 marks]
3. Write down your best estimate of the regression equation relating fuel efficiency to the explana-
tory variables. [2 marks]
4. If we wanted to advise a car manufacturer regarding the choice of an engine to have likely good
fuel efficiency, what does this model suggest we should look for? [2 marks]
5. From your Bayesian analysis, which variables seem unimportant/unlikely to be associated with
fuel efficiency? Justify your answer [2 marks]
6. Imagine that a friend proposes purchasing a particular model of car with the properties given in
Table 2. Use your Bayesian model to:
(a) Predict the mean fuel efficiency for this car. [1 mark]
(b) Provide a histogram of the posterior distribution of the mean fuel efficiency for this car,
and a 95% credible interval on the plausible range of mean fuel efficiency for this car. [1
mark]
(hint: use the rv$beta matrix in the return value from bayesreg() to access the samples of the
coefficients)
7. An engineer suggests they believe there is likely an interaction effect between engine displacement
and the number of cylinders. Using the Bayesian lasso, and statistics produced by the Bayesian
lasso code, assess whether you think this is the case, and what effect it has on the model? [2
marks]
5
Variable name Description Values
Model.Year Year of sale 2017− 2020
Eng.Displacement Engine Displacement (litres, l) 0.9− 8.4
No.Cylinders Number of Cylinders 3− 16
Aspiration Engine Aspiration (Oxygen intake) N: Naturally∗
OT: Other
SC: Supercharged
TC: Turbocharged
TS: Turbo+supercharged
No.Gears Number of Gears 1− 10
Lockup.Torque.Converter Lockup torque converter present? N∗ and Y
Drive.Sys Drive System 4∗: 4-wheel drive
A:All-wheel
F:Front-wheel
P:Part-time 4-wheel
R:Rear-wheel
Max.Ethanol Maximum % of Ethanol allowed 10− 85
Fuel.Type Type of Fuel G∗: Regular Unleaded
GM: Mid-grade Unleaded Recommended
GP: Premium Unleaded Recommended
GPR: Premium Unleaded Required
Comb.FE Fuel Efficiency (km/l) 4.974− 26.224
Table 1: Fuel efficiency data dictionary. The ∗ denotes the reference category for each categorical
variable.
Variable Model.Year Eng.Displacement No.Cylinders Aspiration No.Gears Lockup.Torque.Converter Drive.Sys Max.Ethanol Fuel.Type
Value 2018 3.5 6 N 8 N R 15 GPR
Table 2: Engine Properties.
some text
6
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Figure 1: Example probability density functions for the gamma distribution with mean µ and shape
φ.
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
0
0.2
0.4
0.6
0.8
1
1.2
Figure 2: Example probability density functions for the inverse gamma distribution with shape a and
scale b.
7

欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468