辅导案例-STA 247

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
STA 247 - Assignment #2
Due: April 3, 2020 @ 11:59 PM - Submit through Crowdmark
This is an individual assignment - all work and ideas presented should be entirely your own.
You should not be discussing with others and brainstorm ideas. This is not a group assign-
ment! This also means that posting publicly on Piazza is NOT permitted. Remember
to show all your work. Solutions without justifications will not earn any marks.
Assignments are an opportunity for you to demonstrate how well you are able to apply what
you have been learning in the course, with constructive feedback returned to you so that you
can make improvements for evaluations in the course. While there may be different paths to
the solution, it is your responsibility to show, without a doubt, through your solutions that
you have learned the course material. This includes clearly defining any relevant ran-
dom variables/events and their full distributions as appropriate, using appropriate
notation, and interpreting your results in plain English.
For problems that require R, consider using R-Studio Cloud so you have the flexibility of
saving plots and exporting your script as a .pdf file. Sample scripts will be provided for you
under ‘R Files’ on Quercus by end of day Friday.
1
Problem 1 [22 points]. Consider a binomial random variable X ∼ Bin(n, p). As you will
soon be learning, it turns out that under certain conditions, a binomial distribution can
be well approximated using a normal distribution. In this exercise, you will be comparing
and contrasting between probabilities calculated using the exact distribution and the ap-
proximate distribution. For segments that uses R: type and save your code in the R-script
section of R-Studio.
Submission Instructions: Follow the instructions on Crowdmark carefully. There will be
three submission parts:
(i) Your written responses, including calculations and numerical answers,
(ii) Your two histograms,
(iii) A .pdf file of your R script for all parts of the problem.
Use comments (#Insert comments using the hashtag symbol#) to separate the different
parts of the problem. Failure to submit according to instructions will result in an automatic
deduction of 5 points.
a) (1 point) Suppose that X1 ∼ Bin(100, 0.2). Using R, find the exact probability that
X1 is between 5 and 15, inclusive.
b) (5 points) Since the parameters of a normal distribution are the mean and variance of
the random variable, find the mean and variance of X1. Using a normal distribution
with those parameters, calculate the approximate probability that X1 is between 5 and
15, inclusive. Remember to apply any continuity corrections as needed.
2
c) (1 point) Compare your results in (a) and (b).
d) (4 points) Using the syntax provided in the lecture slides, create a vector in R that
will save 10,000 samples from a Bin(100, 0.2) distribution. Plot a histogram of your
samples in R. Label your axes accordingly and include ‘Bin(100, 0.2)’ as the title of
your histogram.
You can do this easily using the ‘hist( )’ command in R. If ever you’re unsure how to
use a command in R, simply type ‘?(command)’ in the console in R-Studio. (e.g. to
find out how to use the histogram command, including any features, type ?hist in the
console and read the help page that pops up).
e) (3 points) Repeat (a) and (b) for X2 ∼ Bin(100, 0.02).
f) (2 points) How do the exact and approximate probabilities compare in (e)? Is the
approximation for X2 significantly better, significantly worse, or neither compared
with the approximation for X1?
3
g) (4 points) Repeat (d) by plotting a histogram of 10,000 binomial outcomes from a
Bin(100, 0.02) distribution, with appropriately labeled axes. Include ‘Bin(100, 0.02)’
as the title of your histogram.
h) (2 points) Using your histograms, come up with an explanation for how well/poorly
the normal distribution approximates the binomial distribution.
4
Problem 2 [5 points]. Prove that for any two independent random variables X and Y
with finite variance will have zero covariance. Show this for the two cases where X and Y
are both discrete and when they are both continuous.
5
Problem 3 [20 points]. An exam consists of a problem section and a short-answer section.
Let X1 denote the amount of time in hours that a student spends on the problem section
and X2 represent the amount of time the same student spends on the short answer section.
Suppose the joint probability density function of these two times is:
f(x1, x2) = ⎧⎪⎪⎪⎨⎪⎪⎪⎩cx1x2,
x1
3
< x2 <
x1
2
, 0 < x1 < 1
0, otherwise
a) (3 points) Sketch and fully label the support (i.e. label your axes, include the number
scale on your graph). Shade in the area corresponding to the support, and label all
boundaries.
b) (4 points) Find the value of c that would make this a valid probability density function.
6
c) (6 points) Derive the marginal distributions of X1 and X2. Are the two times inde-
pendent? Explain how you determine this.
d) (4 points) If the student spends exactly 0.25 hours on the short answer section, what
is the probability that at most 0.60 hours was spent on the problem section?
7
e) (3 points) If a student spends 0.25 hours on the short answer section, what’s the
expected time they will spend on the problem section?
8
Problem 4 [8 points]. If X is uniformly distributed on [-3, 1], find the probability density
function of Y = ∣X∣. Hint: Sketch a graph of the transformation to help you determine the
corresponding supports of Y .
9
Problem 5 [5 points]. Complete textbook exercise 5.67 on page 232, without using mo-
ment generating functions. Numerical results without appropriate work and detail in steps
will not earn credit.
10
Problem 6 [6 points]. Use the appropriate transformation method to find the distribution
of the sample mean of n independent observations from a Gamma(α, β) distribution.
Sample mean is the average of a collection (x1, x2, ..., xn) of observations from a population
(e.g. the average height of 10 randomly selected students) and is denoted by
X =
∑ni=1Xi
n
Remember that finding the distribution means identifying the distribution type, where pos-
sible, including all relevant parameters.
11
Problem 7 [18 points]. A random variable R has probability density function given by:
g(r) = {kr6e−r/5, r > 0
0, otherwise
a) (2 points) What is the distribution of R?
b) (2 points) Find the value of k that makes g(r) a density function.
12
c) (2 points) Find the mean and variance of R.
d) (6 points) Suppose R models the total distance (in km) between 8 randomly selected
inter-city bus stops, where the distance is computed as the distance to the nearest bus
stop to its east. That is, R = D1 + D2 + ... + D7 where Di is the distance between
the i
th
and the (i + 1)th bus stops. What distribution might be used to model Di?
State at least two assumption(s) that must be made about Di. Critique whether the
assumption(s) is/are reasonable or not in the given context.
13
e) (3 points) Referring to part (d), what is the probability that the next closest bus stop
east of the first bus stop is within 5 km?
f) (3 points) Referring to part (d), give a lower bound estimate for the probability that
the total distance between the first and eighth bus stop is at most 82 km.
14
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468