辅导案例-STAT4051

欢迎使用51辅导,51作业君孵化低价透明的学长辅导平台,服务保持优质,平均费用压低50%以上! 51fudao.top
STAT4051 Name (Print):
Fall 2020
Midterm II () Student ID:
Time Limit: 24 hours
Instructions:
• The exam has five problems. You should try to solve all five problems before deciding which
four problems to submit. Make it clear on the next page what four problems you want graded.
Submitting solutions to all five problems will not gain more points. If you submit solutions
to all five problems, then we will grade just the first four regardless.
• In your analysis, remember to check for assumptions and think about interactions. Your
analysis should go beyond just the linear model and what is significant. It should try to
explain what is going on in the data and provide graphics when appropriate.
• Do your own work! Discuss any non-statistical questions only with the instructor. Once
the exam is distributed, your TA and I cannot answer statistical questions.
• Upload one .pdf file to Canvas of your final solutions by Monday, November 23 at 11:00
am. You may use R Markdown to generate your results or cut and paste your output into
your file. If you cut and paste R output into your document, change the font of R output to
courier to maintain formatting. I will not accept hand written work or screen shots of your
analysis. Also, I will not accept .html files.
• This exam contains 7 pages (including this cover page).
• This exam is worth 100 points.
• Show all your work on each problem for full credit. The following rules apply:
– Organize your work, in a reasonably neat and coherent way, in the space provided. Work
without a clear ordering will receive very little credit.
– Mysterious or unsupported answers will not receive full credit.
STAT4051, Fall 2020 Midterm II () - Page 2 of 7 Initials:
List the problems you want graded here:




STAT4051, Fall 2020 Midterm II () - Page 3 of 7 Initials:
Problem I. Short Answer (25 points total)
Show all work for full credit unless noted otherwise.
Data were collected on six variables for 20 individuals: height, weight, BMI, chest, waist and
hip measurements. The first ten individuals in the dataset are male and the remaining ten are
female.
Download the measurement.csv dataset and provide a thorough analysis which addresses
the questions listed below. Here are the variables and their units of measurement:
• height (inches)
• weight (pounds)
• BMI (kg/m2)
• chest (inches)
• waist (inches)
• hips (inches)
For this problem, show a logical flow and justify the decisions you made along the way to
answer these questions:
i. Describe the relationship(s) among these six variables revealed by the analysis.
ii. Is it possible to reduce the number of variables in the dataset? If so, how much will you
retain?
iii. Is it possible to classify the sex of future individuals based on your analysis? If so, generate
a graphic in support and discuss what you see.
STAT4051, Fall 2020 Midterm II () - Page 4 of 7 Initials:
Problem II. Short Answer (25 points total)
Show all work for full credit unless noted otherwise.
A physiologist studied the effect of three treatments on muscle tissue in cats. Ten litters of
three cats each were randomly selected and the three treatments were randomly assigned to
the three cats in each litter. Each cat received only one treatment. The physiologist recorded
the litter, treatment and muscle activity 1 hour after application.
Download the catactivity.csv dataset and provide a thorough analysis which determines
whether there are any differences among treatments with regard to muscle activity. If so,
identify the treatment(s) that produce(s) the largest activity. Use α = 0.05.
STAT4051, Fall 2020 Midterm II () - Page 5 of 7 Initials:
Problem III. Short Answer (25 points total)
Show all work for full credit unless noted otherwise.
Kidney failure patients are commonly treated on dialysis machines that filter toxic substances
from the blood. The appropriate ”dose” for effective treatment depends, among other things,
on duration of treatment and weight gain between treatments as a result of fluid buildup. To
study the effect of these two factors on the number of days hospitalized (attributed to the
disease) during a year, a random sample of 56 patients who had undergone treatment at a
large dialysis facility was obtained. Treatment duration was categorized into two groups: short
duration (average dialyzing time for the year under four hours) and long duration (average
dialyzing time for the year equal to or greater than four hours). Average weight gain between
treatments during the year was also categorized into three groups: slight, moderate and severe.
The response is the number of days hospitalization followed.
Download the kidneyfailure.csv dataset
Hint: Use the following transformation, loge(y + 1), to stabilize the variances.
Provide a thorough analysis that answers the following questions:
i. What factor(s) statistically affect the length of hospitalization? Use α = 0.05.
ii. What group of patients are hospitalized the longest?
iii. How may you improve this experiment? Be specific.
STAT4051, Fall 2020 Midterm II () - Page 6 of 7 Initials:
Problem IV. Short Answer (25 points total)
Show all work for full credit unless noted otherwise.
Doctors wish to assess the influence of three anti-viral drugs (two actual drugs and a control)
on the length of SARS in patients. As new patients are identified, they are randomly assigned
to receive one of the three drugs. The response is the length of time a patient recovers after
receiving treatment and no longer tests positive for SARS. It is suspected that the duration
the patient experienced SARS symptoms before they were randomized to treatment affects the
response, therefore duration was recorded for each patient.
Download the antiviral.csv data to use to answer the following questions. Here are the details
about the data:
• treatment (one of three treatments)
• recovery.length (days) - length of recovery time post treatment
• duration (days) - how long a patient had SARS symptoms before they were were treated.
For this problem:
i. Provide a thorough analysis that addresses whether there are any statistical differences
among the three treatments. Show the logical flow and the decisions you made along the
way to arrive at your final model. Use α = 0.05.
ii. Estimate the mean length of recovery for each treatment.
STAT4051, Fall 2020 Midterm II () - Page 7 of 7 Initials:
Problem V. Short Answer (25 points total)
Show all work for full credit unless noted otherwise.
A service center employs a large number of technicians who specialize in repairing laptops.
Four technicians were randomly selected from all technicians working at the service center to
be included in a study about service time. Five laptop brands were randomly selected from all
the brands currently serviced by the center. It was desired to study the effects of technician
and laptop make on the variability of service time.
Download the laptop.csv dataset and provide a thorough analysis which answers the following
questions:
i. Estimate the various sources of variation in this study.
ii. What percentage of the total variability does the technician component contribute?
iii. Test whether all variances are statistically different from zero. Use α = 0.05.
iv. The study was criticized by one of the center’s managers who said that the results could
only be applied to the four technicians in the study. Is this statement accurate or not?
Discuss.

欢迎咨询51作业君
51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: abby12468