程序代写案例-STAT331

STAT331: Final Project
Due: August 4, 2021 at 5pmET on Crowdmark
General Instructions
• Due: August 4 at 5pm.
• Each group consists of 3–4 stud
ents (see below). Students who have not enrolled in a group
by Wednesday July 12th will be randomly assigned to a group.
• Each project consists of a typed report between 7-10 pages (12 point font with standard
1-inch margins and single-spaced) including figures and tables, but excluding a mandatory
Appendix containing (but not limited to) all R code.
• Reports may be written in R Markdown, LaTeX, Word, or any other reasonable format as
long as all R code is included in the appendix
• Reports must be submitted online via Crowdmark
• Late Penalty: 10% per day. Projects turned in after August 8 will not be graded.
• Your project grade will be worth 35% of your final grade
Group Enrolment
• Log in to LEARN and join a Group: At the top of the screen, click Connect > Groups.
• Agree on a Group number between 1–70 with your other team members, and select a group.
• The names of all collaborators must be written on your report.
Project Details
Data
The dataset pollution.Rdata (posted on LEARN) contains a sample of n = 1000 births included
in a study investigating the relationship between several chemical and non-chemical exposures
during pregnancy and birthweight. Specific variable names and descriptions (including any variable
transformations that have been applied–e.g. some variables are log-transformed) can be found
in codebook.csv. The outcome of interest is birthweight in grams (variable e3 bw). The data
include several possible exposures of interest, including chemical exposures measured in mother’s
blood/urine/hair, as well as outdoor exposures measured in the surrounding environment. The
data also include 7 other covariates (maternal age, education, bmi, weight gain during pregnancy,
child’s year of birth, gestational age at birth, and sex).
1

Goals
The goal of this project is to analyze the pollution.Rdata data and write a report on your anal-
ysis. The specific goals of your analysis are up to you to decide. Examples could include: building
the best possible predictive model for birthweight; investigating interactions among chemical expo-
sures; identifying the most important predictors of low birthweight; evaluating how much chemical
exposures improve predictions; many others! You can be creative here—the more specific and in-
teresting the goal, the better. You can use these data in any way you like, but birthweight must
be your outcome.
Report
Your 7–10 page report must contain the following components:
1. Summary:
• A maximum of 200 words describing the objective of the report, an overview of the
statistical analysis, and summary of the main results.
2. Objective:
• Describe your goals for the analysis.
3. Exploratory Data Analysis:
• Conduct exploratory data analyses: report summary statistics, visualize data (his-
tograms, scatter plots, etc.). Report on any interesting findings and comment on how
these inform the rest of your analysis.
4. Methods:
• Describe your statistical analysis: What is your model? Did you use any transformations
or extensions of the basic multiple linear regression model? How did you select a model?
Does the model fit the data well? Are the necessary assumptions met? Be sure to explain
and justify your decisions.
5. Results:
• Report on the findings of your analysis
6. Discussion:
• Comment on your findings/conclusions; describe any limitations of your analysis.
Grading
• Project grades will consider the following:
– All required components are included.
– Ideas are well organized (please use the sections as described above, but you can further
divide material into subsections as appropriate)
– Ideas are clearly expressed, and written in complete sentences.
– Subjective analysis decisions are reasonable and well-justified.
– Statistical challenges are well described and addressed appropriately
2
– The most important/relevant results and findings are shown and discussed in the report
(optionally, any supporting analyses or results you wish to show can be included in the
Appendix).
– Results are presented and interpreted correctly
– Analysis limitations are acknowledged
– Conclusions are insightful and well-justified
– Choice of tables and figures is e↵ective
– Tables and figures are well presented: captions and labels are informative; axes/scales
are appropriate; no needless digits (3 or 4 significant digits is usually ok); no wasted
space
– R Code is clear, well-commented and reproducible
3

欢迎咨询51作业君
51作业君 51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: ITCSdaixie