程序代写案例-Q2-Assignment 1

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

Assignment 1 Q2
Analyzing wine data (30 points)
The data for this exercise comes from a paper by Cortez, et al. (2009)
(https://www.sciencedirect.com/science/article/abs/pii/S0167923609001377?via%3Dihub) where the authors
were trying to relate various chemical properties of red and white wine to perceived quality. For this question,
we will analyze only the data for the chemical properties, not the quality. Also the original paper looked at red
and white wine, we will only use the data for the red.
The data can be read in via:
library(tidyverse)
wine_data<-read_csv("red_wine_data.csv") # Be sure this is in your current working di
rectory
glimpse(wine_data)
Rows: 1,599
Columns: 12
$ `fixed acidity` 7.4, 7.8, 7.8, 11.2, 7.4, 7.4, 7.9, 7.3, 7.8, 7…
$ `volatile acidity` 0.700, 0.880, 0.760, 0.280, 0.700, 0.660, 0.600…
$ `citric acid` 0.00, 0.00, 0.04, 0.56, 0.00, 0.00, 0.06, 0.00,…
$ `residual sugar` 1.9, 2.6, 2.3, 1.9, 1.9, 1.8, 1.6, 1.2, 2.0, 6.…
$ chlorides 0.076, 0.098, 0.092, 0.075, 0.076, 0.075, 0.069…
$ `free sulfur dioxide` 11, 25, 15, 17, 11, 13, 15, 15, 9, 17, 15, 17, …
$ `total sulfur dioxide` 34, 67, 54, 60, 34, 40, 59, 21, 18, 102, 65, 10…
$ density 0.9978, 0.9968, 0.9970, 0.9980, 0.9978, 0.9978,…
$ pH 3.51, 3.20, 3.26, 3.16, 3.51, 3.51, 3.30, 3.39,…
$ sulphates 0.56, 0.68, 0.65, 0.58, 0.56, 0.56, 0.46, 0.47,…
$ alcohol 9.4, 9.8, 9.8, 9.8, 9.4, 9.4, 9.4, 10.0, 9.5, 1…
$ quality 5, 5, 5, 6, 5, 5, 5, 7, 7, 5, 5, 5, 5, 5, 5, 5,…
The variables are self-evident from the names. We will not want to use the quality varible and we can create a
new dataset without it via:
wine_data_chem <- wine_data %>% select(-quality)
head(wine_data_chem)
# A tibble: 6 x 11
`fixed acidity` `volatile acidity` `citric acid` `residual sugar` chlorides

1 7.4 0.7 0 1.9 0.076
2 7.8 0.88 0 2.6 0.098
3 7.8 0.76 0.04 2.3 0.092
4 11.2 0.28 0.56 1.9 0.075
5 7.4 0.7 0 1.9 0.076
6 7.4 0.66 0 1.8 0.075
# … with 6 more variables: free sulfur dioxide ,
# total sulfur dioxide , density , pH , sulphates ,
# alcohol
This is the data you should analyze.
a. (10 points) Using only scatterplots and the sample correlation matrices, summarize what you believe to
be are the most interesting associations you observe amongst these characteristics. Show both the
plots and summaries you generate to support your summaries.
b. (20 points) Perform a principal component analysis of this data using your preferred function. As part of
this analysis, please be sure complete the following tasks:
Report the eigenvalues for all 11 principal compoments.
For the first two principal components, plot and interpret compononents in terms of the original
variables. In particular, explain which variables are most highly correlated with each of these two
components and how these components are different from each other.
Choose the smallest number of principal components that you believe can be used to summarize
the information from the data and justify your choice.

欢迎咨询51作业君