辅导案例-ETC2420/ETC5242

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

ETC2420/ETC5242 Statistical Thinking 2020
Learning goals and additional points, week by week

Workshop
Week
Learning Goals Additional specific areas of focus for exam
1: Introduction to R and
RStudio
• Learn how to set up R and RStudio on your own device.
• Learn to install and load R packages.
• Learn what are RMarkdown files and reproducible
research.
• Learn what is 'the tidyverse'.
• Learn some basic R commands to manipulate and plot
data.
• Recognise R code
• Be able to explain what reproducibility means and why it is
important
• head()
• tibble::glimpse()
2: Introduction to data,
visualisation and
wrangling
• Identify types of variables, summarise them
appropriately, and characterise relationships between
them.
• Describe scientific data collection principles.
• Classify variables as being numerical or categorical.
• Illustrate 'tidy data' organisational principles.
• Produce descriptive summaries of numerical and
categorical data using appropriate ggplot2, tidyr and dplyr
functions.
• Know the meaning of commonly used descriptive statistics
and types of plots for different types of data
• e.g. histograms, kernel density plots, bar plots, column plots
3: Randomisation and
simulation for testing
proportions
• Explain terms relevant to statistical hypothesis testing
and inference problems
• Demonstrate the sampling distribution of a statistic
• Construct a randomisation test for independence of
two binary variables
• Build parametric tests for one and two proportions
using the Central Limit Theorem
• Reiterate the framework for frequentist inference
• Permutations
• Sampling without replacement
• Permutation test for simulation consistent with H0: p1=p2
• Recognise prop.test() function and its use for one sample,
and two independent samples (not paired)
• Construction of CLT-based confidence intervals
• One-sided and two-sided alternative hypotheses
4/5: Resampling
techniques for assessing
variability in means
• Review the Central Limit Theorem
• Apply one and two sample t-tests and confidence
intervals
• Build Bootstrap confidence interval for numerical data
• Distinguish between independent and paired samples

• Recognise t.test() function and its use for one sample, and
two sample (both paired and independent) samples
• Construction of CLT-based confidence intervals
• Sampling with replacement
• Bootstrap sampling distribution
• Bootplot.f()
• Use of simulation to understand methodology
• Interpretation of confidence intervals, p-values
• Permutation test for independent means 0 1 2H : µ µ=
6: Distributional
models and maximum
likelihood
• Apply elementary probability and conditional
probability rules
• Identify common discrete and continuous univariate
distributions
• Develop distributional models for i.i.d data and
estimate them using maximum likelihood methods
• Use CLT- and Bootstrap-based confidence intervals to
characterise uncertainty in MLEs

• Use of MASS::fitdistr() function and interpret its output
• How to obtain the “fitted” theoretical distribution using an
estimate of the parameters (e.g. the MLE)
• Use N(0,1) quantiles for CLT test/confidence intervals
• Implement bootstrap for both scalar and vector-valued
parameters
• Interpretation of MLE-based confidence intervals, p-values
Week 7 : Updating
discrete probabilities
• Discuss model assessment tools for distributions fitted
using MLE
• Transition to Bayesian Statistical Thinking
• Apply Bayes theorem in discrete cases

• Different strategies for QQ-plots
• Use of Bayes theorem to spell-checking algorithm
• Continuous density for Y|θ with discrete prior for θ still a
discrete application of Bayes theorem
• Denominator of Bayes theorem is a constant (in terms of θ )
8: Bayesian inference for
numerical data and
Decision rules
• Review Bayesian statistical thinking
• Apply Bayes theorem with conjugate continuous priors
• Consider loss functions and decision rules
• Construct credibility factors

• Bayesian A/B testing (Application of 2 independent Beta-
Binomials)
• Denominator of Bayes theorem is a constant function of θ
• Use of simulation in place of analytical posterior
• Relationship between A/B testing and 2 independent
proportions test
9: Regression models

• Synthesise the Bayesian approach
• Compare frequentist and Bayesian inference
• Recognise when transformations may be required
• Review frequentist simple linear regression
• Diagnose problems with a regression model

• Fit MLE to Olympic medal count data
• Relationship between Lognormal and Normal
• Adding a constant before taking a log (if zeroes in data)
• Checking MLE fit using QQ-plots
• MLE-based prediction distribution
• Check regression fit using residual plots, LOOCV
• R-squared
• broom::tidy(), glance(), augment()
• Leverage and Cook’s D
10: Multiple Linear
Regression
• Apply multiple linear regression models
• Diagnose issues related to multi-collinearity
• Apply model performance measures
• Formulate a general strategy for building a regression
model
• Techniques to select model regressors
• Multicollinearity and the variance inflation factor
• ggscatmat()
• meifly::fitall()
• adjusted R-squared
• AIC and negAIC
• BIC and negBIC

11: Bayesian Multiple
Linear Regression
• Introduce Markov chain Monte Carlo methods
• Apply Bayesian multiple regression models
• Consider Bayesian ensembles
• Conditionally conjugate N times independent IG prior for
regression model
• MCMCpack::MCMCregress
• Posterior trace plots
• Bayesian prediction averages conditional predictions with
respect to posterior of θ

欢迎咨询51作业君