Where business comes to life
Business in Practice: Data Analytics
Week 7 – OPTIONAL SLIDES
Dr. Markos Kyritsis
OBJECTIVES – LEARNING OUTCOMES
• By the end of today’s session, you should be able to:
• Run a n-way AN(c)OVA and discuss its parametric assumptions (OPTIONAL
FOR NON-DATA ANALYTICS STUDENTS)
N-WAY FACTORIAL ANOVA
OPTIONAL SLIDES. YOU ONLY NEED TO READ THESE IF YOU’RE IN THE DATA ANALYTICS STREAM,
OR PLAN TO TRANSFER OVER
2,3,4 OR MORE!
• The N in N-Way ANOVA represents the number of factorial independent
variables you plan to include in your analysis.
• N-Way ANOVA, again, reduces Type I error rate by controlling for multiple
comparisons. However, we can take it a step further and investigate whether
there are any interactions.
FACTORS?
• As a side note, when building ANOVA models you may come across these terms: Factors with X
levels, Response variable, etc.
• Factors are your categorical independent variables. The levels are how many different groups are in that
variables. The response variable is your dependent variable.
EXAMPLE TWO-WAY ANOVA ON MELANOMA
DATASET
RESULTS (ONLY ONE MAIN EFFECT)
INCLUDING COVARIATES (ANCOVA)
COVARIANCE
• Covariance is a non-standardised measurement of relationship between two variables:
• For our example the covariance is 29.21
• Let’s now consider these two variables:
• The covariance is 2921.43
• If you noticed, the second set of variables are just the first ones multiplied by 10. However, it would
appear that the covariance is much higher. Therefore the value of cov(x,y) depends on the scale of
the variables
A COVARIATE IV IN THE FACTORIAL ANOVA
• If you suspect that a continuous variable changes the strength of a
main effect, then you should include this variable as a covariate in
your ANOVA.
• In R this is simple, you just add it as you normally would the factors
• Let’s take a look at an example using R-commander
DOES TUMOUR THICKNESS IN MELANOMA
COVARY WITH SURVIVAL TIME ?
THICKNESS AS A COVARIATE
INTERPRETING AND REPORTING THE
RESULTS OF THE ANCOVA
• What we see is that, yes, the presence of an ulcer decreases the expected
survival time.
• We also see that the thickness of the tumour has a significant negative effect
on survival time.
• So in other words: The omnibus tests indicated that the presence of an ulcer
had a small but significant effect on survival time [F(1,202) = 45.29, p < 0.001].
We also report that the thickness of the tumour has a small but significant
negative effect on survival time [F(1,202) = 4.09, p < 0.05].
SUMMARY
• If you have more than one factor, use n-way ANOVA (2,3,4…n)
• If you have a covariate use ANCOVA
• Covariance is non-standardised and does not make intuitive sense.