POPH90242 Epidemiology 2
Semester 2, 2024
Assignment 2
Task type:
Written task
Task length:
2000 words (+10% allowed and can be less than 1750 words – no penalties for less)
Weighting:
40%
Due Date/Time:
22nd September, 11.59pm. A penalty of 5% per day will be applied for every day after the due date.
Submission:
Submit as a Microsoft Word document electronically via Turnitin and Gradescope
Learning Outcomes:
Apply standardisation, inverse probability weighting and g-computation to control for confounding
Apply quantitative bias techniques to quantify the direction and magnitude of bias
Critique experimental and observational epidemiological studies
Task Purpose:
In this assessment task you will apply many of the epidemiological analytic techniques that you have learnt in Epidemiology 2. In addition to these specific analyses, the general skills of thinking through and conducting an epidemiological analysis is a core skill in epidemiology. Hence, this assessment will give you the opportunity to practice those skills.
Additionally, a very common task in epidemiology is reading and analysing previous studies. This is a key skill if you work in government or non-government organisations or research. In government or non-government organisations you need to know when to trust the results of a study and develop programs or policies based on study finding and when not to do so. Hence there is a critical appraisal of an article in this assessment. We will do another critical appraisal in the next assessment as well, so this is a good chance to learn and improve. It takes practice.
Section A.
Learning Outcomes:
Apply standardisation, regression, propensity scores, and g-computation to control for confounding
Research question
Does living rurally, compared to urban living increase systolic blood pressure over a five year period in adults living in the United States?
Methods
Study design: Cohort study, follow-up period of 5 years.
Population: Adults living in the United States
Participants: randomly selected individuals from across the USA for the NHANES study longitudinal study https://www.cdc.gov/nchs/nhanes-ls/index.htm.
Data information
Variable
Definition
Measurement
Categorisation
Use in this study
sampl
Individual id number
-
-
Individual identification number
rural
Living rurally / urban setting
Based on categorization of addresses
Urban=0
Living rurally=1
Exposure
age_grp
6 age groups in 10 year brackets
Questionnaire data at baseline
Categorical
Potential confounding factor
sex1
Dichotomous sex variable (USA categorisations)
Questionnaire data at baseline
Male=0
Female=1
Potential confounding factor
race1
Dichotomous race variable (USA categorisations)
Questionnaire data at baseline
White=0
African American=1
Potential confounding factor
bmi
Body mass index (kg/m2)
Questionnaire data at baseline
Continuous measure
Potential confounding factor
bpsystol_2
Systolic blood pressure at follow-up (mmHg)
Questionnaire data at follow-up
Continuous measure
Outcome
bpsystol_1
Systolic blood pressure at baseline (mmHg)
Questionnaire data at baseline
Continuous measure
Potential confounding factor
For the purposes of questions 1 and 2 that there are no other potential confounding factors of the association between living rurally and blood pressure. Do not assume this for question 3.
Results
Table 1. Participant characteristics
Participants (%) (n=10,137)
Urban (%) (n=3,897)
Rural (%) (n=6,240)
Sex
Male (%)
4,806 (47.4)
1,793 (46.0)
3,013 (48.3)
Female (%)
5,331 (52.6)
2,104 (54.0)
3,227 (51.7)
Age groups (%)
20 - 29 years
2,261 (22.3)
933 (23.9)
1,328 (21.3)
30 - 39 years
1,589 (15.7)
612 (15.7)
977 (15.7)
40 - 49 years
1,242 (12.3)
469 (12.0)
773 (12.4)
50 - 59 years
1,267 (12.5)
491 (12.6)
776 (12.4)
60 - 69 years
2,804 (27.7)
1036 (26.6)
1,768 (28.3)
70+ years
974 (9.6)
356 (9.1)
618 (9.9)
Race
Identifies as white (%)
8,548 (84.3)
2,435 (62.5)
6,113 (98.0)
Identifies as African American (%)
1,589 (15.7)
1,462 (37.5)
127 (2.0)
Mean BMI (SD)
25.6 (4.9)
25.6 (5.1)
25.5 (4.8)
Mean baseline systolic blood pressure (SD)
127.7 (12.9)
128.3 (13.2)
127.4 (12.7)
Mean follow-up systolic blood pressure (SD)
130.9 (23.4)
131.4 (23.9)
130.6 (22.9)
SD: standard deviation, BMI: body mass index.
This table has been included so you do not need to repeat the table in your assignment.
Questions
Question 1
Write a plan for your analysis (see module 5.6 for a guide). Choose either IPW or G-computation to address the research question.
The characteristics of the sample was described including summarizing the categorical binary exposure (rural/urban living) and numerical continuous outcome (systolic blood pressure at follow up). For categorical variables including sex, race and age group, frequencies and percentages were calculated. For continuous variables including BMI and systolic blood pressure, means and standard deviations were reported. An unadjusted association between rural/urban living and systolic blood pressure was examined by conducting initial comparison between the exposure groups (rural living vs urban living). This was done by tabulating the mean difference in systolic blood pressure at follow-up between participants living rurally and those with urban living.
To examine the causal effect of rural living on systolic blood pressure, G-computation was used. First, we fitting a linear regression model to obtain the conditional means for exposed and unexposed groups by including the interaction terms between rural living (exposure) and covariates that are potential confounders including sex, age groups, race groups, BMI at baseline and systolic blood pressure at baseline. Second, we predicted the potential outcomes for the entire population by applying the conditional means to them:
Generate the predicted systolic blood pressure as if they were all exposed (living rurally)
Generate the predicted systolic blood pressure as if they were all unexposed (urban living)
Third, we calculated the mean systolic blood pressure which is the standardized mean:
Generate the mean systolic blood pressure as if they were all exposed (living rurally)
Generate the mean systolic blood pressure as if they were all unexposed (urban living)
Finally, the difference in standardized mean was calculated to estimate the average causal effect of rural living on systolic blood pressure.
A sensitivity analysis was conducted by including all the potential confounders identified including sex, age groups, race groups, BMI at baseline and the systolic blood pressure at baseline. These covariates were considered as potential confounding factors as they might have influence on both the exposure and the outcome. The analysis examined whether the effect of rural living on systolic blood pressure at follow up would be consistent after controlling for these confounders. Isolating the effect of rural living on systolic blood pressure at follow up by including all the potential confounders identified allows for causal effect to be investigated and increase the robustness.
Question 2
Write up the results from your analytic plan. Include:
a short summary of the key findings from your descriptive results (from Table 1 above) (i.e., 3 sentences)
the results of your analyses
interpretations of the results from your IPW or G computation analyses
From table 1, the distribution of sex and age group is approximately balanced while the number of females (urban: 54%; rural: 51.7%) is larger comparing to males (urban: 46%; rural 48.3%) across the rural and urban group. Majority (84.3%) of the participants were identified as white with only a few 70+ years participants (9.9%). There is a slightly higher mean systolic blood pressure for both at baseline (urban: 128.3 mmHg; rural 127.4 mmHg) and at follow-up (urban: 131.4 mmHg; rural: 130.6 mmHg) for people with urban living comparing to people living rurally
For the unadjusted regression analysis, the estimated mean systolic blood pressure at follow-up for people with urban living was 131.4 mmHg. The estimated relative decrease in mean systolic blood pressure at follow-up for people living rurally was 0.73 mmHg comparing to people with urban living.
The standardized mean systolic blood pressure at follow-up when assuming all people have urban living is 131 mmHg. The standardized mean systolic blood pressure at follow-up when assuming all people live rurally 131.1.
The results of G-computation suggest that there was a 0.164 mmHg (95% CI : -0.568 0.895) relative increase in mean systolic blood pressure at follow-up for people living rurally comparing to people with urban living when controlling for age group, sex, race group, BMI at baseline and systolic blood pressure at baseline.
Question 3
Do you think the four causal conditions have been met in this analysis? Explain your answer by exploring each of the four causal conditions separately.In the analysis, the temporality was met as the exposure (rural/urban living) was measured before the outcome (systolic blood pressure at follow-up). Moreover, this cohort study also includes baseline systolic blood pressure measurement before the outcome measurement while the interval between baseline measurement and follow-up measurement was 5 years. Thus, the temporality was met as the exposure and other covariates were measured before the outcome.
Exchangeability might not be met perfectly. Achieving exchangeability in the observational studies are more challenging as participants were not randomly assigned and there are different causes for people being exposed. There might be other unmeasured confounders in the analysis although all the identified potential confounding factors were included in the analysis, increasing the likelihood that the changes in the outcome are all due to the exposure (rural/ urban living) rather than due to those confounding factors including age group, sex, race group, BMI at baseline and systolic blood pressure at baseline.
Positivity might be met since the participants were randomly selected and there are individuals with diverse background in two different exposure groups However, there might be possibilities that some participants are underrepresented which affect the positivity.
Causal consistency might be not perfectly met as G-computation was used to examine potential outcomes for the exposures (rural/urban living) which isolates the effect of exposure on the outcome by including all the identified potential confounding factors. However, the challenges to meet exchangeability might influence the causal consistency.
Question 4
Briefly write three to four sentences of Discussion for this analysis, taking into account your answers to questions 2 and 3. Include a summary of your findings, limitations and a recommendation for future research.
The association between rural living and systolic blood pressure at follow-up were examined, after adjusting for all the identified potential confounding factors including age group, sex, race group, BMI at baseline and systolic blood pressure at baseline. The results indicate that there was a minimal increase in mean systolic blood pressure at follow-up for people living rurally comparing to people with urban living after controlling for these confounding factors, which is different to the result of unadjusted regression analysis where there was a slight decrease in mean systolic blood pressure at follow-up for people living rurally comparing to people with urban living. This means that confounding factors played a role in the association between exposure and outcome.
The study has some limitations as there might be unmeasured confounding factors that affect both the exposure and outcome, resulting in residual confounding and challenges to meet exchangeability in the study. This limits the ability to establish causality in the study. In the future, the researchers could utilize more diverse participants and use methods that could measure those confounders that are not considered to investigate the causal effect of exposure on the outcome.
Section B
Learning outcome:
Critique experimental and observational epidemiological studies
All Questions in Section B will refer to this study:
Petit D, Touchette E, Pennestri MH, Paquet J, Côté S, Tremblay RE, Boivin M, Montplaisir JY. Nocturnal sleep duration trajectories in early childhood and school performance at age 10 years. J Sleep Res. 2023 Oct;32(5):e13893. doi: 10.1111/jsr.13893. Epub 2023 Mar 27. PMID: 36973015. https://onlinelibrary.wiley.com/doi/full/10.1111/jsr.13893
Questions
Question 5
Answer the questions in Domain 1, Domain 5 and Domain 6 the ROBINS-E Adapted for POPH90242 Epidemiology 2 document. Copy and paste the question number and your answer into your assessment answer page.
A1 is filled in below, otherwise the initial sections are not included in this assessment. Include what you think is relevant from these sections (i.e., important confounding factors) in the responses to the Domain questions. This will make it cleaner to write and read.
A1. Specify the numerical result being assessed
Association between sleep trajectory 1 and Reading level: OR: 2.4 (95%CI 1.3-4.6), p value 0.007. Taken from Table 4 in the study above.
Question 6
Based on your answer to Question 5 answer the questions in the ‘Overall risk of bias’ section of the ROBINS-E Adapted for POPH90242 Epidemiology 2. Copy and paste this section into your assessment answer page.
Section C.
Learning outcome:
Apply quantitative bias techniques to quantify the direction and magnitude of bias
In this section of the assignment we will be looking at this study:
Bruinsma FJ, Jordan S, Bassett JK, et al. Analgesic use and the risk of renal cell carcinoma - Findings from the Consortium for the Investigation of Renal Malignancies (CONFIRM) study. Cancer Epidemiol. 2021 Dec;75:102036. doi: 10.1016/j.canep.2021.102036. Epub 2021 Sep 22. PMID: 34562747.
The question we will focus on is:
Is there an increased risk of incident renal cell carcinoma (RCC) in Australian adults with a higher paracetamol intake compared to those with a lower paracetamol intake?
Methods
Participants: This study was conducted across Victoria and Queensland in Australia, using Cancer Registry data.
Cases with a renal cell carcinoma diagnosis on the Cancer Registries of participating states were invited to participate.
Controls were family members of the case; a sibling or spouse.
Measurement: Regular paracetamol intake was measured through a questionnaire. Participants who used paracetamol for least five times per month, for six months or more were defined as regular users.
Analysis: Odds ratios were calculated and adjustment for age, sex, smoking and hypertension was undertaken.
Results:
Regular paracetamol use was associated with increased odds of renal cell carcinoma (OR 1.32, 95%CI 1.09, 1.61)*
Table 1. The number of cases and controls using paracetamol regularly.
Cases
Controls
Regular paracetamol users
514
300
814
Non regular paracetamol users
550
424
974
1064
724
1788
You decide to explore the role bias may play in this finding by completing a quantitative bias analysis.
The information below will help you plan your bias analysis:
Recall of over-the-counter medications is often prone to measurement error. You find that the sensitivity and specificity of the self-report of use of these medications is low (1). Using the information from this published study you decide that you will conduct a QBA under the hypothesis that in controls the sensitivity and specificity is 0.55 (95%CI 0.50, 0.60) and 0.89 (95%CI 0.81, 0.94), respectively. You think those with RCC are more likely to ‘recall’ their taking of these medications, hence in the cases you estimate that the sensitivity will be 5% higher and specificity 5% lower in cases, compared to controls
To begin the study there were logically 3484 potential case-control pairs that could participate. In the final analysis 1211/3484 (34.8%) cases and 724/3484 (20.8%) controls had data available. You are concerned that low education is cause of paracetamol use (2), and that two causes of low participation are low education and being a control participant. Hence you hypothesise that the participation fraction for the controls taking paracetamol is 1% to 2% lower than those not taking paracetamol.
Question
Question 7
Write-up your bias analysis plan, include a DAG of all potential biases discussed above.
Notes:
we will return to complete this analysis in Assessment 3.
do not include Stata commands.
*Supplementary Table 3. Note there are some differences in the confidence intervals between our analysis and that published due to differences in the data available.
References
Lacasse A, Ware MA, Bourgault P, Lanctôt H, Dorais M, Boulanger A, Cloutier C, Shir Y, Choinière M. Accuracy of Self-reported Prescribed Analgesic Medication Use: Linkage Between the Quebec Pain Registry and the Quebec Administrative Prescription Claims Databases. Clin J Pain. 2016 Feb;32(2):95-102. doi: 10.1097/AJP.0000000000000248. PMID: 25924096.
Algarni, M., Hadi, M.A., Yahyouche, A. et al. A mixed-methods systematic review of the prevalence, reasons, associated harms and risk-reduction interventions of over-the-counter (OTC) medicines misuse, abuse and dependence in adults. J of Pharm Policy and Pract 14, 76 (2021). https://doi.org/10.1186/s40545-021-00350-7