程序代写案例-PS918

欢迎使用51辅导，51作业君孵化低价透明的学长辅导平台，服务保持优质，平均费用压低50%以上！ 51fudao.top

PS918 Modelling Assignment
PS918 Psychological Models of Choice
Emmanouil Konstantinidis | University of Warwick
March 5, 2021
This assignment is due on Wednesday, April 14, 2021. Submit your report on Moodle as one HTML or
PDF file.
Medical Decision Making
The file medical_dm.csv contains part of the data in Trueblood et al. (2018, also discussed in Lecture 3)
investigating medical decision making among medical professionals (pathologists) and novices (undergraduate
students). The task of participants was to judge whether pictures of blood cells show cancerous cells (i.e.,
blast cells) or non-cancerous cells (i.e., non-blast cells). The current data set contains 200 such decisions per
participant. At the beginning of the experiment, both novices and medical experts completed a training to
familiarise themselves with blast cells. After that, each participant performed the main task in which they
judged whether or not the presented images were blast cells or non-blast cells. Among them, some of the
cells were judged as easy and some as difficult trials by an additional group of experts. Figure 1 contains
examples of blast cells and non-blast cells. Here, we only consider the data from the “accuracy” condition
(Trueblood et al. considered additional conditions that are not part of the assignment dataset).
Figure 1: Sample images of blast and non-blast cells that were classified as easy and difficult. Panel a is
an easy blast image, panel b is a hard blast image, panel c is an easy non-blast image, and panel d is a hard
non-blast image.
med <- read.csv("medical_dm.csv")
str(med)
## 'data.frame': 11000 obs. of 9 variables:
## $ id : int 2 2 2 2 2 2 2 2 2 2 ...
## $ group : Factor w/ 3 levels "experienced",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ block : int 3 3 3 3 3 3 3 3 3 3 ...
## $ trial : int 1 2 3 4 5 6 7 8 9 10 ...
## $ classification: Factor w/ 2 levels "blast","non-blast": 1 2 2 2 1 1 1 1 2 1 ...
## $ difficulty : Factor w/ 2 levels "easy","hard": 1 1 2 2 1 1 2 2 1 2 ...
## $ response : Factor w/ 2 levels "blast","non-blast": 1 2 1 2 1 1 1 1 2 1 ...
## $ rt : num 0.853 0.575 1.136 0.875 0.748 ...
## $ stimulus : Factor w/ 312 levels "blastEasy/AuerRod.jpg",..: 8 167 246 273 47 32 132 98 217 85 ...
The data contains 9 variables:
• id: Participant identifier (note, participant identifier is not unique across groups).
• group: Group identifier with three levels: experienced, inexperienced, and novice. The first two
levels refer to different types of medical professionals that we will consider together for the purposes
1
of the assignment (i.e., in your answer, do not distinguish between experienced and inexperienced
doctors).
• block: The block in which the trial was shown.
• trial: Trial number.
• classification: True/Correct category of each image, either blast or non-blast.
• difficulty: Difficulty of trial, either easy or hard.
• response: Response of participant, either blast or non-blast.
• rt: Response time in seconds.
• stimulus: Stimulus shown.
Task Description (i.e., what you need to do)
Your main task is to analyse the data with a diffusion model using a no-pooling approach (i.e., fit the
diffusion model to the data of each individual participant separately) and trial-wise maximum likelihood
estimation. Then, use ANOVA to compare the individual-level parameter estimates between the two groups
(experts versus novices).
There are 2 research questions here:
1. Is the diffusion model able to describe real-life medical decision making for both experts (i.e., medical
professionals) and novices?
2. And if so, do the cognitive processes captured by the diffusion model differ between experts and novices?
Put differently, how do cognitive processes underlying medical decision making differ between experts
and non-experts?
There should be two sections in your answer. In the first section, your answer should consist of “complete
sentences”, as you would do in an essay. This section should start with a description of the design and
research question, the diffusion model used here (e.g., what are the free parameters used and what do they
represent?), the benefits of using diffusion models, and relevant aspects of the data (e.g., how many trials per
participant on average, how many trials were excluded). Next, your answer should say something about the
model fit or adequacy. Next, present the results comparing the parameter estimates across the two groups
(experts versus novices). Remember to describe ANOVAs sufficiently. The final part should contain some
sort of summary with respect to the research questions. Feel free to use headings to separate the parts in
the first section in a reasonable manner. Include descriptive statistics in the text, or in tables or figures as
appropriate. Tables and figures should be of publication quality (i.e., fully labelled). Integrate inferential
statistics into your description of the results.
Given the correctness/appropriateness of the model and statistical analysis, the first section
will play the main role for your mark. If an analysis is performed in the second section, but
not reported in the first, it will not be considered. Do not forget to consider the research
question in your answer. The first part may be up to 2500 words long (but can of course be shorter).
Please note that too many figures or tables in this part can also reduce your mark.
The second section should include the complete R code that you used and its output. Add potentially
comments (after a #) to explain what the code does. The code should show all of the commands that you
used, enough for me to replicate exactly what you did (I will be copying and pasting code to run it, so make
sure that works). You can include figures here that you used to explore the data that you do not wish to
include in the first section. I will use the second section to help identify the source of any mistakes. For
practical reports and papers you would only submit the first section, and thus the first section should stand
alone without the second section.
2
Model Specification
For your diffusion model, please estimate the same 6 model parameters as above: a, v, t0, z, sv, and st0
(i.e., fix sz to 0). Note that in contrast to all analyses performed in the worksheet, we now have two stimulus
classes, blast or non-blast images. Consequently, both stimulus classes need to have independent drift
rates. All other parameters should be shared across both stimulus classes. In total, the simplest possible
diffusion model has 7 parameters per individual data set.
You can also fit and report a more complicated model. This model should have separate drift rates for easy
and difficult trials, separately for each stimulus class. Such a model would have 9 parameters per individual
data set.
Note that in either case, due to having two drift rates, you will have to write a new function for generating
starting values. In addition, you will need to modify the likelihood function, or write a new function that is
a wrapper for the likelihood function that makes sure that for each data point the correct drift rate is used.
For example, the function could split the data into two data sets for blast or non-blast stimuli and make
sure that for each of those subsets the correct drift rate is used (note again, all other parameters should
be shared across the two stimulus classes so you cannot separate the data into stimulus classes before the
fitting).
Fitting the model to all participants can be time consuming. It may not make sense to re-fit the data each
time you want to work on the assignment. To avoid this, note that you can save R objects via save() and
load them with load(). For example, if you have saved your fits in res_med, the following code can be used
for saving and loading, respectively:
save(res_med, file="res_med.rda")
load("res_med.rda")
Additional Considerations
Note that before fitting the model to the data, you should exclude too fast and too slow trials. For example,
exclude all trials that are faster than 250ms and slower than 2.5 seconds. Such a cut-off is not uncommon
when applying the diffusion model to data. If you use a cut-off (which probably makes sense), make sure
to describe this in your report and indicate how many trials were excluded that way (ideally using relative
frequencies and not absolute values). If you use other tricks to ensure the validity and quality of your
results (such as multiple fitting runs with different random starting values to avoid local optima or checks
for parameter identifiability) make sure to describe this as well.
As described above, your answer should also include some sentences on the adequacy of the model. How well
does it describe the data? A common way to evaluate this across participants is by plotting observed and
predicted response proportions in a scatter plot and report the correlation. If the model fits the data well,
the correlation should be quite high (around .8 or .9) and most points should be near the main diagonal.
One could do the same also for the observed and predicted RT quantiles (e.g., the median). As an example,
take a look in Trueblood et al. (2018), Figure 4 (p. 9).
If you see additional reasonable ways to extend this analysis and you have not yet reached 2500 words for
Section 1, feel free to add such additional analyses. However, make sure they address the research question(s).
References
Trueblood, J. S., Holmes, W. R., Seegmiller, A. C., Douds, J., Compton, M., Szentirmai, E., . . . Eichbaum,
Q. (2018). The impact of speed and bias on the cognitive processes of experts and novices in medical
image decision-making. Cognitive Research: Principles and Implications, 3 (1), 28. doi: 10.1186/s41235-
018-0119-2
3

欢迎咨询51作业君