MLIM Information Retrieval Theory and Practice
Annotated Knowledge Graph
In this assignment, you will create an annotated knowledge graph to explore the process of
information retrieval. Particularly You will explore how different information is
interconnected and how knowledge is embedded in a networked fashion. It is a graded task
that provides you an opportunity to explore the topic of interest in a systematic way with the
support of the network theories we have discussed in class.
Ask a Question
You will need to identify one question that can be answered by empirical research. That is,
the question has to be, at least somewhat, answerable through acquiring experiences from the
real world (e.g., observation, measurement, surveying, experimentation). In this assignment,
you do not have to start until you have the best research question. It is often an iterative
process facilitated by searches.
Identify Relevant Academic Publications
You will identify 5-10 relevant academic publications (e.g., journal articles, book chapters,
conference proceedings) about the question you have asked. This is most likely not going to
be a one-shot search and pick the highest ranked 5-10 papers. Please document the process
and iterations of your searches (e.g., search engines, queries, results, evaluation criteria). You
will have to read these papers for later discussion.
Once you have a collection of relevant documents. Write one or two sentences about each
article to describe what it is about. In the meantime, document the basic information of the
documents (e.g., author, title, keywords). You can make some adjustments if needed: For
example, combining some very similar keywords or use the thesauri/descriptors provided by
the bibliographic database.
Network Analyses
There are a few different network analyses you can choose to do once you have your data.
Choose at least one or more from the following list:
(1) Co-occurrence of keywords/words in title/abstract: Understand how topics are
linked and grouped.
(2) Collaborator network: Understand how the authors work with each other. You
may pull more related articles from some authors’ profile.
(3a) Co-citation network: Explore how these publications may be connected through
ConnectedPapers (https://www.connectedpapers.com/).
(3b) Co-citation network: Analyze the reference list and how the knowledge base of
these publications are connected.
MLIM Information Retrieval Theory and Practice
Zhichun Liu, Ph.D.
(4) Multimodal network: Combine the networks mentioned above.
You are encouraged to be creative (e.g., connect the keywords with documents and then
implement tf-idf to assign weight)!
Depending on your preference, you can choose to the following:
(1) Old-fashioned paper-and-pencil or sketch board
(2) Online visualization tool, such as Nocodefunctions (https://nocodefunctions.com/)
or Flourish (https://app.flourish.studio/)
(3) Local network analysis software, such as Gephi (https://gephi.org/) or UCINET
(https://sites.google.com/site/ucinetsoftware/home?pli=1)
(4) Network analysis packages such as igraph in R or networkX in Python
The latter two options will allow you quantitatively analyze the network data, but it is
up to you.
Discussion
In the discussion section, you will reflect on the results of your searches and analyses.
Consider some of the following questions
(1) Was your initial/revised search strategy effective? What does your impression of
reading papers tell you about the effectiveness and what does the analyses tell you
about the effectiveness?
(2) How does your network analysis results compare to your impression of reading
papers?
(3) Are there any interesting insights you might get from your network analyses?
(4) What are some implications of network analysis? Remember, this assignment is
only at the scale of 5-10 papers. Look beyond the scope of it.
Submission Guidelines
The deliverable of this assignment is a written essay that includes at least the four
components (i.e., question, search process and results, network analysis, and discussion). The
written narrative should not exceed 1200 words (excluding tables, appendices, and
references). The written narrative contributes to 15% of your grade. The essay should be
submitted no later than February 18, 11:59 pm.