程序代写案例-COMP2521-Assignment 2
COMP2521: Assignment 2
Social Network Analysis
A notice on the class web page will be posted after each major revision. Please check the class notice board and this assignment page
frequently. The specification may change.
Change log:
[08/04 16:50] Removed some incorrect comments and preprocessor instructions from the provided .c files.
Objectives
to implement graph based data analysis functions to mine a given social network
to give you further practice with C and data structures
Admin
Marks 20 marks
Individual
Assignment This assignment is an individual assignment.
Due 11:59:59pm Friday 23 April 2021 (Friday of Week 10)
Late Penalty 2 marks per day off the ceiling. The latest you can submit the assignment (with late penalty) is 8pm Monday 26April 2021.
Submit Read instructions in the Submission section below.
Aim
In this assignment, your task is to implement graph-based data analysis functions to mine a given social network. You should start by
reading the Wikipedia entries on these topics:
Floyd-Warshall algorithm
Edge betweenness and community structure
The main focus of this assignment is to calculate measures that could identify "influencers", "followers", etc., and also discover possible
"communities" in a given social network.
Dos and Don'ts!
Please note that:
For this assignment you can use source code that is available as part of the course material (lectures, exercises, tutes and labs).
However, you must properly acknowledge it in your solution.
All the required code for each part must be in the respective *.c file. You may not create additional C files.
You may implement additional helper functions.
Helper functions should be declared as static.
After implementing the functions in FloydWarshall.h and CentralityMeasures.h, you may use these functions in other tasks.
You must not modify any .h files.
None of your submitted files should contain a "main" function.
If you have not implemented any part, you must still submit an empty file with the corresponding file name.
Provided Files
We have provided an implementation of a directed graph ADT that you should use. However, you must not modify Graph.h or Graph.c.
Further, you should only interact with a graph via the functions and data structures in Graph.h.
Also note:
you may assume that all edge weights will be greater than zero.
10/4/21 下午1:11
⻚码:1/4
you may assume that graphs will not contain parallel edges or self-loops.
Download files:
Ass2_files.zip - after you have downloaded the files, you should remove _template from the name of all .c files (the _template was
added as a measure to prevent you from overwriting your work if you redownload the files). If you are connected to CSE, you can
download the files from the command-line using the command:
Testing files will be released later this week.
Part 1: Floyd-Warshall algorithm
In order to discover "influencers", we need to repeatedly find shortest paths between all pairs of nodes. In this section, you need to
implement the Floyd-Warshall algorithm, which is a shortest-path algorithm for graphs. Unlike Dijkstra's algorithm, which is a single-
source, shortest-path algorithm, the Floyd-Warshall algorithm computes the shortest distances between every pair of vertices in a graph.
The algorithm begins with the following observation: If the shortest path from to passes through , then the partial paths from to and
to must be minimal as well. A similar idea is adopted in the Warshall algorithm (in Week 7 lecture) for constructing the transitive closure
matrix of a graph.
Consider a graph with vertices numbered 1 through (note: in the provided files, vertices are numbered from 0 to ). In the
-th step, let be a function that returns the shortest path from to that only uses vertices from the set
as intermediate vertices along the way in the graph. In the -th step, the algorithm will then have to find the shortest
paths between all pairs using only the vertices from .
For all pairs of vertices it holds that the shortest path must either only contain vertices in the set , or otherwise must be a
path that goes from to via vertex . This implies that in the -th step, the shortest path from to either remains
or is being improved to , depending on which of
these paths is shorter. Therefore, each shortest path remains the same, or contains the node whenever it is improved. In each iteration,
all pairs of nodes are assigned the cost for the shortest path found so far:
This formula is the heart of the Floyd-Warshall algorithm. The algorithm works by first computing for all pairs
for , then , and so on. This process continues until , and we have found the shortest path for all pairs using any
intermediate vertices.
The Floyd-Warshall algorithm typically only provides the lengths of the shortest paths between all pairs of vertices. We can construct the
shortest path between any two endpoint vertices and by recording the "next vertex" of along the shortest path to . The idea of
recording the "next vertex" is similar to the idea of recording the "predecessor vertex" in Dijkstra's Algorithm. Note that multiple shortest
paths may exist between two vertices. For simplicity, in this assignment, we will only consider the first found shortest path between two
vertices. In the lectures, we will discuss how to construct all of the shortest paths between two vertices in Dijkstra's Algorithm.
Your task: In this section, you need to implement the following file:
FloydWarshall.c
See FloydWarshall.h for details on the data structure that needs to be returned. Note: if there is no path from vertex to vertex , the
distance from to should be set to infinity, defined in FloydWarshall.h, and the next vertex of vertex along the (non-existant) path
to should be set to -1. For all vertices , the next vertex along the path from to itself should be set to -1.
Part 2: Edge Betweenness Centrality
Centrality measures play a very important role in analysing a social network. For example, vertices with higher "betweenness" measures
often correspond to "influencers" in the given social network. In this part you will implement a well known centrality measure, edge
betweenness, for a given graph.
Edge betweenness is an indicator of highly central edges in networks. Mathematically, betweenness centrality of an edge is the sum of
the fraction of all-pairs shortest paths that pass through :
cp /web/cs2521/21T1/assigns/ass2/Ass2_files.zip .
i j k i k
k j
G N N − 1
(k− 1) shortestPath(i, j, k− 1) i j
{1, 2, . . . , k− 1} k
(i, j) {1, 2, . . . , k}
{1, . . . , k− 1}
i j k k i j
shortestPath(i, j, k− 1) shortestPath(i, k, k− 1) + shortestPath(k, j, k− 1)
k
shortestPath(i, j, k) =
min[shortestPath(i, j, k− 1), shortestPath(i, k, k− 1) + shortestPath(k, j, k− 1)]
shortestPath(i, j, k) (i, j)
k = 1 k = 2 k = N (i, j)
i j i j
v w
v w v
w v v
e
e
cB(e) = ∑
s,t∈V , t is reachable from s
δ(s, t|e)
δ(s, t)
( , ) ( , | )
10/4/21 下午1:11
⻚码:2/4
where is the set of nodes, is the number of shortest paths from to , and is the number of those paths passing
through edge .
In this assignment, as we have made an assumption that there is at most one shortest path between each pair of vertices, the formula can
be simplified to the following:
Note that due to our assumption, will be 1 if the shortest path from to passes through , and 0 otherwise.
Your task: In this section, you need to implement the following file:
CentralityMeasures.c
See CentralityMeasures.h for details on the data structure that needs to be returned. It is highly recommended that you use the
Floyd-Warshall algorithm from Part 1 to compute the shortest paths between vertices. Note: the edge betweenness of a non-existant edge
should be set to -1.
Part 3: Discovering Community
Discovering communities in a network, known as community detection/discovery, is a fundamental problem in network science. A network
is said to have community structure if the nodes of the network can be easily grouped into sets of nodes such that each set of nodes is
densely connected internally.
The Girvan-Newman algorithm detects communities by progressively removing edges from the original network. The connected
components of the remaining network are the communities. The Girvan-Newman algorithm focuses on edges that are most likely
"between" communities. It adopts the "edge betweenness" centrality introduced in Part 2. The algorithm's steps for community detection
are summarized below:
1. Calculate the edge betweenness of all edges in the network.
2. Remove the edge(s) with the highest edge betweenness.
3. Recalculate the edge betweenness of all edges affected by the removal.
4. Repeat Steps 2 and 3 until no edges remain.
After the removal of each edge, we only have to recalculate the betweennesses of those edges that were affected by the removal, which is
at most only those in the same component as the removed edge.
The end result of the Girvan-Newman algorithm is a dendrogram. As the Girvan-Newman algorithm runs, the dendrogram is produced from
the top down (i.e. the network splits up into different communities with the successive removal of links). The leaves of the dendrogram are
individual nodes.
Your task: In this section, you need to implement the following file:
GirvanNewman.c
See GirvanNewman.h for details on the data structure that needs to be returned. It is highly recommended that you use the edge
betweenness algorithm from Part 2 to help you calculate the edge betweenness of each edge.
Assessment Criteria
Part 1: Floyd-Warshall Algorithm (25%)
Part 2: Edge Betweenness Centrality (25%)
Part 3: Discovering Community (30%)
Style: (20%)
Submission
You need to submit the following files and only these files:
FloydWarshall.c
CentralityMeasures.c
GirvanNewman.c
V δ(s, t) s t δ(s, t|e)
e
cB(e) = ∑
s,t∈V , t is reachable from s
δ(s, t|e)
δ(s, t|e) s t e
10/4/21 下午1:11
⻚码:3/4
You can submit from the command-line using the following command:
or you can submit via WebCMS.
Note: Currently, submission does not perform any dryrun testing. Thus, if you submit now, it is your responsibility to ensure that your code
compiles and behaves as expected.
As mentioned earlier, please note that:
For this assignment you can use source code that is available as part of the course material (lectures, exercises, tutes and labs).
However, you must properly acknowledge it in your solution.
All the required code for each part must be in the respective *.c file. You may not create additional C files.
You may implement additional helper functions.
Helper functions should be declared as static.
After implementing the functions in FloydWarshall.h and CentralityMeasures.h, you may use these functions in other tasks.
You must not modify any .h files.
None of your submitted files should contain a "main" function.
If you have not implemented any part, you must still submit an empty file with the corresponding file name.
Plagiarism
This is an individual assignment. Each student will have to develop their own solution without help from other people. You are not permitted
to exchange code or pseudocode. If you have questions about the assignment, ask your tutor. All work submitted for assessment must be
entirely your own work. We regard unacknowledged copying of material, in whole or part, as an extremely serious offence. For further
information, read the Student Conduct.
give cs2521 ass2 FloydWarshall.c CentralityMeasures.c GirvanNewman.c
10/4/21 下午1:11
⻚码:4/4

欢迎咨询51作业君
51作业君 51作业君

Email:51zuoyejun

@gmail.com

添加客服微信: ITCSdaixie