CITS3402 Assignment 2 2019: Shortest Paths

Nicholas Pritchard

September 2019

1 Details

• Due date: Fri 25/10 11:59pm

• Worth: 25%

• You are allowed to undertake this assignment as a pair or individually

The goal of this project is to design an implement parallel algorithms to solve

the all-pairs-shortest path problem for a number of large graphs using either

Dijkstra’s or Floyd-Warshall algorithms. You will use C with the Message-

Passing Interface (MPI) to write parallelized code. You will also need to observe

and comment on how speedup is affected by changing the size of the problem and

the number of nodes you use. A reasonable range for the size of your problems

is 1024 to 4096 vertices. A reasonable range for the number of processors is one

to 16.

Your submission will include:

1.1 Code

A copy of all source files, job scripts and build-files used. Source files must in-

clude your names and student numbers at the top. Your program must compile,

execute correctly and be well documented. Make sure to include comments to

explain any MPI function the first time it is used.

1.2 Report

Alongside your code you will also need to write a report addressing the following

points:

• How your approach partitions data and processing

• A table showing speedup for graphs of 2048 vertices vs 1, 2, 4, 8, 16 pro-

cessors

• A table showing speedup for 4 processors vs graphs of 256, 512, 1024, 2048, 4096

vertices.

1

When discussing the performance of your approach, try to relate performance to

various factors in your computation (number of compute nodes, load balancing,

communication overhead etc.)

2 Code Requirements

File Reading

Your solution should read a matrix description from file. It is acceptable to

implement this with or without flags, for example:

mpirun -n 4 ./my_solution -f file.in

is acceptable as is

mpirun -n 4 ./my_solution file.in

The input file format is essentially a flattened adjacency matrix for a di-

rected, weighted graph. All weights are positive integers. These are binary files

(not text files) and will need to be read in as such. More specifically, each file

contains:

• The number of vertices (numV)

• numV x numV integers specifying the weights for all connections

You can read all the information needed by reading the first integer to determine

how many vertices there are in the graph, then read numV x numV integers into

a freshly allocated array. For example the graph (assuming nodes are labelled

from 0 upwards)

2

corresponds to the following adjacency matrix

0 15 1 1

0 0 3 0

1 3 0 0

0 1 1 0

(1)

which would be specified by the file example.in

4 0 15 1 1 0 0 3 0 1 3 0 0 0 1 1 0

Being a binary file however, you would not be able to ’read’ the files yourself as

such. We have added an .txt files for all of our example inputs if you would

like to read them yourself to make sure you are correct.

File Writing

The output of either algorithm should be a matrix containing the length of the

shortest path from vertex i to vertex j. This file has the same form as the input

files. That is, a single array length numV*numV + 1 where the first integer is the

number of vertices in the graph. The solution for the example.in file would be

example.out

4 0 2 1 1 4 0 3 5 1 3 0 2 2 1 1 0

You can choose to write this file by default, or enable this feature with command

line arguments. In any case, specify your choice at the top of your code file.

3 Hints

• When timing your program for speedup analysis do not count the time

taken for file operations.

• Passing command line arguments to MPI programs can be a little tricky

compared to standard programs, be careful.

• Dijkstra’s algorithm is made very concise by using a priority queue. Since

C has no default data structure library you are not required to implement

one. A simpler implementation will be accepted.

• A hint for parallelizing Dijkstra’s is that it is a single-source sortest path

algorithm

• A hint for parallelizing the Floyd-Warshall algorithm is that it can be

viewed as iterating over 2D distance matrix.

3

Marking Rubric

Critera Highly

Satisfactory

Satisfactory Unsatisfactory

File Reading (5) Program is able to

read input files as

specified.

Reads in

adjacency files

correctly.

Attempts to read

adjacency files.

No attempt is

made to read

adjacency files.

File Writing (5) Program is able to

write output files

as specified.

Write output files

correctly and

efficiently.

Attempts to write

output files but are

either poorly

formatted or

written

inefficiently.

No attempt is

made to write

output files.

Dijkstra’s

Algorithm or

Floyd Warshall

(60)

Dijkstra’s all-pairs

shortest path or

the Floyd-

Warshall

algorithm is

implemented

correctly in a

parallel fashion.

MPI calls are

explained.

Algorithm is

implemented

correctly and

efficiently.

Comments

indicate an

understanding of

the parallelism

present.

Algorithm is

either incorrect for

some cases or

inefficient.

Parallelism is

attempted but not

efficiently.

Algorithm is

implemented

incorrectly for all

cases or missing.

Parallelism is not

attempted.

Report (30) Addresses all

required points

and is presented

clearly. Testing is

thorough and

clear.

Report addresses

all points

effectively and is

presented clearly.

Performance

testing is thorough

and meaningful.

Report does not

address all points

required or some

minor readiblity

issues.

Report addresses

few points or is

very difficult to

read.

A Graphs

Graphs and graph theory provides a natural way to model many problems.

Power-grids, transport systems, social structures and nervous systems can all be

modelled by large graphs for example. For completeness we provide a definition

of a graph below:

• A graph G is specified by two sets V and E

• V is a finite set of points called vertices

• E is a finite set of edges

• An edge e is an ordered pair of two vertices (u, v) indicating that vertex

u is connected to vertex v.

• Edges can be weighted or un-weighted. For our assignment, edges are

weighted by positive integers. This forms an extra set of weights w to go

along with our vertices and edges.

• Graphs can be directed or un-directed (for every (u, v) there exists an edge

(v, u))

• Vertices u and v are adjacent to each other if there is an edge (u, v) or

(v, u) present

• A path from vertex u to vertex v is a sequence of vertices following edges

that exist in our graph.

• The shortest-path from vertex u to vertex v in a weighted graph is a

path with minimum sum-weight

B All-Pairs Shortest Paths

For a weighted graph, the all-pairs shortest paths problem is to find the

shortest path between all pairs of vertices.

C Dijkstra’s Algorithm

Dijkstra’s algorithm finds the shortest paths from a starting vertex s to all

other vertices in the graph. It incrementally selects edges that appear closest

to a vertex to build a set of paths, updating the length of a path if a superior

option is found. For more reading, you can start with 1, 2 or 3. An all-pairs

shortest path implementation is found by running Dijkstra’s for each possible

source.

5

D Floyd-Warshall Algorithm

The Floyd-Warshall algorithm is an all-pairs shortest path algorithm which con-

siders the shortest paths possible over zero ’hops’ upwards. Trivially a path of

a single hop is the adjacency of all vertices. The computation of all subsequent

paths is based on the following observation.

For three vertices i, j, k the shortest path from i to j either includes the

paths i to k plus k to j or it does not. This boils down to exceedingly concise

pseudo-code. For more reading consult 1 2 3

6