23/02/2020 Scientific Computation Project 2 (last updated 19/2) — M345SC2020 1.0 documentation

https://m345sc.bitbucket.io/p2.html#p2 1/4

Scientific Computation Project 2 (last

updated 19/2)

Clarifications

Part 1.1 (19/2): A simple example Alist for a 3-node, 2-link network: Alist = [[2,1],[0],[0]]

Part 1.2 (19/2): If no path exists between start and dest, return an empty list.

Part 1.3 (19/2): You may assume that at least 2 in-network stations have a cycle route between them.

Part 2.2i) (19/2): Aside from i at node x, all variables should be set to zero at t=0

You are not required to use latex or the provided latex template. Any pdf with a simple, clear presentation

is perfectly fine.

Due date/time: 2/3/2020, 18:00 GMT

This project consists of two parts – in the first, you will be developing/analyzing algorithms on graphs, and

in the second you will be analyzing dynamical processes on graphs.

Getting Started: Template files have been added to the course repo in the project2/ directory. You can

also download the files directly from here: | part1_template.py | part2_template.py |

project2_template.tex

Place these files in a folder, and create copies of the template files named part1_dev.py and part2_dev.py

in this new directory. Ultimately, the final files that you submit should be titled part1.py, part2.py, and

project2.pdf. Browse through the Python template files; there are a few function headers, and you will

have to add code for each of these functions as described below. First, however, you should add your 8-

digit college id to the docstring at the top of each file.

Part 1.

You are tasked with analyzing public transportation networks. You are allowed to use the collections mod-

ule for part 1, please do not use any other external modules.

For each of the questions below, you should develop a correct, efficient algorithm, implement the algo-

rithm in Python, and discuss the running time and efficiency of your implementation in your pdf. Your dis-

cussion should include an estimate of the asymptotic running time and a concise description of how you

constructed this estimate. You should also include a brief description of each algorithm.

1) (2 pts) You are given a set of airports and connections corresponding to a air transportation network.

Specifically, you know if there is a direct connection between any two airports in the network. Complete

the function, flightLegs, so that it efficiently computes the minimum number of flights required to travel

between two distinct airports on the network. The function should also efficiently determine the number of

distinct routes between the two airports which require this minimum number of flights. The function is

provided an N- element adjacency list, Alist, as input, and airports are numbered from 0 to N-1. The ith

element of Alist is a list containing the airports which can be reached directly from airport i. The network

is symmetric, so if i can be reached directly from j, then j can be reached directly from i. The starting

(start) and destination (dest) airports are also provided. The function output should be a two-element list

containing the minimum number of flights required to travel between start and dest and the number of

distinct journeys between start and dest with this minimum number.

2) (5 pts) You will now analyze the journeys on a subway network. You are provided with a list of stations

and direct routes between stations.

i) You are also provided with densities for the network where corresponds to the average number of

people on a direct journey from station i to j (no stations other than i and j are visited during a direct jour-

ney). For convenience, we will assume that . A safety factor, , for a journey between stations a

=

23/02/2020 Scientific Computation Project 2 (last updated 19/2) — M345SC2020 1.0 documentation

https://m345sc.bitbucket.io/p2.html#p2 2/4

and b (not necessarily a direct journey) is defined to be the maximum density encountered during the jour-

ney (lower S corresponds to higher safety). Given a starting and destination station, develop an algorithm

to efficiently find the safest journey between the two stations. Implement your method in safeJourney in

part1.py. Along with the starting and destination stations, the function is provided an adjacency list, Alist,

for the network. The ith element of the list is a list of tuples. Each tuple contains a station which can be di-

rectly reached from i, and the density for travel between that station and i. The function should return a

list containing the sequence of stations encountered during the safest journey and the safety factor for the

journey. E.g., if the function returns [[1,5,3],10] then the full journey is 1 -> 5 -> 3 and the safety factor of

the journey is 10. If multiple journeys produce the same minimum S, you can return any one of those jour-

neys. If no path exists between the starting and destination stations, return an empty list.

ii) Now consider the fastest journeys between stations start and dest. You are provided with durations of

journeys between stations i and j, (rounded to the nearest minute and ), and if there are multi-

ple shortest journeys, you should select the one with the least number of steps. Complete shortJourney so

that it efficiently finds this fastest journey and returns the stations included in the journey (as a list) and

the duration of the journey. If no path exists between the starting and destination stations, return an emp-

ty list. The overall setup of the function is very similar to safeJourney, and further details are provided in

the function docstring.

3) (3 pts) You are provided a network of train stations and, for a given station outside of this network (x),

you know the minimum fare for reaching each in-network station from x, and you also know the minimum

fare for returning to x from any in-network station. You are also given a list of 2-way dedicated cycling

routes connecting pairs of in-network stations. Complete the function, cheapCycling, so that it efficiently

finds the cheapest cyling journey and returns the pair of distinct in-network stations corresponding to this

journey. Here, a journey consists of 1) arrival at an in-network station from x, 2) some amount of cost-free

cycling using dedicated routes, and 3) return to x from an in-network station which is distinct from the ar-

rival station. The train stations are numbered from 0 to N-1, and the function is provided two N-element

lists as input (Slist and Clist). The ith element of Slist contains two numbers – the minimum arrival and

return fares for routes between stations i and x. Clist is an adjacency list. Its ith element contains a list of

integers which correspond to in-network stations that can be reached directly by bicycle from station i.

Note for Part 1: You are not allowed to use heapq, however, if you determine that the best approach to a

problem requires a binary heap, you should choose this approach, implement it without a heap, and then

discuss the benefits of the heap (and its influence on the asymptotic running time) in your analysis.

Part 2.

1) (5 pts) For this question, you will investigate random walks on unweighted, undirected graphs. One step

of a random walk is defined as follows. The probability of a walker moving from node i to node j is

where is the adjacency matrix for the graph, and is the degree of node i.

i) Complete rwgraph so that it efficiently simulates M Nt- step random walks on the NetworkX graph pro-

vided as input. All walkers should start at the same node with the this node set via input variable, i0. Pro-

vide a 1-2 sentence description of your algorithm for 1 simulation step in your pdf.

For the next two questions, you should use Barabasi-Albert graphs with n=2000 and m=4 (see NetworkX

documentation for the definitions of m and n). You should also fix seed so that the same graph is generat-

ed each simulation.

ii) Analyze your simulation results as Nt and M are increased. The initial positions for all walkers should

be the node with maximum degree. If multiple nodes have the same maximum degree, select the node with

the smallest node number. Compare large-Nt results to the node degrees for the graph. Make 1-2 figures

supporting your main findings, and place them along with accompanying discussion in your pdf. The code

used to generate the figures should be placed in rwgraph_analyze1

iii) Recall that linear diffusion on an unweighted graph was defined using the graph Laplacian matrix,

where Q is a diagonal matrix with the graph degrees on the diagonal, and A is the adjacency

matrix. The scaled Laplacian, , is defined as . Consider the linear spreading produced

by both this operator and . Investigate if any one of these three diffusion operators produces dynamics

with significant similarities to your simulated random walk results (all generated using the same B-A

=

= /

= −

= −

−1

23/02/2020 Scientific Computation Project 2 (last updated 19/2) — M345SC2020 1.0 documentation

https://m345sc.bitbucket.io/p2.html#p2 3/4

graph). Identify 2-3 key points and provide explanations of what you have found in your pdf. These may be

accompanied by figures as needed. Place any relevant code in rwgraph_analyze2. You are not required to

establish quantitative correspondence between simulated diffusion and random walks on a time-step by

time-step basis.

2. (5 pts) Consider the following two models for transport processes on graphs:

Model A:

Model B:

Here, is the Laplacian matrix and is the adjacency matrix for an N-node graph.

i) Complete modelA so that it computes simulations of modelA on the NetworkX graph provided as input.

Complete the function, RHS, so that it computes and returns the right-hand side of the model equations

(see the function docstring). You should assume that RHS will be called a very large number of times with-

in one call to model A and construct your code accordingly. Construct an estimate of the number of opera-

tions required to compute in one call to RHS. Add this estimate to your pdf along

with a concise explanation of how it was constructed. Add the needed code below RHS to simulate the

model for Nt time steps from t=0 to t=tf with the initial condition at node x with x and provided

as input, and finally return .

ii) Investigate the similarities and differences between the dynamics generated by model A, model B, and

linear diffusion on Barabasi-Albert graphs. You can initially consider , , and ,

though you are free to vary these parameters as you like. You should initially (at t=0) set all variables to

zero except for i at the node with maximum degree. You should design your own approach to the problem,

and identify 3-4 key points, carefully discuss them (and provide explanations for highlighted trends), and

produce a few figures illustrating numerical results related to the key points. Among the points you could

consider is the initial spread of a perturbation placed at some node x. Add code for generating your figures

to the function, transport and add the figures and accompanying discussion to your pdf. It is sufficient to

consider B-A graphs with n=100 and m=5. Add code in the name==main portion of the code to call trans-

port and generate the figures you are submitting with your assignment

Notes for Part 2:

You may use numpy, scipy, matplotlib, and networkX for part2. Do not use any other modules without

the instructor’s permission.

You should design your codes to efficiently work with large complex networks. You may assume that

where N is the number of nodes and L is the number of links. You may also assume

that all graphs in this part consist of one connected part and that there are no self-loops.

You do not need to account for the randomness of Barabasi-Albert graphs by, say, running simulations

with several graphs and averaging the results.

Further Notes:

1. Marking will consider both the correctness and efficiency of your code as well as the soundness of your

analysis. For part 2, consideration of efficiency will focus on computationally intensive tasks. Code for gen-

erating figures or processing simulation results are unlikely to be closely examined.

2. All figures created by your code should be well-made and properly labeled.

= − +

(

1 −

)

∑

=0

−1

=

=

(

−

)

∑

=0

−1

/, = 0, . . . , − 1

=

0

0

()

= 0.01 = 0.5 = 0.1

≪ ≪

2

23/02/2020 Scientific Computation Project 2 (last updated 19/2) — M345SC2020 1.0 documentation

https://m345sc.bitbucket.io/p2.html#p2 4/4

3. In order to assign partial credit, markers must be able to understand how your code is organized and

what you are trying to do.

4. Lab 6 contains an exercise on simulation of 1-D random walks.

5. You should aim to limit your discussion for part 1 to 2 pages. For part 2, you should aim to fit your dis-

cussion within 5 pages including up to 6 figures. These constraints are provided for guidance, and marks

will not be deducted for exceeding these limits.

6. Add additional functions as needed, but these functions should not use any external modules outside of

those explicitly allowed for each part.

7. You may use codes provided in lectures and labs, but you should not assume that these codes are

“efficient”.

8. Please be sensible with the amount of time you devote to the open-ended portions of the assignment. If

it seems that your approach for, say, question 2.2 will require an extraordinary amount of time, it may be

helpful to consult with the instructor.

Submitting the assignment

To submit the assignment for assessment, go to the course blackboard page, and find the link for “Project

2” Click on this link and then click on “Write Submission” and add the statement: “This is all my own un-

aided work unless stated otherwise.”

Click on “Browse My Computer” and upload your final files, part1.py, part2.py, and project2.pdf. Finally,

click “Submit”.