Introduction to Analysis of Algorithms October 16, 2020 Boston University CS 330 Professor Adam Smith, Dora Erdos Fall 2020 Homework 6 Solutions 1 Dijkstra and negative edges Usually, Dijkstra’s shortest-path algorithm is not used on graphs with negative-weight edges because it may fail and give us an incorrect answer. However, sometimes Dijkstra’s will give us the correct answer even if the graph has negative edges. (See Lab 4, question 1.2 for examples.) You are given graph G with at least one negative edge, and a source s. Write an algorithm that tests whether Dijkstra’s algorithm will give the correct shortest paths from s. If it does, return the shortest paths. If not, return ‘no.’ The time complexity should not be longer than that of Djiksta’s algorithm itself, which is ⇥(|E| log |V |). (Hint: First, use Djikstra’s algorithm to come up with candidate paths. Then, write an algorithm to verify whether they are in fact the shortest paths from s.) Solution: First, use Djiksta’s algorithm to come up with D, a dictionary of the lengths of the shortest paths from s to every node. Then, for each edge (u, v) in the graph, we check if D[v] D[u]+weight(u, v). If there is any edge where that inequality is not true, we reject. 1 Time complexity: Dijkstra’s algorithm takes ⇥(|E| log |V |) time complexity. Afterwards, we visit each vertex v in G, as well as every incoming edge to v. Since all edges point at exactly one vertex, we will examine each edge once. Thus, the time complexity of the second step is ⇥(|V |+ |E|), and the time complexity of the entire algorithm is ⇥(|E| log |V |). Proof of correctness: We need to prove two things: 1Technically, we should also handle the case where some vertices are not reachable so some of the distances are infinite. Disjktra will always get those correct, even if some edge weights are negative, so we will ignore that case here. One way to handle it in general is to just say adopt the convention that the inequality is true if D[u] =1. Algorithm 1: Does Djikstra Work(G(V,E), s) 1 D, tree = Djikstra(G, s); /* returns a dictionary of candidate shortest path lengths from s to each vertex v in V */ 2 for v in V do 3 for u in incoming edges(v) do 4 if D[v] > D[u] + weight(u, v) then 5 return ‘no’ // That is, we accept the edge if D[v] D[u] + weight(u, v) 6 return D, tree 1 Homework-6-sols 1 of 5 Claim 1: If Dijkstra’s algorithm returns the shortest paths, our algorithm will output those distances D. If Dijkstra’s produces the shortest paths, we know that for every v in G, there is at least one u such that D[v] = D[u]+weight(u, v). (This u is the last node we encounter before v on the shortest path from the source to v.) The algorithm outputs ‘no’ only when it cannot find a node u that fits this criteria, or when it finds a shorter path than what is currently stored at D[v]. Therefore, if Dijkstra’s algorithm works, our algorithm would have no reason to reject it. Claim 2: If Dijkstra’s algorithm does not return the shortest paths, our algorithm will output ‘no.’ Proof: We start with one observation that will make our lives easier: Even if some edge weights are negative, Disjktra’s algorithm will never underestimate shortest path distances. The only mistakes it can make are overestimates. That is because the distances that Dijsktra produces are all actually the length of some path (recall that we only update a distance in Dijsktra when we find a better path to some vertex). So we only need to worry about distances that are higher than they should be. Consider a vertex t that is reachable from s. There is some true shortest path from s to t. Let’s say it passes through vertices u1, ..., uk, like this: s! u1 ! u2 ! · · ·! uk ! t. Let D⇤[t] be the length of this true shortest path (that is, D⇤[t] = w(s, u1)+w(u1, u2)+· · ·+w(uk, t)). We want to show that D[t] = D⇤[t]. Our algorithm checks each of these edges. So if it accepts the distances, then we know the following inequalities are true: D[u1] D[s] + w(s, u1) D[u2] D[u1] + w(u1, u2) ... D[uk] D[uk 1] + w(uk 1, uk) D[t] D[uk] + w(uk, t) If we add these inequalities, the terms D[u1], ...,D[uk] cancel (since they each appear once on each side) and we are left with D[t] D[s]|{z} This is always 0 in Dijkstra’s alg. + w(s, u1) + w(u1, u2) + · · ·+ w(uk, t)| {z } this is the true distance D⇤[t] = D⇤[t] Therefore, D[t] D⇤[t]. But we know Dijsktra’s algorithm can only make mistakes by overestimat- ing, that is D⇤[t] D[t] (since Dijkstra’s D[t] is always the length of some path from s to t). So the distance to t must be correct. Remark 1: Another way to prove correctness here would be to proceed by induction on the length of the true shortest path, and prove that D[ui] = D⇤[ui] for each of the vertices along the way from s to t. That prove doesn’t involve adding lots of inequalities, so some students might find it more intuitive. 2 Homework-6-sols 2 of 5 Remark 2: There is something pretty subtle in the proof above: we assume that a shortest path really does exist from s to every other vertex really does exist. But with negative-cost edges, that isn’t always true! If there is a negative-cost cycle (reachable from s), then one can always make paths to some vertices shorter by going around the cycle one more time. But our algorithm will always reject if there is a negative cost cycle. Suppose the cycle is v1, ..., vk. Then if our algorithm were to accept, it would ahve to be that D[vi+1] D[vi] + w(vi, vi+1) for all i = 1, ..., k 1, and that D[v1] D[vk] + w(vk, v1). Adding up those inequalities, all the distances D[vi] cancel out and we wind up with 0 w(v1, v2) + w(v2, v3) + · · ·+ w(vk 1, vk) + w(vk, v1). But that inequality says that the cycle’s total cost is not negative, so we get a contradiction. Thus our algorithm couldn’t have accepted. (In other words: if our algorithm accepts a set of distances, it must be that there are no negative-cost cycles.) 2 Scheduling a software project You are in charge of a software project that has n subprojects. Each subproject si takes ti time to complete. Some subprojects have dependencies, meaning that some subprojects cannot be started until certain other subprojects have been completed. Otherwise, any number of subprojects can be performed simultaneously. Write a linear-time algorithm that finds the shortest possible completion time for the project, as well as a schedule with start and finish times for each si that achieves the shortest total time. Here is an example: Subproject ti Dependencies s1 5 s4, s10 s2 20 None s3 5 s6 s4 40 None s5 15 s9 s6 10 s1, s10 s7 15 s2 s8 35 s2, s4, s7 s9 10 None s10 30 s2 Schedule Subproject Start time s2 0 s4 0 s9 0 s5 10 s7 20 s10 20 s8 40 s1 50 s6 55 s3 65 Total time: 75 (Hint: How can you turn this problem into a graph so that you may use an algorithm you already know as part of the solution?) 3 Homework-6-sols 3 of 5 Algorithm 2: Find Schedule(S : subprojects, T : times,D : dependencies) 1 G = create graph(S,T,D) /* make each subproject si into a vertex vi, make each dependency si on sj into a directed edge (vj , vi), give each edge vj , vi is tj the weight ti */ 2 D ={} /* this will be used to store the length of the longest path from a vertex with no incoming edges for each vi */ 3 T = topological sort(G) /* find the longest path from a node with no incoming edges to each node */ 4 for v in order of T do 5 max path length = 0 6 for u in incoming edges(v) do 7 max path length = maximum(D[u] + weight(u, v), max path length) 8 D[v] = max path length /* We will also make a dictionary of the finish times, saving the largest one as the total time to finish the entire project */ 9 F = {}; 10 total time = 0; 11 for vi in G do 12 F [vi] = D[vi] + ti; 13 total time = maximum(F [vi], total time); 14 RETURN total time,D,F Solution: First, form a graphG in which each vertex vi represents a subproject si, each dependency si on sj is a directed edge (vj , vi), and the weight of each edge vj , vi is tj . This will form a Directed Acyclic Graph (DAG). The longest possible path to each node in the DAG will represent the earliest possible start time of the subproject represented by that node. Next, we create a dictionary D to store the lengths of the longest paths to each vertex from a node with no incoming edges. We then perform a topological sort of graph G, which outputs list T . We will examine each node in the order of T . For every vertex v, we will find the incoming edge u such that D[u] + weight (u, v) has the largest value, and then store D[v] = D[u]+ weight(u, v). When this is finished, we will calculate D[vi] + ti for each node, keeping track of the largest one T , which is the shortest possible completion time for the software project. At the end, we return T , as well as D, which is our schedule of start times. Time complexity: Creating the graph is linear in the number of subprojects and dependencies, which becomes ⇥(|V |+ |E|) in the graph. Topological sort also takes ⇥(|V |+ |E|) time. Next, to find the maximum path lengths, we examine each vertex and each edge exactly once, which is also ⇥(|V | + |E|). Finally, we examine each vertex to find the end time for each subproject, which is ⇥(|V |). So the entire algorithm is linear, or ⇥(|V |+ |E|). Correctness: We will prove this in two parts. Claim 1: The longest path from a node without an incoming edge to vi in G gives us the earliest possible start time for vi. 4 Homework-6-sols 4 of 5 We prove this by induction: Base case: Without loss of generality, suppose v1 only has incoming edges coming from vertices that themselves do not have incoming edges. The longest edge to v1 comes from u, and we start subproject u at time 0. Since weight(u, v)is the time it takes for the subproject u to finish, and v1 cannot start until u is done, and all other prerequisite subprojects to v will finish sooner than u. Therefore, the earliest start time of v is weight(u, v). Inductive hypothesis: For every uj with an incoming edge to vi, the earliest start time for subproject uj is the same as the distance of the longest possible path from a node with no incoming edges in G. Prove for vi: Each prerequisite subproject uj will finish at time D[uj ] + tj , and the largest of these is the time at which the last prerequisite will finish and vi can start. By taking into account the longest possible paths to each uj , this also represents longest possible path to vi. Claim 2: Adding path lengths in the order of topological sort as in Algorithm 2 gives us the longest path from a node with no incoming edges. Topological sort puts vertices in a order such that for every vertex v that follows vertex u in the sort, there is no directed edge (v, u). So, with a quick inductive proof: Base case: The first vertex in the sort, vi, has no incoming edges, so the longest path to it is of length 0. Inductive hypothesis: For vertices v2 through vi in the topological sort, the longest path is calculated as in Algorithm 2. Prove for vi+1: We know the longest distance to every vertex uj with an incoming edge to vi+1 because they are all earlier in the topological sort. Therefore, the longest path to vi+1 will be the largest of all possible D[u] + weight(uj , vi+1). 5 Homework-6-sols 5 of 5
欢迎咨询51作业君