Final Practice Problems

These problems are ungraded, and are intended as a study aid. They are very similar in

style and level to the problems that will appear on the final. The final exam will have between

6 and 8 problem (including the true/false question).

The final covers all the material until lecture 23 (including all the material covered by HW1

through HW6), with a bigger emphasis on the post-midterm material. There final is 2h30min.

Like in midterm, most problems won’t require full analysis (correctness + runtime); instead

each problem will ask explicitly which part of the analysis, if any, is required.

Problem 1 (DFS) If we perform depth-first-search on graph, we obtain a forest containing

all the vertices of the graph and a subset of the edges. Each node u has a “discovery time” u.d

and a “finish time” u.f . We consider the possible relationships between the intervals [u.d, u.f ]

and [v.d, v.f ] for two distinct vertices u and v. For each, either draw a DFS tree where the

relationship holds, or explain why it is impossible.

• u.d < v.d and v.f < u.f

• u.f < v.d

• u.d < v.d < u.f < v.f

Solution. 1. This situation arises when v is a descendant of u in the DFS tree. An example

is pictured in Figure 1. below.

2. This situation can arise when u is discovered before v and v is not reachable from u. An

example is pictured in Figure 2. below for a directed graph.

3. This is impossible. Since u is discovered before v but not yet finished, all of the nodes

reachable from v will be discovered before u is finished, hence u.d < v.d implies that v.f < u.f .

Problem 2 (MST) Consider the undirected, connected, weighted graph pictured in Fig. 3

below. Use an algorithm of your choice to find a minimum spanning tree in that graph. Briefly

explain your algorithm, indicate which edges of the graph are contained in your final tree, and

the order in which you add them to your tree.

Solution. We will use Kruskal’s algorithm, which initially considers these 8 vertices as 8

separate components and iteratively builds a minimal weight spanning tree by adding a lightest

remaining edge that connects two previously disconnected components. When there is more

than one lightest such edge (i.e. there is a tie for the lightest edge with this property), we can

choose among these tied candidates arbitrarily.

We start by adding the light edge of weight 1 from b to e. We next add the edge of weight 1

from e to h. Next we add the edge of weight 2 from e to a, then the edge of weight 2 from b to d.

1

Figure 1: 1. v is a descendant of u in the DFS tree.

Now, we add the edge of weight 3 from c to f. At this point, we have 3 connected components:

one containing b,e,h,a,d, one containing c and f, and one containing just g. As a lightest edge

connecting two of these components, we can next take the edge of weight 4 connecting e and c,

and finally the edge of weight 4 connecting h and g.

Problem 3 (Hardness: Search to Decision) Another important hard problem (NP-hard)

is the SUBSET-SUM problem. The decision version of the problem is formulated as follows:

we are given n numbers S = {x1, x2, . . . xn}, in the range {0, 1, . . . 2n − 1} (note that it takes n

bits to describe each number), and a target sum T , determine if there’s a subset S′ ⊆ S such

that

∑

i∈S′ xi = T .

Suppose we have a magic algorithm M that solves the decision version of the SUBSET-SUM

problem in polynomial time. Prove that, in this case, we can actually recover the set S′ in

polynomial time as well. In particular, assuming M, design a polynomial time algorithm for

the following problem: given a set S = {x1, . . . xn}, and a target T , find a set S′ such that∑

i∈S′ xi = T .

Solution. Note that for a subset S′ = {xi1 , xi2 . . . xik} of S, we have

∑

ij∈S′ xij = T if and

only if

∑

ij∈S′\{i1} xij = T−xi1 . This forms the basis of algorithm: we first check if such a subset

S′ exists. If not, we output ∅ and stop. If yes, however, we iterate over the elements of S and

for each xi, we check whether there exists a subset S

′′ of S \ {xi} such that

∑

ij∈S′′ xi = T − xi

with M. If so, we add xi to S′, subtract xi from T , and move on to the next element of S.

Once T hits 0 during any iteration, we output S′ and stop.

Runtime: it’s just (at most) n calls to the procedure M, and hence polynomial overall.

2

Figure 2: 2. u is discovered first and v is not reachable from u.

Problem 4 (FLOW)

• Consider a flow network as a directed graph G where each edge (u, v) has a capacity

c(u, v) ≥ 0. We designate a source node s and a sink node t. Recall that a cut (S, T ) is

a partition of the nodes into sets S and T with s ∈ S and t ∈ T . The capacity of the cut

(S, T ) is the sum of the capacities of the edges that go from a vertex in S to a vertex in

T . Is it possible to have such a G where the max flow from s to t is 13 and there exists a

cut (S, T ) with capacity 10? Explain your answer.

• Consider a directed graph G, containing vertices s, u, t (along with many others). Describe

an efficient way to compute the number of edge disjoint paths between s and t that do

not go through the vertex u.

Solution. For the first part, recall the max-flow, min-cut theorem. This states that the

maximum flow we can send from s to t is equal to the minimum capacity of a cut (S, T ) such

that s ∈ S and t ∈ T . Thus, if there is a cut with capacity 10, the max flow must be at most

10, and hence cannot be 13. So no, this is not possible.

For the second part, turn G into a flow network by assigning a capacity to each edge. If the

edge does not involve u, then assign it a capacity of 1. If it does involve u, assign it a capacity

of 0. Thus, edges that travel through u will not contribute to the max flow. Now, if we compute

the max flow from s to t in such a graph, it will count the number of edge disjoint paths, since

a flow of 1 can be sent along each edge disjoint path, and no additional flow can then be sent.

Problem 4 (Path Width in DAG) Suppose we are given a directed acyclic graph G =

(N,E) with weights wu,v ≥ 0 for each edge (u, v). The graph represents a network of (one-way)

roads, where wu,v represents the width of the road from u to v. We are also given two node in-

dices: s, the start node, and t, the destination node. The goal of the problem is to determine the

3

ab

c

d

e

f

12

g

h

2

5

1

7

2

4

8 1

4

7

3

Figure 3: graph for the problem MST.

maximal width W so that there is a path from s to t composed entirely of roads of width at least

W (in other words, we need to determine the width of the widest car that can travel from s to t.)

Design an algorithm that, given graph G, widths wu,v for (u, v) ∈ E, and nodes s, t, determines

the maximal width W . Your runtime should be O(|V |+ |E|). Briefly argue runtime bound and

correctness.

Solution. Our algorithm is as follows:

Algorithm 1 Path Width in DAG

Given: G = (V,E), s, t

∀v ∈ N , set d[v]← 0, p[v]←⊥

d[s]← +∞

for each node a ∈ N iterating in topological order do

for each neighbor b of a do

if d[b] < min{d[a], w(a, b)} then

d[b] = min{d[a], w(a, b)}

p[b] = a

end if

end for

end for

return d[t]

Correctness: After running topological sort, we can employ a dynamic programming approach

4

to compute the maximum width. When iterating in topological order, we can write a recurrence

relation that computes the max width d(v) up to some node v i.e. a value such that we can

always find a path from s to some node v such that all roads along that path have width at

least the max width value. We can compute this max width by considering the subproblems of

finding the max width to all neighbors of v and simply comparing to the weight of the given

edge, which yields the recurrence relation:

d(v) = max

(q,v)∈E

{min{w(q, v), d[q]}}

Runtime: We first run topological sort, which takes time O(|V |+ |E|). Then, we iterate over

all the vertices and edges again to update the array d, which gives an overall running time of

O(|V |+ |E|).

Problem 6 (Hardness) Recall the 0-1 Knapsack problem, where there are n items with

(positive) weights w1, . . . , wn and (positive) values v1, . . . , vn. The optimization version of this

problem is to maximize the sum of the values for a subset of the items while staying within a

given weight constraint W . An item must be taken in whole and cannot be split into fractional

pieces.

1. State the decision version of this problem.

2. Suppose someone proposes a dynamic programming approach to solving the optimization

problem. He suggests defining V (k,W ) to be the optimal value that can obtained using

only the first k items with a weight bound of W , and proposes to compute V (k,W ) by

maximizing over V (k − 1,W ) and vk + V (k − 1,W − wk). Will this approach yield a

polynomial time algorithm to find the optimal value V (n,W )? Explain why or why not.

Solution. 1. The decision version of the problem can be stated as follows: Given a bound

b ≥ 0, does there exist a subset S of the items such that the sum of the weights of the items in

S is ≤W while the sum of the values is ≥ b?

2. Since this problem is NP-complete, we should guess that this approach will not lead to a

polynomial time algorithm. One reason is that the computation of V (k,W ) requires computing

values of the form V (k−1,W−wk): if we try to compute and store all of the values V (k−1,W ′)

for all the W ′ ≤ W we will encounter in this dependence, it will be far too many values of W ′

(more than polynomially many). This is because there are 2n different subsets of the weights

to subtract from W , and we need to run in time polynomial in n and the number of bits to

represent W , b, {vi}, {wi} in binary.

Problem 7 (True/False) For each of the claims below circle True or False, and justify your

answer. Your justifications are more important than the True/False designations, and must be

succinct, precise and convincing.

All the parts can be adequately answered and justified in a few lines.

All the times below refer to worst-case time.

• Suppose that a problem can be solved in two ways. Approach 1 is a Dynamic Programming

algorithm that runs in O(n2) time. Approach 2 is a divide and conquer algorithm that

reduces an instance of size n to 8 instances of size n/3 and spends Θ(n) time in the divide

and combine steps.

Claim. Approach 1 has a better asymptotic running time.

5

True False

Solution. False

The recurrence for Approach 2 is: T (n) = 8T (n/3) + Θ(n). Using Master Theorem,

a = 8, b = 3, f(n) = Θ(n), we have nlogb a = nlog3 8 = Ω(n1+); hence it is Case 1.

Solution: T (n) = Θ(nlog3 8). Since log3 8 < 2 , Approach 2 has a better asymptotic

running time.

• We are given an array A with 3n distinct numbers. We want to partition the elements

of A into three equal-sized sets A1, A2, A3 (i.e., |A1| = |A2| = |A3| = n), such that all

elements of A1 are smaller than all elements of A2, and all elements of A2 are smaller than

all elements of A3.

Claim. This task requires Ω(n · log n) time in the comparison-based model.

True False

Solution. False.

We can do this in O(n) time. Use a linear-time selection algorithm to select the element

x of A with rank n and element y with rank 2n. Then compare all the elements of A

to x and y, and partition them into the three sets A1, A2, A3, where A1 contains the

elements that are ≤ x, A2 contains the elements that are > x and ≤ y, and A3 contains

the remaining elements that are > y.

• We are given a directed graph G = (N,E). We wish to determine if there exist two nodes

u, v such that G does not contain any path from u to v.

Claim. This problem can be solved in O(|N |+ |E|) time.

True False

Solution. True.

There exist two nodes u, v such that G does not contain any path from u to v if and only

if the graph is not strongly connected. We can use DFS from class to test if a graph is

strongly connected in O(|N |+ |E|) time.

• We are given a network G = (N,E) with positive integer capacities on the edges, source

node s, sink node t, and a maximum s-t flow f (i.e., a value fe for each edge E satifying

the flow property).

Claim: We can compute a minimum s− t cut in O(|N |+ |E|) time. (Note that the output

is the actual set S).

True False

6

Solution. True.

Construct the residual network Gr with respect to f and compute the set S of reachable

nodes from s in Gr. Both parts can be done in O(|N | + |E|) time. The minimum s − t

cut is (S,N \ S).

• Claim. Suppose we are given a matrix A of size n × n, where Aij is the length of the

shortest path from node i to node j in an unweighted graph G. Then we can sort all the

n2 numbers {Aij}i,j=1,...n in time O(n2).

True False

Solution. True.

From the special structure of A, we can see that the maximum value of any element in

A is n, since that is the max distance between any two vertices. Thus, we can use any

linear-time sort such as counting sort or bucket sort to sort the n2 numbers in O(n2).

• Consider an algorithm A that has the following runtime recurrence: T (n) = 4 · T (√n) +

O(log n). Another algorithm B has runtime O(log n).

Claim. Algorithm A has the same asymptotic running time as the algorithm B.

True False

Solution. False.

We can solve the recurrence for Algorithm A by changing variabl es so we can apply Master

Theorem. Solving gives that the recurrence should be O(log2(n)), while Algorithm B has

running time O(log n). These are not the same asymptotic running time, so the answer

is False.

• The MAX-CUT problem is as follows: given an undirected, unweighted graph G = (N,E),

and nodes s, t, find an s− t cut of maximum value — i.e., partition N into sets S, T , with

s ∈ S, t ∈ T , maximizing the number of edges (u, v) such that u ∈ S, v ∈ T .

Claim. We can solve MAX-CUT problem by running a max-flow algorithm from the

source s to the terminal t.

True False

Solution. False.

Max-flow is equivalent to finding the min-cut, not the max-cut. (Side remark: MAX-CUT

is actually an NP-hard problem, so that would also imply that P=NP.)

7

Problem 8 (LP) Suppose we have an extremely efficient LP solver at hand (i.e., implementa-

tion of some new fancy algorithm for the generic Linear Programming problem), and we decide

that we are better off computing graph connectivity using this LP solver.

So let P be the problem: given an undirected unweighted graph G = (N,E), as well as two

nodes s, t, determine if there a path between s and t.

Show how to formulate the problem P as an instance of Linear Program of small size. For

example, you may want to have variables fu,v and fv,u for each (undirected) edge (u, v) ∈ E.

(The number of constraints in the LP should be O(|N |+ |E|).

Solution. The idea is that we can formulate the connectivity problem as a special case of the

flow problem, and just use the LP for the max-flow problem we discussed in class. Specifically,

we think of G as a graph, where each undirected edge {u, v} is replaced by 2 directed edges

(u, v) and (v, u), and associate flow variables fu,v, fv,u to each such edge. The capacity for each

edge is 1. Let the set of directed edges be called E′ (note that |E′| = 2|E|).

Now the LP formulation is exactly as for the max-flow problem discussed. In particular the

LP is to maximize

∑

u:(s,u)∈E′ fs,u −

∑

v:(v,s)∈E′ fv,s subject to the constraints:

fu,v ≥ 0, ∀(u, v) ∈ E′

fu,v ≤ 1, ∀(u, v) ∈ E′∑

v:(u,v)∈E′

fu,v −

∑

v:(v,u)∈E′

fv,u = 0, ∀u ∈ N \ {s, t}

If the optimal LP solution is at least 1, there is a path from s to t (and vice versa). If the

LP solution is 0, s, t are disconnected.

Problem 9 (Greedy: Intercluster Distance) Suppose we are given a set of points X =

{x1, x2 . . . xn} and the distances d(xi, xj) between any pair of them (distances here are sym-

metric i.e. d(xi, xj) = d(xj , xi)). We want to find a partition of these points in to k subsets

C1, C2 . . . Ck that maximizes the minimum intercluster distance between any pair of these sub-

sets. More formally, we want to find a collection on k disjoint subsets {C1, C2 . . . Ck} of X that

covers X and maximizes the quantity

min

i,j

min

x∈Ci,y∈Cj

i 6=j

d(x, y)

Design an efficient algorithm for this problem. For partial credit, show this for k = 2. Hint:

use an algorithm for Minimum Spanning Tree.

Solution. Start by building a complete graph Kn on these these points with the weight of

each edge equal to the distance between the pair of vertices it connects i.e. associate a vertex vi

with each xi and for an edge (vi, vj), set its weight w(vi, vj) to d(xi, xj). Now, find an MST T

on Kn and delete the k− 1 heaviest edges in it. This forms a forest F of k trees and we return

the nodes of these trees as our k-partition.

Running Time: Finding an MST takes O(n2 log n) with Kruskal’s and since Kruskal’s sorts

these edges in non-decreasing order of weight, we can remove the k heaviest edges from the

MST and report the k-partition in the same amount of time.

Correctness. (Note that the problem didn’t ask for correctness – this is just FYI.) Let

E = {e1, e2...e(n2)} be the edges of Kn sorted in non-decreasing order of weight, i.e. we have

8

w(ei) ≤ w(ei+1)∀1 ≤ i ≤

(

n

2

) − 1 and let E = {eT (1), eT (2)...eT (n−1)} be the edges of T also

sorted in non-decreasing order of weight.

Lemma. The k-partition F induced by the edges {eT (1), eT (2) . . . eT (n−k)} maximizes the

minimum intercluster distance.

Proof. We begin by noting that the minimum intercluster distance for F is given by

w(eT (n−k+1)). This follows since T is an MST. Now, let C ′ = {C ′1, C ′2...C ′k} be any other

k-partition of X that maximizes the minimum intercluster distance and let ei be the edge that

realizes this distance for C ′. We claim that w(ei) ≤ w(eT (n−k+1)). Suppose, towards a contradic-

tion, that this was not the case and we had w(ei) > w(eT (n−k+1)) > w(eT (n−k)) . . . > w(eT (1)).

Then all the edges ej of Kn such that w(ej) < w(ei) are inner-cluster edges of C

′. This implies

that the graph induced by the edge set E′ = {eT (1), eT (2) . . . eT (n−k+1)} has at least k connected

components. However, we know that E′ induces a graph with k − 1 connected components

(since they are part of a tree) and we have a contradiction.

Problem 10 (Hardness: Search to Decision for SAT) For the purposes of this exercise,

a boolean formula is a function f : {0, 1}n → {0, 1} defined as f(x1, . . . xn) =

∏m

i=1(maxj∈Si xj),

where Si ⊆ [n] (this is a CNF formula in fact). Note that such a formula is described in O(mn)

bits. A satisfying assignment for f is a vector x = (x1, . . . xn) ∈ {0, 1}n such that f(x) = 1.

Suppose someone discovered a polynomial algorithm for deciding whether or not, for a

given boolean formula f , there exists a satisfying x. Describe how to derive a polynomial time

algorithm for finding a satisfying assignment (whenever one exists). Could you find all satisfying

assignments in polynomial time? Explain why or why not.

Solution. Suppose there are n variables in a boolean formula. If we fix the first variable

to take the value 0, we can rewrite the (now simplified) formula as a new formula with n − 1

variables (it will be of comparable size to the original formula). We can then run the polynomial

time algorithm to see if this new formula is satisfiable or not. If it is satisfiable, we are on track

to find a satisfying assignment with x1 = 0. If is it not, we have learned that no satisfying

assignment exists with x1 = 0, so we set x1 = 1. Either way, we set x1 appropriately and now

continue with this new formula, next setting x2 = 0, simplifying, and testing satisfiability for

a formula with n − 2 variables. We continue in this way until we have produced a satisfying

assignment or proven that none exists. Note that this will take time O(nR) where R is the

running time of the satisfiability algorithm we are using as a subroutine. So if R is polynomial,

so is our algorithm to produce a satisfying assignment. We cannot, however, hope to find all

of the satisfying assignments in polynomial time for an arbitrary formula. For example, the

formula could be satisfied by all assignments, and this would take us time 2n just to write them

all down.

Problem 11 [Costly nodes] Suppose we are given an undirected graph G = (N,E) which

has costs for both edges and nodes, denoted by (positive) integers w(u, v) for each edge (u, v),

and c(u) for each node u. We are also given two node indices: s, the start node, and t, the

destination node. The goal is to find the least costly path from s to t taking into account the

costs of both edges and nodes. In particular, for a path P = (u0, u1, u2, . . . uk−1, uk), where

u0 = s and uk = t, the cost of the path P is cost(P ) = c(u0) +

∑k

i=1 (w(ui−1, ui) + c(ui)).

Design an algorithm that, given graph G, costs w(u, v) for (u, v) ∈ E and c(u) for u ∈ N ,

and nodes s, t, finds the cost of the least costly s-t path (it’s enough to output the cost only).

What’s your runtime?

9

Solution. We first consider the directed version of the graph, where for each edge, we include

two directed edges of the same cost. We then augment the graph by splitting each node u ∈ N

into two u1 and u2, connected by an edge (u1, u2) with weight c(u). We attach all incoming

edges of u to u1, and attach all outgoing edges of u to u2. Therefore, any path going through

u will come in through u1, take the edge (u1, u2), and go out through u2. Thus, we compute

the shortest path s1 to t2 in the augmented graph, which will correspond to the shortest path

taking into account node costs.

We note that we may use any shortest paths algorithm, for instance Djikstra’s (since all

weights are positive).

Formulating the problem as an LP is also possible; but note that that leads to a slower

solution in general (since a generic LP solver is slower than a direct algorithm). Hence that

would only lead to partial credit.

Problem 12 [Multi-Sum DP] In this problem, you are given n positive interegers, a1, . . . an ≥

1, as well as a target integer T ≥ 1. You goal is to establish whether you can compose the tar-

get T using the integers ai: i.e., whether there exists positive integers f1, . . . fn ≥ 0 such that∑

i aifi = T . Design an algorithm running in time O(Tn). You don’t need to argue correctness

or runtime; but if you are using dynamic programming, you need to write formally what your

arrays/functions mean.

Solution. Fix a ∈ Nn. We define the set of feasible compositions (achievable integer linear

combinations of aj ’s) Ea ⊆ N. Our goal is to test whether T ∈ Ea. Let ei , 1[i ∈ Ea], then we

can compute ei using e

*is at least some aj such that ei−aj ∈ Ea). By convention, we let e0 = 1 and e<0 = 0.*

Our overall DP algorithm iterates over i ∈ [T ] and updates ei according the above in each

step i (noting each such ei computation takes O(n) time). We output eT after O(Tn) time.

10

欢迎咨询51作业君

Our overall DP algorithm iterates over i ∈ [T ] and updates ei according the above in each

step i (noting each such ei computation takes O(n) time). We output eT after O(Tn) time.

10

欢迎咨询51作业君