Johnson’s Algorithm

47th Friday Fun Session – 19th Jan 2018

We have seen why Dijkstra’s algorithm cannot work with negative edge and that we cannot trivially add a constant to each of the edge weights and make them non-negative to proceed further. It is where Johnson’s algorithm comes into play. It finds a special set of offset values to remove the negative edges (change the negative edge weights to non-negative edge weights) and now this transformed graph is all set to work with Dijkstra’s algorithm.

How Does Johnson’s Algorithm work?

Johnson’s algorithm starts with a graph having negative edge(s). Let’s go through it using an example as shown below.

1

Add a New Node

It then adds a new vertex, let’s call it s, with edges starting from it and ending to each of the vertices of the existing graph, each having a cost of 0, as we have done earlier.

2.png

Apply Bellman-Ford

Then it applies Bellman-Ford, a Single Source Shortest Path (SSSP) algorithm that can work with a graph having negative edge(s). We will use s as the source, and find shortest path from it to all other vertices.

We also need to check whether a negative cycle exists, something that Bellman-Ford can detect. If it exists then we cannot proceed further as we cannot find shortest path in a graph with negative cycle. In our example graph, there is no negative cycle.

We find d[s, 1] = 0, d[s, 2] = -30, and d[s, 3] = 0 as shown below, using this code where d[s, t] indicates the shortest path from s to t.

3.png

Adjust Original Edge Weights

Now using these shortest path costs, original edges will be updated using the formula: w’[u, v] = w[u, v] + d[s, u] – d[s, v]. Applying the same for the original 3 edges in the original graph, we find,

w’[1, 2] = w[1, 2] + d[s, 1] – d[s, 2] = 20 + 0 – (-30) = 50

w’[1, 3] = w[1, 3] + d[s, 1] – d[s, 3] = 40 + 0 – 0 = 40

w’[3, 2] = w[3, 2] + d[s, 3] – d[s, 2] = (-30) + 0 – (-30) = 0

Now that we have adjusted the original edge costs, the new (cost) adjusted graph (without s and associated edges) does not have any more negative edge. Let’s see how the cost adjusted graph looks like.

4

Apply Dijkstra

With this non-negative edge graph we can proceed with Dijkstra’s algorithm. For each shortest path found in this graph from u to v, we have to adjust back the cost by subtracting d[s, u] – d[s, v] from it.

Is the Shortest Path Still the Same?

We are adjusting edge cost to remove negative edge. That way, we are changing the graph to some extent. However, while doing so we must preserve certain things of it. What was the cheapest cost in the original graph must still remain the cheapest path in the transformed graph. Let’s first verify whether that is indeed the case.

We will first look at the original graph (before edge cost adjustment). Let’s take a certain source destination pair (1, 2). There are two paths to reach from vertex 1 to vertex 2.

The first one (original):

d1[1, 2]

= from vertex 1 to vertex 2 directly using edge 1->2

= 20.

The second one (original):

d2[1, 2]

= from vertex 1 to 3 and then from 3 to 2

= 40 + (-30)

= 10.

Now let’s see how the costs of the same two paths change in the new cost adjusted graph.

The first one (cost adjusted):

d’1[1, 2]

= from vertex 1 to vertex 2 directly using edge 1->2

= 50.

The second one (cost adjusted):

d’2[1, 2]

= from vertex 1 to 3 and then from 3 to 2

= 40 + 0

= 40.

We see both the path costs have increased by 30, a constant. So what was earlier the shortest from vertex 1 to vertex 2, in the original graph, which was the second path, using two edges: edge 1->3 and edge 3->2, still remains the shortest path in the cost adjusted graph.

So how did that happen? Let’s have a closer look as to how the path cost changes.

The first one (cost adjusted):

d’1[1, 2]

= w’[1, 2]

= w[1, 2] + d[s, 1] – d[s, 2]

= d1[1, 2]  + d[s, 1] – d[s, 2]

The second one (cost adjusted):

d’2[1, 2]

= w’[1, 3] + w’[3, 2]

= w[1, 3] + d[s, 1] – d[s, 3] + w[3, 2] + d[s, 3] – d[s, 2]

= w[1, 3] + d[s, 1] + w[3, 2] – d[s, 2]

= w[1, 3] + w[3, 2] + d[s, 1] – d[s, 2]

= d2[1, 2] + d[s, 1] – d[s, 2]

So we see both the paths, with a certain source u and a certain destination v, have increased with a constant cost = d[s, u] – d[s, v], where s is the extra node that we added before applying Bellman-Ford algorithm.

We can easily find, no matter how many paths are present between a certain source s and a certain destination v, and no matter how many edges each of those paths uses, each of them would be adjusted by adding a constant cost = d[s, u] – d[s, v] to it. And hence, the shortest path in the original graph remains the shortest path in the new cost adjusted, non-negative edge graph.

Let’s consider a path that goes through 5 vertices: u, x1, x2, x3, and v.

In the cost adjusted graph the cost

d’[u, v]

= w’[u, x1] + w’[x1, x2] + w’[x2, x3] + w’[x3, v]

= w[u, x1] + d[s, u] – d[s, x1] + w[x1, x2] + d[s, x1] – d[s, x2] + w[x2, x3] + d[s, x2] – d[s, x3] + w[x3, v] + d[s, x3] – d[s, v]

= w[u, x1] + d[s, u] + w[x1, x2] + w[x2, x3] + w[x3, v] – d[s, v]

= w[u, x1] + w[x1, x2] + w[x2, x3] + w[x3, v] + d[s, u] – d[s, v]

= d[u, v] + d[s, u] – d[s, v]

By generalizing the above, we see that a constant cost d[s, u] – d[s, v] is getting added to all paths from u to v.

Are all Negative Edge Removed?

The second thing that we need to prove is: no longer there exists a negative edge in the adjusted graph. After applying Bellman-Ford, we computed the shortest paths from source s. Let’s assume, d[s, u] and d[s, v] are the shortest paths from s to any two vertices, u and v, respectively. In that case, we can say,

d[s, v] <= d[s, u] + w[u, v]

=> 0 <= d[s, u] + w[u, v] – d[s, v]

=> 0 <= w[u, v] + d[s, u] – d[s, v]

=> 0 <= w’[u, v]

We prove that the new edge cost, w’[u, v] is always non-negative.

Why Would We Use Johnson’s algorithm?

So here with Johnson’s algorithm, first we use Bellman-Ford to get a set of values; using which we transform the graph with negative edge to a graph with all non-negative edges so that we can apply Dijkstra’s algorithm.

But why would anyone want to do that? After all, both Bellman-Ford and Dijkstra are SSSP algorithms. What is the point of using one SSSP algorithm to transform a graph so that another SSSP algorithm can be used on the transformed graph?

Dijkstra’s Algorithm is Faster

Well, the reason being, the latter SSSP algorithm, namely Dijkstra’s, is much faster than Bellman-Ford. So, if we need to find shortest paths many times, then it is better that first we apply a bit more expensive SSSP alogorithm – Bellman-Ford to get the graph ready to work with Dijkstra’s algorithm. Then we execute much cheaper Dijkstra’s algorithm on this transformed graph, as many times as we want – later.

Sparse Graph

But in such a situation is it not better to run an ALL-Pairs Shortest Paths (APSP) algorithm like Floyd-Warshall? After all, Floyd-Warshall can compute APSP at a cost of O(V3) while Bellman-Ford costs O(|V| * |E|) that can shoot up to O(V3), when E=|V|2 for a dense graph.

Yes, that is correct. For a dense graph Johnson’s algorithm won’t possibly be useful. Johnson’s algorithm is preferable for a sparse graph when Bellman-Ford is reasonably efficient to work with it.

Index

Currency Arbitrage with Increasing Rate

13th JLTi Code Jam – Mar 2018

After adding new node to Floyd-Warshall algorithm incrementally and dealing with decreasing rate (of an existing currency pair), the next logical thing is how to deal with increasing rate (of an existing currency pair). By an existing currency pair we mean both the currencies were already present and there was already a rate between the two.

Just like before, given that we have an existing best path cost matrix, when a rate between two currencies increases what shall we do? Once again, we have two options:  i) re-compute the cost matrix from the scratch, using Floyd-Warshall, at a cost O(V3) and ii) update the already computed cost matrix using some partial computations. This problem expects a solution using the second option.

Input:

1 USD = 1.380 SGD

1 SGD = 3.080 MYR

1 MYR = 15.120 INR

1 INR = 0.012 GBP

1 GBP = 1.29 USD

I CAD = 0.57 GBP

1 GBP = 1.30 USD

Explanation: We have 7 inputs here. Each time an input is given, we need to present an output and hence, we have 7 lines of output.

The first 6 inputs do not result in any arbitrage; we output “No luck here”.

At 7th input, we see the existing rate from GBP to USD, which was 1.29 last time, has changed (increased) to 1.30 now. With this new rate in effect, an arbitrage comes into picture now. We need to output the path that creates that arbitrage.

Since in this problem, we are dealing with only increasing rate, in input, between two currencies, rate will only increase. For example, an input like 1 GBP = 1.25 USD will never appear.

When multiple arbitrages exist, printing any one will do.

Output:

No luck here

No luck here

No luck here

No luck here

No luck here

No luck here

USD -> SGD -> MYR -> INR -> GBP -> USD

Task: For each line of input, for each new vertex, incrementally adjust/add shortest paths at a cost (time) of O(|V|2), detect the presence of an arbitrage and output as specified. Use existing solution for this.

If input contains a rate that has increased since last time, accommodate that change in the best path cost matrix using some partial computations, instead of computing the whole matrix from the scratch.

Index

Dijkstra’s Problem with Negative Edge

46th Friday Fun Session – 12th Jan 2018

Dijkstra’s algorithm cannot work with negative edge. Also, we cannot trivially add a constant to each of the edge weights and make them non-negative to proceed further.

Why Does Dijkstra’s Algorithm not Work with Negative Edge?

negative edge

In the above figure, we are trying to get shortest paths from source node 1 to all other nodes (node 2 and node 3). Since Dijkstra’s algorithm works by employing a greedy process, it outputs 20 as the shortest path cost to node 2.

As we can see, from node 1, we can go to two nodes – node 2 and node 3, at a cost of 20 and 40 respectively. Hence, going to node 2 is cheaper. And that is why, it outputs 20 to be the cheapest cost to reach node 2.

However, we know that the cheapest cost to reach node 2 is through node 3. And the associated cost is: 40 + (-30) = 10. So Dijkstra’s algorithm gets it wrong. It gets it wrong because it cannot foresee that later, a negative edge can bring down the total cost to below 20.

If we carefully observe, we see that the wrong calculation by Dijkstra’s algorithm happens due to the negative edge. Had cost from node 3 to node 2 not been negative, it could never bring down the total cost to lower than 20, after getting added to 40.

Why Does Adding a Constant Cost to Each Edge not Work?

Now that we realize, Dijkstra’s algorithm fails due to the negative edge from node 3 to node 2, having the value -30, we might be tempted to add 30 to each of the edges. We might think, this way we can remove the negative edge. And doing so would be fair; after all, we are adding the same value to each of the edges. Let’s do it and see what happens.

adjusting negative edge.png

After updating the edge costs, the graph looks as shown above. So what is the cheapest path from node 1 to node 3 now?

Well, now the cheapest cost is 50, which uses the direct edge from node 1 to node 2. But this is not supposed to be the cheapest path, right? The cheapest path was node 1 -> node 3 -> node 2, before we adjusted the edge cost. Adjusting edge cost should not change the graph. It must not change the cheapest path, right?

So why does that happen? Well, if we observe, we find that path node 1 -> node 3 -> node 2 uses two edges/segments – node 1 to node 3 and node 3 to node 2. On the other hand, path node 1 -> node 2 uses just one edge/segment. The way we have updated the edge cost – adding a constant to each path segment – is not fair to a path using more path segments. For the path that uses two path segments, which was originally the cheapest path, we have added the constant 30 twice. On the other hand, for the path that uses just one path segment, we have added 30 only once. That way, we are unfair to the path using more path segments.

We must add a constant to each of the paths, not to each of the path segments.

Solution

Johnson’s algorithm does this – add a constant cost to each path with a certain source s to a certain target t. It does so, by finding a special set of offset values to remove the negative edges from a graph. Once that is done Dijkstra’s algorithm can work. But that works in absence of a negative cycle in the graph.

Index

Currency Arbitrage with Decreasing Rate

12th JLTi Code Jam – Feb 2018

Here we extend the Currency Arbitrage problem that expected us to re-compute the best path cost matrix when a new currency arrived along with some rates from and/or to the existing currencies. We came up with a solution at O(n2) cost.

If we look at the currency exchange issues, we realize that new currencies are not arriving every day. Rather, rates among the existing currencies are the ones which are changing every now and then. So it’s equally, if not more, important to find a solution as to how to address the rate changes and compute the cost matrix.

Rate between two currencies can either increase or decrease. Here we will focus only on rate decrease.

Given that we have an existing best path cost matrix, when a rate between two currencies decreases what shall we do? Well, just like Currency Arbitrage problem, we have two options:  i) re-compute the cost matrix from the scratch, using Floyd-Warshall, at a cost O(V3) and ii) update the already computed cost matrix using some partial computations. This problem expects a solution using the second option.

Input:

1 USD = 1.380 SGD

1 SGD = 3.080 MYR

1 MYR = 15.120 INR

1 INR = 0.012 GBP

1 GBP = 1.30 USD

I CAD = 0.57 GBP

1 GBP = 1.29 USD

Explanation: We have 7 inputs here. Each time an input is given, we need to present an output and hence, we have 7 lines of output. We already know how to incrementally add a new currency. The first 4 do not result in any arbitrage; we output “No luck here”. For 5th, we have an arbitrage and we output the same. For 6th, we continue to have the same arbitrage and we again output the same.

At 7th input, we see the existing rate from GBP to USD, which was 1.30 last time, has changed to 1.29 now. With this new rate in effect, the arbitrage disappears now. Output should be “No luck here”.

Since we are dealing with only decreasing rate, in input, between two currencies, rate will only decrease. For example, an input like 1 GBP = 1.31 USD will never appear.

When multiple arbitrages exist, printing any one will do.

Output:

No luck here

No luck here

No luck here

No luck here

USD -> SGD -> MYR -> INR -> GBP -> USD

USD -> SGD -> MYR -> INR -> GBP -> USD

No luck here

Task: For each line of input, for each new vertex, incrementally adjust/add shortest paths at a cost (time) of O(|V|2), detect the presence of an arbitrage and output as specified. Use existing solution for this.

If input contains a rate that decreases since last time, accommodate that change in the best path cost matrix using some partial computations, instead of computing the whole matrix from the scratch.

Index

Solution – Currency Arbitrage

49th Friday Fun Session – 2nd Feb 2018

Negative Cycle can be identified by looking at the diagonals of the dist[][] matrix generated by Floyd-Warshall algorithm. After all, diagonal dist[2][2] value is smaller than 0 means, a path starting from 2 and ending at 2 results in a negative cycle – an arbitrage exists.

However, we are asked to incrementally compute the same, at cost of O(n2) for each new vertex.

Floyd-Warshall algorithm takes O(n3) time to compute All-Pairs Shortest Path (APSP), where n is the number of vertices. However, given that it already computed APSP for n nodes, when (n+1)th node arrives, it can reuse the existing result and extend APSP to accommodate the new node incrementally at a cost of O(n2).

This is the solution for JLTI Code Jam – Jan 2018.

Converting Rates

If USD to SGD rate is r1 and SGD to GBP rate is r2, to get the rate from USD to GBP, we multiply the two rates and get the new rate that is r1*r2. Our target is to maximize rate, that is maximizing r1*r2.

In paths algorithm, we talk about minimizing path cost (sum). Hence maximizing multiplication of rates (r1*r2) would translate into minimizing 1/(r1*r2) => log (1/(r1*r2))  => log (r1*r2) -1 => – log r1 – log r2 => (–log r1) + (–log r2) => sum of (–log r1) and (–log r2). Rate r1 should be converted into – log r1 and that is what we need to use in this algorithm as edge weight.

While giving output, say the best rate from the solution, the rate as used in the dist[][] matrix should be multiplied by -1 first and then raised to the bth power, where b is the base (say one of 2, 10 etc.) of the log as used earlier.

Visualizing Floyd-Warshall

We have seen the DP algorithm that Floyd-Warshall deploys to compute APSP. Let us visualize to some extent as to how it is done for 4 vertices.

What Cells Are Used to Optimize

The computation will be done using k = 1 to 4, in the following order – starting with cell 1-1, 1-2, . . . . .2-1, 2-2, …….3-1, ……. 4-3, 4-4.

At first, using k = 1.

Let us see how the paths are improving using the following two examples.

dist[2][3] = min (dist[2][3],  dist[2][1] + dist[1][3])

1

and dist[3][4] = min (dist[3][4],  dist[3][1] + dist[1][4])

2

We see that for k = 1, all paths are optimized using paths from 1st (kth) row and 1st (kth) column.

Kth Row and Column do not Change

What about paths on kth row and kth column?

dist[1][2] = min(dist[1][2], dist[1][1] + dist[1][2]) – well, there is no point in updating dist[1][2] by adding something more to it.

So we see, at a certain kth iteration, kth row and kth column used to update the rest of the paths while they themselves are not changed.

At k = 1

3

At k = 2

4

At k = 3

5

At k = 4

6

Consider Only 3X3 Matrix Was Computed

Now assume that we did not consider that we had 4 vertices. Rather we considered that we had 3 vertices and completed APSP computations for all paths in the 3X3 matrix. We ignored the 4th row and column altogether.

So we have APSP computed for the following matrix using k = 1, 2 and 3.

7

Add 4th Vertex

Let’s say, 4th vertex arrives now. First, we can compare the computations used for the above 3X3 matrix with the same for the 4X4 matrix as shown earlier and find out what all computations need to be done now to extend this 3X3 matrix to 4X4 matrix to accommodate the new 4th vertex.

We will find that at first we have to optimize the 4th row and column using k = 1, 2 and 3. Let us do that.

8

Note that at this point, 4th row and column are not used to optimize paths for the older 3X3 matrix. So now that we have the 4th row and column optimized using k = 1, 2 and 3, we have to optimize that 3X3 matrix using k = 4.

9

This way, we don’t miss out any computation had we considered all the 4 vertices at one go. And thus we are done with optimizing all the paths in the 4X4 matrix.

Code

dist[][] //APSP matrix, already computed for n-1 vertices

p[][] //predecessor matrix, already computed for n-1 vertices


dist[n][] = ∞

dist[][n] = ∞

dist[n][n] = 0


for each edge (i, n)

  dist[i][n] = weight(i, n)

  p[i][n] = n


for each edge (n, i)

  dist[n][i] = weight(n, i)

  p[n][i] = i


for k = 1 to n-1

  for i = 1 to n-1

    if dist[i][n] > dist[i][k] + dist[k][n]

      dist[i][n] = dist[i][k] + dist[k][n]

      p[i][n] = p[i][k]

  for j = 1 to n

    if dist[n][j] > dist[n][k] + dist[k][j]

      dist[n][j] = dist[n][k] + dist[k][j]

      p[n][j] = p[n][k]


for i = 1 to n-1

    for j = 1 to n-1

      if dist[i][j] > dist[i][n] + dist[n][j]

        dist[i][j] = dist[i][n] + dist[n][j]

        p[i][j] = p[i][n]

Complexity

The complexity for this incremental building for a new vertex is clearly O(n2). That makes sense. After all, for n vertices the cost is O(n3) that is the cost of Floyd-Warshall, had all n vertices were considered at one go.

But this incremental building makes a huge difference. For example, consider that we have 1000 vertices for which we have already computed APSP using 1 billion computations. Now that 1001st vertex arrives, we can accommodate the new vertex with a cost of 1 million (approx.) computations instead of doing 1 billion+ computations again from the scratch – something that can be infeasible for many applications.

Printing Arbitrage Path

We can find the first negative cycle by looking (for a negative value) at the diagonals of the dist[][] matrix, if exists and then print the associated path. For path reconstruction, we can follow the steps as described here.

GitHub: Code will be updated in a week

Index

Dijkstra’s Algorithm

10th Friday Fun Session – 17th Mar 2017

Dijkstra’s algorithm is a Single-Source Shortest Path (SSSP) algorithm developed by Edsger Wybe Dijkstra. It uses a greedy process and yet finds the optimal solution. It looks similar to Breadth-first search.

Compare to Bellman-Ford

It is asymptotically the fastest SSSP algorithm, at a cost O(|E|+|V|log|V|), when min-priority queue implemented by Fibonacci heap is used.

That is quite cheap, given Bellman-Ford’s complexity of O(|V||E|) to find the same, something that can become prohibitively expensive for a dense graph having |V|2 edges.

However, while Bellman-Ford can work with negative edge and can detect negative cycle, Dijkstra’s algorithm cannot work with negative edge. Since it cannot work with negative edge, there is no question of detecting negative cycle at all.

Standard Algorithm

dist[] //shortest path vector

p[] //predecessor vector, used to reconstruct the path

Q //vertex set

for each vertex v in Graph
  dist[v] = ∞
  p[v] = undefined
  add v to Q

dist[s] = 0

while Q is not empty
  u = vertex with min dist[] value
  remove u from Q

  for each neighbor v of u
    alt = dist[u] + weight(u, v)
    if alt < dist[v]
      dist[v] = alt
      p[v] = u

return dist[], p[]

Given source vertex s, it finds the shortest distance from s to all other vertices. At first, it initializes dist[] vector to infinite to mean that it cannot reach any other vertex. And sets dist[s] = 0 to mean that it can reach itself at a cost of 0, the cheapest. All vertices including itself are added to the vertex set Q.

Then, it chooses the vertex with min dist[] value. At first, s (set to u) would be chosen. Then using each of the outgoing edges of u to v, it tries to minimize dist[v] by checking whether v can be reached via u using edge (u, v). If yes, dist[v] is updated. Then again it retrieves vertex u with the cheapest dist[u] value and repeats the same. This continues till Q is not empty. Whenever, a vertex u is removed from Q, it means that the shortest distance from s to u is found.

Since we are retrieving |V| vertices from Q, and for each vertex, trying with all its edges (=|V|, at max), to minimize distance to other vertices, the cost can be |V|2.

So, here we see a greedy process where it is retrieving the vertex with min dist[] value.

Since retrieving a vertex u from Q means that we found the minimum distance from s to u, if we are solving shortest path from a single source s to a single destination d, then when u matches the destination d, we are done and can exit.

It can also be noted that from source s, we find the shortest distances to all other vertices, in the ascending order of their distances.

Finally, we see that dist[] vector is continuously changing. And each time when we retrieve a vertex u, we choose the one with min dist[] value. That indicates using min-priority queue might be the right choice of data structure for this algorithm.

Using Fibonacci Heap

dist[] //shortest path vector
p[] //predecessor vector, used to reconstruct the path
Q //priority queue, implemented by Fibonacci Heap

dist[s] = 0

for each vertex v
  if(s != v)
    dist[v] = ∞
    p[v] = undefined
  
  Q.insert_with_priority(v, dist[v]) // insert

while Q.is_empty() = false
  u = Q.pull_with_min_priority() // find min and delete min
  
  for each neighbor v of u
    alt = dist[u] + weight(u, v)
    if alt < dist[v]
      dist[v] = alt
      p[v] = u
      Q.decrease_priority(v, alt) //decrease key

return dist[], p[]

In the above algorithm, we have used a function called decrease_priority(), something that is absent in standard priority queue but present in Fibonacci heap. So the above algorithm is implemented using Fibonacci heap.

Fibonacci heap is a special implementation of a priority queue that supports decrease key (decrease_priority()) operation. Meaning, we can decrease the value of a key while it is still inside the priority queue. And this can be achieved by using constant amortized time for insert, find min and decrease key operation and log (n) time for delete min operation.

As for cost, since we have called delete operation for each of the v vertices, and we have treated each of the |E| edges once, the cost here is O(|E|+|V|log|V|), as mentioned at the beginning of this post, as the cost of Dijkstra’s algorithm.

Using Standard Priority Queue

Standard priority queue implementation takes log (n) time for both insert and delete operation and constant time for find min operation. But there is no way to change the key value (decrease key) while the item is still in the priority queue, something Dijkstra’s algorithm might need to do quite frequently as we have already seen.

If standard priority queue is used, one has to delete the item from the priority queue and then insert into it again, costing log (n) each time, or an alternative to that effect. However, as long as standard priority queue is used, it is going to be slower than Fibonacci heap. With standard priority queue, the algorithm would look like below:

dist[] //shortest path vector
p[] //predecessor vector, used to reconstruct the path
Q //standard priority queue

for each vertex v
  dist[v] = ∞
  p[v] = undefined

dist[u] = 0
Q.insert_with_priority(u, dist[u]) // insert

while Q.is_empty() = false
  u = Q.pull_with_min_priority() // find min and delete min
  
  for each neighbor v of u
    alt = dist[u] + weight(u, v)
    if alt < dist[v]
      dist[v] = alt
      p[v] = u
      insert_with_priority(v, alt) //insert v 
                                     even if already exists 
return dist[], p[]

There are two differences from the earlier algorithm:

First, we have not inserted all vertices into the standard priority queue at first, rather inserted the source only.

Second, instead of decreasing priority that we cannot do using standard priority queue, we have kept on inserting vertex v when dist[v] decreases. That might mean, inserting a vertex v again while it is already there inside the queue with a higher priority/dist[v]. That is another way of pushing aside the old entry (same v but with higher priority) out of consideration for the algorithm. When shortest distances from source s to all other vertices v are found, those pushed aside vertices will be pulled one by one from the priority queue and removed. They will not affect dist[] vector anymore. And thus the queue will be emptied and the algorithm will exit.

Negative Edge

Please check Dijkstra’s Problem with Negative Edge for further details.

Index

Floyd-Warshall Algorithm

35th Friday Fun Session – 29th Sep 2017

Floyd-Warshall, also known as Roy-Warshall is an All-Pairs Shortest Path (APSP) algorithm developed by Robert Floyd, Bernard Roy, and Stephen Warshall. It is an example of dynamic programming that uses 3 nested loops. At a cost O(|V|3), it is quite impressive, given that Bellman-Ford might encounter the same cost (O(|V||E|)) to find only Single Source Shortest Path (SSSP) for dense graph having |V|2 edges. Floyd-Warshall can work with negative edges just like Bellman-Ford. After all, both are based on dynamic programming. As for detecting negative cycle, once again, both can detect it. However, in presence of negative cycle, results from both are invalid.

Three Nested Loops

dist[][] //shortest path matrix
p[][] //predecessor matrix, used to reconstruct the path

dist[][] = ∞

for each vertex i
  dist[i][i] = 0

for each edge (i, j)
  dist[i][j] = weight(i, j)
  p[i][j] = j

for k = 1 to |V|
  for i = 1 to |V|
    for j = 1 to |V|
      if dist[i][j] > dist[i][k] + dist[k][j]
        dist[i][j] = dist[i][k] + dist[k][j]
        p[i][j] = p[i][k]

To compute the shortest path between any pair (s, t), we have considered each of the |V| vertices as intermediate points k, and chosen the cheaper between i) existing (s, t) and ii) the sum of s to k and then from k to t, meaning s to t via k.

Short-circuiting an SSSP?

Does it mean that we can derive a SSSP solution for any pair (s, t), at a cost of O(|V|2)? To be precise, can we do the following?

for k = 1 to |V|
  if dist[i][j] > dist[i][k] + dist[k][j]
    dist[i][j] = dist[i][k] + dist[k][j]

After all, we have relaxed via all the intermediate nodes. Well, that will not work! Why?

Dynamic Programming

If we want to get the shortest path between (i, j) using k (1 to k) intermediate nodes then we have to choose the cheaper between the below paths:

  1. Without using k: dist[i][j] using intermediate nodes 1 to k-1.
  2. Using k: dist[i][k] + dist[k][j], where both dist[i][k] and dist[j][k] should make use of intermediate nodes 1 to k-1.

At k = 0, dist[][] is initialized using edge weights where exists, 0 for diagonals (dist[v][v]) and infinite for the rests.

An Example

Suppose, we want to compute dist[2][3] when k = 5.

Then, dist[2][3] = min { dist[2][3], dist[2][5] + dist[5][3] }

Here, all three distances – dist[2][3], dist[2][5] and dist[5][3] must already use intermediate nodes 1 to 4. Meaning, dist[2][5] is not the static cost set at k=0; possibly the edge cost, 0 or infinite. Rather, dist[2][5] is already computed using k from 1 to 4. Similarly, dist[5][3] (and dist[2][3] as well) is also computed using k from 1 to 4.

In other words, we cannot compute a certain dist[s][t] alone, using the intermediate nodes 1 to k. Rather for each intermediate node k, we need to compute dist[i][j] progressively, using the 3 nested loops, as shown earlier.

Obviously we can use recursion without the loops. That will not save any work for us. In fact, while using recursion, if we are not reusing existing solutions for the sub-problems, we will repeat the computation – something very expensive.

Path Reconstruction

The predecessor matrix p, keeps track of the shortest path. If we have to find the best path from s to t, we know for sure that we start with s. We print s. To know where we went from there, we have to look at p[s][t]. If that is t, we are done as that is the destination. However, if that is not the case, that means we find another node r. Then we know from s we went to an intermediate node r. So this becomes the new start s for the rest of the path. However, destination remains the same t. Again we look at p[s][t] and continue the same till we reach t, all along printing r (=p[s][t]).

Incremental Node Addition

Suppose as of now, we have 4 nodes and APSP is computed. At this point 5th node arrives, along with some edges connecting the existing nodes. Instead of computing APSP from the scratch, at a cost of O(|V|3) = O(125), we can use the already computed APSP and extend that to complete it for 5 nodes, at a cost of O(|V|2) = O(25).

Adjusting Edge Weight Changes

What if weight for an edge changes (increases or decreases)? Do we need to re-compute APSP from scratch? Or we can adjust the existing results using some partial computations?

Index