Johnson’s Algorithm

47th Friday Fun Session – 19th Jan 2018

We have seen why Dijkstra’s algorithm cannot work with negative edge and that we cannot trivially add a constant to each of the edge weights and make them non-negative to proceed further. It is where Johnson’s algorithm comes into play. It finds a special set of offset values to remove the negative edges (change the negative edge weights to non-negative edge weights) and now this transformed graph is all set to work with Dijkstra’s algorithm.

How Does Johnson’s Algorithm work?

Johnson’s algorithm starts with a graph having negative edge(s). Let’s go through it using an example as shown below.

1

Add a New Node

It then adds a new vertex, let’s call it s, with edges starting from it and ending to each of the vertices of the existing graph, each having a cost of 0, as we have done earlier.

2.png

Apply Bellman-Ford

Then it applies Bellman-Ford, a Single Source Shortest Path (SSSP) algorithm that can work with a graph having negative edge(s). We will use s as the source, and find shortest path from it to all other vertices.

We also need to check whether a negative cycle exists, something that Bellman-Ford can detect. If it exists then we cannot proceed further as we cannot find shortest path in a graph with negative cycle. In our example graph, there is no negative cycle.

We find d[s, 1] = 0, d[s, 2] = -30, and d[s, 3] = 0 as shown below, using this code where d[s, t] indicates the shortest path from s to t.

3.png

Adjust Original Edge Weights

Now using these shortest path costs, original edges will be updated using the formula: w’[u, v] = w[u, v] + d[s, u] – d[s, v]. Applying the same for the original 3 edges in the original graph, we find,

w’[1, 2] = w[1, 2] + d[s, 1] – d[s, 2] = 20 + 0 – (-30) = 50

w’[1, 3] = w[1, 3] + d[s, 1] – d[s, 3] = 40 + 0 – 0 = 40

w’[3, 2] = w[3, 2] + d[s, 3] – d[s, 2] = (-30) + 0 – (-30) = 0

Now that we have adjusted the original edge costs, the new (cost) adjusted graph (without s and associated edges) does not have any more negative edge. Let’s see how the cost adjusted graph looks like.

4

Apply Dijkstra

With this non-negative edge graph we can proceed with Dijkstra’s algorithm. For each shortest path found in this graph from u to v, we have to adjust back the cost by subtracting d[s, u] – d[s, v] from it.

Is the Shortest Path Still the Same?

We are adjusting edge cost to remove negative edge. That way, we are changing the graph to some extent. However, while doing so we must preserve certain things of it. What was the cheapest cost in the original graph must still remain the cheapest path in the transformed graph. Let’s first verify whether that is indeed the case.

We will first look at the original graph (before edge cost adjustment). Let’s take a certain source destination pair (1, 2). There are two paths to reach from vertex 1 to vertex 2.

The first one (original):

d1[1, 2]

= from vertex 1 to vertex 2 directly using edge 1->2

= 20.

The second one (original):

d2[1, 2]

= from vertex 1 to 3 and then from 3 to 2

= 40 + (-30)

= 10.

Now let’s see how the costs of the same two paths change in the new cost adjusted graph.

The first one (cost adjusted):

d’1[1, 2]

= from vertex 1 to vertex 2 directly using edge 1->2

= 50.

The second one (cost adjusted):

d’2[1, 2]

= from vertex 1 to 3 and then from 3 to 2

= 40 + 0

= 40.

We see both the path costs have increased by 30, a constant. So what was earlier the shortest from vertex 1 to vertex 2, in the original graph, which was the second path, using two edges: edge 1->3 and edge 3->2, still remains the shortest path in the cost adjusted graph.

So how did that happen? Let’s have a closer look as to how the path cost changes.

The first one (cost adjusted):

d’1[1, 2]

= w’[1, 2]

= w[1, 2] + d[s, 1] – d[s, 2]

= d1[1, 2]  + d[s, 1] – d[s, 2]

The second one (cost adjusted):

d’2[1, 2]

= w’[1, 3] + w’[3, 2]

= w[1, 3] + d[s, 1] – d[s, 3] + w[3, 2] + d[s, 3] – d[s, 2]

= w[1, 3] + d[s, 1] + w[3, 2] – d[s, 2]

= w[1, 3] + w[3, 2] + d[s, 1] – d[s, 2]

= d2[1, 2] + d[s, 1] – d[s, 2]

So we see both the paths, with a certain source u and a certain destination v, have increased with a constant cost = d[s, u] – d[s, v], where s is the extra node that we added before applying Bellman-Ford algorithm.

We can easily find, no matter how many paths are present between a certain source s and a certain destination v, and no matter how many edges each of those paths uses, each of them would be adjusted by adding a constant cost = d[s, u] – d[s, v] to it. And hence, the shortest path in the original graph remains the shortest path in the new cost adjusted, non-negative edge graph.

Let’s consider a path that goes through 5 vertices: u, x1, x2, x3, and v.

In the cost adjusted graph the cost

d’[u, v]

= w’[u, x1] + w’[x1, x2] + w’[x2, x3] + w’[x3, v]

= w[u, x1] + d[s, u] – d[s, x1] + w[x1, x2] + d[s, x1] – d[s, x2] + w[x2, x3] + d[s, x2] – d[s, x3] + w[x3, v] + d[s, x3] – d[s, v]

= w[u, x1] + d[s, u] + w[x1, x2] + w[x2, x3] + w[x3, v] – d[s, v]

= w[u, x1] + w[x1, x2] + w[x2, x3] + w[x3, v] + d[s, u] – d[s, v]

= d[u, v] + d[s, u] – d[s, v]

By generalizing the above, we see that a constant cost d[s, u] – d[s, v] is getting added to all paths from u to v.

Are all Negative Edge Removed?

The second thing that we need to prove is: no longer there exists a negative edge in the adjusted graph. After applying Bellman-Ford, we computed the shortest paths from source s. Let’s assume, d[s, u] and d[s, v] are the shortest paths from s to any two vertices, u and v, respectively. In that case, we can say,

d[s, v] <= d[s, u] + w[u, v]

=> 0 <= d[s, u] + w[u, v] – d[s, v]

=> 0 <= w[u, v] + d[s, u] – d[s, v]

=> 0 <= w’[u, v]

We prove that the new edge cost, w’[u, v] is always non-negative.

Why Would We Use Johnson’s algorithm?

So here with Johnson’s algorithm, first we use Bellman-Ford to get a set of values; using which we transform the graph with negative edge to a graph with all non-negative edges so that we can apply Dijkstra’s algorithm.

But why would anyone want to do that? After all, both Bellman-Ford and Dijkstra are SSSP algorithms. What is the point of using one SSSP algorithm to transform a graph so that another SSSP algorithm can be used on the transformed graph?

Dijkstra’s Algorithm is Faster

Well, the reason being, the latter SSSP algorithm, namely Dijkstra’s, is much faster than Bellman-Ford. So, if we need to find shortest paths many times, then it is better that first we apply a bit more expensive SSSP alogorithm – Bellman-Ford to get the graph ready to work with Dijkstra’s algorithm. Then we execute much cheaper Dijkstra’s algorithm on this transformed graph, as many times as we want – later.

Sparse Graph

But in such a situation is it not better to run an ALL-Pairs Shortest Paths (APSP) algorithm like Floyd-Warshall? After all, Floyd-Warshall can compute APSP at a cost of O(V3) while Bellman-Ford costs O(|V| * |E|) that can shoot up to O(V3), when E=|V|2 for a dense graph.

Yes, that is correct. For a dense graph Johnson’s algorithm won’t possibly be useful. Johnson’s algorithm is preferable for a sparse graph when Bellman-Ford is reasonably efficient to work with it.

Index

Solution – Currency Arbitrage

49th Friday Fun Session – 2nd Feb 2018

Negative Cycle can be identified by looking at the diagonals of the dist[][] matrix generated by Floyd-Warshall algorithm. After all, diagonal dist[2][2] value is smaller than 0 means, a path starting from 2 and ending at 2 results in a negative cycle – an arbitrage exists.

However, we are asked to incrementally compute the same, at cost of O(n2) for each new vertex.

Floyd-Warshall algorithm takes O(n3) time to compute All-Pairs Shortest Path (APSP), where n is the number of vertices. However, given that it already computed APSP for n nodes, when (n+1)th node arrives, it can reuse the existing result and extend APSP to accommodate the new node incrementally at a cost of O(n2).

This is the solution for JLTI Code Jam – Jan 2018.

Converting Rates

If USD to SGD rate is r1 and SGD to GBP rate is r2, to get the rate from USD to GBP, we multiply the two rates and get the new rate that is r1*r2. Our target is to maximize rate, that is maximizing r1*r2.

In paths algorithm, we talk about minimizing path cost (sum). Hence maximizing multiplication of rates (r1*r2) would translate into minimizing 1/(r1*r2) => log (1/(r1*r2))  => log (r1*r2) -1 => – log r1 – log r2 => (–log r1) + (–log r2) => sum of (–log r1) and (–log r2). Rate r1 should be converted into – log r1 and that is what we need to use in this algorithm as edge weight.

While giving output, say the best rate from the solution, the rate as used in the dist[][] matrix should be multiplied by -1 first and then raised to the bth power, where b is the base (say one of 2, 10 etc.) of the log as used earlier.

Visualizing Floyd-Warshall

We have seen the DP algorithm that Floyd-Warshall deploys to compute APSP. Let us visualize to some extent as to how it is done for 4 vertices.

What Cells Are Used to Optimize

The computation will be done using k = 1 to 4, in the following order – starting with cell 1-1, 1-2, . . . . .2-1, 2-2, …….3-1, ……. 4-3, 4-4.

At first, using k = 1.

Let us see how the paths are improving using the following two examples.

dist[2][3] = min (dist[2][3],  dist[2][1] + dist[1][3])

1

and dist[3][4] = min (dist[3][4],  dist[3][1] + dist[1][4])

2

We see that for k = 1, all paths are optimized using paths from 1st (kth) row and 1st (kth) column.

Kth Row and Column do not Change

What about paths on kth row and kth column?

dist[1][2] = min(dist[1][2], dist[1][1] + dist[1][2]) – well, there is no point in updating dist[1][2] by adding something more to it.

So we see, at a certain kth iteration, kth row and kth column used to update the rest of the paths while they themselves are not changed.

At k = 1

3

At k = 2

4

At k = 3

5

At k = 4

6

Consider Only 3X3 Matrix Was Computed

Now assume that we did not consider that we had 4 vertices. Rather we considered that we had 3 vertices and completed APSP computations for all paths in the 3X3 matrix. We ignored the 4th row and column altogether.

So we have APSP computed for the following matrix using k = 1, 2 and 3.

7

Add 4th Vertex

Let’s say, 4th vertex arrives now. First, we can compare the computations used for the above 3X3 matrix with the same for the 4X4 matrix as shown earlier and find out what all computations need to be done now to extend this 3X3 matrix to 4X4 matrix to accommodate the new 4th vertex.

We will find that at first we have to optimize the 4th row and column using k = 1, 2 and 3. Let us do that.

8

Note that at this point, 4th row and column are not used to optimize paths for the older 3X3 matrix. So now that we have the 4th row and column optimized using k = 1, 2 and 3, we have to optimize that 3X3 matrix using k = 4.

9

This way, we don’t miss out any computation had we considered all the 4 vertices at one go. And thus we are done with optimizing all the paths in the 4X4 matrix.

Code

dist[][] //APSP matrix, already computed for n-1 vertices

p[][] //predecessor matrix, already computed for n-1 vertices


dist[n][] = ∞

dist[][n] = ∞

dist[n][n] = 0


for each edge (i, n)

  dist[i][n] = weight(i, n)

  p[i][n] = n


for each edge (n, i)

  dist[n][i] = weight(n, i)

  p[n][i] = i


for k = 1 to n-1

  for i = 1 to n-1

    if dist[i][n] > dist[i][k] + dist[k][n]

      dist[i][n] = dist[i][k] + dist[k][n]

      p[i][n] = p[i][k]

  for j = 1 to n

    if dist[n][j] > dist[n][k] + dist[k][j]

      dist[n][j] = dist[n][k] + dist[k][j]

      p[n][j] = p[n][k]


for i = 1 to n-1

    for j = 1 to n-1

      if dist[i][j] > dist[i][n] + dist[n][j]

        dist[i][j] = dist[i][n] + dist[n][j]

        p[i][j] = p[i][n]

Complexity

The complexity for this incremental building for a new vertex is clearly O(n2). That makes sense. After all, for n vertices the cost is O(n3) that is the cost of Floyd-Warshall, had all n vertices were considered at one go.

But this incremental building makes a huge difference. For example, consider that we have 1000 vertices for which we have already computed APSP using 1 billion computations. Now that 1001st vertex arrives, we can accommodate the new vertex with a cost of 1 million (approx.) computations instead of doing 1 billion+ computations again from the scratch – something that can be infeasible for many applications.

Printing Arbitrage Path

We can find the first negative cycle by looking (for a negative value) at the diagonals of the dist[][] matrix, if exists and then print the associated path. For path reconstruction, we can follow the steps as described here.

GitHub: Code will be updated in a week

Index

Floyd-Warshall Algorithm

35th Friday Fun Session – 29th Sep 2017

Floyd-Warshall, also known as Roy-Warshall is an All-Pairs Shortest Path (APSP) algorithm developed by Robert Floyd, Bernard Roy, and Stephen Warshall. It is an example of dynamic programming that uses 3 nested loops. At a cost O(|V|3), it is quite impressive, given that Bellman-Ford might encounter the same cost (O(|V||E|)) to find only Single Source Shortest Path (SSSP) for dense graph having |V|2 edges. Floyd-Warshall can work with negative edges just like Bellman-Ford. After all, both are based on dynamic programming. As for detecting negative cycle, once again, both can detect it. However, in presence of negative cycle, results from both are invalid.

Three Nested Loops

dist[][] //shortest path matrix
p[][] //predecessor matrix, used to reconstruct the path

dist[][] = ∞

for each vertex i
  dist[i][i] = 0

for each edge (i, j)
  dist[i][j] = weight(i, j)
  p[i][j] = j

for k = 1 to |V|
  for i = 1 to |V|
    for j = 1 to |V|
      if dist[i][j] > dist[i][k] + dist[k][j]
        dist[i][j] = dist[i][k] + dist[k][j]
        p[i][j] = p[i][k]

To compute the shortest path between any pair (s, t), we have considered each of the |V| vertices as intermediate points k, and chosen the cheaper between i) existing (s, t) and ii) the sum of s to k and then from k to t, meaning s to t via k.

Short-circuiting an SSSP?

Does it mean that we can derive a SSSP solution for any pair (s, t), at a cost of O(|V|2)? To be precise, can we do the following?

for k = 1 to |V|
  if dist[i][j] > dist[i][k] + dist[k][j]
    dist[i][j] = dist[i][k] + dist[k][j]

After all, we have relaxed via all the intermediate nodes. Well, that will not work! Why?

Dynamic Programming

If we want to get the shortest path between (i, j) using k (1 to k) intermediate nodes then we have to choose the cheaper between the below paths:

  1. Without using k: dist[i][j] using intermediate nodes 1 to k-1.
  2. Using k: dist[i][k] + dist[k][j], where both dist[i][k] and dist[j][k] should make use of intermediate nodes 1 to k-1.

At k = 0, dist[][] is initialized using edge weights where exists, 0 for diagonals (dist[v][v]) and infinite for the rests.

An Example

Suppose, we want to compute dist[2][3] when k = 5.

Then, dist[2][3] = min { dist[2][3], dist[2][5] + dist[5][3] }

Here, all three distances – dist[2][3], dist[2][5] and dist[5][3] must already use intermediate nodes 1 to 4. Meaning, dist[2][5] is not the static cost set at k=0; possibly the edge cost, 0 or infinite. Rather, dist[2][5] is already computed using k from 1 to 4. Similarly, dist[5][3] (and dist[2][3] as well) is also computed using k from 1 to 4.

In other words, we cannot compute a certain dist[s][t] alone, using the intermediate nodes 1 to k. Rather for each intermediate node k, we need to compute dist[i][j] progressively, using the 3 nested loops, as shown earlier.

Obviously we can use recursion without the loops. That will not save any work for us. In fact, while using recursion, if we are not reusing existing solutions for the sub-problems, we will repeat the computation – something very expensive.

Path Reconstruction

The predecessor matrix p, keeps track of the shortest path. If we have to find the best path from s to t, we know for sure that we start with s. We print s. To know where we went from there, we have to look at p[s][t]. If that is t, we are done as that is the destination. However, if that is not the case, that means we find another node r. Then we know from s we went to an intermediate node r. So this becomes the new start s for the rest of the path. However, destination remains the same t. Again we look at p[s][t] and continue the same till we reach t, all along printing r (=p[s][t]).

Incremental Node Addition

Suppose as of now, we have 4 nodes and APSP is computed. At this point 5th node arrives, along with some edges connecting the existing nodes. Instead of computing APSP from the scratch, at a cost of O(|V|3) = O(125), we can use the already computed APSP and extend that to complete it for 5 nodes, at a cost of O(|V|2) = O(25).

Adjusting Edge Weight Changes

What if weight for an edge changes (increases or decreases)? Do we need to re-compute APSP from scratch? Or we can adjust the existing results using some partial computations?

Index

Solution – Manipulating Money Exchange

25th Friday Fun Session – 7th Jul 2017

Given a set of currencies and some exchange rates among them, we want to find if there exists an arbitrage; meaning, if it is possible to exploit the discrepancies in the exchange rates and transform one unit of a certain currency to more than one unit of the same, thus making a profit.

This is the solution to JLTi Code Jam – Jun 2017 problem.

Let us walk through an example

Let us take the example as mentioned here. We can start with 1 USD, convert that to SGD (1.380 SGD), then convert that to MYR (1.380 * 3.080 MYR), then convert to INR (1.380 * 3.080 * 15.120 INR), then convert to GBP (1.380 * 3.080 * 15.120 * 0.012 GBP), then convert that back to USD (1.380 * 3.080 * 15.120 * 0.012 * 1.30 = 1.0025503488 USD).

We end up with more than 1 USD. That means, we have an arbitrage in this set of exchange rates. The profit making cycle here is: USD -> SGD -> MYR -> INR -> GBP. And since it is a cycle, we can start from any currency within it. For example, SGD -> MYR -> INR -> GBP -> USD also represents the same cycle.

The transformation

In general, if we have to make a profit, the respective rates in the cycle, when multiplied, should give more than 1, as we have seen in the above example.

Formula

Negative cycle in Bellman-Ford

After some simple transformation of the profit making condition, we see, if we take negative of log rate, and use that as the edge cost/distance, then finding profit making cycle is equivalent to finding negative cycle in the corresponding graph. And we can do so using Bellman-Ford algorithm.

To be precise, each of the currencies would be considered as a vertex. If there exists an exchange rate r between two currencies then there would be a directed edge between the corresponding vertices, and –log r would be the associated cost/distance of that edge.

Source of Bellman-Ford

The next question comes: using which vertex as source shall we run the Bellman-Ford? Let us see the below graph.

Souce.png

Suppose, we have a single profit making cycle here: GBP-> AUD -> CAD. In that case, if we start with USD as source vertex, we will never detect this cycle.

Add extra currency as source

To solve this problem, we need to add an extra currency, and then create edges from it to all the existing currencies with cost 0. Now using this extra vertex (EXT) as source we have to run Bellman-Ford and that would ensure that we can detect a cycle, if there exist one.

Extra Souce

GitHub: Manipulating Money Exchange

Index

Bellman-Ford Algorithm

17th Friday Fun Session – 12th May 2017

We use Bellman-Ford Algorithm to find the shortest path from a single source node/vertex (red color) to all destination nodes/vertices.

Let’s use our intuition

Image1

Given that distance from city-1 (city-1 is node 1 here) to city-2 is 5, and from city-2 to city-3 is 6, if I have travel from city-1 to city-2 and city-3 respectively what would be the cheapest way to do so?

We can start from city-1 and reach city-2 at cost 5. Now that we have arrived at city-2 at cost 5, we can add cost 6 to it and reach city-3 via city-2. Thus, the shortest path from city-1 to city-2 and city-3 are respectively 5 and 11.

Image2

Distance

Since city-1 is the source, we can add a self-loop on it with cost 0. That means, reaching city-1 from itself would cost 0. We also set that reaching city-2 and city-3 would cost infinity. We set so, because as of now we don’t know what would be the cost to reach there. So we put the maximum possible cost. Let’s call it distance. So we have distance [1] = 0, distance [2] = ∞ and distance [3] = ∞.

Predecessor

Let’s also maintain another array, called predecessor to indicate the last node from which we arrived here. We set predecessor [1] = 0, predecessor [2] = 0, predecessor [3] = 0.  Since we have not arrived to city-2 and city-3 yet, we set the predecessor value for them to something invalid (0). For city-1, it is the source, can be set to 0 as well.

Relaxation

Now we take each path/edge. We have two edges here. First one is from city-1 to city-2 with cost 5 and the second edge is from city-2 to city-3 with cost 3. Now let’s do what is called relaxation on each of the edges.

We see that using first edge we can arrive at city-2 from city-1 at a cost of 5 (distance [1] + cost of first edge). Since 5 is less than the existing distance of city-2 that is ∞, we update distance [2] to 5. We also note that, we arrived here from city-1 and hence got this new distance, that means we also update predecessor [2] = 1.

Now let’s do relaxation on edge 2. We see that distance [3] that is ∞ as of now can be improved by using edge 2. We set distance [3] = distance [2] + cost of edge 2 = 5 + 6 = 11. Since we arrived here from city-2, let’s update predecessor [3] = 2.

Does order of edges for relaxation matter?

Now we are done with relaxation for all the edges once. First, we did the relaxation on first edge. Then we did the relaxation on the second edge. What would happen if the we changed the order? That means do the relaxation on the second edge first and then do it on the first edge.

Let’s do it. Start the relaxation afresh with new edge order. As of now we have predecessor [1] = 0, predecessor [2] = 0, predecessor [3] = 0. Also distance [1] = 0, distance [2] = ∞ and distance [3] = ∞.

We do relaxation on second edge (city-2 to city-3 at cost 6) first. We see that both distance [2] and distance [3] = ∞. Hence there is no chance to improve distance [3] since distance [2] + 6 = ∞ + 6 = ∞, that is no better than the existing distance [3] that also ∞. Hence relaxing second edge did not yield anything.

Let’s do relaxation on first edge. We know that would result in distance [2] = 5 and predecessor [2] = 1.

Well, at this point we see that we are done with relaxation on all the edges once and yet we have not found the shortest path to reach city-3.

So when shall we get the result?

Iteration

That brings us to the next concept called iteration. Relaxation on all the edges once is called an iteration. So how many iterations do we need to get the shortest path to all destination nodes? Let’s use our intuition on the example that we are working on. We got 3 nodes. So if we have to reach from one end (say node 1) to the other end (say node 3), the maximum edges we might have to travel is 2, that is, the number of nodes minus 1. If we do the relaxation on the edges in an order that would choose the furthest edge from source (or the closest to the destination, in this case, second edge that is going from city-2 to city-3 at cost 6), we see that at each iteration we would increase the path (source node to furthest node) by at least one edge. And hence at 2 iterations we will certainly reach all reachable (I am saying reachable because all destinations might not be reachable) destinations.

Let’s continue our workout from where we left. Let’s start iteration 2 for the cases when we did the relaxation on second edge first. At this 2nd iteration, we again start with second edge. This time we can update distance [3] = distance [2] + 6 = 5 + 6 = 11. Predecessor [3] = 2.

We are done with2 iterations and we have found the result to reach from city-1 to both city-2 and city-3.

Are all nodes reachable?

Let’s consider the below example, that is constructed by adding one more node 4.

Image3

We will see that distance [4] will remain ∞ and predecessor [4] will remain 0 (invalid) after |V| -1 = 4 – 1 = 3 iterations, where |V| is the number of nodes/vertices. This is because there is no incoming edge (path) to city-4. Hence city-4 is unreachable.

Did order of edges for relaxation really matter?

We have seen that intermediate results (distance and predecessor values) might vary based on the order of edges we chose for relaxation but final result after all the iterations will still be the same. Hence the order of edges on which we do relaxation does not really matter (as far as final result is concerned).

So after |V| – 1 iterations we have got the correct result?

Unfortunately not! Well, we did get the final result. But as of now we don’t know whether the result is valid or not. That sounds interesting. So, we have found the result and still we don’t know whether the result is correct/valid or not. So what is the issue? Well, let’s consider the below case.

Image4

I have added a third edge from city-3 to city-1. And the cost is -12. Negative cost? Why? We don’t answer the why question but let’s answer the what question. Cost -12 means reaching city-1 from city-4 would cost -1.

Let’s continue our workout where we have finished 2 iterations by considering the second edge first. Let’s assume we also considered the third edge and that third edge was considered at first for relaxation at each iteration. Distance [3] got a less than infinity value after 2nd iteration. Since we used 3rd edge at first, that means third edge was not used till 2nd iteration to update any distance for any node. That means the values (distance and predecessor) we got last time would be the same value even with the presence of third edge after 2 iterations.

Since we just added an extra edge (3rd edge) but no new nodes, the number of iterations we have to do still remains 2. That also means the result we found so far still valid in this case. But is the result (distance [2] = 5 and distance [3] = 11) correct with this new situation?

Negative cycle

Now that you arrived at city-3 at cost 11, you can go to city-1 at cost 11 + (-12) = -1 and then city-2 at -4 and so on. The more you travel, the less cost you would incur. And hence the shortest path found after 2 iterations are not valid.

So how do we find that the result is invalid? Well after we are done with |V|-1 iterations, we have to do one more iteration that is the |V|th one (a cycle involving |V| nodes can be found with |V| edges). If that changes distance value for any node that means there exists a negative cycle (a cycle whose edges (costs) sum to a negative value). When there is negative cycle present in a graph then the answer found is invalid.

Negative edge vs. negative cycle

Does negative edge means negative cycle? Does presence of negative edge mean no answer can be found?

Image5

Negative edge is fine with Bellman-Ford as in the above example. A correct solution can still be found. A correct solution cannot be found when there is a negative cycle. But Bellman-Ford can detect a negative cycle and in that case it can indicate that a correct solution is not found.

Are all iterations required?

Not really. When we did the relaxation on the first edge first, we already found the shortest paths to both city-2 and city-3. How do we know? Well, at iteration 2 we would have found that no distance got updated. If an iteration does not change any distance value then we can terminate the algorithm there and return valid result. Because in that case, subsequent iterations are not going to change anything. It also means there is no negative cycle.

The shortest path sequence

We can use the predecessor array recursively to get the shortest path sequence. For example, earlier after 2 iterations we got the following result.

distance [1] = 0, distance [2] = 5, distance [3] = 11

predecessor [1] = 0, predecessor [2] = 1, predecessor [3] = 2

If we want to find the shortest path sequence to city-3, we can find the predecessor [3] that is 2, recursively we can check the predecessor [2] that is 1 and that equals to source node, that is city-1. So we stop and the sequence is city-1 to city-2 to city-3.

Distance for a particular node can be updated more than once in an iteration

Image6

In this example above we have two nodes. We have to do one iteration. We have two edges: first with cost 5, second with cost 2. Source is node 1. If we relax the first edge then distance [2] will be 5. Subsequent relaxation on second edge would result the node 2 distance to be updated again with 2 (because, existing distance 5 > (0 + 2)). We see that node 2 distance got updated twice within the same iteration.

The algorithm

Now that we have done with the workout, let’s write down the algorithm.

Function BellmanFord()
{
  input = G {V, E};
  distance[] = ∞;
  predecessor[] = -1;
  distance[sourceNode] = 0;

  for i = 1 to |V|-1
  {
    valueChanged = false;
    for j = 1 to |E|
      valueChanged = Relax (E[j]) || valueChanged;

    if(!valueChanged)
      return Result();
  }

  for j = 1 to |E|
    if(Relax(E[j])
      print ‘negative cycle detected, solution not possible’;
}

Function Relax (e)
{
  if(distance[e.to]) > distance[e.from] + e.cost)
  {
    distance[e.to] = distance[e.from] + e.cost;
    predecessor[e.to] = e.from;
    return true;
  }

  return false;
}

Function Result()
{
  print ‘success’;
  print distance[];
  print predecessor[];
}

The complexity

For each iteration (number of vertices – 1), we are iterating over all the edges. That means the complexity is O(|V|. |E|).

GitHub: Manipulating Money Exchange

Index