Currency Arbitrage with Decreasing Rate

12th JLTi Code Jam – Feb 2018

Here we extend the Currency Arbitrage problem that expected us to re-compute the best path cost matrix when a new currency arrived along with some rates from and/or to the existing currencies. We came up with a solution at O(n2) cost.

If we look at the currency exchange issues, we realize that new currencies are not arriving every day. Rather, rates among the existing currencies are the ones which are changing every now and then. So it’s equally, if not more, important to find a solution as to how to address the rate changes and compute the cost matrix.

Rate between two currencies can either increase or decrease. Here we will focus only on rate decrease.

Given that we have an existing best path cost matrix, when a rate between two currencies decreases what shall we do? Well, just like Currency Arbitrage problem, we have two options:  i) re-compute the cost matrix from the scratch, using Floyd-Warshall, at a cost O(V3) and ii) update the already computed cost matrix using some partial computations. This problem expects a solution using the second option.

Input:

1 USD = 1.380 SGD

1 SGD = 3.080 MYR

1 MYR = 15.120 INR

1 INR = 0.012 GBP

1 GBP = 1.30 USD

I CAD = 0.57 GBP

1 GBP = 1.29 USD

Explanation: We have 7 inputs here. Each time an input is given, we need to present an output and hence, we have 7 lines of output. We already know how to incrementally add a new currency. The first 4 do not result in any arbitrage; we output “No luck here”. For 5th, we have an arbitrage and we output the same. For 6th, we continue to have the same arbitrage and we again output the same.

At 7th input, we see the existing rate from GBP to USD, which was 1.30 last time, has changed to 1.29 now. With this new rate in effect, the arbitrage disappears now. Output should be “No luck here”.

Since we are dealing with only decreasing rate, in input, between two currencies, rate will only decrease. For example, an input like 1 GBP = 1.31 USD will never appear.

When multiple arbitrages exist, printing any one will do.

Output:

No luck here

No luck here

No luck here

No luck here

USD -> SGD -> MYR -> INR -> GBP -> USD

USD -> SGD -> MYR -> INR -> GBP -> USD

No luck here

Task: For each line of input, for each new vertex, incrementally adjust/add shortest paths at a cost (time) of O(|V|2), detect the presence of an arbitrage and output as specified. Use existing solution for this.

If input contains a rate that decreases since last time, accommodate that change in the best path cost matrix using some partial computations, instead of computing the whole matrix from the scratch.

Index

Solution – Currency Arbitrage

49th Friday Fun Session – 2nd Feb 2018

Negative Cycle can be identified by looking at the diagonals of the dist[][] matrix generated by Floyd-Warshall algorithm. After all, diagonal dist[2][2] value is smaller than 0 means, a path starting from 2 and ending at 2 results in a negative cycle – an arbitrage exists.

However, we are asked to incrementally compute the same, at cost of O(n2) for each new vertex.

Floyd-Warshall algorithm takes O(n3) time to compute All-Pairs Shortest Path (APSP), where n is the number of vertices. However, given that it already computed APSP for n nodes, when (n+1)th node arrives, it can reuse the existing result and extend APSP to accommodate the new node incrementally at a cost of O(n2).

This is the solution for JLTI Code Jam – Jan 2018.

Converting Rates

If USD to SGD rate is r1 and SGD to GBP rate is r2, to get the rate from USD to GBP, we multiply the two rates and get the new rate that is r1*r2. Our target is to maximize rate, that is maximizing r1*r2.

In paths algorithm, we talk about minimizing path cost (sum). Hence maximizing multiplication of rates (r1*r2) would translate into minimizing 1/(r1*r2) => log (1/(r1*r2))  => log (r1*r2) -1 => – log r1 – log r2 => (–log r1) + (–log r2) => sum of (–log r1) and (–log r2). Rate r1 should be converted into – log r1 and that is what we need to use in this algorithm as edge weight.

While giving output, say the best rate from the solution, the rate as used in the dist[][] matrix should be multiplied by -1 first and then raised to the bth power, where b is the base (say one of 2, 10 etc.) of the log as used earlier.

Visualizing Floyd-Warshall

We have seen the DP algorithm that Floyd-Warshall deploys to compute APSP. Let us visualize to some extent as to how it is done for 4 vertices.

What Cells Are Used to Optimize

The computation will be done using k = 1 to 4, in the following order – starting with cell 1-1, 1-2, . . . . .2-1, 2-2, …….3-1, ……. 4-3, 4-4.

At first, using k = 1.

Let us see how the paths are improving using the following two examples.

dist[2][3] = min (dist[2][3],  dist[2][1] + dist[1][3])

1

and dist[3][4] = min (dist[3][4],  dist[3][1] + dist[1][4])

2

We see that for k = 1, all paths are optimized using paths from 1st (kth) row and 1st (kth) column.

Kth Row and Column do not Change

What about paths on kth row and kth column?

dist[1][2] = min(dist[1][2], dist[1][1] + dist[1][2]) – well, there is no point in updating dist[1][2] by adding something more to it.

So we see, at a certain kth iteration, kth row and kth column used to update the rest of the paths while they themselves are not changed.

At k = 1

3

At k = 2

4

At k = 3

5

At k = 4

6

Consider Only 3X3 Matrix Was Computed

Now assume that we did not consider that we had 4 vertices. Rather we considered that we had 3 vertices and completed APSP computations for all paths in the 3X3 matrix. We ignored the 4th row and column altogether.

So we have APSP computed for the following matrix using k = 1, 2 and 3.

7

Add 4th Vertex

Let’s say, 4th vertex arrives now. First, we can compare the computations used for the above 3X3 matrix with the same for the 4X4 matrix as shown earlier and find out what all computations need to be done now to extend this 3X3 matrix to 4X4 matrix to accommodate the new 4th vertex.

We will find that at first we have to optimize the 4th row and column using k = 1, 2 and 3. Let us do that.

8

Note that at this point, 4th row and column are not used to optimize paths for the older 3X3 matrix. So now that we have the 4th row and column optimized using k = 1, 2 and 3, we have to optimize that 3X3 matrix using k = 4.

9

This way, we don’t miss out any computation had we considered all the 4 vertices at one go. And thus we are done with optimizing all the paths in the 4X4 matrix.

Code

dist[][] //APSP matrix, already computed for n-1 vertices

p[][] //predecessor matrix, already computed for n-1 vertices


dist[n][] = ∞

dist[][n] = ∞

dist[n][n] = 0


for each edge (i, n)

  dist[i][n] = weight(i, n)

  p[i][n] = n


for each edge (n, i)

  dist[n][i] = weight(n, i)

  p[n][i] = i


for k = 1 to n-1

  for i = 1 to n-1

    if dist[i][n] > dist[i][k] + dist[k][n]

      dist[i][n] = dist[i][k] + dist[k][n]

      p[i][n] = p[i][k]

  for j = 1 to n

    if dist[n][j] > dist[n][k] + dist[k][j]

      dist[n][j] = dist[n][k] + dist[k][j]

      p[n][j] = p[n][k]


for i = 1 to n-1

    for j = 1 to n-1

      if dist[i][j] > dist[i][n] + dist[n][j]

        dist[i][j] = dist[i][n] + dist[n][j]

        p[i][j] = p[i][n]

Complexity

The complexity for this incremental building for a new vertex is clearly O(n2). That makes sense. After all, for n vertices the cost is O(n3) that is the cost of Floyd-Warshall, had all n vertices were considered at one go.

But this incremental building makes a huge difference. For example, consider that we have 1000 vertices for which we have already computed APSP using 1 billion computations. Now that 1001st vertex arrives, we can accommodate the new vertex with a cost of 1 million (approx.) computations instead of doing 1 billion+ computations again from the scratch – something that can be infeasible for many applications.

Printing Arbitrage Path

We can find the first negative cycle by looking (for a negative value) at the diagonals of the dist[][] matrix, if exists and then print the associated path. For path reconstruction, we can follow the steps as described here.

GitHub: Code will be updated in a week

Index

Currency Arbitrage

11th JLTi Code Jam – Jan 2018

Here we revisit Manipulating Money Exchange problem where we tried to find currency arbitrage using Bellman-Ford at a time complexity of O(|V||E|).

In general, this kind of graph can be dense. Suppose, there are 4 currencies: USD, SGD, GBP and INR. Usually, a rate is given from each currency to all other currencies, resulting in |V|2 edges. Hence, Bellman-Ford ends up with O(|V||E|) = O(|V||V|2) = O(|V|3), that is quite expensive. Specially, when you consider the fact that apart from the few hundred fiat currencies, there are 1000+ cryptocurrencies out there.

Also we should not forget that currency exchange rate is not a factor of solely the currency itself, rather it is tied with an exchange. For example, suppose, Moneycorp exchange has a USD to SGD rate 1.4 while for HiFX it is 1.396 for the same. So we see, USD appearing twice in the graph – once as part of Moneycorp and again as part of HiFX.

However, computing shortest paths, a prerequisite for finding arbitrage, is something quite expensive. In this problem, we need to incrementally compute shortest paths when a new vertex, nth one arrives, assuming we have pre-computed results for (n-1) vertices that we can re-use.

To be more specific, suppose, at this moment, we have 12344 vertices and we already know whether there is an arbitrage, after computing the necessary shortest paths. And then, a new currency, JioCoin arrives with some new rates (from JioCoin to some existing currencies, say, JioCoin to INR and from some existing currencies to JioCoin, say, SGD to JioCoin). Now we have 12345 vertices. Computing shortest paths for a dense graph with 12345 vertices would take a very long time (try running 3 nested for loops, each looping 12345 times), doing billions of computations.

At this moment, would it be not wise to use the existing results for 12344 vertices? And then incrementally adjust the new shortest paths and compute some new ones? That is precisely, this problem is all about. We need to incrementally, adjust/add shortest paths once a new vertex arrives. And this is to be done at a time complexity of O(|V|2), something that is comfortably manageable. After this, we have to now say, whether an arbitrage exists.

Input:

1 USD = 1.380 SGD

1 SGD = 3.080 MYR

1 MYR = 15.120 INR

1 INR = 0.012 GBP

1 GBP = 1.30 USD

I CAD = 0.57 GBP

Explanation: Whenever a rate arrives, starting from the first, for each new vertex, we need to incrementally adjust/add shortest paths, find whether an arbitrage exists or not and output the same. We have 6 inputs here. Each time an input comes, we need to output and hence, we have 6 lines of output. The first 4 did not result in any arbitrage, we output “No luck here”. From 5th we have an arbitrage and we output the same.

Once an arbitrage is found, it is going to last. Note that, there might exist more than one arbitrage. Printing any one will do.

An important thing: rate between a certain currency pair will not appear twice in the input. Meaning once, GBP to USD rate arrives at line 5, a new rate between the two won’t arrive again.

Output:

No luck here

No luck here

No luck here

No luck here

USD -> SGD -> MYR -> INR -> GBP -> USD

USD -> SGD -> MYR -> INR -> GBP -> USD

Input:

1 USD = 1.38295 SGD

1 SGD = 3.08614 MYR

1 MYR = 15.0996 INR

1 INR = 0.0119755 GBP

1 GBP = 1.295 USD

Output:

No luck here

No luck here

No luck here

No luck here

No luck here

Task: For each line of input, for each new vertex, incrementally adjust/add shortest paths at a cost (time) of O(|V|2), detect the presence of an arbitrage and output as specified.

Index

Sprint Completion Time

10th JLTi Code Jam – Dec 2017

At the start of a sprint we are given a list of deliverables. The first thing in our mind is whether the team can deliver it in time. Thus estimating time to complete a sprint is something very important.

The first thing we do is, split the deliverables into a number of tasks, and estimate the time required to complete each of them. A task takes 3 days to complete means it takes 3 days for one person to complete; it cannot be split further to get 3 persons doing it on a single day.

While some tasks can be completed independently, others might be dependent tasks – meaning we cannot start them unless the prerequisite tasks are completed first. For example, work on a report cannot start until we are done with the database design/creation. Testing or deployment cannot be done unless we develop the solution. Suppose, completing task 1, and task 2 takes 4, and 6 days respectively and task 2 is dependent on task 1 – in other words – task 1 is a prerequisite for task 2. In this case, completing task 2 would take 10 days.

Finally, we don’t have an infinite number of people available. And for simplicity, assume each person is capable of doing any of the tasks.

Input:

3

4 3 2 1 4 6

1 2 4

2 3 4

4 3

5 6

6 3

Output: 12

Explanation:

The first line says the team has 3 persons. Second line lists the number of days required to complete each of the tasks. Here we have 6 numbers. It says we have 6 tasks – task 1 takes 4 days to complete, task 2 takes 3 days to complete and so on. The last, task 6, takes 6 days.

The subsequent lines list the dependent tasks. 1 2 4 means task 1 depends on tasks 2 and 4. 6 3 means task 3 is a prerequisite for task 6. No line starts with 3 means task 3 does not depend on any other task.

Task 3 can be completed in 2 days by one person. These first 2 days the other two persons have to sit idle as all other tasks are dependent tasks. After 2 days, task 2, 4 or 6 – all of which were dependent on task 3 can start. Each of the 3 persons can start any of them. Once task 6 is done task 5 can start. Similarly, when task 2 is done task 1 can start. We will see completing all of them takes 12 days.

Input:

2

4 3 6 2

1 2

2 3

3 1

Output: Infeasible

Explanation: We have 2 persons to complete 4 tasks – completing them take 4, 3, 6 and 2 days respectively.  However, we see that task 1 is dependent on task 2, task 2 is dependent on task 3 and task 3 is dependent back on task 1. While we can finish task 4 easily, we cannot start any of the first 3 tasks. They are dependent on each other and thus creating a dependency cycle.

Task: Manually calculating the minimum time required to complete the tasks is time consuming and prone to error, especially when we need to estimate this very often. Why not write a small program that can do it for us?

Index

RC Election Result

9th JLTi Code Jam – Nov 2017

Whole JLT is buzzing with Recreational Committee (RC) aka Fun Ministry election 2018. It is more palpable in JLTi where a fierce competition is taking place between two candidates representing Millennial Party and Traditional Party. In this two-party system, Millennial Party is claiming that they know the magic as to how people can be entertained while the Traditional Party cannot stop laughing at them saying they are just inexperienced kids incapable of running the massive Fun Ministry.

This time, voting mechanism has changed. Instead of one person one vote that was how it worked till last year, one person’s vote weight would now equal to the number of years he/she is working at JLT. For example, I am working here for 4 years and hence my vote would count as 4. If somebody is working for just 1 year, his/her vote weight would be 1. For obvious reason, Millennial Party is unhappy about this new legislation that was recently passed by the incumbent Traditional Party. They call it unfair. But law is law.

Voting stopped on 10th Nov 2017 and counting votes would commence on 13th Nov 2017 followed by the announcement of result on the same day.

Being a member of the existing RC, my concern is little different. I am worried about a tie, and if that happens, what would be the next course of action.

Hence, I am checking the possibility of a tie. Manually doing so is quite problematic, if not impossible, for several hundred employees that we have in Singapore. Being the only software engineer in the existing RC, I am tasked to write a program that would take vote weight of each of the voters as input and output whether a tie is possible.

Input:

20 10 4 6

Output: Possible

ExplanationVote weight 20 and 10 – sounds familiar? Anyway, we see that if the first voter votes for one candidate and the rests for another – a tie is inevitable.

Input:

8 7 2 5 16

Output: Not Possible

Explanation: As you see the total vote weight for the above 5 voters is 38. For a tie to happen each candidate should get a vote count of 19. However, we can see, no way a vote count of 19 is possible here.

Input:

5, 6, 7, 3

Output: Not Possible

Explanation: We see, the total vote weight for the above 4 voters is 21. An odd number cannot be divided by two.

Task: Given a list of vote weight, one for each voter, we need to find whether a tie is possible. We are assuming that all voters in the input would vote for sure.

Index

Solution – Choosing Oranges

38th Friday Fun Session – 3rd Nov 2017

Given a set of goodness scores of oranges and a window length, we need to find the highest scoring oranges within the window as we move it from left to the end.

This is the solution for JLTI Code Jam – Oct 2017.

Using priority queue

Suppose we have n scores and the window length is m. We can simply move the window from left to to right and take (consecutive) m scores within the window and each time compute the max of them, and output it, if it is already not outputted. Finding max from m scores would take O(m) and as we do it n times (n-m+1 times, to be precise), the complexity would be O(mn). However, it was expected that the complexity would be better than this.

Alternatively, we can use a max-heap, where we push each score as we encounter it. At the same time, we retrieve the top of the max-heap and if all is fine – output it. By if all is fine, we mean to say, we need to make sure that the orange has not been already outputted and that it belongs to the current window. At the same time, if the top item is out-dated, we can pop it, meaning take it out of the heap. Note that, max-heap is a data structure that retains the max element at the top.

Let us walk through an example

Let us take the first example as mentioned here. For the scores 1 3 5 7 3 5 9 1 2 5 with window size 5, let us walk through the process.

At first, we push the first 4 items (4 is one less than the window size 5). The max-heap would look like: 1 3 5 7 where 7 is the top element.

Then for each of the remaining items, we do the following:

  1. If the (new) item is greater than or equal to the top item in the max-heap, pop it (out) and push the new item into it. Output the new item (if the same orange is not already outputted). We do it because the new item is the max in the present window. And the existing top one is of no further use. We can now move to the next item.
  2. Keep on popping the top as long it is not one belonging to the current window. We do it, as we are interested to find the max within the window, not the out-dated ones those are no longer inside the window.
  3. Output the top (if the same orange is not already outputted). We do it as it the max within the current window.
  4. If the top item is the oldest (left-most/first/earliest/starting one) in the current window, pop it. We do it because this item is going to go out of the window as the next item gets in.

Score 3:

3 is not >= 7 (top in heap)

Existing top (7) is within the current window. Output it.

Push 3; max-heap looks like: 1 3 3 5 7

Score 5:

5 is not >= 7 (top in heap)

Existing top (7) is within the current window (3 5 7 3 5). But this orange is already outputted (we can use index of the item to track it, meaning instead of just pushing the score, retain the index along with it). No output this time.

Push 5; max-heap looks like: 1 3 3 5 5 7

Score 9:

9 >= 7 (top in heap)

Pop 7, push 9, output 9.

New max-heap: 1 3 3 5 5 9

Score 1:

1 is not >= 9 (top in heap)

Existing top (9) is within the current window. But this orange is already outputted. No output this time.

Push 1; max-heap looks like: 1 1 3 3 5 5 9

Score 2:

2 is not >= 9 (top in heap)

Existing top (9) is within the current window. But this orange is already outputted. No output this time.

Push 2; max-heap looks like: 1 1 2 3 3 5 5 9

Score 5:

Existing top (9) is within the current window. But this orange is already outputted. No output this time.

Push 5; max-heap looks like: 1 1 2 3 3 5 5 5 9

No item left. We are done!

Finally, output is 7, 9.

Complexity

If we closely observe, we see that the size of the max-heap would be always around m. Because, if the new item is greater or equal we are popping the top – hence the max-heap size is not increasing. If new item is smaller, we are pushing it and the size of the max-heap is increasing – true; but then soon the top would be out-dated and then we would pop that. So the max-heap size remains around m. Pushing (in) or popping (out) an item would cost log m, and since we would do it n times – the complexity would be O(n log m). Please note that getting the top of the max-heap costs O(1).

GitHub: Choosing Oranges

Index

Choosing Oranges

8th JLTi Code Jam – Oct 2017

Orange is one of my favourite fruits that I buy for our Friday Fun Session participants. How would you choose the good ones from hundreds of them; especially, on the way to office, when you stop by the supermarket, in the morning rush hour?

To speed up the selection while at the same time choosing the good – firm, smooth and heavier compare to its size – I have devised a selection process. I would go from left to right, scoring each of the oranges, in a scale from 0 to 9, 9 being the best; and once a row is done, I go to the next row and so on. As I go and score, I would also choose the best one among each consecutive, say 5 oranges.

How that would look like?

Input:

5

1 3 5 7 3 5 9 1 2 5

Output: 7, 9

Explanation:

The first line says: choose the best among consecutive 5. The second line shows the score for each of the 10 oranges. The first 5 are: 1, 3, 5, 7, and 3; best among them is 7. We choose 7. The next 5 are:  3, 5, 7, 3, and 5; best among them 7 – already chosen. Move on to the next 5: 5, 7, 3, 5, and 9; best among them 9, pick that. Move to the next 5: 7, 3, 5, 9, and 1; best among them is 9, already chosen. Next 5 are: 3, 5, 9, 1, and 2; once again the best among them 9 is already chosen. Final 5 are: 5, 9, 1, 2, and 5; same as before. We cannot move further as we don’t have 5 oranges after this point.

We end up with two oranges: 7 and 9. I am not doing a bad job of selecting the best oranges for you, am I?

Input:

4

1, 3, 5

Output: None

The first line says: choose the best among 4. However, the second line shows only 3 oranges. Obviously we cannot choose any.

Input:

3

1 2 4 9

Output: 4, 9

Choose 4 from 1, 2 and 4. And then choose 9 from the next consecutive 3: 2, 4 and 9. And we are done!

Task: If we have a total of n oranges and we got to choose the best from each consecutive m, I am looking for a solution having better than O(mn) time complexity.

Index