Currency Arbitrage

11th JLTi Code Jam – Jan 2018

Here we revisit Manipulating Money Exchange problem where we tried to find currency arbitrage using Bellman-Ford at a time complexity of O(|V||E|).

In general, this kind of graph can be dense. Suppose, there are 4 currencies: USD, SGD, GBP and INR. Usually, a rate is given from each currency to all other currencies, resulting in |V|2 edges. Hence, Bellman-Ford ends up with O(|V||E|) = O(|V||V|2) = O(|V|3), that is quite expensive. Specially, when you consider the fact that apart from the few hundred fiat currencies, there are 1000+ cryptocurrencies out there.

Also we should not forget that currency exchange rate is not a factor of solely the currency itself, rather it is tied with an exchange. For example, suppose, Moneycorp exchange has a USD to SGD rate 1.4 while for HiFX it is 1.396 for the same. So we see, USD appearing twice in the graph – once as part of Moneycorp and again as part of HiFX.

However, computing shortest paths, a prerequisite for finding arbitrage, is something quite expensive. In this problem, we need to incrementally compute shortest paths when a new vertex, nth one arrives, assuming we have pre-computed results for (n-1) vertices that we can re-use.

To be more specific, suppose, at this moment, we have 12344 vertices and we already know whether there is an arbitrage, after computing the necessary shortest paths. And then, a new currency, JioCoin arrives with some new rates (from JioCoin to some existing currencies, say, JioCoin to INR and from some existing currencies to JioCoin, say, SGD to JioCoin). Now we have 12345 vertices. Computing shortest paths for a dense graph with 12345 vertices would take a very long time (try running 3 nested for loops, each looping 12345 times), doing billions of computations.

At this moment, would it be not wise to use the existing results for 12344 vertices? And then incrementally adjust the new shortest paths and compute some new ones? That is precisely, this problem is all about. We need to incrementally, adjust/add shortest paths once a new vertex arrives. And this is to be done at a time complexity of O(|V|2), something that is comfortably manageable. After this, we have to now say, whether an arbitrage exists.

Input:

1 USD = 1.380 SGD

1 SGD = 3.080 MYR

1 MYR = 15.120 INR

1 INR = 0.012 GBP

1 GBP = 1.30 USD

I CAD = 0.57 GBP

Explanation: Whenever a rate arrives, starting from the first, for each new vertex, we need to incrementally adjust/add shortest paths, find whether an arbitrage exists or not and output the same. We have 6 inputs here. Each time an input comes, we need to output and hence, we have 6 lines of output. The first 4 did not result in any arbitrage, we output “No luck here”. From 5th we have an arbitrage and we output the same.

Once an arbitrage is found, it is going to last. Note that, there might exist more than one arbitrage. Printing any one will do.

An important thing: rate between a certain currency pair will not appear twice in the input. Meaning once, GBP to USD rate arrives at line 5, a new rate between the two won’t arrive again.

Output:

No luck here

No luck here

No luck here

No luck here

USD -> SGD -> MYR -> INR -> GBP -> USD

USD -> SGD -> MYR -> INR -> GBP -> USD

Input:

1 USD = 1.38295 SGD

1 SGD = 3.08614 MYR

1 MYR = 15.0996 INR

1 INR = 0.0119755 GBP

1 GBP = 1.295 USD

Output:

No luck here

No luck here

No luck here

No luck here

No luck here

Task: For each line of input, for each new vertex, incrementally adjust/add shortest paths at a cost (time) of O(|V|2), detect the presence of an arbitrage and output as specified.

Index

Sprint Completion Time

10th JLTi Code Jam – Dec 2017

At the start of a sprint we are given a list of deliverables. The first thing in our mind is whether the team can deliver it in time. Thus estimating time to complete a sprint is something very important.

The first thing we do is, split the deliverables into a number of tasks, and estimate the time required to complete each of them. A task takes 3 days to complete means it takes 3 days for one person to complete; it cannot be split further to get 3 persons doing it on a single day.

While some tasks can be completed independently, others might be dependent tasks – meaning we cannot start them unless the prerequisite tasks are completed first. For example, work on a report cannot start until we are done with the database design/creation. Testing or deployment cannot be done unless we develop the solution. Suppose, completing task 1, and task 2 takes 4, and 6 days respectively and task 2 is dependent on task 1 – in other words – task 1 is a prerequisite for task 2. In this case, completing task 2 would take 10 days.

Finally, we don’t have an infinite number of people available. And for simplicity, assume each person is capable of doing any of the tasks.

Input:

3

4 3 2 1 4 6

1 2 4

2 3 4

4 3

5 6

6 3

Output: 12

Explanation:

The first line says the team has 3 persons. Second line lists the number of days required to complete each of the tasks. Here we have 6 numbers. It says we have 6 tasks – task 1 takes 4 days to complete, task 2 takes 3 days to complete and so on. The last, task 6, takes 6 days.

The subsequent lines list the dependent tasks. 1 2 4 means task 1 depends on tasks 2 and 4. 6 3 means task 3 is a prerequisite for task 6. No line starts with 3 means task 3 does not depend on any other task.

Task 3 can be completed in 2 days by one person. These first 2 days the other two persons have to sit idle as all other tasks are dependent tasks. After 2 days, task 2, 4 or 6 – all of which were dependent on task 3 can start. Each of the 3 persons can start any of them. Once task 6 is done task 5 can start. Similarly, when task 2 is done task 1 can start. We will see completing all of them takes 12 days.

Input:

2

4 3 6 2

1 2

2 3

3 1

Output: Infeasible

Explanation: We have 2 persons to complete 4 tasks – completing them take 4, 3, 6 and 2 days respectively.  However, we see that task 1 is dependent on task 2, task 2 is dependent on task 3 and task 3 is dependent back on task 1. While we can finish task 4 easily, we cannot start any of the first 3 tasks. They are dependent on each other and thus creating a dependency cycle.

Task: Manually calculating the minimum time required to complete the tasks is time consuming and prone to error, especially when we need to estimate this very often. Why not write a small program that can do it for us?

Index

RC Election Result

9th JLTi Code Jam – Nov 2017

Whole JLT is buzzing with Recreational Committee (RC) aka Fun Ministry election 2018. It is more palpable in JLTi where a fierce competition is taking place between two candidates representing Millennial Party and Traditional Party. In this two-party system, Millennial Party is claiming that they know the magic as to how people can be entertained while the Traditional Party cannot stop laughing at them saying they are just inexperienced kids incapable of running the massive Fun Ministry.

This time, voting mechanism has changed. Instead of one person one vote that was how it worked till last year, one person’s vote weight would now equal to the number of years he/she is working at JLT. For example, I am working here for 4 years and hence my vote would count as 4. If somebody is working for just 1 year, his/her vote weight would be 1. For obvious reason, Millennial Party is unhappy about this new legislation that was recently passed by the incumbent Traditional Party. They call it unfair. But law is law.

Voting stopped on 10th Nov 2017 and counting votes would commence on 13th Nov 2017 followed by the announcement of result on the same day.

Being a member of the existing RC, my concern is little different. I am worried about a tie, and if that happens, what would be the next course of action.

Hence, I am checking the possibility of a tie. Manually doing so is quite problematic, if not impossible, for several hundred employees that we have in Singapore. Being the only software engineer in the existing RC, I am tasked to write a program that would take vote weight of each of the voters as input and output whether a tie is possible.

Input:

20 10 4 6

Output: Possible

ExplanationVote weight 20 and 10 – sounds familiar? Anyway, we see that if the first voter votes for one candidate and the rests for another – a tie is inevitable.

Input:

8 7 2 5 16

Output: Not Possible

Explanation: As you see the total vote weight for the above 5 voters is 38. For a tie to happen each candidate should get a vote count of 19. However, we can see, no way a vote count of 19 is possible here.

Input:

5, 6, 7, 3

Output: Not Possible

Explanation: We see, the total vote weight for the above 4 voters is 21. An odd number cannot be divided by two.

Task: Given a list of vote weight, one for each voter, we need to find whether a tie is possible. We are assuming that all voters in the input would vote for sure.

Index

Solution – Choosing Oranges

38th Friday Fun Session – 3rd Nov 2017

Given a set of goodness scores of oranges and a window length, we need to find the highest scoring oranges within the window as we move it from left to the end.

This is the solution for JLTI Code Jam – Oct 2017.

Using priority queue

Suppose we have n scores and the window length is m. We can simply move the window from left to to right and take (consecutive) m scores within the window and each time compute the max of them, and output it, if it is already not outputted. Finding max from m scores would take O(m) and as we do it n times (n-m+1 times, to be precise), the complexity would be O(mn). However, it was expected that the complexity would be better than this.

Alternatively, we can use a max-heap, where we push each score as we encounter it. At the same time, we retrieve the top of the max-heap and if all is fine – output it. By if all is fine, we mean to say, we need to make sure that the orange has not been already outputted and that it belongs to the current window. At the same time, if the top item is out-dated, we can pop it, meaning take it out of the heap. Note that, max-heap is a data structure that retains the max element at the top.

Let us walk through an example

Let us take the first example as mentioned here. For the scores 1 3 5 7 3 5 9 1 2 5 with window size 5, let us walk through the process.

At first, we push the first 4 items (4 is one less than the window size 5). The max-heap would look like: 1 3 5 7 where 7 is the top element.

Then for each of the remaining items, we do the following:

  1. If the (new) item is greater than or equal to the top item in the max-heap, pop it (out) and push the new item into it. Output the new item (if the same orange is not already outputted). We do it because the new item is the max in the present window. And the existing top one is of no further use. We can now move to the next item.
  2. Keep on popping the top as long it is not one belonging to the current window. We do it, as we are interested to find the max within the window, not the out-dated ones those are no longer inside the window.
  3. Output the top (if the same orange is not already outputted). We do it as it the max within the current window.
  4. If the top item is the oldest (left-most/first/earliest/starting one) in the current window, pop it. We do it because this item is going to go out of the window as the next item gets in.

Score 3:

3 is not >= 7 (top in heap)

Existing top (7) is within the current window. Output it.

Push 3; max-heap looks like: 1 3 3 5 7

Score 5:

5 is not >= 7 (top in heap)

Existing top (7) is within the current window (3 5 7 3 5). But this orange is already outputted (we can use index of the item to track it, meaning instead of just pushing the score, retain the index along with it). No output this time.

Push 5; max-heap looks like: 1 3 3 5 5 7

Score 9:

9 >= 7 (top in heap)

Pop 7, push 9, output 9.

New max-heap: 1 3 3 5 5 9

Score 1:

1 is not >= 9 (top in heap)

Existing top (9) is within the current window. But this orange is already outputted. No output this time.

Push 1; max-heap looks like: 1 1 3 3 5 5 9

Score 2:

2 is not >= 9 (top in heap)

Existing top (9) is within the current window. But this orange is already outputted. No output this time.

Push 2; max-heap looks like: 1 1 2 3 3 5 5 9

Score 5:

Existing top (9) is within the current window. But this orange is already outputted. No output this time.

Push 5; max-heap looks like: 1 1 2 3 3 5 5 5 9

No item left. We are done!

Finally, output is 7, 9.

Complexity

If we closely observe, we see that the size of the max-heap would be always around m. Because, if the new item is greater or equal we are popping the top – hence the max-heap size is not increasing. If new item is smaller, we are pushing it and the size of the max-heap is increasing – true; but then soon the top would be out-dated and then we would pop that. So the max-heap size remains around m. Pushing (in) or popping (out) an item would cost log m, and since we would do it n times – the complexity would be O(n log m). Please note that getting the top of the max-heap costs O(1).

GitHub: Choosing Oranges

Index

Choosing Oranges

8th JLTi Code Jam – Oct 2017

Orange is one of my favourite fruits that I buy for our Friday Fun Session participants. How would you choose the good ones from hundreds of them; especially, on the way to office, when you stop by the supermarket, in the morning rush hour?

To speed up the selection while at the same time choosing the good – firm, smooth and heavier compare to its size – I have devised a selection process. I would go from left to right, scoring each of the oranges, in a scale from 0 to 9, 9 being the best; and once a row is done, I go to the next row and so on. As I go and score, I would also choose the best one among each consecutive, say 5 oranges.

How that would look like?

Input:

5

1 3 5 7 3 5 9 1 2 5

Output: 7, 9

Explanation:

The first line says: choose the best among consecutive 5. The second line shows the score for each of the 10 oranges. The first 5 are: 1, 3, 5, 7, and 3; best among them is 7. We choose 7. The next 5 are:  3, 5, 7, 3, and 5; best among them 7 – already chosen. Move on to the next 5: 5, 7, 3, 5, and 9; best among them 9, pick that. Move to the next 5: 7, 3, 5, 9, and 1; best among them is 9, already chosen. Next 5 are: 3, 5, 9, 1, and 2; once again the best among them 9 is already chosen. Final 5 are: 5, 9, 1, 2, and 5; same as before. We cannot move further as we don’t have 5 oranges after this point.

We end up with two oranges: 7 and 9. I am not doing a bad job of selecting the best oranges for you, am I?

Input:

4

1, 3, 5

Output: None

The first line says: choose the best among 4. However, the second line shows only 3 oranges. Obviously we cannot choose any.

Input:

3

1 2 4 9

Output: 4, 9

Choose 4 from 1, 2 and 4. And then choose 9 from the next consecutive 3: 2, 4 and 9. And we are done!

Task: If we have a total of n oranges and we got to choose the best from each consecutive m, I am looking for a solution having better than O(mn) time complexity.

Index

Pseudo-polynomial Complexity

33rd Friday Fun Session (Part 2) – 15th Sep 2017

The complexity for FaaS solution is O(n), where n is the largest day number. It looks like polynomial. However, it is actually pseudo-polynomial.

Size of input

Complexity is measured in terms of the size of input, say, in bits. Suppose, there are b bits in n. Then O(n) = O(2b) and hence, it is exponential.

Let’s assume n increases from 10 to 1125899906842624. More specifically, lunch schedule, as used in the previous example, changes from 1, 3, 4, 5, 6, 7, 10 to 1, 3, 4, 5, 6, 7, 1125899906842624. We still have the same 7 days to go for lunch. Yet, we are running 1,125,899,906,842,624 loops. In our layman understanding, the problem is still the same and should have taken the same amount of time to execute, and yet, for the latter, the algorithm takes way too long!

Spot a pseudo-polynomial

This is how we spot a pseudo-polynomial time algorithm. Ideally, we would like to express the complexity using the number of inputs; here, it should have been 7. But the above algorithm works in a way, where the complexity has been expressed in one of the numeric values of the input, the maximum value of the input – 1125899906842624, to be precise. This is where we are tricked into believing it to be a polynomial time algorithm, linear (polynomial) in the (max) numeric value of the input. But if we apply the definition of complexity that takes into consideration the size/length of the input, then it is actually exponential.

To be more specific, if we look at the input size, 4 bits are required to represent 10, while 50 bits are required to represent 1,125,899,906,842,624. Complexity has gone from O(24) = 10 loops to O(250) = 1,125,899,906,842,624 loops.

That is essentially exponential in the number of bits, meaning exponential in the size of the input but polynomial in the numeric value of the input. Algorithm with this kind of running time is called pseudo-polynomial.

Truly polynomial

At this point, you might wonder what is a truly polynomial time algorithm. For example, when we add n numbers using a loop running n times, we say, the complexity of it to be O(n). But here this n can also be written as 2b. So, shall we also say, adding n numbers is a pseudo-polynomial time algorithm?

Well, when we say, adding n numbers, we implicitly say, we want to find the sum of n 32 bit numbers/integers. Then the size of n numbers is 32 * n. Once again, the formal definition of complexity is defined in terms of input size, in bits. What is the input size here? The size here is 32n. The complexity is O(32n) and removing the constant terms it is O(n), a truly polynomial time algorithm.

Index

Solution – FaaS

33rd Friday Fun Session – 15th Sep 2017

Given a lunch schedule – a sequence of days when lunch is planned, and three price plans – daily, weekly and monthly, we want to get the cheapest lunch price.

This is the solution to JLTi Code Jam – Aug 2017 problem.

Let us walk through an example

Let us take an example as mentioned here: 1, 2, 4, 5, 17, 18. Since first day is 1 and last day is 18, it can be put under a month that consists of 20 consecutive days (not calendar month). We can use a monthly plan. But it would be too expensive (S$ 99.99) for just 6 days.

The days: 1, 2, 4 and 5 fall within a week that requires consecutive 5 days (not a calendar week). We have an option to buy a weekly plan for these 4 days that would cost S$ 27.99. However, that would be higher than had we bought day-wise for 4 days at a price of S$24.

Dynamic Programming

In general, at any given day, we have three options:

  1. We buy lunch for this day alone, using daily price S$ 6. Add that to the best price found for the previous day.
  2. We treat this as the last day of a week, if applicable, and buy a weekly plan at a cost of S$ 27.99. Add that to the best price for the day immediately prior to the first day of this week.
  3. We treat this as the last day of a month, if applicable, and buy a monthly plan at a cost of S$ 99.99. Add that to the best price for the day immediately prior to the first day of this month.

This is an optimization problem that can be solved with dynamic programming where we use the result of already solved sub-problems.

Bottom-up

We have two options: top-down and bottom-up. We realize that, at the end, all the sub-problems (for each of the days) have to be solved. We also find that it is easy to visualize the problem bottom-up. And if we do use bottom-up then the required space would be limited by the last day number.

Hence, we will solve it using bottom-up dynamic programming.

Blue colored days are when lunch is scheduled.

DP table1.png

On day 1:

Cost S$ 6.

On day 2:

Daily basis: S$ 6 + price at day 1 = S$ 12

Weekly basis: S$ 27.99

Monthly basis: S$ 99.99

Best price: S$ 12

On day 3:

No lunch schedule, cost of previous day S$ 12 is its cost.

On day 4:

Daily basis: S$ 6 + price at day 3 = S$ 18

Weekly basis: S$ 27.99

Monthly basis: S$ 99.99

Best price: S$ 18

On day 5:

Daily basis: S$ 6 + price at day 4 = S$ 24

Weekly basis: S$ 27.99

Monthly basis: S$ 99.99

Best price: S$ 24

From day 6 to day 16:

No lunch schedule, cost of previous day will be carried forward: S$ 24.

On day 17:

Daily basis: S$ 6 + price at day 16 = S$ 30

Weekly basis: S$ 27.99 + price at day 12 = S$ 51.99

Monthly basis: S$ 99.99

Best price: S$ 30

On day 18:

Daily basis: S$ 6 + price at day 17 = S$ 36

Weekly basis: S$ 27.99 + price at day 13 = S$ 51.99

Monthly basis: S$ 99.99

Best price: S$ 36

Finally, the best price is S$ 36.

Another example

Let us work with another example: 1, 3, 4, 5, 6, 7, 10.

DP table2

On day 7:

Daily basis: S$ 6 + price at day 6 = S$ 36

Weekly basis: S$ 27.99 + price at day 2 = S$ 33.99

Monthly basis: S$ 99.99

Best price: S$ 33.99

Finally, the best price at the end is S$ 39.99.

Complexity

The complexity is O(n), where n is the largest day number. It is a pseudo-polynomial time algorithm.

GitHub: FaaS

Index