Synchronizing Web System

22nd Friday Fun Session (Part 3) – 16th Jun 2017

When we have to implement an idempotent operation in web application, where the respective method execution is not idempotent when executed simultaneously, it is essential that we use concurrency control. Here we focus on .NET application deployed in IIS.

Idempotent operation

Suppose user requests to approve a certain transaction, say, transaction Id 100, hence it is data specific. We need to make sure this is an idempotent operation, meaning, no matter how many times the request comes, simultaneously or serially, the end result should be the same.

Concurrency and thread synchronization

If more than one request comes, from the same user or from multiple users simultaneously, we have seen IIS might launch multiple threads, one for each request. It is possible that they are executed simultaneously.

Let’s take a closer look. Suppose the two requests were initiated at 12:00:00:000 PM. The requests ended up in IIS at 12:00:01:000 PM, two threads starts processing them at the same time at 12:00:01:001 PM. Both the threads find that the transaction has not been approved yet, and proceed to do the same thing, like send an email, make some database entries etc. At the end both mark that the transaction is approved.

So we see that, what was supposed to be an idempotent operation has ended up sending two emails and so on, clearly failing to be one. If we don’t have any concurrency control in place, nothing is stopping from sending two emails from the two threads. After all, both the threads are executing simultaneously checking, finding and doing the same thing, at the exact same time.

Concurrency control is essential here. That means, at the beginning we need to place a gate through which only one thread can enter, at a time. Once it enters, it should mark that it is approving a certain transaction. Then all other threads who would enter the gate later serially (one after another), can detect that somebody else is processing that exact same request and quit gracefully.

Intra-process synchronization

We will walk through some options, even though they would not qualify to be the desired solution. We would do so just to explain the issues around the options, so that we can rationalize the final solution(s).

Let us start with lock, provided by C#. lock can implement critical section, meaning it can make a certain portion of the code executable only by one thread at a time. So the transaction processing method can be wrapped using a lock. That will make sure only one thread of a process is executing the method at any given point of time.

However, we have two issues here:

First, we are locking the whole method. We wanted to make sure only transaction Id 100 gets approved only by one thread at any point of time. But we end up blocking approval for all other transactions, say, transaction Id 99 or 101, when approval for transaction Id 100 is going on.

We can solve this issue by implementing a named lock, meaning the lock will have a name, say, TranApproval_100. That way, only threads executing approval requests for the same transaction will be executed serially. Instead of using lock, we can achieve the same using interned string, named mutex etcas well.

Second, the scope of the lock is within the process. Nobody outside the process knows about it. However, we know that in web garden configuration, there might be more than one process running for the same web application and the two threads can come from two different processes. In that case, the thread in the first process would not know that the thread in the second process is having a lock. Hence both the threads would happily execute the same method simultaneously. We can solve this problem by using named mutex, an inter-process synchronization mechanism.

Inter-process synchronization

We see that using a named mutex, say, the name of the mutex is TranApproval_100, we can make sure we don’t block approval for other transactions. Meaning only thread approving transaction Id 100 will execute, without blocking approval for, say, transaction Id 99 or 101.

We also see that between the two threads running in two processes in web garden configuration, only one would do the approval and other would quit. This is because the named mutex is visible to all processes throughout the operating system.

However, we also know that in web farm configuration, the two processes, executing the two threads, both intending to serve the approval processing for transaction Id 100, might be running in two different web servers. In such a case, this inter-process synchronization mechanism that we just explored, would not work.

External and Common

So we see that we have to use something that is both external and common to web servers. If we have one common database that is used by all the processes then we can use something from the database to make sure that the threads are processing the requests one by one.

Using transaction within application

Using the concurrency control support provided by database, we can make sure only one method is doing the database operations, at any point of time. If we are using ORM, say Entity Framework, then we can use the transaction support provided from version 6 onwards (Database.BeginTransaction(), Commit, Rollback etc.). Since we want them execute serially, we know we have to use serializable isolation level System.Data.IsolationLevel.Serializable. We can begin the transaction with this isolation level parameter.

There are two problems associated with this. First, we are again serializing the whole transaction approval process, meaning blocking approval for transaction Id 101, while processing the same for transaction Id 100.

Second, we cannot stop non-database operations like sending email etc.

Named synchronization using database

The first problem can be solved by, as we have seen previously, using a named synchronization mechanism. The second problem can also be solved using the same, but we need to make sure we are using the synchronization mechanism to serialize the whole method, not just the database transaction.

Named lock table

We first create the following table that will keep track of the transactions, presently under processing.

CREATE TABLE [Web].[LockInfo]
(
  [LockInfoId] [int] IDENTITY(1,1) NOT NULL, 
  [LockName] [nvarchar](256) NULL, 
  [DurationInSeconds] [int] NULL, 
  [CreatedOn] [datetime] NOT NULL, 
  CONSTRAINT [PK_LockInfoId] PRIMARY KEY CLUSTERED 
  ( 
    [LockInfoId] ASC
  )
  WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, 
  IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) 
  ON [PRIMARY]
) ON [PRIMARY]

GO

ALTER TABLE [Web].[LockInfo] ADD  DEFAULT ((60)) FOR [DurationInSeconds]
Lock name

LockName is the name of the lock, like TranAproval_100.

Duration

We should not hold the lock forever. If after acquiring the lock things go wrong, for example, the caller somehow misses to unlock it or there is a crash, this resource cannot be locked again by any thread in future. The created date along with the default duration of 1 minute would make sure, after a minute this lock becomes invalid. This is based on the assumption that approval processing would be done within a minute.

Using stored procedure

We create the following stored procedure.

CREATE Procedure [Web].[usp_Lock] 
  (@Lock BIT, @LockName NVARCHAR(256), @LockDuration INT = NULL) 
  AS 
    BEGIN SET TRANSACTION ISOLATION LEVEL SERIALIZABLE 
    BEGIN TRY 
      BEGIN TRANSACTION 

      DECLARE @Success AS BIT 
      SET @Success = 0

      IF(@Lock IS NULL OR @LockName IS NULL) 
      BEGIN  
        SELECT @Success AS Success; 
        ROLLBACK TRANSACTION
        RETURN; 
      END 

      IF(@Lock = 1) -- LOCK
      BEGIN 
        DELETE FROM [Web].[LockInfo] 
        WHERE @LockName = [LockName] AND 
        DATEADD(SECOND, [DurationInSeconds], [CreatedOn]) < GETUTCDATE();

        IF NOT EXISTS (
                       SELECT * FROM [Web].[LockInfo] 
                       WHERE @LockName = [LockName]) 
        BEGIN 
          IF(@LockDuration IS NULL) 
            INSERT INTO [Web].[LockInfo]  
            ([LockName], [CreatedOn]) 
            VALUES(@LockName, GETUTCDATE()) 
          ELSE 
            INSERT INTO [Web].[LockInfo]  
            ([LockName], [DurationInSeconds], [CreatedOn]) 
            VALUES(@LockName, @LockDuration, GETUTCDATE())
          
          SET @Success = 1 
        END 
      END 
      
      ELSE -- UNLOCK
      BEGIN 
        DELETE FROM [Web].[LockInfo] 
        WHERE @LockName = [LockName] 
        SET @Success = 1 
      END 
      
      COMMIT TRANSACTION 
    END TRY
  
    BEGIN CATCH 
      SET @Success = 0 
      IF(@@TRANCOUNT > 0) 
        ROLLBACK TRANSACTION
    END CATCH;
 
    SELECT @Success AS Success; 
  END

Lock and unlock

This stored procedure can be used for both lock and unlock operation of a named resource. The lock can be acquired for a given interval. In absence of any duration parameter, default duration of 1 minute will be used.

Leveraging isolation level

We make sure that this stored procedure can be executed serially, thus making sure only one transaction execute, at any point of time.

Here we focus on making the lock/unlock function faster, compromising concurrency by using the highest isolation level.

Leveraging consistency

We could have used unique constraint on lock name column in the Web.LockInfo table. In that case, we could use the default READ COMMITTED isolation of MS SQL Server, in the stored procedure, increasing concurrency. If a second thread would want to take the lock on the same resource, it would try to insert a row with the same lock name, resulting in failure, due to the unique constraint checking.

Performance

The two have different performance implications that we explain using an example.

Suppose two threads started executing simultaneously. The faster stored procedure takes one millisecond to execute. The first thread takes one millisecond to lock and then 100 milliseconds to execute the main approval processing, taking a total of 101 milliseconds.

The second stored procedure would wait one millisecond for the first stored procedure to lock. Then it would take one more millisecond to check that it is already locked, and hence it would take a total of 2 milliseconds to quit the processing.

On the other hand, suppose the stored procedure using default isolation level and unique constraint takes 3 milliseconds to lock. The first thread would take 103 milliseconds to execute.

The second thread that starts at the same time would take 3 milliseconds to fail while acquiring the lock and quit.

Based on the load and usage pattern, an appropriate mechanism can be chosen.

Usage for general purpose locking

This mechanism can be used not only for synchronizing transaction approval but all other cases that use named resources. Based on the load, more than one lock table (and/or stored procedure) can be used, each supporting certain modules.

GitHub: Named Resource Lock

Index

Interpreting IIS Internals

22nd Friday Fun Session (Part 2) – 16th Jun 2017

We are trying to understand how an HTTP request is processed by .NET web application, hosted in IIS in various scenarios with a focus on synchronization of the processing of that request. To be precise, we are interested in the thread and process contexts involved while serving an HTTP request.

Client request

Multiple clients across the globe, using their respective browsers, sending HTTP request to the web server.

Web server

All these http requests are ending up in IIS web server, hosting the application.

IIS kernel mode

In web server, HTTP listener, a kernel mode device driver, part of network subsystem, and part of the IIS – kernel mode of IIS to be precise, called http protocol stack (Http.sys), listens for http requests.

HTTPS.sys, as a forwarder, might directly pass the request to the right worker process, or as a request queuer, queues it unless a worker process picks it up. Once the response of that request reaches to it, it returns that back to client browser. Also as a kernel level cacher, it does some kernel level caching and if possible, returns the cached output directly, without involving any user level processing.

Worker process

Worker process, w3wp.exe, an executable, a process to OS, runs inside IIS user mode, is little different than other processes in the operating system (OS), in the sense that it can contain multiple application domains.

Application domain

Application domain represented by AppDomain object, a .NET concept, is a virtual process within a process. As said, it runs within a worker process. One application domain does not share its static variables etc. with another application domain running within the same worker process.

Usually, all static variables etc. of a process are visible to all within a process. It is not the case for worker process. This is how worker process is a special process, where one or more application domains are running inside it, as if each of them is a separate process, providing isolation. So what usually we are used to thinking as per process, in the world of IIS, inside worker process, it is actually per application domain.

Why Application domain, you might ask. Well, a web server can have many applications. Creating one worker process for each of them will end up creating many processes that is quite expensive. Creating an application domain for each of them and putting them together inside a single process is much cheaper.

Note that, one of the application domains can be stopped without affecting other application domains inside a worker process.

Worker thread

When worker process receives a request, it uses worker thread to process that request. If two users send two requests to the same web application, both of them can be simultaneously executed by two worker threads and served.

At any point of time, a worker thread can only be executed within a single application domain. An application domain can have multiple worker threads. But these worker threads are not confined to a single application domain. Rather they belong to worker process. When a thread is done with serving a request for a particular application domain, it is freed. At a later point of time, the same thread can be used to serve another request, belonging to a different application domain.

Web application

We develop web application. We are interested to know how this ending up running in IIS environment. Application domains are associated with web application. One web application has typically, one application domain running inside a worker process. One worker process might be running many application domains, each supporting a separate web application.

Application pool

Application pool is the container for (web) applications. Every application has to be assigned to a single application pool. A number of web applications can be assigned to a single application pool. But as mentioned earlier, a single application cannot be assigned to multiple application pools. All applications belonging to an application pool share the same configuration.

Worker process runs inside the application pool.

Application pool can be recycled, restarted. Applications belonging to other application pool are not affected by this. Thus application pool provides isolation.

So we see that a number of applications are contained within an application pool. And then a worker process running inside an application pool is again running a number of application domains, each application domain serving a different web application. Thus, we have two different levels of isolation.

Web garden

How many worker processes can be there inside an application pool? Well, it is configurable. By default, it is only one worker process running inside an application pool, supporting multiple web applications, by creating one application domain for each of the applications. And all these application domains are running as separate processes within that single worker process.

However, application pool can be configured to run more than one worker processes. In that case, it is called a web garden. In this situation, multiple worker processes can be running in a single application pool. Each of these worker processes, once again running multiple application domains, each belonging to one application.

In this scenario, each of the worker processes can have its own application domain for the same application. In other words, for a certain web application, we can have multiple application domains, each running in a separate worker process, all in the same application pool. To be precise, one application or web site can have multiple instances running in a single web server, if web garden is enabled.

This is important as it renders uses of static variables, application-wide variables etc. problematic in web application.

Web farm

When one web server is not enough to serve the clients requests, we need more of them to host the same application/web site. We call it web farm.

A load balancer would sit in front of the web servers and its IP will be exposed to external world. HTTP requests will come to load balancer first and it will then distribute the load to different web servers.

Individual web server can share the same database or replicated/partitioned database.

In a nutshell

Single server, application pool running one worker process

So we see that, multiple https requests for the same web application would be simultaneously served by multiple threads. Those threads can be executed within a single application domain belonging to a single worker process. This happens when only one worker process is set to run for an application pool.

Simple.png

In the above image, we see IIS having two parts – system and user mode. HTTP.sys is in kernel mode, forwarding HTTP request to 3rd application pool, belonging to application X. We further see that a single worker process inside that 3rd application pool is running two application domains X and Y. Two threads within application domain X – Thread 1 and Thread 2 are serving the two requests, respectively. The response will go back to client browser through HTTP.sys.

Single web server, application pool running more than one worker process, called web garden

Or the threads can come from different application domains associated with the same web application or web site, running inside different worker processes, all contained within the same application pool. This can happen in web garden configuration, where multiple worker processes are allowed to execute within a single application pool. We can understand any locking mechanism that works within a single process would not work in this setup. We would need to implement an inter-process synchronization mechanism, if our application is deployed in web garden.

Web Garden

In the above image, showing web garden, two requests are being served by two worker threads, belonging to two application domains (both associated with same web application), each running in a separate worker process, both of them (worker processes) contained within the same application pool.

Multiple web servers behind load balancer, called web farm

Or the threads can come from different physical web servers. This can happen in a web farm scenario, where multiple web servers sit behind a load balancer. We can understand that an inter-process synchronization mechanism, which works across the processes within an OS, would not work here. Since we have multiple web servers here, each running its own OS, inter-process synchronization mechanism would not work for application-wide synchronization.

Web Farm.png

In the above image, showing web farm, two requests are being served by two worker threads, each running in a separate web server.

Index

Incision into Isolation Levels

22nd Friday Fun Session (Part 1) – 16th Jun 2017

We are trying to see how isolation level, serializable to be precise, can help us implementing a synchronization mechanism for web application.

Let us start with ACID

ACID stands for Atomicity, Consistency, Isolation and Durability. It is detailed in ISO standard. Database systems implement this so that a sequence of operations, called as transaction, can be perceived as a single logical operation.

Atomicity

All operations of a transaction are all done or nothing done. Logging with undo capability can be used to achieve this.

Consistency

Given that all database constraints (foreign key, unique etc.) are valid at the beginning, the same should be maintained, at the end of the transaction as well.

Durability

All changes done by a committed transaction must go to storage even if database system crashes in the middle. Logging with redo capability can be used to achieve this.

Why Logging?

We talked about logging and then redo/undo in the previous sections. Why Logging? Well, when some transactions changes data, they are not immediately written to disk. Rather those pages are marked as dirty. Lazy writing flushes them to disk later. Instant writing to disk is expensive. Instead, logging the operation that is directly written to disk immediately, is much cheaper.

However, performance, while important is not a must. Logging is essential to ensure atomicity and durability. Any modification must be written to log before applying to actual database. This is known as write-ahead logging (WAL) This is to make sure that in case of a crash (say, 2 out 5 operations of a transaction are written to database storage and then it crashes), system can come back, read the log and figure out what was supposed to be done and what was not supposed to be done. By redoing and undoing necessary operations, durability and atomicity is ensured.

Focus on the I of ACID

Today we focus on the I of ACID, called isolation. When we are writing a transaction, we write the operations inside it thinking nobody else is doing anything else to the data that we are dealing with. Isolation property defines such an environment and database systems implements that.

So, why do we need such an environment? Well, without this, in a highly concurrent transaction execution environment, our understanding of the data we are working with will not hold true, as other would change them simultaneously. It will happen largely due to three problems: dirty read, non-repeatable read, and phantom read.

However, creating such an isolated environment is expensive in terms of performance. Hence, a number of other isolation levels are introduced, giving various degrees of isolation rather than a complete isolation.

The ISO standard defines the following Isolation levels that we will describe in terms of two transactions T1 and T2 that executes in parallel.

Read Uncommitted

Transaction 1 (T1) updates salary for Joe
Transaction 2 (T2) reads updated salary for Joe
T1 aborts transaction

We see that, T2 read dirty (because T1 did not commit the updated salary) data and went ahead with his decisions/operations inside it based on it, that was of course a wrong thing it did.

As the name implies Read Uncommitted reads uncommitted data, also called dirty data that is wrong. So we see, this isolation level does not guarantee isolation property and it is an example of a weaker isolation level. Note that along with dirty read, it also has non-repeatable read and phantom read problems.

Read Committed

The next better isolation level, as the name Read Committed implies, reads only committed data and solves the dirty read problem encountered previously in Read Uncommitted isolation level. Let us see through an example. Now T1 is running in Read Committed isolation level.

T1 reads the salary of Joe
T2 updates the salary of Joe and commits
T1 reads the salary of Joe

So we see T1 reads the salary of Joe twice, and it is different in the two cases. In the second case, it reads the data that was modified and committed by T2. No more dirty read by T1. Good.

But the isolation property expects each transaction to happen in complete isolation, meaning it would assume it is the only transaction that is taking place now. Joe’s salary was not updated by T1. Then why should T1 see different data when it reads the second time?

So we see, T1 could not repeat a read (the same salary for Joe). Hence, this problem is called non-repeatable read. Read Committed, like Read Uncommitted is another weaker isolation level. Again, note that, along with non-repeatable read it also has the phantom read problem.

Repeatable Read

To solve the non-repeatable read problem, Repeatable Read isolation level comes into picture. Since T1 reads the salary of Joe, no other transaction should be able to modify Joe’s data if we run T1 in Repeatable Read isolation level.

If we repeat the previous transactions we did earlier we would see T2 waits for T1 to finish first. Because T1 would use the right locks on the rows it reads so that others cannot delete/modify it. Repeatable read is solving the non-repeatable read problem as the name implies.

However, that won’t stop new data insertion. After all, Repeatable Read put necessary locks only on the data that it has read, not on future data. Hence, we will see ghost/phantom data. Let’s see an example.

T1 reads 4 rows in employee table
T2 inserts one record in employee table and commits
T1 reads 5 rows in employee table

We see that T1 sees a phantom row (the newly inserted row by T2) in its second read of employee table. Repeatable Read, once again, another weaker isolation level.

Serializable

So far, we see different isolation levels providing different degrees of isolation level but not what I of ACID really defines as isolation. We also know that weaker isolation levels are introduced to avoid the performance penalty that occurs for executing transactions in complete isolation. But at times, it becomes an absolute necessity to execute transaction in full isolation. Serializable comes into picture to implement that complete isolation. In serializable isolation level, it is ensured that we get the effect as if all transactions are happened one after another, in the order they started.

So if we rerun the earlier two transactions, we would see T2 waiting for T1 to complete first. Hence both the reads of T1 would read 4 rows. Only after T1 is done that T2 would insert a new row.

This solves all the three problems: dirty read, non-repeatable read and phantom reads.

At this point, it can be mentioned here that ISO standard expects serializable, not serial. The end result of a serializable execution is to produce a result equivalent to executing them one after another. Serializable does not necessarily executing transaction one after another, just that the end result is the same, had they executed serially.

MS SQL Server implementation

With Serializable, we are done with the 4 ISO transaction isolation levels. MS SQL Server implements all of them. In addition, it implements a fifth one, called Snapshot.

Snapshot

It is an isolation level that solves all the three problems just like serializable. So, why do we have two isolation levels doing the same thing? What special thing snapshot is doing?

If we closely observe the earlier serializable isolation level, implemented using locks, we see that it is too pessimistic. T2 has to wait for T1 to finish. But T2 could be simultaneously executed. After all, T1 is only reading, not modifying any data.

Snapshot comes into picture with optimistic concurrency control. It uses multiversion concurrency control (MVCC) to implement this. Every transaction starts with the latest committed copy it sees and keeps on executing the operations inside it.

So, for the last example we saw, in snapshot, T1 would read 4 rows in both the reads. After all, it had its own private copy. On the other hand T2 would start with its own copy, add a row in the middle. At the end, it would see there was no conflict. This is because no other transaction, T1 in this case, did anything conflicting. So, two transactions are simultaneously executed without violating the requirements of a serializable solation level.

What would happen if both T1 and T2 modify the same data, creating a conflicting situation? Well, both the transaction started with its own copy hoping that at the end there would no conflict, hence it is called optimistic. But if there is a conflict, the one committed first would win. The other would fail and rollback.

By the way, even though it is called serializable, write skew, anomaly is still present and it cannot be called serializable in ISO definition.

Database needs to be configured, which at times takes a while, to use this isolation. Again, since each transaction uses its own private copy, it is resource intensive.

At a glance

Isolation levels

Default transaction isolation level for MS SQL Server

Read Committed is the default isolation level set for MS SQL Server. Keeping performance in mind, it is done this way. So all along if you had thought, by default, you were getting the I of ACID by SQL Server, you are wrong. You are living with non-repeatable read and phantom read unless you have explicitly changed the isolation level or used locks.

How to set isolation level in MS SQL Server?

We can set one isolation level at a time using the following command:

SET TRANSACTION ISOLATION LEVEL 
   { READ UNCOMMITTED 
   | READ COMMITTED 
   | REPEATABLE READ 
   | SNAPSHOT 
   | SERIALIZABLE 
   } 
[ ; ]

As mentioned earlier, to set snapshot isolation level some database specific configuration is required before executing the above command.

How long isolation level remains active?

Once set, it lasts for the session, the duration of which is largely controlled by the component that creates it. When another session starts, it starts with the default Read Committed.

Transaction

It is obvious yet important to remember, isolation levels works on transaction. After all, it is called transaction isolation level. If we want to isolate (using isolation level) the execution of a set operation as a single logical operation, then they have to be wrapped with Begin and Commit transaction.

Index

Maximum Subarray Problem

21st Friday Fun Session – 9th Jun 2017

Maximum subarray finds the contiguous subarray within a one-dimensional array having the largest sum.

Visualizing the divide and conquer solution

For the time being, let us forget about maximum subarray problem and focus on the divide and conquer solution that we discussed in the previous session.

If we visualize the tree, we see that from the left subtree the smallest value is propagated upwards.  On the way up, it is treated as the buy value and the right side values are treated as sell values. This way profits are calculated and maximum among them is retained. So we see two themes of processing as we go from left to right of the array:

  1. Retain the minimum value and treat it as the buy value.
  2. Calculate profit by treating each value seen as we go right and retain the maximum profit.

DP1.png

The above table shows day number in first row and the corresponding stock prices in second row. Third row shows the minimum value seen so far. The fourth row shows the profit had we sold on this day, buy price being the minimum value seen so far (shown in green).

The intuition

The intuition being, when we see a new lower value than the one already seen, we treat that as the new buy value. For example, when we see the new lower value 1 on day 5, onward we treat that as the new buy value and calculate profits considering each of the following days as sell days. This is because the new lower value (lowest till now) would give a better profit when the following days are treated as potential sell days. By treating the previous lower value 2 that was found on day 1, we already considered all possible profits prior to 5th day and retained the best among them. On 5th day, the utility of the previous lower value, which is 2, stops.

From divide and conquer to dynamic programming

Now let us now consider the dynamic programming (DP) point of view. In dynamic programming we make use of the result of an already solved overlapping subproblem.

On the first day, we can buy but cannot sell. After all, no profit would be made selling on the first day with the same price as the buy price. Also note that we have to buy and only then we can sell. So on day 1, profit is 0. Now if we want to find the best profit on day 2, can we use the solution of the previously solved overlapping subproblem? What is that already solved overlapping subproblem at day 2? Well, it is the best profit found for day 1, which is 0. How can we make use of the previous solution to find the best profit at day 2? Well, we have to consider two things:

  1. If we have to make the most profit by selling today, then we have to buy using the lowest price seen so far.
  2. If the profit calculated above is better than the best seen on previous day, then this is the new best. Else previous day’s best is still the best for today.

For example, on day 2 we realize that we can make a profit of (8-0) = 8 and it is better than the profit at day 1, which is 0. Hence, the best profit for day 2 is updated to 8. On day 3, we find we can make a profit of 3 but the best profit till day 2 is better than this. So, we retain day 2’s best profit as day 3 best profit.

So we realize, what we found by visualizing and transforming the divide and conquer solution is nothing but this dynamic programming. In fact, this is possibly one of the simplest forms of dynamic programming.

The below code would find the solution. For brevity buy day and sell day is not tracked that is easy to accommodate.

void StockDpN(double price[], int n, double &maxProfit)
{
  double minPriceSoFar = price[0]; 
  maxProfit = 0;
  
  for(int i=1; i<n; i++)  
  { 
    if(price[i] - minPriceSoFar > maxProfit) 
      maxProfit = price[i] - minPriceSoFar;

    if(price[i] < minPriceSoFar) 
     minPriceSoFar = price[i]; 
  }
}

The reverse can also be used. If we start from right and move leftwards, we have to keep track of the maximum value seen so far and that is the sell value. As we go left, we see new values and they are buy values. The associated code is not shown here.

Moving to maximum subarray problem

Suppose we buy a stock at one day and then sell it on the following day. For example, buy at day 1 and then sell on day 2. Buy at day 2 and then sell on day 3 and so on. Each day we make a profit, incur a loss and sometimes it is neutral, meaning no profit or loss (buy value and sell value being the same). The third row of the below table shows the same (loss shown in red).

MSData.png

The optimal solution to our stock profit problem with our example set is to buy on day 1 at price 2 and sell it on day 4 at price 12, thus making a profit of 10. It is the same as saying:

  1. We buy at day 1 and sell at day 2 making profit 8 and then
  2. Buy at day 2 and sell at day 3 making loss 5 and then
  3. Buy at 3 and sell at day 4 making profit 7 and then
  4. Add all profits/losses made in our buy/sell operations that started by buying on day 1 and ended by selling on day 4. The final and best profit is: 8 + (-5) + 7 = 10.

Thus we have transformed the previous stock profit problem to a maximum subarray problem. As explained earlier, we are interested to find contiguous portion of array that gives the maximum sum. In the above 8 values that we have, we got two such subarrays each giving a sum of 10. They are showed in colored boxes.

Kadane’s algorithm

Kadane’s algorithm also deploys DP to solve this. Once again in DP, we have to make use of already solved overlapping subproblems. Here it is done by this way:

  1. Maximum subarray ending in position i+1 includes already solved maximum subarray ending at i, if doing so increases the sum for subarray ending at i+1
  2. Else maximum subarray ending in position i+1 will only have itself.

MSDP

Maximum subarray at day 1: day 1 value which is 0.

Maximum subarray at day 2: since adding the subarray sum for day 1, which is 0, is not increasing the sum for day 2, maximum subarray at day 2 will have only day 2 value itself, meaning 8.

Maximum subarray at day 3: subarray sum at day 2 is positive, which is 8, and helping day 3, so subarray at day 3 includes day 2. Subarray sum at day 3 = 8 + (-5) = 3.

It boils down to a simple thing. If the previous sum is positive then take it forward else not. The red color in the Maximum subarray sum row (4th row) shows the cases where it does not include the (immediately) prior subarray. In two cases it happens (8 at day 2 and 2 at day 6) because the prior sums (0 and -1 respectively) are not more than zero.

The code shown below implements this. Note that the input array profit contains the profit and loss unlike the earlier DP function where we passed the stock prices. It is also noteworthy that if all values are positive then the whole array is the maximum subarray. After all, adding all of them would give the highest sum.

void StockKadaneDpN(double profit[], int n, double &maxProfit)
{  
  double curProfit = 0; maxProfit = 0;
  
  for(int i=1; i<n; i++) 
  { 
    curProfit = curProfit > 0 ? curProfit + profit[i] : profit[i]; 
    if(curProfit > maxProfit) 
      maxProfit = curProfit; 
  }
}

If we observe closely, we see that this DP is essentially the same as the one we discussed earlier in this post.

Backtrace

At the end, when we find the maximum subarray sum 10 at day 4, we will do what is called backtrace, typical of DP to find the path, in this case, the maximum subarray. We know that at day 4, we included the subarray ending at day 3. At day 3, we included the subarray ending at day 2. At day 2, we did not include prior subarray. So the maximum subarray starts at day 2 and ends at day 4. It could be easily tracked/stored as we went ahead in the computation using appropriate data structure and would not require a come back.

Map maximum subarray solution to stock profit

If we want to map this solution back to our stock profit problem, then we know the profit at start day of the maximum subarray, that is day 2, is essentially found by buying stock at the previous day that is day 1. So the solution is: buy at day 1 and sell at the last day of the maximum subarray that is day 4. And the profit would be the maximum subarray sum that is 10.

The transformations

This is an interesting problem to observe as we started with a O(n^2) brute force accumulator pattern, moved to O(n log n) divide and conquer that we optimized later to O(n). Finally, we transformed that to a O(n) DP solution only to find that it is interchangeable to O(n) maximum subarray problem that is also a DP solution.

Can we do better than O(n)? Well, that is not possible. After all, we cannot decide the best solution unless we read all the data at least once. Reading the data once is already O(n).

Where is pattern recognition here?

Maximum subarray essentially gives the brightest spot in a one-dimensional array. Finding this brightest spot is one kind of pattern recognition. Note that we just solved a problem that reads like this: given the profit/ loss made by a company over the period find the longest duration(s) when the company performed the best. The answer here is: from day 2 to day 4 or from day 6 to day 7.

Even though we focused on finding the single brightest spot, it is also possible to find, k brightest spots.

Again, maximum subarray considers only one dimension. In real life, data sets typically contain more than one dimension. For example, a problem involving two dimensions might read like: can you find the largest segment of the customers buying product x based on age and income? A potential answer might be: customer from age 30 to 40 years with income range $3000 – $6000. There are other algorithms to deal with multi-dimensional data.

GitHub: Stock Profit Kadane Code

Index

Manipulating Money Exchange

4th JLTi Code Jam – Jun 2017

Input:

1 USD = 1.380 SGD

1 SGD = 3.080 MYR

1 MYR = 15.120 INR

1 INR = 0.012 GBP

1 GBP = 1.30 USD

I CAD = 0.57 GBP

Explanation: Now that we realize, we have to wait substantial amount of time to make any meaningful gain from stock market, we change our focus to money exchange. I am particularly very excited after collecting the above exchange rates using Google search today at 8th Jun 2017. If I start with 10, 000 USD, then convert them to MYR, then convert them to INR and then to GBP and then back to USD, I realize I will end up with 10, 025.50 USD, making USD 25.50 on the same spot within minutes.

Output: USD -> SGD -> MYR -> INR -> GBP -> USD

The next step that I am going to take is to get a list of all money exchanges available to me, somehow collect their exchange rates daily, and push it to a program that will tell me when it sees there is a chance to make some profit.

I might not always be lucky. To check whether my adrenaline rush is justified I looked at the same rates in an online money exchange. The rates from it look like below:

Input:

1 USD = 1.38295 SGD

1 SGD = 3.08614 MYR

1 MYR = 15.0996 INR

1 INR = 0.0119755 GBP

1 GBP = 1.295 USD

As you realize, we will end up losing money.

Output: No luck here

Task: I also realize, given a few hundred currencies and thousands of exchange rates among them, there is a possibility of having a number of ways we can make money. For example, given a set of rates (not as shown in the above example), we could make USD 10, starting with USD 10, 000, using this route: USD -> SGD -> MYR -> USD. Again, we could possibly make SGD 50, starting with SGD 10, 000, using another route: SGD -> MYR -> INR -> SGD. I am happy with just one such route, not necessarily the one providing the most profit.

Index

Solution – Making Money at Stock Market

20th Friday Fun Session – 2nd Jun 2017

Given the stock prices for a number of days, in order, we have to buy one stock at one day and then sell it on a later day to maximize the profit.

This is the solution to JLTi Code Jam – May 2017  problem.

Let us walk through an example

Suppose, we have 8 days’ stock prices, starting at day 1, in order and they are 2, 10, 5, 12, 1, 3, 11, and 9 respectively. We can clearly see, if we buy a stock at day 1, at a price of 2 and then sell it on day 4, at a price of 12; we can make a profit of 10. We could make the same profit had we bought it on day 5, at a price of 1 and then sold it on day 7, at a price of 11. Since we want to make the most profit only once, we would choose say, the first one.

Accumulator pattern

We could simply run two loops, consider all possible buys and sells, calculate all possible profits (like buy on day 1 and sell it on day 2 with profit 8, buy on day 1 and sell it on day 3 with profit 3, and so on) and find the maximum profit. The below code using accumulator pattern  – traverse a sequence and accumulate a value (here it is maximum), as it goes, can be used to do so.

void StockN2(double price[], int n, double &maxProfit, int &buyDay, int &sellDay)
{
  for(int i=0; i<n; i++)
    for(int j=i+1; j<n-1; j++)     {       if(price[j] - price[i] > maxProfit)
      {
        maxProfit = price[j] - price[i];
        buyDay = i;
        sellDay = j;
      }
    }
}

For the outer loop that runs (n-1) times the inner loop runs (n-1) times resulting in O(n2), also known as quadratic complexity. Can we do better?

Divide and conquer solution

When we have an algorithm with O(n2) complexity, and we think about optimizing it, we ask ourselves – what could be better than this. And then O(n log n) comes to our mind. Even though there are many other complexities in between these two, usually O(n log n) is next best after O(n2). And when we see log n, divide and concur comes to our mind.

If we use divide and conquer, we have to divide the input into two sets, find some kind of solutions from both sides and then combine them. What decision can be found from two sides? We can get the maximum profit from each of the two sides. For example, suppose, we have only 4 days’ stock prices: 2, 10, 5 and 12. If we divide them into two, we would get left side: 2 and 10. Right side: 5 and 12. Left side profit would be 8 and the same for right side would be 7. The best between the two would be 8.

However, we can clearly see that the maximum profit is not 8, but 10 when the buy happens in left and sell happens in the right side. It is understandable that local solutions from left and right sides alone would not result in a global optimal solution. Somehow we have to compute a third profit combining the two sides. It is also obvious that buy happens in the left side and sell happens in the right side (after all, we cannot sell before we buy). So we see that the merge phase of the divide and conquer should consider the below 3 profits and find the best among them.

  • Maximum profit from left side
  • Maximum profit from right side
  • Profit by buying at the lowest from left and then selling at the highest in right

The below code is doing this. By the way, we are not tracking the buy day and sell day to keep the focus on the main points. Buy day and sell day can be easily accommodated.

void StockDivideAndConquerNlogN(double price[], int start, int end, double &maxProfit)
{
  if(start == end)
  {
    // just one value, return
    maxProfit = 0;
    return;
  }

  int mid = start + (end-start)/2;

  double leftMaxProfit;
  StockDivideAndConquerNlogN(price, start, mid, leftMaxProfit);

  double rightMaxProfit;
  StockDivideAndConquerNlogN(price, mid+1, end, rightMaxProfit);

  double minLeft = GetMin(price, start, end);
  double maxRight = GetMax(price, start, end);

  double minValue = GetMin(price, start, mid);
  double maxValue = GetMax(price, mid+1, end);
  maxProfit = maxOutOfThree(leftMaxProfit, rightMaxProfit, maxValue - minValue);
}

For our working example, with 8 days’ stock prices, it looks like below:

Divide and Conquer

The value inside the circle indicates the profit. It is coming from the best of the three as detailed earlier.

Computing complexity

How much is the cost? To compute it, we have to find two things – how many levels and how much work is done at each level.

How many levels? We have 8 items. At each level, it is halved. 8 -> 4 -> 2 -> 1. Suppose, in general, we are halving it k times. That means, n is divided by 2, k times. That means, n is divided by 2k and then it becomes 1. 1 = n/2k => n = 2k => log n = k log 2. Since, log 2 = 1, base being 2, k = log n. For n = 8, k = 3.

Level starts at 0 (the root) and ends at k. Hence, the actual number of levels is k+1. For simplicity, the rest of the post would consider the total number of levels to be log n, ignoring the small constant issue of 1.

Next thing to find: how much work is done at each level. We see, at each level, minimum is found from left, costing n/2 and maximum is found from right, also costing n/2. At each level, merging them (computing the third profit and then finding the best among the three) costs iterating n items and a few constant comparisons. Note that, at each level, all n items are present. It is the number of total processed items from all the sub-problems present in that level, not necessarily just from two sub-problems. So for log n levels the total cost is (n * log n) = n log n.

This calculation is explained/performed using master theorem.

Optimized divide and conquer solution

At each level, we are running loops and iterating n items to get the minimum from left and maximum from right. This loop is not necessary. The minimum and maximum can be computed bottom up doing a constant number of comparisons, at each level.

The below code shows this optimized version:

void StockDivideAndConquerOptimizedN(double price[], int start, int end, double &maxProfit, double &minValue, double &maxValue)
{
  if(start == end)
  {
    // just one value, return
    maxProfit = 0;
    minValue = maxValue = price[end];
    return;
  }

  int mid = start + (end-start)/2;

  double leftMaxProfit, leftMinValue, leftMaxValue;
  StockDivideAndConquerOptimizedN(price, start, mid, leftMaxProfit, leftMinValue, leftMaxValue);

  double rightMaxProfit, rightMinValue, rightMaxValue;
  StockDivideAndConquerOptimizedN(price, mid + 1, end, rightMaxProfit, rightMinValue, rightMaxValue);

  maxProfit = maxOutOfThree(leftMaxProfit, rightMaxProfit, rightMaxValue - leftMinValue);

  minValue = leftMinValue > rightMinValue ? rightMinValue : leftMinValue;
  maxValue = leftMaxValue > rightMaxValue ? leftMaxValue : rightMaxValue;
}

Computing complexity for this optimized version

In this optimized version, for each of the log n levels, we are still doing some processing. It is no longer n items. Rather, the number of items is decreasing by half at each level, upwards. At the bottom-most level there are n items. One level up, it is reduced by half (due to the merging), shrinking to n/2 items. As it goes up, it gets reduced and at the topmost level it becomes only one item. Let us add the items from each level that we we processing.

n + n/2 + n/22 + n/2+ n/24 + .  .  .  . + n/2k)

=> n (1 + 1/2 + 1/4 + 1/8 + . . .)

=> n * 2 (the convergent series gives 2)

=> 2 n

By discarding the constant terms, we get a complexity of O(n), meaning we can get the maximum profit in linear time.

GitHub: Stock Profit

Index

Finding Fibonacci – Exponential vs. Linear

2nd Friday Fun Session – 13th Jan 2017

What is Fibonacci number?

0, 1, 2, 3, 5, 8 . . . is Fibonacci series where 1st number is 0, 2nd number 1, 3rd number 2 and so on, each number being the sum of its two predecessors.

What are we talking about here?

We will see that nth Fibonacci number can be found using both recursive and iterative methods. Recursive one will be prohibitively expensive while iterative one will be much more efficient.

Recursive solution

We can use the following recursive function to get nth Fibonacci number.

int FibonacciExponential(int n, int &opCounter)
{
  opCounter++;

  if(n == 0 || n == 1)
    return n;

  return FibonacciExponential(n-1, opCounter) + 
         FibonacciExponential(n-2, opCounter);
}

We have passed an extra parameter to count the number of times the recursive function gets called. For example, for n = 4, we see we have the following calls, each node showing the value of n, with which the function is called. Many a times, for a certain value of n, the function is called numerous times.

Fibonacci Call Tree

We see, if we increase n by one, the tree also expands by one more level. For n = 4 we end up with 24 (2 raised to the 4th power) calls. It will be few less as the right sub-tree is one level less than the left, but they are negligible. When the input number n (input size) goes as an exponent, we call the complexity – exponential. Especially, when we talk about Big O notation, we express it using the upper asymptotic bound. The complexity here is O(2n).

The value 2n when n = 100, is 2100 = 1267650600228229401496703205376. What if each operation takes a millisecond? The execution time would be trillions of years ~ just for n = 100!

Iterative solution

We could as well run a simple loop by retaining the previous two values and add them to find the present Fibonacci number. For n = 100, we just needed to loop maximum 100 times, requiring 100 operations. That means, we could call the complexity linear and in terms of big O notation it would be O(n).

The below function finds nth Fibonacci number iteratively, in linear time.

int FibonacciLinear(int n, int &opCounter)
{
  opCounter++;
 
  if(n == 0 || n == 1)
    return n;

  int result = 0;
  int previousPrevious = 0;
  int immediatePrevious = 1; 

  for(int i=2; i<=n; i++)
  {
    opCounter++;
 
    result = immediatePrevious + previousPrevious;

    previousPrevious = immediatePrevious;
    immediatePrevious = result;
  }
 
  return result;;
}

Considering each operation taking 1 millisecond, we are talking about 100 milliseconds for linear algorithm vs. trillions of years for exponential algorithm.

GitHub: Fibonacci Code

Index