Transactions in Nested SP

34th Friday Fun Session (Part 2) – 22nd Sep 2017

When one (MS SQL Server) Stored Procedure (SP), say, A calls another SP, say, B inside it; we call the SPs are nested. It is not always the case that A calls B. B can be called directly as well. Now if they are required to execute within transaction, in both the cases – when executed individually or nested, we face some interesting situations. Here we discuss how to manage transactions for those cases, when scope is limited within a database.

Requirements

We usually face two kinds of requirements.

  1. Full Commit: either all in both parent (A) and child (B) have to be executed or all should fail.
  2. Partial Commit: if one or more calls to child fail, let them fail. But persist/commit the changes done by parent and the successful calls to child.

Nesting does not need to be limited within two levels (parent and child) as stated here. For example, B can further call another SP C, or A can call any of B or C and so on. The requirements can vary in some other ways as well. However, if we solve the above two cases, the same mechanism/logic can be used/extended to handle all other situations.

Note that, we are also considering the situation that the child, B can be directly called, and in that situation, we are making sure that it is executing within its own transaction.

Full commit

The way we have written the SPs A and B as shown below, A is calling B 6 times. However B fails on the last call. A is inserting data into a table called OuterTable and B is doing the same for InnerTable.

In all or nothing case, we want to make sure, when we execute A, if any call to B fails, no data should be inserted in any of the tables; all should fail. It is obvious that if all the calls pass, then all data changed by them should persist.

It is also required that if B is executed directly, everything happening inside it must happen within a transaction.

Code for SP A:

CREATE Procedure [A]
AS
BEGIN
  BEGIN TRANSACTION;

  BEGIN TRY
    DECLARE @Success AS INT
    DECLARE @Counter AS INT

    SET @Success = 0
    SET @Counter = 0

    INSERT INTO [OuterTable] ([Name]) VALUES(@Counter)

    WHILE < 6
    BEGIN
      EXEC @Success = [B] @Counter;

      IF @Success = 0
        THROW 50001, 'B Failed! Rollback all!', 1

      SET @Counter = @Counter + 1;
    END

    IF @@TRANCOUNT > 0
      COMMIT TRANSACTION;
  END TRY

  BEGIN CATCH
    SET @Success = 0

    IF @@TRANCOUNT > 0
      ROLLBACK TRANSACTION;
  END CATCH;

  RETURN @Success
END

Code for SP B:

CREATE Procedure [B] (@CallCounter INT)
AS
BEGIN
  BEGIN TRANSACTION;

  BEGIN TRY
    DECLARE @Success AS INT
    SET @Success = 0

    IF @CallCounter = 5
      THROW 50001, 'Last call cannot be executed! Rollback all!', 1

    INSERT INTO [InnerTable] ([Name]) VALUES(@Counter)

    SET @Success = 1

    IF @@TRANCOUNT > 0
      COMMIT TRANSACTION;
  END TRY

  BEGIN CATCH
    SET @Success = 0

    IF @@TRANCOUNT > 0
      ROLLBACK TRANSACTION;
  END CATCH;

  RETURN @Success
END

When A gets called it starts a transaction. It then calls B, that succeeds 5 times, and subsequently fails on the last one.

@@TRANCOUNT is important

When B starts, it begins a transaction, unconditionally. Each time A calls B, @@TRANCOUNT gets incremented by 1 and reaches 2. Each time B succeeds, it executes a commit, thus reducing @@TRANCOUNT Back to 1, corresponding to the transaction started by A.

However, the last call to B fails. It then throws an exception, that gets the execution flow jump into the catch block, where it checks @@TRANCOUNT, that is now 2. It then rollbacks the transaction. A rollback discards all transactions in the nested call and resets @@TRANCOUNT. Before commit or rollback, it is a safe practice to make sure that it is still under a transaction, by checking @@TRANCOUNT value that should be greater than 0.

At this situation, B returns 0 to indicate failure (usual practice is to use 0 for indicating success though). A gets to know that B has failed. A throws exception, and ends up in catch block, where it checks @@TRANCOUNT, that is already 0. No further rollback is called and at the end 0 is returned from A to mean that it has failed.

Everything that A and B have done so far are all rolled back. After all, no matter how many transactions are started, a single rollback can discard all. At the end, no data changes happen to either of the tables, where they inserted data.

If B executes directly, depending on the parameter, data would persist or it would fail. However, all inside it would happen within a transaction.

Partial Commit

In this case, we want to persist the changes made by 5 successful calls to B, along with what A has done. The only thing that should get discarded is the failed 6th call to B.

Code for SP A:

CREATE Procedure [A]
AS
BEGIN
  BEGIN TRANSACTION;

  BEGIN TRY
    DECLARE @Success AS INT
    DECLARE @Counter AS INT

    SET @Success = 0
    SET @Counter = 0

    INSERT INTO [OuterTable] ([Name]) VALUES(@Counter)

    WHILE < 6
    BEGIN
      EXEC @Success = [B] @Counter;

      IF @Success = 0
        PRINT 'I know B Failed! But I am just ignoring that!', 1

      IF @Success = 2
        THROW 50001, 'Something terribly wrong! Rollback all!', 1

      SET @Counter = @Counter + 1;
    END

    IF @@TRANCOUNT > 0
      COMMIT TRANSACTION;
  END TRY

  BEGIN CATCH
    SET @Success = 0

    IF @@TRANCOUNT > 0
      ROLLBACK TRANSACTION;
  END CATCH;

  RETURN @Success
END

Code for SP B:

CREATE Procedure [B] (@CallCounter INT)
AS
BEGIN
  DECLARE @ProcedureSave NVARCHAR(MAX)
  DECLARE @TranCounter INT;

  SET @ProcedureSave = 'B-' + CONVERT(CHAR(23), GETDATE(), 121)
  SET @TranCounter = @@TRANCOUNT;

  IF @TranCounter > 0
    SAVE TRANSACTION @ProcedureSave;
  ELSE
    BEGIN TRANSACTION;

  BEGIN TRY
    DECLARE @Success AS INT
    SET @Success = 0

    IF @CallCounter = 5
      THROW 50001, 'Last call cannot be executed! Rollback all!', 1

    INSERT INTO [InnerTable] ([Name]) VALUES(@Counter)

    SET @Success = 1

    IF @TranCounter = 0 AND @@TRANCOUNT > 0
      COMMIT TRANSACTION;
  END TRY

  BEGIN CATCH
    SET @Success = 0

    IF @TranCounter = 0
      ROLLBACK TRANSACTION;
    ELSE
      IF XACT_STATE() <> -1
        ROLLBACK TRANSACTION @ProcedureSave;
      ELSE
        SET @Success = 2
  END CATCH;

  RETURN @Success
END

Savepoint

B starts a transaction, only if it is already not inside one. Meaning, if it is directly called then it starts a transaction of its own. However, if it is called by A, it does not do so. In that case, it creates a savepoint. A savepoint name is created in a way so that for each instance of the SP’s execution, it remains unique. A savepoint is a point where it can come back by executing a partial rollback. In such a situation, all changes between the savepoint and the place where partial rollback is called get discarded.

That is precisely what happens, when B fails and an exception is thrown. In catch block, it checks whether it started the transaction on its own. If that is the case, it simply calls a (full) rollback. However, if it was called by A, that also means it created a savepoint; it would simply execute a partial rollback using the savepoint. A partial rollback only discards the changes done by B itself.

XACT_STATE() is a safe choice

However, prior to executing the partial rollback it should check whether it is safe to do so. If XACT_STATE() returns -1, that means something went wrong, something that should happen rarely. In this situation, a partial rollback is no longer possible – only a full rollback can be done. Special failure code 2 is returned by B to indicate such a colossal failure.

A remains mostly the same. The only thing that gets changed is when it knows B has failed. Earlier, it would have thrown an exception. But this time, if the returned failure code is 0, it simply ignores that B has failed. If the returned failure code is 2, it knows that B fails to execute partial rollback  – something really bad – A executes a (full) rollback.

In the above case, we see that the outer table gets one new row, and the inner table gets 5 new rows; notwithstanding the fact that 6th call to B fails.

When B is called directly – not from A – it manages its own transaction; savepoint is never used.

Index

Executing SP using EF

34th Friday Fun Session (Part 1) – 22nd Sep 2017

Many a times, we use Entity Framework (EF), Microsoft’s recommended data access technology for an application, to execute (MS SQL Server) Stored Procedure (SP), and retrieve the results emitted by them. Here we discuss different kinds of output that a SP can produce and how we can retrieve them using EF.

SP output

A SP typically provides the following kinds of results:

  1. Return code (single integer)
  2. Output parameters (one or more, any data type)
  3. Result set
    1. Single result set
    2. Multiple result set
      1. All having the same schema
      2. Having different schema

Return code

SP can return a single integer return code. Return statement without any value (null) would automatically return 0. It is mostly used to exit execution of a SP when certain condition is met.

CREATE Procedure SpReturnCode
AS
BEGIN
  RETURN 1;
END

Using T-SQL we can execute SP like below.

DECLARE @Success AS INT
EXEC @Success = [SpReturnCode];
PRINT @Success

Output parameter

SP can return one or more values, each having its own data type.

CREATE Procedure SpOutputParameter (@InputValue INT, @OutputValue INT OUTPUT)
AS
BEGIN
  SET @OutputValue = @InputValue + 1;
  RETURN 0;
END

Using T-SQL we can execute SP like below.

DECLARE @ReturnValue INT;
EXECUTE [SpOutputParameter] 2, @ReturnValue OUTPUT;
PRINT @ReturnValue

Single result set

Returns a result set having 0 or more rows of a certain schema. The following SP returns a result set with a single column named, Success.

CREATE Procedure SpSingleResultSet
AS
BEGIN
  SELECT 3 AS Success
  RETURN 0;
END

Multiple result set, same schema

The following SP returns the result set for Employee schema twice.

CREATE Procedure SpMultipleResultSetSameSchema
AS
BEGIN
  SELECT * FROM [Employee]
  SELECT * FROM [Employee] SELECT [EmployeeId] > 10

  RETURN 0;
END

The following SP returns a result set that is not associated with any database entity.

CREATE Procedure SpMultipleResultSetNonDbContextEntity
AS
BEGIN
  DECLARE @Loop AS INT
  SET @Loop = 0

  WHILE @Loop < 10
  BEGIN
    EXEC SpSingleResultSet
    SET @Loop = @Loop + 1
  END

  RETURN 0;
END

Multiple result set, multiple schema

The following SP returns two different result sets: one for Company and another for Employee.

CREATE Procedure SpMultipleResultSetMultipleSchema
AS
BEGIN
  SELECT * FROM [Company]
  SELECT * FROM [Employee]

  RETURN 0;
END

Executing SP using EF

We will use the following different ways in EF to read different kinds of SP output as described earlier:

  1. ExecuteSqlCommand
  2. SqlQuery
  3. CreateCommand/ExecuteReader

ExecuteSqlCommand

This executes a given DDL/DML command. This can be executed when no result set needs to be returned.

Return code

Return code does not require any explicit output parameter to be used in the SP. However, while calling from EF, it should be treated as an output parameter by specifying the direction for the parameter.

SqlParameter returnCode = new SqlParameter("@ReturnCode", SqlDbType.Int);
returnCode.Direction = ParameterDirection.Output;

Db.Database.ExecuteSqlCommand("exec @ReturnCode = [SpReturnCode] ", returnCode);
var returnCodeValue = (int)returnCode.Value;

Output Parameter

For each of the output parameters, we need to declare an output parameter in EF matching the appropriate data type.

SqlParameter inputParam = new SqlParameter("@InputValue", SqlDbType.Int);
inputParam.Direction = ParameterDirection.Input;

SqlParameter outputParam = new SqlParameter("@OutputValue ", SqlDbType.Int);
outputParam.Direction = ParameterDirection.Output;

Db.Database.ExecuteSqlCommand("[SpOutputParameter] @InputValue, @OutputValue OUT", inputParam, outputParam);
var returnValue = (int)outputParam.Value;

SqlQuery

SqlQuery is usually used when SP returns a single result set. However, it can return any data type including primitive types – not necessarily only entity type. If the SP returns multiple result sets, it will only get the first one. However, the complete execution of the entire SP does happen.

public class SuccessSet
{
  public int Success { get; set; }
}

var result = Db.Database.SqlQuery("[SpSingleResultSet]").ToList();

CreateCommand/ExecuteReader

When multiple result sets to be returned this method can be used. We need to use IObjectContextAdapter interface that makes use explicit interface implementation.

Db.Database.Initialize(force: false);
var cmd = Db.Database.Connection.CreateCommand();
cmd.CommandText = "[SpMultipleResultSetSameSchema]";

Db.Database.Connection.Open();

var reader = cmd.ExecuteReader();
var employees =
  ((System.Data.Entity.Infrastructure.IObjectContextAdapter)Db)
    .ObjectContext
    .Translate<Employee>(reader, "Employee", System.Data.Entity.Core.Objects.MergeOption.AppendOnly);

foreach (var employee in employees)
  Console.WriteLine(employee. Name);

While(!reader.NextResult())
{
  employees =
    ((System.Data.Entity.Infrastructure.IObjectContextAdapter)Db)
    .ObjectContext
    .Translate<Employee>(reader, "Employee", System.Data.Entity.Core.Objects.MergeOption.AppendOnly);

  foreach (var employee in employees)
    Console.WriteLine(employee. Name);
}

Db.GetDatabase().Connection.Close();

When result set is not in DbContext, slight modification is required.

var successSets =
  ((System.Data.Entity.Infrastructure.IObjectContextAdapter)Db)
  .ObjectContext
  .Translate<SuccessSet>(reader);

When different schema are used we can still use the same, we just need to use the right object while translating.

var companies =
  ((System.Data.Entity.Infrastructure.IObjectContextAdapter)Db)
  .ObjectContext
  .Translate<Company>(reader, "Company", System.Data.Entity.Core.Objects.MergeOption.AppendOnly);

reader.NextResult();

var employees =
  ((System.Data.Entity.Infrastructure.IObjectContextAdapter)Db)
  .ObjectContext
  .Translate<Employee>(reader, "Employee", System.Data.Entity.Core.Objects.MergeOption.AppendOnly);

Index

Pseudo-polynomial Complexity

33rd Friday Fun Session (Part 2) – 15th Sep 2017

The complexity for FaaS solution is O(n), where n is the largest day number. It looks like polynomial. However, it is actually pseudo-polynomial.

Size of input

Complexity is measured in terms of the size of input, say, in bits. Suppose, there are b bits in n. Then O(n) = O(2b) and hence, it is exponential.

Let’s assume n increases from 10 to 1125899906842624. More specifically, lunch schedule, as used in the previous example, changes from 1, 3, 4, 5, 6, 7, 10 to 1, 3, 4, 5, 6, 7, 1125899906842624. We still have the same 7 days to go for lunch. Yet, we are running 1,125,899,906,842,624 loops. In our layman understanding, the problem is still the same and should have taken the same amount of time to execute, and yet, for the latter, the algorithm takes way too long!

Spot a pseudo-polynomial

This is how we spot a pseudo-polynomial time algorithm. Ideally, we would like to express the complexity using the number of inputs; here, it should have been 7. But the above algorithm works in a way, where the complexity has been expressed in one of the numeric values of the input, the maximum value of the input – 1125899906842624, to be precise. This is where we are tricked into believing it to be a polynomial time algorithm, linear (polynomial) in the (max) numeric value of the input. But if we apply the definition of complexity that takes into consideration the size/length of the input, then it is actually exponential.

To be more specific, if we look at the input size, 4 bits are required to represent 10, while 50 bits are required to represent 1,125,899,906,842,624. Complexity has gone from O(24) = 10 loops to O(250) = 1,125,899,906,842,624 loops.

That is essentially exponential in the number of bits, meaning exponential in the size of the input but polynomial in the numeric value of the input. Algorithm with this kind of running time is called pseudo-polynomial.

Truly polynomial

At this point, you might wonder what is a truly polynomial time algorithm. For example, when we add n numbers using a loop running n times, we say, the complexity of it to be O(n). But here this n can also be written as 2b. So, shall we also say, adding n numbers is a pseudo-polynomial time algorithm?

Well, when we say, adding n numbers, we implicitly say, we want to find the sum of n 32 bit numbers/integers. Then the size of n numbers is 32 * n. Once again, the formal definition of complexity is defined in terms of input size, in bits. What is the input size here? The size here is 32n. The complexity is O(32n) and removing the constant terms it is O(n), a truly polynomial time algorithm.

Index

Solution – FaaS

33rd Friday Fun Session – 15th Sep 2017

Given a lunch schedule – a sequence of days when lunch is planned, and three price plans – daily, weekly and monthly, we want to get the cheapest lunch price.

This is the solution to JLTi Code Jam – Aug 2017 problem.

Let us walk through an example

Let us take an example as mentioned here: 1, 2, 4, 5, 17, 18. Since first day is 1 and last day is 18, it can be put under a month that consists of 20 consecutive days (not calendar month). We can use a monthly plan. But it would be too expensive (S$ 99.99) for just 6 days.

The days: 1, 2, 4 and 5 fall within a week that requires consecutive 5 days (not a calendar week). We have an option to buy a weekly plan for these 4 days that would cost S$ 27.99. However, that would be higher than had we bought day-wise for 4 days at a price of S$24.

Dynamic Programming

In general, at any given day, we have three options:

  1. We buy lunch for this day alone, using daily price S$ 6. Add that to the best price found for the previous day.
  2. We treat this as the last day of a week, if applicable, and buy a weekly plan at a cost of S$ 27.99. Add that to the best price for the day immediately prior to the first day of this week.
  3. We treat this as the last day of a month, if applicable, and buy a monthly plan at a cost of S$ 99.99. Add that to the best price for the day immediately prior to the first day of this month.

This is an optimization problem that can be solved with dynamic programming where we use the result of already solved sub-problems.

Bottom-up

We have two options: top-down and bottom-up. We realize that, at the end, all the sub-problems (for each of the days) have to be solved. We also find that it is easy to visualize the problem bottom-up. And if we do use bottom-up then the required space would be limited by the last day number.

Hence, we will solve it using bottom-up dynamic programming.

Blue colored days are when lunch is scheduled.

DP table1.png

On day 1:

Cost S$ 6.

On day 2:

Daily basis: S$ 6 + price at day 1 = S$ 12

Weekly basis: S$ 27.99

Monthly basis: S$ 99.99

Best price: S$ 12

On day 3:

No lunch schedule, cost of previous day S$ 12 is its cost.

On day 4:

Daily basis: S$ 6 + price at day 3 = S$ 18

Weekly basis: S$ 27.99

Monthly basis: S$ 99.99

Best price: S$ 18

On day 5:

Daily basis: S$ 6 + price at day 4 = S$ 24

Weekly basis: S$ 27.99

Monthly basis: S$ 99.99

Best price: S$ 24

From day 6 to day 16:

No lunch schedule, cost of previous day will be carried forward: S$ 24.

On day 17:

Daily basis: S$ 6 + price at day 16 = S$ 30

Weekly basis: S$ 27.99 + price at day 12 = S$ 51.99

Monthly basis: S$ 99.99

Best price: S$ 30

On day 18:

Daily basis: S$ 6 + price at day 17 = S$ 36

Weekly basis: S$ 27.99 + price at day 13 = S$ 51.99

Monthly basis: S$ 99.99

Best price: S$ 36

Finally, the best price is S$ 36.

Another example

Let us work with another example: 1, 3, 4, 5, 6, 7, 10.

DP table2

On day 7:

Daily basis: S$ 6 + price at day 6 = S$ 36

Weekly basis: S$ 27.99 + price at day 2 = S$ 33.99

Monthly basis: S$ 99.99

Best price: S$ 33.99

Finally, the best price at the end is S$ 39.99.

Complexity

The complexity is O(n), where n is the largest day number. It is a pseudo-polynomial time algorithm.

GitHub: FaaS

Index

Team Lunch

7th JLTi Code Jam – Sep 2017

Since our JLTi Mumbai colleagues started vising our Singapore office, we are having more team/project lunches. Usually, a number of them come together, and after a short while they also leave together. It is only few days before they leave that we start organizing team lunches. Suppose, there are three colleagues belonging to three different teams, then there would be three team lunches, one for each team.

However, not all members work for an exclusive team. For example, I create an impression as if I work for more than one team, and due to the good grace of those teams, I also get invited in their team lunches.

However, due to the rush of deliverables, that is the norm here, the team lunches are squeezed in the last few days, and at times, multiple team lunches fall on the same day, typically on the last day.

That is all fine and good for most. However, I have a big problem. If two team lunches fall on the same day, and I belong to both, I miss one for obvious reason. I skip lunch does not necessarily mean I skip free lunches.

Hence, I decided to write a small program that will take the team composition in certain way and output the minimum days required to schedule the lunches so that people working on multiple teams don’t miss out any.

Yes, I am not the only person but there are some other colleagues who also work across more than one team. Let us also assume that, for our 7 or 8 teams, it might be easy to calculate it manually. But when the number of team exceeds, say 100, then a program is a must.

Input:

1 2

Output: 2

Explanation:

Input 1 2 (1 and 2 separated by a space) means there are one or more members who belong to both team 1 and team 2.

Output 2 means, at least 2 days are required to arrange lunches for the teams. On day 1, one of the two teams can go for lunch. On the second day, the other can go.

How many teams are there? Well, there are at least 2. There can be more, but that is irrelevant. Suppose there are 4 more teams – team 3, team 4, team 5 and team 6, they can go either on first day or on second day. This is because, no member working in those 4 teams work for a second team. After all, the input says, only team 1 and team 2 have some common members.

Input:

1 2

2 3

Output: 2

We have some members common to team 1 and team 2. And there are some members, who are common to team 2 and team 3, as shown in the second line.

Each line in the input would indicate the presence of common members between two teams, where the two team numbers are separated by a space. There would be at least one line of input, meaning somebody would run this program only if there exist at least one member working for more than one team.

For the above input, we would still require at least two days to avoid any conflict. On one day team 1 and team 3 can go. Team 2 must go on a separate day.

Input:

1 2

2 3

1 3

Output: 3

Now we need 3 separate days. Team 1 cannot go on the same day as team 2 or team 3. This is because team 1 has members working for both team 2 and team 3. Similarly, team 2 cannot go for lunch on the same day as team 3 as they have common members. Hence, team 1, team 2 and team 3 – all need exclusive lunch days.

Task: Given a list of team pairs (like 1 2 is a team pair, as shown in input) sharing common members, we need to write a program, that would output the minimum number of days required to set aside for team lunches, so that nobody who work across multiple teams misses his/her share of team lunches.

Index