Are You Blindly Trusting Plans Generated by MS SQL Server?

Imagine the following situation:

Your query (SELECT statement) involves a number of big tables including T (a big table itself). Suppose the execution usually takes 1 second.

After each time you execute the query you also insert some data into T.

Suddenly in some cases the same query taking 60 seconds to execute! And this is happening even though the overall data volume involved in this query did not change significantly (apart from some data you inserted into T, that is quite small compare to the total data).

If you have to call this stored procedure say 1000 times in the who processing that it is involved in, you will end up being really slow.

Why this is happening?

Well, if you compare the two plans, you might see a small change in the two plans (the one taking 1 second and the one taking 60 seconds). Say, a Nested Loop Join has replaced a Hash Join. This is possible. Since you were inserting some data into T, statistics got changed for that table. Due to the inherent weakness of the cost model, a wrong plan generation is entirely possible. Since you are dealing with big tables, instead of creating hash table used by a Hash Join, a Nested Loop might cost you severely.

What can you do?

Use your business domain knowledge to use query hints. For example, one might know the right side usually produces 5K to 10K rows (say, some special transactions that this right side is supposed to produce after a lot of joins). In that case, you specify query (join) hint to use Hash Join. That way, a small change in statistics won’t make Query Plan Generator to produce a small change in the plan (Nested Loop selection instead of Hash Join) that will cost you significantly.

Index