Are You Blindly Trusting Plans Generated by MS SQL Server?

Imagine the following situation:

Your query (SELECT statement) involves a number of big tables including T (a big table itself). Suppose the execution usually takes 1 second.

After each time you execute the query you also insert some data into T.

Suddenly in some cases the same query taking 60 seconds to execute! And this is happening even though the overall data volume involved in this query did not change significantly (apart from some data you inserted into T, that is quite small compare to the total data).

If you have to call this stored procedure say 1000 times in the who processing that it is involved in, you will end up being really slow.

Why this is happening?

Well, if you compare the two plans, you might see a small change in the two plans (the one taking 1 second and the one taking 60 seconds). Say, a Nested Loop Join has replaced a Hash Join. This is possible. Since you were inserting some data into T, statistics got changed for that table. Due to the inherent weakness of the cost model, a wrong plan generation is entirely possible. Since you are dealing with big tables, instead of creating hash table used by a Hash Join, a Nested Loop might cost you severely.

What can you do?

Use your business domain knowledge to use query hints. For example, one might know the right side usually produces 5K to 10K rows (say, some special transactions that this right side is supposed to produce after a lot of joins). In that case, you specify query (join) hint to use Hash Join. That way, a small change in statistics won’t make Query Plan Generator to produce a small change in the plan (Nested Loop selection instead of Hash Join) that will cost you significantly.

Index

Searching Just One Record Taking Several Seconds?

You need just one record from a database. Say, the record for a particular transaction number. You need this information quite frequently (say, from your web application). However, every time you go and search it, it is taking several seconds.

There is something wrong. Just to pick one row should be done quite fast, even if your database is substantially big.

Did you look at the query plan? Is it using an index?

Not using an index? If you do have an index on this column, give hints to the query to use it, re-run (do not forget to execute DBCC DROPCLEANBUFFERS before every time you run a query, especially when you are doing performance testing) and check the time. If it is doing fine, save the query that way to force it to use the index every time you call this query later.

You don’t have any index on that column? Create a nonclustered index on it (assuming you have some clustered index on something more important). It should be fine now.

Index