SQL Server: Blocking Problems
Date: May 24, 2002
Article is provided courtesy of InformIT.
SQL Server transactions are extremely powerfulthey provide a way to incorporate business rules into your applications and protect data integrity. Behind the scenes, the transactions are implemented through the SQL Server's locking mechanism, which we discussed in a previous article. Due to the poor application design, some of the incorrect settings on the server, or poorly written transactions, the locks can conflict with other active locks. This article will tell you why the blocking conflicts occur, and suggest a few ways to resolve them.
Blocking and Deadlocks
Many people confuse blocking with deadlocks. Let's make sure we're on the same page: blocking and deadlocks are two very different occurrences. Blocking occurs due to user A's transaction locking the resources that user B wants to read or modify. Most blocking conflicts are temporary in nature, and will resolve themselves eventually. Deadlocks are much worse then blocking. A deadlock occurs when user A has locks on the resources that user B wants to modify, and user B has locks on the resources that user A intends to modify. So a deadlock is much like an infinite loop: If you let it go, it will continue until the end of time, until your server crashes, or until the power goes out (whichever comes first). Fortunately, SQL Server has a built-in algorithm for resolving deadlocks. It will choose one of the deadlock participants and roll back its transaction. The user will see a friendly message:
Your server command (process id N) was deadlocked with another process and has been chosen as deadlock victim. Re-run your command.
On the other hand, blocking will resolve itself, but it might take quite a bit longer than your users would like. That's why you should try to avoid blocking conflicts in your applications.
Common Causes of Blocking
In this section, we'll go over a few most common causes of blocking.
More often than not, blocking problems are due to poor application design. A transactional database model should be highly normalized. There are several normalization rules that you should adhere to when designing your database. We won't go into the details of normalization, but to summarize the conceptyou should not keep any redundant data in your database. Transactional databases should not have any repeating columns, and each piece of data should be stored only once. That way, the transactions modify lean tables, complete, and release locks quickly.
The lack of appropriate indexes can often cause blocking problems as well. If you have a transaction that modifies many rows of a table without a clustered index, SQL Server might decide to acquire a table lock for the duration of your transaction. The rest of the transactions will be blocked until your transaction is committed or rolled back.
Poorly written transactions are by far the most common cause of blocking. Such transactions ask for an input value from a graphical user interface (or a Web screen) in the middle of a transaction. Imagine what happens if user Lucy gets interrupted by a phone call in the middle of a transaction. The transaction will hold the locks until Lucy gets off the phone and decides to input the requested value. If a phone call is fairly quick, we might get away with a few hate emails from the rest of the application users. But Lucy might decide to take a lunch break or go home right after her phone conversation. Therefore, never ask for a user input inside a transaction.
Inappropriate use of locking hints can be yet another cause of blocking. If you force SQL Server to acquire 50000 row level locks, your transaction might have to wait until other transactions complete, and this many locks are available.
Troubleshooting Blocking
The first step in troubleshooting a problem is figuring out the cause and type of problem. If you have several phone calls from users whose screens just freeze when they hit the CREATE RECORD button, chances are you have some blocking issues. Fortunately, there are some tools that can help you identify the root of the problem.
Your first line of defense should be the system stored procedures sp_lock, sp_who, and sp_who2. The sp_lock procedure lets you see the type of locks acquired by one, many, or all sessions connected to the server. The syntax is as follows:
Executing sp_lock without any parameters shows the current locks on my server:
Spid |
Dbid |
Objid |
Indid |
Type |
Resource |
Mode |
Status |
51 |
6 |
0 |
0 |
DB |
|
S |
GRANT |
51 |
1 |
85575343 |
0 |
TAB |
|
IS |
GRANT |
52 |
2 |
3 |
2 |
KEY |
(d000d5e60ffd) |
X |
GRANT |
52 |
2 |
3 |
2 |
KEY |
(0701e6e322ae) |
X |
GRANT |
52 |
2 |
1 |
3 |
KEY |
(0400353572f0) |
X |
GRANT |
52 |
2 |
11 |
0 |
TAB |
|
IX |
GRANT |
52 |
2 |
99 |
0 |
RID |
0.094444444 |
X |
GRANT |
52 |
2 |
0 |
0 |
PAG |
0.094444444 |
X |
GRANT |
52 |
2 |
0 |
0 |
PAG |
0.09375 |
X |
GRANT |
52 |
5 |
0 |
0 |
DB |
|
S |
GRANT |
52 |
2 |
6 |
0 |
TAB |
|
IX |
GRANT |
52 |
2 |
1 |
0 |
TAB |
|
IX |
GRANT |
52 |
2 |
3 |
0 |
TAB |
|
IX |
GRANT |
52 |
2 |
2 |
0 |
TAB |
|
IX |
GRANT |
52 |
2 |
12 |
0 |
TAB |
|
IX |
GRANT |
52 |
2 |
9 |
0 |
TAB |
|
IX |
GRANT |
52 |
2 |
1 |
1 |
KEY |
(0400403514b4) |
X |
GRANT |
52 |
2 |
3 |
1 |
KEY |
(0500e3b8aae3) |
X |
GRANT |
52 |
2 |
3 |
1 |
KEY |
(06000d171ff1) |
X |
GRANT |
52 |
2 |
0 |
0 |
IDX |
IDX: 2:37575172: |
X |
GRANT |
52 |
2 |
37575172 |
0 |
TAB |
|
Sch-M |
GRANT |
52 |
2 |
2 |
1 |
KEY |
(0400323fcaa6) |
X |
GRANT |
52 |
2 |
1 |
2 |
KEY |
(a71682514e25) |
X |
GRANT |
Although the information returned by sp_lock is somewhat cryptic, it isn't too difficult to decipher. The following paragraph gives you the explanation of each column:
Spid: Tells you the system process id of the connection that has acquired a particular lock. Each end user or system process may have multiple connections, and therefore could show up several times with a single spid.
Dbid: The database id where the lock was acquired.
ObjId: The object that has been locked.
IndId: The index that has been used to execute the query (if appropriate, otherwise 0).
Type: The type of lock that has been requested: row identifier lock, page lock, table lock, file lock, index lock, key lock, or database lock.
Resource: The lock resource.
Mode: Whether the lock is shared, intent shared, update, or exclusive.
Status: Whether the lock request has been granted (values of GRANT, CONVERT, or WAIT).
If you want to get into the details of exactly which objects have been locked in each database, you can use functions such as db_name(database_id) and object_name(object_id). For instance, the following query shows the database and object names for the ids of 1 and 3 respectively:
SELECT DB_NAME(1) AS Db_name_for_db_id_of_1, OBJECT_NAME(3) AS object_name_for_object_id_3
Result:
Db_name_for_db_id_of_1 |
object_name_for_object_id_3 |
master |
Syscolumns |
More importantly, sp_lock gives you an overall picture of the locking activity on your server. If you see hundreds of locks in the output of sp_lock, with many of them having a status of WAIT, it spells trouble. To get to the bottom of your blocking problems though you really need the output of sp_who (or sp_who2).
The other two procedures (sp_who and sp_who2) are very similar in nature (as their names suggest). sp_who2 simply returns a few additional columns. sp_who returns following columns:
Spid: System process id that requested the lock.
Ecid: Execution context of the thread associated with the spid. Zero means the main thread; all other numbers mean subthreads.
Status: Runnable, sleeping, or background. If the status is runnable, it means that the process is actually performing work; sleeping means that the process is connected to the server, but is idle at the moment.
Loginname: Login that has initiated the lock request.
Hostname: Name of the computer where the lock request was initiated.
Blk: The connection that is blocking the lock request from the current connection.
Dbname: Database name where the lock has been requested.
Cmd: General command type that requested the lock.
sp_who2, the newer version of sp_who, returns the following information:
Spid: System process id that requested the lock.
Status: Background, sleeping, or runnable.
Login: Login name that has requested the lock.
HostName: Computer where the lock request has been initiated.
BlkBy: Spid of the connection that is blocking the current connection.
DbName: Database name where the lock request has been generated.
Command: General command type that requested the lock.
CPUTime: Number of milliseconds the request has used.
DiskIO: Disk input/output that the command has used.
LastBatch: Date and time of the last batch executed by the connection.
ProgramName the Name of the application that issued the connection
Spid: In case you can't read the spid from the beginning of the output, it is repeated here.
The following is an output of sp_who on my server (I intentionally created some blocking issues):
spid |
ecid |
Status |
Loginname |
Hostname |
blk |
Dbname |
cmd |
1 |
0 |
background |
Sa |
0 |
NULL |
LAZY WRITER |
|
2 |
0 |
sleeping |
Sa |
0 |
NULL |
LOG WRITER |
|
3 |
0 |
background |
Sa |
0 |
Master |
SIGNAL HANDLER |
|
4 |
0 |
background |
Sa |
0 |
NULL |
LOCK MONITOR |
|
5 |
0 |
background |
Sa |
0 |
Master |
TASK MANAGER |
|
6 |
0 |
background |
Sa |
0 |
Master |
TASK MANAGER |
|
7 |
0 |
sleeping |
Sa |
0 |
NULL |
CHECKPOINT SLEEP |
|
8 |
0 |
background |
Sa |
0 |
Master |
TASK MANAGER |
|
9 |
0 |
background |
Sa |
0 |
Master |
TASK MANAGER |
|
10 |
0 |
background |
Sa |
0 |
Master |
TASK MANAGER |
|
11 |
0 |
background |
Sa |
0 |
Master |
TASK MANAGER |
|
12 |
0 |
background |
Sa |
0 |
Master |
TASK MANAGER |
|
51 |
0 |
sleeping |
Sa |
BP-5CHSFFH2HEJ1 |
52 |
Pubs |
UPDATE |
52 |
0 |
runnable |
Sa |
BP-5CHSFFH2HEJ1 |
0 |
Pubs |
SELECT |
53 |
0 |
runnable |
Sa |
BP-5CHSFFH2HEJ1 |
0 |
Pubs |
SELECT |
54 |
0 |
sleeping |
Sa |
BP-5CHSFFH2HEJ1 |
51 |
Pubs |
UPDATE |
We only really care about a few rows from this output. The system process ids of 1 through 12 are just background processes that take care of normal SQL Server operations. The ids of 51, 52, 53, and 54 are the connections that I initiated for this article. Notice that connection 54 is blocked by connection 51. However, that's not the end of the story. Connection 51 is also blocked by connection 52. So the root of the blocking problem is the statement executed on connection 52. The next step is to figure out what that statement was. To do so, use DBCC INPUTBUFFER (spid) syntax, as shown below:
DBCC INPUTBUFFER (52)
Result:
EventType |
Parameters |
EventInfo |
Language Event |
0 |
SELECT * FROM sales(HOLDLOCK) SET @i = @i + 1 END |
The DBCC INPUTBUFFER command shows the last 128 characters of the last TRASACT-SQL statement executed from connection 52. That way, I can track down the stored procedure or the portion of the application that caused the problem.
Occasionally, DBCC INPUTBUFFER isn't very helpfulit might show RPC in the event info column. If so, you have to resort to using SQL Server Profiler. (In earlier versions of SQL Server the Profiler was referred to as SQL Trace.) The Profiler is an excellent tool for gathering information about the activity on your server. I don't have room to discuss the Profiler in this article, but for resolving blocking problems, you can try to spy on the few connections you are interested in. The Profiler will tell you exactly what statements those connections execute, and in what order. One other word of caution here: The Profiler collects much useful info, but has a tendency to overwhelm you. Be sure to select only the events and connections that you're interested in before running your trace.
If you prefer using the Enterprise Manager to execute TRANSACT-SQL statements, the output of sp_who2 and sp_lock are available from the Current Activity Window of the Enterprise Manager.
Resolving Blocking Issues
So far, this article has revealed to you what the blocking problems are, what causes them, and how to identify the roots of such problems. Resolving the blocking problems might take quite a few steps; we'll go over some of the more common ones in the next few paragraphs.
A quick way to resolve a blocking problem is to disconnect the connection blocking the rest of the users. All you have to do is ask your database administrator to execute the KILL statement. Obviously, this is not a permanent solution, and won't really help anything except take care of an immediate need.
If you deduce that the blocking is caused by a poorly written transaction, try to rewrite it. Often, a single transaction might bring the entire application to its knees. Other times, you'll have to review many (or all) stored procedures of your application before you can resolve problems.
If you see many table locks in your database, you might want to evaluate your indexing strategy. The blocking problems can often be resolved by adding appropriate indexes to the affected tables. The index study is a serious exercise, and it takes quite a bit of effortso don't take it lightly.
If a problem is caused by poor database design, then you can try to rework the database model. If you experience blocking in the early stages of the application design, you're in luck because you might be able to afford reworking the design. If the problems did not crop up until the application is deployed to the users, reworking the design might not be an option. In such cases, you might wish to direct some of the transactions to another server, and use replication to keep two databases synchronized.
Finally, if you think the query hints are forcing the activity that you don't care to see, try rewriting your queries without such hints. Nine times out of ten, SQL Server can make the best locking decision on its own.
Summary
In this article, I showed you some of the problems you might experience with conflicting locks. Temporary blocking is normal, and should not scare you. But if you have many users complaining about the application being slow, you have some work ahead of you. This article taught you how to determine the cause of your problems, and how to go about resolving them.