Thursday, July 18, 2013

All about Tempdb

TempDB Basics

TempDB is one of the critical system databases that need to be configured carefully otherwise the performance of your application may suffer. I am not suggesting that the current implementation of TempDB addresses the challenges of every application but SQL Server has made significant improvements starting with SQL Server 2005.

TempDB Basics:

TempDB is like any other user database in that it is created using the specifications in Model Database and can be used to create/query tables but unlike its user database counterparts, it is created every time the SQL Server starts. This is significant difference with following implications

         It does not need to be recovered. So there is no reason for TempDB to be running under FULL or BULK LOGGED recovery models. In fact, the TempDB is configured to be under SIMPLE recovery model and it cannot be changed.

         Though the TempDB does not need to be recovered, it does not mean that operations on objects in TempDB need no logging. Now here comes the tricky part. Each operation in SQL Server is done inside an explicit or implicit (each statement is encapsulated by a transaction) and that includes operations on objects in TempDB. Let us say update a row in a global temporary table and now you rollback the transaction, there will be no way to undo this change unless we have logged the 'before value' of the update. However, there is no need to log the 'after value' of the update because this value is only needed when the transaction needs to be 'redone' which happens during database recovery. As I indicated earlier, the TempDB is not recovered; there is no reason to LOG the 'after value' of the update. As another example, let us consider Insert operation. When you insert a row in say a global temporary table, the actual 'row value' is not logged because SQL Server does not need the 'row' to undo except it needs to set the offsets within the page appropriately or if this insert had caused a new page to be allocate, de-allocate the page. So key point is, we only log 'undo' information but not 'redo'. This change on logging is available starting with SQL Server 2005 but at this time, these are supported only on a HEAP. To make the matter further complicated, not all objects in TempDB are subject to logging. I will explain this in more detail later. Let us consider one example to illustrate the point that I just made

 
 

use tempdb

go

 
 

---------------

-- SETUP

---------------

drop
table ##t_tempdblog

go

 
 

-- create a global temp table. Notice, each row is

-- greater than > 1000. I did this so that we can easily

-- identify if the row is indeed logged.

create
table ##t_tempdblog
(c1 int, c2 char(1000))

go

 
 

-- load 10 rows into this table

declare @i int

select @i = 0

while (@i < 10)

begin


insert
into ##t_tempdblog values


(@i,
replicate
(cast
('a'
as
varchar), 1000))

    select @i = @i+1

end

 
 

--Now run the following command to see the log records

select
top 10 operation,context,


[log record fixed length],

[log record length]

from
fn_dblog(null,
null)

where allocunitname='dbo.##t_tempdblog'

order
by [Log Record Length] Desc

Here is the output of the above query. You will notice that all log records are < 100 bytes so clearly the row to be inserted was not logged.

operation context log record fixed length log record length

-----------------------------------------------------------------------

LOP_MODIFY_ROW LCX_IAM 62 88

LOP_MODIFY_ROW LCX_IAM 62 88

LOP_FORMAT_PAGE LCX_HEAP 80 84

LOP_FORMAT_PAGE LCX_IAM 80 84

LOP_FORMAT_PAGE LCX_HEAP 80 84

LOP_MODIFY_ROW LCX_PFS 62 80

LOP_MODIFY_ROW LCX_PFS 62 80

LOP_INSERT_ROWS LCX_HEAP 62 72

LOP_INSERT_ROWS LCX_HEAP 62 72

LOP_INSERT_ROWS LCX_HEAP 62 72

 
 

Now, just to prove that logging is different in user database, I will use the same example as above but with user table, here the log records that I get

operation context log record fixed length log record length

----------------------------------------------------------------------

LOP_INSERT_ROWS LCX_HEAP 62 1104

LOP_INSERT_ROWS LCX_HEAP 62 1104

LOP_INSERT_ROWS LCX_HEAP 62 1104

LOP_INSERT_ROWS LCX_HEAP 62 1104

LOP_INSERT_ROWS LCX_HEAP 62 1104

LOP_INSERT_ROWS LCX_HEAP 62 1104

LOP_INSERT_ROWS LCX_HEAP 62 1104

LOP_INSERT_ROWS LCX_HEAP 62 1104

LOP_INSERT_ROWS LCX_HEAP 62 1101

LOP_INSERT_ROWS LCX_HEAP 62 1104

Now to illustrate, that UNDO log information is indeed logged. I did the following operation

update ##t_tempdblog set c2 =


replicate
(cast
('b'
as
varchar), 1000)

 
 

And, now I run the following query

select
top 10 operation,context,


[log record fixed length], [log record length]

from
fn_dblog(null,
null)

where allocunitname='dbo.##t_tempdblog'

order
by [Log Record Length] Desc

 
 

Here is the output I got. You will notice that the length of the log record > 1000 bytes and in fact it logs the previous value of the column C2 that I just updated.

 
 

 
 

operation context log record fixed length log record length

-----------------------------------------------------------------------

LOP_MODIFY_ROW LCX_HEAP 62 1072

LOP_MODIFY_ROW LCX_HEAP 62 1072

LOP_MODIFY_ROW LCX_HEAP 62 1072

LOP_MODIFY_ROW LCX_HEAP 62 1072

LOP_MODIFY_ROW LCX_HEAP 62 1072

LOP_MODIFY_ROW LCX_HEAP 62 1072

LOP_MODIFY_ROW LCX_HEAP 62 1072

LOP_MODIFY_ROW LCX_HEAP 62 1072

LOP_MODIFY_ROW LCX_HEAP 62 1072

LOP_MODIFY_ROW LCX_HEAP 62 1072

As I indicated, the TempDB is created every time the SQL Server is started, it is tempting let it grow starting with the size initially set in the Model Database. While this works, it will fragment your TempDB database because by default the size of the TempDB will increase in the increments of 10% of its original size which starts in low MB range. If your workload needs the size of the TempDB to be 500GB in steady state, then you can imagine the number of time the TempDB needs to grow. On top of this, each time the TempDB grows, all transactional activity may come to an halt because TempDB is used by most operations and these activities will get blocked until more disk space gets allocated to the TempDB. So the key recommendation is to pre-allocate the size of the TempDB that matches with the needs of your workload. For example, if your workload needs 500GB TempDB, then it makes sense to pre-allocated say 600GB disk space it does not need to grow further. The auto-grow functionality should only be used more for exception rather than a strategy. One interesting aspect is that once the size of the database is explicitly set to say 600GB, the size will be retained when you start the SQL Server next time. You can use instant file initialization feature available starting with SQL Server 2005. To use instant file initialization, you must run the SQL Server (MSSQLSERVER) service account under a Windows account and assign the Windows SE_MANAGE_VOLUME_NAME special permission to that Windows account. Similarly, the log file(s) should be appropriately sized.

 
 

         There is one Filegroup for data files and one Filegroup for log files. It is recommended that you create at least as many data files as the number of Cores used by SQL Server process to minimize allocation bottleneck. I will explain this in more details in my troubleshooting section.

         Types of Objects in TempDB: There are following three kinds of objects created in TempDB

o   User Objects: These objects correspond to ## and # tables and table variables created explicitly by the application. BOL describes these objects well. For specific discussion for the difference in # table a table variable, please refer to the blog http://blogs.msdn.com/sqlserverstorageengine/archive/2008/03/30/sql-server-table-variable-vs-local-temporary-table.aspx. Some key points about changes for User Object in SQL Server 2005

  REDO information is not logged

  Improved caching for temporary objects. SQL Server caches 1 IAM page and 1 data page with the associated query plan. The benefit of this is significant. First, if the same plan is executed multiple times, the temporary table needs to be created once there by reducing DDL contention. Imagine if a temporary table with say 5 columns gets created/destroyed every time, SQL Server will need to insert one row system table for 'tables' and 5 rows for columns in 'columns' and then remove these entries when table is removed. Second, it reduces allocation overhead. Let me illustrate this with the following example

 
 

 
 

use general

go

 
 

set
nocount
on

go

 
 

 
 

-- create the stored procedure

if
object_id('test_temptable_caching')
IS
NOT
NULL

    drop
procedure test_temptable_caching;

go

create
procedure test_temptable_caching

as

 
 

create
table #t1
(c1 int, c2 int, c3 char(5000))

--create index foo on #t1(c1);

declare @i int

select @i = 0

while (@i < 10)

begin

    insert
into #t1 values (@i, @i + 1000,
'hello')

    select @i = @i+1

end

print
'done with the stored proc'

go

 
 

Now if I execute this stored procedure in a loop, the #table is only created on the first execution. You can use the following script to know how many times the #table was created.

 
 

declare @table_counter_before_test bigint;

 
 

select @table_counter_before_test=cntr_value

from
sys.dm_os_performance_counters

where counter_name =
'Temp Tables Creation Rate';

 
 

 
 

declare @i int

select @i = 0

while (@i < 10)

begin

    exec test_temptable_caching

    select @i = @i+1

end

 
 

declare @table_counter_after_test bigint;

 
 

select @table_counter_after_test=cntr_value

from
sys.dm_os_performance_counters

where counter_name =
'Temp Tables Creation Rate';

 
 

print
'Temp tables created during the test: '
+

    convert(
varchar(100), @table_counter_after_test-@table_counter_before_test);

 
 

                       Temp Objects are only cached when none of the following conditions is violated.

a.    Named constraints are not created.

b.      Data Definition Language (DDL) statements that affect the table are not run after the temp table has been created, such as the CREATE INDEX or CREATE STATISTICS statements.

c.       Temp object is not created by using dynamic SQL, such as: sp_executesql N'create table #t(a int)'

 
 

o   Internal Objects: These objects are created internally by SQL Server to execute user commands. For example, when processing an Order By clause, the SQL Server uses TempDB for sorting. Similarly, when doing a hash join, the hash table uses TempDB for its persistence. Some examples of internal objects are Work tables, Work Files, Spool Operator and so on. Like user objects, the caching behavior of internal objects has improved in SQL Server 2005. Now, SQL Server caches 1 IAM page and 1 full extent (i.e 8 pages). For example, you can identify the work table caching using the permon counter 'Worktables Created/sec' in similar way described with #table in earlier. Another key point here is that operations on internal objects are not logged because internal objects are created/destroyed in the scope of a single statement.

 
 

o   Version Store: This is a new construct in SQL Server 2005 and it needs a separate blog of its own.


 

Managing TempDB in SQL Server: TempDB Basics (Version Store: logical structure)

Now that we know a few things about Version Store, let us now look into its structure to understand how it stores rows from different tables/indexes with different schema. Interestingly, you don't need to look far and the answer is available when you examine the DMV sys.dm_tran_version_store.

This DMV shows the full logical structure of the version store. There are two key points to note. First, the version store consists of 8K pages just like data or index pages. These pages exist in the buffer pool and can be flushed to the disk, in this case to the TempDB, under memory pressure. The rows on the version store follow the same rules as any data/index row. Second, the row versions are stored as their full 'binary' image, like the way it is stored in the data page', in the version store. The binary image is broken into 'first-part' and 'second-part' which is then combined internally by the SQL Server to interpret it just like it would interpret the actual data/index row according to the schema. This makes the row version storage independent of the owning schema of the object. So a given version store page can have rows from multiple tables and indexes and in fact they can be from any database under SQL Server instance. In other words, the version store is shared among all databases in the SQL Server instance. Just like the pages of a table or an index may need to be kicked out of buffer pool under memory pressure so are the pages of version store. For this reason the version store is backed by persistence in TempDB.

If you look at the fields in the sys.dm_tran_version_store DMV, you will note that the version row stores many other attributes like database-id, row-length etc which is not part of the original data/index row. So you may wonder how does the version of a row with a size of 8060 (max allowed length of a row in SQL Server) stored in the version store as the version store 'row' stores many other attributes as described below? Well, the answer is that the data row is broken into 2 rows physically on the version store page but it does not show as such in the DMV because it is 'virtual' and shows it as 1 big row.

Here is an example of the contents of version store where transaction with XSN 57 has updated three different rows while a transaction with XSN 58 has updated only 1 row. Note, if a transaction updates the same row multiple times, there is only 1 version created because for other transactions it looks like the first transaction is holding a X lock.

transaction_sequence_num version_sequence_num database_id

------------------------ -------------------- -----------

57 1 9

57 2 9

57 3 9

58 1 9

 
 

rowset_id status min_length_in_bytes

-------------------- ------ -------------------

72057594038321152 0 12

72057594038321152 0 12

72057594038321152 0 12

72057594038386688 0 16

 
 

record_length_first_part_in_bytes

---------------------------------

29

29

29

33

 
 

record_image_first_part

--------------------------------------------------------------------

0x50000C0073000000010000000200FCB000000001000000270000000000

0x50000C0073000000020000000200FCB000000001000100270000000000

0x50000C0073000000030000000200FCB000000001000200270000000000

0x500010000100000002000000030000000300F800000000000000002E0000000000

 
 

record_length_second_part_in_bytes record_image_second_part

---------------------------------- ----------------------

0 NULL

0 NULL

0 NULL

0 NULL

 
 

In the next blog, I will describe how the pages in the version store are grouped together and the subsequent garbage collection and gotchas

Managing TempDB in SQL Server: TempDB Basics (Version Store: Growth and removing stale row versions)

In the previous chapter , I described the logical/physical structure of a row in version store. A natural question that comes up is how these pages are grouped together and how does SQL Server garbage collect these rows. You may recall that if you need to DELETE a large number of rows from a table, an efficient option is to truncate or even drop the table rather than deleting one row at a time. Similarly, the SQL Server has to delete 1 row at a time. It can be quite expensive. So a better option is to DELETE bunch of rows together. For this reason, the SQL Server groups the row versions in a physical unit called 'Append-Only store'. All the versioned rows in an Append-Only store are garbage collected together. Alternatively, you can think of Append-Only store as an internal table, but it is not really a table like #table, and this table is dropped once SQL Server determines that it does not need the row versions in it. You can see that this operation is much more efficient compared to deleting 1 versioned row at a time. A new 'Append-Only Store' is created very minute, if needed (i.e. if there is no transactional activity that generates a row version, the Append-Only store will not be created). The version store represents the aggregation of all active 'Append-Only' stores.

Let us now work through an example to illustrate the functioning of the version store. For simplicity, let us assume that the transactional activity is such that a new Append-Only store is being created every minute. Let us say that Append-Only store AS1 was created at time T and the two subsequent Append-Only stores AS2 and AS3 were created at time (T+1) and (T+2) respectively. You may now wonder how SQL Server chooses a specific Append-Only store to "insert" the row version. Well, the SQL Server associates an Append-Only store to a transaction at its start. So for example, for transactions that start between T and (T+1), the version store is stored in AS1. Similarly, for transactions that start between (T+1) and (T+2), the row versions are stored in AS2. Note, a transaction may be long running say 5 minutes and can be generating row versions throughout its life time, but still all the row versions it creates go to the Append-Only store that was assigned to it in the begin of the transaction. You can also imagine that in its life-time, each Append-Only store will have different sizes depending upon the row versions generated by associated transactions. One important point that I did not mention earlier is the "insert" into the Append-Only store are not logged. Why? Well if SQL Server has to fail, all active transactions will be rolled back anyways so the SQL Server does not need the "state" of version store as of the time when SQL Server stopped.

Let us now look at when it is safe to remove an Append-Only store. To remove an Append-Only store, the SQL Server must guarantee that no transaction will need to look at the row versions it contains. For the above example, to remove Append-Only store AS1, the following conditions must be met

1.    All transactions that started between T and (T + 1) have completed. Let us say that these transactions completed at time T2.

2.    All transactions that started between (T + 1) and T2 and needed row versions (i.e. the transactions running with RCSI or SI) from AS1 have completed.

3.    All Append-Only stores created before AS1 have been removed

 
 

Here are some key observations (a) typically, the transactions commit in the order they were started so this scheme of deleting Append-Only store works well. (b) if you have a long running transaction, it will prevent deleting Append-Only stores created since the start of the transaction. This can cause version store to grow and subsequently the  TempDB may go out of space. While this may sound alarming but it is similar to when a long running transaction can cause the transaction log to fill up. The key difference here is that the impact of versioning is wider as it can impact all databases that need to create row versions. Typically a long running transaction indicates a poor application design. SQL Server provides tools to detect the version store growth. I will cover this in troubleshooting section in later blogs.

On another note, you may wonder how the version store is managed for ONLINE index which can potentially take an order of magnitude more time than a transaction. The SQL Server gets around this issue by 'internally' creating another version store, distinct from the one I just described, for ONLINE index build containing the row versions of the table on which the index is being built.

This concludes my discussion on the version store. I will cover the troubleshooting of version store as part of overall troubleshooting of TempDB. Hope this provided you with some deeper understanding and I look forward to your comments.

What is allocation bottleneck?

Allocation bottleneck refers to contention in the system pages that store allocation structures. There are three types of pages that store allocation structures as follows

         PFS (Page Free Space): it tracks the following key information about each page. It uses 1 byte/page and each PFS page stores status for 8088 pages.

o   Free space available

o   If it is allocated or not

o   Has ghost records (when a row is deleted, it is marked as ghost)

         GAM (Global Allocation Map): Tracks if a uniform extent is free or not. A bit '1' indicates that extent is available for allocation. Each GAM page tracks 64000 extents or 4GB

         SGAM (Shared Allocation Map): Tracks if an extent is mixed extent or not. A bit '1' indicates that it is a mixed extent and has one or more free pages. Mixed extents are used to allocate first 7 pages of an object. If number of pages in an object is 8 or larges, its pages are allocated from uniform extents. Each SGAM covers 64000 extents or 4 GB

So for example, if you have a database file of size 16GB, it will have 4 GAM pages (each GAM covers 4GB), 4 SGAM pages and 256 PFS pages. It is worth mentioning that each file has its own PFS, GAM and SGAM pages.

Now, let us look at a high level what happens during allocation. A page allocation is done when an object is first created or as part of inserting the row that requires a new page to be allocated. Please note, my intent in the following description is not to describe the allocation algorithm of SQL Server (in fact, the allocation mechanism is much more complex and possibly widely different) but to give you an idea how you can get contention in allocation structures.

         SQL Server looks at the target extent if there are any pages available for allocation by examining the PFS page under SH latch. If a page is found, it is allocated and the PFS information is updated under UPDATE latch. For HEAPs, but not for Btree, the SQL Server will look into other allocated pages with enough free space to store the new row and then updates the new free space information by taking the UPDATE latch on PFS page.

         If no such page is found, the SQL Server looks at GAM page under SH latch to find a free uniform extent (assuming the object already has > 8 pages). If no such extent is found, then it looks at the next GAM page and so on. When a free extent is found, the SQL Server takes an UPDATE latch on the GAM page and updates the information.

         If the object has <= 8 pages, a page from mixed extent needs to be allocated. The SQL Server looks at SGAM page under SH latch and if a mixed extent with one or more free page is found, the page is allocated and the SGAM page is updated by taking the UPDATE latch

Similarly, when page is deallocated or an object is dropped or when a row is deleted (for HEAPs), the allocation structures need to be updated. As you can now imagine, if there is significant allocation/deallocation activity in a SQL Server database, multiple threads may need to wait to acquire X or SH latch in non-conflicting mode which can lead to allocation bottleneck.

Starting with SQL Server 2005, the caching mechanism for objects in TempDB has been improved significantly which will reduce the allocation contention incurred by your workload. you can address SGAM bottleneck by enabling TF-1118. Please refer to Managing TempDB in SQL Server: TempDB Configuration.

Managing TempDB in SQL Server: TempDB Configuration

In my previous chapters, I described the types of objects in TempDB and how they are managed. I hope that it provided you with a good working knowledge of TempDB. Now the next question is how do I configure the TempDB for my production workload? In this context, there are three common questions as follows:

1.   Where should I create TempDB?

2.   What should be the size of TempDB?

3.   Single file vs multiple file?

Let us tackle each of these questions in Order.

Where (i.e. what disks) should I create TempDB?

More often than not, customer may not realize that the cause of the slowdown in the workload is because the TempDB is on the slower device. This can happen as workload changes over time or the TempDB was configured on a slower device. The performance of your TempDB is critical to your workload as, I had indicated in my earlier blogs, the SQL Server uses TempDB to store intermediate results as part of executing a query, for example to create a hash table or to sort as a result of order by. So what should you do? Well, it is not any different than what you will need to for your user databases. You need to measure the IO bandwidth needed to meet the demands of your workload. Since the persistence requirements of TempDB are different (i.e. no REDO needed), you may want to consider creating TempDB on its own spindles. In fact, you may even consider using RAM Disk to achieve better performance. Please refer to the KB article http://support.microsoft.com/kb/917047

What should be the size of TempDB?

Unlike user databases where you can probably estimate the growth rate of tables over time based on the workload, estimating the size of TempDB is non-trivial. The size of TempDB can be affected for example by a change in query plans, version store when there is a long running transaction and so on. The best way to estimate the size of TempDB is by running your workload in a test environment. Once you have a good estimate of the size of TempDB, use Alter Database command to set its size with a safety factor that you feel is appropriate. Never, I mean never, let the TempDB grow to its steady state size through auto-grow. You should only use auto-grow as a last resort but not as a strategy. Also, remember that the TempDB is created every time you restart a SQL Server but its size is set to either default of Model database or the size you had set using Alter Database command (the recommended option)

         Don't rely on auto-grow to manage file sizes. You must set the appropriate size of TempDB using Alter Database command. Here are the pitfalls of auto-grow

o   Auto-grow causes a pause during processing when you can least afford it

o   Less of an issue with instant file initialization

o   Auto-grow leads to physical fragmentation

Should I use 1 file or multiple files for TempDB?

SQL Server recommends that you create as many files as there are COREs or CPUs to minimize allocation bottlenecks. Yes, it is true that starting with SQL Server 2005, there is better caching of TempDB objects but there are cases when the caching does not work and even with caching, the size of temp objects grows beyond 1 page (# tables) or 1 extent (internal objects), the SQL Server can potentially incur the allocation bottleneck. Please refer to the allocation-bottleneck for details on what causes allocation bottleneck. Before we proceed further, you may wonder why we talk about allocation contention in the context of TempDB and why not in the context of other databases. Clearly, the allocation contention can happen even in user database. The main reason is that the objects in TempDB are created/destroyed all the time to process customer workload which leads to order of magnitude more allocation contention.

To minimize allocation bottleneck, here is the recommendation

         Spread TempDB across atleast as many equal sized files as there are COREs or CPUs. Since allocation in SQL Server is done using proportional fill, the allocation will be evenly distributed and so is the access/manipulation of the allocation structures across all files. Note, you can always have more files than COREs but you may not see much improvement. One of the questions that people often ask is if they should create multiple files on the same physical spindle or multiple? The allocation bottleneck can be reduced by creating multiple files on single or multiple spindles. Creating files across multiple spindles will help increase the IO bandwidth but will have no additional impact on allocation bottleneck.

         If you are still encountering allocation bottleneck, you may want to consider enabling TF-1118. Under this trace flag SQL Server allocates full extents to each TempDB object, thereby eliminating the contention on SGAM page. This is done at the expense of some waste of disk space in TempDB. This trace flag has been available since SQL Server 2000. With improvements in TempDB object caching in SQL Server 2005, there should be significantly less contention in allocation structures. If you see contention in SGAM pages, you may want to use this trace flag. If your workload is not encountering SGAM contention, then enabling TF-1118 will not help. Also, this TF is instance wide and impacts the user databases as well.

TempDB Monitoring and Troubleshooting: IO Bottleneck

I hope my earlier blogs on TempDB (http://blogs.msdn.com/sqlserverstorageengine/archive/tags/TempDB/default.aspx)  have given you a good background on the objects contained in TempDB and its configuration and have deployed your workload in production and it runs for ever without any glitches. Oh, really? Well, don't we all wish it was so but as you may have already experienced, perhaps one time too many, that the performance/configuration problems do happen. In fact, most DBAs/Administrators spend 20-30% of the time in troubleshooting the issues. With that context, let us look into key issues with TempDB in production and steps to troubleshoot them.

TempDB is more vulnerable because it is a shared resource and any misbehaving application or adhoc query can probably cause extreme resource issues in TempDB thereby affecting other applications running on the same instance. Here is a list of 4 common issues that you may encounter with TempDB

         IO bottleneck

         Allocation Contention

         DDL Contention

         Running out of space

I will cover IO bottleneck in this blog and others in my next blogs with the emphasis on how to detect that this indeed is happening and the steps that you can take to address.

IO Bottleneck: Now, the IO bottleneck is nothing specific to TempDB. Any database can incur IO bottleneck if the IO bandwidth is not configured to meet the workload demands. It is possible, and in fact likely, that your workload has changed overtime such the initial configuration is not able to handle it.

When a customer hits an IO bottleneck, it is tempting to conclude that the IO subsystem needs to be upgrades. While this may in fact be the final solution, but it should not be the first and the only step. Here are the recommended actions

1.   Check if you have memory pressure: If you don't have sufficient physical memory allocated to the SQL Server process, you may incur additional IO that could be avoided. As a first step, you need to make sure you don't have memory bottleneck by looking at the following counters

a.   Buffer Cache Hit ratio

b.   Page Life Expectancy

c.   Checkpoint pages/sec

d.   Lazywrites/sec

 
 

I will also advise you to look at DBCC Memorystatus command that gives details on how the memory is being used. You may want to refer to
http://support.microsoft.com/kb/907877

 
 

2.   Identify queries that are generating the most IOs: you can run the following DMV query to identify it. The output of the query shows the top 10 queries (the SQL text and the query plan handle) that are generating the most IOs across all its executions. Agreed that the IOs include IOs in TempDB and also in other databases, but it is a good place to start. You can investigate these queries to identify any bad query plan (perhaps an index was not chosen) and/or possibly look into rewriting the queries that minimize IOs.

 
 

SELECT
TOP 10


(total_logical_reads/execution_count) AS


avg_logical_reads,


(total_logical_writes/execution_count) AS



avg_logical_writes,


(total_physical_reads/execution_count)



AS avg_phys_reads,

execution_count,

statement_start_offset as stmt_start_offset,


(SELECT
SUBSTRING(text, statement_start_offset/2 + 1,


(CASE
WHEN statement_end_offset =

-1


THEN
LEN(CONVERT(nvarchar(MAX),text))
* 2


ELSE statement_end_offset


END
- statement_start_offset)/2)


FROM
sys.dm_exec_sql_text(sql_handle)) AS query_text,

plan_handle

FROM
sys.dm_exec_query_stats

ORDER
BY


(total_logical_reads + total_logical_writes) DESC

 
 

3.   Once you feel satisfied that you have configured your memory correctly and the query plans are ok, then it makes sense to add more IO bandwidth for TempDB. Now, here is one catch. Say, you add 1 new file to TempDB on a faster disk. Guess what happens? Due to proportional fill algorithm, the access to new file will be skewed in its favor and this may in fact lead to more IO and allocation bottleneck. The recommended action is to add the new file(s) with the same size as existing files and then restart the SQL Server. This will guarantee that the IOs are distributed across all files. Other alternative will be to move all the files in the TempDB to a different disk subsystem.

TempDB Monitoring and Troubleshooting: Allocation Bottleneck

This blog continues the discussion on the common issues in TempDB that you may need to troubleshoot. In the above post, we discussed how to identify and troubleshoot IO bottleneck in TempDB. In this blog, I will describe how to indentify allocation bottleneck and to troubleshoot it.

As you may recall, the allocation bottleneck is caused when allocation structures are accessed by concurrent threads in conflicting modes. Please refer to http://blogs.msdn.com/sqlserverstorageengine/archive/2009/01/04/what-is-allocation-bottleneck.aspx for details. In this case, the concurrent thread(s) will need to wait to acquire the pagelatch thereby slowing the operation. The good thing is that the pages containing allocation structures (GAM, SGAM, PFS) are well known and have fixed page numbers in each file. For example, in file-id 1, the allocation pages IDs are

         PFS – 1

         GAM – 2

         SGAM - 3

A PFS page will appear every 8088 pages in a file. The GAM and SGAM will appear every 511232 pages and similarly in other files.

Diagnosing:

You can run the following DMV query to find any latch waits that occur in allocation pages

select     session_id, wait_duration_ms, resource_description


from     sys.dm_os_waiting_tasks


where     wait_type like
'PAGE%LATCH_%'
and

    resource_description like
'2:%'

Since the database id of TempDB is 2, the search argument '2.%' represents any page in TempDB across any file. If this page happens to be GAM, SGAM or PFS, it will represent allocation bottleneck. Note, in a concurrent application, some blocking is expected so you will need to baseline the allocation waits when your application is performing normally. Only when the waits exceed the baseline significantly, it signals that you are incurring allocation bottleneck.

Troubleshooting:

SQL Server recommends the following

         Create atleast as many files of equal size as there are COREs/CPUs for SQL Server process. The rationale is that at a given time, the number of concurrent threads is <= number of COREs/CPUs. Don't confuse this with the number of active sessions/batches.

         Enable TF-1118 if you are encountering bottleneck in SGAM structures.

         If you are still encountering allocation bottleneck, you will need to look at your application and see which query plans are creating/dropping objects in TempDB and if these objects are being cached and take corrective action when possible. Most IT shops have very limited choice here as they don't own the application code.

Here I want to point out one seemingly harmless step to solve allocation bottleneck that can in fact make it worse. Say you are encountering allocation bottleneck and you decide to add one more file to the mix hoping that allocation will spread further. Well, if you recall proportional fill methodology; all new allocations will favor the newer file because it has the most free space. So suddenly, the allocation problem becomes even worse. It is a good practice to restart SQL Server when you add a new file, clearly it needs to be of the same size as other files, to TempDB.

TempDB Monitoring and Troubleshooting: DDL Bottleneck

This blog continues the discussion on the common issues in TempDB that you may need to troubleshoot. In previous blogs, I discussed how to identify and troubleshoot IO and allocation bottleneck in TempDB. In this blog, I will describe how to indentify DDL bottleneck in TempDB and to troubleshoot it.

It will be useful to understand why DDL bottleneck is most commonly related to TempDB and not in user databases. Well, if you think about user database(s), they are created as part of application design. Once the application is tested and deployed, the schema in user databases does not change except may be to add/drop indexes or when the application is upgraded to a newer version. TempDB on the other hand is quite different and is used as a scratch pad both by user applications to store intermediate data in # and ## tables as well as table variables and also by SQL Server as part of processing queries. Please refer to the http://blogs.msdn.com/sqlserverstorageengine/archive/2008/12/22/tempdb-basics-cont.aspx for some background. Now, let us take a simple case where an application creates a #table containing 10 columns in a stored procedure. When this stored procedure is invoked, the SQL Server creates this #table by inserting one row in system table that stores meta-information about tables and 10 rows, each representing 1 column the #table, in the system table storing meta-information about column. There may be other system tables affected, for example when you create index(s) on the #table, but it is clear that at minimum, it will require 11 rows to be inserted into system tables at create time and then these rows to be deleted when the stored procedure terminates. Now for moments assume that this table is not cached and this stored procedure is executed thousands of times by concurrent threads. It will cause a large number of inserts/deletes in system tables in TempDB. I hope that this provides you a good understanding why and how the DDL contention, i.e. blocking when accessing system tables, can happen inside TempDB. Note, it is possible that an application may create another 'user' database as a scratch pad. In that case, this database can potentially incur DDL bottleneck but then you may wonder why create a scratch database when TempDB provides optimized logging and easy manageability.

Now let us discuss how to detect DDL contention and what you can do to minimize it.

Diagnosing DDL Contention

As I indicated in the previous paragraph, the DDL contention refers to the contention in pages belonging to system tables. For this specific case, we need to look at PAGELATCH contention in system tables within TempDB. You can use the following query to identify it

select session_id, wait_duration_ms, resource_description

from
sys.dm_os_waiting_tasks

where wait_type like
'PAGE%LATCH_%'
and

resource_description like
'2:%'

You may recall that this query is very similar, in fact it is identical, to the DMV query to detect allocation bottleneck. The difference however is that you will need to look at pages other than GAM, SGAM and PFS that are incurring PAGELATCH waits and see, using undocumented DBCC PAGE command, if these pages indeed belong to system table(s) in TempDB. This will show you that you are indeed incurring DDL bottleneck. Now for any performance metrics, you will need to compare this number with baseline when the workload was performing normally and see if you are incurring excessive DDL bottleneck.

You can also look at the following PERFMON counters to see if you are encountering significant change in these numbers. Note, internal objects in TempDB don't cause DDL contention.

         Temp Tables Creation Rate The number of temporary table or variables created/sec.

         Temp Tables For Destruction The number of temporary tables or variables waiting to be destroyed by cleanup system thread.

 
 

Troubleshooting:

OK, now you have determined that you indeed are encountering DDL bottleneck, what can you do? Clearly, the goal is to minimize creation/destruction of user objects (i.e. #, ## and table variables). Unfortunately, there are no easy fixes. My suggestions are

(1)        Check if the user objects are indeed getting cached. Please refer to http://blogs.msdn.com/sqlserverstorageengine/archive/2008/12/22/tempdb-basics-cont.aspx for details on detecting user object caching.

(2)        Look at query plans for queries that are creating/destroying user objects in TempDB and see if you can do something to minimize it. For example, a #table may be created in a loop and maybe it is possible to create it outside the loop.