Transaction Log Management
Transaction Log Management
Transaction Log Management
The right of Tony Davis and Gail Shaw to be identied as the authors of this work has been asserted by them in accordance with the Copyright, Designs and Patents Act 1988. All rights reserved. No part of this publication may be reproduced, stored or introduced into a retrieval system, or transmitted, in any form, or by any means (electronic, mechanical, photocopying, recording or otherwise) without the prior written consent of the publisher. Any person who does any unauthorized act in relation to this publication may be liable to criminal prosecution and civil claims for damages. This book is sold subject to the condition that it shall not, by way of trade or otherwise, be lent, re-sold, hired out, or otherwise circulated without the publisher's prior consent in any form other than which it is published and without a similar condition including this condition being imposed on the subsequent publisher. Technical Review and Edit: Kalen Delaney Additional Material: Jonathan Kehayias and Shawn McGehee Cover Image by Andy Martin Typeset by Peter Woodhouse and Gower Associates
Table of Contents
Introduction____________________________________________ 11 Chapter 1: Meet the Transaction Log_______________________ 15
How SQL Server Uses the Transaction Log_________________________________ 15 Write Ahead Logging and Transactional Consistency________________________ 16 Transaction Log Backup and Restore______________________________________ 18 Controlling the Size of the Log__________________________________________ 20 A Brief Example of Backing up the Transaction Log_________________________22 Summary____________________________________________________________ 26
Virtual Log Files and the Active Log_______________________________________ 27 Log Truncation and Space Reuse_________________________________________ 31 A Peek at How SQL Server Grows the Log__________________________________ 33 Summary____________________________________________________________ 42 The Importance of Backups______________________________________________ 43 What Backups Do I Need to Take?________________________________________ 45 Database Recovery Models_______________________________________________47 Choosing the right recovery model_____________________________________48 Setting the recovery model___________________________________________ 49 Discovering the recovery model________________________________________50 Switching models____________________________________________________ 53 Log Backup Logistics___________________________________________________ 54 Frequency of log backups_____________________________________________ 54 Preserving the log chain______________________________________________ 55 Storing log backups__________________________________________________ 56 Automating and verifying backups______________________________________ 57 Summary_____________________________________________________________ 58
Chapter 4: Managing the Log in SIMPLE Recovery Model______59 Chapter 5: Managing the Log in FULL Recovery Model_______ 62
Working in SIMPLE Recovery Model______________________________________ 59 Pros and Cons of SIMPLE Recovery Model_________________________________ 61 What Gets Logged?____________________________________________________ 62 Basics of Log Backup___________________________________________________ 63 Are log backups being taken?__________________________________________ 63 How to back up the transaction log____________________________________ 64 Tail log backups_____________________________________________________ 65 Performing Restore and Recovery________________________________________67 Full restore to point of failure_________________________________________ 68 Restore to end of log backup___________________________________________ 71 Point-in-time restores________________________________________________76 Tail log backups when the database is offline______________________________84 Summary_____________________________________________________________ 85 Minimally Logged Operations___________________________________________ 88 Advantages of Minimal Logging and BULK_LOGGED Recovery______________ 96 Implications of Minimally Logged Operations____________________________ 100 Crash recovery_____________________________________________________ 101 Database restores___________________________________________________ 102 Log backup size____________________________________________________ 106 Tail log backups____________________________________________________ 107 Best Practices for Use of BULK_LOGGED_________________________________ 113 Summary____________________________________________________________ 115
Sizing and Growing the Log_____________________________________________ 118 Diagnosing a Runaway Transaction Log__________________________________ 119 Excessive logging: index maintenance operations_________________________ 120 Lack of log space reuse______________________________________________ 128 Other possible causes of log growth____________________________________ 140 Handling a Transaction Log Full Error___________________________________ 144 Mismanagement or What Not To Do_____________________________________ 147 Detach database, delete log le________________________________________ 147 Forcing log le truncation____________________________________________ 148 Scheduled shrinking of the transaction log______________________________ 149 Proper Log Management_______________________________________________ 150 Summary____________________________________________________________ 151
ix
Acknowledgements
Tony Davis would like to thank: Gail Shaw. It's been a pleasure to work with Gail on this book. She was tireless in keeping me honest and accurate, and commendably patient in helping me understand some of the ner points of transaction log mechanics. Kalen Delaney. It was both reassuring and slightly intimidating to have Kalen edit and review my eorts, given how much she knows about this topic, and SQL Server in general. Her eorts have undoubtedly made this a far better book than it would have been otherwise. All of the people who give their time to contribute to the SQL Server community knowledge base, in the form of books, articles, blogs and forum posts, as well as technical presentations at events such as SQL Saturday and the PASS Summit. I've worked with many of these people over the years, and learned from all of them. Sarah, my wife. For her tolerance and patience during the trials and tribulations of writing this book, and for far too many other things to mention here.
Introduction
Associated with every SQL Server database is a primary data le (.mdf ), possibly some secondary data les (.ndf ), and a transaction log le (.ldf ). The purpose of the data les is self-explanatory, but the purpose of the log le remains, for many, shrouded in mystery, and yet it is critical to the everyday operation of SQL Server, and to its ability to recover a database in the event of failure. A transaction log is a le in which SQL Server stores a record of all the transactions performed on the database with which the log le is associated. In the event of a disaster that causes SQL Server to shut down unexpectedly, such as an instance-level failure or a hardware failure, SQL Server uses the transaction log in order to recover the database to a consistent state, with data integrity intact. Upon restart, a database enters a crash recovery process in which the transaction log is read to ensure that all valid, committed data is written to the data les (rolled forward) and the eects of any partial, uncommitted transactions are undone (rolled back). SQL Server also uses the transaction log during normal operation, to identify what it needs to do when a transaction rolls back, due to either an error or a user-specied ROLLBACK statement. In short, the transaction log is the fundamental means by which SQL Server ensures database integrity and the ACID properties of transactions, notably durability. In addition, backups of the transaction log enable DBAs to restore a database to the state in which it existed at a previous, arbitrary point in time. During the restore phase, SQL Server rolls forward the data changes described in a series of log backup les. We can then recover the database and, via the recovery process described above, SQL Server will ensure that the database is in a transactionally consistent state, at the recovery point. When a SQL Server database is operating smoothly and performing well, there is no need to be particularly conscious of exactly what the transaction log does, or how it works. As a DBA, you just need to be condent that every database has the correct backup regime in place. When things go wrong, however, a deeper understanding of the transaction log 11
is important for taking corrective action, particularly when a point-in-time restore of a database is urgently required! In this book, we strive to oer just the right level of detail so that every DBA can perform all of the most important duties of a DBA concerning management of the transaction log. This covers performing transaction log backup and restore, monitoring and managing log growth, including quick diagnosis of the causes of rapid growth, and optimizing log architecture to maximize log throughput and availability.
Book Structure
The book starts with a discussion, in Chapters 1 and 2, of how SQL Server uses the transaction log during normal operation as well as during database restore operations and crash recovery. These chapters aim to provide just enough information so that you understand the basic role of the log, how it works, and a little about its internal structure. With this foundation laid, the book sets about describing all of the major necessary management techniques for the transaction log, covering how to: Choose the right recovery model SQL Server oers three database recovery models: FULL (the default), SIMPLE, and BULK LOGGED. The DBA must choose the appropriate model according to the business requirements for the database, and then establish maintenance procedures appropriate to that mode. We cover this in Chapter 3, Transaction Logs, Backup and Recovery. Perform transaction log backup and restore Unless working in SIMPLE model, it is vital that the DBA perform regular backups of the transaction log. Once captured in a backup le, the log records can subsequently be applied to a full database backup in order to perform a database restore, and so re-create the database as it existed at a previous point in time, for example, right before a failure. Chapters 4 to 6 cover this in detail, for each of the recovery models. Monitor and manage log growth In a busy database, the transaction log can grow rapidly in size. If not regularly backed up, or if inappropriately sized, or assigned 12
incorrect growth characteristics, the transaction log le can ll up, leading to the infamous "9002" (transaction log full) error, which puts SQL Server into a "read-only" mode. We deal with this topic in Chapter 7, Dealing with Excessive Log Growth. Optimize log throughput and availability In addition to basic maintenance such as taking backups, the DBA must take steps to ensure adequate performance of the transaction log. This includes hardware considerations, as well as avoiding situations such as log fragmentation, which can aect the performance of operations that read the log. We cover this topic in Chapter 8, Optimizing Log Throughput. Monitor the transaction log Having congured hardware for the log array, and pre-sized the log according to workload and maintenance requirements, it's very important that we monitor the I/O performance of the log array, as well as log activity, tracking log growth, log fragmentation, and so on. Chapter 9, Monitoring the Transaction Log, covers log monitoring tools and techniques.
Code Examples
You can download every script (denoted Listing X.X) in this book from the following URL: http://www.simple-talk.com/RedGateBooks/DavisShaw/SQLServerTransactionLog_Code.zip Most of the examples use custom-built databases, but a few rely on the readily available AdventureWorks database. For SQL Server 2008 and later, you can download it from Microsoft's codeplex site: http://msftdbprodsamples.codeplex.com/releases/ For SQL Server 2005, use: http://msftdbprodsamples.codeplex.com/releases/view/4004.
13
For SQL Server 2005 and 2008, run the le AdventureWorksDB.msi (SQL Server 2005) or simply copy the data and log les to your hard drive (SQL Server 2008) and then, in SQL Server Management Studio, attach the les to create the AdventureWorks database on your preferred SQL Server instance. For SQL Server 2008 R2 and later, simply follow the instructions here: http://social.technet.microsoft.com/wiki/contents/articles/3735.sql-serversamples-readme-en-us.aspx#Readme_for_Adventure_Works_Sample_Databases.
Feedback
We've tried our very best to ensure that this book is useful, technically accurate, and written in plain, simple language. If we've erred on any of these elements, we'd like to hear about it. Please post your feedback and errata to the book page, here: http://www.simple-talk.com/books/sql-books/sql-server-transaction-logmanagement/.
14
Chapter 1: Meet the Transaction Log the right order (more on this in Chapter 2) and so can construct a complete description of the actions of each transaction, which can be performed again as a part of redo, or rolled back as a part of undo, during a crash recovery operation.
The transaction log is not an audit trail
The transaction log does not provide an audit trail of changes made to the database; it does not keep a record of the commands executed against the database, just how the data changed as a result.
At each checkpoint, SQL Server scans the data cache and ushes to disk all dirty pages in memory. A dirty page is any page in the cache that has changed since SQL Server read it from disk, so that the page in cache is dierent from what's on disk. Again, this is not a selective ushing; SQL Server ushes out all dirty pages, regardless of whether they contain changes associated with open (uncommitted) transactions. However, the log buer manager always guarantees to write the change descriptions (log records) to the transaction log, on disk, before it writes the changed data pages to the physical data les. This mechanism, termed write ahead logging, allows SQL Server to ensure some of the ACID properties (http://msdn.microsoft.com/en-gb/library/aa719484(VS.71). aspx) of database transactions, notably durability. By writing changes to the log le rst, SQL Server can guarantee that a committed change will persist, even under exceptional circumstances. For example, let's say a transaction (T1) is committed, and hardened to the log, but SQL Server crashes before a checkpoint occurs. Upon restart, the crash recovery process is initiated, which attempts to reconcile the contents of the transactions log le and the data les. It will read the transaction log le and nd a series of log records relating to T1 ending with one that conrms the COMMIT. Therefore, it will ensure that all of the operations that comprise transaction T1 are "rolled forward" (redone) so that they are reected in the data les.
The database checkpoint process
By regularly ushing dirty pages from cache to disk, the database checkpoint process controls the amount of work SQL Server needs to do during crash recovery. If SQL Server had to roll forward the changes for a huge number of committed transactions, then the recovery process could be very lengthy.
17
Chapter 1: Meet the Transaction Log Conversely, let's say that T1 starts, a checkpoint occurs, and then, before T1 commits, SQL Server crashes. At the checkpoint, SQL Server would have ushed to disk the dirty pages relating to T1 (along with many others) but the log buer manager would have ensured that, rst, the relevant log records, in cache, were ushed to disk. Since there will be no log record on disk conrming a commit for T1, the crash recovery process will read the relevant operations from the log le, relating to T1, and perform the reverse physical operation on the data, thereby "rolling back" (undoing) the partial eects of an uncommitted transaction. In this manner, SQL Server can return the database to a consistent state in the event of a crash. More generally, the rollback (undo) process occurs if: A ROLLBACK command is issued for an explicit transaction. An error occurs and XACT_ABORT is turned on. If the database detects that communication has been broken between the database and the client that instigated the transactions. In such circumstances, the log records pertaining to an interrupted transaction, or one for which the ROLLBACK command is explicitly issued, are read and the changes rolled back. In these ways, SQL Server ensures transaction atomicity, i.e. that either all the actions associated with a transaction succeed as a unit, or that they all fail. As such, the transaction log represents one of the fundamental means by which SQL Server ensures data consistency and integrity during normal day-to-day operation.
Chapter 1: Meet the Transaction Log As discussed earlier, a transaction log contains a series of log records, each one describing a single database action, stored sequentially according to when that action occurred in the database. For a given transaction, rst a log record will be recorded signifying the start of the transaction, then a series of log records describing each action performed, and nally a record for the transaction commit. It's a mistake, however, to think of the log records for each transaction being stored neatly together; they will be interspersed with records relating to the actions of many other transactions occurring at around the same time. When operating a database in FULL or BULK_LOGGED recovery model (we cover recovery models in more detail in Chapter 3), it is possible to take log backups. These log backups write into a backup le all the log records that have been entered since the last log backup or, if this is the rst-ever log backup, since the rst full database backup. If the log chain (discussed in Chapter 3) is broken, then the rst log backup we take after this event will capture all records added to the log since the rst full or dierential backup taken after the chain was broken. During a restore operation, we can then restore the most recent full database backup, followed by the complete chain of log backup les up to the one that covers the point in time to which we wish to restore the database. If we restore completely the last log backup le in the chain, then the recovery process will return the database to the state consistent with the time of the last committed transaction in that backup le. Alternatively, we can stop the restore process at a specic point in time within the nal log backup, and then the database recovery process will return the database to a consistent state for that time. We'll discuss these operations in much more detail in Chapter 5. For reasons that will become clearer as we progress, and which we discuss in detail in Chapter 4, it is not possible to take log backups for databases operating in SIMPLE recovery model, and so any database restore operations for such databases must reply solely on full (and dierential) database backups.
19
20
Therefore, when working in FULL and BULK LOGGED recovery models, it is vital that you perform regular transaction log backups, in addition to full backups and, optionally, dierential backups. Many novice or part-time DBAs perform full backups on their databases, but they don't perform transaction log backups. As a result, the transaction log is not truncated, and it grows and grows until the drive it is on runs out of disk space, causing SQL Server to enter read-only mode. When a log backup occurs, any VLF that is no longer active becomes eligible for truncation, meaning simply that SQL Server can reuse its space to store new log records. Truncation of the log will occur as soon as the log backup is taken, assuming no other factors, such as an in-progress database backup operation, are delaying truncation. We'll cover some of the other factors that may delay truncation of VLFs even after log backup, as well as factors that keep large swathes of the log active that otherwise wouldn't need to be, such as a rogue, long-running uncommitted transaction, or database mirroring, or replication processes, in Chapter 7.
COPY_ONLY backups of the transaction log
An exception to the rule of log backups truncating the log is the COPY_ONLY log backup. A COPY_ONLY log backup exists "independently" of the normal log backup scheme; it does not break the log backup chain.
In a SIMPLE recovery model database, by contrast, log truncation can occur immediately upon checkpoint. SQL Server ushes cached dirty pages to disk (after rst writing the transaction details) and then immediately truncates any VLFs that are no longer active, so that the space can store new log records. This also explains why log backups are meaningless for SIMPLE recovery model databases. 21
In Listing 1.1, we create a new TestDB database on a SQL Server 2008 instance, and then immediately obtain the size of the log le using the DBCC SQLPERF (LOGSPACE) command.
USE master ; IF DB_ID('TestDB') IS NOT NULL DROP DATABASE TestDB ; CREATE DATABASE TestDB ON ( NAME = TestDB_dat, FILENAME = 'D:\SQLData\TestDB.mdf' ) LOG ON ( NAME = TestDB_log, FILENAME = 'D:\SQLData\TestDB.ldf' ) ; DBCC SQLPERF(LOGSPACE) ;
22
Listing 1.1:
As you can see, the log le is currently about 1 MB in size, and about 30% full.
The model database
The properties of the model database determine the initial size and growth characteristics of new user databases on an instance, as well as the default recovery model that each database will adopt (FULL, in this case). We'll discuss the impact of these properties in more detail in Chapter 8.
We can conrm the size of the le simply by locating the physical les on disk, as shown in Figure 1.1.
Figure 1.1:
Let's now perform a full database backup (i.e. backing up the data le) for TestDB, as shown in Listing 1.2 (you'll rst need to create the SQLBackups directory). Note that this backup operation ensures the database truly is operating in FULL recovery model; more on this in Chapter 3.
-- full backup of the database BACKUP DATABASE TestDB TO DISK ='D:\SQLBackups\TestDB.bak' WITH INIT; GO
Listing 1.2:
23
Chapter 1: Meet the Transaction Log There is no change in the size of the data or log le as a result of this backup operation, or in the percentage of log space used, which is perhaps unsurprising given that there are no user tables or data in the database as yet. Let's put that right, and create a table called LogTest on this database, ll it with a million rows of data, and recheck the log le size, as shown in Listing 1.3. Je Moden, the author of this script, is seen regularly on the SQLServerCentral.com forums (http://www.sqlservercentral.com/Forums/), and we reproduce it here with his kind permission. Do not worry about the details of the code (we use it several times throughout the book); the only important thing here is that it's a highly efficient way to insert many rows. Still, this code may take several seconds to execute on your machine, due to all the work going on behind the scenes, writing to, and growing, the data and log les.
USE TestDB ; GO IF OBJECT_ID('dbo.LogTest', 'U') IS NOT NULL DROP TABLE dbo.LogTest ; --===== AUTHOR: Jeff Moden --===== Create and populate 1,000,000 row test table. -- "SomeID" has range of 1 to 1000000 unique numbers -- "SomeInt" has range of 1 to 50000 non-unique numbers -- "SomeLetters2";"AA"-"ZZ" non-unique 2-char strings -- "SomeMoney"; 0.0000 to 99.9999 non-unique numbers -- "SomeDate" ; >=01/01/2000 and <01/01/2010 non-unique -- "SomeHex12"; 12 random hex characters (ie, 0-9,A-F) SELECT TOP 1000000 SomeID = IDENTITY( INT,1,1 ), SomeInt = ABS(CHECKSUM(NEWID())) % 50000 + 1 , SomeLetters2 = CHAR(ABS(CHECKSUM(NEWID())) % 26 + 65) + CHAR(ABS(CHECKSUM(NEWID())) % 26 + 65) , SomeMoney = CAST(ABS(CHECKSUM(NEWID())) % 10000 / 100.0 AS MONEY) , SomeDate = CAST(RAND(CHECKSUM(NEWID())) * 3653.0 + 36524.0 AS DATETIME) , SomeHex12 = RIGHT(NEWID(), 12) INTO dbo.LogTest FROM sys.all_columns ac1 CROSS JOIN sys.all_columns ac2 ; DBCC SQLPERF(LOGSPACE) ;
24
Listing 1.3:
Notice that the log le size has ballooned to almost 100 MB and the log is 93% full (the gures might be slightly dierent on your system). If we were to insert more data, it would have to grow in size to accommodate more log records. Again, we can conrm the size increases from the physical les (the data le has grown to 64 MB). We can perform another full database backup at this point, by rerunning Listing 1.2, and it will make no dierence to the size of the log le, or the percentage of space used in the le. Instead, however, let's back up the transaction log le and recheck the values, as shown in Listing 1.4.
-- now back up the transaction log BACKUP Log TestDB TO DISK ='D:\SQLBackups\TestDB_log.trn' WITH INIT; GO DBCC SQLPERF(LOGSPACE) ;
Listing 1.4:
The log le is still the same physical size but, by backing up the le, SQL Server is able to truncate the log, making space in the inactive VLFs in the log le available for reuse; it can write new log records without needing to grow the log le. In addition, of course, we've captured the log records into a backup le and can use that backup le as part of the database recovery process, should we need to restore the TestDB database to a previous state. 25
Summary
In this rst chapter, we've introduced the transaction log, and explained how SQL Server uses it to maintain data consistency and integrity, via a write ahead logging mechanism. We've also described, and briey demonstrated, how a DBA can capture the contents of the transaction log le into a backup le, for use as part of a database recovery operation. Finally, we stressed the importance of backups in controlling the size of the transaction log. In the next chapter, we'll take a closer look at the architecture of the transaction log.
26
Figure 2.1:
Transaction log les are sequential les; in other words, SQL Server writes to the transaction log sequentially (unlike data les, which SQL Server writes in a more random fashion, as applications modify data in random data pages).
27
SQL Server stamps each log record inserted into the log le with a Logical Sequence Number (LSN). When we create a new database, with its associated log le, the rst log record marks the logical start of the log le, which at this stage will coincide with the start of the physical le. The LSNs are then ever increasing, so a log record with an LSN of 5 records an action that occurred immediately before the one with an LSN of 6. The most recent log record will always have the highest LSN, and marks the end of the logical le (discussed in more detail shortly). Log records associated with a given transaction are linked in an LSN chain. Specically, log records are "backwards chained;" each log record stores the sequence number of the log record that preceded it in a particular transaction. This enables SQL Server to perform rollback, undoing each log record in exact reverse order. An important concept is that of the "active log." The start of the active log is the "oldest log record that is required for a successful database-wide rollback or by another activity or operation in the database." The LSN of this log record is MinLSN. In other words, the MinLSN is the LSN of the log record relating to the oldest open transaction, or to one that is still required by some other database operation or activity. The log record with the highest LSN (i.e. the most recent record added) marks the end of the active log. SQL Server writes all subsequent records to the logical end of the log.
28
Figure 2.2:
Note that you may occasionally hear the MinLSN referred to as the tail of the log, and MaxLSN as the head of the log. A VLF that contains any part of the active log is active; a VLF that contains no part of the active log is inactive. In Figure 2.2, VLFs 1 to 3 are all active. In short, a log record is no longer part of the active log if it is older (i.e. has a lower LSN) than the MinLSN record and this will be true only if the following two conditions are met: 1. No other database process, including a transaction log backup when using FULL or BULK LOGGED recovery models, requires it to remain in the active log. 2. It relates to a transaction that is committed, and so is no longer required for rollback. For example, consider a case where the MinLSN is a log record for an open transaction (T1) that started at 9.00 a.m. and takes 30 minutes to run. A subsequent transaction (T2) starts at 9.10 a.m. and nishes at 9.11 a.m.. Figure 2.3 shows a very simplied depiction of this situation (our VLFs hold only four log records!).
29
Figure 2.3:
VLF2 contains no log records relating to open transactions, but all these log records remain part of the active log since they all have LSNs greater than MinLSN. In Figure 2.4, the action has moved on. A third transaction (T3) has started, and T1 has completed its nal update, and committed.
Figure 2.4:
Now, the log record for the oldest open transaction is LSN8, but the MinLSN is still LSN1, and so VLFs 13 are all active. Why? Our database is operating in the FULL recovery model and until the next log backup operation, all of these log records must remain in the active log. In other words, LSN1 is the oldest log record still required by another database process or activity, in this case a log backup. 30
Figure 2.5:
In this manner, as transactions start and are committed, we can imagine (somewhat simplistically) the tail and head of the log moving left to right across Figure 2.5, so that VLFs that previously contained part of the active log now become inactive (VLF1), and VLFs that were previously untouched will become part of the active log (VLF4). In a SIMPLE recovery model database, we would get the same situation as depicted in Figure 2.5, if we replaced the log backup with a checkpoint operation. 31
Chapter 2: Some, But Not Too Much, Log Internals It's important to remember that even after a checkpoint (for SIMPLE recovery) or a log backup (for FULL or BULK_LOGGED recovery) a VLF may still not be eligible for truncation if it contains log records that must remain part of the active log for other reasons. For example, it may be that these log records relate to an open transaction and so SQL Server requires them for possible rollback. If, in Figure 2.5, we swap the positions of the log records for the T1 commit and the log backup, then VLF1 would not be eligible for truncation, despite the log backup, since the MinLSN would be "pinned" at LSN1 until T1 committed. Alternatively, these records may still be required because some other database operation, such as database mirroring or replication or CDC (Change Data Capture) has yet to process them. In either case, the VLFs containing these log records will remain active, despite the log backup or checkpoint. A later log backup will mark these VLFs as inactive, assuming that, by that point, they no longer contain any part of the active log. The nal question to consider is what happens when the head of the log reaches the end of VLF4. It's easiest to think of space in the log le as being reused in a circular fashion, though there are complicating factors that can sometimes make space reuse patterns seem rather arbitrary, and which we're not going to delve deeper into in this book. Nevertheless, in the simplest case, once the head of the log reaches the end of a VLF, SQL Server will start to reuse the next sequential VLF that is inactive. In Figure 2.5, the head will "wrap back round" and reuse VLF1 (or SQL Server may need to grow the log, depending on how much extra space it requires). If no further VLFs were available at all, the log would need to auto-grow and add more VLFs. If this is not possible, because auto-growth is disabled, or the disk housing the log le is full, then the logical end of the active log will meet the physical end of the log le, the transaction log is full, and SQL Server will issue the 9002 "transaction log full" error and the database will become read-only.
32
Chapter 2: Some, But Not Too Much, Log Internals This architecture explains the reason why a very long-running transaction, or a replicated transaction that fails to dispatch to the distribution database, or a disconnected mirror, among others, can cause the log to grow very large. For example, in Figure 2.5, we can imagine what happens if T1 does not commit. Other transactions start and eventually VLF 4 is full and there aren't any inactive VLFs. Even if every transaction that started after MinLSN has committed, SQL Server can reuse none of the space in these VLFs, as all the VLFs are still part of the active log. You can see this in action quite easily if you return to our example from Chapter 1. First, rerun Listing 1.1 to drop and re-create the TestDB database. Then create a small sample table, update one of the rows in the table within an explicit transaction, and leave the transaction open (do not COMMIT it). In a second tab within SSMS, run the scripts in Listings 1.2 to 1.4. This time you should see that the log backup does not make space available for reuse. However, if you then commit the transaction and rerun the log backup, SQL Server reuses the space. In such cases, where the "area" occupied by the active log is very large, with a lot of space that is not reusable, at some point, the log le will have to grow in size (and grow, and grow). We'll discuss other factors that delay the truncation of the log le in Chapter 7.
33
Chapter 2: Some, But Not Too Much, Log Internals and allow it to grow in small, xed increments of 16 MB (settings based on those for AdventureWorks2008). We set the database to FULL recovery model, create in it a table, PrimaryTable_Large, perform a full database backup and, nally, check the log space use.
USE master ; IF DB_ID('TestDB') IS NOT NULL DROP DATABASE TestDB ; CREATE DATABASE TestDB ON PRIMARY ( NAME = N'TestDB' , FILENAME = N'D:\SQLData\TestDB.mdf' , SIZE = 199680KB , FILEGROWTH = 16384KB ) LOG ON ( NAME = N'TestDB_log' , FILENAME = N'D:\SQLData\TestDB_log.ldf' , SIZE = 2048KB , FILEGROWTH = 16384KB ); GO USE TestDB Go IF OBJECT_ID('dbo.PrimaryTable_Large', 'U') IS NOT NULL DROP TABLE dbo.PrimaryTable_Large GO CREATE TABLE PrimaryTable_Large ( ID INT IDENTITY PRIMARY KEY , SomeColumn CHAR(4) NULL , Filler CHAR(1500) DEFAULT '' ); GO ALTER DATABASE TestDB SET RECOVERY FULL;
34
Listing 2.1:
We can interrogate the VLF architecture using a command called DBCC LogInfo, as shown in Listing 2.2.
Interrogating VLFs using DBCC LogInfo
DBCC LogInfo is an undocumented and unsupported command at least there is very little written about it by Microsoft. We'll use it in this chapter to peek at the VLFs, but we won't go into detail about all the information it returns.
Listing 2.2:
35
Chapter 2: Some, But Not Too Much, Log Internals Note that four rows are returned, meaning that we have four VLFs. We are not too concerned right now with the meaning of the columns returned, but we'll note the following: The initial log size was 2 MB, and three of the four VLFs are just under 0.5 MB in size, and one is over, as seen in the FileSize column (in Bytes). A Status value of 2 indicates an active VLF that cannot be overwritten, and a value of 0 indicates an inactive VLF where space can be reused. Currently, we have one active VLF. The FSeqNo (rst sequence number) column indicates the logical order of the VLFs in the log. Our single active VLF is logically the rst VLF (the value starts at one more than the highest FSeqNo in the model database, as explained by Paul Randal, http://www.sqlskills.com/BLOGS/PAUL/post/Initial-VLF-sequence-numbersand-default-log-file-size.aspx). The CreateLSN is the LSN of the auto-growth event, or ALTER DATABASE operation, which grew the log and so created the VLF. Zero indicates that these VLFs were created when the database was created. Let's now add 1,500 rows to our table, recheck log space use, and re-interrogate DBCC LogInfo.
36
Listing 2.3:
Having inserted only 1,500 rows, the log has already auto-grown. We see four new VLFs (note that the CreateLSN values are non-zero, and the same in each case, because SQL Server created them in a single auto-grow event). The auto-growth increment was 16 MB and each new VLF is roughly 4 MB in size. The rst of the new VLFs (FSeqNo 34) is active, but the rest are yet unused. The log is now 18 MB, as expected, and is 16% full.
37
Chapter 2: Some, But Not Too Much, Log Internals Let's now perform a log backup and recheck our stats.
/*now a log backup*/ BACKUP Log TestDB TO DISK ='D:\SQLBackups\TestDB_log.bak' WITH INIT; GO DBCC SQLPERF(LOGSPACE) ; /*Log Size: 18 MB ; Log Space Used: 6%*/ -- how many VLFs? USE TestDB; GO DBCC Loginfo GO
Listing 2.4:
Thanks to the log backup, the log is now only 6% full. The rst four VLFs are truncated, and so are available for reuse. Let's grow the log a second time, adding 15K rows (simply rerun Listing 2.3 and adjust the number of rows).
38
Figure 2.6:
SQL Server lls VLFS 3437 (arising from the previous auto-growth). The original four VLFs (3033) are available for reuse, but SQL Server opts to grow the log immediately, rather than reuse them rst, and then starts lling the rst three of the new VLFs (3840). In this case, SQL Server grows the log simply because if it had used the original four small VLFs it would not have had the required log reservation space necessary in the event that it needed to roll back the transaction. The process of rolling back a transaction requires a series of operations that will undo the eects of that transaction. These compensating operations generate log records, just as does any other database change. Log reservation is the amount of log space (per transaction) that SQL Server must keep free in order to accommodate these compensation log records. Accordingly, big transactions that write many records will also need quite a large log reservation. SQL Server releases this additional space once a transaction commits. Chapter 7 (Listing 7.1) provides a query that will display log bytes written and reserved by a transaction. If we had run the 15K-row insert inside an explicit transaction, without 39
Chapter 2: Some, But Not Too Much, Log Internals committing it, and in another session had run Listing 7.1 to nd out how much log space SQL Server reserved for this transaction, we'd have seen it was about 5.3 MB. The total space in those rst four VLFs is only 2 MB, which is insufficient to store the compensation operations that may be necessary, and so SQL Server grows the log immediately. In the absence of any log reservation calculation, SQL Server would simply have lled the available VLFs and then grown the log. As such, the exact growth behavior depends very much on the patterns of user activity, and the size of the transactions. If we replace the 15K row insert with a series of three, smaller inserts (1,500 rows, 5,000, rows, 3,500 rows, for example) then we see something like Figure 2.7.
Figure 2.7:
Notice that SQL Server lls the VLFs added by the rst auto-growth (3437), and then cycles back to the start and uses the rst two of the original VLFs (now reused as 38 and 39). This happens because, here, these VLFs will accommodate the log reservation required by the smaller initial inserts. SQL Server then grows the log again and starts to use the rst of these (40) rather the remaining two original VLFs.
40
Chapter 2: Some, But Not Too Much, Log Internals We hope that this provides a little insight into how SQL Server auto-grows the log. Of course, in these examples, we've been adding a relatively small number of rows. As a nal test, rerun Listing 2.1, followed by Listing 2.3, but this time add 1 million rows. The log will now be about 2.1 GB in size and about 80% full. Figure 2.8 shows the output of DBCC LogInfo.
Figure 2.8:
Now we have 536 VLFs (436 of which are in use). Note that it may seem as if SQL Server has "overgrown" the log, but again we must factor in log reservation requirements. In this case, SQL Server had to add considerable extra space in case it needed to roll back the insert. A transaction log that auto-grows frequently, in small increments, will have a very large number of small VLFs. This phenomenon is log fragmentation. Essentially, the initial size and relatively small growth increments we've chosen for this database are inappropriate for this sort of data load and lead to the creation of a large number of VLFs. It's possible that log le fragmentation can degrade the performance of SQL Server processes that need to read the log, such as crash recovery and log backups. We will discuss this in more detail in Chapter 8, and show how to avoid fragmentation by correctly sizing and growing the log le.
41
Summary
This chapter provides minimal background information regarding the architecture of the transaction log; hopefully, just enough so that you can understand the basic issues and potential problems relating to truncation, space reuse, and fragmentation, in the log le. In Chapter 3, we move on to a more detailed discussion of the role of the transaction log in database restore and recovery.
42
43
Chapter 3: Transaction Logs, Backup and Recovery A DBA can perform three main types of backup (although only the rst two apply for a database in the SIMPLE recovery model). Full database backup A copy of the data les. Essentially, this provides a complete archive of your database as it existed at the time the backup operation nished (the backup will include the eects of any transaction that completed before the data-copying portion of the backup was nished. Any transactions that are still open when SQL Server starts backing up the necessary portion of the transaction log are not included). A full backup includes all user objects and data as well as relevant system information. Dierential database backup A copy of any changes made to the database since the last full backup. There is an inextricable link between a dierential backup and the most recent full backup. Without the latter, the former is unusable. Transaction log backup A copy of all log records inserted into the transaction log since the last log backup (or database checkpoint, if working in SIMPLE recovery model). When a log backup is made, the log generally gets truncated so that space in the le can be reused, although some factors can delay this (see Chapter 7). No inextricable link exists between a log backup and any specic full or dierential database backup (unless it is the rst full database backup). We can restore any full backup, followed by a complete chain of log backups, up to the required recovery point. Some junior DBAs and many developers, perhaps misled by the term "full," assume that a full database backup backs up everything, both the data and contents of the transaction log. This is not correct. Essentially, both full and dierential backups only back up the data les, although they do also back up enough of the transaction log to enable recovery of the backed up data, and reproduce any changes made while the backup was in progress. However, in practical terms, a full database backup does not back up the transaction log. We must take separate log backups for database restore purposes.
44
45
Chapter 3: Transaction Logs, Backup and Recovery If a database holds business-critical data and you would prefer to measure the exposure to data loss in minutes rather than hours, then you will need to take transaction log backups. Let's nally assume that we supplement the full and dierential backups with transaction log backups, taken every 30 minutes. In other words, we take a full database backup, followed by a series of transaction log backups, followed by a dierential database backup, followed by a series of transaction log backups, followed by another full backup, and so on. Figure 3.1 shows a simplied picture of this (with a reduced number of log backups).
Figure 3.1:
A backup strategy comprising full and dierential database backups and log backups.
In this case, we could restore the most recent full backup, the dierential backup and the whole chain of log backups (les 7 to 12 in Figure 3.1) up to 12.30 a.m., and we would lose only 30 minutes-worth of data. In fact, if we still had access to the transaction log, we might be able to perform a tail log backup (covered in detail in Chapter 4) and minimize our data loss to close to zero. SQL Server will roll forward all of the actions recorded in the log backups, up to the specied point and then we can recover the database to a consistent state at a point in time very close to the time of the disaster. Of course, this scheme assumes that all backups are valid and non-corrupt and, in the case of the log backups, that you have a full and complete log chain. We'll get to this in more detail shortly, but let's say that between the eighth and ninth log backups someone switched the database to SIMPLE recovery model, and then back to FULL, without taking any further backups. Upon switching to SIMPLE recovery model, SQL Server would have truncated the transaction log and we'd only be able to restore to the end of Log 8. In addition, any subsequent log backups would fail, until we took another full (or dierential) database backup. 46
Chapter 3: Transaction Logs, Backup and Recovery However, assuming this is not the case, then the scheme in Figure 3.1 oers much more "redundancy" to our restore options. If, for some reason, the dierential backup is unavailable, we can restore the full backup followed by Logs 112. If Full Backup 1 was missing, we could even go back to the full backup before that, followed by the dierential associated with that backup, followed by the long chain of log les, up to the required point. Of course, with all these backups comes a much higher maintenance overhead, in terms of the extra eort of creating and monitoring the jobs required to run frequent transaction log backups, the I/O resources that these backups require (albeit for short periods of time), and the disk space required to store a large number of backup les. Due consideration needs to be given to this at a business level, before choosing the appropriate recovery model for a given database.
Chapter 3: Transaction Logs, Backup and Recovery FULL SQL Server maintains the active log as normal and does not truncate the log after a database checkpoint. Only a log backup operation can truncate the log. SQL Server can apply log backups during database restore operations, essentially replaying (redoing) the operations therein, so that it can recover the database to the state in which it existed at a previous point in time. There is also a third recovery model, BULK_LOGGED, in which certain operations that would normally generate a lot of writing to the log perform less logging, in order not to overwhelm the transaction log.
Chapter 3: Transaction Logs, Backup and Recovery to recover a database to a specic point in time. We will discuss this in a lot more detail in Chapter 6, but suffice to say for now that operating a database permanently in BULK_LOGGED model is not a viable way to minimize the size of transaction log.
Minimally logged operations
Examples of operations that SQL Server can minimally log include bulk import operations (using, for example, bcp or BULK INSERT), SELECT/INTO operations and certain index operations, such as index rebuilds. For a full list, see: http://msdn.microsoft.com/en-us/library/ms191244.aspx.
Listing 3.1:
A database will adopt the default recovery model specied by the model database. In many cases, this will mean that the default recovery model for a database is FULL, but dierent editions of SQL Server may have dierent defaults for the model database. 49
Listing 3.2:
However, be careful with this query, as it may not always tell the whole truth. For example, if we create a brand new database and then immediately run the command from Listing 3.2, it would report that the database was in FULL recovery model. However, in fact, until we take a full database backup the database will be operating in auto-truncate mode, sometimes referred to as pseudo-SIMPLE recovery model. We can see this in action by creating a new database on a SQL Server 2008 instance, where the default recovery model is FULL. To make doubly sure, we explicitly set the recovery model to FULL and then query sys.databases to conrm, as shown in Listing 3.3.
50
Listing 3.3:
This indicates that we're in FULL recovery model, so let's insert some data, check the log space usage, force a checkpoint operation and then recheck the log usage, as shown in Listing 3.4.
51
Listing 3.4:
Note that the log le is roughly the same size after the checkpoint, but is only 11% full; because of the checkpoint operation, SQL Server truncated the log and made the space available for reuse. Although the database is assigned to FULL recovery model, it is not actually operating in that model until the rst full database backup is taken. Interestingly, this means we could have achieved the same eect by running that full backup of the TestDB database, instead of explicitly forcing a CHECKPOINT. The rst full backup operation triggers a CHECKPOINT and the log is truncated. All full backups start by running a checkpoint. This is to ensure that as much of the changed data as possible is on disk, and to minimize the portion of the log that SQL Server needs to read at the end of the backup.
52
Chapter 3: Transaction Logs, Backup and Recovery To tell for sure what recovery model is in operation, execute the query shown in Listing 3.5.
SELECT FROM WHERE GO db_name(database_id) AS 'DatabaseName' , last_log_backup_lsn master.sys.database_recovery_status database_id = db_id('TestDB') ;
Listing 3.5:
If a value of NULL appears in the column, then the database is actually in autotruncate mode, and so SQL Server will truncate the log when database checkpoints occur. Having performed a full database backup, you will nd that the column is populated with the LSN of the log record that recorded the backup operation, and at this point the database is truly in FULL recovery model. From this point on, a full database backup will have no eect on the transaction log; the only way to truncate the log will be to back up the log.
Switching models
If you ever switch a database from FULL or BULK LOGGED model to SIMPLE model, this will break the log chain and you'll only be able to recover the database up to the point of the last log backup taken before you switched. Therefore, always take that log backup immediately before switching. If you subsequently switch the database back from SIMPLE to FULL or BULK_LOGGED model, remember that the database will actually continue operating in auto-truncate mode (Listing 3.5 will display NULL) until you perform another full or dierential backup.
53
Chapter 3: Transaction Logs, Backup and Recovery If you switch from FULL to BULK_LOGGED model then this will not break the log chain. However, any bulk operations that occurred while in BULK_LOGGED model will not be fully logged in the transaction log and so cannot be controlled on an operation-byoperation basis, in the same way that fully logged operations can. This means that recovering a database to a point in time within a transaction log that contains bulk operations is not possible. You can only recover to the end of that log le. In order to "re-enable" point-in-time restore, switch back to FULL model after the bulk operation is complete and immediately take a log backup.
Chapter 3: Transaction Logs, Backup and Recovery adopt a backup scheme consisting of full backups interspersed with dierential backups, interspersed with transaction log backups. In reality, the backup scheme is often more of a compromise between the ideal and the practical, between an assessment of the true risk of data loss, and what it will cost the company, and the cost involved in mitigating that risk. Many very important business applications use somewhat simpler, but rigorous, backup schemes, perhaps involving regular nightly full backups coupled with hourly transaction log backups. The transaction rate will also, to some extent, inuence the frequency of log backups. For very busy databases, it may be necessary to back up more frequently in order to control the size of the log. There is no easy way to calculate how often to take log backups. Most DBAs will take their best estimate at how often to take log backups, observe the growth characteristics of the les and then adjust the backup scheme as necessary to prevent them from getting oversized.
55
Chapter 3: Transaction Logs, Backup and Recovery Loss or corruption of a transaction log backup le You will only be able to recover to the last preceding good log backup. The log chain will start again at the next good full or dierential backup. Switch to SIMPLE recovery model If you ever switch from FULL to SIMPLE recovery model, this will break the log chain, as SQL Server instigates a checkpoint and truncates the transaction log immediately. When and if you return to FULL model, you will need to take another full backup to restart the log chain, or a dierential backup to bridge the LSN gap that will now exist. In fact, until you take that full or dierential backup, the database will remain in auto-truncate mode and you won't be able to back up the log le. Pre-SQL Server 2008, there were a couple of commands, namely BACKUP LOG WITH NO_LOG or BACKUP LOG WITH TRUNCATE_ONLY (they are functionally equivalent) that would force a log le truncation and so break the log chain. You should not issue these commands in any version of SQL Server, but we mention them here as many still use them, when trying to deal with a "runaway log le," without understanding their implications for database restore. See Chapter 7 for more details.
56
57
Chapter 3: Transaction Logs, Backup and Recovery PowerShell/SMO scripting More powerful and versatile than T-SQL scripting, but with a steeper learning curve for many DBAs, PowerShell can be used to script and automate almost any maintenance task. See, for example: http://www.simple-talk.com/author/allen-white/ or http://www.simple-talk.com/author/laerte-junior/. Third-party backup tools Several third-party tools exist that can automate backups, as well as verify and monitor them. Most oer backup compression and encryption as well as additional features to ease backup management, verify backups, and so on. See, for example, Red Gate's SQL Backup: (http://www.red-gate.com/products/dba/sql-backup/).
Summary
In this chapter, we discussed the importance of devising the right backup regime for the needs of your most important production databases. Most of these databases will be operating in FULL recovery model, with regular log backups alongside full and, possibly, dierential database backups. These backups are vital, not only for database restore and recovery, but also for controlling the size of the transaction log. Some databases don't require point-in-time restore, or need the log backups for any other purpose (such as database mirroring), and have data-loss objectives which can be met with full and dierential database backups only. In these cases, we can operate them in SIMPLE recovery model, greatly simplifying log maintenance. We also briey discussed log backup logistics, and the need to automate log backups, schedule them at the right frequency, verify them, and store them securely. In Chapter 4, we begin a series of three chapters examining in more detail log management in each of the three database recovery levels, starting with SIMPLE.
58
Chapter 4: Managing the Log in SIMPLE Recovery Model A database in SIMPLE recovery model is always in auto-truncate mode. As noted in Chapter 3, all user databases, even those designated as operating in FULL recovery model, will be in auto-truncate mode until we perform the rst full database backup.
How often do checkpoints happen?
The SQL Server engine decides how often to perform a checkpoint based on how many log records it will need to process in order to recover a database in the time specied by the recovery interval server conguration option. If your database is mainly read-only, the time between checkpoints may be long. However, on busy, frequently updated systems, checkpoints can occur about every minute. See http://msdn.microsoft.com/en-gb/library/ms189573.aspx for more details.
While SQL Server still writes to the log a complete description of all actions performed, it does not retain them such that log backups can capture this complete description. The LSN chain will be incomplete. In short, log backups have no meaning in SIMPLE recovery model and, in fact, we cannot even perform transaction log backups, as Listing 4.1 demonstrates.
USE master; ALTER DATABASE TestDB SET RECOVERY SIMPLE; BACKUP Log TestDB TO DISK ='D:\SQLBackups\TestDB_log.trn' GO
Msg 4208, Level 16, State 1, Line 1 The statement BACKUP LOG is not allowed while the recovery model is SIMPLE. Use BACKUP DATABASE or change the recovery model using ALTER DATABASE. Msg 3013, Level 16, State 1, Line 1 BACKUP LOG is terminating abnormally.
Listing 4.1:
This means that our backup and restore scheme will consist entirely of full (and possibly dierential) database backups. 60
61
62
Chapter 5: Managing the Log in FULL Recovery Model Kalen Delaney has published some investigations into logging for SELECT INTO (http://sqlblog.com/blogs/kalen_delaney/archive/2011/03/15/what-gets-loggedfor-select-into.aspx) and index rebuild (http://sqlblog.com/blogs/kalen_delaney/ archive/2011/03/08/what-gets-logged-for-index-rebuilds.aspx) operations, both in FULL and BULK_LOGGED recovery models. The dierences in logging of minimally logged operations, when working in BULK_LOGGED model, are discussed in more detail in Chapter 6.
63
Listing 5.1:
In the type column, a D represents a database backup, L represents a log backup, and I represents a dierential backup. Note that, since the data in this backupset table could be manipulated without aecting backup and restore behavior, it's wise to verify ndings from this query. We can do this either by querying sys.database_recovery_status to see the value of last_log_backup_lsn (see Listing 3.5), or the sys.databases table to see the value of log_reuse_wait_desc (will return LOG_BACKUP if a backup is required).
64
Chapter 5: Managing the Log in FULL Recovery Model If this were the rst database backup of this name, SQL Server would create the DatabaseName.bak le in the specied directory. If such a le already existed, then the default behavior is to append subsequent backups to that le. To override this behavior, and stipulate that SQL Server should overwrite any existing le, we can use the INIT option, as follows:
BACKUP DATABASE DatabaseName TO DISK ='FileLocation\DatabaseName.bak' WITH INIT;
Of course, during a normal backup regime, we do not want to overwrite existing backups. When used in this book, it is simply a convenience to facilitate repeatable tests. Most commonly, we'd give each subsequent backup a unique name; more on this in the forthcoming section, Full restore to point of failure. After each regular (e.g. daily) full backup, there will be frequent (e.g. hourly) log backups, the basic command for which is very similar:
BACKUP LOG DatabaseName TO DISK ='FileLocation\DatabaseName_Log.trn';
65
Chapter 5: Managing the Log in FULL Recovery Model Let's say that at 1:45 p.m. a drive hosting the secondary data le failed; the database is still online and we still have access to the transaction log, contained on a separate, dedicated array. In such a situation, we can perform a special form of log backup, a tail log backup, which will capture the remaining contents of the transaction log, and place the damaged database in a restoring state, as follows:
BACKUP LOG DatabaseName TO DISK ='FileLocation\DatabaseName_Log_tail.trn' WITH NORECOVERY
The NORECOVERY option puts the database in a restoring state, ensures that no further transactions will succeed after the tail log backup, and assumes that the next action we wish to perform is a RESTORE. We can then restore our full database (or le) backups, followed by the full chain of log backups, nishing with the tail log backup, and then recover the database. However, let's say instead that the 1:45 p.m. failure causes damage so bad that the database will not come back online. In this case, a normal log backup (or a tail log backup using WITH NORECOVERY) will fail because it will attempt to truncate the log and, as part of that operation, it needs to write into the database's header the new log MinLSN. In such cases, we can still attempt to back up the tail of the log, using the NO_TRUNCATE option instead of the NORECOVERY option.
BACKUP LOG DatabaseName TO DISK ='FileLocation\DatabaseName_Log_tail.trn' WITH NO_TRUNCATE
Note that we should only use this option if a failure has damaged the database severely and it won't come online.
66
After each restore operation we perform using the WITH NORECOVERY option, SQL Server will roll forward the contents of the applied log backups, and leave the database in a restoring state, ready to accept further log backups. If we omit the WITH NORECOVERY option, then by default the RESTORE command will proceed WITH RECOVERY. In other words, SQL Server will reconcile the data and log les, rolling forward completed transactions and then rolling back uncompleted transactions as necessary. After restoring the last backup in the restore sequence, we can then restore WITH RECOVERY and SQL Server will perform the necessary roll forward and roll back to recover the database to a consistent state.
67
A common requirement is to restore the database to a dierent location, in which case you can simply move the les as part of the restore process, as described here: http://msdn.microsoft.com/en-us/library/ms190255.aspx, and demonstrated later in the chapter.
Chapter 5: Managing the Log in FULL Recovery Model and then insert a few sample rows of data (for convenience, the script to do this, CreateAndPopulateTestDB.sql, is included with the code download for this chapter). As usual, you'll also need to have a SQLBackups directory on the local D:\ drive of your database server, or modify the le paths as appropriate. This example assumes that the database is still online when we perform the tail log backup.
-- Perform a full backup of the Test database -- The WITH FORMAT option starts a new backup set -- Be careful, as it will overwrite any existing sets -- The full backup becomes the first file in the set BACKUP DATABASE TestDB TO DISK = 'D:\SQLBackups\TestDB.bak' WITH FORMAT; GO -- Perform a transaction log backup of the Test database -- This is the second file in the set BACKUP Log TestDB TO DISK = 'D:\SQLBackups\TestDB.bak' GO -- ....<FAILURE OCCURS HERE>.... -- The RESTORE HEADERONLY command is optional. -- It simply confirms the files that comprise -- the current set RESTORE HEADERONLY FROM DISK = 'D:\SQLBackups\TestDB.bak' GO -- Back up the tail of the log to prepare for restore -- This will become the third file of the backup set BACKUP Log TestDB TO DISK = 'D:\SQLBackups\TestDB.bak' WITH NORECOVERY; GO
69
Listing 5.2:
However, using backup sets seems to be a relic from times when we backed up databases to tape. When backing up to disk, it is a bad idea to use this scheme because, of course, the backup le will quickly grow very large. In practice, it is far more common that each full backup and transaction log backup le will be individually named, and probably stamped with the time and date we took the backup. For example, most third-party backup tools, popular community-generated scripts, plus the maintenance plan wizard /designer in SSMS, will all create separate date-stamped les e.g. AdventureWorks_FULL_20080904_000001.bak. For all further examples, we'll adopt the scheme of uniquely named backups.
70
Listing 5.3:
71
Chapter 5: Managing the Log in FULL Recovery Model In Listing 5.4, we create the SomeTable table, insert some data, and then take a log backup.
USE FullRecovery GO IF OBJECT_ID('dbo.SomeTable', 'U') IS NOT NULL DROP TABLE dbo.SomeTable GO CREATE TABLE SomeTable ( SomeInt INT IDENTITY , SomeCol VARCHAR(5) ); INSERT INTO SomeTable ( SomeCol ) SELECT TOP ( 10 ) REPLICATE('a', 5) FROM sys.columns AS c;
Listing 5.4:
Create and populate SomeTable, in the FullRecovery database, and run a log backup.
Finally, in Listing 5.5, we update a row in SomeTable, make a note of the time immediately afterwards (we'll need this for a later restore), and then perform a named, marked transaction (again, more on this shortly) which accidentally empties SomeTable.
72
Listing 5.5:
Update a row in SomeTable then delete the table contents in a marked transaction.
For this example, we're simply going to restore over the top of the existing FullRecovery database, to return it to the state in which it existed at the time of the log backup. Listing 5.6 restores our full database backup over the top of the existing database and then applies the log backup.
USE master GO --restore the full database backup RESTORE DATABASE FullRecovery FROM DISK='D:\SQLBackups\FullRecovery.bak' WITH NORECOVERY; --restore the log backup RESTORE LOG FullRecovery FROM DISK='D:\SQLBackups\FullRecovery_log.trn' WITH RECOVERY;
Listing 5.6:
Restoring to the end of the log backup (no tail log backup).
73
Chapter 5: Managing the Log in FULL Recovery Model In this case, however, we'll get a useful and descriptive error.
Msg 3159, Level 16, State 1, Line 1 The tail of the log for the database "FullRecovery" has not been backed up. Use BACKUP LOG WITH NORECOVERY to back up the log if it contains work you do not want to lose. Use the WITH REPLACE or WITH STOPAT clause of the RESTORE statement to just overwrite the contents of the log. Msg 3013, Level 16, State 1, Line 1 RESTORE DATABASE is terminating abnormally. Msg 3117, Level 16, State 1, Line 5 The log or differential backup cannot be restored because no files are ready to rollforward. Msg 3013, Level 16, State 1, Line 5 RESTORE DATABASE is terminating abnormally.
SQL Server is warning us that we're about to overwrite the transaction log of the existing FullRecovery database, and we have not done a tail log backup, so any operations in there would be lost forever. In this example, it means that after the proposed restore operation we'd lose forever the eects of Listing 5.5, which we haven't backed up. If we're certain that we don't need to perform that backup, we can override this check using WITH REPLACE when we restore the full backup, as follows, and then perform the log restore as normal.
RESTORE DATABASE FullRecovery FROM DISK='D:\SQLBackups\FullRecovery.bak' WITH NORECOVERY, REPLACE;
Conversely, only use REPLACE when certain that a tail log backup is not required. In this example, we'll take that tail log backup, even though we don't need it right away. This pitches the database into a restoring state and we proceed immediately with the restore operation, as shown in Listing 5.7.
74
Listing 5.7:
As you can see, we've returned the database to the state in which it existed after the rst log backup. Of course, we could restore the database to the end of the tail log backup, as shown in Listing 5.8, but then we'd just be returning the database to the state where all data had been lost from SomeTable.
75
Listing 5.8:
Query SomeTable and you'll see that there is no data. Let's now see how to get it back; in order to do this, we'll need to stop our restore operation after the update in Listing 5.5, but before the rogue delete.
Point-in-time restores
Whether DBAs like it or not, developers often have access to production databases to perform ad hoc data loads and changes. It is the joint responsibility of the DBA and the developer to ensure these proceed smoothly, but accidents can and do happen, and may result in the accidental deletion or modication of data. In such an event, the DBA may need to perform a point-in-time restore, to return a database to the state in which it existed prior to the rogue transaction, and then copy the lost data back into the production database. We'll cover two such techniques to perform a point-time restore: RESTOREWITHSTOPBEFOREMARK restoring to a marked transaction. RESTOREWITHSTANDBY restoring a standby database to specic point in time.
76
Chapter 5: Managing the Log in FULL Recovery Model Of course, the exact nature of the reparative action required depends on the nature of the "bad" transaction. If someone accidentally dropped a table, and we don't know exactly when it happened, then it's possible we'll be heading down the RESTORE WITH STANDBY route. At other times, we may get away with simply creating a script to "reverse out" the rogue modications. If the damage aected only a single column, or a limited number of rows, then it may be possible, as an alternative, to use a tool such as SQL Data Compare, which can compare directly to backup les, and can do row-level restores. If we have SQL Server 2005 (or later) Enterprise Edition plus a recent database snapshot, we may be able to run a query against the snapshot to retrieve the data as it looked at the time the database snapshot was taken. We can then write an UPDATE or INSERT command to pull the data from the database snapshot into the live, source database. Finally, as a last resort, a specialized log reader tool may help us reverse out the eects of a transaction, although I'm not aware of any that work reliably in SQL Server 2005 and later. However, the ultimate safeguard, and most likely route to restoring the database without losing any data, is to have a complete set of backups, and to use them to perform a pointin-time restore.
Listing 5.9:
We've left the FullRecovery_Copy database in a restoring state, and the next step is to restore the rst log backup, using WITHSTOPBEFOREMARK, because we want to return our database to a consistent state as of the last committed transaction at the time the marked transaction started.
RESTORE LOG [FullRecovery_Copy] FROM DISK = N'D:\SQLBackups\FullRecovery_Log.trn' WITH NORECOVERY, STOPBEFOREMARK = N'Delete_SomeTable' GO
At this point, we'll see the following message since the log backup does not contain the designated mark:
Processed 0 pages for database 'FullRecovery_Copy', file 'FullRecovery_dat' on file 1. Processed 4 pages for database 'FullRecovery_Copy', file 'FullRecovery_log' on file 1. This log file contains records logged before the designated mark. The database is being left in the Restoring state so you can apply another log file. RESTORE LOG successfully processed 4 pages in 0.007 seconds (4.115 MB/sec).
78
Chapter 5: Managing the Log in FULL Recovery Model In this example, the tail log backup contains our marked transaction and so, in Listing 5.11, we restore that tail log backup.
RESTORE LOG [FullRecovery_Copy] FROM DISK = N'D:\SQLBackups\FullRecovery_tail.trn' WITH NORECOVERY, STOPBEFOREMARK = N'Delete_SomeTable' GO
Listing 5.11:
At this point, we can recover the FullRecovery_Copy database and query it, as shown in Listing 5.12.
RESTORE DATABASE [FullRecovery_Copy] WITH RECOVERY GO USE FullRecovery_copy SELECT TOP 1 * FROM SomeTable
Listing 5.12:
79
Chapter 5: Managing the Log in FULL Recovery Model As you can see, our restored copy of the database contains the eects of our update, but not the accidental delete. We can now think about copying the contents of SomeTable back into our production copy of the FullRecovery database (we'll discuss this in more detail in the next section).
80
Chapter 5: Managing the Log in FULL Recovery Model Of course, this process is not necessarily straightforward, and can be quite timeconsuming. Unless we've purchased a specialized log reading tool, and can interrogate the log backup directly, rolling the logs forward can mean a series of painstaking steps involving restoring a log, checking the data, restoring a bit further, and so on, until we've worked out exactly where the bad transaction occurred. Step 3 can be difficult, too, since we will be introducing data into the live system that is not necessarily consistent with the current state of the database, so there could be referential integrity issues. Nevertheless, let's look at how this works. In order to try to retrieve the lost data without interrupting normal business operation, we're going to restore our backups to create a Standby_FullRecovery database, in STANDBY mode.
USE master GO IF DB_ID('Standby_FullRecovery') IS NOT NULL DROP DATABASE Standby_FullRecovery; GO RESTORE DATABASE Standby_FullRecovery FROM DISK = N'D:\SQLBackups\FullRecovery.bak' WITH MOVE N'FullRecovery_Dat' TO N'D:\SQLDATA\Standby_FullRecovery.mdf', MOVE N'FullRecovery_Log' TO N'D:\SQLDATA\Standby_FullRecovery.ldf', STANDBY = N'D:\SQLBackups\Standby_FullRecovery_Copy_UNDO.bak' GO
Listing 5.13:
We now have a new database, called Standby_FullRecovery, and it's in "Standby / Read-Only" mode, as shown in Figure 5.1. As you can see, there are currently no user tables.
Figure 5.1:
81
Chapter 5: Managing the Log in FULL Recovery Model The next step is to restore the rst log backup to the standby-mode database. In doing so, we must create an undo le for the standby database. This undo le contains information relating to which operations in the log backup SQL Server had to undo during the restore operation, in order to make the database readable. When the next log backup is applied, SQL Server rst has to "redo" whatever undo is stored in the undo le.
RESTORE LOG Standby_FullRecovery FROM DISK = N'D:\SQLBackups\Fullrecovery_Log.trn' WITH STANDBY = N'D:\SQLBackups\Standby_FullRecovery_Copy_UNDO.bak' -STOPAT = '2012-10-05 16:23:06.740' GO
Listing 5.14: Restore the rst log backup to the standby database.
Next, we can restore fully the next log backup (in this example, the tail log backup). Having done so, we'll nd that all the data in SomeTable is gone, and so we know that this log backup contains the rogue delete transaction. Fortunately, in this case, not only do we know that already, we also know the exact time we wish to stop the restore. Therefore, we can proceed as shown in Listing 5.15, restoring the tail log backup using WITH STOPAT to stop the restore at the point of the last committed transaction, at the time specied, just before we lost the data.
USE master Go RESTORE LOG Standby_FullRecovery FROM DISK = N'D:\SQLBackups\Fullrecovery_tail.trn' WITH STANDBY = N'D:\SQLBackups\Standby_FullRecovery_Copy_UNDO.bak', STOPAT = '2012-10-05 16:23:06.740' GO
Listing 5.15:
82
Chapter 5: Managing the Log in FULL Recovery Model Query the Standby_FullRecovery database and you'll see that the lost data is back, and we can attempt to reintroduce it to the live database, as discussed earlier. Of course, the less sure we are of the exact time to which we need to restore, the trickier the process can become. Aside from third-party log readers (very few of which oer support beyond SQL Server 2005), there are a couple of undocumented and unsupported functions that can be used to interrogate the contents of log les (fn_dblog) and log backups (fn_dump_dblog). So, for example, we can look at the contents of our second log backup les as shown in Listing 5.16.
SELECT FROM * fn_dump_dblog(DEFAULT, DEFAULT, DEFAULT, DEFAULT, 'C:\SQLBackups\FullRecovery_tail.trn', DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT);
DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT, DEFAULT,
It's not pretty and it's unsupported, so use it with caution. It accepts a whole host of parameters, the only one we've dened being the path to the log backup. It returns a vast array of information that we're not going to begin to delve into here, but it does return the Begin Time for each of the transactions contained in the le, and it may oer some help in working out where to stop a restore operation. A dierent technique to point-in-time restores using STOPAT, is to try to work out the LSN value associated with, for example, the start of the rogue transaction that 83
Chapter 5: Managing the Log in FULL Recovery Model deleted your data. We're not going to walk through an LSN-based restore here, but you can nd a good explanation of some of the practicalities involved at: http://janiceclee.com/2010/07/25/alternative-to-restoring-to-a-point-in-time/. An alternative to doing a standby restore is to consider use of a third-party tool such as Red Gate's SQL Virtual Restore (http://www.red-gate.com/products/dba/sql-virtualrestore/), which provides a way to mount backups as live, fully functional databases, without a physical restore.
84
Listing 5.17:
We can then proceed with the restore operation, replacing the existing database, as discussed previously. We'll return to this topic in Chapter 6, when we discuss the BULK_LOGGED recovery model, as there are cases there where this technique won't work.
Summary
In this chapter, we've covered the basics of backing up and restoring log les for databases operating in FULL recovery model, which will be the norm for many production databases. For most DBAs, the need to perform a point-in-time restore is a rare event, but it's one of those tasks where, if it is necessary, it is critical that it is done, and done well; the DBA's reputation depends on it.
85
Chapter 5: Managing the Log in FULL Recovery Model In the case of corruption, drive failure, and so on, point-in-time recovery might, if you're lucky, involve backing up the tail of the transaction log and restoring right to the point of failure. If the transaction log is not available, or if you're restoring in order to revert to some point in time before a "bad transaction" occurred, then the situation becomes trickier, but hopefully some of the techniques covered here will help.
86
87
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model Having said all this, for many databases, the ability to switch to BULK_LOGGED recovery so that SQL Server will minimally log certain operations, is a very useful weapon in the ght against excessive log growth. In most cases, the SLA will allow enough leeway to make its use acceptable and, with careful planning and procedures, the risks will be minimal. This chapter will cover: what we mean by "minimal logging" advantages of minimal logging in terms of log space use implications of minimal logging for crash recovery, point-in-time restore, and tail log backups best practices for use of BULK_LOGGED recovery.
88
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model You can nd here a full list of operations that SQL Server can minimally log: http://msdn.microsoft.com/en-us/library/ms191244.aspx. Some of the more common ones are as follows: bulk load operations such as via SSIS, bcp, or BULK INSERT SELECT INTO operations creating and rebuilding indexes. It's worth noting that "can be" minimally logged is not the same as "will be" minimally logged. Depending on the indexes that are in place and the plan chosen by the optimizer, SQL Server might still fully log bulk load operations that, in theory, it can minimally log. Due (mainly) to recoverability requirements, SQL Server only minimally logs a bulk data load that is allocating new extents. For example, if we perform a bulk load into a clustered index that already has some data, the load will consist of a mix of adding to pages, splitting pages, and allocating new pages, and so SQL Server can't minimally log. Similarly, it's possible for SQL Server to log minimally inserts into the table, but fully log inserts into the non-clustered indexes. See the white paper, The Data Loading Performance Guide, (http://msdn.microsoft.com/en-us/library/dd425070.aspx) for a fuller discussion. Books Online describes minimally logged operations as "logging only the information that is required to recover the transaction without supporting point-in-time recovery." Similarly, Kalen Delaney, in her book, SQL Server 2008 Internals (Chapter 4, page 199), denes minimally logged operations as "ones that log only enough information to roll back the transaction, without supporting point-in-time recovery." In order to understand the dierence between what is logged for a "minimally logged" operation, depending on whether the database uses FULL or BULK_LOGGED recovery, let's try it out and see!
89
We'll use a SELECTINTO statement, which can be minimally logged in SQL Server 2008, to insert 200 rows, of 2,000 bytes each, into a table called SomeTable. Since the page size in SQL Server is 8 KB, we should get four rows per page and so 50 data pages in all (plus some allocation pages). Listing 6.1 creates a test database, FullRecovery, ensures that it is operating in FULL recovery model, and then runs the SELECTINTO statement.
USE master GO IF DB_ID('FullRecovery') IS NOT NULL DROP DATABASE FullRecovery; GO -- Clear backup history EXEC msdb.dbo.sp_delete_database_backuphistory @database_name = N'FullRecovery' GO CREATE DATABASE FullRecovery ON (NAME = FullRecovery_dat, FILENAME = 'D:\SQLData\FullRecovery.mdf' ) LOG ON ( NAME = FullRecovery_log, FILENAME = 'D:\SQLData\FullRecovery.ldf' ); ALTER DATABASE FullRecovery SET RECOVERY FULL GO USE FullRecovery GO
90
Listing 6.1:
Now, at this point, we'd like to peek inside the log, and understand what SQL Server recorded in the log because of our fully logged SELECTINTO statement. There are some third-party log readers for this purpose, but very few of them oer support beyond SQL Server 2005. However, we can use two undocumented and unsupported functions to interrogate the contents of log les (fn_dblog) and log backups (fn_dump_dblog), as shown in Listing 6.2.
SELECT Operation , Context , AllocUnitName , Description , [Log Record Length] , [Log Record] fn_dblog(NULL, NULL)
--
FROM
Listing 6.2:
Figure 6.1 shows a small portion of the output consisting of a set of eight pages (where the allocation unit is dbo.SomeTable). Notice that the context in each case is LCX_HEAP, so these are the data pages. We also see some allocation pages, in this case a Dierential Changed Map, tracking extents that have changes since the last database backup (to facilitate dierential backups), and some Page Free Space (PFS) pages, tracking page allocation and available free space on pages.
91
Figure 6.1:
The log records describing the changes made to SomeTable are all of type LOP_FORMAT_ PAGE; they always appear in sets of 8, and each one is 8,276 bytes long. The fact that they appear in sets of 8 indicates that SQL Server was processing the insert one extent at a time and writing one log record for each page. The fact that each one is 8,276 bytes shows that each one contains the image of an entire page, plus log headers. In other words, for the INSERTINTO command, and others that SQL Server will minimally log in BULK_LOGGED recovery, SQL Server does not log every individual row when run in FULL recovery; rather, it just logs each page image, as it is lled. A closer examination of the Log Record column shows many bytes containing the hex value 0x61, as shown in Figure 6.2. This translates to decimal 97, which is the ASCII value for 'a', so these are the actual data rows in the log le.
Figure 6.2:
So in FULL recovery model, SQL Server knows, just by reading the log le, which extents changed and exactly how this aected the contents of the pages. Let's now compare this 92
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model to the log records that result from performing the same SELECTINTO operation on a database operating in BULK_LOGGED recovery.
USE master GO IF DB_ID('BulkLoggedRecovery') IS NOT NULL DROP DATABASE BulkLoggedRecovery; GO EXEC msdb.dbo.sp_delete_database_backuphistory @database_name = N'BulkLoggedRecovery' GO CREATE DATABASE BulkLoggedRecovery ON (NAME = BulkLoggedRecovery_dat, FILENAME = 'D:\SQLData\BulkLoggedRecovery.mdf' ) LOG ON ( NAME = BulkLoggedRecovery_log, FILENAME = 'D:\SQLData\BulkLoggedRecovery.ldf' ); GO ALTER DATABASE BulkLoggedRecovery SET RECOVERY BULK_LOGGED GO BACKUP DATABASE BulkLoggedRecovery TO DISK = 'D:\SQLBackups\BulkLoggedRecovery.bak' WITH INIT GO USE BulkLoggedRecovery GO IF OBJECT_ID('dbo.SomeTable', 'U') IS NOT NULL DROP TABLE dbo.SomeTable GO SELECT TOP ( 200 ) REPLICATE('a', 2000) AS SomeCol INTO SomeTable FROM sys.columns AS c;
Listing 6.3:
93
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model Rerun the fn_dblog function, from Listing 6.2, in the BulkLoggedRecovery database, and what we get this time is a very dierent set of log records. There are no LOP_FORMAT_PAGE log records at all.
Figure 6.3:
This time, the log records for the changes made to SomeTable appear in the context of Global Allocation Maps (GAMs) and Index Allocation Maps (IAMs), tracking extent allocation, plus some PFS pages. In other words, SQL Server is logging the extent allocations (and any changes to the metadata, i.e. system tables, which we don't show in Figure 6.3), but the data pages themselves are not there. There is no reference in the log as to what data is on the allocated pages. We do not see here the 0x61 pattern that appeared in the log records for the FullRecovery database, and most of the log records are around 100 bytes in size. Therefore, we now have a clearer picture of exactly what it means for SQL Server to minimally log an operation: it is one where SQL Server logs allocations of the relevant extents, but not the actual content of those extents (i.e. data pages).
94
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model The eects of this are twofold. Firstly, it means that SQL Server writes much less information to the transaction log, so the log le will grow at a signicantly slower rate than for equivalent operations in the FULL recovery model. It also means that the bulk load operations may be faster (but see the later discussion on this topic, in the section on Advantages of Minimal Logging and BULK_LOGGED Recovery). Secondly, however, it means that SQL Server logs only enough information to undo (roll back) the transaction to which the minimally logged operation belongs, but not to redo it (roll it forward). To roll back the transaction containing the SELECTINTO operation, all SQL Server needs to do is de-allocate the aected pages. Since page allocations are logged, as shown in Figure 6.3, that's possible. To roll the transaction forward is another matter. The log records can be used to re-allocate the pages, but there is no way, when an operation is minimally logged, for SQL Server to use these log records to re-create the contents of the pages.
Minimally logged versus "extent de-allocation only"
For DROP TABLE and TRUNCATE TABLE operations, as for bulk operations, SQL Server logs only the extent de-allocations. However, the former are not true minimally logged operations because their behavior is the same in all recovery models. The behavior of true minimally logged operations is dierent in FULL recovery from in BULK_LOGGED (or SIMPLE) recovery, in terms of what SQL Server logs. Also, for true minimally logged operations, when log backups are taken, SQL Server captures into the backup le all the data pages aected by a minimally logged operation (we'll discuss this in more detail a little later), for use in restore operations. This does not happen for the DROP TABLE and TRUNCATE TABLE commands.
95
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model In BULK_LOGGED recovery, only the page allocations for the new index are logged, so the impact on the transaction log can be substantially less than it would in FULL recovery. A similar argument applies to large data loads via bcp or BULK INSERT, or for copying data into new tables via SELECTINTO or INSERT INTOSELECT. To get an idea just how much dierence this makes, let's see an example. First, we'll create a table with a clustered index and load it with data (the Filler column adds 1,500 bytes per row and ensures we get a table with many pages).
USE FullRecovery GO IF OBJECT_ID('dbo.PrimaryTable_Large', 'U') IS NOT NULL DROP TABLE dbo.PrimaryTable_Large GO CREATE TABLE PrimaryTable_Large ( ID INT IDENTITY PRIMARY KEY , SomeColumn CHAR(4) NULL , Filler CHAR(1500) DEFAULT '' ); GO INSERT INTO PrimaryTable_Large ( SomeColumn ) SELECT TOP 100000 'abcd ' FROM msdb.sys.columns a CROSS JOIN msdb.sys.columns b
GO SELECT FROM
Listing 6.4:
97
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model Let's now rebuild the clustered index (consisting of 20,034 pages, according to sys.dm_ db_index_physical_stats) in our FULL recovery model database, and see how much log space the index rebuild needs, using with the sys.dm_tran_database_ transactions DMV.
--truncate the log USE master GO BACKUP LOG FullRecovery TO DISK = 'D:\SQLBackups\FullRecovery_log.trn' WITH INIT GO -- rebuild index and interrogate log space use, within a transaction USE FullRecovery GO BEGIN TRANSACTION ALTER INDEX ALL ON dbo.PrimaryTable_Large REBUILD -- there's only the clustered index d.name , session_id , d.recovery_model_desc , -database_transaction_begin_time , database_transaction_log_record_count , database_transaction_log_bytes_used , DATEDIFF(ss, database_transaction_begin_time, GETDATE()) AS SecondsToRebuild FROM sys.dm_tran_database_transactions AS dt INNER JOIN sys.dm_tran_session_transactions AS st ON dt.transaction_id = st.transaction_id INNER JOIN sys.databases AS d ON dt.database_id = d.database_id WHERE d.name = 'FullRecovery' COMMIT TRANSACTION SELECT --
Listing 6.5:
98
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model When I ran this code on a database operating in FULL recovery, the output from the DMV was as shown in Figure 6.4.
Figure 6.4:
It took approximately 5 seconds to rebuild the index, and the rebuilds required about 166 MB of log space for 20,131 log records; this is ignoring the log reservation in case of rollback, so the total log space required is larger. If we run the same example in the BulkLoggedRecovery database, the output is as shown in Figure 6.5.
Figure 6.5:
The rebuild appears to be a bit faster, at 4 seconds; however, because the index in this example is quite small, and because the data and log les are both on a single drive, not much can be concluded from that time dierence. The main point here is the dierence in the log space used. In BULK_LOGGED recovery, that index rebuild only used about 0.6 MB of log space, compared to the 166 MB of log space in FULL recovery. That's a major saving considering that this was quite a small table, at only 160 MB in size. In case anyone's wondering whether the behavior will be any dierent in SIMPLE recovery, look at Figure 6.6 (the code to reproduce the example in a SIMPLE recovery database, as well as BULK_LOGGED, is included in the code download for this chapter).
99
Figure 6.6:
As expected, the behavior is the same as for BULK_LOGGED recovery, since operations that are minimally logged in BULK_LOGGED recovery are also minimally logged in SIMPLE recovery. This is the major reason for running a database in BULK_LOGGED recovery; it oers both the database recovery options of FULL recovery (mostly, see the coming sections), but it reduces the amount of log space used by certain operations, in the same way as SIMPLE recovery. Note also, that if the database is a log-shipping primary, we cannot switch the database into SIMPLE recovery for index rebuilds without having to redo the log shipping afterwards, but we can switch it to BULK_LOGGED for the index rebuilds. Finally, note that database mirroring requires FULL recovery only and, as such, a database that is a database mirroring principal cannot use BULK_LOGGED recovery.
Crash recovery
Crash recovery, also called restart recovery, is a process that SQL Server performs whenever it brings a database online. So for example, if a database does not shut down cleanly, then upon restart SQL Server goes through the database's transaction log. It undoes any transaction that had not committed at the time of the shutdown and redoes any transaction that had committed but whose changes had not been persisted to disk. This is possible because, as discussed in Chapter 1, the Write Ahead Logging mechanism ensures that the log records associated with a data modication are written to disk before either the transaction commits or the data modication is written to disk, whichever happens rst. SQL Server can write the changes to the data le at any time, before the transaction commits or after, via the checkpoint or Lazy Writer. Hence, for normal operations (i.e. ones that are fully logged), SQL Server has sufficient information in the transaction log to tell whether an operation needs to be undone or redone, and has sufficient information to roll forward or roll back. For operations that were minimally logged, however, roll forward is not possible as there's not enough information in the log. Therefore, when dealing with minimally logged operations in SIMPLE or BULK_LOGGED recovery model, another process, Eager Write, guarantees that the thread that is executing the bulk operation hardens to disk any extents modied by the minimally logged operation, before the transaction is complete. This is in contrast to normal operations where only the log records have to be hardened before the transaction is complete, and the data pages are written later by a system process (Lazy Writer or checkpoint). This means that crash recovery will never have to redo a minimally logged operation since SQL Server guarantees that the modied data pages will be on disk at the time the transaction commits, and hence the minimal logging has no eect on the crash recovery process.
101
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model One side eect of this requirement that SQL Server writes both log records and modied data pages to disk before the transaction commits, is that it may actually result in the minimally logged operation being slower than a regular transaction, if the data le is not able to handle the large volume of writes. Minimally logged operations are usually faster than normal operations, but there is no guarantee. The only guarantee is that they write less information to the transaction log.
Database restores
SQL Server also needs to perform redo operations when restoring full, dierential, or log backups. As we've discussed, for a minimally logged operation the aected pages are on disk at the point the transaction completes and so SQL Server simply copies those pages into any full or dierential backup les, and restores from these backups are unaected. Restores from log backups, however, are more interesting. If the log backup only contained the log records relating to extent allocation, then on restoring the log backup there would be no way to re-create the contents of the extents that the minimally logged operation aected. This is because, as we saw earlier, the log does not contain the inserted data, just the extent allocations and metadata. In order to enable log restores when there are minimally logged operations, included in the log backup are not just the log records, but also the images of any extent (set of eight pages) aected by the minimally logged operation. This doesn't mean images of them as they were after the minimally logged operation, but the pages as they are at the time of the log backup. SQL Server maintains a bitmap allocation page, called the ML map or bulk-logged change map, with a bit for every extent. Any extents aected by the minimally logged operation have their bit set to 1. The log backup operation reads this page and so knows exactly what extents to include in the backup. That log backup will then clear the ML map.
102
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model So, for example, let's say we have a timeline like that shown in Figure 6.7, where a log backup occurs at 10:00, then at (1) a minimally logged operation (let's say a BULK INSERT) aected pages 13081315 among others, at (2) an UPDATE aected pages 1310 and 1311 and at (3) another log backup occurred.
Figure 6.7:
The log backup at 10:30 backs up the log records covering the period 10:0010:30. Since there was a minimally logged operation within that log interval, it copies the extents aected by the minimally logged operations into the log backup. It copies them as they appear at the time of the log backup, so they will reect the eects of the BULK INSERT and the UPDATE, plus any further modications that may have taken place between the UPDATE and the log backup. This aects how we can restore the log. It also aects the size of the log backups and, under certain circumstances, may aect tail log backups, but we'll get to that in more detail in the next section. Let's take a look an example of how minimally logged operations can aect a pointin-time restore. Figure 6.8 depicts an identical backup timeline for two databases. The green bar represents a full database backup and the yellow bars represent a series of log backups. The only dierence between the two databases is that the rst is operating in FULL recovery model, and the second in BULK LOGGED.
103
Figure 6.8:
The time span of the fth log backup is 10:00 to 10:30. At 10:10, a BULK INSERT command (1) loaded a set of data. This bulk data load completed without a hitch but, in an unrelated incident at 10:20, a user ran a "rogue" data modication (2) and crucial data was lost. The project manager informs the DBA team and requests that they restore the database to a point in time just before the transaction that resulted in data loss started, at 10:20. In the FULL recovery model database, this is not an issue. The bulk data load was fully logged and we can restore the database to any point in time within that log le. We simply restore the last full database backup, without recovery, and apply the log les to the point in time right before the unfortunate data loss incident occurred, using the RESTORE LOG command with the STOPAT parameter, to stop the restore operation sometime before 10:20. In the BULK_LOGGED database, we have a problem. We can restore to any point in time within the rst four log backups, but not to any point in time within the fth log backup, which contains the minimally logged operations. Remember that for this log backup we only have the extents aected by the minimally logged operation, as they existed at the time of the log backup. The restore of the fth log backup is "all or nothing:" either we apply none of the operations in this log le, stopping the restore at the end of the fourth 104
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model le, or we apply all of them, restoring to the end of the le, or proceeding to restore to any point in time within the sixth log backup. If we tried to restore that fth log backup, with STOPAT 10:15 (a time between the minimally logged operation and the rogue modication), SQL is not going to walk the rest of the log backup guring out what operations it needs to undo on the pages that were aected by the minimally logged operation. It has a simpler reaction:
Msg 4341, Level 16, State 1, Line 2 This log backup contains bulk-logged changes. It cannot be used to stop at an arbitrary point in time.
Unfortunately, if we apply the whole of the fth log le backup, this would defeat the purpose of the recovery, since the errant process committed its changes somewhere inside of that log backup le, so we'd simply be removing the data we were trying to get back! We have little choice but to restore up to the end of the fourth log, recover the database, and report the loss of any data changes made after this time. This inability to restore the database to point-in-time if there are minimally logged operations within the log interval is something that must be considered when choosing to run a database in BULK_LOGGED recovery, either for short or long periods. It is easy to identify whether or not a specic log backup contains any minimally logged operations. A RESTORE HEADERONLY returns a set of details about the backup in question, including a column HasBulkLoggedData. In addition, the msdb backupset table has a column, has_bulk_logged_data. If the column value is 1, then the log backup contains minimally logged operations and can only be restored entirely or not at all. That said, nding this out while planning or executing a restore may be an unpleasant surprise.
105
Listing 6.6:
Next, we check current log space usage, and then back up the log.
106
Listing 6.7:
Given that the log size is only about 24 MB, you may be surprised to see the size of the log backup, about 1 GB in my test! For a database in FULL recovery, you'll nd that the log size and log backup size are both about 1 GB.
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model With the existing backups (full, perhaps dierential, and then log) and the tail log backup, we can restore the database up to the exact point that it failed. We can do that because the log contains sufficient information to re-create all of the committed transactions. However, if the database is in BULK_LOGGED recovery and there are minimally logged operations in the active portion of the transaction log, then the log does not contain sufficient information to re-create all of the committed transactions, and the actual data pages are required to be available when we take the tail log backup. If the data le is not available, then that log backup cannot copy the data pages it needs to restore a consistent database. Let's see this in action, using the BulkLoggedRecovery database.
BACKUP DATABASE BulkLoggedRecovery TO DISK = 'D:\SQLBackups\BulkLoggedRecovery2.bak' WITH INIT GO USE BulkLoggedRecovery GO IF OBJECT_ID('dbo.SomeTable', 'U') IS NOT NULL DROP TABLE dbo.SomeTable ; SELECT TOP 200 SomeCol = REPLICATE('a', 2000) INTO dbo.SomeTable FROM sys.all_columns ac1 GO SHUTDOWN WITH NOWAIT
Listing 6.8:
Create the BulkLoggedRecovery database, perform a SELECT INTO, and then shut down.
With the SQL Server service shut down, go to the data folder and delete the mdf le for the BulkLoggedRecovery database, and then restart SQL Server. It's not a complete simulation of a drive failure, but it's close enough for the purposes of this demo.
108
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model When SQL Server restarts, the database is not available, which is no surprise since its primary data le is missing. The state is Recovery_Pending, meaning that SQL couldn't open the database to run crash recovery on it.
USE master GO SELECT name , state_desc FROM sys.databases WHERE name = 'BulkLoggedRecovery'
name state_desc --------------------------------------BulkLoggedRecovery RECOVERY_PENDING
Listing 6.9:
In Listing 6.10, we attempt to take a tail log backup (note that NO_TRUNCATE implies COPY_ONLY and CONTINUE_AFTER_ERROR):
BACKUP LOG BulkLoggedRecovery TO DISK = 'D:\SQLBackups\BulkLoggedRecovery_tail.trn' WITH NO_TRUNCATE
Processed 7 pages for database 'BulkLoggedRecovery', file 'BulkLoggedRecovery_log' on file 1. BACKUP WITH CONTINUE_AFTER_ERROR successfully generated a backup of the damaged database. Refer to the SQL Server error log for information about the errors that were encountered. BACKUP LOG successfully processed 7 pages in 0.007 seconds (7.463 MB/sec).
Listing 6.10: Attempting a tail log backup using BACKUP LOGWITH NO_TRUNCATE.
Well, it said that it succeeded (and despite the warning, there were no errors in the error log). Now, let's try to restore this database, as shown in Listing 6.11.
109
Listing 6.11:
Well, that didn't work. Let's follow the advice of the error message and see if we can restore the log using CONTINUE_AFTER_ERROR.
110
That worked, so let's investigate the state of the restored database. Rerun Listing 6.9 and you'll see it's reported as ONLINE, and SomeTable, the target of SELECTINTO exists, so let's see if any of the data made it into the table (remember, the page allocations were logged, the contents of the pages were not).
111
Msg 824, Level 24, State 2, Line 1 SQL Server detected a logical consistency-based I/O error: incorrect pageid (expected 1:184; actual 0:0). It occurred during a read of page (1:184) in database ID 32 at offset 0x00000000170000 in file 'C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\ DATA\BulkLoggedRecovery.mdf'. Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
Listing 6.13:
Note that this was the error message from SQL Server 2008 Enterprise Edition. It's possible you'll see dierent errors on other versions. In any event, this doesn't look good; let's see what DBCC CHECKDB says about the state of the database.
DBCC CHECKDB ('BulkLoggedRecovery') WITH NO_INFOMSGS, ALL_ERRORMSGS
Msg 8921, Level 16, State 1, Line 1 Check terminated. A failure was detected while collecting facts. Possibly tempdb out of space or a system table is inconsistent. Check previous errors.
This doesn't look good at all, unfortunately (and TempDB was not out of space). About the only sensible option here is to restore again and leave o the tail log backup. It means that any transactions that committed between the last normal log backup and the point of failure are lost.
112
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model This is another important consideration when deciding to run a database in BULK_LOGGED recovery model for long or short periods. In FULL recovery model, a tail log backup requires access only to the transaction log. Therefore, we can still back up the transaction log even if the MDF les are unavailable, due to disk failure, for example. However, in BULK_LOGGED model, if any minimally logged operations have occurred since the last log backup, it will mean that we cannot perform a tail log backup, if the data les containing the data aected by the minimally logged operations become unavailable. The reason for this is that when performing a transaction log backup in BULK_LOGGED model, SQL Server has to back up to the transaction log backup le all the actual extents (i.e. the data) that the bulk operation modied, as well as the transaction log entries. In other words, SQL Server needs access to the data les in order to do the tail log backup.
113
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model To illustrate how this reduces risk, consider the following scenario: 1:00 a.m. Full backup 1:15 a.m. Transaction log backup1 2:15 a.m. Transaction log backup2 2:40 a.m. Switch to BULK_LOGGED, Bulk operation begins 3:05 a.m. Bulk operation ends 3:10 a.m. FAILURE MDF becomes unavailable 3:15 a.m. Transaction log backup3 In this case, the 3.15 a.m. log backup would fail, as would a subsequent attempt to do a tail log backup. All we could do is restore the full backup followed by the rst two log backups, so we would lose 55 minutes-worth of data. If instead, we had adopted the following regime, we would have been in a much better situation: 1:00 a.m. Full backup 1:15 a.m. Transaction log backup1 2:15 a.m. Transaction log backup2 2:35 a.m. Transaction log backup3 2:40 a.m. Switch to BULK_LOGGED, Bulk operation begins 3:05 a.m. Bulk operation ends 3:05 a.m. Switch back to FULL and perform transaction log backup4 3:10 a.m. FAILURE MDF becomes unavailable 3:15 a.m. Transaction log backup5 114
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model Here, the 3:15 log backup would also fail, but we would subsequently be able to perform a tail log backup, since log backup4 ensures that there are no minimally logged operations in the live log. We could then restore the full backup, the four transaction log backups, and the tail log backup to recover to the point of failure at 3.15 a.m. Even given these precautionary log backups, it is best to perform any minimally logged operations out of hourse, when very few, if any, other transactions are being performed. This way, if anything goes wrong we may simply be able replay the bulk load to restore the data. Even if you minimize risk by taking extra log backups before and after every bulk operation, it is inadvisable to operate a database continually in BULK_LOGGED model. It can be very hard, depending on your environment, to exert total control over who might perform a minimally logged operation, and when. Bear in mind that any table owner can create or rebuild indexes on that table; anyone who can create a table can also run SELECTINTO statements. Finally, we recommend reading The Data Loading Performance Guide (http://msdn. microsoft.com/en-us/library/dd425070.aspx), which oers a lot of advice on achieving high-speed bulk data modications, and discusses how to measure the potential benet that will arise from minimal logging, using Trace Flag 610.
Summary
The BULK_LOGGED recovery model oers a way to perform data loads and some database maintenance operations such as index rebuilds without the transaction log overhead that they would normally have in FULL recovery model, but while still keeping the log chain intact. The downsides of this include greater potential data loss if a disaster occurs during, or over the same time span as, the bulk operation. In other words, you won't be able to use the STOPAT option when restoring a log le that contains minimally logged operations. It is still possible to restore the entire transaction log backup to roll the 115
Chapter 6: Managing the Log in BULK_LOGGED Recovery Model database forward, and it is still possible to restore to a point in time in a subsequent log le which doesn't contain any minimally logged operations. However, in the event of an application bug, or a user change that causes data to be deleted, around the same period as the minimally logged operation, it will not be possible to stop at a specic point in time in the log in which these changes are recorded, in order to recover the data that was deleted. When using BULK_LOGGED recovery, keep in mind the increased risks of data loss, and use the recovery model only for the duration of the maintenance or data load operations and not as a default recovery model. Even with the increased risks, it is a viable and useful option and is something that DBAs should consider when planning data loads or index maintenance.
Acknowledgements
Many thanks to Shawn McGehee, author of SQL Server Backup and Restore (http://www.simple-talk.com/books/sql-books/sql-backup-and-restore/) for contributing additional material to the Database restores section of this chapter.
116
117
Conversely, if the log le has only a few VLFs that are very large, we risk tying up large portions of the log for long periods. Each VLF will hold a very large number of log records, and SQL Server cannot truncate a VLF until it contains no part of the active log. In cases where truncation is delayed for some reason (see the Lack of log space reuse section), this can lead to rapid log growth. For example, let's assume that each VLF is 1 GB in size and that the log is full. You perform a log backup, but all VLFs contain some part 118
Chapter 7: Dealing with Excessive Log Growth of the active log and so SQL Server cannot truncate the log. It has no option but to add more VLFs and, if the growth increment for the log is set to a similarly large size then the log might grow rapidly, until some existing VLFs become eligible for truncation. As such, it's important that we size the log appropriately initially, and then grow it in appropriately sized steps, to minimize log fragmentation but also to avoid rapid growth. There is also a second reason why it is very important to size the log appropriately and grow it in a very controlled fashion: for log les, each growth event is a relatively expensive operation. It is natural that both data and log les will grow in size over time. SQL Server can optimize the process of adding new data les and expanding existing data les, via instant le initialization (introduced in SQL Server 2005, this allows the data les to allocate space on disk without having to ll the space with zeros). Unfortunately, the same is not true for log les, which still require initialization and "zeroing out" whenever space is allocated for log le creation or growth.
Why can't the transaction log use instant initialization?
For further information about transaction log zeroing, see Paul Randal's blog post, Search Engine Q&A #24: Why can't the transaction log use instant initialization? (http://sqlskills.com/ BLOGS/PAUL/post/Search-Engine-QA-24-Why-cant-the-transaction-log-use-instantinitialization.aspx).
119
Chapter 7: Dealing with Excessive Log Growth If the growth is due primarily to excessive log activity, you need to investigate whether there might be log activity that could be avoided, for example by adjusting how you carry out bulk data and index maintenance operations, so that these operations are not fully logged (i.e. the BULK_LOGGED recovery model is used for these operations). However, any bulk-logged operation will immediately prevent point-in-time recovery to any point within a log backup that contains records relating to the minimally logged operations (refer to Chapter 6 for full details). If this is not acceptable, you must simply accept a large log as a fact, and plan its growth and management (such as frequency of log backups) accordingly, as described in the Proper Log Management section, later in this chapter. If the growth is due to a lack of log space reuse, you need to nd out what is preventing this reuse and take steps to correct the issue.
120
Index rebuilds
Rebuilding an index offline, using ALTER INDEX REBUILD (or the deprecated DBCC DBREINDEX in SQL Server 2000) drops the target index and rebuilds it from scratch (online index rebuilds do not drop the existing index until the end of the rebuild operation).
Logging and online index rebuilds
Online index rebuild is a fully logged operation on SQL Server 2008 and later, whereas it is minimally logged in SQL Server 2005. Therefore, performing such operations in later SQL Server versions will require substantially more transaction log space. See: http://support.microsoft.com/kb/2407439, as well as Kalen Delaney's blog, investigating logging during online and offline index rebuilds, for both FULL and BULK_LOGGED recovery model databases: http://sqlblog.com/blogs/kalen_delaney/ archive/2011/03/08/what-gets-logged-for-index-rebuilds.aspx.
In the FULL recovery model, index rebuilds can be a very resource-intensive operation, requiring a lot of space in the transaction log. In the SIMPLE or BULK_LOGGED recovery model, rebuilding an index is a minimally logged operation, meaning that only the allocations are logged, and the actual pages are not changed, therefore reducing the amount of log space required by the operation. If you switch to the SIMPLE model to perform an index rebuild, the LSN chain will be immediately broken. You'll only be able to recover your database to a point in time contained in the previous transaction log backup. To restart the chain, you'll need to switch back to the FULL model and immediately take a full or dierential database backup. If you switch to the BULK_LOGGED model (see Chapter 6), the LSN chain is always maintained but there are still implications for your ability to perform point-in-time restores, since a log backup that contains a minimally logged operation can't be used to recover to a point in time. If the ability to perform a point-in-time recovery is paramount 121
Chapter 7: Dealing with Excessive Log Growth for a database, then don't use the BULK_LOGGED recovery model for index rebuilds or any other minimally logged operation, unless you can do it at a time when there is no concurrent user activity in the database. Alternatively, consider performing index reorganizations, in FULL recovery model, where possible. If the BULK_LOGGED model is used, take steps to minimize the time period where pointin-time restore is unavailable, and so minimize exposure to data loss. To do this, take a log backup in FULL model, switch to BULK_LOGGED, perform the index rebuild, then switch back to FULL and take another log backup. A nal important point to note is that an ALTER INDEX REBUILD operation occurs in a single transaction. If the index is large, this could represent a long-running transaction that will prevent space reuse in the log for its duration. This means that, even if you rebuild an index in SIMPLE model, where you might think that the log should remain small since it is auto-truncated after a CHECKPOINT operation, and the rebuild is minimally logged, the log le can still expand quite rapidly during an extensive rebuild operation.
Index reorganization
In contrast to rebuilding an index, reorganizing (defragmenting) an index, using ALTER INDEX REORGANIZE or, in SQL Server 2000, DBCC INDEXDEFRAG (since deprecated) is always a fully logged operation, regardless of the recovery model, and so the actual page changes are always logged. However, index reorganizations generally require less log space than an index rebuild, although this is a function of the degree of fragmentation in the index; a heavily fragmented index will require more log space to reorganize than a minimally fragmented one. Furthermore, the ALTER INDEX REORGANIZE operation is accomplished using multiple shorter transactions. Therefore, when performed in conjunction with frequent log backups (or when working in SIMPLE model), log space can be made available for reuse during the operation, so minimizing the size requirements for the transaction log during the operation. 122
Chapter 7: Dealing with Excessive Log Growth For example, rebuilding a 20 GB index can require more than 20 GB of space for the rebuild operation because it occurs in a single transaction. However, reorganizing a 20 GB index may require much less log space because each page allocation change in the reorganization is a separate transaction, and so the log records can be truncated with frequent log backups, allowing the log space to be reused.
123
Chapter 7: Dealing with Excessive Log Growth However, the most eective weapon in guarding against excessive log growth during index maintenance is to maintain only those indexes that really need it. With the SSMS Maintenance Plans Wizard, index maintenance is an all-or-nothing process: you either rebuild (or reorganize) all indexes in your database (and all databases in the maintenance plan) or you maintain none of them. A better approach is to use the sys.dm_db_index_physical_stats DMV to investigate fragmentation and so determine a rebuild/reorganize strategy based on need.
Ola Hallengren's free maintenance scripts
Ola Hallengren oers a comprehensive set of free maintenance scripts which demonstrate how to use sys.dm_db_index_physical_stats to perform index analysis for intelligent maintenance, and which can be used as a replacement for Database Maintenance Plans created by the wizards in SSMS (http://ola.hallengren.com).
The best approach, however, is to schedule regular maintenance on only those indexes where you can prove a positive, sustained impact on query performance. Logical fragmentation (index pages in the wrong order) thwarts SQL Server's read-ahead mechanism (http://msdn.microsoft.com/en-us/library/ms191475(v=sql.105).aspx) and makes it less I/O-efficient at reading contiguous pages on disk. However, this only really aects large range scans from disk. Even for very fragmented indexes, if you are not scanning the table, rebuilding or reorganizing indexes might not help performance. Reduced page density (many gaps causes by page splits and deletes) will cause pages to take up more space on disk and in memory, and require the I/O bandwidth to transfer the data. Again, though, this form of fragmentation won't really aect infrequently modied indexes and so rebuilding them won't help. Before scheduling index maintenance, ask yourself what performance metrics beneted from the maintenance? Did it reduce I/O signicantly? How much did it improve the performance of your most expensive queries? Was the positive impact a sustained one? If the answers to these are "no" or "don't know," then it's probable that regular index maintenance is not the right long-term answer. Finally, it's also worth noting that 124
Chapter 7: Dealing with Excessive Log Growth maintaining small indexes is generally not worthwhile. The commonly cited threshold is around 1,000 pages. Paul Randal suggested these values as guidelines when he was managing the storage engine development team at Microsoft, and they are documented in Books Online. Note, though, that this is guideline advice only and may not be appropriate for all environments, as discussed by Paul in his blog post, Where do the Books Online index fragmentation thresholds come from? (http://www.sqlskills.com/BLOGS/ PAUL/post/Where-do-the-Books-Online-index-fragmentation-thresholdscome-from.aspx).
125
126
Listing 7.1:
Figure 7.1 shows some sample output (we split the result set in two, for readability).
Figure 7.1:
Incidentally, if we rerun this example but with ALTER INDEXREORGANIZE, then the value in the Log Bytes Used column reduces from about 159 MB to around 0.5 MB.
127
Listing 7.2:
The value of the log_reuse_wait_desc column will show the current reason why log space cannot be reused. If you've run the previous example (Listing 7.1), then it's likely that the FullRecovery database will display the value LOG_BACKUP in this column (more on this in the next section). It may be that more than one thing is preventing log reuse. The sys.databases view will only show one of the reasons. It is therefore possible to resolve one problem, query sys.databases again and see a dierent log_reuse_wait reason. The possible values for log_reuse_wait_desc are listed in Books Online (http://msdn.microsoft.com/en-us/library/ms178534.aspx), but we'll cover the most common causes here, and explain how to safely ensure that space can start to get reused.
128
129
FROM WHERE
Listing 7.3:
If a lack of log backups is the cause of log growth problems, the rst thing to do is to verify that the database in question really does need to be operating in FULL recovery. This will be true if it must be possible to restore the database to an arbitrary point in time, or to point of failure in the case of a disaster, or if FULL recovery model is required for another reason (such as database mirroring). If the Recovery Point Objective (RPO) in the SLA stipulates a maximum of 15 minutes potential data loss, then it's highly unlikely you can fulll this with only full and dierential database backups and, again, log backups will be necessary. However, if it turns out there are no log backups simply because they are not required, then the database should not be operating in FULL recovery; we can switch to using the SIMPLE recovery model, where the inactive portion of the transaction log is automatically marked as reusable, at checkpoint. If the database does need to operate in the FULL recovery model, then start taking log backups, or investigate the need to take more frequent log backups. The frequency of the transaction log backups depends on a number of factors such as the frequency of data changes, and on SLAs for acceptable data loss in the event of a crash. In addition, you 130
Chapter 7: Dealing with Excessive Log Growth should take steps to ensure that the log growth is controlled and predictable in future, as described in the Proper Log Management section, later in this chapter.
Active transactions
If the value returned for log_reuse_wait_desc is ACTIVE_TRANSACTION, then you are suering from the second most common cause of a full or large transaction log in SQL Server: long-running or uncommitted transactions. Rerun the transaction from Listing 7.1, but without committing it, and then rerun Listing 7.2 and you should see this value listed (don't forget to go back and commit the transaction). As discussed in the Log Truncation and Space Reuse section of Chapter 2, a VLF inside the transaction log can only be truncated when it contains no part of the active log. If the database is using the FULL or BULK_LOGGED recovery models, only a log backup operation can perform this truncation. Long-running transactions in a database delay truncation of the VLFs that contain the log records generated after the start of the transaction, including the log records generated by changes to data in the database by other concurrent sessions, even when those changes have been committed. Additionally, the amount of space required by a long-running transaction will be increased by space reservations for "compensation log records," which are the log records that would be generated if the transaction were rolled back in the system. This reservation is required to ensure that the transaction can be reverted successfully without running out of log space during the rollback. Another common cause of the Active Transaction value for log_reuse_wait_desc is the presence of "orphaned" explicit transactions that somehow never got committed. Applications that allow for user input inside a transaction are especially prone to this kind of problem.
131
Long-running transactions
One of the most common operations that results in a long-running transaction, which also generates large numbers of log records in a database, is archiving or purging of data from a database. Data retention tends to be an afterthought in database design, usually considered after the database has been active for a period and is approaching the capacity limits of the available storage on a server. Usually, when the need to archive data arises, the rst reaction is to remove the unneeded data from the database using a single DELETE statement, as shown in Listing 7.4. To produce some simple test data, this script uses a simplied version of Je Moden's random data generator (see Chapter 1, Listing 1.3), modied slightly to produce dates into 2012.
USE FullRecovery ; GO IF OBJECT_ID('dbo.LogTest', 'U') IS NOT NULL DROP TABLE dbo.LogTest ; SELECT TOP 500000 SomeDate = CAST(RAND(CHECKSUM(NEWID())) * 3653.0 + 37534.0 AS DATETIME) INTO dbo.LogTest FROM sys.all_columns ac1 CROSS JOIN sys.all_columns ac2 ; -- delete all but the last 60 days of data DELETE dbo.LogTest WHERE SomeDate < GETDATE() - 60
Listing 7.4:
Depending on the number of rows that exist in the date range to be deleted, this can become a long-running transaction that will cause transaction log growth issues, even when the database is using the SIMPLE recovery model. The presence of cascading FOREIGN KEY constraints or auditing triggers exacerbates the problem. If other tables reference the target table, via FOREIGN KEY constraints designed to CASCADE ON DELETE, then SQL Server will also log details of the rows deleted through the cascading constraint. If the table has a DELETE trigger on it, for auditing data changes, SQL Server will also log the operations performed during the trigger's execution. 132
Chapter 7: Dealing with Excessive Log Growth To minimize the impact on the transaction log, the data purge operation should be broken down into a number of shorter, individual transactions. There are a number of ways to break a long-running transaction down into smaller batches. If cascading constraints or a DELETE trigger exist for a table, we can perform the DELETE operation inside of a loop, to delete one day of data at a time, as shown in Listing 7.5. Note that, in this simple example, there are insufficient rows in our table to justify use of this technique over a simple DELETE; it is better suited to data purges of millions of rows. Note also that speed is not necessarily the primary concern with batch deletes (Listing 7.5 will run much slower than Listing 7.4). The bigger concerns are avoiding explosive log growth and lock escalation.
DECLARE @StopDate DATETIME , @PurgeDate DATETIME SELECT @PurgeDate = DATEADD(DAY, DATEDIFF(DAY, 0, MIN(SomeDate)), 0) , @StopDate = DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()) - 60, 0) FROM dbo.LogTest WHILE @PurgeDate < @StopDate BEGIN DELETE dbo.LogTest WHERE SomeDate < @PurgeDate SELECT @PurgeDate = DATEADD(DAY, 1, @PurgeDate) END
Listing 7.5:
Using this model for purging data, the duration of each DELETE transaction is only the time required to delete a single day's data from the table, plus the time required for any triggers or cascading constraints to perform their operations. If the database uses the SIMPLE recovery model, the next checkpoint will truncate the log records generated by each daily purge. If the database uses the FULL or BULK_LOGGED recovery model, the next log backup will truncate the log records generated by each daily purge, as long as no part of the active log exists inside the VLFs containing log records relating to the data purge.
133
Chapter 7: Dealing with Excessive Log Growth When cascading constraints or auditing triggers are not a factor in the process, we can use a dierent method to purge the data from the table while minimizing the transaction duration. Instead of performing a single-day DELETE operation, which can aect more or less data, depending on the number of rows that exist for a specic date, use of the TOP operator inside the DELETE statement will limit the number of rows aected by each loop of the operation. By capturing into a variable the number of rows aected by the DELETE operation, using @@ROWCOUNT, the operation can continue to purge data from the table in small batches, until the value of @@ROWCOUNT is less than the number of rows specied in the TOP clause of the DELETE statement, as shown in Listing 7.6. This method only works when triggers and cascading constraints aren't being used because, when they are, the result of @@ROWCOUNT will not be the actual rows deleted from the base table, but instead the number of rows that are aected by the trigger execution or through enforcing the cascading constraint.
DECLARE @Criteria DATETIME , @RowCount INT SELECT @Criteria = GETDATE() - 60 , @RowCount = 10000 WHILE @RowCount = 10000 BEGIN DELETE TOP ( 10000 ) FROM dbo.LogTest WHERE SomeDate < @Criteria SELECT @RowCount = @@ROWCOUNT END
Listing 7.6:
Using the TOP operator inside the DELETE statement for data purges.
These methods work in any edition of SQL Server 2000, 2005, and 2008 to minimize transaction duration during data purge operations. However, if the database is SQL Server 2005 or 2008 Enterprise Edition, and the data purging process runs regularly, then an even better way to purge the data is to partition the table using a sliding window partition on the column used to delete the data. This will 134
Chapter 7: Dealing with Excessive Log Growth have even less impact on the transaction log, since the partition containing the data can be switched out of the table and truncated, which is an operation for which SQL Server logs only the extent de-allocations.
Managing archiving
It is well outside the scope of this book to delve into full, automated archiving schemes. However, a possible archiving process could involve partitioning, and duplicate schemas between tables, allowing a partition to be switched out of one table and into another one, minimizing the active portion of data in the main OLTP table, but reducing the archiving process to being metadata changes only. Kimberley Tripp has produced a detailed white paper called Partitioned Tables and Indexes in SQL Server 2005, which also covers the sliding window technique, (see http://msdn.microsoft.com/en-us/library/ ms345146(v=sql.90).aspx).
Uncommitted transactions
By default, SQL Server wraps any data modication statement in an implicit transaction to ensure that, in the event of a failure, SQL Server can roll back the changes already made at the point of failure, returning the data to a consistent state. If the changes succeed, the implicit transaction is committed to the database. In contrast to implicit transactions, which occur automatically, we create explicit transactions in code to wrap multiple changes into a single transaction, ensuring that all the changes can be undone by issuing a ROLLBACK command, or persisted by issuing a COMMIT for the transaction. When used properly, explicit transactions can ensure that data modications that span multiple tables complete successfully as a unit, or not at all. When used incorrectly, however, orphaned transactions remain active in the database, preventing truncation of the transaction log, and so resulting in the transaction log growing or lling up.
135
Chapter 7: Dealing with Excessive Log Growth There are a number of causes of orphaned transactions in SQL Server, and it's beyond the scope of this chapter to investigate them in full detail. However, some of the most common causes are: application timeouts caused by a long-running transaction incorrect error handling in T-SQL or application code failure during trigger execution linked server failures resulting in orphaned distributed transactions no corresponding COMMIT/ROLLBACK statement to a BEGIN TRANSACTION command. Once a transaction starts, it will remain active until the connection that created the transaction issues a COMMIT or ROLLBACK statement, or the connection disconnects from the SQL Server (the exception is when using bound connections, which allow sessions to share locks). Modern applications generally utilize connection pooling, keeping connections to the SQL Server in a pool for reuse by the application, even when the application code calls the Close() method on the connection. It is critical that you understand this last point when troubleshooting orphaned transactions since, even though the connection is reset before being added or returned to the application's connection pool, open transactions continue to exist in the database if they have not been properly terminated.
136
Chapter 7: Dealing with Excessive Log Growth This command can accept the database name as an input parameter in the format DBCC OPENTRAN(DatabaseName) where DatabaseName is the name of the database to check for open transactions. If an active transaction exists in the database, this command will output information similar to that shown in Listing 7.7.
DBCC OPENTRAN (FullRecovery) Transaction information for database 'FullRecovery'. Oldest active transaction: SPID (server process ID): 56 UID (user ID) : -1 Name : user_transaction LSN : (897:15322:1) Start time : Sep 18 2012 1:01:29:390PM SID : 0x010500000000000515000000fd43461e19525f12828ba628ee0a0000 DBCC execution completed. If DBCC printed error messages, contact your system administrator.
Listing 7.7:
DBCC OPENTRAN reports only the oldest active transaction, and the primary indicator of whether or not the active transaction is problematic is the Start Time. Generally, uncommitted transactions that become problematic with regard to transaction log growth have been open for a long period of time. The other important piece of information is the SPID (server process ID; in the DMVs this is replaced by session_id), which identies the session that created the open transaction. We can use the SPID to determine whether the transaction is actually an orphaned transaction or just a long-running one, by querying the sysprocesses view (in SQL Server 2000) or the sys.dm_exec_sessions and sys.dm_exec_connections DMVs in SQL Server 2005 and later, as shown in Listing 7.8.
137
Chapter 7: Dealing with Excessive Log Growth Note that the sysprocesses view is still available in SQL Server 2005 and later for backwards compatibility. In each query, simply replace the session_id value with the one you saw when running Listing 7.7 (we have commented out certain columns, simply for readability of the output).
USE master GO SELECT spid , status , -hostname , -program_name , -loginame , login_time , last_batch , ( SELECT text FROM :: fn_get_sql(sql_handle) ) AS [sql_text] FROM sysprocesses WHERE spid = 56 USE FullRecovery GO SELECT s.session_id , s.status , -s.host_name , -s.program_name , -s.login_name , s.login_time , s.last_request_start_time , s.last_request_end_time , t.text FROM sys.dm_exec_sessions s JOIN sys.dm_exec_connections c ON s.session_id = c.session_id CROSS APPLY sys.dm_exec_sql_text(c.most_recent_sql_handle) t s.session_id = 56
WHERE
138
Listing 7.8:
If the session is in a runnable, running, or suspended status, then it is likely that the source of the problem is a long-running, rather than orphaned, transaction. However, only further investigation will conrm. It is possible that an earlier transaction failed and the connection was reset, for use under connection pooling, and that the currently executing statement is not associated with the open transaction. In SQL Server 2005 and later, we can use the sys.dm_tran_session_transactions and sys.dm_tran_database_transactions DMVs to gather information specic to the open transaction, including the transaction start time, number of log records used by the open transaction, as well as the bytes of log space used, as we saw previously in Listing 7.1. Listing 7.9 shows a simplied version, with sample output.
SELECT st.session_id , st.is_user_transaction , dt.database_transaction_begin_time , dt.database_transaction_log_record_count , dt.database_transaction_log_bytes_used FROM sys.dm_tran_session_transactions st JOIN sys.dm_tran_database_transactions dt ON st.transaction_id = dt.transaction_id AND dt.database_id = DB_ID('FullRecovery') WHERE st.session_id = 56
Listing 7.9:
139
Chapter 7: Dealing with Excessive Log Growth Unless the application was specically designed to check for, and handle, orphaned transactions, the only way to clear the transaction is to KILL the session, which will cause the transaction to roll back as the connection terminates, allowing the space in the log to be made available for reuse during the next log backup. However, the ramications of performing the rollback must be understood.
Replication
During transactional replication, it is the job of the log reader agent to read the transaction log, looking for log records that are associated with changes that need to be replicated to subscribers (i.e. are "pending replication"). Once the changes are replicated, it marks the log entry as "replicated." Slow or delayed log reader activity can lead to records being left as "pending replication" for long periods, during which time they will remain part of the active log, and so the parent VLF cannot be truncated. A similar problem exists for log records required by the Change Data Capture (CDC) feature. In either case, the log_reuse_wait_desc column of sys.databases will show REPLICATION as the root cause of the problem. The problem will also reveal itself in the form of bottlenecks in the throughput performance of the transaction log disk array, specically, delayed read operations under concurrent write loads. Writes to the log le occur sequentially, but read operations associated with the log reader agent and log backups read the le sequentially as well. Having sequential reads and writes occurring 140
Chapter 7: Dealing with Excessive Log Growth at the same time can, depending on the level of activity in the system and the size of the active portion of the log, result in random I/O activity as the disk heads have to change position to read from the beginning of the active log and then write to the end of the active log. We can use the disk latency Perfmon counters, Physical Disk\Disk Reads/sec and Physical Disk\Disk Writes/sec counters to troubleshoot this type of problem. See Chapter 2 of the free eBook, Troubleshooting SQL Server (http://www. simple-talk.com/books/sql-books/troubleshooting-sql-server-a-guide-for-theaccidental-dba/) for further details on this topic. The rst step in troubleshooting these REPLICATION wait issues is to verify that the log reader SQL Agent jobs are actually running. If they are not, attempt to start them. If this fails, you'll need to nd out why. If the jobs are running but the REPLICATION waits persist, and the transaction log is growing rapidly, you need to nd some way to get the relevant log entries marked as "replicated" so that space in their parent VLFs can be reused. Unfortunately, there is no perfect solution that will avoid side eects to replication or CDC in the environment, but you could try one of the solutions below. In the case of transactional replication, use the sp_repldone command to mark all of the log records currently waiting on the log reader as replicated, but this will require re-initialization of the subscribers. With CDC, this command will not resolve the problem with transaction log growth. Disabling CDC or replication and performing a manual resynchronization of the data. Having disabled CDC or replication, the pending replication log records in the transaction log will no longer be pending and the next log backup, in FULL or BULK_ LOGGED recovery, or CHECKPOINT operation in SIMPLE recovery, will clear them out. However, the trade-o is that the environment will require manual synchronization of the data for CDC, or it will require re-initialization of the subscribers for replication, if these features are added back to the database.
141
Chapter 7: Dealing with Excessive Log Growth Remember that simply switching to the SIMPLE recovery model, in the hope of truncating the log, will not work since replication and CDC are both supported using SIMPLE recovery, and the log records will continue to be required until the log reader SQL Agent process processes them.
Snapshot Replication schema change issue
There is a known issue with Snapshot Replication in SQL Server 2005 that causes log entries that are marked for replication of schema changes not to be unmarked when the changes are replicated. This problem is explained in the following blog post that also explains how to work around the issue by using sp_repldone: Size of the transaction log increasing and cannot be truncated or shrunk due to Snapshot Replication (http://blogs.msdn.com/b/sqlserverfaq/archive/2009/06/01/size-of-thetransaction-log-increasing-and-cannot-be-truncated-or-shrinked-due-to-snapshotreplication.aspx).
ACTIVE_BACKUP_OR_RESTORE
When the log_reuse_wait_desc column shows ACTIVE_BACKUP_OR_RESTORE as the current wait description, a long-running full or dierential backup of the database is the most likely cause of the log reuse problems. During a full or dierential backup of the database, the backup process delays log truncation so that the active portion of the transaction log can be included as a part of the full backup. This allows changes made to database pages during the backup operation to be undone when the backup is restored WITH RECOVERY, to bring the database to a consistent state. If such waits are causing persistent problems, you'll need to investigate ways to optimize the backup process, such as by improving the performance of the backups (via backup compression) or improving the performance of the underlying disk I/O system.
142
DATABASE_MIRRORING
When the log_reuse_wait_desc column shows DATABASE_MIRRORING, as the current wait description, asynchronous database mirroring operations may be the cause of the log reuse issues. In synchronous mirroring, transactions on the principal are only committed once their related log records have been transferred to the mirror database. For asynchronous database mirroring, the log records are transferred later and the principal's log can't be truncated until they are. When mirroring problems arise, a large number of log records on the principal can remain part of the active log, preventing log space reuse, until copied over to the mirror. For synchronous database mirroring, we may see a value of DATABASE_MIRRORING if the mirror is not contactable, due to a broken or very slow connection, or suspension of the mirroring session. For asynchronous database mirroring, we may well see this value during normal operation, as well as during connection problems. In such cases, I would rst check the status of the mirroring session for the aected database(s). If they are not synchronizing correctly, then you will need to troubleshoot the cause of the failed connection between the principal and the mirror. One of the most common problems with database mirroring, when certicates are used to secure the endpoints, is the expiration of the certicates, requiring that they be re-created. A full discussion of troubleshooting mirroring connectivity problems is outside the scope of this book but, unless the databases are properly synchronizing so that the log records are being sent to the mirror, the active portion of the transaction log on the principal will continue to grow and not be able to be truncated without breaking the mirroring setup. If the transaction rate on the principal greatly exceeds the rate at which log records can be transferred to the mirror, then the log on the principal can grow rapidly. If the mirror server is being used for reporting, by creating snapshots, verify that the disk I/O conguration for the mirror is not saturated, by using the disk latency Perfmon counters mentioned earlier. If this is where the problem is, eliminating use of the mirror server 143
Chapter 7: Dealing with Excessive Log Growth for reporting may provide temporary relief of the problem. If the problem is strictly the sheer volume of transactions and the database is not running on SQL Server 2008 or higher, then upgrading may be able to resolve the problem due to the use of log stream compression in SQL Server 2008 and beyond. The best approach is to determine the cause of the mirroring issue and resolve it. For example, tuning operations that produce a signicant number of log records, such as bulk loading data, or reorganizing indexes, may reduce the impact to the system overall during the operation.
144
In the type column, a D represents a database backup, L represents a log backup and I represents a dierential backup. If there are no log backups, or they are very infrequent, then your best course of action is to take a log backup (assuming the database is operating in FULL or BULK_LOGGED recovery model). Hopefully, this will free up substantial space within the log and you can then implement an appropriate log backup scheme, and log le growth management strategy. If, for some reason, it is not possible to perform a log backup due to a lack of disk space, or the time it would take to perform a log backup exceeds the acceptable time to resolve the problem, then, depending on the disaster recovery policy for the database in question, it might be acceptable to force a truncation of the log by temporarily switching the database to the SIMPLE recovery model in order that inactive VLFs in the log can be truncated on CHECKPOINT. You can then switch the recovery model back to FULL and perform a new full database backup (or a dierential backup, assuming a full backup was taken at some previous time) to restart the log chain for point-in-time recovery. Of course, you'll still need to investigate the problem fully, in order to make sure that the space isn't simply devoured again. Bear in mind also that, as discussed previously, if the problem preventing space reuse is anything other than Log Backup, then this technique won't work, since those records will simply remain part of the active log, preventing truncation.
145
Chapter 7: Dealing with Excessive Log Growth If a lack of log backups isn't the problem, or taking a log backup doesn't solve the problem, then investigating the cause will require a little more time. If it is quick and easy to make extra space on the log drive then do so. This might mean shifting o other les, or adding capacity to the current log drive, or adding an extra log le on a dierent disk array, but it will buy you the bit of breathing space you need to get the database out of read-only mode, and perform a log backup. If a log backup fails to free up space, you need to nd out what is preventing space reuse in the log. Interrogate sys.databases (Listing 7.1) to nd out if anything is preventing reuse of space in the log, and take appropriate action, as described in the earlier Lack of log space reuse section. If this reveals nothing, you'll need to investigate further and nd out which operations are causing the excessive logging that led to the log growth, as described in the Diagnosing a Runaway Transaction Log section. Ultimately, having resolved any space reuse issue, we may still have a log le that is consuming the vast majority of the space on the drive. As a one-o measure, i.e. assuming we will take steps to ensure proper management of log growth in the future (see the Proper Log Management section, following shortly), it is acceptable to use DBCC SHRINKFILE (see http://msdn.microsoft.com/en-us/library/ms189493.aspx) to reclaim the space used by a bloated transaction log le. We'll provide an example of how to do this in Chapter 8. We can either specify a target_size to which to shrink the log le, or we can specify 0 (zero) as the target size and shrink the log to its smallest possible size, and then immediately resize it to a sensible size using ALTER DATABASE. The latter is the recommended way, as it minimizes fragmentation of the log le. This fragmentation issue is the main reason why you should never schedule regular DBCC SHRINKFILE tasks as a means of controlling the size of the log; we discuss this in more detail in the next section.
146
147
Chapter 7: Dealing with Excessive Log Growth to it. The really nasty twist to this technique is that, unlike with BACKUP LOG WITH TRUNCATE_ONLY, SQL Server is unaware that the log records have simply been discarded. As far as SQL Server is concerned, a log backup has been performed, the log records are safely stored in a backup le so the LSN chain is intact, and any inactive VLFs in the live log can safely be truncated. Any subsequent, conventional log backups will succeed but will be entirely useless from the point of view of disaster recovery since a log backup le is "missing" and so the database can only be restored to some point in time covered by the last standard log backup that was taken before BACKUP LOG TO DISK='NUL ' was issued. Do not use either of these techniques. The right way to "force" log truncation is to temporarily switch the database into the SIMPLE recovery model, as discussed earlier.
149
Chapter 7: Dealing with Excessive Log Growth The best practice for the transaction log le continues to be to size it appropriately up front, so it does not have to grow under normal operations. Then, monitor its usage periodically to determine if the need to grow it manually occurs, allowing you to determine the appropriate growth size and determine the number and size of VLFs that will be added to the log le. We'll discuss this in more detail in Chapter 8.
150
Summary
The transaction log is critical to the operation of a SQL Server database, and the ability to minimize data loss in the event of a disaster. In a situation where the log is growing explosively, or is even full, the DBA needs to act very quickly to diagnose and x the problem, but it's also important to act calmly, and avoid unthinking reactions such as forcing log truncation, then scheduling regular log shrinks, which will cause more harm than good.
Acknowledgements
Many thanks to Jonathan Kehayias, lead author of Troubleshooting SQL Server (http://www.simple-talk.com/books/sql-books/troubleshooting-sql-server-aguide-for-the-accidental-dba/), available as a free eBook, for contributing additional material to this chapter.
151
Physical Architecture
The correct physical hardware and architecture will help ensure that you get the best possible log throughput, and there are a few "golden rules." Others have covered this before, notably Kimberly Tripp in her much referenced blog post, 8 Steps to better Transaction Log throughput, (http://www.sqlskills.com/blogs/kimberly/post/8-Stepsto-better-Transaction-Log-throughput.aspx), so we won't go into detail again here. Note that in designing the underlying physical architecture for the log le, our primary goal is to optimize log write throughput. SQL Server writes to the log in response to every transaction that adds, removes or modies data, as well as in response to database maintenance operations such as index rebuilds or reorganization, statistics updates, and more.
152
Note that we've based the SIZE and FILEGROWTH settings for the data and log les on those for AdventureWorks2008. The code for this example (Listings 8.18.3) is in the YouOnlyNeed1Log.sql le, in the code download.
153
Listing 8.1:
154
GO
CREATE TABLE dbo.Persons ( PersonID INT NOT NULL IDENTITY , FName VARCHAR(20) NOT NULL , LName VARCHAR(30) NOT NULL , Email VARCHAR(7000) NOT NULL ); GO
Listing 8.2:
Now, we'll add 15K rows to the table and run DBCC LOGINFO. Note that in our tests, we lled the Persons table with data from the Person.Contact table in the AdventureWorks 2005 database. However, the code download le contains alternative code, which will work with AdventureWorks2008 and AdventureWorks2012.
155
GO
Listing 8.3:
SQL Server is sequentially lling the VLFs in the primary log le (FileID 2), followed by the secondary one (FileID 3). It has also auto-grown the primary log le (please refer back to Chapter 2 for a more detailed example of how SQL Server auto-grows the transaction log).
156
Chapter 8: Optimizing Log Throughput If we continue to add rows, SQL Server will continue to grow both les as required, and ll the VLFs sequentially, one VLF at a time. Figure 8.1 shows the situation after rerunning Listing 8.3 for 95,000 rows (110K rows added in total). Now we have 12 VLFs for the primary log le and 8 for the secondary log le.
Figure 8.1:
In this situation, any operation that reads the log will start by reading a block of four VLFs in the primary log (FSeqNo 36-39), followed by four blocks in the secondary log (FSeqNo 40-43), followed four blocks in the primary log, and so on. This is why multiple log les can reduce I/O efficiency, as we'll discuss further in the next section. The only reason to add an additional log le is in exceptional circumstances where, for example, the disk housing the log le is full (see Chapter 6) and we need, temporarily, to add an additional log le as the quickest means to get the SQL Server out of read-only mode. However, as soon as it's no longer required the additional le should be removed, as discussed later, in the section, What to Do If Things Go Wrong. 157
Chapter 8: Optimizing Log Throughput Ideally, every database will have one log le on a dedicated disk array, though in many systems this remains an ideal rather than a practical reality. For similar reasons, relating to sequential I/O efficiency, it's also important to defragment the physical disk drives before we create the log le.
RAID 1+0 is a nested RAID level known as a "striped pair of mirrors." It provides redundancy by rst mirroring each disk, using RAID 1, and then striping those mirrored disks with RAID 0, to improve performance. There is a signicant monetary cost increase associated with this conguration since only half of the disk space is available for use. However, this conguration oers the best conguration for redundancy since, potentially, it allows for multiple disk failures while still leaving the system operational, and without degrading system performance. A common and somewhat cheaper alternative is RAID 5, "striping with parity," which stripes data across multiple disks, as per RAID 0, and stores parity data in order to provide protection from single disk failure. RAID 5 requires fewer disks for the same storage capacity, compared to RAID 1+0, and oers excellent read performance. However, the need to maintain parity data incurs a performance penalty for writes. While this is less of a problem for modern storage arrays, it's the reason why many DBAs don't recommend it for the transaction log les, which primarily perform sequential writes and require the lowest possible write latency. If, as per our previous suggestion, you are able to isolate a single database's log le on a dedicated array, at least for those databases with the heaviest I/O workload, then it may be possible to use the more expensive RAID 1+0 for those arrays, and RAID 5, or RAID 1, for lower workload databases. To give an idea of the I/O performance oers by various RAID levels, following are three possible RAID congurations for a 400 GB database that performs a balanced mix of
160
Chapter 8: Optimizing Log Throughput random read and write operations, along with the resulting theoretical I/O throughput rates, based on a 64K random I/O workload for SQL Server. 1. RAID 1 using 2 x 600 GB 15K RPM disks =>185 IOPS at 11.5 MB/sec. 2. RAID 5 using 5 x 146 GB 15K RPM disks =>345 IOPS at 22 MB/sec. 3. RAID 10 using 14 x 73 GB 15K RPM disks => 1609 IOPS at 101 MB/sec. Remember, though, that these numbers are theoretical, based only on the potential I/O capacity of the disks in a given conguration. They take no account of other factors that can, and will, have an impact on overall throughput, including RAID controller cache size and conguration, RAID stripe size, disk partition alignment, or NTFS format allocation unit sizes. The only way to be sure that your selected disk conguration will cope gracefully with the workload placed on it by your databases is to perform proper benchmarking of the I/O subsystem, prior to usage.
Benchmarking storage congurations for SQL Server
A number of tools exist for measuring the I/O throughput of a given conguration, the two most common tools being SQLIO (http://www.microsoft.com/download/en/details.aspx?id=20163) and IOmeter (http://www.iometer.org/). In addition, there is SQLIOSim , for testing the reliability and integrity of a disk conguration (http://support.microsoft.com/kb/231619/en-us).
Chapter 8: Optimizing Log Throughput as follows (though, for very small growth increments, it will sometimes add fewer than four VLFs): <64 MB each auto-growth event will create 4 new VLFs. 64MB to 1 GB = 8 VLFs. >1 GB = 16 VLFs. For example, if we create a 64 MB log le and set it to auto-grow in 16 MB increments, then the log le will initially have 8 VLFs, each 8 MB in size, and SQL Server will add 4 VLFs, each 4 MB in size, every time the log grows. If the database attracts many more users than anticipated, but the le settings are left untouched, by the time the log reaches 10 GB in size, it will have grown about 640 times, and will have over 2,500 VLFs. Towards the other end of the scale, if we grow a log in 16 GB chunks, then each growth will add 16 VLFs, each 1 GB in size. With large VLFs, we risk tying up large portions of the log that SQL Server cannot truncate, and if some factor further delays truncation, meaning the log has to grow, the growth will be rapid. The trick is to obtain the right balance. The maximum recommended auto-growth size is about 8 GB (advice oered by Paul Randal in his Log File Internals and Maintenance video, http://technet.microsoft.com/en-us/sqlserver/gg313762.aspx). Conversely, the growth increments must be large enough to avoid an unreasonably large number of VLFs. There are two main reasons to avoid frequent small auto-grow events. One is that, as discussed in Chapter 7, log les cannot take advantage of instant le initialization, so each log growth event is relatively expensive, compared to data le growth, in terms of resources. A second is that a fragmented log can impede the performance of operations that read the log.
162
Chapter 8: Optimizing Log Throughput Many operations will need to read the transaction log, including: Full, dierential and log backups though only the latter will need to read substantial portions of the log. Crash recovery process to reconcile the contents of data and log les, undoing the eects of any uncommitted transactions, and redoing the eects of any that were committed and hardened to the log, but not to the data les (see Chapter 1). Transactional replication the transactional replication log reader reads the log when moving changes from the publisher to the distributor. Database mirroring on the mirror database, the log gets read when transferring latest changes from the primary to the mirror. Creating a database snapshot which requires the crash recovery process to run. DBBC CHECKDB which creates a database snapshot when it runs. Change Data Capture which uses the transactional replication log reader to track data changes. Ultimately, the question of a "reasonable" number of VLFs in a log le will depend on the size of the log. In general, Microsoft regards more than about 200 VLFs as a possible cause for concern, but in a very big log le (say 500 GB) having only 200 VLFs could also be a problem, with the VLFs being too large and limiting space reuse.
Transaction Log VLFs too many or too few?
Kimberly Tripp's article discusses this topic in more detail: http://www.sqlskills.com/BLOGS/KIMBERLY/post/Transaction-Log-VLFs-too-many-ortoo-few.aspx.
163
Chapter 8: Optimizing Log Throughput In order to get at least some idea of the size of the impact of a fragmented log on operations that need to read it, we'll run some tests to look at the impact on two processes that read the log extensively, namely log backups and the crash recovery process.
Disclaimer
The tests that follow in no way reect the reality of busy multi-user databases running on server-grade hardware, with specic RAID congurations and so on. We ran them on an isolated SQL Server 2008 instance, installed on a virtual machine. Your gures are likely to vary, and the observed impact will obviously be lower for faster disks. The idea is simply to oer some insight into potential log fragmentation issues, and some pointers on how to investigate their potential impact.
Note, nally, that Linchi Shea has demonstrated a big eect on the performance of data modications when comparing a database with 20,000 VLFs to one with 16 VLFs. See: http://sqlblog.com/blogs/linchi_shea/archive/2009/02/09/performance-impact-alarge-number-of-virtual-log-files-part-i.aspx.
164
165
Listing 8.4:
Now, we're going to grow the log in lots of very small increments, as shown in Listing 8.5, in order to produce a very fragmented log le.
DECLARE @LogGrowth INT = 0; DECLARE @sSQL NVARCHAR(4000) WHILE @LogGrowth < 4096 BEGIN SET @sSQL = 'ALTER DATABASE PersonsLots MODIFY FILE (NAME = PersonsLots_log, SIZE = ' + CAST(4096+2048*@LogGrowth AS VARCHAR(10)) + 'KB );' EXEC(@sSQL); SET @LogGrowth = @LogGrowth + 1;
Listing 8.5:
Here we grow the log in 4,096 increments to a total size of 8 GB (4096+2048*4096 KB). The log will grow 4,096 times, adding 4 VLFs each time for a grand total of 4+(4096*4) = 16388 VLFs. 166
Chapter 8: Optimizing Log Throughput Now rerun Listing 8.2 to re-create the Persons table, but this time in the PersonsLots database, and then adapt Listing 8.3 to populate the table with 1 million rows. Now we're going to update the Persons table to create many log records. Depending on the specs of your machine, you may have time for a cup of coee while Listing 8.6 runs.
USE PersonsLots GO /* this took 6 mins*/ DECLARE @cnt INT; SET @cnt = 1; WHILE @cnt < 6 BEGIN; SET @cnt = @cnt + 1; UPDATE dbo.Persons SET Email = LEFT(Email + Email, 7000) END; DBCC SQLPERF(LOGSPACE) ; --8194 MB, 67% used DBCC LOGINFO; -- 16388 VLFs
Listing 8.6:
Finally, we're ready to take a log backup and see how long it takes. We've included the backup statistics in a comment after the backup code.
167
Listing 8.7:
For comparison, we'll repeat the same test, but this time we'll carefully size our database log to have a reasonable number of reasonably sized VLFs. In Listing 8.8, we re-create the Persons database, with an initial log size of 2 GB (= 16 VLFs, each 128 MB in size). We then manually grow the log, just three steps to 8 GB in size, comprising 64 VLFs (each roughly 128 MB in size).
USE master GO IF DB_ID('Persons') IS NOT NULL DROP DATABASE Persons; GO CREATE DATABASE [Persons] ON PRIMARY ( NAME = N'Persons' , FILENAME = N'C:\SQLData\Persons.mdf' , SIZE = 2097152KB , FILEGROWTH = 1048576KB ) LOG ON ( NAME = N'Persons_log' , FILENAME = N'D:\SQLData\Persons_log.ldf' , SIZE = 2097152KB , FILEGROWTH = 2097152KB ) GO
168
Listing 8.8:
Now rerun Listings 8.2, 8.3 (with 1 million rows) and 8.6 exactly as for the previous test. You should nd that, in the absence of any log growth events, Listing 8.6 runs a lot quicker (in half the time, in our tests). Finally, rerun a log backup.
169
Listing 8.9:
The eect on log backup time is relatively small, but reproducible, for this sized log, about a 1520% increase in backup time for a log with 14,292 VLFs compared to one with 64, and of course, this is a relatively small database (albeit with a very heavily fragmented log).
Lots of redo
In the rst example, we reuse the PersonsLots database. Drop and re-create it, set the recovery model to FULL, take a full backup and then insert 1 million rows, as per previous listings. You can nd the full code for the examples in this section in the les, PersonsLots_RedoTest.sql and Persons_RedoTest.sql, as part of the code download for this book. Now, before we update these rows, we're going to disable automatic checkpoints.
170
When we commit the subsequent updates, we'll immediately shut down the database so that all of the updates are hardened to the log but not to the data le. Therefore, during crash recovery, SQL Server will need to read all of the related log records and redo all of the operations.
USE PersonsLots Go /*Disable Automatic checkpoints*/ DBCC TRACEON( 3505 ) /*Turn the flag off once the test is complete!*/ --DBCC TRACEOFF (3505) /* this took 5 mins*/ BEGIN TRANSACTION DECLARE @cnt INT; SET @cnt = 1; WHILE @cnt < 6 BEGIN; SET @cnt = @cnt + 1; UPDATE dbo.Persons SET Email = LEFT(Email + Email, 7000) END; DBCC SQLPERF(LOGSPACE) ; --11170 MB, 100% used USE PersonsLots GO DBCC LOGINFO; -- 22340 VLFs
Listing 8.10: PersonsLots disable automatic checkpoints and run update in an explicit transaction.
171
Chapter 8: Optimizing Log Throughput Now we commit the transaction and shut down.
/*Commit and immediately shut down*/ COMMIT TRANSACTION; SHUTDOWN WITH NOWAIT
Listing 8.11:
After restarting the SQL Server service, try to access PersonsLots, and you'll see a message stating that it is in recovery.
USE PersonsLots Go /*Msg 922, Level 14, State 2, Line 1 Database 'PersonsLots' is being recovered. Waiting until recovery is finished.*/
SQL Server has to open the log and read each VLF before it starts recovering the database. So the impact of many VLFs is that it could extend the time between SQL Server restarting the database, and the actual start of the recovery process. Therefore, once the database is accessible, we can interrogate the error log for time between these two events, as well as the total recovery time.
EXEC sys.xp_readerrorlog 0, 1, 'PersonsLots' /* 2012-10-03 11:28:14.240 2012-10-03 11:28:26.710 2012-10-03 11:28:33.000 2012-10-03 11:28:33.010
*/
Starting up database 'PersonsLots'. Recovery of database 'PersonsLots' (6) is 0% complete (approximately 155 seconds remain). 140 transactions rolled forward in database 'PersonsLots' (6). Recovery completed for database PersonsLots (database ID 6) in 6 second(s) (analysis 2238 ms, redo 4144 ms, undo 12 ms.)
Listing 8.13:
172
Chapter 8: Optimizing Log Throughput There were approximately 12.5 seconds between SQL Server starting up the database, and the start of the recovery process. This is why it's possible to see a database listed as "in recovery," without initially seeing any recovery messages in the error log. The recovery process then took under 7 seconds. Notice that, of the three recovery phases, SQL Server spent most time in redo. Let's now repeat the same test with the Persons database (pre-sized log le).
USE Persons Go /*Disable Automatic checkpoints*/ DBCC TRACEON( 3505 ) --DBCC TRACEOFF (3505) USE Persons Go BEGIN TRANSACTION DECLARE @cnt INT; SET @cnt = 1; WHILE @cnt < 6 BEGIN; SET @cnt = @cnt + 1; UPDATE dbo.Persons SET Email = LEFT(Email + Email, 7000) END; DBCC SQLPERF(LOGSPACE) ; -- 12288 MB, 87.2% used USE Persons GO DBCC LOGINFO; -- 96 VLFs /*Commit and immediately Shut down*/ COMMIT TRANSACTION; SHUTDOWN WITH NOWAIT
Listing 8.14: Persons: disable automatic checkpoints, run and commit explicit transaction, shut down SQL Server.
173
Chapter 8: Optimizing Log Throughput Finally, we interrogate the error log again, for time between these two events, as well as the total recovery time.
EXEC sys.xp_readerrorlog 0, 1, 'Persons' /* 2012-10-03 11:54:21.410 2012-10-03 11:54:21.890 2012-10-03 11:54:30.690 2012-10-03 11:54:30.710
*/
Starting up database 'Persons'. Recovery of database 'Persons' (6) is 0% complete (approximately 108 seconds remain). 1 transactions rolled forward in database 'Persons' (6). Recovery completed for database Persons (database ID 6) in 3 second(s) (analysis 2177 ms, redo 1058 ms, undo 10 ms.)
Listing 8.15:
Note that, this time, there is less than 0.5 seconds between SQL Server starting up the database, and recovery beginning. The recovery process took just over 9 seconds. You will see that in these tests we haven't created a situation in which all other things are equal, aside from the log fragmentation. For a start, the recovery process for the fragmented log database rolls forward 140 transactions, and in the second test, only rolls forward 1 transaction. However, it is clear from the tests that a fragmented log can signicantly delay the onset of the actual database recovery process, while SQL Server reads in all of the VLFs.
174
Lots of undo
As an alternative example, we could execute our long update transaction, run a checkpoint and then shut down SQL Server with the transaction uncommitted, and see how long SQL Server takes to recover the database, rst when the log is fragmented and then when it is not. In each case, this will force SQL Server to perform a lot of undo in order to perform recovery, and we'll see the eect, if any, of an internally fragmented log. We won't show the full code for these tests here as we've presented it all previously, but it's available in the code download les for this book (see PersonsLots_UndoTest.sql and Persons_UndoTest.sql).
/* (1) Recreate PersonsLots, with a fragmented log (Listing 8.4 and 8.5) (2) Create Persons table, Insert 1 million rows (Listings 8.2 and 8.3) */ BEGIN TRANSACTION /* run update from Listing 8.6*/ /*Force a checkpoint*/ CHECKPOINT; /*In an second session, immediately Shutdown without commiting*/ SHUTDOWN WITH NOWAIT
Repeat the same test for the Persons database (as dened in Persons_UndoTest.sql). Listing 8.17 shows the results from the error logs for each database.
175
Persons (non-fragmented log) 2012-10-03 13:21:23.250 2012-10-03 13:21:23.740 2012-10-03 13:26:03.840 2012-10-03 13:26:03.990 Starting up database 'Persons'. Recovery of database 'Persons' (6) is 0% complete (approximately 10775 seconds remain). 1 transactions rolled back in database 'Persons' (6). Recovery completed for database Persons (database ID 6) in 279 second(s) (analysis 24 ms, redo 57468 ms, undo 221671 ms.)
*/
Listing 8.17: Error log database startup and recovery information for PersonsLots and Persons.
For PersonsLots the delay between database startup and the start of recovery is over 11 seconds, whereas for Persons it is about 0.5 seconds. The overall recovery times are much longer in these undo examples, compared to the previous redo examples. For PersonsLots, the total recovery time was 326 seconds, compared to 279 seconds for Persons, with the non-fragmented log.
176
177
Chapter 8: Optimizing Log Throughput So, how do we size the log correctly? There is no easy answer. There is no sound logic behind suggestions such as "The log should be at least 25% of the size of the database." We must simply pick a reasonable size based on the following considerations, and then track log growth. Log must be big enough to accommodate largest single transaction, for example the largest index rebuild. This means the log must be bigger than the largest index in the database, to allow for logging requirements to rebuild the index under FULL recovery, and must be big enough to accommodate all activity that might occur concurrently while that largest transaction is running. Log sizing must account for how much log is generated between log backups (e.g. in 30 minutes, or 1 hour). Log sizing must account for any processes that may delay truncation, such as replication, where the log reader agent job may only run once per hour. We must also remember to factor in log reservation. The logging subsystem reserves space when logging a transaction to ensure that the log can't run out of space during a rollback. As such, the required log space is actually greater than the total size of log records for the operation. In short, a rollback operation logs compensation log records, and if a rollback were to run out of log space, SQL Server would have to be mark the database as suspect. This log reservation is not actually "used" log space, it's just a specic amount of space that must remain free, but it can trigger auto-growth events if the log lls to a point where the "used space + reserved space = log size," and it is counted as used space for the purposes of DBCC SQLPERF(LOGSPACE). Hence, it is possible to see the space used reported by DBCC SQLPERF(LOGSPACE) drop after a transaction commits, even if the database is in FULL recovery model, and no log backup has run. To see this in action, we just need a FULL recovery model database with a table containing about 50K rows. We won't repeat the full code for that here, but it's included in the code download (Persons_LogReservation.sql). 178
Note that the log space used has dropped from 87% to 34%, even though this is a FULL recovery model database, and there was no log backup after the transaction committed. SQL Server has not truncated the log in this case, merely released the log reservation space, after the transaction committed. Having set the initial log size, based on all of these requirements, and set a sensible autogrowth safety net, it's wise to monitor log usage, and set alerts for log auto-growth events, since, if we've done our job properly, they should be rare. Chapter 9 will discuss log monitoring in more detail.
179
180
Some time later, we've xed the problem that resulted in delayed log truncation; there is now plenty of reusable space in the primary log le, and we no longer need this secondary log le, but it still exists. Let's restore the PersonsLots database.
USE master GO RESTORE DATABASE PersonsLots FROM DISK ='D:\SQLBackups\PersonsLots_full.bak' WITH NORECOVERY; RESTORE DATABASE PersonsLots FROM DISK='D:\SQLBackups\PersonsLots.trn' WITH Recovery; /*<output truncated> Processed 18094 pages for database 'PersonsLots', file 'PersonsLots_log' on file 1. Processed 0 pages for database 'PersonsLots', file 'PersonsLots_Log2' on file 1. RESTORE LOG successfully processed 18094 pages in 62.141 seconds (2.274 MB/sec).*/
The restore took over 60 seconds. If we repeat the exact same steps, but without adding the secondary log le, the comparative restore, in our tests, took about 8 seconds.
181
Chapter 8: Optimizing Log Throughput In order to remove the secondary log le, we need to wait until it contains no part of the active log. Since our goal is to remove it, it's permissible to shrink this secondary log le to zero (demonstrated shortly), and turn o auto-growth for this le, as this will "encourage" the active log to move swiftly back into the primary log le. It's important to note that this will not move any log records in the secondary log over to the primary log. (Some people expect this behavior because, if we specify the EMPTYFILE parameter when shrinking a data le, SQL Server will move the data to another data le in the same legroup.) Once the secondary log le contains no part of the active log, we can simply remove it.
USE PersonsLots GO ALTER DATABASE PersonsLots GO
This is one problem solved, but we may still have a bloated and fragmented primary log. While we should never shrink the log as part of our standard maintenance operations, as discussed in Chapter 7, it is permissible in situations such as this, in the knowledge that we have investigated and resolved the cause of the excessive log growth, and so shrinking the log should be a "one-o" event. The recommended approach is to use DBCC SHRINKFILE (see http://msdn.microsoft. com/en-us/library/ms189493.aspx) to reclaim the space. If we don't specify a target size, or if we specify 0 (zero) as the target size, we can shrink the log back to its original size (in this case, 2 MB) and minimize fragmentation of the log le. If the initial size of the log was large, we wish to shrink the log smaller than this, in which case we specify a target_size, such as "1".
182
In the output from this command, we see the current database size (24128*8-KB pages) and minimum possible size after shrinking (256*8-KB pages). This is actually an indication that the shrink did not work fully. SQL Server shrank the log to the point where the last VLF in the le contained part of the active log, and then stopped. Check the messages tab for conrmation.
/*Cannot shrink log file 2 (PersonsLots_log) because the logical log file located at the end of the file is in use. (1 row(s) affected) DBCC execution completed. If DBCC printed error messages, contact your system administrator.*/ Perform a log backup and try again. USE master GO BACKUP DATABASE PersonsLots TO DISK ='D:\SQLBackups\PersonsLots_full.bak' WITH INIT; GO BACKUP LOG PersonsLots TO DISK = 'D:\SQLBackups\PersonsLots.trn' WITH init
183
Having done this, we can now manually grow the log to the required size, as demonstrated previously in Listing 8.8.
Summary
We started with a brief overview of the physical architecture factors that can aect log throughput, such as the need to separate log le I/O onto a dedicated array, and choose the optimal RAID level for this array. This chapter then emphasized the need to manage transaction log growth explicitly, rather than let SQL Server auto-growth events "manage" it for us. If we undersize the log initially, and then let SQL Server auto-grow it in small increments, we'll end up with a fragmented log. Examples in the chapter demonstrated how this might aect the performance of any SQL Server operations that need to read the log. Finally, we discussed the factors that determine the correct log size, and correct autogrowth increment for a given database, and we oered advice on how to recover from a situation where a database suers from multiple log les and an oversized and fragmented primary log.
184
Chapter 8: Optimizing Log Throughput The next chapter, the nal one in this book, will describe the various tools and techniques for monitoring log activity, throughput, and fragmentation.
Further Reading
Manage the Size of the Transaction Log File http://msdn.microsoft.com/en-us/library/ms365418.aspx The Trouble with Transaction Logs http://thomaslarock.com/2012/08/the-trouble-with-transaction-logs/ How to shrink the SQL Server log http://rusanu.com/2012/07/27/how-to-shrink-the-sql-server-log/ Multiple log les and why they're bad http://www.sqlskills.com/BLOGS/PAUL/post/Multiple-log-files-andwhy-theyre-bad.aspx
Acknowledgements
Many thanks to Jonathan Kehayias, who contributed the RAID section of this chapter.
185
186
Monitoring Tools
Several tools are available that will, among other things, allow us to monitor activity on our database les, including the log. Here, we'll consider just two in brief; a built-in tool (Perfmon) and a third-party tool (Red Gate SQL Monitor).
Windows Perfmon
A popular "built-in" tool for monitoring SQL Server activity is Windows Performance Monitor (Perfmon). It is a Windows OS monitoring tool that provides a vast range of counters for monitoring memory, disk I/O, CPU and network usage on a server (for example, see http://technet.microsoft.com/en-us/library/cc768048.aspx), and also exposes the counters maintained by SQL Server. Generally, the DBA or system administrator would set up Perfmon to record statistics from various counters at regular intervals, storing the data in a le and then importing it into Excel, or a similar tool, for analysis. Amongst its many counters, it oers a number to measure disk read and write performance, as well as specic counters for log monitoring. There are plenty of available tutorials on the use of Perfmon, and we won't repeat those in any detail here. In addition to documentation on TechNet (http://technet.microsoft. com/en-us/library/cc749249.aspx), we recommend the articles below for those new to this tool. SQL Server Perfmon Best Practices, by Brent Ozar A comprehensive tutorial on use of the tools and recommended counters for monitoring SQL Server, and how to analyze the saved data in Excel. http://www.brentozar.com/archive/2006/12/ dba-101-using-perfmon-for-sql-performance-tuning/
187
Chapter 9: Monitoring the Transaction Log Correlating SQL Server Proler with Performance Monitor, by BradMcGehee A "quick start" tutorial on use of Perfmon, plus how to correlate Perfmon data with Proler trace data. http://www.simple-talk.com/sql/database-administration/ correlating-sql-server-profiler-with-performance-monitor/ Baselining and Benchmarking, by Brent Ozar A video tutorial providing an overview of monitoring using Performance Monitor, and Proler, as well as benchmarking "whys" and "hows." http://technet.microsoft.com/en-us/sqlserver/gg429821.aspx In terms of the general performance of the disks housing the log (and data) les, we can monitor the following pairs of counters: Physical Disk\Disk Reads/sec and Physical Disk\Disk Writes/sec we need to know the values of these counters and, having established a baseline, look out for marked rises in either, and investigate their cause. Physical Disk\Avg. Disk sec/Read and Physical Disk\Avg. Disk sec/Write the average times (in ms) of a read and write to disk; these counters provide the disk latency stats that can be used to pinpoint potential I/O bottlenecks. Popular guideline values for the disk latency counters suggest that less than 10 ms is "good," anything over 2030 ms is "OK" but potentially worrying, and anything over about 50 ms indicates a possible I/O bottleneck. Of course, these gures are entirely dependent on the specication and conguration of the disk array in your environment. For a very simple demo of Perfmon in action, we'll just set it up to monitor counter activity for the Physical Disk\Avg. Disk sec/Read and Physical Disk\Avg. Disk sec/Write counters whilst one of the databases is under a heavy write load.
188
Chapter 9: Monitoring the Transaction Log For our examples, we're going to re-create a new version of the Persons database and table from Chapter 8. In the code download for this chapter, the script, Ch09_ Persons.sql re-creates the Persons database, in FULL recovery model, followed by the Persons table. It then populates the table with 1 million rows of data (from the AdventureWorks2008 database, though it will also work with AdventureWorks2012). We can now start Windows Perfmon (just search for "perfmon" from the Start menu). If you just want to see an instant snapshot of certain activity, you can simply add counters directly to the Perfmon graph. However, if you wish to dene a particular set on counters to monitor over time, on a schedule, then you'll need to dene a new data collection set. Under Data Collector Sets, right-click on User Dened and select New | Data Collector Set (these data collector sets are unrelated to the SQL Server Data Collector mentioned in the introduction to this chapter). Give your new Data Collector Set a name, and choose whether to select the counters based on a pre-supplied template or manually (I chose the latter for this simple example). At the next screen, opt to include only performance counter data, and at the one after, you get to select the performance counters. Click Add, type in the path to the appropriate server, hit the Enter key and wait a few seconds (or longer, sometimes!) while it connects and populates the box below with the available counters. You can now go through and add the ones you want. On this occasion, we're only interested in the PhysicalDisk class, so click on it to expand it and select the relevant counters. Notice that we're monitoring the counters for two separate disk drives (the data le for the Persons database is on the C:\ drive and the log le on the D:\ drive), rather than capturing the total values for all drives. If both of these les are on the same drive, on your test machine, then you can simply collect the counters for the appropriate single drive.
189
Figure 9.1:
Setting up Perfmon.
Having clicked OK, we return to the previous screen and simply need to conrm the collection interval (this will vary, depending on what you're monitoring; I just chose one second for this simple example) and click Finish. We can run this new data collection set on a schedule, of course, but here we'll simply right-click on it and select Start (after a second or two, a green "Play" symbol will appear). Back in SSMS, we create a reasonably heavy write load on our Persons database, as shown in Listing 9.1.
USE Persons GO DECLARE @cnt INT; SET @cnt = 1; -- may take several minutes; reduce the number of loops, if required WHILE @cnt < 6 BEGIN; SET @cnt = @cnt + 1; UPDATE dbo.Persons SET Email = LEFT(Email + Email, 7000) END;
Listing 9.1:
190
Chapter 9: Monitoring the Transaction Log Once the code completes, return to Perfmon, right-click on your data collector set and stop the data collection. Navigate to the Reports menu item and under User Dened you should nd a list of reports, with names of the form ClientName_Date_Number. Locate the relevant report and double-click on it to view the data in graphical form, as shown in Figure 9.2.
Figure 9.2:
191
Chapter 9: Monitoring the Transaction Log You can select and deselect the counters you wish to display, and by double-clicking on any one, in the counter listing below the graph, you can modify their appearance on the graph, the scale, and so on. You can also zoom in to an area of the graph by clicking on the graph and dragging across to highlight the required area and then clicking the magnifying glass from the top menu (use the horizontal scroll bar directly below the graph to zoom back out). The dominant activity on the graph is disk writes to the D: drive, hosting our log le. We can see that, for periods, the latency sits at around 40 ms, but with frequent higher spikes. We can use the Change Graph type icon from the top menu to view a report. Over the entire collection period, the report shows an average latency for writes to the D: drive of 55 ms, which could be a cause for concern if seen for sustained periods. Of course, many other PhysicalDisk counters, and other counters, are available that can oer insight into the underlying performance of your disks (%DiskTime, DiskTransfers/sec and so on), and we should undertake a deeper analysis before jumping to conclusions. In addition, and in a similar manner, we could collect other log-related counters, such as those available in the SQL Server:Databases object. This object oers various counters for investigation of log activity, including among others (see http://msdn. microsoft.com/en-us/library/ms189883.aspx): Log File(s) Size (KB)and Log File(s) Used Size (KB) total size of transaction log and how much is currently being used. Log Flushes/sec and Log Bytes Flushed/sec number and size of writes from log cache to disk.
192
Figure 9.3:
A nice feature of SQL Monitor is that it makes it very easy, much easier than with Perfmon, to compare the same type of activity across dierent periods. In the Time range drop-down, we can change the time range, set a custom range, as well as compare values for this metric from today (or "this week") to those captured yesterday (or "last week"), and so on.
193
194
Chapter 9: Monitoring the Transaction Log Here, we'll examine just three examples, starting with sys.dm_db_log_space_usage and then moving on to sys.dm_io_virtual_file_stats and then sys.dm_os_ performance_counters, to expose detailed information on log activity and growth.
Listing 9.2:
Using sys.dm_io_virtual_le_stats
For each database le that SQL Server uses, data les as well as log (and full text) les, the sys.dm_io_virtual_file_stats function gives cumulative physical I/O statistics, indicating how frequently the le has been used by the database for reads and writes 195
Chapter 9: Monitoring the Transaction Log since the server was last rebooted. It also provides a very useful metric in the form of the "I/O stall" time, which indicates the total amount of time that user processes have waited for I/O to be completed on the le in question. Note that this DMV measures physical I/O only. Logical IO operations that read from cached data will not show up here. The function accepts a database_id and a file_id, which we can use to investigate specic les or databases, or we can simply return the data for all les on the server. In order to start with a clean slate, rerun the script Persons.sql to drop and re-create the Persons database and table, and populate it with 1 million rows and then run Listing 9.3 to capture into a temporary table some baseline data for the server.
SELECT DB_NAME(mf.database_id) AS databaseName , mf.physical_name , divfs.num_of_reads , divfs.num_of_bytes_read , divfs.io_stall_read_ms , divfs.num_of_writes , divfs.num_of_bytes_written , divfs.io_stall_write_ms , divfs.io_stall , size_on_disk_bytes , GETDATE() AS baselineDate #baseline sys.dm_io_virtual_file_stats(NULL, NULL) AS divfs JOIN sys.master_files AS mf ON mf.database_id = divfs.database_id AND mf.file_id = divfs.file_id
INTO FROM
Listing 9.3:
Listing 9.4 shows a query against the #baseline table, returning some the read and write statistics for the Persons database.
196
FROM WHERE
Listing 9.4:
As noted, the data provided by this function is cumulative from when the server last restarted; in other words, the values in the data columns increment continuously, from the point when the server was last restarted. As such, a single "snapshot" of the data is rather meaningless, on its own. What we need to do is take a "baseline" measurement, wait for a set period, perhaps while a specic set of operations completes, then take a second measurement and subtract the two, so that you can see where I/O is "accumulating." Rerun Listing 9.1 to update our Persons table, and run Listing 9.5 to collect a second set of data and subtract the baseline data values (we have omitted a few columns from the output, due purely to space constraints).
197
FROM
) SELECT currentLine.databaseName , LEFT(currentLine.physical_name, 1) AS drive , currentLine.physical_name , DATEDIFF(millisecond,baseLineDate,currentLineDate) AS elapsed_ms, currentLine.io_stall - #baseline.io_stall AS io_stall_ms , currentLine.io_stall_read_ms - #baseline.io_stall_read_ms AS io_stall_read_ms , currentLine.io_stall_write_ms - #baseline.io_stall_write_ms AS io_stall_write_ms , currentLine.num_of_reads - #baseline.num_of_reads AS num_of_reads , currentLine.num_of_bytes_read - #baseline.num_of_bytes_read AS num_of_bytes_read , currentLine.num_of_writes - #baseline.num_of_writes AS num_of_writes , currentLine.num_of_bytes_written - #baseline.num_of_bytes_written AS num_of_bytes_written FROM currentLine INNER JOIN #baseline ON #baseLine.databaseName = currentLine.databaseName AND #baseLine.physical_name = currentLine.physical_name WHERE #baseline.databaseName = 'Persons' ;
DB_NAME(mf.database_id) AS databaseName , mf.physical_name , num_of_reads , num_of_bytes_read , io_stall_read_ms , num_of_writes , num_of_bytes_written , io_stall_write_ms , io_stall , size_on_disk_bytes , GETDATE() AS currentlineDate sys.dm_io_virtual_file_stats(NULL, NULL) AS divfs JOIN sys.master_files AS mf ON mf.database_id = divfs.database_id AND mf.file_id = divfs.file_id
Listing 9.5:
198
Chapter 9: Monitoring the Transaction Log Clearly we forced some very heavy write activity on the log le in this example! Pinpointing the cause of high I/O stalls, and resolving the problem, is sometimes a complex process. If you suspect that I/O stall rates are causing issues, then the rst course of action might be attempt to reduce the overall I/O load to acceptable levels. For example, we can use data from the execution- and index-related DMVs to attempt to reduce the overall I/O load on the server through tuning and indexing. We could also increase the amount of RAM, so that more data can be held in the data cache, and so reducing the occurrence of physical le reads. Armed with the I/O stall rate, and the amount of data read and written, we can also identify opportunities to implement partitioning, or to at least separate tables onto dierent legroups. Ultimately, however, high stall rates could simply indicate that the disk I/O subsystem is inadequate to handle the required I/O throughput. If attempts to reduce the overall I/O load fail to bring the stall rates down to acceptable levels, then there is little choice but to consider adding more or faster disks, more or faster I/O paths, or to investigate potential problems with the conguration of the I/O subsystem. Finally, remember that the data in this DMV reects only SQL Server's perspective of disk I/O. If the disk subsystem is shared at a server level with other applications, another application may be the actual cause of poor disk performance, not SQL Server. Furthermore, with use of SANs, virtualization software, and so on, there are often several "intermediary" layers between SQL Server and the actual disk storage. In short, analyze carefully the data obtained from this DMV, and consider it in conjunction with data obtained from Windows OS counters, Proler, and other DMVs, before deciding on a course of action.
199
Using sys.dm_os_performance_counters
Generally, it's probably easiest to collect performance counters using Performance Monitor (Perfmon), as discussed previously. However, if you prefer to save the statistics in a database table and interrogate them using SQL, the sys.dm_os_performance_ counters DMV is a very useful tool. Just write the query to retrieve the data from the DMV, add INSERT INTO CounterTrendingTableName and you have a rudimentary monitoring system! In addition, it's not always possible to get direct access to Perfmon, and accessing it from a dierent machine can be slow. Unfortunately, using this DMV is far from plain sailing and a full description of its intricacies is out of scope here. Instead, we refer you to the book, Performance Tuning with SQL Server Dynamic Management Views (http://www.simple-talk.com/books/ sql-books/performance-tuning-with-sql-server-dynamic-management-views/), which is available as a free eBook. Listing 9.6, below, simply provides an example of how to report on log growth or shrink events. The output indicates that the Persons database (initial log size 2 MB, autogrowth increment 2 MB) underwent a huge number of log growth events, due to inserting the initial load of 1 million rows and then performing the update in Listing 9.2. This is obviously a cause for concern and the DBA would need to investigate the log sizing and growth settings, and possibly perform a one-o shrink followed by appropriate resize, as described in Chapter 8.
200
Listing 9.6:
201
PowerShell
PowerShell, with Server Management Objects, forms a powerful automation tool for managing and documenting SQL Server databases. It's a rather steep learning curve for any DBA brought up on T-SQL and GUI management tools, but a few short scripts can gather all manner of data across all your servers, and all SQL Server versions.
202
Chapter 9: Monitoring the Transaction Log The following two scripts are adapted from the work of PowerShell enthusiast Phil Factor; see http://www.simple-talk.com/sql/database-administration/ powershell-smo-just-writing-things-once/ The PowerShell routine in Listing 9.7 takes a list of SQL server instance names and checks all the databases within those server instances, listing out, via SMO: name of the log le and the path auto-growth setting (either KB or percent) current le size (MB), amount of space used (MB) and maximum le size (MB) number of disk reads, number of disk writes bytes read from disk and bytes written to disk. Simply update the script with your server name and run it from Windows PowerShell.
203
Listing 9.7:
Using PowerShell and SMO to investigate log le size, location and activity.
204
Chapter 9: Monitoring the Transaction Log We've displayed the output in the grid view, and ltered to show only databases with the term "persons" in their name. If you'd prefer to output to Excel, simply replace Out-GridView with the following: Convertto-csv useculture > Export.csv If you want run the script from within SSMS, right-click on the server or database and select Start PowerShell. If you're using anything other than the SQL 2012 version, then you'll need to rst download, import and install the sqlps module, in order to get access to the Out-GridView and Convertto-csv cmdlets. See Michael Sorens's article for more details: http://www.simple-talk.com/sql/database-administration/practicalpowershell-for-sql-server-developers-and-dbas-%E2%80%93-part-1/. Alternatively, simply replace the last line with FormatTable. Listing 9.8 shows a second routine, concerned with investigating log le fragmentation. Again, the routine uses SMO to query each database on all specied SQL Server instances. It queries DBCC LogInfo, via T-SQL, to get the number of VLFs for each log le. It groups the results to report the number of virtual log les, the size of the biggest VLF in MB, the smallest VLF in MB, and average and total size across all VLFs, for each database. This time, we use FormatTable for the output.
205
Listing 9.8:
Investigating log fragmentation with PowerShell and SMO (server name omitted from output).
206
Summary
This nal chapter reviewed just a few of the tools available to the DBA for monitoring log growth and performance, including Windows Perfmon, third-party monitoring tools, Dynamic Management Views, and PowerShell or T-SQL scripting. We tried to oer a reasonable feel for what each tool can do, so that you can explore further if it looks like a good t for your needs. It is essential that every DBA maintains a healthy transaction log. Ideally, this will consist of a single log le, on a dedicated RAID 1+0 array (or as close to this ideal as you can get) in order to support maximum write performance and throughput. We must capture "baseline" statistics that characterize log write performance under typical workload, and then monitor this data over time, checking for abnormal activity, or sudden deterioration in performance. Likewise, we should also size in accordance with the current and predicted data load, rather than let SQL Server "manage" log growth via auto-growth events. We should enable SQL Server's auto-growth facility but only as a safeguard, and the DBA should receive an alert when log growth occurs, and investigate. By carefully monitoring log growth, we can avoid situations such as a full transaction log, or a highly fragmented log, which might impede performance of operations that read the log, such as log backups and the crash recovery process.
Further Reading
Storage Testing and Monitoring (video) http://technet.microsoft.com/en-us/sqlserver/gg508910.aspx Baselining and Benchmarking (video) http://technet.microsoft.com/en-us/sqlserver/gg429821.aspx
207
Chapter 9: Monitoring the Transaction Log Back to Basics: Capturing Baselines on Production SQL Servers http://www.sqlservercentral.com/articles/baselines/94656/ Diagnosing Transaction Log Performance Issues and Limits of the Log Manager http://sqlcat.com/sqlcat/b/technicalnotes/archive/2008/12/09/diagnosingtransaction-log-performance-issues-and-limits-of-the-log-manager.aspx Monitoring SQL Server Virtual Log File Fragmentation http://www.simple-talk.com/sql/database-administration/ monitoring-sql-server-virtual-log-file-fragmentation/ Monitoring SQL Server database transaction log space http://www.mssqltips.com/sqlservertip/1178/ monitoring-sql-server-database-transaction-log-space/ System Data Collection Sets http://msdn.microsoft.com/en-us/library/bb964725%28v=sql.100%29.aspx The Future fn_dblog() No More? Tracking Transaction Log Activity in Denali http://www.sqlskills.com/blogs/jonathan/post/An-XEvent-a-Day-(22-of-31)e28093-The-Future-e28093-fn_dblog()-No-More-Tracking-Transaction-LogActivity-in-Denali.aspx
Acknowledgements
We would like to thank: Louis Davidson, who, in the Dynamic Management Objects section of this chapter, allowed us to reproduce material from his book, which he co-authored with Tim Ford: Performance Tuning with SQL Server Dynamic Management Views (http://www.simple-talk.com/books/sql-books/ performance-tuning-with-sql-server-dynamic-management-views/) Phil Factor for his help with the PowerShell script.
208
Pricing and information about Red Gate tools are correct at the time of going to print. For the latest information and pricing on all Red Gate's tools, visit www.red-gate.com
$595
"Just purchased SQL Compare. With the productivity I'll get out of this tool, it's like buying time."
Robert Sondles Blueberry Island Media Ltd
$595
"We use SQL Data Compare daily and it has become an indispensable part of delivering our service to our customers. It has also streamlined our daily update process and cut back literally a good solid hour per day."
George Pantela GPAnalysis.com
$295
"SQL Prompt is hands-down one of the coolest applications I've used. Makes querying/developing so much easier and faster."
Jorge Segarra University Community Hospital
from $395
Deployment Manager
from
$295
Deploys your whole application ASP.NET sites, dependent assemblies, and databases in one process Makes deployment repeatable with a minimum of custom scripting Shows you which version of your software is running on each dev, test, staging, and production environment, from a central dashboard Works with local, remote, and Cloud-based servers Uses public/private key encryption to keep deployments over the Internet secure
"This tool ranks up there with NuGet and Team City as the tools that have made the biggest difference to web application development productivity these last months it truly deserves to be a roaring success!"
Mark Roberts .NET Web Developer, Red Gate Software
$795
Compress SQL Server database backups by up to 95% for faster, smaller backups Protect your data with up to 256-bit AES encryption Strengthen your backups with network resilience to enable a fault-tolerant transfer of backups across flaky networks Control your backup activities through an intuitive interface, with powerful job management and an interactive timeline Get integrated backup verification schedule regular restores and include a database integrity check (DBCC CHECKDB)
"SQL Backup Pro cut the backup time for our most mission-critical database by 92%, and provided us with 95% compression. Builtin network resilience has also reduced our failure rate to zero. I'm absolutely amazed at how well it performs."
Kiara Rodemaker Manager, IT Accounting Services, Czarnowski
SQL Monitor
SQL Server performance monitoring and alerting
from
$795
Intuitive overviews at global, cluster, machine, SQL Server, and database levels for up-to-the-minute performance data Use SQL Monitor's web UI to keep an eye on server performance in real time on desktop machines and mobile devices Intelligent SQL Server alerts via email and an alert inbox in the UI, so you know about problems first Comprehensive historical data, so you can go back in time to identify the source of a problem View the top 10 expensive queries for an instance or database based on CPU usage, duration, and reads and writes PagerDuty integration for phone and SMS alerting Fast, simple installation and administration Add your own T-SQL scripts with the custom metrics feature to expand SQL Monitor's range
"Being web based, SQL Monitor is readily available to you, wherever you may be on your network. You can check on your servers from almost any location, via most mobile devices that support a web browser."
Jonathan Allen Senior DBA, Careers South West Ltd
$1,395
Protect your organization's data by creating highly compressed, fully verified and encrypted backups, and ensure reliable restores.
Troubleshooting
Get an alert within seconds of a problem arising, gain insight into what happened, and diagnose the issue, fast.
Security
Protect your data from unauthorized access with strong backup encryption.
The tools in the bundle can be bought separately with a combined value of $3,375, or purchased together for $1,395, saving 60% on the individual tool prices.
SQL Toolbelt
The essential SQL Server tools for database professionals
$1,995
You can buy our acclaimed SQL Server tools individually or bundled. Our most popular deal is the SQL Toolbelt: sixteen of our SQL Server tools in a single installer, with a combined value of $5,930 but an actual price of $1,995, a saving of 66%. Fully compatible with SQL Server 2000, 2005, and 2008. SQL Toolbelt contains:
SQL Compare Pro SQL Data Compare Pro SQL Source Control SQL Backup Pro SQL Monitor SQL Prompt Pro SQL Data Generator SQL Doc SQL Test
SQL Dependency Tracker SQL Packager SQL Multi Script Unlimited SQL Search SQL Comparison SDK SQL Object Level Recovery Native SQL Connect
"The SQL Toolbelt provides tools that database developers, as well as DBAs, should not live without."
William Van Orden Senior Database Developer, Lockheed Martin
$495
Find memory leaks and optimize memory usage of your .NET applications
Zero in on the causes of memory leaks, fast Visualize the relationship between your objects and identify references which should no longer be held in memory "Freaking sweet! We have a known memory leak that took me about four hours to nd using our current tool, so I red up ANTS Memory Proler and went at it like I didn't know the leak existed. Not only did I come to the conclusion much faster, but I found another one!"
Aaron Smith IT Manager, R.C. Systems Inc.
from
$395
"ANTS Performance Proler took us straight to the specic areas of our code which were the cause of our performance issues."
Terry Phillips Sr Developer, Harley-Davidson Dealer Systems
"I have integrated ANTS Proler into my entire development process and it has truly helped us write better performing applications from the beginning by ensuring that we always know exactly what is going on."
Mitchell Sellers MVP
.NET Reector
$95
"One of the most useful, practical debugging tools that I have ever worked with in .NET! It provides complete browsing and debugging features for .NET assemblies, and has clean integration with Visual Studio."
Tom Baker Consultant Software Engineer, EMC Corporation
SmartAssembly
Prepare your application for the world
Obfuscation: Obfuscate your .NET code to secure it against reverse engineering. Multiple layers of protection defend your software against decompilation and cracking.
$795
"Knowing the frequency of problems (especially immediately after a release) is extremely helpful in prioritizing and triaging bugs that are reported internally. Additionally, by having the context of where those errors occurred, including debugging information, really gives you that leap forward to start troubleshooting and diagnosing the issue."
Ed Blankenship Technical Lead and MVP
Automated Error Reporting: Get quick and automatic reports on exceptions your end-users encounter, and identify unforeseen bugs within hours or days of shipping. Receive detailed reports containing a stack trace and values of the local variables, making debugging easier.