Press "Enter" to skip to content

Category: DBCC

Offloading Maintenance Operations

Taryn Pratt has a process for offloading maintenance operations onto another server:

Early on when I started working on the SQL Servers at Stack Overflow, we were taking daily backups. We had a handful of databases that were being restored for other processes, but the majority weren’t actively tested to ensure the backups were good. Since you never want to be in a situation where you need to restore a database and find it doesn’t work, my goal was to create a process to automatically restore our backups to a separate server, and then run DBCC CHECKDB on it.

This is a T-SQL-driven process and I appreciate that. If you want a Powershell-driven process, Kevin Hill has you covered.

Comments closed

DBCC CHECKDB on Large Databases

Aaron Bertrand shares some thoughts on CHECKDB:

We have a lot of data. Some of that data is stored in large databases (dozens of terabytes each). In some shops, this is an excuse to not run integrity checks. We are not one of those shops.

But we don’t run full CHECKDB operations in production; we have a set of servers dedicated to testing our restores and running checks. We follow a lot of the guidance in these articles:

CHECKDB From Every Angle: Consistency Checking Options for a VLDB

Minimizing the impact of DBCC CHECKDB : DOs and DON’Ts

Minimize performance impact of SQL Server DBCC CHECKDB

Read the whole thing, even if you aren’t dealing with 30+ TB databases.

Comments closed

Building an Azure Function to Automate CHECKDB

Arun Sirpal shows us how to build an Azure Function:

The title is a mouthful and so is this post. In the past I have linked to blog posts from Microsoft that say consistency checks for Azure SQL Database is the responsibility of Microsoft. (https://azure.microsoft.com/en-gb/blog/data-integrity-in-azure-sql-database/)

However, Paul Randal got me thinking about his thoughts on it (via his insider email). Forming the core of this post. If you desire to run DBCC CHECKDB against Azure SQL Database (which I know people do) – how can you do this? There are many ways, but for this blog post – Enter Azure functions. There are many moving parts to this, but once setup and coded it is a very satisfying experience. Let’s dig in. I am NOT going to copy and paste every little element of the high-level guide from Microsoft, there is no point in that but I will show you the links that you need to setup the relevant function app project then the tailored bits around CHECKDB forms the bulk of this post.

This isn’t necessary to do, but if you want to learn how Azure Functions work, it’s a good example of working through the mechanics.

Comments closed

Emergency Mode in SQL Server

Paul Randal answers a reader question:

I had a blog comment question a few days ago that asked why emergency-mode repair requires the database to be in EMERGENCY mode as well as SINGLE_USER mode.

All repair operations that DBCC CHECKDB (and related commands) performs require the database to be in single-user mode so there’s a guarantee that nothing can be changing while the checks and repairs are done. But that doesn’t change the behavior of what repair does – that needs emergency mode too.

Read on for an explanation of what emergency mode is and why we need it to run CHECKDB repair operations.

Comments closed

Dealing With CheckDB Error Message 824 Level 24

Steve Stedman has a post on fixing a database which has experienced an incorrect pageid error:

Msg 824, Level 24, State 2, Line 1

SQL Server detected a logical consistency-based I/O error: incorrect pageid (expected 1:2806320; actual 0:0).  It occurred during a read of page (1:xxxxx) in database ID 5 at offset 0x00000xxxxx0000 in file ‘C:\Program Files\Microsoft SQL Server\MSSQL12.MSSQLSERVER\MSSQL\DATA\YourDatabaseName.mdf’.  Additional messages in the SQL Server error log or system event log may provide more detail. This is a severe error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.

Since this is one of those things that I regularly work with, I thought I would see what other people are saying about this error message, and boy oh boy did I found some crazy and outright damaging suggestions

Steve puts together a bunch of really bad advice and explains why you shouldn’t follow it.  Read the whole thing and listen to Steve’s advice, not the bad advice.

Comments closed

When Using DBCC DROPCLEANBUFFERS

Dan Guzman shares words of wisdom with using DBCC DROPCLEANBUFFERS for testing query performance in SQL Server:

One can make the argument that DBCC DROPCLEANBUFFERS might not be particularly valuable for testing. First, the storage engine in SQL Server Enterprise Edition (or Developer Edition, which is often used when testing) behaves differently with a cold cache versus a warm one. With a warm cache, a page not already in cache (e.g. index seek by primary key) will be fetched from disk using a single 8K page IO request as one expects. However, when the cache isn’t fully warmed up (Buffer Manager’s Target Pages not yet met), the entire 64K extent (8 contiguous 8K pages) is read for the single page request regardless of whether the adjacent pages are actually needed by the query. This has the benefit of warming the cache much more quickly than would otherwise occur, but given that the normal steady state of a production SQL Server is a warm cache, testing with a cold cache isn’t a fair comparison of different plans. More data than normal will be transferred from storage so timings may not be indicative of actual performance.

I don’t think I agree 100% with that argument, but I am sympathetic to it.  Still, Dan has great advice in this post.

Comments closed

Using DBCC OPTIMIZER_WHATIF To Mimic Production Hardware

Max Vernon has a technique for mimicking production hardware layouts when testing queries in development:

Attempting to debug production performance problems in your development environment can be problematic in many ways, leading to a frustrating troubleshooting experience. One very common situation is the resources on the development environment are substantially less robust than on the production system; for instance prod has 128 GB of RAM, while dev only has 16 GB, prod has 16 cores, while dev only has 4 cores. Unintuitively, this disparity can result in queries running faster in development than in production.

SQL Server has a little-known (and undocumented and unsupported) troubleshooting-related DBCC command that can be used to mimic production resource levels in your development environment. As with all undocumented features, do not try this in production.

Read on to learn how DBCC OPTIMIZER_WHATIF can lead the optimizer to choose different plans.  I almost never use this command, but it is helpful to have it in your back pocket.

Comments closed

Diagnosing Database Corruption

Jeff Mlakar has started a series on database corruption.  His first post involves finding corruption:

The cause lies in layers below SQL Server. The most common are hardware faults; in particular, issues with the I/O Subsystem. Any component in the I/O Subsystem can fail and be the cause of database corruption: disks, controllers, CPU, memory, network switch, network cables, SAN, etc.

Database corruption cannot entirely be prevented. It is not a matter of if but rather when

Disks go bad. So do NICs, cables, routers, and everything else physical below the SQL Server Instance. This is why it is important to know that we cannot entirely prevent corruption – only deal with and mitigate it.

Click through for a few ways to find potential corruption.

Comments closed