Recently I was doing some work with a friend around some database restores. It was pretty routine stuff. However, after one restore my friend came across something in the SQL Error Log that caught him by surprise. As part of the restore, there was a CHECKDB message for the restored database:
My friend’s first reaction was “why is SQL Server doing a DBCC CHECKDB as part of the restore?” He was concerned, because CHECKDB is a pretty hefty operation and this could really impact the restore time if he had to wait on a CHECKDB to complete. But the other confusing thing was that the date for the CHECKDB didn’t match up with the restore timing.
Click through to learn the answer.
Whenever I’m teaching and recovering from corruption, I always stress that if the REPAIR_ALLOW_DATA_LOSS option of DBCC CHECKDB (which I’ll just call ‘repair’ from now on) has to be used, you should do the following:
- Run another DBCC CHECKDB to make sure that repair fixed everything and no new problems were exposed (by the repair fixing something that was blocking DBCC CHECKDB‘s access to a portion of the database)
- Run DBCC CHECKCONSTRAINTS on any tables or indexes that were repaired and are involved in constraints (stated in Books Online too)
- Reinitialize any replication subscriptions that are based on tables that were repaired (stated in Books Online too)
- Reevaluate your HA/DR strategy so you don’t have to use repair again in future
One question I’m often asked is about why replication can be broken by running repair.
Read on for the answer.
Running this statement interactively doesn’t return any data – it just loads the data into DBADatabase.dbo.DBCCMemoryStatus. Running the commented-out SELECT at the bottom of the script as written will query that table for all rows of counter VM Reserved (virtual memory reserved) but there is much more data than that available if you modify the SELECT.
This query can be dropped into a SQL Agent job step as is and it will run – just like the interactive run it will create the database and permanent table if they don’t exist and then store those nuggets of data into the permanent table for later use – you never know when you may need them!
Click through for the script.
Wednesday I walk into the office and immediately hear that CHECKDB is the source of issues on one of the servers and is the reason behind some errors that have been happening. While I don’t think this is the case (it might look like it on the surface but there is something else that is happening that is the actual cause) I also wanted to find out what CHECKDB was running at the time the errors occurred.
I needed information on when CHECKDB ran for each database. When you look for what you can run to find when CHECKDB was last run you find this blog post and also this blog post on grabbing this info. While these were very informative, they were for one database at a time. I need this for all the databases so I can try to not only find out when each one ran, but also use these time stamps to figure out the duration.
The big recommendation I’d make with regard to this is not to use sp_msforeachdb. Otherwise, click through for a good script.
DBCC CHECKDB has the ability to perform parallel checking of objects. However, it absolutely depends on the edition of SQL Server, it only happens when using enterprise edition.
Let’s see this in action. I propose the following tests for this blog post:
- Test on a SQL Server Enterprise Edition.
- Test on a non-enterprise edition of SQL Server.
I don’t have 2017 Enterprise at hand but I do have 2014 Enterprise and Express handy so it makes sense to use these versions for my “experiment”.
Read on to see the difference.
If you know about DBCC CHECKDB then most likely you will know about DBCC CHECKTABLE. Quite simply this command performs primitive system-catalog consistency checks, per-table consistency checks on the single table specified, and cross-table consistency checks on indexed views that reference the specified table. (Page 899 Microsoft SQL Server Internals 2012, Chapter 14, Page 899, Paul Randal)
There is one concept about this command that I did not know about until recently, intrigued? Read on.
I was going to say “read on” but Arun already used that line.
I was exchanging messages with Azure Support and even though I didn’t get a concrete answer to confirm this I ended up asking the question within a Microsoft based yammer group and yes they do automatically carry out consistency checks.
This is great but it is one less thing for me to worry about and if there is serious corruption, you know potential data loss (which would be rare) then they will definitely tell you and work with you.
However, it doesn’t mean you CAN’T run it, I was curious so I ran DBCC CHECKDB on my Azure SQL Databases, but like with any other consistency check it is best to do it OFF-PEAK hours. I would probably take it a step further and wouldn’t even bother running it.
It’s an interesting post, reminding us that administering an Azure database isn’t the same as on-prem.
Looking in the SQL Error Log there were hundreds of these combinations in the minutes immediately preceding the job failure:
The operating system returned error 665(The requested operation could not be completed due to a file system limitation) to SQL Server during a write at offset 0x000048a123e000 in file ‘E:\SQL_Data\VLDB01.mdf:MSSQL_DBCC17‘. Additional messages in the SQL Server error log and system event log may provide more detail. This is a severe system-level error condition that threatens database integrity and must be corrected immediately. Complete a full database consistency check (DBCC CHECKDB). This error can be caused by many factors; for more information, see SQL Server Books Online.
Error: 17053, Severity: 16, State: 1.
E:\SQL_Data\VLDB01.mdf:MSSQL_DBCC17: Operating system error 665(The requested operation could not be completed due to a file system limitation) encountered.
Read on for more information, including a rough idea of how much space the snapshot requires as well as a few workarounds and hints.
An interesting situation was discussed online recently which prompted me to write this post. A fellow MVP was seeing periodic corruption messages in his error log, but DBCC CHECKDB on all databases didn’t find any corruptions. A subsequent restart of the instance caused the problem to go away.
My diagnosis? Memory corruption. Something had corrupted a page in memory – maybe it was bad memory chips or a memory scribbler (something that writes into SQL Server’s buffer pool, like a poorly-written extended stored procedure), or maybe a SQL Server bug. Whatever it was, restarting the instance wiped the buffer pool clean, removing the corrupt page.
So why didn’t DBCC CHECKDB encounter the corrupt page?
Read on for the rest of the story.
Let me explain a few things about the script. I am getting the path of the current trace file and placing it into a variable. The current file name will almost certainly have a suffix of _nn just before the .trc extension. If I were to run the script as is I would only be reading the current log file and not the other 4 that preceded it. If all you care about is the current log file then fine but most will want to search all the existing log files. One way to do this is to simply replace the current file name with just log.trc and use default as the 2nd parameter as I did above in the fn_trace_gettable function. The default parameter value tells the function to read all files from that one onward. even though log.trc doesn’t actually exist it knows how to handle it and reads all of the existing trace files in order.
So if the string that we search on (here we use ‘dbcc free%’) is in any of the files it will return the matching rows. You may have to adjust the wildcards and such but I think you get the idea. Again remember that the data is transient so always look at the StartTime column in the logs to ensure you know which Date and Time range you are looking at. You can do something like this but I will leave that up to you.
SELECT MIN(StartTime) AS [Begin], MAX(StartTime) AS [End] FROM ::fn_trace_gettable(@Path,default)
A word of caution in that I never bothered to see just how resource intensive this function is. while I don’t expect any issues with normal use it is not something you want to be searching on every second. Be sensible and you should have no problems.
Click through for more details, including the script Andy uses to do this search.