Deleting Lots Of Data

Kenneth Fisher wants to delete a lot of rows:

I recently had the task of deleting a bit over a billion rows from a table. Now I could have done just this:

DELETE FROM tablename WHERE createdate >= '1/1/2017'

But I have a few problems here. The table has no index on createdate, potentially causing problems with tempdb (the sort on createdate). Although in this case tempdb is pretty large because of some large batch work done at various times. I’m also going to be deleting > billion rows of ~6 billion which is probably going to fill up the log of the database (which fortunately isn’t in use yet) and end up rolling back my delete anyway. Even if I don’t fill up the log, I’m still going to bloat it pretty badly (autogrowth). And last, and anything but least, this is on a production server. Even if this database was on its own drive (meaning growth of the log can’t cause a problem with any other databases) that tempdb thing (let alone other resource usage) is going to be an issue.

Read on to see how to delete in batches.  My pattern is to have an explicit transaction within the WHILE loop, opening and closing for each deletion operation.  That has worked pretty well in the past when deleting large numbers of rows from a table.  It might also make sense to put a temporary filtered index on the table, dropping it afterward.

Related Posts

Running The SQL Server Features Discovery Report

Dave Mason shows us how to run the SQL Server features discovery report via command prompt and PowerShell: I don’t need to validate SQL Server installations on a regular basis. When the need arises, my preference is to run the SQL Server features discovery report. Further, I prefer to run it from the command line. After […]

Read More

Configuring Telegraf To Monitor Windows Servers

Tracy Boggiano continues her series on configuring Telegraf: The solution for this is fairly simple now that you have setup Part 1 of this series.  You can download the Windows conf file for Telegraf from my presentation.  Below are the important pieces of the file. The main part of the OUTPUT PLUGINS being to place the data […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *


April 2018
« Mar