Press "Enter" to skip to content

Category: Administration

Approaches to Deleting Data in Batches

Andy Mallon shares a couple approaches to deleting data in batches:

In this scenario, we’re going to keep the data for X days after it’s created. Then we delete it. That’s it. X could be 3 days or 3 years–it doesn’t matter, we’ll follow the same design pattern.

In today’s world, we generate loads of log data, sensor data, telemetry data, etc. All that data is super duper valuable. But only for a while. Eventually, all that granular data becomes less useful, and isn’t worth keeping around. Maybe it gets aggregated, summarized, or maybe it just gets thrown out.

You’ll have a lot of data with more complex requirements, but I think you’ll also be surprised at how much data has simple date-based retention based on it’s creation.

Also read the comments, as they include additional techniques.

Comments closed

Managing Kubernetes Resources

Vincent-Philippe Lauzon takes us through some thoughts on Kubernetes resource allocation:

In this article we will look at how to inform Kubernetes about pods’ resources and how we can optimize for different scenarios.

A scenario that typically comes up is when a cluster has a bunch of pods where a lot of them are dormant, i.e. they don’t consume CPU or memory. Do we have to carve them a space they won’t use most of the time? The answer is no. As usual, it’s safer to provision capacity for a workload than relying on optimistic heuristic that not all workloads will require resources at the same time. So, we can configure Kubernetes optimistically or pessimistically.

Read the whole thing.

Comments closed

Deleting In Batches

Andy Mallon explains why deleting in batches is a good idea:

Yesterday, I wrote about the importance of planning your data purges. Today, let’s look at the simplest requirement for purging data: Keep data for X days after it’s created. There are tons of cases where this is the rule. Log data, telemetry data, even transactional data is often retained based on X days since it is created.

Given how common this scenario is, let’s talk in a bit more detail about I like to go about deleting it. Deleting data is easy, right? DELETE t FROM dbo.Transactions AS t WHERE CreateDate <= DATEADD(dd,-90,GETDATE());

This part of the series covers the why; the next part will cover the how.

Comments closed

Row Versioning and 14 Bytes

Kendra Little explains why enabling row versioning adds 14 bytes per row:

I love it when someone sends me a repro script, but in this case I didn’t need to run it. The first thing I did was to look at the two numbers given for row size, and to subtract the smaller one from the larger one: 724 – 710 = 14 bytes of difference.

That bit of information alone gave me an immediate guess of what was going on.

Click through for the solution as well as a more detailed explanation of one of the trickier scenarios.

Comments closed

Putting TempDB Files On Azure IaaS D Drive

John McCormack tries out using the temporary drive on Azure VMs for tempdb:

Azure warn you not to to store data on the D drive in Azure VMs, but following this advice could mean you are missing out on some very fast local storage. It’s good general advice because this local storage is not permanently attached to your instance, meaning you could lose data or log files if your VM is stopped and restarted but what if you could afford to lose certain files? Say files that are recreated during startup anyway.

TempDB is the ideal candidate for this. No other database is suitable! Putting the tempdb data and log files onto D drive can be achieved quite easily with a little bit of effort. And you will most likely see a big improvement in tempdb read/write latency.

John ended up seeing much bigger gains than I did when I tried this, but with a difference that big, it’s definitely worth using the temporary drive for tempdb.

Comments closed

Sizing Azure SQL Database

Arun Sirpal takes us through finding the right size for Azure SQL Database:

Do you want to identify the correct Service Tier and Compute Size ( was once known as performance level) for your Azure SQL Database? How would you go about it? Would you use the DTU (Database Transaction Unit) calculator? What about the new pricing model vCore? How would you translate you current on-premises workload to the cloud?

It can be a form of trial and error especially if you are new to this but I really do recommend trying out the PowerShell script that you can access once you have installed  DMA – Database Migration Assistant.

Read on to see how to run this tool and potentially save some money.

Comments closed

Cleaning Up After Yourself in Azure Data Factory

Rayis Imayev shows how you can automatically delete old files in Azure Data Factory:

File management may not be at the top of my list of priorities during data integration projects. I assume that once I learn enough about sourcing data systems and target destination platform, I’m ready to design and build a data integration solution between two or more connecting points. Then, a historical file management process becomes a necessity or a need to log and remove some of the incorrectly loaded data files. Basically, a step in my data integration process to remove (or clean) such files would be helpful. 

Click through to see how to do this.

Comments closed

Retaining a Few Tables From a Large Set

Jana Sattainathan has a Powershell-based solution to eliminate all but a few tables in a database:

Recently, I received a request to backup a dozen tables or so tables out of 12 thousand tables. I had to retain all the indexes, statistics etc. The goal was to hand this over to the vendor for analysis as a database backup.

I could have copied the selected tables over to a new database using the PowerShell function I had published earlier and backed that up but since the tables to backup were quite large, I skipped that route

Read on to see Jana’s solution.

Comments closed

Fixing High VLF Counts

Ajay Dwivedi shares a technique for optimizing VLF counts on log files:

DBAs! I guess everyone know that huge number of Virtual Log Files (VLFs) in SQL Server can cause Backup/Restore & Database Recovery process slow. Even in rare cases, it can introduce slowness at transaction level.
https://sqlperformance.com/2013/02/system-configuration/transaction-log-configuration

Even we all are aware of it, it has been still a challenge to remove High VLF counts from SQL Server log files since it involved log of manual effort with Shrinking and re-growing the log files.

This is where my ‘Space-Capacity-Automation‘ open source project comes to your rescue. It has a parameter option @optimizeLogFiles that can help you optimize your log files by below below tasks:-

Click through to see Ajay’s technique.

Comments closed