Press "Enter" to skip to content

Author: Kevin Feasel

High-Performance ETL via Buffer Table

Daniel Hutmacher needs things to zoom:

It’s almost like a myth – one that I’ve heard people talk about, but never actually seen myself. The “shock absorber” is a pretty clever data flow design pattern to ingest data where a regular ETL process would choke on the throughput or spikes. The idea is to use a buffer table to capture incoming data, and then run an asynchronous process that loads that data in batches from the buffer into its intended target table.

While I’ve seen whitepapers and blog posts mention the concept loosely along with claims of “7x or 10x performance”, none of them go into technical detail on how it’s done, so I decided to try my hand at it.

I’ve compiled my findings, along with some pre-baked framework code if you want to try building something yourself. Professional driver on closed roads. It’s gonna get pretty technical.

Combine that with Eitan Blumin’s post yesterday and you’d think it were buffer week.

This shock absorber pattern works well for warehouse loading, especially when you’re trickle-loading data into columnstore indexes and don’t want to have open rowgroups slowing everything down.

Comments closed

Renaming a YAML Pipeline in Azure DevOps

Hamish Watson figures out what’s in a name:

I had created a pipeline using YAML – which was called InfrastructureAsCode as the YAMP file was in the root directory.

However I wanted to move it into a folder .\InfrastructureAsCode\pipelines\… and run the YAML file from there – as I would have a non-prod and PROD version of them (as the schedule was different for each).

Click through to see how Hamish was able to resolve this.

Comments closed

SQL Server Baselines with the TIG Stack

Mark Wilkinson combines Telegraf, InfluxDB, and Grafana:

Lots of folks wonder why I would go through the trouble of building out a system when so many vendors have already solved the problem of collecting baseline metrics. The answer at the time was simple: cost. With my setup I could monitor close to 600 instances (including dev) for $3,000 USD per year. That includes data retention of ~2 years! Are there some administration costs as far as my time is concerned? Of course. In the begining things were a little rough as I learned more about InfluxDB, but once things were configured correctly the most work I’ve had to do is to expand the size of the data drive as we started collecting more metrics.

Click through for more info and check out the GitHub repo.

Comments closed

Power BI Report Server and Query Authentication

Emanuele Meazzo shows us the power of configuration:

With the end of the IE support for Power Bi (and in general tbh), companies are scrambling finally to move their users from the legacy browser to modern ones; it was about time if you ask me.
However, there’s an edge case where using anything but IE is not as straightforward as it could be; in my case Power Bi RS worked fine for any report in any browser, except with direct query reports that were set up to authenticate via Windows Authentication as the user viewing the report:

Read on to see how to fix this so that it works well in browsers like Edge and Chrome.

Comments closed

Identify SQL Server Configuration Drift

Garry Bargsley won’t let the cattle become pets:

As you sit and wonder about when the next Star Wars movie is going to come out, do you ever get the thought of “I wonder if all my SQL Servers are configured the same?”

Occasionally, I get a thought like that run through my mind. Or I might see something on Twitter or Blog post about something, and it sparks the question.  Not about Star Wars, but my SQL Server environment.

Today something triggered me to confirm that all of my SQL Servers had the default backup compression setting set to enabled.

Garry mentions dbachecks at the end, and it’s a really good way of performing a fairly large number of such checks easily.

Comments closed

Filtering Unique Objects in Powershell

Jeffrey Hicks solves an interesting problem:

A few weeks ago my friend, Gladys Kravitz, was lamenting about a challenge related to filtering for unique objects. PowerShell has a Get-Unique cmdlet, and Select-Object has a -Unique parameter, but these options are limited. On one hand, I’d say most things we manage with PowerShell are guaranteed to be unique. Objects might have a GUID , ID, or SID, to guarantee uniqueness. But, and this is Gladys’ situation, sometimes the things we are managing come from an external source. Such as importing data from a CSV file. In this situation, it is definitely possible to have duplicate objects.

This turns out to be a bit harder than expected.

Comments closed

Reviewing the ConstantCare Population Report

Brent Ozar surveys the landscape:

Ever wonder how fast people are adopting new versions of SQL Server, or what’s “normal” out there for SQL Server adoption rates, hardware sizes, or numbers of databases? Let’s find out in the summer 2021 version of our SQL ConstantCare® population report.

Click through for Brent’s findings. It’s interesting to compare these against Steve Stedman’s findings, and these come with the same caveat about population.

Comments closed

Buffering Events in SQL Server

Eitan Blumin has a technique to reduce expensive upserts:

Do you find yourself facing performance problems and long lock chains caused by very frequent INSERT, UPDATE, or DELETE statements being executed on a table? Check out this neat trick that could help you out and make all the difference in the world.

Okay, I admit that title ended up being quite long. But I wanted something that could be easily found in search engines by people facing similar problems.

I’ve done something similar, though without the partition switch and instead deleting batches into a temp table. This is a good example of something I like to say about scalable processes in T-SQL: many times, the most scalable technique involves a mental pivot (and sometimes a literal pivot, such as using tally tables to work with string data) of the straightforward answer.

1 Comment