Press "Enter" to skip to content

Author: Kevin Feasel

Multi-Threading with dbatools

Andy Levy has some lessons learned:

Over the summer, I spent some (a lot of) time working on updates to a script at work which runs multiple processes in parallel. Everything seemed to work OK for a while, but then everything broke. It broke right around the time dbatools 1.1 dropped, so I started thinking that something must have changed there. Turns out, it was entirely my fault and I hope this post will help you avoid the same trap.

Don’t fall into the same traps Andy did; read the whole thing.

Comments closed

More Efficient Pivoting

Dave Mason is on the hunt:

While working with some poorly performing code in T-SQL that used a PIVOT operator, I wondered if there was a more efficient way to get a result set of pivoted data. It may have been a fool’s errand, but I still wanted to try. It dawned on me that I could use the STRING_AGG() function to build a delimited list of pivoted column names and values. From there, I’d have to “shred” the delimited data to rows and columns. But how?

Read on to see how.

Comments closed

Adding an Animated GIF to Power BI Reports

Ed Hansberry works around a Power BI limitation:

It is easy to add an animated GIF to your Power BI Reports. However, if you just add it as an image, it won’t animate. You’ll just get a static image.

Animated GIFs can be useful in reports to explain new features to users, or on hidden developer pages showing how certain things were done.

Click through for instructions on how to include an animated GIF on your Power BI report. Just make sure to pronounce it the right way.

Comments closed

Compressing Multiple Files into an Archive with Powershell

Patrick Gruenauer zips all the files:

Compressing files is a common task. For saving time, you can use PowerShell to automate the compression process. In this blog post I will show you how to compress multiple folders at once with the PowerShell Cmdlet Compress-Archive. The compressed zip files will be stored separately in a specific file. Let’s dive in.

Click through for a script which compresses each folder in its own zip file.

Comments closed

Push-Based Shuffle in Apache Spark 3.2 via Project Magnet

Venkata Krishnan Sowrirajan and Min Shen announce that Project Magnet will be in Apache Spark 3.2:

Push-based shuffle is an implementation of shuffle where the shuffle blocks are pushed to the remote shuffle services from the mapper tasks in order to address shuffle scalability and reliability issues. In a nutshell, with push-based shuffle, a large number of small, random reads is converted into a small number of large, sequential reads, which significantly improves disk I/O efficiency and shuffle data locality.

This is explained in greater detail in an earlier blog post, Magnet: A scalable and performant shuffle architecture for Apache Spark, which you can read for more information about how we achieve push-based shuffle.

Read on to see when this matters and how you can make use of it once you’re in Spark 3.2 (whose first release was exactly two weeks ago, October 13th).

Comments closed

Connection Leaks with MARS

Josh Darnell warns that, if you go to MARS, a doctor will warn you that you have a schizoid embolism and it will be up to you to determine whether the doctor is lying or not:

I recently looked at a SQL Server instance that had a large number of MARS connections under a single “parent” connection. Most of these “child” connections had been idle for quite a while, but they were still hanging around.

Read the whole thing. Because I’ve used MARS so little, I’ll instead add a follow-up point to my Total Recall reference above. In the commentary track for the movie, director Paul Verhoeven notes that Douglas Quaid actually did die and that it really was just a fantasy concocted in his mind and he really did die at the end. You can tell because, instead of a fade to black like normal movies, he fades to white, indicating that this wasn’t a proper ending. But then again, considering the follow-on media which happened (and was slated to happen but didn’t make it to the finish line), I don’t think the studios would have let Verhoeven keep his unhappy ending.

Comments closed

Caching Strategies with Redis

Camilo Reyes shares performance data from four Redis caching strategies:

Redis is a cache database that stores documents in memory. The data store has a key-value pair lookup with O(1) time complexity. This makes the cache fast and convenient because it does not have to deal with complex execution plans to get data. The cache service can be trusted to find a cache entry with a value in almost no time.

When datasets in cache begin to grow, it can be surprising to realize that any latency is not a Redis issue. In this take, I will show you several strategies to cache Redis data structures then show what you can do to pick the best approach.

Read on for the contenders and how they do. ProtoBuf’s results on small datasets surprised me.

Comments closed

Serverless SQL Pool CI/CD

Kevin Chant doesn’t have time for manual deployments:

I want to cover one way you can do CI/CD for Azure Synapse Analytics serverless SQL pools using Azure DevOps in this post. Because I know it is a popular topic.

It’s related to my post about how you can create a dacpac for an Azure Synapse Analytics dedicated SQL pool using Azure DevOps. Since they are both based in the same service.

Plus, a while ago I wrote about the increase in demand for Data Platform automation. So, I really wanted to do a post about how you can do CI/CD for Azure Synapse Analytics serverless SQL pools.

Read on to learn how.

Comments closed