Press "Enter" to skip to content

Month: August 2022

A Wake for Distributed Replay

Grant Fritchey thinks about what could have been:

Honestly, sincerely, no kidding, I love Distributed Replay. Yes, I get it. Proof positive I’m an idiot. As we needed proof. To be a little fair to me, I love what Distributed Replay could have been, with a little more love. However, fact is, it’s on the deprecation list for 2022. Which means, what minimal amount of love, if any, that Microsoft was giving to it, it’s all gone, forever. Unlike the Little Engine That Could, turns out that Distributed Replay was the Little Engine That Almost Could, But Didn’t. Really Didn’t. Let’s discuss it a bit.

I liked the concept a lot but the tooling was so finicky and there were just so many built-in assumptions that tended to fall apart in real life. Grant actually got it to work outside of toy environments, which was one step further than I could ever get.

Comments closed

Attaching All SQL Server Data Files in a Single Directory

David Fowler migrated a bunch of databases:

Have you ever had the need to attach a large number of database in one go? There’s no way to attach multiple databases in SSMS or via script, so you’re probably going to be left with the slow, arduous task of doing them one by one.

I recently had to deal with a DR situation (I won’t go into details of what happened just yet as things are still quite sensitive, but I might look at it at some point in the future) where I faced exactly that issue. For one reason or another I needed to attach several hundred databases quickly. I didn’t fancy doing that via SSMS or script each one individually so I knocked together this script to do the job for me.

Click through for that script and instructions. Alternatively, a bit of Powershell and the right dbatools command could get you to the same result but this is good in the event that you can’t leave SSMS.

Comments closed

Comparing Pre- and Post-Event in Power BI

Marco Russo and Alberto Ferrari do a bit of comparative analysis:

Many article ideas come from questions we receive from our customers and readers, and this article is no exception. The requirement is to analyze the sales volume before and after a selected time period, to a specific customer segment. In this case we define the segment as the brand that the customers have purchased in the past. For example, considering the customers who purchased Contoso-branded products in February 2018, what is the Sales Amount of Contoso and other products sold to those same customers 180 days before and after February 2018?

Read on to find out.

Comments closed

Solving the Traveling Salesman Problem in R

Tomaz Kastrun gives us a solution to the Traveling Salesman Problem:

Travelling Salesman Problem is an NP-complete problem and an old mathematical problem. For this useless function, we will look for the nearest city from the previous city (or starting point) and repeat until we visit all cities. The greedy solution is fairly simplified but one disadvantage; it might not give you the best path (optimal solution) and proving that the solution is correct is an additional issue 

As Tomaz notes, this is not guaranteed to be the best solution, just a solution. Considering that TSP is NP-hard, if Tomaz did have a globally optimal solution for us, he certainly wouldn’t be calling it ‘useless-useful’ but instead would be calling it “My prize-winning algorithm.”

Comments closed

Building a Data Serving API in Azure

Justice Zishanhi has some recommendations for serving data in Azure:

Data is an important asset to all organizations big and small. As these organizations mature, building an end-to-end data platform to enable BI and AI at scale has become part of that journey. Some organizations, have the requirement to expose modelled data in a data warehouse or data lake (Azure Data Lake Storage Gen2) to downstream consumer applications (mobile or web apps) where access patterns can be unpredictable in respect to frequency of access and/or type of data that is requested.

Data warehouse engines and data lakes are not designed for singleton transactional (request / response) interactions.  To serve these requests at scale and to meet the different SLAs and access pattern unpredictability, data needs to be offloaded to a suitable database engine (i.e., a caching layer) that is built to serve such queries.  

The “Design Patterns” section of this article highlights a generalized pattern for implementing a data serving API which meets this requirement – consisting of a Data Platform component and an API component. For implementing the API, two patterns are commonly adopted – a synchronous pattern or an asynchronous pattern. Both are explored in the “API Implementation Patterns” section of this article.

The example focuses on Cosmos DB and provides quite a bit of helpful guidance.

Comments closed

Converting a Referenced Query Column to a Table

Erik Svensen does a bit of magic:

Let’s imagine I have a query with a column called Reportid – and I want to create another query with the unique values from the column.

Normally I would create a new blank query and than reference the column with – queryName[ColumnName]

And then there are a bunch of other steps which eventually lead you to the result. Or you can go straight to easy mode. That sounds like a pretty good idea to do instead.

Comments closed

SQL Server 2022 and S3 Object Integration

Anthony Nocentino does some PolyBase work:

In this blog post, I’ve implemented two example environments for using SQL Server 2022’s s3 object integration. One for backup and restore to s3 compatible object storage and the other for data virtualization using Polybase connectivity to s3 compatible object storage. This work aims to get you up and running as quickly as possible to work with these new features. I implemented this in Docker Compose since that handles all the implementation and configuration steps for you. The complete code for this is available on my GitHub repo…I’m walking you through the implementation here in this post.

In my post Setting up MinIO for SQL Server 2022 s3 Object Storage Integration we did this step by step at the command line. Using Docker Compose it will do all the hard work for you and you can get up and running fast.

Let’s walk through what you’ll get in each environment.

Read the whole thing, especially the note at the bottom about minimum RAM requirements on the Docker container.

Comments closed

Custom Infix Functions in R

Dominik Rafacz loves infix functions:

Custom infix functions are one of my favorite features in R. This article is my love letter to them. But first, a quick recap.

For those unfamiliar with the terminology, infix function is a function fun which is called using infix notation, e.g., x fun y instead of fun(x, y). Those functions are also called infix operators by base R, and I will use those terms and name infixes interchangeably. There are a lot of infix operators in base R used very frequently, i.e., arithmetic or logical operators. We use them so often that we usually forget that they are functions. And that we can call them just like regular functions.

Infix functions are something I tend to forget entirely about developing on my own but they can be extremely useful, as Dominik shows. H/T R-Bloggers.

Comments closed