2023-10-27 – Curated SQL

Building a Bland-Altman Plot in R

Published 2023-10-27 by Kevin Feasel

Before we dive into the code, let’s briefly understand what a Bland-Altman plot is. It’s a graphical method to visualize the agreement between two measurement techniques, often used in fields like medicine or any domain with comparative measurements. The plot displays the differences between two measurements (Y-axis) against their means (X-axis).

Click through to see how this works and how you can interpret the results.

Comments closed

Trying out Batch Mode on Rowstore

Published 2023-10-27 by Kevin Feasel

Etienne Lopes has some fun with a feature:

Before 2012, creating analytical queries (that usually scan many rows and have lots of aggregations) from big OLTP databases to feed real-time based reports used in decision making processes, could be quite challenging. Then ColumnStore Indexes arrived and they’ve been enhanced overtime, offering amazing gains both in performance and storage.

Unfortunately, regarding pure OLTP databases, there are many situations in which ColumnStore Indexes can’t (or won’t) be used. There are some great performance enhancements present in columnstore that’s for sure and today I’m going to speak about one that became automatically available since SQL Server 2019 for “traditional” RowStore tables. It’s called “Batch Mode on Rowstore” and it can really boost some of our analytical queries over the “traditional tables” without any effort from our side!

There are a series of specific rules you need to hit but if you hit them, I’ve noticed about a 3x performance gain with you doing nothing at all.

Comments closed

Analyzing Dependencies in Microsoft Fabric Measures

Published 2023-10-27 by Kevin Feasel

Sandeep Pawar has an interesting library for us:

In my previous blog post, I introduced Semantic-Link, discussing its use cases and explained how it enables us to create solutions that were either not possible or not easily achievable before. In this blog post, I would like to present another powerful use case that, although possible in the past, could not be created and used seamlessly in Power BI. Allow me to introduce the MeasureMaze Python library, which helps uncover insights from a complex maze of dependencies in a Power BI semantic model using Semantic-Link and the power of network analysis.

Sandeep dives into the approach and what you get out of this library. It’s very interesting.

Comments closed

Automating SQL Dump Analysis

Published 2023-10-27 by Kevin Feasel

Sean Gallardy has been busy:

One of the things that is lacking in the community is for some simple dump analysis for SQL Server, well no longer!

Introducing the Beta version of SQL Dump Analysis… sure it’s not super pretty, sure it’s not 100% yet… but it’ll give you the basic goods and get you on your way.

This is a website you upload SQL dumps to, rather than an executable you’d download. Still, check it out.

Comments closed

Skipping through Rows in a Large File with Powershell

Published 2023-10-27 by Kevin Feasel

Jay Robinson has a one-liner for us:

It came from working with enormous text files, and the gotchas that come with them. In this case, I had a 50gb data file that had something wrong with it, buried about 25% in. The file was being processed just fine, until it hit this unexpected data. And because the ETL app was written for performance first, there wasn’t a lot of data validation being done. So it’d just go boom when it hit that batch.

So what was wrong with the file? Well, in order to determine that, I had to see what was in that batch. But you can’t just open a 50gb file in Notepad. Yes, there are other ways to get around this, but here’s the one I chose:

Click through for Jay’s solution to the problem.

Comments closed

SSIS Catalog Error 27150

Published 2023-10-27 by Kevin Feasel

Andy Leonard diagnoses an error:

In case you’ve never encountered SSIS Catalog Error 27150, the error message reads:

The version of the project has changed since the instance of the execution has been created. Create a new execution instance and try again.

A customer recently encountered this error.
So… what happened?

Click through for the answer. I’ve had this happen as well for the same reason as Andy.

Comments closed

Packed/Batched Remote Procedure Calls in SQL Server

Published 2023-10-27 by Kevin Feasel

Bob Dorr digs in:

The SQL Server TDS protocol provides two main paths for query execution (Language and RPC events.) You can trace these events using the Batch::Starting/Completed (Language) and RPC:Starting/Completed (RPC) XEvents.

Language events are text streams requiring full parsing and may be susceptible to injection attacks. Language events also require educated guesses. For example, should select 1 return a smallint, bigint?

Bob goes on to show an example of a simple call being susceptible to SQL injection, explains why Remote Procedure Calls (RPCs) are superior, and what packing (or batching) RPCs does for you.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Day: October 27, 2023

Building a Bland-Altman Plot in R

Trying out Batch Mode on Rowstore

Analyzing Dependencies in Microsoft Fabric Measures

Automating SQL Dump Analysis

Skipping through Rows in a Large File with Powershell

SSIS Catalog Error 27150

Packed/Batched Remote Procedure Calls in SQL Server

Language events are text streams requiring full parsing and may be susceptible to injection attacks. Language events also require educated guesses. For example, should select 1 return a smallint, bigint?