Curated SQL – Page 392 – A Fine Slice Of SQL Server

Building a Bland-Altman Plot in R

Published 2023-10-27 by Kevin Feasel

Before we dive into the code, let’s briefly understand what a Bland-Altman plot is. It’s a graphical method to visualize the agreement between two measurement techniques, often used in fields like medicine or any domain with comparative measurements. The plot displays the differences between two measurements (Y-axis) against their means (X-axis).

Click through to see how this works and how you can interpret the results.

Comments closed

Trying out Batch Mode on Rowstore

Published 2023-10-27 by Kevin Feasel

Etienne Lopes has some fun with a feature:

Before 2012, creating analytical queries (that usually scan many rows and have lots of aggregations) from big OLTP databases to feed real-time based reports used in decision making processes, could be quite challenging. Then ColumnStore Indexes arrived and they’ve been enhanced overtime, offering amazing gains both in performance and storage.

Unfortunately, regarding pure OLTP databases, there are many situations in which ColumnStore Indexes can’t (or won’t) be used. There are some great performance enhancements present in columnstore that’s for sure and today I’m going to speak about one that became automatically available since SQL Server 2019 for “traditional” RowStore tables. It’s called “Batch Mode on Rowstore” and it can really boost some of our analytical queries over the “traditional tables” without any effort from our side!

There are a series of specific rules you need to hit but if you hit them, I’ve noticed about a 3x performance gain with you doing nothing at all.

Comments closed

Analyzing Dependencies in Microsoft Fabric Measures

Published 2023-10-27 by Kevin Feasel

Sandeep Pawar has an interesting library for us:

In my previous blog post, I introduced Semantic-Link, discussing its use cases and explained how it enables us to create solutions that were either not possible or not easily achievable before. In this blog post, I would like to present another powerful use case that, although possible in the past, could not be created and used seamlessly in Power BI. Allow me to introduce the MeasureMaze Python library, which helps uncover insights from a complex maze of dependencies in a Power BI semantic model using Semantic-Link and the power of network analysis.

Sandeep dives into the approach and what you get out of this library. It’s very interesting.

Comments closed

Automating SQL Dump Analysis

Published 2023-10-27 by Kevin Feasel

Sean Gallardy has been busy:

One of the things that is lacking in the community is for some simple dump analysis for SQL Server, well no longer!

Introducing the Beta version of SQL Dump Analysis… sure it’s not super pretty, sure it’s not 100% yet… but it’ll give you the basic goods and get you on your way.

This is a website you upload SQL dumps to, rather than an executable you’d download. Still, check it out.

Comments closed

Skipping through Rows in a Large File with Powershell

Published 2023-10-27 by Kevin Feasel

Jay Robinson has a one-liner for us:

It came from working with enormous text files, and the gotchas that come with them. In this case, I had a 50gb data file that had something wrong with it, buried about 25% in. The file was being processed just fine, until it hit this unexpected data. And because the ETL app was written for performance first, there wasn’t a lot of data validation being done. So it’d just go boom when it hit that batch.

So what was wrong with the file? Well, in order to determine that, I had to see what was in that batch. But you can’t just open a 50gb file in Notepad. Yes, there are other ways to get around this, but here’s the one I chose:

Click through for Jay’s solution to the problem.

Comments closed

SSIS Catalog Error 27150

Published 2023-10-27 by Kevin Feasel

Andy Leonard diagnoses an error:

In case you’ve never encountered SSIS Catalog Error 27150, the error message reads:

The version of the project has changed since the instance of the execution has been created. Create a new execution instance and try again.

A customer recently encountered this error.
So… what happened?

Click through for the answer. I’ve had this happen as well for the same reason as Andy.

Comments closed

Packed/Batched Remote Procedure Calls in SQL Server

Published 2023-10-27 by Kevin Feasel

Bob Dorr digs in:

The SQL Server TDS protocol provides two main paths for query execution (Language and RPC events.) You can trace these events using the Batch::Starting/Completed (Language) and RPC:Starting/Completed (RPC) XEvents.

Language events are text streams requiring full parsing and may be susceptible to injection attacks. Language events also require educated guesses. For example, should select 1 return a smallint, bigint?

Bob goes on to show an example of a simple call being susceptible to SQL injection, explains why Remote Procedure Calls (RPCs) are superior, and what packing (or batching) RPCs does for you.

Comments closed

Functional Programming and R

Published 2023-10-26 by Kevin Feasel

Anirban Shaw ties functional programming to R:

Functional Programming‘s relevance in the R programming language, a language primarily known for its prowess in data analysis and statistical computing, is particularly noteworthy. By leveraging functional programming, organizations can improve operational efficiency and gain a competitive edge.

R’s ecosystem is enriched by functional programming paradigms, which enable developers and data scientists to write concise and expressive code for tasks such as data manipulation, transformation, and visualization.

In this article, we take a deep dive into the fundamental characteristics of R, the advantages of adopting functional programming within it and the essential concepts ingrained in the core of R.

Read on to see how the two fit together. H/T R-Bloggers.

Comments closed

Working with Oracle Cloud Infrastructure from Python

Published 2023-10-26 by Kevin Feasel

Brendan Tierney peruses an SDK:

Having set up your Oracle Cloud account and perhaps initiated some services, such as an Autonomous Database or an AI Service, you have some options with how you’ll interact with those services. The first is to use the online interfaces and tools for them. Another is to access these OCI services from your local PC or laptop. This latter method gives you greater flexibility for accessing these services, automating various tasks and building out some applications. This post will step through how to install the necessary Python library, how to configure for connecting to OCI and will demo some of the simpler APIs for accessing the OCI services. This will help you verify your connection is working correctly.

Click through for the details.

Comments closed

Understanding Postgres Explain Plans

Published 2023-10-26 by Kevin Feasel

Muhammad Ali explains explain plans:

In a previous post titled Exploring Postgres Performance: A Deep Dive into pg_stat_statements, we discussed the utility of pg_stat_statements as a tool for identifying resource-intensive queries in terms of memory, CPU, or I/O usage.

Now, let’s suppose you’ve pinpointed certain queries you’d like to investigate further. The EXPLAIN command is used to generate execution plans. It includes:

Read on to see what it includes, how you can generate an explain plan, and some of the optional settings.

Comments closed

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31

Curated SQL Posts

Language events are text streams requiring full parsing and may be susceptible to injection attacks. Language events also require educated guesses. For example, should select 1 return a smallint, bigint?