Press "Enter" to skip to content

Month: October 2022

Rewriting Tricky Functions in SQL Server

Erik Darling fights dragons:

Far and away, some of the trickiest situations I run into when helping clients is rewriting scalar functions that have WHILE loops in them.

This sort of procedural code is often difficult, but not impossible, to replace with set-based logic.

Erik improves a function in this post, though often, the best way to improve a function is not to play the game at all.

Comments closed

IsNull and IsEmpty in KQL

Robert Cain’s fuel gauge is running on E:

In writing queries, it is not uncommon to get results where a column has missing values. This can cause concerns or questions from your users. “Why is this blank?”, “There must be something wrong with your query its missing data!”.

To avoid this, Kusto provides two functions to check for missing values: isnull and isempty. You can combine this with the iif function (covered in the Fun With KQL – IIF post) to provide clarifying text to the end user.

Check out the examples of how to use these two functions in Robert’s post.

Comments closed

Solving the CanSum Problem in R

Tomaz Kastrun knows if you can sum those together:

CanSum problem is a problem where a given array of integers (nums) and a target integer (target), return boolean (TRUE \ FALSE) indicating, that the target integer can be calculated using two or more numbers in array nums.

You may assume that each integer from the array can be used multiple times. You can also assume, that any integer (in nums or in target) is from 0 to +1e9 and the length of the nums array is from 2 to 1000 elements.

Click through for an example of one brute-force solution, followed by a much faster solution.

Comments closed

Creating a SQL Server Assessment Dashboard

Robert Blackburn builds a dashboard:

We must periodically evaluate the state of our databases. Luckily for SQL Server, Microsoft provides us with a customizable assessment through their SQL Assessment API Repo and API Documentation. You can change the rules per database and output the results to a database to track history.

However, that will take more than an hour. Let’s create a dashboard with the default rules in under an hour. We will use Azure Data Studio (ADS) and Power BI Desktop (PBI). If you are not familiar with them, both are free. Azure Data Studio is automatically installed with SSMS 18.7 and higher. You can also install them individually.

Read on to see how this works. Granted, it will not auto-update but unless the assessment output format changes between runs, at least you wouldn’t need to modify Power BI and could just refresh the data.

Comments closed

Finding Faulty Rows in Tabular Server Errors

Teo Lachev goes error-hunting:

A scheduled SSIS job that executes a massive DAX query to an on-prem Tabular server (Power BI can also generate this error) one day decided to throw an error “Source: “Microsoft OLE DB Provider for Analysis Services.” Hresult: 0x80004005 Description: “MdxScript(Model) (2020, 98) Calculation error in measure ‘Account Snapshot'[Average utilisation % of all CR active current accounts last 3 months]: The result of a conversion or arithmetic operation is either too large or too small.” At least we know the offending measure, but which row is causing the error? The query requests some 300+ measures for 120 million customers, so I thought someone might find the troubleshooting technique useful. Let’s ignore what the measure does for now except mentioning that it performs a division of two other measures.

Click through for the technique.

Comments closed

Auto Partitioning Recommendations for Oracle

Brendan Tierney checks out some recommendations:

In a previous blog post I gave an overview of the DBMS_AUTO_PARTITION package in Oracle Autonomous Database. This looked at how you can get started and to setup Auto Partitioning and to allow it to automatically implement partitioning.

This might not be something the DBAs will want to happen for lots of different reasons. An alternative is to use DBMS_AUTO_PARTITION to make recommendations for tables where partitioning will have a performance improvement. The DBA can inspect these recommendations and decide which of these to implement.

Read on to see how you can run the recommender, as well as what a recommendation looks like.

Comments closed

Geo-Zone Redundant Storage for SQL MI Backups

Niko Neugebauer moves the backups pretty far away:

The new Geo-Zone Redundant Storage (GZRS) backup storage option combines the best of two worlds – Geo-Redundant and Zone-Redundant storage, keeping backups safe from both regional (Geo-Redundant) and Data Center (Zone-Redundant) failures. It provides the highest availability for storage currently offered on Azure, improving recovery speed and enabling Point-In-Time Restore (PITR) of backups in the event of a zone failure.  

Geo-Zone Redundant Storage for Azure SQL Managed Instance backups provides 3 synchronous copies in different availability zones within the same primary region, plus an additional asynchronous copy within a single availability zone in the paired secondary region, as shown on the following picture: 

Click through for that picture and what it does for expected availability. Basically, a whole bunch of data centers would need to fail before you lose a backup. Or someone messes up DNS and makes everything unavailable for a day, not that that’s ever happened before with a large cloud service provider…

Comments closed

Checking All Metrics when Query Tuning

Grant Fritchey has some query tuning advice:

Recently, a person asked about the costs differences in an execution plan, referencing them as if they were performance measures. The key to understanding performance is to check every metric. When it comes to execution plans, I’m sure I’ve said this before, so please allow me to repeat myself.

The cost numbers shown in an execution plan, which, barring a recompile, will be the same for an execution plan or an execution plan with runtime metrics (aka, estimated and actual plans), are not measures of performance. They do not represent actual metrics. Instead, they are calculations of a theoretical actual performance measurement. So, you can’t look at two plans, with two costs, and say, “this plan will perform better.” Instead, you can say, “this plan has a lower estimated cost.” To really see performance metrics, you must measure performance.

Read on for the full set of advice.

Comments closed

Thoughts on Current 2022

Markos Sfikas, et al, recap the Current 2022 conference:

Throughout the conference, the theme of Batch vs. Streaming was apparent. Discussions covered how they can be unified, how batch processing’s performance can / must be improved for real-time applications, and more. There was even a dedicated panel discussion with Adi Polak, Amy Chen, Eric Sammer and Tyler Akidau discussing the state of streaming adoption today, and debating if streaming will ever fully replace batch. You can view some interesting points from the panel discussion in the Twitter thread from Robin Moffatt.

Click through for the full recap.

Comments closed