Press "Enter" to skip to content

Month: October 2024

Functions in JOIN and Performance Problems

Steve Stedman lays out a warning:

When writing SQL queries, it’s easy to focus on getting the right results without thinking too much about performance. One common mistake that can lead to significant slowdowns is using functions in the JOIN ON clause of a SELECT statement.

While SQL Server supports a wide range of built-in functions, using them incorrectly—especially in the join conditions—can severely impact performance. Let’s break down why this happens, how it can affect your query speed, and what you can do to avoid it.

Read on for the answer.

Comments closed

Restarting Failed Control Flows in Azure Data Factory

Meagan Longoria doesn’t want to repeat good work:

I presented at SQL Saturday Pittshburgh this past weekend about populating your data warehouse with a metadata-driven, pattern-based approach. One of the benefits I mentioned is that it’s easy to employ this pattern for restartability.

For instance, let’s say I am loading data from 30 tables and 5 files into the staging area of my data mart or data warehouse, and one of table loads fails. I don’t want to reload the other tables I just loaded. I want to load the ones that have not been recently loaded. Or let’s say I have 5 dimensions and 4 facts, and I had a failure loading a fact table. I don’t want to reload my dimensions, and I only want to reload the failed facts. How do we accomplish this?

Read on to learn how.

Comments closed

Configuration Is Code

Steve Jones has a public service announcement:

I posted a note on Twitter/X with this quote: “The content updates had not previously been treated as code because they were strictly configuration information.” This is from testimony given by Crowdstrike to a US Congressional committee in trying to explain how they grounded much of the airline industry a few months ago. That was a mess of a situation, and apparently, the vendor didn’t think their configuration was part of their code.

That’s an amazing viewpoint to me. The fact that any developer or manager thinks that their configuration data isn’t a part of their code is worth testing. Yet, I see this attitude all the time, where developers, QA, managers, and more think that the code is the only thing that changes or doesn’t change, ignoring the fact that there are configuration items that affect the code and need to be managed appropriately. Certainly, if the config data were in enums rather than in a file or database they’d feel differently.

Read on for Steve’s extended thoughts. I can understand the urge to call something “just a configuration file” so that you don’t have to do as much work. But that can lead to disaster.

Comments closed

Tips for Optimizing Power BI Semantic Models

Koen Verbeeck shares some tips:

Power BI is designed to be user-friendly. With just a few clicks, you can import data from various sources, combine them together in one data model and start analyzing it using powerful data visualizations. This sometimes leads to a scenario where people are just importing data into the tool without giving it too much thought. When you’re working on a solo project on a small dataset, there probably won’t be too many issues. But what if your report is successful and you want to share it with your colleagues and maybe other departments? Or more data is loaded into the model, but refreshes are taking more and more time? Even other data sources are added into your model, but writing DAX formulas has become hard, and reports are slowing down.

In this article, we’ll cover a couple of tricks that will help you make your Power BI models smaller, faster and easier to maintain. In the immortal words of Daft Punk: “Harder. Better. Faster. Stronger”.

Click through for those tricks and tips.

Comments closed

Implementing Role-Playing Dimensions in Power BI

Teo Lachev puts on a mask:

Role-playing dimensions are a popular business requirement but yet challenging to implement in Power BI (and Tabular) due to a long-standing limitation that two tables can’t be joined multiple times with active relationships. Declarative relationships are both a blessing and a curse and, in this case, we are confronted with their limitations. Had Power BI allowed multiple relationships, the user must be prompted which path to take. Interestingly, a long time ago Microsoft considered a user interface for the prompting but dropped the idea for unknown reasons.

Given the existing technology limitations, you have two implementation choices for implementing subsequent role-playing dimensions: duplicating the dimension table (either in DW or semantic model) or denormalizing the dimension fields into the fact table. The following table presents pros and cons of each option:

Click through for that table, as well as some thoughts on viable approaches, including an edge case.

Comments closed

Capturing Database Object Changes in SQL Server

Sebastiao Pereira creates a trigger:

Data Definition Language (DDL) is a group of SQL statements that can be executed to manage database objects. The idea is to create a database trigger that tracks and logs all changes to database schemas, including modifications to stored procedures, tables, views, and other schema objects and storing any event in one table.

Read on for an example of a database-level trigger. These are much less likely to cause performance problems compared to table-level triggers, though you could certainly cause issues via poor trigger definition.

Comments closed

Querying Audit Log (.xel) Files in Azure SQL DB

Tanayankar Chakraborty reads an audit log:

A recent issue was brought to our attention that customers could not query .xel log files in an Azure SQL DB using t-sql command. The customers complained that when they ran the command, they received column headers but no content whereas they know that there is content in the logs because they were able to open them with SSMS using Merge Extended Event Files. Here was the T-sql command used by the customer:

select * from sys.fn_get_audit_file (‘https://mydbastorage.blob.core.windows.net/sqldbauditlogs/servername/dbname/SqlDbAuditing_Audit_NoRetention/*.xel’, NULL, NULL);

Click through for the solution, which came down to two separate issues.

Comments closed

Contained Availability Groups and Database Mail

Ben Miller points out a gap in functionality:

Call to action for Microsoft. Contained Availability Groups came out in SQL 2022 and they definitely have their use. But there were some artifacts left behind that need some fixing. Namely when you use DBMail while in the Availability Group jobs or operations. Let’s see what there is left.

First, here is the link to the Feedback Item that is out there for voting to get Microsoft to fix this issue. There has already been an issue fixed with the msdb proc to activate dbmail in a Contained AG ([dbo].[sp_sysmail_activate]).

Read on to learn more about the issue.

Comments closed

Dealing with the Lack of Identity Columns in Microsoft Fabric

Nikola Ilic forges a new identity:

If you’ve ever worked with traditional relational database management systems (RDBMS) and/or data warehouses, and you’re now trying to be a “modern data platform professional” and apply your skills in Microsoft Fabric, you may find yourself in uncharted territory. Not only because of the SaaS-ification of the environment, but also due to many puzzling “solutions”, or maybe it’s better to say – lack of the features that we were taking for granted in the “previous” (pre-Fabric) life.

The goal of this article is to introduce you with different approaches for overcoming the limitation of non-existency of the identity columns in Microsoft Fabric. Please keep in mind that all of these approaches are considered workarounds and it may happen that Microsoft in the future provide the out-of-the-box solution

Missing the identity column attribute can be a bit annoying when building out dimensions, so Nikola provides a few tips on how to emulate this functionality.

Comments closed