Press "Enter" to skip to content

Month: May 2023

Bugs in SQL Server 2022 CU4

Brent Ozar facepalms:

<sigh> Okay, so, the last few Cumulative Updates have had known issues around broken remote queries using the generic ODBC connector and errors with contained availability groups, but I couldn’t really care less about those. If you use those features, I give you bombastic side eye anyway.

However, in the last few days, two more known issues have surfaced.

And those two issues are definitely bad. Click through to see what they are and what the workarounds are until fixes come in.

Leave a Comment

Data Pipelines and Dataflows in Fabric Data Factory

Reza Rad has two videos and posts for us. First up is a primer on data pipelines in Microsoft Fabric Data Factory:

The Pipeline comes from Azure Data Factory. A Pipeline is a group of activities bundled together into a workflow. For example, a Pipeline can generate a process around the Dataflow. For example, you may want to run a Dataflow in a loop until something happens, and with the failure or success of each execution, you want to perform a task such as sending out an email, copying data somewhere, running a stored procedure, etc.

Reza then gets into Dataflows:

Through the years, the Data Transformation engines evolved. In the past, much coding was involved, and the user interface was not the best experience. These days, most actions can be done through pre-built transformations; less coding is needed, and a hardcore developer is not needed for preliminary tasks. This enables citizen data engineers to work with these tools.

Power Query is the data transformation engine of the new generation of Microsoft Data Integration tools and services. Power Query is the data transformation engine used in Power BI. However, Power Query can be used as a standalone cloud-based data transformation service when it is used as Dataflow. Dataflow is the ETL in the cloud offered by Microsoft, which uses the Power Query engine.

Leave a Comment

Keeping VLF Counts Low (Enough)

Eitan Blumin has a script for us:

In this ultimate guide, learn how to fix high VLF counts and shrink transaction logs in SQL Server to improve performance. Discover best practices for managing transaction log sizes and VLFs, and get your hands on a robust script to help you out. Boost your SQL Server’s performance and optimize your database administration with these tips.

Click through for the guide, and for a script Eitan has put together.

Leave a Comment

Contrasting Power BI Import and DirectQuery Modes

Jason Cockington digs into the two modes:

Over the last few months, I have received a number of very similar questions at either one of our Live Training courses or from a student attending one of our Ask Anything Power BI Q&A sessions.  These questions all revolved around DirectQuery and were often specifically about performance issues or complexity of DAX calculations being written on the Power BI reports.  This prompted me to write this post, as it became abundantly clear that most students were opting for DirectQuery when they should have been selecting Import.  If you are not sure if DirectQuery is right for you, then read on, as we explore further.

Jason includes some Real Talk from Chris Webb, fairly eviscerating DirectQuery mode in the majority of cases.

Leave a Comment

From Pie Chart to Stacked Bar Chart

Simon Rowe gives a multi-pie chart a once-over:

This data is from a small company that is looking to get a sense of its sales pipeline across its four regions–North, South, East and West. The data is broken into five categories which relate to the progression of any given opportunity–Prospect, Conversation, Presentation, Agreement and Formal Offer. 

The pipeline composition is expected to have minor variations between regions, but the proportion of opportunities at each stage should be consistent overall. Any notable discrepancies should be flagged to management immediately. 

Here we run into one of the main issues people have with data visualization: context is critical. A stacked bar chart is often not a good alternative for any cluster of pie charts. In this case it was a very good answer because of the nature of the exact problem to solve. If I need to make a generic report given the same raw data, it would be difficult for me to land on this specific visual.

Leave a Comment

The Importance of backup_label Files for Postgres

Robert Haas says, don’t delete that file:

I’m sure you already know what I’m going to tell you: “Of course you need that backup_label file. How could you even think that you don’t need that backup_label file?” Well, you’re right. That is what I’m going to say. But do you know why you need that backup_label file? If you were to remove that backup_label file (or fail to create in the first place, in cases where that is your responsibility), what exactly is the bad thing that would happen to you?

Read on to learn what the file does, how you get it, and why it’s so important to keep around.

Leave a Comment

MATCHBY for DAX Window Functions

Jeffrey Wang plays matchmaker:

In this final installment of our four-part mini-series on DAX window functions, we’ll be focusing on a new development. The May release of Power BI Desktop has enriched all DAX window functions – namely OFFSETWINDOWINDEXRANK, and ROWNUMBER – with an additional sub-function, MATCHBY, supplementing the existing sub-functions, ORDERBY and PARTITIONBY. In this article, we’ll delve into the purpose of the MATCHBY function, along with the three specific problems it aims to resolve.

Read on to understand what the MATCHBY sub-function does and why it can be important.

Leave a Comment