Press "Enter" to skip to content

Day: August 15, 2023

Adaptive Query Execution in Spark 3.0

The Big Data in Real World team talks on-the-fly execution plan changes:

Apache Spark, the popular distributed computing framework, has been widely adopted for processing large-scale data. With the release of Apache Spark 3.0, a groundbreaking feature called Adaptive Query Execution (AQE) was introduced. AQE addresses the limitations of traditional static execution plans by dynamically optimizing query execution based on runtime statistics. In this blog post, we will explore how AQE works and how it significantly improves the performance of Spark applications.

This sounds pretty similar to adaptive query processing in SQL Server, though a look at the Spark documentation shows that there are some practical differences in implementation versus what SQL Server does.

Comments closed

Dense Rank in KQL

Robert Cain continues a series on KQL:

The Kusto Windowing Function row_rank_dense is an interesting function. It lets you get a unique count of a specific column in a dataset. Unlike other methods of getting counts, row_rank_dense allows you to see each individual row of data.

First though, if you’ve not read the introductory post on Windowing Functions, Fun With KQL Windowing Functions – Serialize and Row_Number, you should do so now as it introduced several important concepts needed to understand how these Windowing Functions work.

Read on to see how this window function works.

Comments closed

Thoughts on Parameter Sensitive Plan Optimization

Jared Poche shares some thoughts:

First, I wanted to point out that there are fixes to issues specific to Parameter Sensitive Plan Optimization in SQL Server 2022. The issue itself is documented at the link below.

When I heard about the issue, I heard that it could cause SQL Server 2022 to cause stack dumps every 15 minutes. The time interval made me think it was related to the default interval for Query Store, and that seems to be the case.

Jared has several thoughts on the topic as he puts together a talk on PSPO.

Comments closed

Tracking Historical Database Snapshots

David Fowler goes sleuthing:

This is a question that’s come up twice this morning, firstly where can we find a history of database snapshots and secondly where can we find a history of restores from snapshot?

Frustratingly, SQL doesn’t make this at all easy for us and if this is something that you want to record, you’re going to have a do a little extra work.

Let’s take a look at each part in turn.

Read on for a messy solution.

Comments closed

Working with Similar but Different Schemas across SQL Server

Aaron Bertrand writes a query, or maybe two:

At Stack Overflow, our environment has multiple implementations of a largely – but not 100% – identical schema. (By “schema,” I mostly mean the set of tables within a database.) I sometimes need to retrieve or update data across a large number of sites. Let’s say, pulling or removing information from the Users table, and related tables, across all of our databases.

Most of the time, we think of “schema drift” as “different from a Platonic ideal of our schema” and try to correct it. In this case, Aaron shows how we can still write queries while embracing schema drift as a basic fact.

Comments closed

Microsoft Fabric Notebooks and Compute Limits

Reitse Eskens hits a wall:

In this case, my notebook threw an error at me but the command seemed to finish without any issue. Sounds vague? It did to me. The notebookcell I tried to run had a lot of stuff happening at the same time.

As you can see in the above screenshot, the status shows green checkmarks but there’s an error as well. The error message was not really clear to me, but that can really be me lack of deep level experience. So, I logged a call with Microsoft Support and see what they could come up with.

I’ve had enough experience with Spark to see the issue and figure the response, but click through for the screenshot and what Reitse did to resolve the issue.

Comments closed

Creating Wireframes in Power BI

Seth Bauer has my attention:

In the dynamic landscape of data visualization, staying ahead of the curve is crucial. That’s why we’re thrilled to introduce the latest game-changing feature in the PowerBI.Tips Theme Generator: WireFrames! This innovation not only revolutionizes the way Power BI is customized but also opens doors for UI/UX teams and designers to interact with Power BI developers like never before.

Most of this does require a subscription, but it does look quite interesting.

Comments closed

Creating Sprint Review Reports with Azure DevOps and Fabric

Kevin Chant checks the burndown:

In this post I want to cover using Azure DevOps Analytics views and Microsoft Fabric to create Sprint review dashboards.

I consider this post to be a sequel to one of my post popular posts that covered using Azure DevOps Analytics views and Power BI to create Sprint review dashboards. For four very good reasons.

Read on for those reasons, along with the steps Kevin took.

Comments closed