Press "Enter" to skip to content

Category: Performance Tuning

Improving Performance Counters with Powershell

Jeffrey Hicks has an improvement to Get-Counter in Powershell:

I wanted to tell you about another addition to the latest release of the PSScriptTools module. This is something I’ve written about before but I decided to add the function to the module. I hope you find it a much easier way to work with performance counters. And it works in Windows PowerShell and PowerShell 7.x.

Click through to see what has changed.

Comments closed

The Default Cardinality Estimator and Ascending Keys

Erik Darling compares cardinality estimators:

Look, I’m not saying there’s only one thing that the “Default” cardinality estimator does better than the “Legacy” cardinality estimator. All I’m saying is that this is one thing that I think it does better.

What’s that one thing? Ascending keys. In particular, when queries search for values that haven’t quite made it to the histogram yet because a stats update hasn’t occurred since they landed in the mix.

Read the whole thing.

Comments closed

PASS Summit Q&A: Intelligent Query Processing

Kathi Kellenberger has a follow-up of some questions after a PASS Virtual Summit session:

Last week, I presented a session on Intelligent Query Processing for the first ever Virtual PASS Summit. This summit had a mix of live and pre-recorded session. During the pre-recorded sessions, the speaker could hang out with the attendees in the discussion room and join a virtual chat room at the end.  My session was live, so I answered questions a few questions during the session. There were a couple of questions that I couldn’t answer fully during the session, but all the questions were interesting, so I’ll go through them here.

Click through for the questions and answers.

Comments closed

Parallelism and Nested Loops Joins

Erik Darling talks about the intersection of two performance tuning topics:

Yesterday we saw a case where the Gather Streams operator was costed quite highly, and it prevented a parallel plan from being chosen, despite the parallel plan in this case being much faster.

It’s important to note that costing for plans is not a direct reflection of actual time or effort, nor is it accurate to your local configuration.

They’re estimates used to come up with a plan. When you get an actual plan, there are no added-in “Actual Cost” metrics.

Read on to see how you can monkey’s paw your way through this problem by introducing exciting, new problems.

Comments closed

Tips for Optimizing Dedicated SQL Pools in Synpase Analytics

Tsuyoshi Matsuzaki shares some tips for improving query performance when using Dedicated SQL Pools in Azure Synapse Analytics:

By above BROADCAST_MOVE operation, the rows in dimension_City table are all copied in a temporary table (called TEMP_ID_3) on all distributed database. (See below.)
Since the size of dimension_City is small, then all rows in this table is duplicated in all database before joining. This time, we join only 2 tables, however, if a lot of tables are needed to join, this data movement will become large overhead for query execution.

The short version is, replicate smaller dimensions and align distribution keys for large tables which get joined together. Both of these minimize the changes of the engine needing to shuffle data between nodes. These sorts of things can make a huge difference when working with Dedicated SQL Pools, cutting query time down by an order of magnitude in some extreme cases.

Comments closed

Monitoring Storage Metrics

Robert Sheldon continues a series on storage concepts for the DBA:

When monitoring storage systems, engineers should track a variety of metrics to ensure the systems continue to meet application requirements. Three of the most important and commonly cited metrics are latency, I/O operations per second (IOPS), and throughput. In addition to these three, queue length and I/O splitting can also provide valuable insights into storage performance.

In this article, I discuss all five of these metrics and demonstrate them in action. Despite my focus on these five, they’re not the only important metrics to monitor. For example, engineers should also track storage capacity, device cache usage, controller operations, and storage networks. Even seemingly unrelated components can be a factor, such as low CPU utilization, which can indicate that the processor is waiting on storage to complete requests from the application.

Robert also shows off the new perfmon, which may or may not be better than Perfmon Classic.

Comments closed

Working with Active Power BI Sessions

Marc Lelijveld provides us insights:

Getting more insights in the telemetry of your Power BI dataset is always interesting. Especially if you share your Power BI dataset for self-service purposes to build new reports on top of your managed dataset, you might want to know who is actually using in and what queries are they executing against your datasets.

Besides that, there might be a whole lot of other valid use cases where you want to have more insights in currently running sessions on your dataset. An example where I recently ran into, was canceling a Power BI Dataset refresh. As there is no button in the Power BI Service to do this, I had to find a different way to do this.

Read on to see how, using DAX Studio.

Comments closed

Power BI Report Performance: Number of Visuals on a Page Edition

Chris Webb looks at some extreme scenarios:

You may have read the title of this post and guessed that I’m going to talk about reducing the number of visuals that display data from your dataset as a way of improving performance, but that’s not the case. In this blog post I want to show how visuals that do not display any data from your dataset can have a significant impact on report performance. Before we carry on I suggest you read the series of posts I wrote late last year on measuring the performance of reports in the browser using Chrome/Edge DevTools (part 1 is here, part 2 is here, part 3 is here) because I’ll be using techniques described in these posts in my testing.

Click through for an interesting demo.

Comments closed

R: Avoid apply() with Large Datasets

George Pipis walks us through the performance cost of using the apply() function against a large dataset in R:

When we are dealing with large datasets and there is a need to calculate some values like the row/column min/max/rank/mean etc we should avoid the apply function because it takes a lot of time. Instead, we can use the matrixStats package and its corresponding functions. Let’s provide some comparisons.

Click through for a demonstration of how much slower it can be in certain cases. H/T R-Bloggers.

Comments closed

Optimizing Cross Join Performance in Power BI

Chris Webb tries an experiment:

In a late addition to the series of posts that I started here a few months ago, I thought it would be useful to look at how to optimise the performance of a rare (but nonetheless useful) type of merge: the cross join. Cross joins are not available as a join kind in the Table.Join or Table.NestedJoin M functions but there are two ways of creating them in Power Query:

1. Creating a custom column in your query that contains the query that you want to cross join with and then expanding, as Matt Allington describes here
2. Creating a custom column that contains a single value, such as the number 1, on both your queries and then using that column to merge the two queries on (something I mentioned a long time ago here)

Which one gives you the best performance? 

Chris has the answer to that question and the tests to prove it.

Comments closed