Press "Enter" to skip to content

Category: Performance Tuning

Working with Active Power BI Sessions

Marc Lelijveld provides us insights:

Getting more insights in the telemetry of your Power BI dataset is always interesting. Especially if you share your Power BI dataset for self-service purposes to build new reports on top of your managed dataset, you might want to know who is actually using in and what queries are they executing against your datasets.

Besides that, there might be a whole lot of other valid use cases where you want to have more insights in currently running sessions on your dataset. An example where I recently ran into, was canceling a Power BI Dataset refresh. As there is no button in the Power BI Service to do this, I had to find a different way to do this.

Read on to see how, using DAX Studio.

Leave a Comment

Power BI Report Performance: Number of Visuals on a Page Edition

Chris Webb looks at some extreme scenarios:

You may have read the title of this post and guessed that I’m going to talk about reducing the number of visuals that display data from your dataset as a way of improving performance, but that’s not the case. In this blog post I want to show how visuals that do not display any data from your dataset can have a significant impact on report performance. Before we carry on I suggest you read the series of posts I wrote late last year on measuring the performance of reports in the browser using Chrome/Edge DevTools (part 1 is here, part 2 is here, part 3 is here) because I’ll be using techniques described in these posts in my testing.

Click through for an interesting demo.

Leave a Comment

R: Avoid apply() with Large Datasets

George Pipis walks us through the performance cost of using the apply() function against a large dataset in R:

When we are dealing with large datasets and there is a need to calculate some values like the row/column min/max/rank/mean etc we should avoid the apply function because it takes a lot of time. Instead, we can use the matrixStats package and its corresponding functions. Let’s provide some comparisons.

Click through for a demonstration of how much slower it can be in certain cases. H/T R-Bloggers.

Comments closed

Optimizing Cross Join Performance in Power BI

Chris Webb tries an experiment:

In a late addition to the series of posts that I started here a few months ago, I thought it would be useful to look at how to optimise the performance of a rare (but nonetheless useful) type of merge: the cross join. Cross joins are not available as a join kind in the Table.Join or Table.NestedJoin M functions but there are two ways of creating them in Power Query:

1. Creating a custom column in your query that contains the query that you want to cross join with and then expanding, as Matt Allington describes here
2. Creating a custom column that contains a single value, such as the number 1, on both your queries and then using that column to merge the two queries on (something I mentioned a long time ago here)

Which one gives you the best performance? 

Chris has the answer to that question and the tests to prove it.

Comments closed

Troubleshooting SQL Server with PSSDIAG and SQL Nexus

Mat Hayward-Hill shows how to use PSSDIAG and SQL Nexus. First, PSSDIAG:

Once you have worked on enough SQL Server performance issues, you start to develop an approach and lean on the same tools, to collect and present your data. I will walk you through how I troubleshoot a SQL Server performance issue, having no prior knowledge of the system. Typically, the description of the problem goes along the lines of “it was fine last week, but today queries are taking forever or timing out.”

If the issue is with one query, I will start with the Query Plan. Otherwise, when the problem is system-wide, and the SQL instance sits on top of a Windows on-prem or IaaS server, I use PSSDIAG to gather the data.

Then, SQL Nexus:

This post is the second part of how I troubleshoot a SQL Server performance issue like a pro. In Part 1, we used PSSDIAG to collect performance data. Now, we will look at how to load and analyse that data using SQL Nexus.

Before we dive straight into SQL Nexus, there are a few prerequisites that you need to install. I will confess I didn’t get this right on my first attempt so to make it easier for you, I have included the requirement along with the link of where to find it.

Check out both posts.

Comments closed

Enabling SQL Server Optimizer Hotfixes

John Morehouse takes us through the step-by-step for enabling optimizer hotfixes in SQL Server:

There are a number of knobs and switches that are available to database administrators that can be used to enable better performance.  There are three options in particular that this blog will be discussing, trace flag 4199, the database scoped configuration QUERY_OPIMIZER_HOTFIXES and the qeury hint ENABLE_QUERY_OPTIMIZER_HOTFIXES. Understanding how these options function will give you a hand up on ensuring the query optimizer is running as optimally as possible.

Let’s take a look at the three options.

Read on to learn more. There is some potential risk of regression with new optimizer updates, so standard rules around testing apply.

Comments closed

All About The Compute Scalar Operator

Hugo Kornelis continues a quest to document query plan operators:

The Compute Scalar operator is used to compute new values based on other columns in the same row. These new values are then added as extra column(s) in the output rows.

The expressions used to compute the new values can only refer to constant values and to columns in the input rows of the Compute Scalar operator. Other than that, there are, to my knowledge, no restrictions. The expressions can vary from very simple to extremely complex. The expressions can even include references to scalar user-defined functions, to CLR user-defined functions, and to built-in CLR functions.

Read on for a good deal of information about the operator.

Comments closed

Performance Tuning for Cloudera’s Operational Database

Liliana Kadar, et al, show us the tools we can use to tune Cloudera’s Operatioanl Database:

A query optimizer determined the most efficient way to run a query. Query optimization helps you to reduce the hardware resources required to run a query and also speeds up your query-response time. Cloudera’s Operational Database provides you with various tools such as plan analyzers to make optimal use of your computing resources. 

Cloudera’s OpDB provides various cost-based and rules-based optimizers. You can use different optimizers based on your use cases. OpDB is primarily used for Online Transactional Processing (OLTP) use cases with Apache Phoenix in the OpDB used as a SQL engine. But you can also use Hive and Impala for Online Analytical Processing (OLAP) use cases. 

Read on for recommendations on platform choice as well as indexing and tuning options.

Comments closed

Parallel Processing in Analysis Services

Kasper de Jonge takes us through parallel processing of partitions and tables in Analysis Services:

One thing that has come up several times in the last few weeks is a misconception that you cannot process multiple partitions on the same table or tables in parallel as it would cause a lock. This could be true if you try to do the parallelism yourself. Like in SQL Server you need to think about transactions, the AS engine is a transactional system too.

So, the AS engine is definitely capable of loading data in parallel but only if you let him do the puzzling on concurrency. This means you must send processing commands to the AS engine in one transaction so the AS engine can manage the locks itself. There are other benefits of letting AS doing the work like recalculating the calculated items (tables, columns etc) once instead of multiple times which improves processing performance.

Read on for an example.

Comments closed