Press "Enter" to skip to content

Category: Performance Tuning

Sorting When Columns are Ordered the Same Way

Erik Darling thinks outside the column:

Sorts aren’t just for the column(s) in your order by — if you SELECT *, you need to all the columns in the * by all the columns in the order by.

I know I basically repeated myself. That’s for emphasis. It’s something professional writers do.

Dig it.

The principle Erik is talking about is having a monotonic relationship (that is, as one column increases, the other always increases; and as one column decreases, the other always decreases). If you can guarantee that, and if one of those happens to be indexed already, you can get a nice performance boost.

But wait, I was told there would be no math.

Comments closed

Reducing Visual Count to Improve Performance

Chris Webb explains that you might get better performance in Power BI with fewer visuals:

Before we go any further, I don’t want you to go and change your reports if you’re not going to get any benefit from doing so. Use Performance Analyzer (as shown here) to determine which visuals on your report are the cause of slow performance – there’s no point redesigning visuals that are fast anyway.

As a general rule the more visuals you put on a report page the slower it’s going to get. It’s logical if you think about it: the more visuals there are, the more queries have to be run against your dataset and the more work Power BI has to do to render the report. I know there is a tendency to try to pack as much information onto a page as possible and this often happens when someone else has designed the report you’re trying to build, but you should always try to resist this. Splitting a single large page into multiple smaller pages, using slicers or filters to reduce the amount of data shown at any one time and avoiding gigantic Excel-like tables are a good idea.

It certainly doesn’t mean “get rid of all of your visuals;” after all, speed is only one part of the story. Read the whole thing.

Comments closed

Power BI Query Diagnostics

Paul Turley has a video covering Power BI query performance:

This post demonstrates how the order of steps added to a query can make a big performance difference and drastically effect the number of steps generated by the designer. I’ll demonstrate how to use the new query Diagnostics tools to compare and understand query performance.

The Power Query Editor for Power BI simplifies data transformation processing by generating query steps for each action you perform in the query designer. This whiteboard diagram shows the high-level flow of information through a Power BI solution. Every query has a source (“SRC” in the diagram) followed by a connection. The query consists of a series of transformations (“XForm”) prior to populating a table in the data model.

Read on for a high-level explanation followed by a video which covers the Query Diagnostics feature.

Comments closed

Building a Cache in ksqlDB

Michael Drogalis shows how to build a materialized cache to reduce the load on your Kafka Streams servers:

There are a lot of ways that you can introduce a materialized cache into your architecture. One such way is to leverage ksqlDB, an event streaming database purpose-built for stream processing applications. With native Kafka integration, ksqlDB makes it easy to replicate the pattern of scaling out many sets of distributed caches.

Let’s look at how this works in action with an example application. Imagine that you have a database storing geospatial data of pings from drivers at a ridesharing company. You have a particular piece of logic that you want to move out of the database—a frequently run query to aggregate how active a territory is. You can build a materialized cache for it using ksqlDB.

The tutorial starts you from “grab the Docker container” and takes you through the process.

Comments closed

Troubleshooting Slow Power BI Report Server Reports

Jamie Wick helps us figure out why that Power BI Report Server report is loading so slowly:

Troubleshooting “slow” reports in PowerBI Report Server (or SQL Server Reporting Services) can be an arduous task. End users are often unable to provide detailed (or reliable) data that a report took longer to load today than it did the last time it was run. Even if a user states that the report is now taking 10 seconds longer to load, that additional time needs to be attributed to a specific step in the report generation process before it can be improved/fixed.

In the report server database (ReportServer by default) there is a view (ExecutionLog) that can provide detailed statistics about each execution of a report. Note: ExecutionLog3 view is the newest/current version and the ExecutionLog and ExecutionLog2 views are for backwards compatibility. By default the execution log entries are retained for 60 days.

The view that Jamie shows also works for SQL Server Reporting Services reports, so it can help there as well.

Comments closed

Quick Hits on Azure Databricks Performance

Rayis Imayev has a few thoughts on optimizing delta table-based workloads in Azure Databricks:

2) Enable the Delta cache – spark.databricks.io.cache.enabledtrue
There is a very good resource available on configuring this Spark config setting: https://docs.microsoft.com/en-us/azure/databricks/delta/optimizations/delta-cache

And this will be very helpful in your Databricks notebook’s queries when you try to access a similar dataset multiple times. Once you read this dataset for the first time, Spark places it into internal local storage cache and will speed up the process of further referencing it for you.

Click through for several more along these lines.

Comments closed

Ghost Records and Availability Groups

Aaron Bertrand ran into an interesting problem:

I recently came across an issue in a SQL Server Availability Group scenario where queries against a heavily-used queue table were taking longer and longer over time. The symptoms of the query were that logical reads were increasing rapidly, and we ultimately tracked it down to ghost records (and version ghost records) that were being created as rows from the queue table were consumed and deleted. Because the database was being used in a readable secondary, the ghost cleanup process simply wasn’t able to keep up with the volume of deletes against the table. The first workaround implemented was to suspend the queue consumers and rebuild the table.

Read on to see how Aaron used filtered indexes and soft deletes to mitigate some of the pain.

Comments closed

Diagnosing and Resolving CPU Issues

Jeff Iannucci has a story to tell:

Anyhow, [Erik] Darling’s tool is designed to provide five data sets showing a quick view of memory and processor usage based on several dynamic management views (DMVs). It’s gives you a quick look into what’s going on when the CPU and/or memory of your SQL Server instance are feeling discomfort.

Why do I mention this? Because little did I know the very next day I would be presented with a fine opportunity to use this tool. And by opportunity I mean a server on fire. (Well, not literally, but still…)

Click through for a description of sp_pressure_detector and a testimonial of its utility.

Comments closed

Wait Stats: Necessary but not Sufficient

Greg Gonzalez explains how wait stats are not the only thing you should look at to determine system health:

Waits and Queues has been used as a SQL Server performance tuning methodology since Tom Davidson published the above article as well as the well-known SQL Server 2005 Waits and Queues whitepaper in 2006. When used in combination with resource metrics, waits can be valuable for assessing certain performance characteristics of the workload and aid in steering tuning efforts. Waits data is surfaced by many SQL Server performance monitoring solutions, and I’ve been an advocate of tuning using this methodology since the beginning. The approach was influential in the design of the SQL Sentry performance dashboard, which presents waits flanked by queues (key resource metrics) to deliver a comprehensive view of server performance.

However, some seem to have missed Davidson’s point regarding the importance of resources and rely almost entirely on waits to present a picture of query performance and system health. Waitstats come directly from the SQL Server engine and are easy to consume and categorize. Waiting queries mean waiting applications and users, and no one likes to wait! From a marketing standpoint this is pure gold for a SQL Server monitoring tools vendor – it is easier to evangelize waits analysis as a singular solution for making queries and applications faster than the full story, which is more involved.

Unfortunately, a waits-focused approach to the exclusion of resource analysis can mislead users, and worst-case leave them flying blind. SentryOne team members Kevin Kline and Steve Wright have previously touched on this here and here. In this post I’m going to take a deeper dive into some recent research made possible by Query Store that has shed new light on how deficient waits-focused tuning can truly be.

Interesting research and Greg does a great job of explaining it.

Comments closed