Press "Enter" to skip to content

Category: Performance Tuning

Analyzing Microsoft Fabric Lakehouse Query Performance

Dennes Torres takes a peek at some views:

You may have already discovered the 4 special views the lakehouse has in the queryinsights schema to track query performance. I made a video about the lakehouse special tables, but since then, they evolved a lot:

  • queryinsights.exec_requests_history
  • queryinsights.exec_sessions_history
  • queryinsights.frequently_run_queries
  • queryinsights.long_running_queries

Let’s discover what these tables have to offer for us to analyze the lakehouse performance.

Click through to see what each one of these holds.

Comments closed

Indexing for PostgreSQL in pgNow

Ryan Booz continues a series on pgNow:

In that first article, I shared how pgNow can be a lifesaver when you need immediate performance insights, highlighting features like query tuning and current activity monitoring. The tool’s ability to take periodic snapshots of query activity and spotlight active sessions has already been a significant help for early users.

Today, I wanted to look at another area of information that pgNow can help you explore during times of performance degradation or even as part of a regular database maintenance and hygiene: the Indexing tab.

Click through to see what’s in the feature and to get a free copy of the preview for pgNow.

Comments closed

Handling a Sort Operation in SQL Server Integration Services

Andy Brownsword knows that sometimes, the only winning move is not to play:

Last time out we discussed blocking transformations, what they are, the impact of them, and touched on how to deal with them. In this post we’re going a step further to tackle one of them head on.

Here we’ll demonstrate the impact of blocking caused by the Sort transformation, and look at two options for solving this and slashing execution time.

Sorts aren’t the only blocking transformation that you should push back down to your source (if possible), but it is the most common example.

Comments closed

SQL Server Performance Office Hours

Erik Darling answers a set of user questions:

You have said that table variables, CTEs, Change Tracking, and Azure Managed Instances all suck. Do you have a full list of “features” to avoid?

Click through for a video of Erik answering questions around deadlocks, terrible things, UTF-8, and more. And I like the nuance behind Erik’s answer of this particular question. It’s easy to say “this thing is awful” and be done with it, but often times, the answer is more of “In this particular circumstance, don’t use this thing because of reasons X, Y, and Z; instead, use this thing.” That’s a rather different answer.

Comments closed

Table Compaction in Apache Spark

Miles Cole groups things together:

If there anything that data engineers agree about, it’s that table compaction is important. Often one of the first big lessons that folks will learn early on is that not compacting tables can present serious performance issues: you’ve gotten your lakehouse pilot approved and it’s been running for a couple months in production and you find that both reads and writes are increasingly getting slower and slower while your data volumes have not increased drastically. Guess what, you almost surely have a “small file problem”.

What engineers won’t always sing the same tune on is how and when to perform table compaction.

Read on for a dive into the power of compaction (converting a large number of small files into a small number of large files) and plenty of tips along the way.

Comments closed

A List of PostgreSQL Parameters

Semab Tariq has a list:

Have you ever experienced your database slowing down as the amount of data increases? If so, one important factor to consider is tuning PostgreSQL parameters to match your specific workload and requirements. 

PostgreSQL has many parameters because it is designed to be highly flexible and customizable to meet a wide range of use cases and workloads. Each parameter allows you to fine-tune different aspects of the database, such as memory management, query optimization, connection handling, and more. This flexibility helps database administrators to optimize performance based on hardware resources, workload requirements, and specific business needs.

In this blog, I will cover some of the important PostgreSQL parameters, explain their role, and provide recommended values to help you fine-tune your database for better performance and scalability. 

Click through for those parameters, including descriptions, default values, and recommendations.

Comments closed

An Overview of PostgreSQL Performance Monitoring via pgNow

Grant Fritchey announces a product:

I’ve been putting together a new PostgreSQL session called “Performance Monitoring for the Absolute Beginner.” There are several ways to get an understanding of how well your queries are running in PostgreSQL, but, frankly, all of them are a bit of a pain to someone coming from the land of Extended Events (ah, my one true love). Because of this, I saw it as an opportunity to help those just getting going in PostgreSQL. I’ll be presenting it for the first time at Postgres Conference in Orlando on March 19, 2025. Come on by.

Anyhoo, wouldn’t it be nice to maybe have a shortcut, an easier way to look at this information?

Well, there is. Redgate has been working on a completely free tool for leveraging just this sort of data called pgNow. Go here to check it out yourself, but I’ll do a quick run through here.

Click through to see how it works.

Comments closed

Tips for Scaling Apache Kafka

Narendra Lakshmana Gowda tunes a Kafka cluster:

Apache Kafka is known for its ability to process a huge quantity of events in real time. However, to handle millions of events, we need to follow certain best practices while implementing both Kafka producer services and consumer services.

Before start using Kafka in your projects, let’s understand when to use Kafka:

Much of the advice is pretty standard for performance tuning in Kafka, like setting batch size and linger time on the producer or managing consumers in a consumer group.

Comments closed

Improving Power Query CSV File Performance with Data Columns

Chris Webb makes things go faster:

A few weeks ago I replied to a question on reddit where someone was experiencing extremely slow performance when importing data from a CSV file using Power Query. The original poster worked out the cause of the problem and the solution themselves: they saw that removing all date columns from their query made their Power Query query much faster and that using the Date.FromText function and specifying the date format solved the problem. While I couldn’t reproduce the extreme slowness that was reported I was able to reproduce a performance difference between the two approaches and Curt Hagenlocher of the Power Query team confirmed that this was expected behaviour.

Read on for the example and explanation.

Comments closed