Performance Tuning – Page 24

Optimization Parameters in Oracle 19c

Published 2021-06-28 by Kevin Feasel

Kellyn Pot’Vin-Gorman enters a time warp:

As I and the dedicated CSA were working to optimize the ETL load on Oracle in Azure IaaS, I noticed that there wasn’t a significant improvement with physical VM and storage changes as expected. As I dug into the code and database design, I started to document what I’ve summarized above and realized that the database was quite frozen in time. Even though I couldn’t make changes to the code, (per the customer request) I was quickly understanding why we had such limited success and why I was failing miserably as I attempted to put recommended practices in place at the parameter level for the Oracle 19c database from what they had originally.
As I thought this through, I had an epiphany- This database was doing everything in its power to be a 10g or earlier database so why shouldn’t I optimize it like one?

Read on to see what this entails.

Comments closed

FGCB_ADD_REMOVE Latch

Published 2021-06-22 by Kevin Feasel

Paul Randal looks at a particular latch:

Most latch class names are tied directly to the data structure that they protect. The FGCB_ADD_REMOVE latch protects a data structure called an FGCB, or File Group Control Block, and there will be one of these latches for each online filegroup of each online database in a SQL Server instance. Whenever a file in a filegroup is added, dropped, grown, or shrunk, the latch must be acquired in EX mode, and when figuring out the next file to allocate from, the latch must be acquired in SH mode to prevent any filegroup changes. (Remember that extent allocations for a filegroup are performed on a round-robin basis through the files in the filegroup, and also take into account proportional fill, which I explain here.)

Read on to understand what can cause this particular latch to become a bottleneck in your system.

Comments closed

Searching for Key Lookups

Published 2021-06-15 by Kevin Feasel

Grant Fritchey answers a question:

While teaching about Extended Events and Execution Plans last week, Jason, one of the people in the class, asked: Is there a way in Extended Events to find queries using a Key Lookup operation? Sadly, the answer is no. However, you can query the Execution Plans in cache or in the Query Store to find this. Thanks for the question Jason. Here’s your answer.

Read on to see how.

Comments closed

Changing Power BI Evaluation Container Numbers

Published 2021-06-14 by Kevin Feasel

Chris Webb shows how we can optimize the number of evaluation containers in Power BI:

Last week I showed how the new MaxEvaluationWorkingSetInMB registry setting could increase the performance of memory-hungry Power Query queries in Power BI Desktop. In this post I’ll show how the other new registry setting, ForegroundEvaluationContainerCount, can also help performance. Before I carry on I recommend you read the documentation on these new registry settings if you haven’t done so already.
To illustrate the effect of this setting I created ten identical Power Query queries feeding an Import mode dataset in a new .pbix file, each of which read data from the same 150MB CSV file, apply the a filter and then count the number of rows returned.

I don’t think I like having to modify a registry setting each time; that’s leading me to believe I should rarely (or never) mess with this.

Comments closed

Maximizing Availability Group Performance

Published 2021-06-10 by Kevin Feasel

Jonathan Kehayias has a few tips for improving performance of your Availability Groups:

Since Microsoft first introduced the Always On Availability Groups (AGs) feature in SQL Server 2012, there’s been a lot of interest in using AGs for both high availability and disaster recovery (HADR), as well as for offloading read-only workloads. The combination of the best features for failover clustering, the simplicity of data movement and synchronization from database mirroring, and the ability to offload read-only workloads to secondaries has given businesses a compelling reason to upgrade to leverage AGs.
But, as the saying goes, there’s no such thing as a free lunch, and there are several performance implications and considerations you must be aware of to have a successful deployment using AGs. This blog post will explore some of the considerations and look at how to plan, architect, and implement an AG with minimal latency and performance impact on the production workload.

Click through for those tips.

Comments closed

Speeding Up Power Query with Evaluation Container Memory

Published 2021-06-07 by Kevin Feasel

Chris Webb notes a new toggle in Power Query:

However if you have just read the docs you may be wondering what these two new registry key settings actually do. In this post I’m only going to talk about one, MaxEvaluationWorkingSetInMB; I’ll leave ForegroundEvaluationContainerCount for a future post.
At various times in the past I have blogged about how, when you run a Power Query query, the query itself is executed inside a separate process called an evaluation (or mashup) container and how this process has a limit on the amount of memory it can use. Some transformations such as sorting a table, doing a group by, pivoting and unpivoting require an entire table of data to be held in memory and if these operations require more memory than the evaluation container is able to use then it starts paging and query performance gets a lot worse.

Read on to see where setting the max evaluation working set in memory can help, as well as the caveats that Chris lays out.

Comments closed

Memory Grants for CHAR vs VARCHAR

Published 2021-06-04 by Kevin Feasel

Erik Darling has a head-slapping moment:

While working with a client recently, we found that someone, at some point in time, probably during the original migration from Access, had chosen CHAR columns rather than VARCHAR columns.
Okay, fine. How bad could it be?
Bad enough that… A whole bunch of columns that only had a single character in them were stored in CHAR(1000) columns.

I like CHAR…well, to be specific, NCHAR. But only when you’ll need exactly that many characters.

Comments closed

Speeding Up Azure Data Factory Pipelines

Published 2021-06-01 by Kevin Feasel

Hiram Fleitas doesn’t have all day to wait for that pipeline to finish:

His issue was pretty much as mentioned on the tile. Our bank’s Azure Data Factory pipeline is running slow moving data from on-prem, we’re copying all tables in a SQL Server database, files from ftp sites and network share drives to Azure SQL DB Managed Instance and to blob storage (our datalake) , do you have some recommendations how to make it go faster? Its around 300GBs and takes over 8 hrs.
So I replied with the following and figured to post it here as it may help others.

Hiram has a video, as well as specific advice to offer.

1 Comment

Tips for Improving Code Performance in R

Published 2021-05-05 by Kevin Feasel

Mira Celine Klein continues a series on code performance in R:

This is the second part of our series about code performance in R. It contains a lot of approaches to reduce the time your code needs to run. It’s useful to know those ideas before starting to write new code, but it also helps to optimize existing code.
If you have already written some code you want to speed up, but don’t know which part of it is actually slow, I recommend you to read the first part of this series on profiling. That article also introduces the microbenchmark package which we are going to use to measure code performance in this article.
Let’s start with a seemingly obvious rule, which is however not always easy to follow.

Read on for some tips. H/T R-bloggers.

Comments closed

When PyODBC fast_executemany Isn’t

Published 2021-05-04 by Kevin Feasel

Jon Morisi troubleshoots a performance issue:

I recently had a project in which I needed to transfer a 60 GB SQLite database to SQL Server. After some research I found the sqlite3 and pyodbc modules, and set about scripting connections and insert statements.
The basic form of my script is to import the modules, setup the database connections, and iterate (via cursor) over the rows of the select statement creating insert statements and executing them.
The issue here is that this method results in single inserts being sent one at a time yielding less than satisfactory performance. Inserting 35m+ rows in this fashion takes ~5hrs on my system.

Jon tries out a few different options. It would appear that there is no easy bulk insertion operation with PyODBC.

Comments closed

Category: Performance Tuning