Limitations on Inlining Functions

Erik Darling takes us through a limitation to SQL Server 2019 scalar function inlining:

There’s a lot of excitement (alright, maybe I’m sort of in a bubble with these things) about SQL Server 2019 being able to inline most scalar UDFs.

But there’s a sort of weird catch with them. It’s documented, but still.

If you use GETDATE in the function, it can’t be inlined.

GETDATE() and its bretheren are non-deterministic so I figured that would be an issue. Check out the documentation for the other limitations.

Understanding Key Lookups

Monica Rathbun explains what a key lookup is in SQL Server:

One of the easiest things to fix when performance tuning queries are Key Lookups or RID Lookups. The key lookup operator occurs when the query optimizer performs an index seek against a specific table and that index does not have all of the columns needed to fulfill the result set. SQL Server is forced to go back to the clustered index using the Primary Key and retrieve the remaining columns it needs to satisfy the request. A RID lookup is the same operation but is performed on a table with no clustered index, otherwise known as a heap.  It uses a row id instead of a primary key to do the lookup.

As you can see these can very expensive and can result in substantial performance hits in both I/O and CPU. Imagine a query that runs thousands of times per minute that includes one or more key lookups. This can result in tremendous overhead which is generated by these extra reads it effects the overall engine performance.

Monica’s absolutely right: key lookups can take a decent query and make it into a performance hog.

Power BI: Comparing Web.Contents and File.Contents Performance

Chris Webb compares the performance of File.Contents and Web.Contents:

In my last post I mentioned the Power Query engine’s persistent cache, which in some scenarios caches the data read from a data source when a query is refreshed. Another important nugget of information that Ehren von Lehe of the Power Query dev team mentioned in a post on the Power Query MSDN forum recently is the fact that if you use File.Contents to get data from a file then the persistent cache is not used, but if you use Web.Contents to get data from the same file then the persistent cache is used. I guess the thinking here is that there is no point creating an on-disk cache containing the contents of a file that is already on disk.

Chris takes us through a couple of unexpected twists, so check it out.

Getting the Last Actual Execution Plan for a Query

Pedro Lopes takes us through a new dynamic management function in SQL Server 2019 CTP 2.4:

In CTP 2.4 not all actual execution plans will be available, you can see more details on that here.

For an upcoming CTP version, all queries will be available with the equivalent of the actual execution plan. At least those where the plan was cached in the first place, or those where the plan has not been evicted from cache.

That caveat aside, I’m happy with this.

Important Considerations with Indexed Views

Jason Brimhall takes us through a few important considerations with indexed views:

The creation of this view has chewed up a bunch of storage. It has jumped right up to the number two spot on the biggest objects list within this database. You can see that differences by comparing the highlighted rows to the previous image. The vPerson view is highlighted in red in this second image to help point it out quickly.

Surely this must be a contrived example and people don’t really do this in the real world, right? The answer to that is simply: NO! It DOES happen. I see situations like this all too often. Far too often, large text fields are added to an indexed view to make retrieval faster. I have mimicked that by adding in two XML columns from the Person.Person table. This is definitely overkill because a simple join back to the table based on the BusinessEntityID would get me those two columns. All I have effectively done is duplicated data being stored and I have achieved that at the low low cost of increased storage of 25% for this small database. If you are curious, the column count between the Person.Person table and this new view is 13 columns each.

Jason takes us through a couple more gotchas and provides some important advice you should follow if you think indexed views might be a fit for you.

Creating Threadpools with ExecutorService in Kafka

Prasanth Nair shows how we can use Java’s ExecutorService to create threadpools for Kafka consumers:

Apache Kafka is one of today’s most commonly used event streaming platforms. While using the Kafka platform, quite often, we run into a scenario where we have to process a large number of events/messages that are placed on a broker. Traditional approaches, where a consumer is listening to a topic and then processes these message within the consumer itself, can become a performance bottleneck if the number of messages being placed on the topic is high. In such cases, the rate at which a consumer can process messages will be very low, as there are a large number of messages getting placed on the topic. A potential solution that can be applied in such a scenario is to offload message processing to the worker threads in a thread pool.

Click through for the Java code.

Tuning Azure SQL Database

Tim Radney walks us through some of the tools we have available to tune Azure SQL Databases:

Many instance-level items that you have been used to configuring on full installations are off limits. Some of these items include:
– Setting min and max server memory
– Enabling optimize for ad hoc workloads
– Changing cost threshold for parallelism
– Changing instance-level max degree of parallelism
– Optimizing tempdb with multiple data files
– Trace flags

Tim does point out workarounds for some of these and gives us the list of things which are possible, so check that out.

Performance Testing Simple Scalar UDF Functions

Wayne Sheffield tries out a simple scalar UDF in SQL Server 2019 to see how it performs:

I recently published a post detailing the new Scalar UDF Inlining feature in SQL 2019 here. That post introduced the new feature in a way that I used to compare performance to the other function types, continuing the performance evaluation of functions that I had previously posted here and here. In the Scalar UDF Inlining post, I used a function to strip all non-numeral values from a string, and to return the result. This used the FOR XML output option.

In thinking about how scalar functions are commonly used, I’ve decided to revisit this feature with a simpler function. I will still compare it to all the other types of functions to see how Scalar UDF Inlining compares to the others.

Wayne’s results are music to the product team’s ears, I’m sure.

Performance Troubleshooting Plus Wait Stats

Jeff Mlakar builds up some thoughts on performance troubleshooting, including wait stats:

Queries go through the cycle of the SPIDS / worker threads waiting in a series like this. A thread uses the resource e.g. CPU until it needs to yield to another that is waiting. It then moves to an unordered list of threads that are SUSPENDED. The next thread on the FIFO queue of threads waiting then becomes RUNNING. If a thread on the SUSPENDED list is notified that its resource is available, it becomes RUNNABLE and goes to the bottom of the queue.

Click through for an analogy using a microwave and plenty more.

CTP 2.3 and Multi-Scoped Temp Tables

Joe Sack announces a performance optimization when you generate temp tables and then work with them in a sub-procedure:

SQL Server 2019 introduces several performance optimizations which will improve performance with minimal changes required to your application code.   In this blog post we’ll discuss one such improvement available in CTP 2.3: reduced recompilations for workloads using temporary tables in multiple scopes.

In order to understand this improvement, we’ll first go over the current behavior in SQL Server 2017 and prior. When referencing a temporary table with a DML statement (SELECT, INSERT, UPDATE, DELETE), if the temporary table was created by an outer scope batch, we will recompile the DML statement each time it is executed.

There’s a pretty big performance improvement here, but architecturally, I really don’t like it. This makes the inner procedure unrunnable unless you know that there should be some temp table(s) and what should be in those temp table(s).


August 2019
« Jul