Press "Enter" to skip to content

Curated SQL Posts

Describing the Physical Join Operators

Deepthi Goguri explains the three physical join operators in SQL Server:

Merge join is not a bad thing and it may be efficient already in your execution plan. What you have to observe when you see the merge joins and performance slow on that plan is to focus on the upstream operations that are going into the merge join. Whether the data is presorted as you already have an index or whether the data is presorted in SQL Server own way then in that case, you can simply check if you can just add that missing column in the index and place in the last key column in the index or use a different join algorithm will be better. The other scenario might be you have lots of duplicate values in your data. If that is the case SQL Server will be using the work tables to handle how the duplicate values can be joined on. So, if you see the duplicate values or using tempdb, then finding the better options will be good.

Click through for more detail. Each physical operator has its place and does quite well within it, but the challenge comes when the optimizer thinks a particular route is better than the one you had in mind.

Comments closed

Comparing CSV to Parquet File Loading Performance in Power BI

Chris Webb has a comparison for us:

Earlier in this series on importing data from ADLSgen2 into Power BI I showed how partitioning a table in your dataset can improve refresh performance. In that post I used CSV files in ADLSgen2 as my source and created one partition per CSV file, but after my recent discovery that importing data from multiple Parquet files can be tuned to be a lot faster than importing data from CSV files, I decided to try creating partitions linked to Parquet files instead.

Click through for the experiment and its results.

Comments closed

Query Plans and Window Functions

Erik Darling has a two-fer here. First, window functions and parallelism:

When windowing functions don’t have a Partition By, the parallel zone ends much earlier on than it does with one.

That doesn’t mean it’s always slower, though. My general experience is the opposite, unless you have a good supporting index.

But “good supporting index” is for tomorrow. You’re just going to have to deal with that.

Second, columnstore behavior with respect to window functions:

Not only is the parallel version of the row mode plan a full second slower, but… look at that batch mode plan.

Look at it real close. There’s a sort before the Window Aggregate, despite reading from the same nonclustered index that the row mode plan uses.

But the row mode plan doesn’t have a Sort in it. Why?

Check out both posts.

Comments closed

Automated File Cleanup with Powershell

Jeffrey Hicks does some spring cleaning:

Springtime is approaching in North America. Where I live, the snow has finally melted and we have blue skies with warmer temperatures. Of course, this means Spring Cleaning. Time to clear out the winter debris and spruce up the house. For me, this is also a good time for some computing housecleaning as well. I don’t know about your Windows environment, but I tend to accumulate a lot of junk. Most of the time I don’t see it, but I know it’s there. While the junk normally doesn’t have a negative impact, I think mentally, I like clearing things out and tidying up. So I pulled out some older PowerShell code, freshened it up, and now I have a set of tools for clearing out junk and temporary folders. Let me show you what I came up with.

Click through for the scripts.

Comments closed

Auto-Failover Groups and Grace Periods

Taiob Ali clears up some misunderstanding:

The auto-failover groups feature for the Azure SQL database can be configured with an automatic failover policy. Azure triggers failover after the failure is detected and the grace period has expired. Grace period is determined by a setting called ‘GracePeriodWithDataLossHours’ that cannot be set under one hour. Why is it not allowed to set a time which is less than an hour? Can your business tolerate the application be down for that period? Should your turn off Auto Fail-over and set it to manual?

I noticed a lot of confusion around this setting, including my own. Some of the confusion is due to a lack of clarity in the documentation. I checked with the Microsoft Azure SQL team, and they are actively working on clarifying some of the questions I raised.

I want to thank Dimitri Furman and Roberto Bustos from the Azure SQL Team for clarifying some of my confusion that I will share here.

Read on for a Q&A style explanation of auto-failover and grace periods.

Comments closed

What to Do if a Database Isn’t Synchronizing

Lee Markup has some advice:

I currently manage 5 AlwaysOn Availability Groups. Two are on SQL Server 2014 and overdue for an upgrade, while three of them are on SQL Server 2017.  From time to time I have run into a couple of different situations that I needed to troubleshoot and I want to tell you where to look and what to check on in these scenarios. I can’t possibly tell you about everything that could go wrong, but I can tell you about my experience with AlwaysOn Availability Groups and let you decide if that experience helps you or not.

Click through for more.

Comments closed

Fixed Server and Database Roles in SQL Server

Greg Larsen takes us through the built-in set of server and database roles in SQL Server:

Managing Security for SQL Server is extremely important. As a DBA or security administrator, you need to provide access for logins and database users to resources within SQL Server. SQL Server has both server and database resources where access might need to be granted. Access to these resources can be granted to either individual logins or database users or can be granted to roles, for which logins or database users can be members. Granting access via a role is known as role-based security.

There are two types of roles: fixed or user-defined. In this article, I will discuss the different fixed server and database roles provided with SQL Server and how these roles can be used to support role-based security to simplify providing access to the different SQL Server resources. In a future article, I will discuss user-defined server and database roles.

Click through for more information. Or just give everybody sysadmin. I mean, that’s probably fine, right?

Comments closed

Power Query and the First Day of the Week

Ed Hansberry decrees that henceforth, Thursday shall be the first day of the week:

By default, Power BI starts its week on Sunday, but that doesn’t work for every situation. Some companies prefer their week start on Monday, and I recently helped someone that needed their week to start on Thursday. Fortunately, by tweaking your date table in Power Query, you can make this change with no calculations at all.

Click through to see how.

Comments closed

VMware In-Guest Time Synchronization

David Klee reviews a product update:

I just found out that VMware has updated their in-guest time synchronization with the 7.0 Update 1 release. Previously, we had to manually disable some of the advanced time synchronization ‘features’ that didn’t adhere to the front-end GUI option that said to not synchronize the guest time with the host. For most VMs, it is not that big of a deal, but for SQL Servers running in a highly available configuration, this act could break your availability solution.

Click through to see what has changed in the product.

Comments closed

Retrieving Counts of Cosmos DB Collections

Manoj Pandey shows how you can retrieve counts of records in Cosmos DB using the .NET client:

Here in this post we will use C# .net code (for beginners like me) to see how to:
1. Connect to a Cosmos DB instance
2. Get list of all Databases in a Cosmos DB
3. Iterate through all the Databases and get the list of all Collections (or Tables)
4. Get COUNT of all documents/items (or records) in these Collections

Click through to see how.

Comments closed