Press "Enter" to skip to content

Curated SQL Posts

The Performance Cost of AT TIME ZONE

Erik Darling shows that AT TIME ZONE does not scale well when used in filters against columns:

Databases really do make you pay dearly for mistakes, and new linguistic functionality is not implemented with performance in mind.

I’ve written before about how to approach date math in where clauses: Where To Do Date Math In Your Where Clause

And it turns out that this lesson is brutally true if you need to pass time zones around, too.

Read the whole thing. In this respect, AT TIME ZONE is similar to pretty much all other date operators and functions.

Comments closed

GREATEST and LEAST in Azure SQL Database

Arun Sirpal shows off some missing functionality in SQL Server:

Being in the cloud does have many benefits, from lower administration to fast scaling but another “side effect” of operating in Azure SQL Database is the cloud first nature of changes. By this I basically mean new features always get pushed to Azure first before the classic on-premises version so some gems come to light.

Here is one for you. Have you ever wanted MySQL’s functinality to apply LEAST() and GREATEST() argument? Well, you can now, in Azure.

I can’t say that I would use this every day or anything, but I have felt the pain of not having it. There are workarounds, though nothing as convenient as syntax. Hopefully this shows up on-prem in the next version of SQL Server.

Comments closed

Saving Transcripts in Powershell

Garry Bargsley shows how easy it is to save a transcript in Powershell:

This week we are going to launch a blog series geared towards folks that are new to PowerShell. The growing popularity of automation is seeing people getting started with the PowerShell scripting language.

The Start-Transcript is a built-in command that allows you to quickly build a log of actions being taken by your script. There are other ways to build logs, but for beginners using the commands that are available to you is pretty easy.

Read on for a demo.

Comments closed

Beyond 10GB for Power BI Users

Gilbert Quevauvilliers wants to go to infinity and beyond:

By default, when using Power BI Premium or Power BI Premium per user the dataset size is set to 10GB.

I have had the wonderful experience of refreshing my dataset and getting the following error:

In the steps below I will show you how to change this setting to allow for larger dataset sizes.

There are a few steps involved, but hey, if you’re paying for Premium, it’s worth a few steps to get this.

Comments closed

How Foreign Keys Appear in Execution Plans

Hugo Kornelis shows us how foreign key contraints modify certain execution plans:

The top left of this execution plan looks very unsurprising. A Clustered Index Scan to read all rows from the #Products temporary table, and those rows are then passed to a Clustered Index Insert operator for insertion into our permanent table dbo.Products. But wait? Why is there no table or index name listed below the operator name? That’s actually because the operator has not one but three items in its Object property. When a Clustered Index Insert targets just a single index, SSMS can work out the name and show it. But when Clustered Index Insert targets multiple objects, SSMS plays it safe and displays none.

Read on to learn more.

Comments closed

Thinking about Virtual Log Files

Julie Behrens, via Kevin Hill, covers the concept of virtual log files:

It is especially evident there is an issue with VLFs when SQL Server takes a long time to recover from a restart. Other symptoms may be slowness with autogrowth, log shipping, replication, and general transactional slowness. Anything that touches the log file, in other words.

Read on to understand how to figure out if you have a problem with virtual log file counts and a resolution.

Comments closed

Identifying Straggler Tasks in Spark Applications

Ajay Gupta clues us in on a process:

What Is a Straggler in a Spark Application?

A straggler refers to a very very slow executing Task belonging to a particular stage of a Spark application (Every stage in Spark is composed of one or more Tasks, each one computing a single partition out of the total partitions designated for the stage). A straggler Task takes an exceptionally high time for completion as compared to the median or average time taken by other tasks belonging to the same stage. There could be multiple stragglers in a Spark Job being present either in the same stage or across multiple stages. 

Read on to understand the consequences and causes of these straggler tasks, as well as what you can do about them.

Comments closed

Query Store Checks in dbatools

Jess Pomfret contributes to dbatools:

Once I was happy with my settings, I realised we were missing a ‘test’ command for dbatools. The suite of ‘test’ functions in dbatools (a lot that end up as checks in dbachecks btw!), give us an easy way to check our environment against best practices, or our desired settings.

Since dbatools is open-source I was able to write this function (Test-DbaDbQueryStore) and get it added into the module. It’s included as of version 1.0.131, so make sure you’re up to date.  Taking Erin’s suggestions and wrapping them in a little PowerShell, I can make it easier for myself and everyone else to make sure we’re following her guidelines.

Click through to see what those settings look like and how you can compare against current settings.

Comments closed

Creating Sequence Diagrams for SSIS Packages

Aveek Das has an idea for documentation:

In this article, I am going to explain in detail how to document SSIS packages using Sequence Diagrams and the importance of these diagrams in the field of software engineering, no matter which programming language are you using. In my previous article, I have talked about the various UML Diagrams that are being used to document various software engineering processes. Also, I have talked about modular ETL architecture and how to create such a modular package in SSIS. Sequence diagrams are also a part of the broader UML Diagrams which define the interaction between the various components in the system in a chronological manner.

My gut feeling is that this works best with medium-sized collections of packages, where we’re talking 10-30 or so packages in total, and that for something much larger, I’d want an automated tool to build diagrams for me. But I could be way off base on that.

Comments closed