Press "Enter" to skip to content

Curated SQL Posts

Implementing GREATEST in SQL Server 2019

Ronen Ariely is on a mission to be the greatest:

The function GREATEST returns the maximum value from a list of one or more expressions. It returns the data type with the highest precedence from the set of types passed to the function.

This function was added to Azure SQL. At this time, it is supported in Azure SQL Database, Azure SQL Managed Instance and Azure Synapse Analytics serverless. 

Unfortunately, it not yet supported on SQL Server on premises and synapse dedicated sql pool.

Click through for a pair of alternative constructs while we wait for GREATEST on-premises.

Comments closed

Getting Started with KQL

Steve Jones starts learning about the Kusto Query Language:

I saw an episode of Data Exposed with my good friend, Hamish Watson. He talked about KQL (Kusto Query Language) being the next query language you need to learn. I was skeptical of the title, but I decided to give this a try.

In the episode, Hamish points out a cheat sheet from Microsoft, which I thought was a good resource. However, while watching the video, I browsed over to the demo site Microsoft has at https://aka.ms/lademo. You need an Azure account to log in, but this is a demo site where you can query some Log Analytics data. The new query window below is what appears when you go here:

If you’re already familiar with the way Splunk’s filtering language works, KQL follows from it. It’s a worthwhile language for Azure-based administrators to know, as it’s the most powerful way to get data out of Log Analytics.

Comments closed

Why Did It Recompile?

Grant Fritchey answers an interesting question:

Strictly speaking, a recompile isn’t really a performance tuning problem. In fact, a lot of time, recompiles are desired because they reflect changes in statistics which are likely to need a new plan. However, you can get really excessive levels of recompiles because of a variety of different issues. So, identifying the causes can be a pain. Here’s one way to get it done.

Click through for an extended event which does the job.

Comments closed

Building an SSMS Database Solution

Andy Leonard has a four-parter four us on database solutions in SQL Server Management Studio. Part one provides an introduction:

I like Microsoft Visual Studio a lot. I know some members of the team that developed Visual Studio, and they are scary-smart individuals who have forgotten more about developing software than I will ever know.

For some reason, I am not fond of SQL Server projects in Visual Studio. I believe the reason is that I am not familiar with the template. Please note I used the word fond intentionally. It’s an emotion. In this case, it’s all about me. I believe my emotion would change if I took the time to learn more about the Visual Studio SQL Server project template.

I continue to attempt to learn VS database projects. In the meantime, I prefer SQL Server Management Studio solutions.

Part two shows how to add a new query:

One solution is to add instrumentation to T-SQL scripts. I personally like to write T-SQL scripts that idempotent (a fancy way to describe “re-executable with the same results”). One way to write idempotent T-SQL is:

1. First check for the current state

2. Provide feedback (instrumentation) on the status

3. Provide more feedback on actions driven by the status (yep, more instrumentation)

Part three includes tables and views in the mix:

Click the “New Query” button in SSMS and add the following T-SQL:

Part four includes stored procedures:

Note the DDL to manage stored procedures is very similar to the DDL for managing views.

If all goes according to plan, the first execution of the s.i DDL T-SQL statement should generate the following messages:

Andy also shows how to use SQLCMD to create a proper deployment script.

Comments closed

Checking if a Spark DataFrame is Empty

The Hadoop in Real World team has a one-liner for us:

A quick answer that might come to your mind is to call the count() function on the dataframe and check if the count is greater than 0. count() on a dataframe with a lot of records is super inefficient.

count() will do a global count of records in the dataframe from all partitions and then add all the intermediate counts together to get the final count. You will find this approach very slow for big dataframes.

Click through for a much faster one-liner.

Comments closed

Creating Fireworks with R

Tomaz Kastrun is ready for Silvester:

New Year’s eve is almost here and what best way to celebrate with fireworks. Snap, pop, crack, boom. This is the most peaceful, animal friendly, harmless, eco, children friendly, no-fire-needed, educative and nifty fireworks.

To get the fireworks, fire up the following R function.

I mean, but I enjoy fire… Though you could launch these in R and save the good stuff for the 4th of July.

Comments closed

Cleaning SQL Express Databases

Kevin Hill knows the pain:

I was contacted by a lawyer that was using a 3rd party application to store emails, keep track of time, etc.

The backend of the application is SQL Server Express edition, which has a hard limit of 10GB for the data file.

One quick note for people with lots of LOB data, remember to reorganize with LOB_COMPACTION = ON as that’s the only way to be sure. Also, depending on how old the version of SQL Server is, there was a bug with LOB compaction which affected SQL Server 2014 and earlier. But, uh, hopefully you’re patched past that point…

Also, getting up to 2016 SP1 means that Express Edition gets data compression. It wouldn’t directly help in this case, but if you have a lot of non-LOB data on Express Edition, it can work wonders, for some definition of “wonders.” After all, if you’re using Express Edition, wonders are by definition pretty small.

Comments closed