Press "Enter" to skip to content

Curated SQL Posts

Expressive T-SQL with Equivalent Performance

Aaron Bertrand shows off one of the reasons I like SQL so much:

As an analogy, there are many routes you can take from New York City to Dallas. Some may be faster than others, some may be fewer miles but take longer, some are more fuel-efficient due to average speed limits, some more scenic, and some more toll-friendly. The beauty is that if you and I are independently planning the same trip, we can choose our routes based on our individual priorities. I may not like interstates, or I may prefer to drive more westerly until the sun starts setting, and you may want to see a particular tourist attraction, visit an uncle, or stop in a certain city.

A query is similar. Usually, performance is of utmost importance, but even that isn’t always true. When two or more queries give the same answer and have identical (or “close enough”) performance, the choice can come down to other factors, as mentioned above. I recently answered a question on Stack Overflow where the user was asking how to filter a grouping where an aggregate condition was true.

The fundamental insight here is that SQL is a 4th generation language, otherwise known as a declarative language: we tell the interpreter what we want and let it determine a path to get us there. By contrast, 3rd generation languages like C are imperative languages: it does what we tell it to do, no more, no less (until the compiler gets in there and re-writes our code to make it better…but these are academic ideas languages approach, not hard-and-fast mandates). There are benefits and drawbacks to either language depending on how creative you are and how good the interpreter/compiler is.

Comments closed

Unit Testing ADX Functions

David Giard builds some tests:

Our application contains many functions that return data stored in Azure Data Explorer (ADX). We wrote these functions in Kusto Query Language (KQL) and each function returns a set of data based on the arguments passed. Although developers tested these functions as they wrote them, we needed a way to validate that the functions continued to work as the code and the data changed.

Automated Unit testing is an essential part of any application development life cycle. It validates that code works properly and minimizes the risk that future code changes will break existing functionality.

In this article, I will discuss the approach we took in automating the testing of ADX functions.

Click through to see how to use the assert() function and build some tests.

Comments closed

Language Translation via Power BI Field Parameters

Gerhard Brueckl shows off a great use of Power BI Field Parameters:

The current approaches when it comes to data and value translations are more workarounds than actual solutions. They probably work fine for small data models and very specific use-cases but usually fall short in performance, usability or maintainability when implemented on a larger scale enterprise models.

The recently introduced Field Parameters in Power BI give us a bit more flexibility here and another potential solution to implement data and value translations in Power BI.

Click through for an example which shows data in English, Spanish, and French.

Comments closed

Defining Data Products Down

Paul Andrew plays Papers, Please! with data mesh:

The reasons for calling out what should not be declared as a data product can take many different forms and in some cases even result in anti-patterns that should be addressed as technical debt etc. Governanace and compliance has always been a key part of successful platform delivers, but who/how do we go about policing this both technically and in the context of people/process? Maybe ‘Data Product Police’ could be a thing! 

Read on to see how to spot a data pretender in a field of data products.

Comments closed

Consuming an Azure ML AutoML Model in Excel

Lewis Prince needs to do some heavy lifting in Excel:

It has come back to my turn to write a blog post, and if you remember my previous one concerned why you should use Azure based AutoMl and subsequently how to do so. If you followed that then you will be left with a model of which you’ve scored and know the performance of, but no way of how to then deploy and use your model. I will outline the steps needed to do this (which involves a major shortcut as we are using an AutoMl model), and then show you the required VBA needed to consume this in Microsoft Excel.

Read on to see how you can do this. Back in the really old Azure ML days, you could download an Excel workbook which would have things set up and you could feed in a bunch of input data and get predictions.

Comments closed

Azure Synapse Analytics May 2022 Updates

Ryan Majidimehr lays out some updates for Azure Synapse Analytics:

Serverless SQL pools let you query files in the data lake without knowing the schema upfront. The best practice was to specify the lengths of character columns to get optimal performance. Not anymore!  

Previously, you had to explicitly define the schema to get optimal query performance. In this case, the column countries_and_territories is defined as varchar(50):  

There are some interesting updates in this month’s release, including the public preview of Azure Synapse Link for SQL, which connects to Azure SQL DB and SQL Server 2022.

Comments closed

Using Power BI Field Parameters with Data from Kusto

Dany Hoter combines Azure Data Explorer and a new feature in Power BI:

Field parameters are a new feature in Power BI as of the May version.

With field parameters you can give the consumer of a report a lot of flexibility about the content of the report, what fields are used in the visuals, what time granularity is used and what measures are displayed.

All this without writing any DAX or M code.

Click through for an example of how this works.

Comments closed

Perspective on Spinlocks

Erik Darling speaks with wisdom:

The more people want to avoid fixing what’s really wrong with their server, the more they go out and find all the weird stuff that they can blame on something else (usually the product), or keep doing the same things that aren’t fixing the problem.

Spinlocks are one of those things. People will measure them, stare them, Google them, and have no idea what to make of them, all while being sure there’s something going on with them.

I don’t want to discount when spinlocks can actually cause a problem, but you shouldn’t treat every performance problem like it’s a bridge too far from what you can solve.

I have seen performance problems which actually did come down to spinlock issues. For every one of those, I’ve seen, oh, about 95-100 or so which came down to inefficient code.

Comments closed

How It Works: Power BI Field Parameters Edition

Gilbert Quevauvilliers figures out how field parameters work:

In this blog post I want to give a visual representation as to how field parameters works and what the current limitations are.

It is important to be aware of the limitations so that you do not get caught out later or you are trying to figure out why it is not working.

I do hope my descriptions and pictures below help you understand how it works and when it does not work!

Click through for some detailed graphics and explanation.

Comments closed