Press "Enter" to skip to content

Curated SQL Posts

Window Functions in DAX

Jeffrey Wang is speaking my language:

The December 2022 release of Power BI Desktop includes three new DAX functions: OFFSETINDEX, and WINDOW. They are collectively called window functions because they are closely related to SQL window functions, a powerful feature of the SQL language that allows users to perform calculations on a set of rows that are related to the current row. Because these functions are often used for data analysis, they are sometimes called analytical functions. In contrast, DAX, a language invented specifically for data analysis, had been missing similar functionalities. As a result, users found it hard to write cross-row calculations, such as calculating the difference of the values of a column between two rows or the moving average of the values of a column over a set of rows.

Read on to learn more about how these functions work and how they differ from their SQL Server counterparts.

Comments closed

Encryption Recommendations for SQL Server

Matthew McGiffen shares some thoughts:

We looked earlier at the various places our data lives – on disk, in memory, in-transit across the network. A good strategy will protect all of these locations – sometimes with multiple layers. Exactly how you go about doing that for your applications may vary – but as long as you are on SQL Server 2016 or higher there is a default strategy that you should consider. This combines a number of the available SQL Server features to provide the best protection.

This story might look a bit different for Azure SQL Database (though you can still use Always Encrypted there) and will look very different on Azure Synapse Analytics dedicated SQL pools, which don’t have Always Encrypted at all.

Comments closed

Paginated Reports in Power BI Pro

Andy Jones has some good news for us:

Microsoft recently announced that Power BI Paginated Reports is now available to users with a Pro licence. Previously, you required a Premium licence to take advantage of Paginated Reports in Power BI.

Read on to learn how paginated reports work in Power BI (hint: almost like SSRS). Even with Power BI Premium Per User being a fairly reasonable $20 per user per month, it’s nice to see Pro get a bit more value.

Comments closed

Isolated Spark Testing with lakeFS

Adi Polak demonstrates lakeFS:

This tutorial demonstrates how to build a development and testing environment for validating your logic on a full-blown production data volume and variety, working with lakeFS and Spark. You will walk through the journey of creating a repository and building a Spark application while using lakeFS capabilities. You will learn how to data changes, revert them in cases of mistakes or other hiccups, and lately merge separate branches to reflect data changes from the isolated environments.

Not too long ago, I had a couple conversations with developers and data engineers about decentralized development and devs having their own environments and data. This seems like it would be a good approach to that common problem, and it works for Azure Synapse Analytics as well.

Comments closed

AutoML and Model Registration in AML

Tomaz Kastrun continues an advent of Azure Machine Learning. Day 13 covers the topic of Automated ML:

Automated ML is a no-code automated machine learning task. It iterates over many combinations of algorithms and hyperparameters in order to find the best model for your dataset and your prediction variable(s). The final solution is a model, that can be downloaded and later reused. So Automated ML is not just giving you the best model out of a family of algorithms, but lets you use the model, generate the scripts and create the artefacts.

Day 14 concerns model registration:

Important asset is the “Models” in navigation bar. This feature allows you to work with different model types -> custom, MLflow, and Triton. What you do here is, you register a model from different locations (e.g.: local file, AzureML Datastore, AzureML Job, MLflow Job, Model asset in AzureML workspace, and Model asset in AzureML Registry).

Once you open the Models asset, you will see, that you can do many things here. I have already model register from the running the notebook on day4.

Comments closed

Column Exclusion and Rename in Snowflake

Kevin Wilkie plays duck-duck-goose with columns:

With Snowflake, we could do many different things that we’re not used to seeing with a SELECT statement. We’re all used to seeing this – SELECT * and it shows all kinds of columns.

With Snowflake, we can tell Snowflake NOT to show certain columns by using the EXCLUDE operator.

Read on to see how it works and specific requirements around operation. In addition, Kevin shows a way to perform aliasing.

Comments closed

Identifying Rows in sp_wait_for_database_copy)_sync

Jose Manuel Jurado Diaz troubleshoots a problem:

As you could see in this public documentation Auto-failover groups overview & best practices – Azure SQL Database | Microsoft Learn about sp_wait_for_database_copy_sync “sp_wait_for_database_copy_sync prevents data loss after geo-failover for specific transactions, but does not guarantee full synchronization for read access. The delay caused by a sp_wait_for_database_copy_sync procedure call can be significant and depends on the size of the not yet transmitted transaction log on the primary at the time of the call.”

Our customer asked about several scenarios to understand this behaviour and also, verify if there is possible to identify the rows that have not been synced. For this, I developed a POC to test it:

Read on to see what you’d need to do.

Comments closed

FHIR and Azure Health Services

Steve Hughes provides an overview of FHIR and what Azure has to offer:

With the recent updated mandates in the healthcare environment in the United States, Microsoft has continued to expand its capability to support the FHIR standard for integrating healthcare data. While the standard is well documented and Microsoft’s capabilities are expansive, it falls on data professionals to interpret that data and build meaningful reports and produce meaningful insights from the data as it is collected and integrated across environments. This requires a good working knowledge of JSON in SQL to manipulate complex data models. In the session, we did a short review of the FHIR standard and the overall implementation of FHIR in Azure. From there we reviewed the resulting data in the data lake and in Synapse. That was followed up with an overview into the heart of complex SQL using JSON functions in Synapse. Whether or not you are active in healthcare today, this will be an enlightening session on how to use JSON SQL functions within the Azure SQL platforms.

Read on to learn more.

Comments closed

Finding “Stolen” CPU on SQL Server

Ajay Dwiveldi has a public service announcement for us:

The above panel is showing any SQLServer that needs DBA attention. But there is this server highlighted in the above screenshot showing a significant gap b/w OS CPU and SQL CPU. Any gap of over 15-20% b/w OS & SQL CPU consumption is something that should be investigated. This was an indicator that there is some other non sqlservr process that is consuming high CPU on the host.

Friends don’t let friends run anything other than SQL Server on a SQL Server machine.

Comments closed