Press "Enter" to skip to content

Category: Syntax

Data Analysis with Window Functions

Erika Balla looks out the window:

Window functions are an advanced feature of SQL that provides powerful tools for detailed data analysis and manipulation without grouping data into single output rows, which is common in aggregate functions. These functions operate on a set of rows and return a value for each row based on the calculation against the set.

In this article, we delve into window functions in SQL Server. You will learn how to apply various window functions, including moving averages, ranking, and cumulative sums, to achieve comprehensive analytics on data sets. 

Click through for several examples.

Leave a Comment

Common Table Expressions in Postgres

Ryan Booz builds a CTE:

In the first article in this transforming data series, I discussed how powerful PostgreSQL can be in ingesting and transforming data for analysis. Over the last few decades, this was traditionally done with a methodology called Extract-Transform-Load (ETL) which usually requires external tools. The goal of ETL is to do the transformation work outside of the database and only import the final form of data that is needed for further analysis and reporting.

However, as databases have improved and matured, there are more capabilities to do much of the raw data transformation inside of the database. In doing so, we flip the process just slightly so that we Extract-Load-Transform (ELT), focusing on getting the raw data into the database and transforming it internally. In many circumstances this can dramatically improve the iteration of development because we can use SQL rather than external tools.

Read on for the scenario and how you can use common table expressions in Postgres. They’re very similar to what we have in SQL Server, though a few differences do exist.

Leave a Comment

EXPAND() and COLLAPSE() in Power BI Visual Calculations

Marco Russo and Alberto Ferrari play the accordion:

In previous articles, we introduced the concepts of visual context, the visual lattice, and the two visual context navigation operations: EXPAND and COLLAPSE. In this article, we build on that knowledge to first compute a simple percentage over the parent: a simple calculation to consolidate the understanding of EXPAND and COLLAPSE. Next, we move on to a much more complex scenario, computing the inflation-adjusted sales using visual calculations.

Read on to see what the two functions do, using a pair of examples.

Leave a Comment

Working with INTERSECT and EXCEPT

Erik Darling wounds me:

I have never once seen anyone use these. The most glaring issue with them is that unlike a lot of other directives in SQL, these ones just don’t do a good job of telling you what they do, and their behavior is sort of weird.

Unlike EXISTS and NOT EXISTS, which state their case very plainly, as do UNION and UNION ALL, figuring these out is not the most straightforward thing. Especially since INTERSECT has operator precedence rules that many other directives do not.

I’ve used EXCEPT to check if two datasets are equivalent for testing purposes: A EXCEPT B should be zero rows, and B EXCEPT A should be zero rows. It has built-in handling of any NULL madness. Set intersections have their uses as well.

Comments closed

UNION and UNION ALL

Erik Darling hires the Pinkertons:

UNION and UNION ALL seem to get used with the same lack of discretion and testing as several other things in T-SQL: CTEs vs temp tables, temp tables vs. table variables, etc.

There are many times I’ve seen developers use UNION when result sets have no chance of being non-unique anyway, and many times I’ve seen them use UNION ALL when there would be a great benefit to discarding unnecessary duplicates.

Erik’s explanation goes about three steps beyond “UNION is bad so always use UNION ALL.” It’s a must-read for anybody who regularly writes T-SQL queries.

Comments closed

EXPAND and COLLAPSE in Power BI Visual Calculations

Marco Russo and Alberto Ferrari build on a foundation:

In a previous article, we introduced VISUAL SHAPE, the table modifier that adds a hierarchical structure to a table, which is needed to implement visual calculations. In this article, we introduce the concept of visual context, the virtual table lattice, and the two main operators to navigate in the visual context: EXPAND and COLLAPSE.

Click through to learn more about these operators and how they fit into the rubric of visual calculations.

Comments closed

Correlated Subqueries in SQL

Joseph Yeates classifies subqueries:

I’ve recently been brushing up on my SQL skills, as I’ve used the language for a while but less so recently. Through this process, I’ve found that I’m comfortable with the topics of complex joins, Common Table Expressions (CTEs), and nested subqueries. However, it was my deep dive into subqueries where I found something new: correlated subqueries. In this post, we’ll explore the intricacies of subqueries, with a spotlight on the often overlooked (at least for me) correlated subqueries.

Click through to understand what correlated subqueries are, in contrast to other forms of subquery.

Comments closed

Regex Support in Azure SQL DB

Abhiman Tiwari has a big announcement:

We are pleased to announce the private preview of regular expressions (regex) support in Azure SQL Database. Regex is a powerful tool that allows you to search, manipulate, and validate text data in flexible ways. With regex support, you can enhance your SQL queries with pattern matching, extraction, replacement, and more. You can also combine them with other SQL functions and operators to create complex expressions and logic.

This is something I’ve wanted to see in SQL Server for years, and I’m excited that there’s official support now. Prior to that, you could use SQL# to perform some regular expression operations using the CLR, but as long as performance is reasonable on these, it’s a huge feature to include.

Comments closed

The Proper Use of Views and Inline UDFs

Erik Darling plays tic-tac-toe:

The problem is really the stuff that people stick into views. They’re sort of like a junk drawer for data. Someone builds a view that returns a correct set of results, which becomes a source of truth. Then someone else comes along and uses that view in another view, because they know it returns the correct results, and so on and so on. Worse, views tend to do a bunch of data massaging, left joining and coalescing and substringing and replacing and case expressioning and converting things to other things. The bottom line is that views are as bad as you make them.

The end result is a trash monster with a query plan that can only be viewed in full from deep space.

Read on to learn the use cases for views and inline UDFs, as well as a few important notes regarding performance of each. Views are like mogwai: they’re fine as long as you never get them wet and never let them eat after midnight. The problem is, far too many companies are apparently the business equivalent of all-you-can-eat buffets at water parks.

Inline user-defined functions are like patenting a device that lets you shoot yourself in both feet with one pull of the trigger. Which, if I understand things correctly, means you’ll need a Form 4 for each inline UDF.

Comments closed