Press "Enter" to skip to content

Category: Syntax

The Value of QUOTENAME

Quoth Chad Callihan, “Occasionally more”:

QUOTENAME can be used to make sure database objects are valid in your query. Most of the time, objects like table names only contain valid characters, so there’s nothing to worry about. But nobody’s perfect. Let’s look at an example of what can happen when somebody creates a table with a forward slash in the name and see how QUOTENAME can be used to query against it.

QUOTENAME is also a good way of preventing SQL injection, though I still prefer appropriate use of exec sp_executesql in any case in which it’s possible to use.

Leave a Comment

MATCHBY for DAX Window Functions

Jeffrey Wang plays matchmaker:

In this final installment of our four-part mini-series on DAX window functions, we’ll be focusing on a new development. The May release of Power BI Desktop has enriched all DAX window functions – namely OFFSETWINDOWINDEXRANK, and ROWNUMBER – with an additional sub-function, MATCHBY, supplementing the existing sub-functions, ORDERBY and PARTITIONBY. In this article, we’ll delve into the purpose of the MATCHBY function, along with the three specific problems it aims to resolve.

Read on to understand what the MATCHBY sub-function does and why it can be important.

Leave a Comment

Recursive Common Table Expressions in Snowflake

Kevin Wilkie is too fancy for simple joins:

Today, I want to talk about that fun edge case when you’re having to join a table to itself in Snowflake. Does it happen often? Not unless your architect just hates you.

Let’s use the normal pieces of data that everyone uses for this kind of thing – employee/manager relationships. We have our employee table that we’ve been working off that we’ll play with for this example.

The syntax is a bit different from T-SQL, but the concept is still the same.

Leave a Comment

Trying NTILE

Chad Callihan looks at the fourth ranking window function:

Have you ever used the NTILE function? Or have you even heard of the NTILE function? It seems to be one of the lesser known, lesser used window functions in SQL Server. I’ve never come across it in the wild but maybe there are those that use it all the time. Either way, let’s have a look at what it does and how it can be used.

Click through for a demo. I definitely use it a lot less than ROW_NUMBER(), RANK(), and DENSE_RANK(), but I have used it to some good effect in the past, mostly in cases where I’ve wanted to focus on the top X% of data for an analysis.

Leave a Comment

NTILE and QUALIFY in Snowflake

Kevin Wilkie continues to build a sample:

In our last post, we went over one way to get a sample of data. In the end was it right? Heck, no – at least not if we wanted a percentage of rows returned. Now, SAMPLE does work fine if you want a specific number of random rows returned each time.

But, let’s face it – sometimes we will be asked for 10% of a table – especially in this world of Machine Learning. So let’s attempt to find a way to make this happen, shall we?

QUALIFY would be a nice clause to have. Instead, if you’re in the T-SQL world, you’re probably used to creating a common table expression or subquery and then applying a WHERE clause to the outputs of that CTE/subquery.

Comments closed

Choosing from a List of Values

Greg Dodd doesn’t need no steenkin’ tables:

Sometimes you have a list of values or parameters that you’d like to run a select statement over the top of to get a result. There’s a few ways of doing this, usually I see people create a Temporary Table and insert the data. But is there an easier way?

Yes, we can select from VALUES:

The result of this is called a virtual table, and as far as other parts of the query are concerned, it’s just another table.

Comments closed

Sparse Columns in SQL Server

Chad Callihan occasionally inserts something:

Have you ever maxed out the SQL Server table column limit yet still needed more columns? Hopefully not considering SQL Server has a max limit of 1024 columns per table. But as I found out, it’s possible for someone to reach out and ask for even more. Sparse columns are an option to consider when you can’t get enough. Let’s take a look at what sparse columns are and how they can be used.

Sparse columns have very little utility, except in the most “I don’t think you’re doing it right” scenarios. Still, if you happen to end up in that scenario, there is a way out, though I’d really want to understand the nature of the data in that problem and, knowing just the amount of detail in the scenario that I do, would lean toward storing the data either in an unpivoted fashion (one row per entity * attribute in an EAV-style “‘additional attributes” table) or as a JSON string and let the client sort it out.

Comments closed

A Thought on Query Granularity

Chris Johnson shares some thoughts:

This query is pretty simple, we’re wanting to return a set of data about the orders placed including the total cost of each order. However the orders in the database are split into an Orders table that holds the order level information, and an OrderLines table that holds information on each product ordered in each order, and we need that OrderLines table to get the cost.

So we write our query, and we join to the OrderLines table, and we sum the cost. But then, because we have an aggregate, we need to group by every other column we’re returning. And to me that just doesn’t look right, and it doesn’t convey the intention of the query properly.

In Chris’s simple example, I’m not sure I’d push it very much, but Chris does have a good point in terms of explaining query intent. Also, depending on how many order lines there are relative to orders (the next step in the chain for that query), aggregation in a common table expression could be faster than waiting until after the join to aggregate on all of the columns. In reality, that’s the most likely reason I’d make this change, assuming that it made a big enough performance difference. But if you take a much more complicated query of this sort, then I’d be more amenable to the argument.

Comments closed

Finding the Max (or Min) Value across Columns

Greg Dodd tries out some new syntax:

Have you ever had 2 columns in a table that you need to select the maximum value from? Not the maximum value from each column, but the biggest value after comparing the columns?

Usually where I have this problem, I’d use a CASE statement, but there’s another way. 

I do like GREATEST() and LEAST() (or argmax/argmin if you’re used to those terms), though Greg does include how you can get the same functionality in versions prior to SQL Server 2022.

Comments closed