Syntax – Page 32 – Curated SQL

Generating Random Data in Snowflake

Published 2023-06-28 by Kevin Feasel

Kevin Wilkie generates some random data:

One of the many things that the business team asks me to do is to create random-ish data. Thankfully, in Snowflake, there are many ways to make this happen. Today, I want to go thru just a few of them.

Perhaps the one that most people are familiar with is making Snowflake create a random number.

Click through for initial coverage of the RANDOM() function, as well as how you can generate data across a uniform distribution over a given range.

Comments closed

Indexing Multiple Columns in Oracle with DBMS_SEARCH

Published 2023-06-27 by Kevin Feasel

Brendan Tierney rounds up the usual suspects:

This type of index is a little different to your traditional index. With DBMS_SEARCH we can create an index across multiple schema objects using just a single index. This gives us greater indexing capabilities for scenarios where we need to search data across multiple objects. You can create a ubiquitous search index on multiple columns of a table or multiple columns from different tables in a given schema. All done using one index, rather than having to use multiples. Because of this wider search capability, you will see this (DBMS_SEARCH) being referred to as a Ubiquitous Search Index. A ubiquitous search index is a JSON search index and can be used for full-text and range-based searches.

This is an interesting approach to the problem, though as I think about it, it makes me wonder, if you’re constantly searching in A+B+C+D, is that really four separate attributes or has something gone wrong in the design? It’s early enough in the morning for me that I’m willing to accede to there being use cases in a well-designed database.

Comments closed

Repetition in R with rep()

Published 2023-06-26 by Kevin Feasel

Steven Sanderson shows off a function in R:

As a programmer, you’re constantly faced with the need to repeat tasks efficiently. Repetition is a fundamental concept in programming, and R provides a powerful tool to accomplish this: the rep() function. In this blog post, we will explore the syntax of the rep() function and delve into several examples to showcase its versatility and practical applications. Whether you’re working with data manipulation, generating sequences, or creating repeated patterns, rep() will become your go-to function for mastering repetition in R.

Read on for the basic syntax, as well as several examples of how to use the rep() function.

Comments closed

Finding the Actual Error Line in sp_executesql

Published 2023-06-23 by Kevin Feasel

Thom Andrews solves a problem:

Notice that the error line states line 2 not line 8, which is the line the sys.sp_executesql was called on. Knowing the line the error occured on within the dynamic batch is certainly important, but if you’re working with multiple dynamic batches you have no way of knowing which dynamic batch might have produced the error; was it the one executed on line 100? Line 200? Line 350?

Solving this problem wasn’t wasn’t exactly simple, and it came with a couple of caveats.

Click through for the approach. As Thom mentions, it isn’t perfect, but it is reasonable and interesting.

Comments closed

Calculating Time Series Differences

Published 2023-06-19 by Kevin Feasel

Steven Sanderson notices the difference:

The diff() function in R calculates the differences between consecutive elements in a vector or a time series. It takes a single argument, which is the input vector, and returns a new vector with the differences. This function is particularly useful for analyzing the rate of change, identifying patterns, and detecting anomalies in your data.

Read on to see how you can use it, as well as some examples of usage.

Comments closed

The Value of QUOTENAME

Published 2023-05-31 by Kevin Feasel

Quoth Chad Callihan, “Occasionally more”:

QUOTENAME can be used to make sure database objects are valid in your query. Most of the time, objects like table names only contain valid characters, so there’s nothing to worry about. But nobody’s perfect. Let’s look at an example of what can happen when somebody creates a table with a forward slash in the name and see how QUOTENAME can be used to query against it.

QUOTENAME is also a good way of preventing SQL injection, though I still prefer appropriate use of exec sp_executesql in any case in which it’s possible to use.

Comments closed

MATCHBY for DAX Window Functions

Published 2023-05-26 by Kevin Feasel

Jeffrey Wang plays matchmaker:

In this final installment of our four-part mini-series on DAX window functions, we’ll be focusing on a new development. The May release of Power BI Desktop has enriched all DAX window functions – namely OFFSET, WINDOW, INDEX, RA N K, and ROWNUMBER – with an additional sub-function, MATCHBY, supplementing the existing sub-functions, ORDERBY and PARTITIONBY. In this article, we’ll delve into the purpose of the MATCHBY function, along with the three specific problems it aims to resolve.

Read on to understand what the MATCHBY sub-function does and why it can be important.

Comments closed

Recursive Common Table Expressions in Snowflake

Published 2023-05-18 by Kevin Feasel

Kevin Wilkie is too fancy for simple joins:

Today, I want to talk about that fun edge case when you’re having to join a table to itself in Snowflake. Does it happen often? Not unless your architect just hates you.

Let’s use the normal pieces of data that everyone uses for this kind of thing – employee/manager relationships. We have our employee table that we’ve been working off that we’ll play with for this example.

The syntax is a bit different from T-SQL, but the concept is still the same.

Comments closed

Trying NTILE

Published 2023-05-17 by Kevin Feasel

Chad Callihan looks at the fourth ranking window function:

Have you ever used the NTILE function? Or have you even heard of the NTILE function? It seems to be one of the lesser known, lesser used window functions in SQL Server. I’ve never come across it in the wild but maybe there are those that use it all the time. Either way, let’s have a look at what it does and how it can be used.

Click through for a demo. I definitely use it a lot less than ROW_NUMBER(), RANK(), and DENSE_RANK(), but I have used it to some good effect in the past, mostly in cases where I’ve wanted to focus on the top X% of data for an analysis.

Comments closed

NTILE and QUALIFY in Snowflake

Published 2023-05-10 by Kevin Feasel

Kevin Wilkie continues to build a sample:

In our last post, we went over one way to get a sample of data. In the end was it right? Heck, no – at least not if we wanted a percentage of rows returned. Now, SAMPLE does work fine if you want a specific number of random rows returned each time.

But, let’s face it – sometimes we will be asked for 10% of a table – especially in this world of Machine Learning. So let’s attempt to find a way to make this happen, shall we?

QUALIFY would be a nice clause to have. Instead, if you’re in the T-SQL world, you’re probably used to creating a common table expression or subquery and then applying a WHERE clause to the outputs of that CTE/subquery.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Category: Syntax