Syntax – Page 38 – Curated SQL

Deleting Data from SQL Server

Published 2022-11-08 by Kevin Feasel

Greg Larsen fills us in on an important command:

Over time data in SQL Server tables needs to be modified. There are two major different aspects of modifying data: updating and deleting. In my last article “Updating SQL Server Data” I discussed using the UPDATE statement to change data in existing rows of a SQL Server table. In this article I will be demonstrating how to use the DELETE statement to remove rows from a SQL Server Table.

This stays pretty simple but provides an effective overview of how to keep those tables tidy.

Comments closed

LATERAL and APPLY

Published 2022-11-04 by Kevin Feasel

Lukas Eder shows off one of my favorite operators:

The SQL:1999 standard specifies the <lateral derived table>, which is SQL’s way of allowing for a derived table (a subquery in the FROM clause) to access all the lexically preceding objects in the FROM clause. It’s a bit weird in terms of syntax, I personally think that Microsoft SQL Server has a much nicer solution for this concept via APPLY. Oracle supports both syntaxes (standard and T-SQL’s). Db2, Firebird, MySQL, PostgreSQL only have LATERAL.

Click through to see how the operator works.

Comments closed

Using the T-SQL OUTPUT Clause

Published 2022-11-02 by Kevin Feasel

Chad Callihan doesn’t make two calls:

Are you familiar with the OUTPUT clause in SQL Server? It can be used with commands like INSERT, UPDATE, DELETE, and MERGE. It can also be used with setting variables in stored procedures. Using the tried and true StackOverflow2013, we’ll narrow it down today to focus on how INSERT/DELETE are typically used for logging table changes as well as an example of how to use OUTPUT with stored procedures.

For really busy transactional systems, this provides a nice boost over making an update and then selecting the new values.

Comments closed

Approximate Percentiles in SQL DB and SQL MI

Published 2022-10-27 by Kevin Feasel

Balmukund Lakhani has an announcement:

Approximate query processing was introduced to enable operations across large data sets where responsiveness is more critical than absolute precision. Approximate operations can be used effectively for scenarios such as KPI and telemetry dashboards, data science exploration, anomaly detection, and big data analysis and visualization. Approximate query processing family has enabled a new market of big data HTAP customer scenarios, including fast-performing dashboard and data science exploration requirements.
Today, we are announcing preview of native implementation of APPROX_PERCENTILE in Azure SQL Database and Azure SQL Managed Instance. This function will calculate the approximated value at a provided percentile from a distribution of numeric values.

This is way faster than using the PERCENTILE_CONT() or PERCENTILE_DISC() window functions. For a decent-sized query, I was getting anywhere from 5-20x performance improvements, and the larger the dataset, the bigger the gains. It is important to note that the approximate percentiles are not window functions, so you don’t get one row back per row of input.

Comments closed

Generating a Series of Dates in SQL Server 2022

Published 2022-10-26 by Kevin Feasel

Hasan Savran won’t settle for just one date:

GENERATE_SERIES function is introduced in SQL Server 2022 version. It is a simple function that generates a series of numbers within a given interval. It requires compatibility level 160.

The natural use for it is to generate a set of numbers, but as Hasan shows, you can easily turn that into a set of dates or dates and times.

Comments closed

Contrasting INSERT INTO and SELECT INTO

Published 2022-10-26 by Kevin Feasel

Chad Callihan embraces the power of AND:

Data can be inserted into one temp table from another a couple of ways. There is the INSERT INTO option and the SELECT INTO option.
Are you devoted to one option over the other? Maybe you’re used to one and never experimented with the other. Let’s test each and compare performance to find out which is more efficient.

Both of these are useful, though Chad does mention a performance improvement with SELECT INTO. I tend to prefer INSERT INTO for “structured” scenarios because it lets me define the shape of the output table. When I don’t care what the shape is—for example, when I just need some data one time to perform an analysis—then I prefer SELECT INTO for its simplicity.

Comments closed

Getting Row Counts for Different DBMS Platforms

Published 2022-10-25 by Kevin Feasel

Brendan Tierney wants rowcounts:

A little warning before using these queries. They may or may not give the true accurate number of records in the tables. These examples illustrate extracting the number of records from the data dictionaries of the databases. This is dependent on background processes being run to gather this information. These background processes run from time to time, anything from a few minutes to many tens of minutes. So, these results are good indication of the number of records in each table.

Click through for examples in Oracle, MySQL, Postgres, SQL Server, and Snowflake. Though the SQL Server one does need a GROUP BY clause because it’s a sum of the partitions’ rows.

Comments closed

Rewriting Tricky Functions in SQL Server

Published 2022-10-19 by Kevin Feasel

Erik Darling fights dragons:

Far and away, some of the trickiest situations I run into when helping clients is rewriting scalar functions that have WHILE loops in them.
This sort of procedural code is often difficult, but not impossible, to replace with set-based logic.

Erik improves a function in this post, though often, the best way to improve a function is not to play the game at all.

Comments closed

Concatenation in Spark SQL

Published 2022-10-12 by Kevin Feasel

Patrick LeBlanc learns about string concatenation in Spark SQL:

Using PySpark in Azure Synapse Analytics and found that my regular way of concatenating two strings in T-SQL didn’t work. I figured out what you need to do to make this work.

Click through for a short video.

Comments closed

Generating Code to Run Across All Databases via Dynamic SQL

Published 2022-10-12 by Kevin Feasel

Aaron Bertrand provides a warning around dynamic SQL:

For October’s T-SQL Tuesday, Steve Jones asks us to talk about ways we’ve used dynamic SQL to solve problems. Dynamic SQL gets a bad rap simply because, like a lot of tools, it can be abused. It doesn’t help that a lot of code samples out there show that “good enough” doesn’t meet the bar most of us have, especially in terms of security.
In a series I started last year, I talked about ways to do <X> to every <Y> inside a database, focusing on the example of refreshing every view (in a single database or across all databases). I already touched on what I want to dig into today: that it can be dangerous to try to parameterize things that can’t be parameterized in the ways people typically try.

Read the whole thing. I do find it funny how often people aren’t allowed to install well-known, third-party stored procedures (like Aaron’s sp_ineachdb) but it’s perfectly okay to write terrible code which is vulnerable to exploit because it was written in-house and is therefore more trustworthy somehow.

I don’t want to dunk on security teams too much in this regard, as I understand and really do appreciate the principle, though it often has counterintuitive first- and second-order consequences.

Comments closed

Category: Syntax