Syntax – Page 30 – Curated SQL

Contrasting INSERT INTO and SELECT INTO

Published 2022-10-26 by Kevin Feasel

Chad Callihan embraces the power of AND:

Data can be inserted into one temp table from another a couple of ways. There is the INSERT INTO option and the SELECT INTO option.
Are you devoted to one option over the other? Maybe you’re used to one and never experimented with the other. Let’s test each and compare performance to find out which is more efficient.

Both of these are useful, though Chad does mention a performance improvement with SELECT INTO. I tend to prefer INSERT INTO for “structured” scenarios because it lets me define the shape of the output table. When I don’t care what the shape is—for example, when I just need some data one time to perform an analysis—then I prefer SELECT INTO for its simplicity.

Comments closed

Getting Row Counts for Different DBMS Platforms

Published 2022-10-25 by Kevin Feasel

Brendan Tierney wants rowcounts:

A little warning before using these queries. They may or may not give the true accurate number of records in the tables. These examples illustrate extracting the number of records from the data dictionaries of the databases. This is dependent on background processes being run to gather this information. These background processes run from time to time, anything from a few minutes to many tens of minutes. So, these results are good indication of the number of records in each table.

Click through for examples in Oracle, MySQL, Postgres, SQL Server, and Snowflake. Though the SQL Server one does need a GROUP BY clause because it’s a sum of the partitions’ rows.

Comments closed

Rewriting Tricky Functions in SQL Server

Published 2022-10-19 by Kevin Feasel

Erik Darling fights dragons:

Far and away, some of the trickiest situations I run into when helping clients is rewriting scalar functions that have WHILE loops in them.
This sort of procedural code is often difficult, but not impossible, to replace with set-based logic.

Erik improves a function in this post, though often, the best way to improve a function is not to play the game at all.

Comments closed

Concatenation in Spark SQL

Published 2022-10-12 by Kevin Feasel

Patrick LeBlanc learns about string concatenation in Spark SQL:

Using PySpark in Azure Synapse Analytics and found that my regular way of concatenating two strings in T-SQL didn’t work. I figured out what you need to do to make this work.

Click through for a short video.

Comments closed

Generating Code to Run Across All Databases via Dynamic SQL

Published 2022-10-12 by Kevin Feasel

Aaron Bertrand provides a warning around dynamic SQL:

For October’s T-SQL Tuesday, Steve Jones asks us to talk about ways we’ve used dynamic SQL to solve problems. Dynamic SQL gets a bad rap simply because, like a lot of tools, it can be abused. It doesn’t help that a lot of code samples out there show that “good enough” doesn’t meet the bar most of us have, especially in terms of security.
In a series I started last year, I talked about ways to do <X> to every <Y> inside a database, focusing on the example of refreshing every view (in a single database or across all databases). I already touched on what I want to dig into today: that it can be dangerous to try to parameterize things that can’t be parameterized in the ways people typically try.

Read the whole thing. I do find it funny how often people aren’t allowed to install well-known, third-party stored procedures (like Aaron’s sp_ineachdb) but it’s perfectly okay to write terrible code which is vulnerable to exploit because it was written in-house and is therefore more trustworthy somehow.

I don’t want to dunk on security teams too much in this regard, as I understand and really do appreciate the principle, though it often has counterintuitive first- and second-order consequences.

Comments closed

Merge Joins in SQL Server

Published 2022-10-12 by Kevin Feasel

Jared Poche continues a series on join types:

Merge joins traverse both inputs once, advancing a row at a time and comparing the values from each input. Since they are in the same order, this is very efficient. We don’t have to pay the cost to create a hash table, and we don’t have the much larger number of index seeks nested loops would encounter.

Read the whole thing. Remember: merge joins are also the best strategy for when two lanes of the road come together.

Comments closed

Finding the Nth Row in a Dataset

Published 2022-10-12 by Kevin Feasel

Reitse Eskens shows off a good use case for window functions:

Last week, one of my coworkers had a nice case. From a dataset, he needed the third row based on an indicator and a timestamp. That sounded a bit vague so I started out with creating a table that contained some records with the columns he’d have to filter on.

This turns out to be an ideal use of the ROW_NUMBER() function.

Comments closed

Inserting Data into MySQL

Published 2022-10-11 by Kevin Feasel

Robert Sheldon takes us through data insertion in MySQL:

In most cases, adding a single row of data to table is a reasonably straightforward process. You define the INSERT clause and VALUES clause and usually specify the column list in between. The column list should include only those columns for which you provide values. The list can include the primary key column, generated columns, or columns defined with default constraints, but you must be careful how you handle them, as you’ll see later in the article.

This is definitely aimed at people new to MySQL and SQL in general.

Comments closed

Strategies for De-Functionizing Your T-SQL

Published 2022-10-03 by Kevin Feasel

Tom Zika continues the 12-step program for scalar function abusers:

In the previous posts, we have learned why Scalar Functions (UDFs) are bad for parallelism and performance and what the options are for their removal.
The only remaining question is where to start. You’ve probably guessed there isn’t one true way to approach this. So instead, I offer several strategies that you can combine.

Plus Tom has an offer for you at the bottom of the post: conversion from Scalar UDFs to Inline Table-Valued Functions, which tend to perform much better.

Comments closed

MERGE in Dedicated SQL Pools

Published 2022-09-28 by Kevin Feasel

Emily Tehrani notes an addition to Azure Synapse Analytics:

We are thrilled to announce that the MERGE T-SQL command for Azure Synapse Dedicated SQL pools is now Generally Available! MERGE has been a highly requested addition to the Synapse T-SQL library that encapsulates INSERTs/UPDATEs/DELETEs into a single statement, drastically easing migrations and making table synchronization a breeze.

If you do decide to use this, I’d expect it to have the same bugs which make its use on-premises a mess. As always, MERGE responsibly.

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Category: Syntax