Press "Enter" to skip to content

Category: T-SQL

Error Handling with OPENROWSET

Deborah Melkin handles missing servers with aplomb:

OPENROWSET is a functionality that allows you to access data sources outside your current server. This could be reading from an Excel file or calling another SQL Server instance. You’re able to treat that other data source as record set, or derived table, and work with the rows returned as you would a local table. One reason you may want to do this is that you need to use a stored procedures to query data from other servers and bring the data together, effectively creating an ELT (Extract – Load – Transform) process without having to use SSIS or Azure Data Factory (ADF).

Read on to see how OPENROWSET() works, what happens if you try to access a remote server which doesn’t exist (or times out), and how you can capture that error message in a CATCH block—something that is not possible to do by default.

Comments closed

Antipattern Queries Extended Event in SQL Server

Bob Ward enumerates some anti-patterns SQL Server can guilt you over:

If a query uses certain antipatterns, it will be detected during query optimization.  For both SQL Server and Azure SQL (internally on by default), if these antipatterns are detected when optimizing the query, and the query_antipattern event has been added as part of a running extended event session, the output will be captured.  The output will contain the relevant capture fields configured for the extended event session, allowing one to quickly identify which queries contain these antipatterns and are, therefore, prime candidates for tuning.

Read on for more information about this extended event, which is new to SQL Server 2022. I haven’t used this yet, so the two caveats I’m about to give are speculative in nature…though when has that ever stopped me? Caveat the first: just because something shows up as an anti-pattern doesn’t mean it needs to be fixed. There can be good reasons why you have chosen what is normally a less-efficient path. Caveat the second: just because something doesn’t show up as an anti-pattern doesn’t mean it’s fine. These are likely directional and my guess is that SQL Server will be fairly conservative in its estimation of what constitutes an anti-pattern so that you don’t get a lot of false positives.

Comments closed

Sessions and Execution of Dynamic SQL

Deborah Melkin riddles us this on dynamic SQL:

I admit it – I do waaayyyy too much with dynamic SQL. But I just keep running into situations that require it. That being said, I ran into an interesting problem that had me puzzled. I found a bunch of different blog posts that pointed to me to the right direction but required a little extra work to find the solution.

There are several concepts that are at play here, so I’ll try to break this out so we can put the pieces together. The first once is centered around dynamic SQL. There are two parts of this I want to make sure we understand first – how it fits into sessions and how it gets executed.

Read the whole thing.

Comments closed

Data Types and CONCAT_WS

Koen Verbeeck hits a concatenation issue:

I was writing some dynamic SQL that generates some SQL statements to load my facts and dimensions into a data warehouse. Some of those SQL statements can become very long, for example if a dimension has a lot of columns. When debugging, I noticed a couple of statements failing with various errors. Turns out, they were truncated after 4000 characters. What was going on?

Read on to see what happened.

Comments closed

External Tables and the Serverless SQL Pool

Ryan Adams continues a series on querying the serverless SQL pool in Azure Synapse Analytics:

There are two ways to read data inside Data Lake using the Synapse Serverless engine.  In this article, we’ll look at the second method which uses an external table to query a path within the lake.

Synapse is a collection of tools with four different analytical engines (Dedicated PoolSpark PoolServerless PoolData Explorer Pool).  This gives you a lot of options for ingesting, transforming, storing, and querying your data.  Here you will use the Synapse Serverless Pool to query the data in your ADLS account.   

Read on for a demonstration.

Comments closed

Reading the Data Lake with the Serverless Pool via OPENROWSET

Ryan Adams begins a series on reading data from the data lake:

There are two ways to read data inside Data Lake using the Synapse Serverless engine.  In this article, we’ll look at the first method which uses OPENROWSET to query a path within the lake. 

Synapse is a collection of tools with four different analytical engines (Dedicated PoolSpark PoolServerless PoolData Explorer Pool).  This gives you a lot of options for ingesting, transforming, storing, and querying your data.  The article will focus on how you can use the Synapse Serverless Pool to query the data in your ADLS account.   

Click through for a primer on the topic, as well as a demo video.

Comments closed

Performance Comparing DISTINCT to GROUP BY

Reitse Eskens does a performance comparison:

A few days ago, I heard someone stating that Group By was much quicker than Distinct. Less disk impact, less memory etc.
So, I thought I’d find out if it’s true or not because I found it interesting. I always thought there was no difference. I tested a single small table and found no difference in speed, reads or execution plan. But that’s no real world example. Usually the tables contain a lot of data and are joined to other tables.

Click through for the results of Reitse’s analysis.

Comments closed

Time Series Features in SQL Server 2022

Kendal Van Dyke walks us through a few new bits of T-SQL in SQL Server 2022:

Time series data is often used for historical comparisons, anomaly detection and alerting, predictive analysis, and reporting, where time is a meaningful axis for viewing or analyzing data.

Time series capabilities in SQL Server were introduced in Azure SQL Edge, Microsoft’s version of SQL Server for the Internet of Things (IoT) which combines capabilities such as data streaming and time series with built-in machine learning and graph features.

I am happy to see that these operators and functions made the leap from Azure SQL Edge and am hopeful that we’ll see a bit more of what makes databases like influxdb so useful for time series make their way in as well.

Comments closed

Defending Less-than-Ideal Practices

Deborah Melkin has a confession and a defense:

I feel like this is where I should say something like, “Hi, my name is Deborah and I’ve used nolock in production.” I would also have to confess to doing things like using correlated sub queries, not using a foreign key, implemented complicated triggers, etc. I often talk about how the first real SQL script I wrote had cursors running over temp tables in SQL Server 6.5, which I’m fairly certain was one of the first thing I read you were NOT supposed to do. And oh, hello there, denomalized table and dynamic SQL! I’m sure I’ve done more things than this too. These are just the ones I can remember doing, or at least I’m willing to admit in public.

With some of these, the answer is “that’s the best alternative I had at the time.” With correlated sub-queries, I wouldn’t even consider that a bad thing. Granted, I personally prefer a combination of common table expressions and the APPLY operator but that’s usually not for performance reasons.

Comments closed

Finding Skipped Identity Values in a Table

Brent Ozar minds the gap:

When someone says, “Find all the rows that have been deleted,” it’s a lot easier when the table has an Id/Identity column. Let’s take the Stack Overflow Users table:

It has Ids -1, 1, 2, 3, 4, 5 … but no 6 or 7. (Or 0.) If someone asks you to find all the Ids that got deleted or skipped, how do we do it?

Click through for two methods, one specific to SQL Server 2022 and one which works for all versions of SQL Server.

Comments closed