Press "Enter" to skip to content

Category: T-SQL

Finding Rogue Line Feeds

Bill Fellows finds out whose line (feed) it was, anyway:

I ran into an issue today that I wanted to write about so that maybe I remember the solution. We ran into a case where the source data in a column had an unprintable character. In this case, it was a line feed character, which is ASCII value 10, and they had 7 instances in this one row. “How did that get in there? Surely that’s an edge case and we can just ignore it,” and dear reader, I’ve been around long enough to know that this is likely a systemic situation.

Click through for two ways of answering this against data already in SQL Server, as well as one additional route explained but sans demonstration.

Comments closed

Comparing Table Records with T-SQL

Chad Callihan compares and contrasts:

We recently looked at looked at comparing schemas using Azure Data Studio. What if we need to compare tables by using a query? For this post we’ll compare using EXCEPT, NOT IN, and NOT EXISTS to find differences between two tables.

Our two tables to compare will be Comic and Comic_Copy. Based on counts, we have 48 more records in Comic than we do in Comic_Copy. Let’s find the differences.

In Chad’s specific query, NOT EXISTS works great. Where I like EXCEPT is when you need to see if any of the non-key columns differ. For example, if you also needed to compare titles for rows with the same ID and ensure those titles matched.

Comments closed

The Value (and Cost) of DATETRUNC

Brent Ozar points out the ups and downs of DATETRUNC():

The first one, passing in a specific start & end date, gets the best plan, runs the most quickly, and does the least logical reads (4,299.) It’s a winner by every possible measure except ease of writing the query. When SQL Server is handed a specific start date, it can seek to that specific part of the index, and read only the rows that matched.

DATETRUNC and YEAR both produce much less efficient plans. They scan the entire index (19,918 pages), reading every single row in the table, and run the function against every row, burning more CPU.

SQL Server’s thought process is, and has always been, “I have no idea what’s the first date that would produce YEAR(2017). There’s just no way I could possibly guess that. I might as well read every date since the dawn of time.”

Read on for the upshot.

Comments closed

Bit Twiddling in T-SQL

Louis Davidson explains how bit operations work in T-SQL:

I expect that 99% of the people reading this looks at this probably would expect there to be a status table that contained the values of status. Seeing that this is a base 2 number, you may be in that 1% that thinks this might be a bitmask. but unless you have and eidetic memory, you probably don’t know what all of the bits mean.

A bitmask is a type of denormalization of values where instead of having a set of columns that have on or off values (no Null values), you encode it like:

Bitmasks make me break out the angry nun ruler. You can almost guarantee you’re doing something wrong if you design a bitmask as a column in a table.

Comments closed

GENERATE_SERIES and Data Types

Bill Fellows runs into an issue:

Perfect, now I have a row for each second from midnight to approximately 5.5 hours later. What if my duration need to vary because I’m going to compute these ranges for a number of different scenarios? I should make that 19565 into a variable and let’s overengineer this by making it a bigint.

Things don’t work out quite the way you might have expected there. Read on and see what Bill found and how you can circumvent the problem.

Comments closed

AT TIME ZONE Performance and an Alternative

Joe Obbish looks at time zones:

Queries that use the AT TIME ZONE operator may perform worse than desired. For example, Jonathan Kehayias blogged about performance struggles with AT TIME ZONE at sqlskills.com. One key difference between Jonathan’s opinion and my own is that I enjoy writing these kinds of posts.

Read on for the details and check out some work Joe has done around time zone friendly functions as an alternative to AT TIME ZONE.

Comments closed

Qualities of Production-Grade Code

Aaron Bertrand pulls out the list:

In a lot of programming languages, efficiency is almost always the guidepost. Sometimes, minimizing character count or line count is a “fool’s gold” measure of the quality of the code itself. Other times, unfortunately, engineers are judged not by quality at all, but rather the sheer number of lines of code produced – where more is, somehow, obviously better. Over my career, “how much code there is” has never been a very meaningful measure in any language.

But I’m here to talk about T-SQL, where certainly efficiency is a good thing to measure – though there are some caveats to that:

Read on for those caveats and what Aaron considers to be the hallmarks of high-quality code.

Comments closed

Deleting Data from SQL Server

Greg Larsen fills us in on an important command:

Over time data in SQL Server tables needs to be modified. There are two major different aspects of modifying data: updating and deleting. In my last article “Updating SQL Server Data” I discussed using the UPDATE statement to change data in existing rows of a SQL Server table. In this article I will be demonstrating how to use the DELETE statement to remove rows from a SQL Server Table.

This stays pretty simple but provides an effective overview of how to keep those tables tidy.

Comments closed