Press "Enter" to skip to content

Category: T-SQL

Custom Sorts

Rob Farley looks at ways of sorting data more efficiently:

Another option, which is more longwinded (some might suggest that would suit me – and if you thought that: Oi! Don’t be so rude!), and uses more reads, is to consider what we’d do in real life if we needed to do this.

If I had a pile of 73,595 orders, sorted by Salesperson order, and I needed to return them with a particular Salesperson first, I wouldn’t disregard the order they were in and simply sort them all, I’d start by diving in and finding the ones for Salesperson 7 – keeping them in the order they were in. Then I’d find the ones that weren’t the ones that weren’t Salesperson 7 – putting them next, and again keeping them in the order they were already in.

My first inclination is to think that this is a fragile solution—what about parameterization?  Will that affect the execution plan in unexpected ways?  I like the approach, however, and will have to add it to the toolbox for those cases in which it makes sense.

Comments closed

Association Rules

Tomaz Kastrun discusses product variants:

To sum up, association rules is a great and powerful algorithm for finding the correlations between items and the fact that you can use this straight from SSMS, it just gives me goosebumps. Currently just the performance is a bit of a drawback. Also comparing this algorithm to Analysis services (SSAS) association rules, there are many advantages on R side, because of maneuverability and extracting the data to T-SQL, but keep in mind, SSAS is still very awesome and powerful tool for statistical analysis and data predictions.

Figuring out variations after the fact is an all-too-common task, and this is a good way of getting some ideas on how to do that.

Comments closed

Handling Large Data Modifications

Jeff Mlakar shows how to insert, update, and delete large numbers of records with T-SQL:

Using T-SQL to insert, update, or delete large amounts of data from a table will results in some unexpected difficulties if you’ve never taken it to task.

Let’s say you have a table in which you want to delete millions of records. If the goal was to remove all then we could simply use TRUNCATE. However, if we want to remove records which meet some certain criteria then executing something like this will cause more trouble that it is worth.

I do like the delete process.  The update process is going to run tableRows/batchSize full scans, so I’m not as fond of that one.  Do read the whole thing.

Comments closed

Format Your Code

Grant Fritchey looks at some poorly-formatted code:

You are going to cause all sorts of problems if you write code like this. First off, if you really do have these three queries within the same stored procedure, how hard will it be to confuse which is table ‘a’ in each of the queries when you go back to edit them? Pretty easy.

It gets worse though. I know that none of us will ever write a query that exceeds 26 tables in a JOIN… well, except that one time… and that other time. In fact, it happens. Oh, it’s not always a good thing, but it’s a thing. How do we respond to that? I’ve seen this:

Some of this stuff is egregious.  Some of it is debatable.  But either way, well-formatted code helps prevent bugs and aid understanding of queries.

Comments closed

(Re-)Design For Today’s Needs

Andy Levy sees common problems when dealing with brownfield applications:

The primary system I deal with on a daily basis was originally developed as a DOS application and several of the above examples are drawn from it. Looking at the core tables and columns, it’s easy to identify those that began life in those early days – they all have 8-character names. Time moved on and the system grew and evolved. DOS to Windows. Windows to the web. But the database, and the practices and patterns used in the database, haven’t come along for the ride.

Data schema conversions can be hard and disruptive – you need to update your application, your stored procedures, and provide customers/users with a clean migration path. Code changes require testing. Complexity and cost grows every time you introduce changes. I get that.

There’s a lot of effort in Andy’s advice, but it’s well worth it.

Comments closed

T-SQL And R Performance Comparisons

Tomaz Kastrun does several performance comparisons between various R packages and T-SQL constructs:

Couple of packages I will mention for data manipulations are plyr, dplyr and data.table and compare the execution time, simplicity and ease of writing with general T-SQL code and RevoScaleR package. For this blog post I will use R packagedplyr and T-SQL with possibilites of RevoScaleR computation functions.

My initial query will be. Available in WideWorldImportersDW database. No other alterations have been done to underlying tables (fact.sale or dimension.city).

Read on for code and conclusions.  I don’t think there are any shocking conclusions:  the upshot is to filter data as early as possible.

Comments closed

ISNULL And COALESCE Behavior Difference

Vladimir Oselsky notes an edge case where ISNULL and COALESCE can behave differently:

Even though we would expect to see both records returned we only get 1 record. Huh? This is exactly what puzzled a coworker, ofcourse query was not as simple as this one but same issue caused him to hit a road block.

In the case of COALESCE and OR methods, results are identical.

The underlying issue here is that the variable data type differs from the column’s data type, and exposes a difference in how COALESCE and ISNULL work.

Comments closed

Rounding

Kenneth Fisher looks at decimal truncation methods:

ROUND

This is the most complicated of the three. It does a standard rounding. If value is .5 or over then you get back 1. If it’s less than .5 you get back 0. On top of that you get to pass the place you want to round to. So for example 0 rounds to the nearest ones place, -1 rounds to the tens place, 2 rounds to the hundredths.

There’s a bonus here:  if you use ROUND in T-SQL, the results are different than the default Round method in .NET; that method uses banker’s rounding by default whereas the T-SQL rounding does not.

Comments closed

Returning Defult Rows

Christopher Huntley wants to return a default record when there are no results:

Or if you’re ready to take it to the ╰[ ⁰﹏⁰ ]╯level then change the column to NChar and use the hex of your favorite emoji like:

DECLARE @testtable1 TABLE (
testid int identity (1,1),
testvalue  nchar (255))

–use the below for the final query

SELECT
ISNULL((SELECT testvalue from @testtable1 where testvalue > 101),  NCHAR(0xD83D)+ NCHAR(0xDE20) ) as testvaluethatworks

There are a few other alternatives, such as loading results into a temp table and inserting a default row if the temp table is empty.

Comments closed