Press "Enter" to skip to content

Category: T-SQL

Bucketing Tables By Size

Bill Fellows has an interesting approach to bucketing tables into groups of similar size:

You need to do something to all of the tables in SQL Server. That something can be anything: reindex/reorg, export the data, perform some other maintenance—it really doesn’t matter. What does matter is that you’d like to get it done sooner rather than later. If time is no consideration, then you’d likely just do one table at a time until you’ve done them all. Sometimes, a maximum degree of parallelization of one is less than ideal. You’re paying for more than one processor core, you might as well use it. The devil in splitting a workload out can be ensuring the tasks are well balanced. When I’m staging data in SSIS, I often use a row count as an approximation for a time cost. It’s not perfect – a million row table 430 columns wide might actually take longer than the 250 million row key-value table.

Click through for the script.  For the R version, this Stack Overflow post shows how to do it with cumulative sums and the cut function.

Comments closed

Searching For Valid Characters For T-SQL Regular Identifiers

Solomon Rutzky is on a mission to find a national treasure:

Quite often these types of things are not that easy. Yes, it is very tempting to assume that the limited test is good enough and that we did indeed find the exact list of characters (plus we would need to add in the four extra characters: at sign (@), dollar sign ($), number sign (#), and underscore (_)). However, based on my experiences, it seems that more often than not, doing an exhaustive test results in a slightly different answer that invalidates the previous conclusion (which was based on the limited test). So, while it does take more time to do more extensive testing, it seems like we have little choice if we truly want to know how these things actually work.

What that means is, at the very least, we need to get the complete list of characters accepted by SQL Server for non-delimited identifiers to make sure that the totals match the number of code points returned by the searches done in Step 1.

This post is an interesting dive into the oddities of Unicode, but leaves us on a cliffhanger.  Also, full-crazy Nicolas Cage beats mullet-wearing Tom Hanks any day.

1 Comment

Will It Bit?

Louis Davidson wants to see what he can cast to a bit type:

There are no other textual/alpha string values that will cast to a bit value, but the numeric values that will cast to a bit are voluminous (even some that are in string format). Consider the following eight statements:

SELECT CAST(100 AS bit);
SELECT CAST(-100 AS bit);
SELECT CAST(99999999999999999999999999999999999999 AS bit);
SELECT CAST(-99999999999999999999999999999999999999 AS bit);
SELECT CAST(88.999999 AS bit);
SELECT CAST('1' AS bit);
SELECT CAST('2' AS bit);
SELECT CAST('999999' AS bit);

Danged if they didn’t all work, and all return 1.

Check out what else Louis tries to cast to a bit type.

Comments closed

Using CONCAT_WS

Dave Mason points out another nice addition to the T-SQL toolbelt in SQL Server 2017:

In the last post, I looked at a new T-SQL function for SQL Server 2017. Let’s continue down that path and look at CONCAT_WS(), which is also new for SQL Server 2017. Here’s the definition of the function from Microsoft Docs:

“Concatenates a variable number of arguments with a delimiter specified in the 1st argument. (CONCAT_WS indicates concatenate with separator.)”

Read on for an example using CONCAT_WS.  It’s one of those functions that I haven’t quite committed to memory, but every time I get reminded of it, I remember that I really need to remember it.

Comments closed

Comparing Distinctness

Michael J. Swart shows several options for comparing whether an attribute’s value is distinct from a parameter:

Check it:

DECLARE @TeamId bigint = NULL,
    @SubTeamId bigint = NULL;
 
SELECT TOP 1 TaskId
FROM tasks
WHERE assignedTeamId IS NOT DISTINCT FROM @TeamId
  AND assignedSubTeamId IS NOT DISTINCT FROM @SubTeamId

Talk about elegant! That’s what we wanted from the beginning. It’s part of ANSI’s SQL 1999 standard. Paul White tells us it’s implemented internally as part of the query processor, but it’s not part of T-SQL! There’s a connect item for it… err. Or whatever they’re calling it these days. Go read all the comments and then give it a vote. There are lots of examples of problems that this feature would solve.

PROS: Super-elegant!
CONS: Invalid syntax (vote to have it included).

This would be nice to have.  In the meantime, Michael shows several options which are currently valid syntax.

Comments closed

Using SQL Server’s PIVOT Operator

Dan Blank shows how to use the PIVOT operator in SQL Server:

I was recently approached by my firms Marketing Manager with a request for some information.  She wanted to know “Which departments have our top clients never done any work for?”.  For some clarity, I work in a law firm with 11 departments. The request seems pretty straightforward at first. Then once I got thinking about how the output of this report would be presented it made me reconsider quite how simple a request this was.  She handed me a drawing with her vision for the output.  What she wanted was :

  1. Client’s names down the left,

  2. List of Departments across the top,

  3. Ticks and crosses at the intersection to show whether they had or had not done work for them.

You can make a good argument that presentation mechanics like this are meant for a different tool (a presentation layer) but it’s useful to know how to pivot and unpivot data within T-SQL for more reasons than just presentation.

Comments closed

The Secret Power Of TRIM

Dave Mason points out something quite useful about TRIM in SQL Server 2017:

Now I can tidy things up and remove both leading and trailing spaces with a single call to TRIM():

SELECT TRIM(b.foo)
FROM dbo.bar b;

On the surface, this may not seem like that big of a deal. And I would tend to agree. But, in this example, it saves a few key strokes and makes the code slightly more readable. And it is nice for T-SQL to finally have a function that has been around in other languages for far longer than I’ve been writing code for a living.

But Wait, There’s More!

Click through for that more.  This makes TRIM a lot more useful, so go check it out.

Comments closed

ORIGINAL_DB_NAME()

Kenneth Fisher explains a couple of database name functions in SQL Server:

I’d never seen ORIGINAL_DB_NAME until recently and I thought it would be interesting to highlight it out, and in particular the difference between it and DB_NAME. I use DB_NAME and DB_ID fairly frequently in support queries (for example what database context is a query running from or what database are given DB files from). So starting with DB_NAME.

Click through to know when to use each.

Comments closed

Using STRING_AGG In SQL Server 2017

Derik Hammer talks about one of the nicer T-SQL additions in SQL Server 2017:

Creating comma separated strings from a column, or delimited strings as I like to call it, is a very common problem in SQL. Beginning with SQL Server 2017 and Azure SQL Database, there is now another option to the existing set of solutions, STRING_AGG().

I would like to convince you to use STRING_AGG over the other methods. So, let us begin with the competing solutions.

I completely agree and have been switching code over to use STRING_AGG since upgrading to 2017.  The code is so much clearer as a result compared to STUFF + FOR XML PATH concatenation.

Comments closed

Getting A Random Row

Brent Ozar shares four methods for getting a random row from a table:

Method 1, Bad: ORDER BY NEWID()

Easy to write, but it performs like hot, hot garbage because it scans the entire clustered index, calculating NEWID() on every row:

That took 6 seconds on my machine, going parallel across multiple threads, using tens of seconds of CPU for all that computing and sorting. (And the Users table isn’t even 1GB.)

Click through for the other three methods.  The really tricky part is when you want to get a random sample from the table, as TABLESAMPLE is an awful choice for that.

Comments closed