Press "Enter" to skip to content

Category: T-SQL

Preventing SQL Injection in Stored Procedures

Vlad Drumea fixes a procedure:

In the past few years, I’ve seen quite a few stored procedures that rely on dynamic T-SQL without properly guarding for SQL injection.

Some cases were reporting stored procedures, while others were maintenance type stored procedures (e.g. stats updates) that could be kicked off from the app, or even stored procedures that handled app upgrades/patching.

In all these cases, certain portions of the dynamic T-SQL relied on input provided by users via input parameters.

Read on for an example. The solution is still the classic combination of QUOTENAME() and sp_execute_sql whenever you have user input.

Leave a Comment

Splitting to Table via STRING_SPLIT() and REGEX_SPLIT_TO_TABLE()

Greg Low might have violated Betteridge’s Law of Headlines:

Using T-SQL it’s quite easy to build a table-valued function that can step through a string, character-by-character, and (based on a delimiter) output the delimited strings. The problem is that the performance of these functions is appalling.

When XML support appeared in SQL Server 2005, there were many attempts to create more efficient string-splitting functions. For many strings, these work quite well, but do have a few oddities that you need to cope with. Plus, most have limitations on the strings that you can split.

Ultimately, what was really needed was an efficient and native built-in function.  

Greg points out two mechanisms and contrasts them.

Leave a Comment

Implementing SOFTMAX in SQL Server

Sebastiao Pereira is back with another formula:

The SOFTMAX function takes raw scores and converts into a probability distribution. This mathematical function is used in neural networking training, multiclass classification methods, multinomial logistic regression, multiclass linear discriminant analysis, and naïve Bayes classifiers. How can this function be built in SQL Server?

Click through for the implementation.

Leave a Comment

Optional SUBSTRING() Length in SQL Server 2025

Louis Davidson points out a neat update:

Sometimes along comes a feature that seems so obvious, so natural, that you wonder why it took so long for Microsoft to implement it. One of those features in SQL Server 2005 is the optional length parameter in the SUBSTRING function. It has long been one of those questions when you wrote a SUBSTRING expression when you wanted to go from the Nth character to the end of the string, how many characters do you want? And for the most part, it didn’t really matter.

But sometimes it did (especially when dealing with nvarchar(max) data.

I learned about this when putting together an update to my Teaching Old Dogs New Tricks presentation. This capability is pretty nifty and something I wish I had a while ago.

Leave a Comment

TOP(1) with Ties

Andy Brownsword can’t stop at one:

Having TOP (1) return multiple rows feels wrong… but that’s what WITH TIES can do.

For a long time I used patterns like this to get the first record in a group:

Andy goes on to explain how WITH TIES works in T-SQL, shows an alternative to using a common table expression + window function to narrow down to the first logical group, and digs into when you might not want to use that alternative.

Leave a Comment

ANY_VALUE() in Fabric Data Warehouse

Jovan Popovic notes a feature going GA:

Fabric Data Warehouse now supports the ANY_VALUE() aggregate, making it easier to write readable, efficient T-SQL when you want to group by a key but still return descriptive columns that are functionally the same for every row in the group.

Right now, this is only available in the Fabric Data Warehouse, so no Azure SQL DB, Managed Instance, or box product support at this time.

Leave a Comment

Word Order and Constraint Naming

Andy Levy is looking for a name:

Ten years (and a couple jobs) ago, I wrote about naming default constraints to avoid having SQL Server name them for you. I closed with the following statement:

SQL Server needs a name for the constraint regardless; it’s worth specifying it yourself.

We’re back with a new wrinkle in the story.

Read on for an interesting scenario where Andy very clearly named a constraint, yet the name didn’t take.

Leave a Comment

Calculating Net Present Value and Internal Rate of Return in T-SQL

Sebastiao Pereira is back with more calculations:

Many organizations store cash-flow data inside SQL Server and decision-makers often need metrics like Net Present Value (NPV) and Internal Rate of Return (IRR) to evaluate those cash flows. Is it possible to calculate NPV and IIR values in SQL Server without the use of external tools?

These are quite easy to pull off in Excel and a bit more complex in T-SQL. Though with Net Present Value, in particular, I’m pretty sure I could rewrite it not to use the cursor.

Leave a Comment

Performance Testing DATE_BUCKET()

Louis Davidson runs some tests:

A month and a half ago, I wrote a blog on using DATE_BUCKET. It is a cool feature thta makes doing some grouping really quite easy. It is here: Cool features in SQL Server I missed…DATE_BUCKET. One of the comments that came in was about performance of the DATE_BUCKET versus using things like DATEDIFF or a date table.

I started working on it then, but it got a bit involved (as performance comparison tests often do), so it took me a bit longer to get to than expected. But here it is, and the results are kind of what you would expect. The uses for DATE_BUCKET are really straightforward, and would rarely involve an index or a lot of filtering using the the function. But over a large number of rows, if it takes more time (even a millisecond more) than another method, you would notice it pretty quickly adding up.

Read on to see how DATE_BUCKET() performs compared to other methods of solving the same problem.

Comments closed

Code to Perform Binary Search in SQL Server

Andy Brownsword has a procedure:

Let’s recap what we’re doing here:

Large append-heavy tables – like logs or audits – often don’t have a useful index on the timestamp. These types of tables do however have a strong correlation between their clustering key and the timestamp due to chronological inserts.

A binary search approach splits the table in half to narrow down the search space with each iteration. By abusing the incremental relationship between the clustering key and timestamps, we can quickly zero in on the point in time we’re after. If you want to see the mechanics, check out last week’s post.

I love the approach for log tables, assuming that a timestamp is part of the filter. This is a clever application of a very common computer science algorithm to database operations.

Comments closed