New Use Hint In SQL Server 2017 CU10

Pedro Lopes shows us a new use hint introduced in SQL Server 2017 CU10:

In this scenario, you only have this one query that apparently does better in SQL Server 2014 than 2017. That’s all “New CE” – there’s no CE70 vs CE 120+ at issue here. Using any known trace flag, the FORCE_LEGACY_CARDINALITY_ESTIMATION hint or the FORCE_DEFAULT_CARDINALITY_ESTIMATION hint doesn’t help. Rewriting the query is an option, but in the interim, I need a quick fix. How?

In SQL Server 2017 CU10, we have introduced a few new USE HINTs: the QUERY_OPTIMIZER_COMPATIBILITY_LEVEL_n, where n is a supported database compatibility level. This forces the query optimizer behavior at a query level, as if the query was compiled with database compatibility level. You can refer to sys.dm_exec_valid_use_hints for a list of currently supported values for n.

So to be clear, the new hint is not forcing only a specific CE model, it’s forcing the equivalent of the specific database compatibility level’s query optimizer behavior, including any query optimizer fixes that are enabled by default in that database compatibility level.

Something to keep in mind, though ideally not something you’d want to use regularly.

What’s In SQL Server 2019 CTP 2.0?

Aaron Bertrand gives us the highlights:

  • Certificate Management in Config Manager View and validate all of your certificates from a single interface, and manage and deploy certificate changes across all of the replicas in an Availability Group or all of the nodes in a Failover Cluster Instance.

  • Built-in data classification A new ADD SENSITIVITY CLASSIFICATION statement helps you identify and automatically audit sensitive data, a huge step up from the previous SSMS wizard (which just used extended properties).

Aaron also digs into the engine a bit:


This new aggregate function is designed for data warehouse scenarios, and is an equivalent for COUNT(DISTINCT()). Instead of performing expensive distinct sort operations to determine actual counts, it relies instead on statistics to get something relatively accurate. You should find that the margin of error is within 2% of the precise count, 97% of the time, which is usually fine for high-level analytics, values that populate a dashboard, or quick estimates.

On my system I created a table with integer columns ranging from 100 to 1,000,000 unique values, and string columns ranging from 100 to 100,000 unique values. There were no indexes other than a clustered primary key on the leading integer column. Here are the results of COUNT(DISTINCT()) vs. APPROX_COUNT_DISTINCT() against those columns, so you can see where it is off by a bit (but always well within 2%):

By the way, APPROX_COUNT_DISTINCT() is a really good idea, and I’m glad it’s here.

Dealing With Large JSON Values

Bert Wagner investigates an issue he found where his long JSON strings were becoming NULL in SQL Server:

After a little bit more research, I discovered that the return type for JSON_VALUE is limited to 4000 characters.   Since JSON_VALUE is in lax mode by default, if the output has more than 4000 characters, it fails silently.

To force an error in future code I could use SELECT JSON_VALUE(@json, ‘strict $.FiveThousandAs’)  so at least I would be notified immediately of an problem with my  query/data (via failure).

Although strict mode will notify me of issues sooner, it still doesn’t help me extract all of the data from my JSON property.

Read on for the answer.

Deleting Top Records With An Order By Clause

Kenneth Fisher shows that deleting the top N records with an ORDER BY clause is not straightforward:

Did you know you can’t do this?

FROM SalesOrderDetail

Msg 156, Level 15, State 1, Line 8
Incorrect syntax near the keyword ‘ORDER’.

I didn’t. Until I tried it anyway. Turns out, it says so right in the limitations section of BOL. Fortunately, that same section does have a recommendation on how to move forward.

Read on for a couple of methods to do this.

Gaps And Islands: Solving Stochastic Islands Problems

Itzik Ben-Gan shares with us a special case of the islands problem:

In your database you keep track of services your company supports in a table called CompanyServices, and each service normally reports about once a minute that it’s online in a table called EventLog. The following code creates these tables and populates them with small sets of sample data:


The special islands task is to identify the availability periods (serviced, starttime, endtime). One catch is that there’s no assurance that a service will report that it’s online exactly every minute; you’re supposed to tolerate an interval of up to, say, 66 seconds from the previous log entry and still consider it part of the same availability period (island). Beyond 66 seconds, the new log entry starts a new availability period. So, for the input sample data above, your solution is supposed to return the following result set (not necessarily in this order):

It’s a neat twist on an old problem.

Ad Hoc Functions In T-SQL

Riley Major shows a couple techniques for including ad hoc functions in T-SQL, namely Common Table Expressions and the APPLY operator:

It’s helpful to think of each APPLY as a pipe operation, taking the values from the previous derived table and passing them into the next to be manipulated. Programming T-SQL in this manner (loosely) approximates modern functional programming techniques.

It keeps each step of the logic smaller, so that it’s easier to understand. And you can expose the intermediary columns to help with debugging.

This is one of my favorite uses of the APPLY operator, as it lets you think through a problem step-by-step while still allowing the optimizer to create a set-based solution for you.

Using Table-Valued Parameters With sp_executesql

Kenneth Fisher shows how to include table-valued parameters in a dynamic SQL query:

Recently I did a presentation on dynamic SQL. In the presentation I pointed out the similarity of using sp_executesql to creating a stored procedure to do the same task. After the session I was asked: If that’s the case, can I pass a TVP (table valued parameter) into sp_executesql?

Awesome question! Let’s give it a shot.

Read on to see how to do this.

Multiple Mentions Of A Column In An UPDATE Statement

Doug Lane walks us through various scenarios with updates including the same column multiple times:

An application developer came to me with this question recently: “Can I use the same column twice in a SQL UPDATE statement?”

Yes and no.

It depends on what you mean by “use”.

Read on to see what Doug means.

Issues With Bulk Inserting Multi-Byte Characters In Fixed Width Files

Randolph West shares an example of an issue with BULK INSERT:

Fellow Canadian Doran Douglas brought this issue to my attention recently, and I wanted to share it with you as well.

Let’s say you have a file in UTF-8 format. What this means is that some of the characters will be single-byte, and some may be more than that.

Where this becomes problematic is that a fixed-width file has fields that are, well, fixed in size. If a Unicode character requires more than one byte, it’s going to cry havoc and let slip the dogs of truncation.

Click through for an example.  This seems like a bug to me—I interpret fixed-width as fixed number of characters, not fixed number of bytes.  At the very least, it’s liable to cause confusion.

When Scalar Functions Go Bad

Daniel Janik head-fakes us a few times when looking at scalar user-defined function performance:

I’ve read a lot of things lately pointing to scalar functions as if they were the devil. In this blog I’m going to explore if that’s the case. Let’s have a look.

It’s true that in many situations a scalar function is often a performance bottleneck; but, is there a situation where they could be responsibly used?

What if you had a lookup table that almost never changed? Is it worth doing a join on the lookup to get the data you need?

Let’s examine a simple join between a customer address and a state lookup table.

Things are not always as they seem.


September 2018
« Aug