Data Types – Page 8 – Curated SQL

Implicit Conversions and SARGability

Published 2021-09-03 by Kevin Feasel

Data types are one of those core things you need to get right. Whether it’s matching types between join columns, or between parameters and predicates, not doing so can really squash application performance in quite similar ways to writing non-SARGable predicates.
That’s because — wait for it — a lot of the stuff we’ve talked about over the last week that can happen with poorly written predicates can happen with poorly matched data types, too.

Click through for an example. If this keeps up, we may never save Sarge.

Comments closed

sysname and Default Nullability

Published 2021-08-27 by Kevin Feasel

Kenneth Fisher discovers a secret of the sysname datatype:

The other day I was creating a table to store some metadata. Since the metadata I was collecting (sys.databases.name for example) uses the datatype sysname I used the same datatype. In case you didn’t know sysname is an nvarchar(128) user defined datatype used in a number of the system views.

Click through for Kenneth’s finding.

Comments closed

Working with Complex Return Types in Power Query

Published 2021-07-26 by Kevin Feasel

Gerhard Brueckl works with some complex data:

When working with Power Query, you have probably already realized that every expression you write returns a value of a specific type. Usually this will be a primitive type like text, number, or date. (A full list of types available in Power Query can be found here: https://docs.microsoft.com/en-us/powerquery-m/m-spec-types). If for some reason the type of an expression cannot be defined, the special type *any* will be used. For sure you already encountered this when using Table.AddColumn which, by default, results in the new column being of type *any*.

Read on to learn more, including what you can do if you’re creating Power Query functions.

Comments closed

Avoid Deprecated Data Types

Published 2021-07-01 by Kevin Feasel

Aaron Bertrand has some advice for us:

I’ve recently discussed a few features that Microsoft advises against using, and that I think you should forget exist, too. There was the case where a colleague constantly promoted the deprecated backward compatibility view sys.sysprocesses instead of newer dynamic management views (DMVs), and another case where a different colleague took down a production server using SQL Server Profiler.
My latest run-in with things best forgotten is a new stored procedure with an ntext parameter. I checked and, sure enough, the data type matches the schema for the underlying table. My mind started racing about these older data types to explain why we really shouldn’t be using them anymore:

I can’t remember the last time I saw new SQL Server tables created with ntext or other deprecated types, but apparently it happens.

Comments closed

Data Type Conversions and Query Folding

Published 2021-06-30 by Kevin Feasel

Chris Webb explains how data type conversions and query folding might not mix:

It’s surprisingly easy to stop query folding happening in Power Query by changing the data type of a column. This is mentioned in the docs here, and it’s something several people have blogged about already (for example here). However there is something new to note: an option that will allow you to convert text columns to number or date columns in a foldable way for SQL Server data sources.
Consider the following table in a SQL Server database that consists of a single nvarchar(50) column containing numeric values:

Click through for the example, and also check out the comments below for more info.

Comments closed

Storing DATETIMEOFFSETs

Published 2021-06-17 by Kevin Feasel

Randolph West shows us how the DATETIMEOFFSET type is stored in SQL Server:

Cast your mind back to our discussion on DATETIME2. As you know, DATETIME2 is basically the same as squishing DATE (3 bytes) and TIME (between 3 and 5 bytes depending on the scale) into the same column. You end up with a persisted value that is between 6 and 8 bytes wide.
DATETIMEOFFSET is kinda sorta the same thing, but with more bytes on the end. If you take a look at the Microsoft Docs page, the similar idea of a varied column size is retained. For a scale of 0 fractions of a second you only need 8 bytes to store your value, while the default scale of 7 decimal places for storing seconds requires the full 10 bytes.

Click through to understand how the sordid details.

Comments closed

Memory Grants for CHAR vs VARCHAR

Published 2021-06-04 by Kevin Feasel

Erik Darling has a head-slapping moment:

While working with a client recently, we found that someone, at some point in time, probably during the original migration from Access, had chosen CHAR columns rather than VARCHAR columns.
Okay, fine. How bad could it be?
Bad enough that… A whole bunch of columns that only had a single character in them were stored in CHAR(1000) columns.

I like CHAR…well, to be specific, NCHAR. But only when you’ll need exactly that many characters.

Comments closed

Arithmetic Operations on DATETIME Data Types

Published 2021-04-29 by Kevin Feasel

Eitan Bluman shows off some math skills:

Mathematical addition and subtraction can be performed between two datetime data types:
SET @d2 = '1900-03-30 18:00'SELECT@d1 + @d2 -- result: 1900-04-01 10:15:15.900, @d1 - @d2 -- result: 1899-10-05 22:15:15.900, @d2 - @d1 -- result: 1900-03-29 01:44:44.100
This means that we can have basic datetime arithmetics in SQL server. We can use subtraction to find an accurate difference between two dates, and use addition to add an accurate interval to a datetime column or variable.

This is one of those things you can do, but I’m not very fond of. First of all, as Eitan points out, you can’t do these in the (in all ways superior) DATETIME2 data type. Secondly, it adds some confusion to the code, as you don’t always get what you expect.

Comments closed

T-SQL Tuesday 136 Wrap-Up

Published 2021-03-16 by Kevin Feasel

Brent Ozar rounds up the usual suspects, plus several more:

For this month’s T-SQL Tuesday, I asked you to blog about your most-loved and least-loved data types.
Crazy, right? How could people possibly love or hate data types? Well, if you’ve been working with them for a while, I figured you’d have built up an array of tips or pain points, and y’all delivered with 29 interesting blog posts.

Click through for a lengthy list of interesting posts.

Comments closed

Disliking User-Defined Data Types

Published 2021-03-12 by Kevin Feasel

Andy Levy has a bone to pick:

Here’s the thing – these types are really just aliases for native types in SQL Server, but more constrained. Constrain yourself to UDTs and you’ll have trouble right-sizing your fields. Let’s say you’ve got three data types for text data:
– myShortString (varchar(10))
– myString (varchar(256))
– myBigString (varchar(8000))
These lengths are not helping anyone. You can’t store email addresses or names in myShortString. But myString is probably way too much for that data. You’re going to waste memory because of how SQL Server estimates memory grants and your indexes will be bloated. Maybe you just need to create more UDTs to cover these situations. But that just compounds the other problems, doesn’t it?

Pushes glasses up the bridge of his nose. Teeeeechnically, an e-mail address may be up to 256 characters long, including a username of up to 64 characters (and maybe two angle brackets, depending on the host). So myString would actually be perfect. Steve Jones has a comment about 300, but that was probably the original standard of 320. Regardless, I realize how far beside the point that is, and Andy’s point is a good one, as well as the several others he makes in the post.

One quick note on defined types: they really do make a lot of sense in a domain-driven design, especially when working with functional programming languages. Defining a CustomerID as an int is fine, but if I know my customer IDs are natural numbers (1, 2, 3, …), 9 digits long, and do not contain the sequence 2345 (because my company considers this an unlucky number sequence), creating a CustomerID type which provides this sort of type checking is great because you keep the rules as close to the data as possible and ensure consistency. It’s also more restrictive than int, so you can cast back down to an int when you’re ready to interact with some remote system. So short answer, do this all day in F#, but not in the database.

Comments closed

Category: Data Types