Press "Enter" to skip to content

Category: Data Types

Working with Complex Return Types in Power Query

Gerhard Brueckl works with some complex data:

When working with Power Query, you have probably already realized that every expression you write returns a value of a specific type. Usually this will be a primitive type like text, number, or date. (A full list of types available in Power Query can be found here: https://docs.microsoft.com/en-us/powerquery-m/m-spec-types). If for some reason the type of an expression cannot be defined, the special type *any* will be used. For sure you already encountered this when using Table.AddColumn which, by default, results in the new column being of type *any*.

Read on to learn more, including what you can do if you’re creating Power Query functions.

Comments closed

Avoid Deprecated Data Types

Aaron Bertrand has some advice for us:

I’ve recently discussed a few features that Microsoft advises against using, and that I think you should forget exist, too. There was the case where a colleague constantly promoted the deprecated backward compatibility view sys.sysprocesses instead of newer dynamic management views (DMVs), and another case where a different colleague took down a production server using SQL Server Profiler.

My latest run-in with things best forgotten is a new stored procedure with an ntext parameter. I checked and, sure enough, the data type matches the schema for the underlying table. My mind started racing about these older data types to explain why we really shouldn’t be using them anymore:

I can’t remember the last time I saw new SQL Server tables created with ntext or other deprecated types, but apparently it happens.

Comments closed

Data Type Conversions and Query Folding

Chris Webb explains how data type conversions and query folding might not mix:

It’s surprisingly easy to stop query folding happening in Power Query by changing the data type of a column. This is mentioned in the docs here, and it’s something several people have blogged about already (for example here). However there is something new to note: an option that will allow you to convert text columns to number or date columns in a foldable way for SQL Server data sources.

Consider the following table in a SQL Server database that consists of a single nvarchar(50) column containing numeric values:

Click through for the example, and also check out the comments below for more info.

Comments closed

Storing DATETIMEOFFSETs

Randolph West shows us how the DATETIMEOFFSET type is stored in SQL Server:

Cast your mind back to our discussion on DATETIME2. As you know, DATETIME2 is basically the same as squishing DATE (3 bytes) and TIME (between 3 and 5 bytes depending on the scale) into the same column. You end up with a persisted value that is between 6 and 8 bytes wide.

DATETIMEOFFSET is kinda sorta the same thing, but with more bytes on the end. If you take a look at the Microsoft Docs page, the similar idea of a varied column size is retained. For a scale of 0 fractions of a second you only need 8 bytes to store your value, while the default scale of 7 decimal places for storing seconds requires the full 10 bytes.

Click through to understand how the sordid details.

Comments closed

Memory Grants for CHAR vs VARCHAR

Erik Darling has a head-slapping moment:

While working with a client recently, we found that someone, at some point in time, probably during the original migration from Access, had chosen CHAR columns rather than VARCHAR columns.

Okay, fine. How bad could it be?

Bad enough that… A whole bunch of columns that only had a single character in them were stored in CHAR(1000) columns.

I like CHAR…well, to be specific, NCHAR. But only when you’ll need exactly that many characters.

Comments closed

Arithmetic Operations on DATETIME Data Types

Eitan Bluman shows off some math skills:

Mathematical addition and subtraction can be performed between two datetime data types:

SET @d2 = '1900-03-30 18:00'SELECT@d1 + @d2 -- result: 1900-04-01 10:15:15.900, @d1 - @d2 -- result: 1899-10-05 22:15:15.900, @d2 - @d1 -- result: 1900-03-29 01:44:44.100

This means that we can have basic datetime arithmetics in SQL server. We can use subtraction to find an accurate difference between two dates, and use addition to add an accurate interval to a datetime column or variable.

This is one of those things you can do, but I’m not very fond of. First of all, as Eitan points out, you can’t do these in the (in all ways superior) DATETIME2 data type. Secondly, it adds some confusion to the code, as you don’t always get what you expect.

Comments closed

T-SQL Tuesday 136 Wrap-Up

Brent Ozar rounds up the usual suspects, plus several more:

For this month’s T-SQL Tuesday, I asked you to blog about your most-loved and least-loved data types.

Crazy, right? How could people possibly love or hate data types? Well, if you’ve been working with them for a while, I figured you’d have built up an array of tips or pain points, and y’all delivered with 29 interesting blog posts.

Click through for a lengthy list of interesting posts.

Comments closed

Disliking User-Defined Data Types

Andy Levy has a bone to pick:

Here’s the thing – these types are really just aliases for native types in SQL Server, but more constrained. Constrain yourself to UDTs and you’ll have trouble right-sizing your fields. Let’s say you’ve got three data types for text data:

– myShortString (varchar(10))
– myString (varchar(256))
– myBigString (varchar(8000))

These lengths are not helping anyone. You can’t store email addresses or names in myShortString. But myString is probably way too much for that data. You’re going to waste memory because of how SQL Server estimates memory grants and your indexes will be bloated. Maybe you just need to create more UDTs to cover these situations. But that just compounds the other problems, doesn’t it?

Pushes glasses up the bridge of his nose. Teeeeechnically, an e-mail address may be up to 256 characters long, including a username of up to 64 characters (and maybe two angle brackets, depending on the host). So myString would actually be perfect. Steve Jones has a comment about 300, but that was probably the original standard of 320. Regardless, I realize how far beside the point that is, and Andy’s point is a good one, as well as the several others he makes in the post.

One quick note on defined types: they really do make a lot of sense in a domain-driven design, especially when working with functional programming languages. Defining a CustomerID as an int is fine, but if I know my customer IDs are natural numbers (1, 2, 3, …), 9 digits long, and do not contain the sequence 2345 (because my company considers this an unlucky number sequence), creating a CustomerID type which provides this sort of type checking is great because you keep the rules as close to the data as possible and ensure consistency. It’s also more restrictive than int, so you can cast back down to an int when you’re ready to interact with some remote system. So short answer, do this all day in F#, but not in the database.

Comments closed

In Defense of Float

Hugo Kornelis levies a defense of floating point data types:

Let’s return to the database. Let’s figure out a way to store these numbers appropriately.

Could we use decimal (or its synonym numeric)? Well, yes. We can. We need 25 digits after the decimal place for 0.0000000000000000000004052, and 9 digits before the decimal place for 299900000, so that would fit in a decimal(34,25). But if you try to compute c2 so you can then multiply that to the m, you’ll run into an overflow error.

Hugo does a good job of defending the float data type.

Comments closed

Two Disliked Data Types

Aaron Bertrand has two bones to pick:

I am not often one to do the bare minimum, unless it comes to looking for things around the house. After about 22 seconds I throw my hands in the air and exclaim, “I can’t find it!” As my wife so kindly added: It usually turns out to be in a spot I already looked.

But when it revolves around SQL Server and opinions, I’m all over it. So I’m not going to talk about a data type today; I’m going to talk about two of them. One of them Brent already mentioned in his invite:

I kinda-sorta disagree with Aaron’s second choice. By that I mean that I fully agree with his premise: use UTC everywhere. But if you don’t use UTC everywhere, then use DATETIMEOFFSET everywhere and apply the time zones.

Comments closed