Press "Enter" to skip to content

Category: Data Types

sp_help And Column Lengths

Jon Morisi looks at an oddity with the way that sp_help shows column length:

I was working with a contractor today who was having difficulty providing me back details on a table definition.  I was specifically interested in a particular column’s data type and size.  (This was related to an ETL process I was working on, and my desire to avoid any implicit conversions).

The reply I got back was, “the column you’re interested in is an nvarchar(100)”.  After continued digging and troubleshooting, I was eventually able to sort out that it was actually an nvarchar(50).

I put together this TEST table to illustrate where the confusion came from.  Can you spot what’s going on?

There’s an interesting explanation which makes me dislike sp_help just a little bit more.

Comments closed

Sparse Columns

Slava Murygin discusses sparse columns:

“Sparsing” is the way SQL Server optimizes spacing for NULL values at the cost of overhead for non-NULL values.
In other words, if you expect having in your column more nulls than not nulls you can SPARSE that column to optimize the space.

I’ve seen the situations when a lot of columns in Data Mart tables were almost completely filled with NULLS and I started wondering if “SPARSE” can be a good tool to gain some space.

Read the whole thing.  I am not a fan of sparse columns because they prohibit things like page-level compression.  Be sure to read the restrictions on using sparse columns before you give them a try; on net, I think they’re more trouble than they’re worth except in edge cases like extremely denormalized tables collecting thousands of data points from sensors.

Comments closed

Max And Min Decimal Values

Robert Davis gives us the formula for the max and min decimal values given a scale and precision:

Unfortunately, the mathematical approach has flaws. First of all, Power(10, 38) exceeds the range of any numerical data type in SQL Server. There is no way to store or work with this value in calculations. Secondly, once you try to raise 10 to a power greater than 30, you start seeing floating point calculations (the values are approximate). For example, select Power(Cast(10 as decimal(38, 0)), 31) — casting as decimal(38, 0) because it exceeds int or bigint — yields 9999999999999999600000000000000. That’s clearly an approximated value and is not going to work for calculations where we’re expecting a precise value. So, that leaves the hacky version I didn’t want to do because I just cannot count on the mathematical approach working. Here is the base query using the hacky calculation

Read on for the hacky version.

Comments closed

Changing Computed Columns

Steve Jones shows a how to modify a computed column:

I was working with a computed column the other day, and realized I had the wrong definition. In this case, I was performing some large calculation, and the result was larger than an int. However the first part of the formula was an int, which resulted in an implicit conversion to an int.

I needed to change the formula, and then realized that plenty of people might not work with computed columns much, and not realize how you alter a computed column.

You don’t.

In fact, you need to drop the column and add it back. In my case, this was what I did.

Fortunately, this tends to be a pretty quick operation, especially if the computed column is non-persisted.

Comments closed

Data Types

Grant Fritchey implores you to use the correct data type:

Far too often then, the easy answer, just change the column to varchar. You can trim the time and output in exactly the format needed by the business. Problem solved and it was easy…

Well, until someone inputs “Janry 20, 2009” slightly mangling the spelling and suddenly your report looks all messed up. Or, they ask you to start filtering just the last two weeks, regardless of when the report was run and you now can’t easily dodate math on the column. Even after you get over that problem with a little formatting using CAST (along with ISDATE to try to catch all those other typos that are in the system now)  you notice that the performance is really slow so you go to put an index on the column and now you have an index key that is 50 bytes wide instead of the 3 bytes that the DATE data type would have been, making the index less efficient (not to mention, sorting the data is going to put February ahead of January, more formatting).

Read the whole thing.

Comments closed


Kenneth Fisher warns against low VARCHAR sizes:

The first thing you’ll notice is that a single space is stored the same way in both columns. With an empty string, on the other hand, we see a difference. Char columns are fixed length. So even though we inserted an empty string into it we get back a single space.

The next major difference is that varchar columns require an extra two bytes of storage. So a varchar(1) column actually uses three bytes not just the one byte that char(1) does.

This is exactly the type of scenario row-level compression improves.

Comments closed