# Category: Data Types

“Sparsing” is the way SQL Server optimizes spacing for NULL values at the cost of overhead for non-NULL values.
In other words, if you expect having in your column more nulls than not nulls you can SPARSE that column to optimize the space.

I’ve seen the situations when a lot of columns in Data Mart tables were almost completely filled with NULLS and I started wondering if “SPARSE” can be a good tool to gain some space.

Read the whole thing.  I am not a fan of sparse columns because they prohibit things like page-level compression.  Be sure to read the restrictions on using sparse columns before you give them a try; on net, I think they’re more trouble than they’re worth except in edge cases like extremely denormalized tables collecting thousands of data points from sensors.

Unfortunately, the mathematical approach has flaws. First of all, Power(10, 38) exceeds the range of any numerical data type in SQL Server. There is no way to store or work with this value in calculations. Secondly, once you try to raise 10 to a power greater than 30, you start seeing floating point calculations (the values are approximate). For example, select Power(Cast(10 as decimal(38, 0)), 31) — casting as decimal(38, 0) because it exceeds int or bigint — yields 9999999999999999600000000000000. That’s clearly an approximated value and is not going to work for calculations where we’re expecting a precise value. So, that leaves the hacky version I didn’t want to do because I just cannot count on the mathematical approach working. Here is the base query using the hacky calculation

Read on for the hacky version.

I was working with a computed column the other day, and realized I had the wrong definition. In this case, I was performing some large calculation, and the result was larger than an int. However the first part of the formula was an int, which resulted in an implicit conversion to an int.

I needed to change the formula, and then realized that plenty of people might not work with computed columns much, and not realize how you alter a computed column.

You don’t.

In fact, you need to drop the column and add it back. In my case, this was what I did.

Fortunately, this tends to be a pretty quick operation, especially if the computed column is non-persisted.

Far too often then, the easy answer, just change the column to varchar. You can trim the time and output in exactly the format needed by the business. Problem solved and it was easy…

Well, until someone inputs “Janry 20, 2009” slightly mangling the spelling and suddenly your report looks all messed up. Or, they ask you to start filtering just the last two weeks, regardless of when the report was run and you now can’t easily dodate math on the column. Even after you get over that problem with a little formatting using CAST (along with ISDATE to try to catch all those other typos that are in the system now)  you notice that the performance is really slow so you go to put an index on the column and now you have an index key that is 50 bytes wide instead of the 3 bytes that the DATE data type would have been, making the index less efficient (not to mention, sorting the data is going to put February ahead of January, more formatting).

The first thing you’ll notice is that a single space is stored the same way in both columns. With an empty string, on the other hand, we see a difference. Char columns are fixed length. So even though we inserted an empty string into it we get back a single space.

The next major difference is that varchar columns require an extra two bytes of storage. So a varchar(1) column actually uses three bytes not just the one byte that char(1) does.

This is exactly the type of scenario row-level compression improves.