Press "Enter" to skip to content

Category: Data Types

Unicode Character Generation in Power Query

Meagan Longoria needs more Unicode:

You may have used the UNICHAR() function in DAX to return Unicode characters in DAX measures. If you haven’t yet read Chris Webb’s blog post on the topic, I recommend you do. But did you know there is a Power Query function that can return Unicode characters? This can be useful in cases when you want to assign a Unicode character to a categorical value.

Click through to see how this works.

Comments closed

Resolving Implicit Conversion on a Join

Andrea Allred has a process:

As I was troubleshooting a performance issue, I noticed that there was an implicit conversion (SQL Server automatically converts the data from one data type to another) happening in my join. The join was on a column that was named the same in both tables, but one was datatype INT (integer) and the other was a datatype of VARCHAR(50) (variable character up to 50 places).

Read on for one way to resolve this issue…so long as no other calling code expects a string on a call.

Comments closed

Linear Data Structures

Dave Saunders describes the concept of linear data structures:

As we said in the introduction, a data structure is ‘linear’ if the elements form a sequence.

That means that the data structure has a first and last element, and each element is connected to its previous and next element.

– An ‘array’ is a linear data structure; the items are stores sequentially.

– A ‘graph’ is not a linear data structure; any node can be linked to any other node in the graph – there is no fixed ‘sequence’.

Click through for several examples of linear data structures.

Comments closed

Unicode and Data Length

Kevin Wilkie lays out an argument:

If you truly need the UNICODE characters in your data, go ahead and use them! If not though, please make your DBA happy by not using them. Since UNICODE characters take up twice the amount of space as the ASCII versions do, then your DBAs will recommend to use the ASCII versions if you are not going to be using any UNICODE characters.

Read on for the justification. But I’m still NVARCHAR (Almost) Everywhere.

Comments closed

Using CHAR

Steve Jones wants to know if people are still using CHAR:

I ran across an article, titled When to use CHAR, VARCHAR, or VARCHAR(MAX), which struck me as something I never do. I mean, I do use varchar (and nvarchar), but I can’t remember the last time I actually created a char column. The article is worth a read, and it sets the stage for you to think about your database design process and the choices you make.

It’s Friday, and if you take a few minutes and think about the last few times you’ve added columns to a table have you used CHAR as a data type? Or do you default to varchar of any size as a general rule?

Yeah, using CHAR (or NCHAR, which you’d probably want to use instead) is pretty rare. This is a consequence of storage becoming much less of a constraint over the past couple decades, as well as considerable gains in efficiency for data platform products. (N)CHAR might be a little faster than (N)VARCHAR for certain operations and may be a bit more efficient for single-character columns, but I agree with Steve’s premise.

Comments closed

Choosing between String Data Types in SQL

Greg Larsen compares CHAR, VARCHAR, and VARCHAR(MAX):

In every database, there are different kinds of data that need to be stored. Some data is strictly numeric, while other data consists of only letters or a combination of letters, numbers, and even special symbols. Whether it is just stored in memory or on disk, each piece of data requires a data type. Picking the correct data type depends on the characteristics of the data being stored. This article explains the differences between CHARVARCHAR, and VARCHAR(MAX).

Click through for Greg’s explanation. My official rule of thumb is as follows:

  • If you have a fixed-length code which you display to customers, use NCHAR.
  • If you have a fixed-length code which you only use internally and you know that the code will never include characters outside of your SQL Server installation’s code page and you know that the code page will never change…probably still use NCHAR, though if you twist my arm enough I’d say fine, CHAR.
  • Otherwise, use NVARCHAR.

Three decades ago, the choice was a lot trickier given performance differences between the two. Today? Unless you’re hunting for microseconds I don’t think you’ll see a practical difference. And if you are hunting for microseconds, you probably want more than rules of thumb.

Comments closed

Swap an Identity Column with a Sequence

Greg Larsen looks at sequences:

Microsoft introduced the sequence number objects starting with SQL Server 2012. A sequence object generates sequence numbers based on starting and increment values, similar to an identity column, but it has additional features. Over time, you might find that the additional benefits of a sequence number have you wanting to replace an identity column with a sequence number. This article demonstrates two options for replacing an identity column with a sequence number.

When it came out, I thought I was going to use sequences a quite often. In practice, I’ve used it a few times, but IDENTITY is so much easier to type and I’ve rarely needed sequence generators. That said, the times I have needed them, I definitely appreciate their existence.

Comments closed

Wildcards and Data Type Precedence

Daniel Hutmacher has fun with implicit conversion:

Implicit conversions in SQL Server follow a specific, predictable order, called data type precedence. This means that if you compare or add/concatenate two values, a and b, with different data types, you can predict which one will be implicitly converted to the data type of the other one in order to be able to complete the operation.

I stumbled on an interesting exception to this rule the other day.

Click through for an example of date searching with LIKE.

Comments closed

On Avoiding NULL

Aaron Bertrand defends NULL:

A long time ago, I answered a question about NULL on Stack Exchange entitled, “Why shouldn’t we allow NULLs?” I have my share of pet peeves and passions, and the fear of NULLs is pretty high up on my list. A colleague recently said to me, after expressing a preference to force an empty string instead of allowing NULL:

“I don’t like dealing with nulls in code.”

I’m sorry, but that’s not a good reason. How the presentation layer deals with empty strings or NULLs shouldn’t be the driver for your table design and data model. And if you’re allowing a “lack of value” in some column, does it matter to you from a logical standpoint whether the “lack of value” is represented by a zero-length string or a NULL? Or worse, a token value like 0 or -1 for integers, or 1900-01-01 for dates?

Definitely read what Aaron has to say. I disagree with the tenor of his point enough that, now that I’m actually blogging again, I’ll have a post up tomorrow laying out the core of my disagreement. Stay tuned!

2 Comments