More Collation Comparison Fun

Solomon Rutzky wraps up his collation comparison series:

Just in case you were wondering: No, the test directly above test does not prove the documentation (as quoted at the top of Part A) correct. Yes, the documentation did state that characters would be converted to the Code Page specified by the Collation of the Database or Column, which does appear to be what is going on here. But, the differences are:

  1. The documentation states that the transformation happens if you reference a Unicode datatype, but what we just saw in the most recent test is the exact opposite:

    • only the NVARCHAR columns of the “Subscript 2” row match because they are still the “Subscript 2” character, while the NVARCHAR columns of the other two rows do not match due to being either “2” or “?”.
    • transformation did occur in the Latin1 and Hebrew VARCHAR columns, which is how it matched both rows with “2” in the Latin1 column and both rows with “?” in the Hebrew column.
  2. In the scenario involving another column where it would help to prefix the string literal with a capital-“N” (i.e. a VARCHAR column using a Collation that specifies a different Code Page than the Database’s Code Page), the Collation of the Database is not used for the transformation; it is only the referenced column’s Collation. The assumption here is that the string literal without the capital-“N” is being used in a Database where the Collation specifies a Code Page that has all of the characters.

  3. In the scenarios where the Database’s Collation, via its specified Code Page, did transform a string literal that was not prefixed with a capital-“N” (the two tests in the previous post), there would have already been unintended behavior no matter how the string literal was used.

I still think it’s a duck.

Related Posts

The Hidden Performance Costs Of Collation Mismatch

Nate Johnson explains why you want collation consistency when joining tables together on varchar/nvarchar columns: There’s a subtle difference here, vs. those many community blog posts, which I’ll repeat.  The columns are of the same type.  Just different collations. And when the collation on the join predicates is different, bad things happen. Let’s take CustomerNumber for example. On the ERP […]

Read More

Collations Used In NVarchar To Varchar Conversions

Solomon Rutzky digs into collations: So, the actual question is a bit more specific than would reasonably fit into a title, and it is: In a WHERE condition (in Microsoft SQL Server, if that’s not obvious), when a string constant containing Unicode characters but not prefixed with a capital-N is compared to a column, which Collation (and hence Code Page) […]

Read More


December 2017
« Nov Jan »