Thoughts On UTF-8 Encoding In SQL Server 2019

Let’s start with what we are told about this new feature. According to the documentation, the new UTF-8 Collations:

can be used …
1. as a database-level default Collation
2. as a column-level Collation
3. by appending “_UTF8” to the end of any Supplementary Character-Aware Collation (i.e. either having “_SC” in their name, or being of level 140 or newer)
4. with only the CHAR and VARCHAR
(implied) have no effect on NCHAR and NVARCHAR data (meaning: for these types, the UTF-8 Collations behave the same as their non-UTF-8 equivalents
“This feature may provide significant storage savings, depending on the character set in use.” (emphasis mine)

Solomon takes his normal, thorough approach to the problem and finds several issues.