For
VARCHAR
, some of you might be thinking that it was “1” until recently when SQL Server 2019 introduced the “_UTF8
” collations. Nope. The last time “1” was correct forVARCHAR
was back in SQL Server 7.0, before SQL Server 2000 introduced the Windows collations which offered some Double-Byte Character Sets.For
NVARCHAR
, some of you might be thinking that it was “2” until SQL Server 2012 introduced the “_SC
” collations that fully support Supplementary Characters (UTF-16). Sorry, still incorrect. “2” was never technically correct forNVARCHAR
, it was only temporarily correct for the first few years (until Supplementary Characters were defined in Unicode 3.1, released in March, 2001). Ever since SQL Server 7.0 introduced theNCHAR
,NVARCHAR
, andNTEXT
datatypes, it has been possible to store whatever UTF-16 byte sequences you want, even if they are currently undefined. The older collations do not recognize surrogate pairs / Supplementary Characters, but that’s not related to SQL Server’s ability to store and retrieve any 16-bit code point. As long as you are using a font that supports Supplementary Characters, they should display correctly.
Solomon is one of a handful of people I’ve met who has collations and characters down cold.