Thoughts On UTF-8 Encoding In SQL Server 2019

Solomon Rutzky digs into UTF-8 support in SQL Server 2019 and has found a few bugs:

Let’s start with what we are told about this new feature. According to the documentation, the new UTF-8 Collations:

  1. can be used …

    1. as a database-level default Collation
    2. as a column-level Collation
    3. by appending “_UTF8” to the end of any Supplementary Character-Aware Collation (i.e. either having “_SC” in their name, or being of level 140 or newer)
    4. with only the CHAR and VARCHAR
  2. (implied) have no effect on NCHAR and NVARCHAR data (meaning: for these types, the UTF-8 Collations behave the same as their non-UTF-8 equivalents

  3. “This feature may provide significant storage savings, depending on the character set in use.” (emphasis mine)

Solomon takes his normal, thorough approach to the problem and finds several issues.

Related Posts

String or Binary Data and Associated Bug

Brent Ozar looks at the “String or binary data would be truncated” improvement: Don’t leave this trace flag enabled.There’s at least one bug with it as of today on SQL Server 2017 CU13: table variables will throw errors saying their contents are being truncated even when no data is going into them. Andrew Pruski reports: […]

Read More

Microsoft Data Platform Bug Reporting Links

Kevin Feasel

2019-02-28

Bugs

Brent Ozar has put together a compendium of where you should go if you want to file bug reports or feature requests for different products in the Microsoft data platform space: Azure Data Studio – open an issue in the Github repo. While you open an issue, Github helps by searching the existing issues as you’re typing, so […]

Read More

Categories

October 2018
MTWTFSS
« Sep Nov »
1234567
891011121314
15161718192021
22232425262728
293031