BCP For UTF-8

Sanjay Mishra notes that SQL Server 2016 and 2014 SP2 support UTF-8 for BCP and BULK INSERT:

This requirement has been addressed in SQL Server 2016 (and backported to SQL Server 2014 SP2). To test this, I obtained a UTF-8 dataset from http://www.columbia.edu/~fdc/utf8/. The dataset is translation of the sentence “I can eat glass and it doesn’t hurt me” in several languages. A few lines of sample data are shown here:

(As an aside, it is entirely possible to load Unicode text such as above into SQL Server even without this improvement, as long as the source text file uses a Unicode encoding other than UTF-8.)

I ran into this problem before, where developers wanted to bulk load UTF-8 but had to settle for an inferior solution.

Related Posts

Using The Spark Connector To Speed Up Data Loads

Denzil Riberio explains how you can use the Spark connector for Azure SQL DB and SQL Server to speed up inserting data from Spark into SQL Server 15x over the native JDBC client: Since the load was taking longer than expected, we examined the sys.dm_exec_requests DMV while load was running, and saw that there was a fair […]

Read More

Automatic Retry With Optimistic Concurrency

Vladimir Khorikov explains an anti-pattern when dealing with a model using optimistic concurrency (for example, memory-optimized tables): Alright, back to the original question. So, how to combine optimistic locking and automatic retry? In other words, when the application gets an error from the database saying that the versions of a Product don’t match, how to […]

Read More

Categories

October 2016
MTWTFSS
« Sep Nov »
 12
3456789
10111213141516
17181920212223
24252627282930
31