BCP For UTF-8

Sanjay Mishra notes that SQL Server 2016 and 2014 SP2 support UTF-8 for BCP and BULK INSERT:

This requirement has been addressed in SQL Server 2016 (and backported to SQL Server 2014 SP2). To test this, I obtained a UTF-8 dataset from http://www.columbia.edu/~fdc/utf8/. The dataset is translation of the sentence “I can eat glass and it doesn’t hurt me” in several languages. A few lines of sample data are shown here:

(As an aside, it is entirely possible to load Unicode text such as above into SQL Server even without this improvement, as long as the source text file uses a Unicode encoding other than UTF-8.)

I ran into this problem before, where developers wanted to bulk load UTF-8 but had to settle for an inferior solution.

Related Posts

Automatic Retry With Optimistic Concurrency

Vladimir Khorikov explains an anti-pattern when dealing with a model using optimistic concurrency (for example, memory-optimized tables): Alright, back to the original question. So, how to combine optimistic locking and automatic retry? In other words, when the application gets an error from the database saying that the versions of a Product don’t match, how to […]

Read More

MERGE In Hive

Carter Shanklin introduces the MERGE operator in Hive: USE CASE 2: UPDATE HIVE PARTITIONS. A common strategy in Hive is to partition data by date. This simplifies data loads and improves performance. Regardless of your partitioning strategy you will occasionally have data in the wrong partition. For example, suppose customer data is supplied by a […]

Read More

Categories

October 2016
MTWTFSS
« Sep Nov »
 12
3456789
10111213141516
17181920212223
24252627282930
31